videoag
backend

Repository



This is the repository for the backend of the Video AG Website. It is mainly written in Python and uses Flask,
Sqlalchemy, ffmpeg, Docker and Kubernetes.

Running the API (e.g. if you are working on the frontend)
You need to have Docker installed.
Then just execute dev/run.sh inside the project root directory. The API should be available at
localhost:5000.
If you have problems with 403 errors, the permissions probably need to be adjusted. For the Docker container to work,
the 'other' group needs to have the execute bit set for every directory (including the api/ directory) and the
read bit set for every file.
In the debug mode (DEBUG=True in the config api/config/api_example_config.py) the login data is:

Username: videoag

Password: videoag


Note: In the database, sometimes warnings appear that no transaction is running. This is because we always perform a
ROLLBACK to ensure the next transactions can run flawlessly.

Logical Overview
Everything is executed inside a Kubernetes Cluster

Postgres Database

Used to store all non-file data
Provided by the K8s Cluster (See flux repo)


API

Provides controlled access to the database for the outside world
Does NOT have access to any media files
Inside the api/ subdirectory of this repo


Job Controller

Takes jobs from the queue in the database and creates a K8s Job
Stores the result in the database when the job finished/failed
Adds jobs to the queue if they are to be executed periodically
Inside the job_controller/ subdirectory of this repo


The different Jobs

Executed by the K8s cluster to do some work
e.g. any file operations are performed by jobs
Have access to the media files
Inside the job_controller/jobs/ subdirectory of this repo


File Provider

Serves the media files to the outside world
Simple nginx image (See flux repo)


Implementation Overview
In addition to the subdirectories mentioned above, there is also common_py/ which contains some code used by the api,
job_controller and the jobs.

Build Pipeline
The Dockerfiles and the GitLab CI pipeline is dynamically generated by build_pipeline_generator.py based on the
build_config.py inside the different modules.
Currently, the build_config.py can contain the following variables/instructions (Also see the API which currently
utilizes all of them):


Name
Type
Optional
Description


TARGET_IMAGE_NAME
str
No
The name of the final docker image. Note that this name will be prepended by development_ or production_ before the image is put into the registry (As anyone can modify development_ images in the registry)


BUILD_DEPENDENCIES
list[str]
Yes
Specify the location of other modules to load. The string must be a path to a directory with a build_config.py, relative to the directory of the current file.


PIP_REQUIREMENTS_FILE
str
Yes
The path to the pip requirements file, relative to the directory of the current file


APT_RUNTIME_DEPENDENCIES
list[str]
Yes
List of apt packages to install for the final image


APT_BUILD_DEPENDENCIES
list[str]
Yes
List of apt packages to install only for the pip install


DOCKERFILE_EXTRA
str
Yes
Dockerfile statements which are appended to the generated pip and apt install statements. The build context for docker is always the project root and the current workdir is /code. The term $MODULE_DIR is replaced by the path from the project root to the module directory.


CI_TEST_JOB_TEMPLATE
str
Yes
A GitLab CI job description in yaml using 4 space indentation. The template itself must not have any top-level indentation. The keys stage, needs and image are added automatically to the job.


Working on the Backend

Run tests & Coverage

Execute dev/run.sh api-test (execute from the project root!) to run the tests with docker
The coverage report is put into api/coverage/report.txt and api/coverage/html/index.html


Run locally
This shows you, how to run the backend components locally and debug them:

Install all the dependencies with dev/install_requirements.sh (execute from the project root!)

You may need to install some system packages for psycopg and uwsgi. See the apt requirements in the
build_config.py files.


(PyCharm specific, but probably similar for other IDEs):

Set .venv/bin/python as your python interpreter for this project
Mark api/src, common_py/src, job_controller/src as your Sources Root.
Add run profile for api/src/run_tests.py

Working directory: backend/api/src

Environment Variables: VIDEOAG_CONFIG=../config/api_example_config.py;VIDEOAG_TEST_CONFIG_OVERRIDE=../config/test_config_override.py

Ensure option Add sources roots to PYTHONPATH is enabled


Execute dev/run.sh db (execute from the project root!) to start the database
Execute your run_tests.py profile with the IDE and enjoy debugging!


Authors
Initially written by Simon Künzel and Dorian Koch

A few technical notes

Dockerfiles

Caching with apt and pip (Currently NOT implemented!, just keeping it here as a note)
There are two types of caching used: One for local building and one in the CI with kaniko.

Local Building
The PIP_CACHE_DIR and APT_CACHE_DIR variables are empty, pip and apt use their default cache location. However, the
--mount=type=CACHE option before the RUN command mounts these cache locations and so the cache can be reused between
different builds. Since the locations are mounted, the files put into there (during the installation command), are not
put into the final image.
Additionally, Docker provides a script for apt to automatically clean the cache (so it does not stay in the final image).
However, we don't want that, and so we need to remove the /etc/apt/apt.conf.d/docker-clean file.

CI Building
The PIP_CACHE_DIR and APT_CACHE_DIR variables are set by the CI to a location (.cache/pip, .cache/apt) inside
the project dir. The cache option of the GitLab CI means these locations are persistent between different job runs. They
need to be inside the project dir since GitLab cannot cache locations outside. When kaniko now executes the Dockerfile
it will provide the environment vars with the caching locations. Pip directly reads the PIP_CACHE_DIR env var (Note
that the ARG command in Docker provides the build argument as an environment variable to the container). For apt we
need to put the location into /etc/apt/apt.conf.d/. The --ignore-path option of kaniko ensures that the cache is not
included in the final image.
Also see https://github.com/GoogleContainerTools/kaniko/issues/969#issuecomment-2160910028.