diff --git a/architecture.md b/architecture.md index 47df7d8090fdd9f2c49c6a7dec77da72888f1ad4..1ace0e6218924d6bc456cf3c531b7bdf706f8a41 100644 --- a/architecture.md +++ b/architecture.md @@ -9,6 +9,36 @@ - Optional: Sends monitoring data to webserver +## Job folders +Jobs are given a randomly generated uuid. A job folder looks like this: + +``` +job_uuid: + - audio.mkv + - video_language.txt + - metadata.json + - statefile (new/done/error) +``` + +### audio.mkv +Preprocessed input file. Contains only audio data to conserve disk space. + +### video_language.txt +Contains the video language tag used for processing with [whisper-webvtt-transcriber](https://gitlab1.ptb.de/janhartig/whisper-webvtt-transcriber). +Is used by the cronjob script (step 3). + +### metadata.json +Used by mailservice (step 4). +```json +{ + "email": "example@example.com", + "language": "de", + "video_language": "de", + "filename": "original_filename.original_file_extension" +} +``` + +### statefile State is tracked through the following files in the jobs folder: - new: Job has been submitted by user - done: Job has been processed without errors