Operations
This page covers what to expect at runtime: log formats, monitoring queries, restart behaviour, and how to diagnose common issues.
Logging
Document Processing emits structured JSON logs to standard output. Each line is a single JSON object, which makes the output trivially ingestible by Datadog, Splunk, ELK, CloudWatch Logs, or any log-aggregation tool.
Log format
{
"timestamp": "2025-04-04 14:13:52,613",
"app_name": "reference-face-image-extractor",
"app_version": "1.0.0",
"log_type": "application",
"log_level": "INFO",
"payload": {
"message": "Processed file.pdf in 2.34s"
}
}
Common payload fields
| Field | Description |
|---|---|
message | Human-readable description of what happened. Always present. |
Lifecycle log lines
A successful document run produces one INFO line when the file finishes:
{ ..., "log_level": "INFO", "payload": { "message": "Processed file.pdf in 2.34s" } }
A failure produces an ERROR line with the exception message and a full Python traceback in a top-level exception field:
{
...,
"log_level": "ERROR",
"payload": { "message": "Error processing file /data/archive/file.pdf: <exception message>" },
"exception": "<full traceback>"
}
Monitoring
Health checks
# Container is running and consuming GPU / memory
docker stats document-processing
# Recent log lines
docker logs --tail=200 document-processing
# Pull only processing-completion lines
docker logs document-processing 2>&1 | grep '"Processed"' | tail -20
Database schema
When DATA_TARGET=db, the service creates a mobai schema on first run and writes to three tables.
mobai.records
One row per successfully processed document — the primary output table.
| Column | Type | Description |
|---|---|---|
id | SERIAL PRIMARY KEY | Auto-incrementing row ID. |
file_name | TEXT | Source file name. |
id_number | TEXT | National identity number. Indexed. |
first_name | TEXT | Given name(s). |
last_name | TEXT | Surname. |
birth_date | TEXT | YYYY-MM-DD. |
gender | TEXT | M / F / X. |
nationality | TEXT | ISO 3166-1 alpha-3. |
document_type | TEXT | Normalised document type. |
document_number | TEXT | Document number as printed. |
issuing_country | TEXT | ISO 3166-1 alpha-3. |
issue_date | TEXT | YYYY-MM-DD. |
expiry_date | TEXT | YYYY-MM-DD. |
face_image | TEXT | Base64-encoded PNG. |
face_image_quality_score | FLOAT | Face image quality score (higher is better). |
created_at | TIMESTAMPTZ | Row creation time. Defaults to CURRENT_TIMESTAMP. |
mobai.best_face_images
One row per person (id_number), holding the single best-ranked reference face image. This is the table to query when building a biometric reference database.
| Column | Type | Description |
|---|---|---|
id | SERIAL PRIMARY KEY | Auto-incrementing row ID. |
file_name | TEXT | Source file the best image came from. |
id_number | TEXT | National identity number. Indexed. |
document_type | TEXT | Document type the image came from. |
issuing_country | TEXT | ISO 3166-1 alpha-3. |
source | TEXT | Source label for the extraction path. |
image_taken | TEXT | Date the image was taken / document issued. |
score | FLOAT | Face image quality score. Indexed. |
content | TEXT | Base64-encoded PNG. |
created_at | TIMESTAMPTZ | Row creation time. |
mobai.face_images
One row per detected face across the archive — useful for auditing which submissions a person's reference image was sourced from.
| Column | Type | Description |
|---|---|---|
id | SERIAL PRIMARY KEY | Auto-incrementing row ID. |
file_name | TEXT | Source file name. |
id_number | TEXT | National identity number. Indexed. |
face_width | FLOAT | Width in pixels. |
face_height | FLOAT | Height in pixels. |
ofiq_score | FLOAT | Face image quality score. Indexed. |
face_image | TEXT | Base64-encoded PNG. |
created_at | TIMESTAMPTZ | Row creation time. |
Database monitoring
When using the database target, common things worth tracking include:
-- Overall throughput: records added today
SELECT COUNT(*) AS records_today
FROM mobai.records
WHERE created_at >= CURRENT_DATE;
-- Distribution of extracted document types
SELECT document_type, COUNT(*) AS n
FROM mobai.records
GROUP BY document_type
ORDER BY n DESC;
-- Proportion of records that include a face image
SELECT
COUNT(*) FILTER (WHERE face_image IS NOT NULL) AS with_face,
COUNT(*) FILTER (WHERE face_image IS NULL) AS without_face
FROM mobai.records;
-- Coverage: distinct persons with a selected best reference image
SELECT COUNT(DISTINCT id_number) AS persons_covered
FROM mobai.best_face_images;
Resumable runs
Long batch runs can be safely interrupted. When the service is restarted with SKIP_PROCESSED_FILES=true, it:
- Queries the configured target (DB / S3 CSV / local CSV) for the list of
file_namevalues already present. - Subtracts those from the candidate file list.
- Processes only the remainder.
This makes it safe to:
- Stop and restart the container at any time — even mid-document — without re-processing completed work.
- Add new documents to the archive and re-run; only the new files will be processed.
- Switch worker counts mid-run by stopping, adjusting
WORKERS, and restarting.
To force a full re-process, set SKIP_PROCESSED_FILES=false. Note that this does not clear the existing target — duplicate file_name entries will be inserted. If you want a clean reprocess, truncate the target tables (or rotate the CSV) first.
Troubleshooting
The container exits immediately
Typically caused by misconfigured environment variables. Run with LOG_LEVEL=DEBUG and look for ValueError lines naming the missing variable. Common culprits:
DATA_SOURCEorDATA_TARGETnot set, or set to an unsupported value.S3_DIRECTORY_TO_PROCESSmissing whenDATA_SOURCE=s3.DB_HOST/DB_USERmissing whenDATA_TARGET=db.
No documents are processed
Check that the source actually contains files with supported extensions (.pdf, .tif, .tiff). The pipeline only processes these extensions and silently skips others.
If SKIP_PROCESSED_FILES=true and the target already contains entries for all files in the source, the work list will be empty and the service will exit cleanly.
GPU not detected
# Verify the host sees the GPU
nvidia-smi
# Verify Docker can pass the GPU through
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi
If nvidia-smi works on the host but not inside the test container, the NVIDIA Container Toolkit is not installed correctly. See the Toolkit installation guide.
Out-of-memory errors on the GPU
Lower WORKERS. Each worker loads its own copy of the models onto the GPU, so the combined memory footprint must fit within the GPU's available VRAM.
psycopg2.OperationalError on startup
Database connection failure. Check:
DB_HOSTis reachable from inside the container (test withdocker exec ... ping <DB_HOST>).- Credentials are correct.
- If using IAM auth, the container's IAM role has
rds-db:connectand the database user is an IAM-enabled role. sslmode=requireis enforced for IAM-authenticated connections — your RDS instance must accept SSL.
Pre-signed S3 URLs expiring mid-download
The pipeline automatically renews pre-signed URLs on download failure (up to 2 retries per file). If you see persistent 403 Forbidden errors in logs, verify that:
- The credentials have not been rotated since the run started.
- The S3 bucket policy allows access from the container's network.