Operations

This page covers what to expect at runtime: log formats, monitoring queries, restart behaviour, and how to diagnose common issues.

Logging

Document Processing emits structured JSON logs to standard output. Each line is a single JSON object, which makes the output trivially ingestible by Datadog, Splunk, ELK, CloudWatch Logs, or any log-aggregation tool.

Log format

{
  "timestamp": "2025-04-04 14:13:52,613",
  "app_name": "reference-face-image-extractor",
  "app_version": "1.0.0",
  "log_type": "application",
  "log_level": "INFO",
  "payload": {
    "message": "Processed file.pdf in 2.34s"
  }
}

Common payload fields

Field	Description
`message`	Human-readable description of what happened. Always present.

Lifecycle log lines

A successful document run produces one INFO line when the file finishes:

{ ..., "log_level": "INFO", "payload": { "message": "Processed file.pdf in 2.34s" } }

A failure produces an ERROR line with the exception message and a full Python traceback in a top-level exception field:

{
  ...,
  "log_level": "ERROR",
  "payload": { "message": "Error processing file /data/archive/file.pdf: <exception message>" },
  "exception": "<full traceback>"
}

Monitoring

Health checks

# Container is running and consuming GPU / memory
docker stats document-processing

# Recent log lines
docker logs --tail=200 document-processing

# Pull only processing-completion lines
docker logs document-processing 2>&1 | grep '"Processed"' | tail -20

Database schema

When DATA_TARGET=db, the service creates a mobai schema on first run and writes to three tables.

`mobai.records`

One row per successfully processed document — the primary output table.

Column	Type	Description
`id`	SERIAL PRIMARY KEY	Auto-incrementing row ID.
`file_name`	TEXT	Source file name.
`id_number`	TEXT	National identity number. Indexed.
`first_name`	TEXT	Given name(s).
`last_name`	TEXT	Surname.
`birth_date`	TEXT	`YYYY-MM-DD`.
`gender`	TEXT	`M` / `F` / `X`.
`nationality`	TEXT	ISO 3166-1 alpha-3.
`document_type`	TEXT	Normalised document type.
`document_number`	TEXT	Document number as printed.
`issuing_country`	TEXT	ISO 3166-1 alpha-3.
`issue_date`	TEXT	`YYYY-MM-DD`.
`expiry_date`	TEXT	`YYYY-MM-DD`.
`face_image`	TEXT	Base64-encoded PNG.
`face_image_quality_score`	FLOAT	Face image quality score (higher is better).
`created_at`	TIMESTAMPTZ	Row creation time. Defaults to `CURRENT_TIMESTAMP`.

`mobai.best_face_images`

One row per person (id_number), holding the single best-ranked reference face image. This is the table to query when building a biometric reference database.

Column	Type	Description
`id`	SERIAL PRIMARY KEY	Auto-incrementing row ID.
`file_name`	TEXT	Source file the best image came from.
`id_number`	TEXT	National identity number. Indexed.
`document_type`	TEXT	Document type the image came from.
`issuing_country`	TEXT	ISO 3166-1 alpha-3.
`source`	TEXT	Source label for the extraction path.
`image_taken`	TEXT	Date the image was taken / document issued.
`score`	FLOAT	Face image quality score. Indexed.
`content`	TEXT	Base64-encoded PNG.
`created_at`	TIMESTAMPTZ	Row creation time.

`mobai.face_images`

One row per detected face across the archive — useful for auditing which submissions a person's reference image was sourced from.

Column	Type	Description
`id`	SERIAL PRIMARY KEY	Auto-incrementing row ID.
`file_name`	TEXT	Source file name.
`id_number`	TEXT	National identity number. Indexed.
`face_width`	FLOAT	Width in pixels.
`face_height`	FLOAT	Height in pixels.
`ofiq_score`	FLOAT	Face image quality score. Indexed.
`face_image`	TEXT	Base64-encoded PNG.
`created_at`	TIMESTAMPTZ	Row creation time.

Database monitoring

When using the database target, common things worth tracking include:

-- Overall throughput: records added today
SELECT COUNT(*) AS records_today
FROM mobai.records
WHERE created_at >= CURRENT_DATE;

-- Distribution of extracted document types
SELECT document_type, COUNT(*) AS n
FROM mobai.records
GROUP BY document_type
ORDER BY n DESC;

-- Proportion of records that include a face image
SELECT
  COUNT(*) FILTER (WHERE face_image IS NOT NULL) AS with_face,
  COUNT(*) FILTER (WHERE face_image IS NULL)     AS without_face
FROM mobai.records;

-- Coverage: distinct persons with a selected best reference image
SELECT COUNT(DISTINCT id_number) AS persons_covered
FROM mobai.best_face_images;

Resumable runs

Long batch runs can be safely interrupted. When the service is restarted with SKIP_PROCESSED_FILES=true, it:

Queries the configured target (DB / S3 CSV / local CSV) for the list of file_name values already present.
Subtracts those from the candidate file list.
Processes only the remainder.

This makes it safe to:

Stop and restart the container at any time — even mid-document — without re-processing completed work.
Add new documents to the archive and re-run; only the new files will be processed.
Switch worker counts mid-run by stopping, adjusting WORKERS, and restarting.

To force a full re-process, set SKIP_PROCESSED_FILES=false. Note that this does not clear the existing target — duplicate file_name entries will be inserted. If you want a clean reprocess, truncate the target tables (or rotate the CSV) first.

Troubleshooting

The container exits immediately

Typically caused by misconfigured environment variables. Run with LOG_LEVEL=DEBUG and look for ValueError lines naming the missing variable. Common culprits:

DATA_SOURCE or DATA_TARGET not set, or set to an unsupported value.
S3_DIRECTORY_TO_PROCESS missing when DATA_SOURCE=s3.
DB_HOST / DB_USER missing when DATA_TARGET=db.

No documents are processed

Check that the source actually contains files with supported extensions (.pdf, .tif, .tiff). The pipeline only processes these extensions and silently skips others.

If SKIP_PROCESSED_FILES=true and the target already contains entries for all files in the source, the work list will be empty and the service will exit cleanly.

GPU not detected

# Verify the host sees the GPU
nvidia-smi

# Verify Docker can pass the GPU through
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi

If nvidia-smi works on the host but not inside the test container, the NVIDIA Container Toolkit is not installed correctly. See the Toolkit installation guide.

Out-of-memory errors on the GPU

Lower WORKERS. Each worker loads its own copy of the models onto the GPU, so the combined memory footprint must fit within the GPU's available VRAM.

`psycopg2.OperationalError` on startup

Database connection failure. Check:

DB_HOST is reachable from inside the container (test with docker exec ... ping <DB_HOST>).
Credentials are correct.
If using IAM auth, the container's IAM role has rds-db:connect and the database user is an IAM-enabled role.
sslmode=require is enforced for IAM-authenticated connections — your RDS instance must accept SSL.

Pre-signed S3 URLs expiring mid-download

The pipeline automatically renews pre-signed URLs on download failure (up to 2 retries per file). If you see persistent 403 Forbidden errors in logs, verify that:

The credentials have not been rotated since the run started.
The S3 bucket policy allows access from the container's network.

Logging​

Log format​

Common payload fields​

Lifecycle log lines​

Monitoring​

Health checks​

Database schema​

mobai.records​

mobai.best_face_images​

mobai.face_images​

Database monitoring​

Resumable runs​

Troubleshooting​

The container exits immediately​

No documents are processed​

GPU not detected​

Out-of-memory errors on the GPU​

psycopg2.OperationalError on startup​

Pre-signed S3 URLs expiring mid-download​