Jobs & lifecycle
A job is one unit of work. This page describes its states, transitions, events, and retry semantics.
A job is one unit of work. Every module runs as a job. Jobs are isolated, observable, resumable.
State machine
┌─────────┐ queue picks ┌─────────┐ success ┌──────┐
│ pending │ ────────────────► │ running │ ────────────► │ done │
└─────────┘ └─────────┘ └──────┘
│ │
│ user cancels │ fatal error
▼ ▼
┌───────────┐ ┌────────┐
│ cancelled │ │ failed │
└───────────┘ └────────┘
done / failed / cancelled ──── (after 7 days) ────► expired
| State | Meaning |
|---|---|
pending |
Created, sitting in the FIFO queue |
running |
Picked by a worker, executing |
done |
Completed successfully, results downloadable |
failed |
Errored out (see error_message) |
cancelled |
Cancelled via the UI or API |
expired |
More than 7 days since terminal state — result files purged |
Transitions and queue assignment are atomic; a job is never picked twice.
Creation
POST /api/jobs { "queries": [...], "zones": [...] } # creates a scrap job
POST /api/jobs/{type} { ...module-specific params } # typed shortcut
See Jobs API.
Observability
GET /api/jobs/{id} # status, counters, metadata
GET /api/jobs/{id}/stream # SSE: status / log / done
The stream closes when the job terminates. Safety timeout: 6 hours. Event payloads: see States & SSE events.
Results
GET /api/jobs/{id}/download?format=csv|json|xlsx
GET /api/jobs/{id}/items?offset=0&limit=200
Results live 7 days after terminal state, then are purged. The job record remains.
Errors & retries
A failed job exposes error_message and error_count (items that errored inside the job — a job can be done with error_count > 0).
POST /api/jobs/{id}/resume
Creates a new attempt resuming from the last successful item.
Cancellation
POST /api/jobs/{id}/cancel # keeps partial results
DELETE /api/jobs/{id} # cancels and removes record
Concurrency
- Up to 5 simultaneous jobs per user (queued beyond)
- Two lanes: serial (extraction) and parallel (6 slots: verification, pipeline utilities,
delivery_check) - Jobs are independent — re-runs do not wait on the original