API

Veille API

Recurring monitoring of scrapes and pipelines, with diff buckets and reputation signals.

Veille API

The Veille API manages recurring monitoring jobs. A veille (watch) replays a source scrape — or an entire pipeline — on a fixed cadence, then computes a diff against the previous run to surface what changed.

See Veille monitoring concepts for lifecycle, scheduling, and diff model.

All endpoints are mounted under /api/veille and require an authenticated, active session. Responses are JSON. Resource ownership is enforced on every request: cross-user access returns 404. Generic errors are 401 (no session) and 404 (not found / not owned); endpoint-specific causes are listed inline.

Resource model

A Veille object exposes the following fields:

Field	Type	Description
`id`	integer	Stable identifier.
`name`	string	Human-readable label (2–200 chars).
`source_job_id`	string \| null	Source scrape replayed on each tick (mutually exclusive with `source_pipeline_id`).
`source_pipeline_id`	string \| null	Source pipeline replayed on each tick.
`frequency_days`	integer	Cadence in days, between `1` and `365`.
`status`	string	One of `active`, `paused`, `deleted`.
`next_run_at`	string (ISO8601)	Next scheduled execution.
`last_run_at`	string \| null	Timestamp of the most recent completed run.
`last_run_job_id`	string \| null	Job id of the most recent run.
`run_count`	integer	Total successful runs.
`created_at`	string (ISO8601)	Creation timestamp.

Endpoints

List veilles

GET /api/veille

Returns the caller's active and paused veilles. Soft-deleted entries are excluded.

Response 200 OK — { "items": [Veille, ...] }.

Create a veille

POST /api/veille

Creates a recurring monitor from a completed scrape or pipeline owned by the caller. Exactly one of source_job_id or source_pipeline_id is required.

Request body

Field	Type	Required	Notes
`name`	string	yes	2–200 characters.
`source_job_id`	string	one of	8–64 characters.
`source_pipeline_id`	string	one of	8–64 characters.
`frequency_days`	integer	yes	`1` ≤ value ≤ `365`.

{
  "name": "Plombiers Lyon 3",
  "source_job_id": "job_8f2c91a4",
  "frequency_days": 7
}

Response 200 OK — the newly created veille.

Specific cause: 400 validation failure (missing/both source fields, source not owned, source not completed, invalid frequency).

Retrieve a veille

GET /api/veille/{id}

Returns a single veille owned by the caller. Soft-deleted entries return 404.

Update a veille

PATCH /api/veille/{id}

Patches mutable fields. Omitted fields are left untouched.

Request body

Field	Type	Notes
`name`	string	2–200 characters.
`frequency_days`	integer	`1` ≤ value ≤ `365`. Reschedules `next_run_at`.
`status`	string	`active`, `paused`, or `deleted`.

{ "status": "paused", "frequency_days": 14 }

Response 200 OK — the updated veille. Specific cause: 400 invalid field value.

Delete a veille

DELETE /api/veille/{id}

Soft-deletes the veille. The record is preserved for audit but excluded from all list endpoints and no longer scheduled.

Response 200 OK — { "ok": true }.

Runs

A run is a single execution of the veille plus the diff statistics computed against the previous run. The first run is a baseline (is_baseline: true) and has no diff counters.

List runs

GET /api/veille/{id}/runs

Returns the run history ordered by computed_at descending.

Response 200 OK

{
  "items": [{
    "id": 17,
    "job_id": "job_b71e0d22",
    "prev_job_id": "job_aa44e0f1",
    "is_baseline": false,
    "total_count": 312, "prev_total_count": 305,
    "new_count": 9, "removed_count": 2,
    "modified_count": 24, "unchanged_count": 279,
    "computed_at": "2026-05-27T08:11:04Z",
    "job_status": "done",
    "job_completed_at": "2026-05-27T08:10:48Z"
  }]
}

Retrieve a run

GET /api/veille/{id}/runs/{run_id}

Returns the run, including samples — capped previews of the rows in each diff bucket.

Response 200 OK

{
  "id": 17,
  "job_id": "job_b71e0d22",
  "is_baseline": false,
  "new_count": 9,
  "removed_count": 2,
  "modified_count": 24,
  "unchanged_count": 279,
  "total_count": 312,
  "computed_at": "2026-05-27T08:11:04Z",
  "samples": {
    "new": [{ "key": "...", "nom": "..." }],
    "removed": [{ "key": "...", "nom": "..." }],
    "modified": [{
      "key": "...", "nom": "...",
      "before": { "note": "4.3", "nb_avis": 42 },
      "after":  { "note": "3.8", "nb_avis": 51 },
      "changed_fields": ["note", "nb_avis"]
    }]
  }
}

Signal categories

Every non-baseline run classifies each row in the dataset into exactly one bucket:

Category	Meaning
`new`	Row present in the current run, absent from the previous run.
`removed`	Row present in the previous run, absent from the current run (closed/dropped).
`modified`	Row present in both runs with at least one tracked field changed.
`unchanged`	Row present in both runs, identical on tracked fields.

Bucket counts are surfaced as new_count, removed_count, modified_count, and unchanged_count. The matching samples.{new,removed,modified} arrays hold capped previews suitable for UI display.

The removed field is the closed/dropped bucket: a record no longer listed at the source.

Reputation signals

Reputation signals are a derived view of a run's modified bucket. They isolate rows whose public reputation moved in a way that is timing-sensitive for outreach — typically Google Maps listings whose rating dropped or whose review volume surged between two runs.

Ranking logic (high level)

A modified row becomes a signal when at least one of the following holds:

Rating drop — the average rating decreased by at least 0.2 points.
Review surge — the review count grew by at least 3 since the previous run.

Each signal carries a score that ranks urgency. Larger rating drops dominate; review surges contribute a smaller, additive boost above a low-volume noise floor. Signals are returned sorted by score descending. The exact weighting is an implementation detail and may evolve; do not depend on absolute score values, only on relative order.

List signals

GET /api/veille/{id}/runs/{run_id}/signals

Response 200 OK

{
  "items": [{
    "nom": "Garage du Centre",
    "adresse": "12 rue Voltaire, 69003 Lyon",
    "telephone": "+33 4 78 00 00 00",
    "site_web": "https://...", "email": "contact@...",
    "lien_google_maps": "https://maps.google.com/...",
    "note_avant": 4.3, "note_apres": 3.8, "delta_note": -0.5,
    "avis_avant": 42, "avis_apres": 51, "delta_avis": 9,
    "score": 12.0
  }],
  "total": 1
}

Export signals

GET /api/veille/{id}/runs/{run_id}/signals.{fmt}

Streams the same ranked signal list as a downloadable file.

Format	Media type	Extension
`csv`	`text/csv; charset=utf-8`	`.csv`
`json`	`application/json`	`.json`
`xlsx`	`application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`	`.xlsx`

The response sets Content-Disposition: attachment with a filename of the form signaux-reputation-veille-{id}-run-{run_id}.{fmt}.

Specific cause: 400 unsupported fmt (must be csv, json, xlsx).