FR
Copied
Modules

French legal data

The legal_data module enriches a list of POI with official records from French public legal data sources. For each input row, the module queries api.gouv.fr (SIRENE, INPI, RNCS) and complements the response with BODACC legal notices and Infogreffe public extracts. The result is a structured profile attached to each company: legal form, capital, registered executives, NAF code, headcount band, headline financials, and a consolidated lead status.

The module is read-only: no credentials are required, no fees are charged by the upstream sources, and no business is contacted as part of the lookup.

No website needed. Unlike legal_ids, which reads identifiers from a website, this module matches each row by name + address against SIRENE and also returns the SIRET/SIREN. Choose it when your list has no website column — for example a Google Maps scrape with names and map links only.

Purpose

French B2B prospecting lists tend to start with a name, an address, and maybe a website. legal_data turns each row into a qualified company record using only public registries.

Typical use cases:

Inputs

legal_data is an enrichment module: it consumes an existing list of POI rather than producing one. The expected input is a poi_list, typically the output of a discovery job.

Field Required Notes
nom yes Company name, used for fuzzy matching.
siren no If present, used for an exact match (preferred).
code_postal no Disambiguates fuzzy name matches.
lat, lon no Geographic fallback when name and SIREN both fail.

Match resolution follows three tiers, in order:

  1. Exact SIREN lookup when the identifier is provided.
  2. Fuzzy match on nom + code_postal.
  3. Geographic fallback on coordinates within a small radius.

A row that cannot be resolved is returned with empty enrichment columns and an error code (see Errors).

Outputs

Each input row is augmented with the following columns. Empty values are preserved as empty strings — the module never fabricates a value.

Column Type Description
legal_form string Legal form (SAS, SARL, SA, EI, association, etc.).
capital number Registered share capital in EUR.
founding_date date Date of registration in the company register.
executives list Named executives with role (Président, Gérant, DG).
financials object Last available revenue and net income, with fiscal year.
naf_code string Five-character NAF/APE activity code.
employees_range string INSEE headcount band (e.g. 10-19, 100-199).

A consolidated lead_status is also returned, taking one of four values: mort, alerte, opportunite, actif. It encodes the combination of administrative state, BODACC signals, and recency of legal events.

Lifecycle

Standard job lifecycle — see Jobs lifecycle. Progress is reported per establishment processed. The job is idempotent within a session: re-running on the same input list yields the same enriched columns, modulo upstream registry updates.

Pipeline

needs:     poi_list
produces:  enriched_list

legal_data consumes a poi_list and emits an enriched_list carrying the original rows plus the columns described in Outputs. The enriched list can itself be consumed by downstream enrichment modules (legal_mentions, legal_ids, etc.).

Endpoints

Create a job

POST /api/jobs/legal-data

Request body:

{
  "items": [
    { "nom": "Boulangerie Martin", "code_postal": "75011" },
    { "siren": "552120222" }
  ],
  "source_job_id": "job_01HXYZ..."
}

Either items or source_job_id must be provided. When source_job_id references a completed discovery job, its rows are used as input directly.

Response: a Job resource with id, type, status, and progress fields.

Retrieve a job

GET /api/jobs/{job_id}

Returns the current state, progress counters, and — when done — the download URL for the enriched CSV.

List jobs

GET /api/jobs?type=legal_data

Maximum 5,000 rows per job. Larger lists must be split client-side. Global quotas and rate limits: see Limits.

Financial figures depend on the company having filed its accounts (roughly 60 percent of French SMEs). Executive names reflect the last filing; recent changes may take a few weeks to propagate.

Errors

Row-level errors are reported in an error column on the enriched output. Job-level errors transition the job to failed.

Code Scope Meaning
not_found row No match in SIRENE for the provided name and postcode.
foreign_business row Establishment is not registered in France.
ambiguous_match row Several candidates with equal score; none selected.
source_unavailable job One or more upstream public sources are unreachable.
quota_exceeded job Daily fair-use quota reached; retry the next day.
invalid_input job Input list is empty or missing required fields.

A source_unavailable failure preserves all rows already enriched before the outage. The job can be re-submitted with the remaining rows once the upstream source recovers.

What's next