Sources & cadence
The 12 MVP sources, what we fetch, how often, and under which licence.
| ID | Source | Strategy | Cadence | Licence |
|---|---|---|---|---|
eurlex | EUR-Lex (CELEX 32024R1689 family) | SPARQL | 30 min | Decision 2011/833/EU |
aioffice | EU AI Office (Drupal news) | RSS | 30 min | Decision 2011/833/EU |
aiboard | EU AI Board (opinions, recommendations) | HTML scrape | 2 h | Decision 2011/833/EU |
codeofpractice | GPAI Code of Practice | HTML scrape | 6 h | Decision 2011/833/EU |
haveyoursay | EU Have-Your-Say (AI initiatives) | HTML scrape | 12 h | Decision 2011/833/EU |
cen | CEN/CENELEC JTC 21 | HTML scrape (metadata only) | 12 h | Standards metadata only — full texts not redistributed |
bnetza | Bundesnetzagentur (DE market authority) | HTML scrape | 4 h | §5 UrhG (DE official works) |
bfdi | BfDI (DE data protection) | HTML scrape | 6 h | §5 UrhG |
bsi | BSI (DE cybersecurity) | HTML scrape | 6 h | §5 UrhG |
cnil | CNIL (FR data protection) | RSS | 30 min | Etalab Licence Ouverte 2.0 |
nl-algoritmeregister | NL Algoritmeregister | HTML scrape | 4 h | CC0 |
oecdai | OECD.AI Policy Observatory | HTML scrape | 12 h | CC-BY 4.0 |
What "cadence" actually means
Each source worker runs on a Cloudflare cron trigger at the listed interval. Inside each run we fetch the source, diff against the previous snapshot, and fan out only changed/new items to enrichment. Cadence is the upper bound on detection latency — typical end-to-end "publish → webhook" is the cron interval plus 1-2 minutes for enrichment + delivery.
What we don't do with the source
- We do not redistribute full texts of CEN/ISO/ETSI standards (they're paywalled — only metadata).
- We do not republish individual consultation responses from third parties (rights belong to authors — only metadata).
- We do not aggregate FLI
artificialintelligenceact.euas a primary source — it's a curated explorer; we link to it where helpful.
Sources for Phase 2
On the roadmap, not yet wired:
- ISO/IEC SC 42 (standards metadata)
- ETSI SAI
- AESIA (Spain)
- AgID + Garante (Italy)
- IMY (Sweden)
- EuGH/InfoCuria for the first AI Act case law (probably 2026/27)