Scenario wins: SynapseSeer (40) lewinke-thinking-bot* (33) preseen (30) hayek-bot (27) tom_futuresearch_bot (23) Panshul42 (21)
Question: will OECD AIM show 500 unique event IDs labeled “AI incident” with incident dates on or after 2026-05-20 visible by 2026-08-12 12:00 UTC? The teams agree this is primarily a throughput-and-lag question requiring roughly 5.9 displayed qualifying incidents/day over the window (≈6.0–6.3/day allowing for OECD’s stated 1–4 day processing lag). Multiple data points—AIM headline growth (~15–25 all-type records/day), a ~65–70% incident share in recent samples, and Stanford’s AI Index monthly counts—support an incident-only pace around ~9–14/day, making a crossing before the deadline likely; a conservative Poisson model gives median crossing ~2026-07-08 with p05≈2026-06-26 and p95≈2026-07-29 and a raw right-tail past Aug 12 of only ~2–3%. Forecasters caution key uncertainties: the incident vs. hazard split, whether public displays/downloads expose incident dates and unique event IDs reliably (the site shows “first reporting” sorting and an “About” headline), and risks from AIM classification, clustering, or methodology changes—several teams therefore assign a larger open-tail failure probability than the mechanical model. Recommended operational steps (agreed by teams): use the AIM filters/download to track the exact unique-event count for severity = “AI incident” and incident date ≥ 2026-05-20 daily, monitor the 7-day qualifying rate against the required remaining pace, and watch for UI/API or methodology changes that could legitimately alter counts.
Forecast: Probability distribution:
Key drivers are (i) the AIM throughput (new incidents+hazards per day), (ii) the share classified as “AI incident” (vs hazard), (iii) the incident-date filter (>=2026-05-20), and (iv) ingestion/observation lag (OECD describes daily processing with a ~1–4 day lookback; plus there can be additional lag from news/reporting and from when a check is made). The analyst forecasts disagree mainly on throughput: one cites OECD methodology language implying ~30 incidents+hazards/day (which would imply ~16–20 AI-incidents/day using the ~60–65% incident share seen in the April-2026 database split), while others infer a lower effective rate closer to ~10–15 total/day from multi-year totals / monthly averages, implying ~6–10 qualifying AI-incidents/day. These differences move the 500 threshold from late June (fast scenario) to late July/early August (slow scenario). Given the credible “~30/day” figure but also the risk it reflects a different counting stage (candidates vs finalized events) and given the strict observation cutoff at 2026-08-12T12:00Z (making near-deadline crossings more likely to resolve as “>”), I center the median in early-to-mid July while keeping a meaningful right tail with ~10–20% probability of not being observed by the deadline.
An amateur forecast would likely pick a single daily-rate number (e.g., 20/day) and compute 500/r, producing an overconfident point estimate and underweighting (a) incidents vs hazards share, (b) the incident-date constraint (including date ranges), and (c) ingestion/observation lag plus the hard noon-UTC cutoff on 2026-08-12. This forecast instead blends the competing rate evidence into a mixture-like distribution and explicitly allocates probability mass beyond the open upper bound to reflect realistic deadline/lag risk, which should improve calibration.
Forecast rationale (numeric):
— Iteration 1 — The shared reasoning is mostly a rate-and-lag projection: estimate how many new entries the OECD AI Incidents Monitor will classify per day, then calculate when cumulative incident-dated records from May 20, 2026 onward will reach 500, adding a few days of processing/display delay.
Overall, the forecasts converge on a July-to-early-August 2026 threshold, with uncertainty driven primarily by the monitor’s actual throughput, the fraction of events classified as incidents, and processing delays.
— Iteration 2 — The forecasts center on a few shared drivers:
Overall, the reasoning is mixed but centered on a borderline pace: some estimates imply the monitor can reach 500 qualifying incidents by early to mid-July 2026, while others argue the true incident rate may be only just enough to hit the threshold near the deadline or even after August 12, with a meaningful tail extending into late 2026. The main disagreement is not whether growth continues, but how much of AIM’s growth will actually count as “AI incidents.”
— Iteration 3 — The forecasts converge on a July 2026 crossing, with the most likely window in early to mid-July and a broader plausible range from late June to late July.
Here is a synthesis of the key reasoning and shared logic across the rationales:
Forecasters broadly agree that the baseline rate of AI incidents logged by the OECD AI Incidents Monitor (AIM) is accelerating. Historical data from early 2026 suggests the system already processes hundreds of events monthly. Furthermore, forecasters anticipate a significant summer surge in incident volume driven by high-profile catalysts. Specifically, the 2026 FIFA World Cup, the U.S. midterm elections, the EU AI Act enforcement deadline, and major consumer AI product releases are expected to generate a wave of reported algorithmic harms, deepfakes, and cyberattacks.
Despite the high overall ingestion volume, rationales emphasize that strict resolution criteria will heavily constrain the accumulation of qualifying events:
A critical factor across all rationales is the systemic delay between an incident’s real-world occurrence and its formal publication on the monitor. Forecasters highlight a dual lag: a media delay (the time it takes for failures to be discovered and reported) and an automated pipeline delay. The OECD system uses a multi-day rolling window to scrape, cluster, and classify news articles before pushing them to the dashboard, creating a persistent processing backlog.
While the overall trend points to rapid accumulation, forecasters identify potential moderating risks. A traditional “summer slump” in corporate and institutional reporting, alongside overly aggressive algorithmic deduplication by the OECD’s pipeline, could temporarily depress the daily ingestion rate and push the target threshold further into the summer.
Question: On what date will the OECD AI Incidents Monitor first show at least 500 events classified as “AI incident” with incident dates of May 20, 2026 or later? Final Prediction: Probability distribution:
Total Cost: extra_metadata_in_explanation is disabled Time Spent: extra_metadata_in_explanation is disabled LLMs: extra_metadata_in_explanation is disabled Bot Name: extra_metadata_in_explanation is disabled
Forecaster 1: Probability distribution:
Forecaster 2: Probability distribution:
Forecaster 3: Probability distribution:
The research reports that as of May 19, 2026 the OECD AI Incidents and Hazards Monitor (AIM) contains about 14,949 total events from public sources worldwide, with the database distinguishing “AI incidents” (actual harm), “AI hazards” (potential harm), and “unrelated” events. A critical data gap identified is that the researcher could not find what share of the 14,949 entries are classified specifically as “AI incidents” versus “AI hazards.” Historical and recent growth figures cited in the research show accelerating reporting: roughly ~50 events/month in early 2020, 200+ events/month in early 2024, and approximately 435–500 events/month by January 2026 (implying 14–16 events/day if all are incidents). Annual totals cited include 233 reported AI incidents in 2024 and 362 in 2025, but the researcher notes these figures may mix incidents and hazards and that Stanford AI Index wording is not clearly aligned with OECD’s incident definition.
The research highlights key forecasting considerations and uncertainties: the 84-day observation window from May 20, 2026 to August 12, 2026, reporting lags between occurrence and database entry, possible changes or biases in OECD classification methodology, and the unknown incident/hazard split (the most critical uncertainty). It also notes that no active prediction markets were found for this specific OECD-AIM threshold on Manifold, Metaculus, Polymarket, or Kalshi. Using the January 2026 combined rate, the researcher calculates that 500 new events could be reached in 31–36 days if all were incidents, or 62–67 days if only ~50% are incidents; they conclude that reaching 500 incident-classified events in the window is mathematically feasible but highly sensitive to the incident/hazard split, reporting lag, and continued acceleration, and they recommend obtaining OECD AIM filter access and close monitoring in the first weeks after May 20 to establish the true incident rate.
Sources/websites used (as cited in the research):
I’ll provide you with a comprehensive rundown of the relevant information for forecasting when the OECD AI Incidents Monitor will first show at least 500 events classified as “AI incident” with incident dates of May 20, 2026 or later.
As of May 19, 2026, the OECD AI Incidents and Hazards Monitor (AIM) contains approximately 14,949 total AI incidents and hazards from public sources globally [2][3]. The database categorizes events into three types:
Critical Data Gap: I could not find specific information about what percentage of the 14,949 total events are classified as “AI incidents” versus “AI hazards.” This is essential since your question specifically asks about “AI incidents” only.
Monthly Incident Trends:
Annual Totals:
Important Note: These figures may represent combined totals of incidents and hazards, not just “AI incidents” specifically. The Stanford AI Index data [8] refers to “reported AI incidents,” but it’s unclear if this uses the OECD’s strict “AI incident” classification or broader definitions.
Daily Rate Estimation:
1. Timeline Constraints:
2. Reporting Lag:
3. Classification Uncertainty:
4. Acceleration Trends:
I searched for prediction markets on this specific question but found no active markets on Manifold, Metaculus, Polymarket, or Kalshi addressing this exact OECD AIM threshold. This is likely because:
General prediction market platforms are active and growing in 2026 [5][6], with Kalshi operating legally in the U.S. and others gaining mainstream adoption, but none appear to have markets on OECD AIM metrics specifically.
High Priority:
Medium Priority:
Reference Class Considerations:
With an 84-day observation window and current rates of approximately 435-500 total events per month (14-16/day), reaching 500 events is mathematically feasible if:
However, the critical unknown is the incident/hazard classification split. If only 30-40% of events are “AI incidents,” the timeline extends significantly, potentially beyond the August 12 deadline.
I recommend seeking direct access to the OECD AIM database filters to determine current incident classification rates and monitoring the database closely in the first 1-2 weeks after May 20 to establish the true baseline rate for new “AI incident” classifications.
Conditioning on not annulled, I accept more of the critic’s argument than I did in the previous round. I am moving the median later again, from 2026-07-27 to about 2026-08-01, and lowering my probability of being observed by the 2026-08-12 12:00 UTC deadline to roughly 67–69%.
The critic’s strongest point is the incident-date vs. first-reporting-date distinction. I verified that the live AIM page’s visible list is sorted by date of first reporting, while the question filters on incident date; that means recent news entries about lawsuits, statistics, multi-month scams, job cuts, studies, or regulatory findings may not all count if AIM backdates the underlying occurrence. The live page also reports only combined “incidents & hazards,” not the filtered count we need. (oecd.ai)
I also agree that the deadline mechanics matter. OECD says the pipeline runs daily but processes events from one to four days ago, and that articles are added to events for up to four days after registration. So the effective event-date window is closer to about 80 days than 84, and the required qualifying rate is around 6.2–6.4 AI incidents/day, not merely 500 divided by the calendar span. (oecd.ai)
Where I still partly defend my earlier view is on classifier permissiveness. OECD’s definitions count indirect harms, including rights, labour/IP, property, communities, and environment, and AIM uses LLM classification plus article-level majority voting; the live page shows broad classifications such as AI-linked job cuts being an “AI incident,” while clearly prospective military/robotaxi/cyber-risk stories are hazards. (oecd.ai) The ChatCPR example remains a warning that the classifier can be broader or stranger than an intuitive “AI caused concrete harm” definition. (oecd.ai)
My updated rate model is:
The OECD 2026 trends report supports both sides: media-reported AI incidents and hazards rose from 92/month in 2022 to 324/month in 2025, and synthetic media, child safety, cyber/fraud, and labour disruptions were rising-share themes; those are more likely to be classified as realized incidents than, say, model-launch or military-risk stories. But the report also emphasizes that AIM is media-report based and theme mix is lumpy, not a smooth ground-truth incident process. (oecd.org)
So I now see this as a genuine knife edge. A rate near 7/day gives first observation around August 1; a rate near 6.3/day gives an observation close to the cutoff; a rate near 5.5–6/day misses and resolves as >2026-08-12T12:00:00+00:00. I am not fully adopting the critic’s late distribution because the live examples show a permissive incident classifier and a current stream that is not obviously below the needed rate. But I do agree the previous ~73–75% deadline probability was too high.
Updated prediction:
Percentile 0.1: 2026-06-09
Percentile 1: 2026-06-17
Percentile 5: 2026-06-29
Percentile 10: 2026-07-06
Percentile 20: 2026-07-15
Percentile 30: 2026-07-22
Percentile 40: 2026-07-28
Percentile 50: 2026-08-01
Percentile 60: 2026-08-06
Percentile 70: 2026-08-14
Percentile 80: 2026-09-02
Percentile 90: 2026-09-27
Percentile 95: 2026-10-28
Percentile 99: 2027-01-20
Percentile 99.9: 2027-07-01
(a) Time left until outcome is known: At most about 84 days, until 2026-08-12T12:00:00Z, unless the 500-event threshold is observed earlier.
(b) Outside view: The relevant base rate is recent AIM media-event accumulation. OECD’s published analysis says media-reported AI incidents and hazards rose from 92/month in 2022 to 324/month in 2025 on average, while remaining a small share of AI news overall. That base rate alone would put 500 all-type events around 46 days, but 500 AI incidents only more like 70–90 days depending on the incident/hazard split. (oecd.org)
(c) Inside view: The current AIM page shows about 14,949 incidents & hazards total, and the visible newest entries are heavily but not exclusively “AI incident”; the recent visible sample includes both AI incident labels and AI hazard labels among May 18–19 entries. (oecd.ai) This pushes me toward a current qualifying incident rate near 8–10/day, not the full 14–17/day all-event rate. The definition and automated pipeline matter: AIM draws from Event Registry, clusters news events, and uses LLMs to classify items into incident/hazard/unrelated; OECD defines incidents as realized harm and hazards as plausible but unrealized harm. (oecd.ai)
(d) Outcome if nothing changed: If no qualifying May 20, 2026-or-later AI incidents were added, the official resolution would be >2026-08-12T12:00:00Z.
(e) Outcome if the current trend continued: At roughly 8–10 qualifying incidents/day, 500 would be reached about 50–63 days after May 20, plus a small observation/reporting lag: roughly 2026-07-12 to 2026-07-25.
(f) Expectations of experts and markets: I do not see a usable public market price for this exact AIM threshold. The closest expert-style evidence is OECD’s own trend work and the research note’s report of continued acceleration into early 2026; both favor “before the cutoff” but with meaningful uncertainty from the incident/hazard split.
(g) Unexpected low/earlier scenario: A large wave of AI-enabled fraud/cyber/deepfake reporting, or a broad classifier/reclassification update, causes many May 20+ clusters to be added quickly as “AI incident,” bringing the threshold forward into late June.
(h) Unexpected high/later scenario: The recent visible “incident” share proves unrepresentatively high; many new items are hazards, duplicates, or dated before May 20 due to earliest-date logic; or AIM tightens methodology / has service lag. That could push the observation past the August 12 noon deadline.
Rationale: Anchoring on the outside view, the 2025 base rate of 324 incidents+hazards/month would make 500 qualifying incidents only a close call by the deadline. The inside view updates earlier because the current live database appears to be running above that 2025 average, and because the latest visible entries show a substantial fraction classified as “AI incident.” I still keep a right tail past the cutoff because the question’s filter is strict: only unique event IDs, only “AI incident,” only incident dates May 20, 2026 or later, and the observation deadline is noon UTC on August 12. Conditional on no annulment, my median is 2026-07-19, with roughly low-teens probability that the threshold is not observed by the deadline.
Percentile 0.1: 2026-06-19 (oldest date) Percentile 1: 2026-06-24 Percentile 5: 2026-06-30 Percentile 10: 2026-07-04 Percentile 20: 2026-07-09 Percentile 30: 2026-07-13 Percentile 40: 2026-07-16 Percentile 50: 2026-07-19 Percentile 60: 2026-07-23 Percentile 70: 2026-07-28 Percentile 80: 2026-08-03 Percentile 90: 2026-08-15 Percentile 95: 2026-08-28 Percentile 99: 2026-10-05 Percentile 99.9: 2027-01-15 (newest date)
Conditioning on no annulment, and assuming any already-eligible observation does not count.
I accept more of the critic’s pushback. The biggest valid correction is still the date-field trap: the public AIM page’s current headline count is for combined “incidents & hazards,” it has separate severity filters for “AI incident” vs “AI hazard,” and the visible sort field is “Date of first reporting,” not necessarily the resolution’s stricter “incident date.” That creates a real risk that newly reported events about older or ongoing schemes fail the May 20+ incident-date filter. (oecd.ai)
A second point that moves me later: the official methodology’s summary table says AIM analyzes about 1,000 AI events daily and gets around 10 AI incidents or hazards from that pipeline, and it also emphasizes daily processing of events from one to four days ago, plus grouping/deduplication before upload. That is a more conservative anchor than interpreting recent peak monthly counts as incident-only flow. (oecd.ai)
I still do not fully adopt a very pessimistic view. The visible recent page continues to show a broad classifier: many first-page entries are labeled “AI incident,” including job cuts, fraud, fabricated quotes, sexual exploitation, malfunctioning name-reading, and even a positive ChatCPR story. My rough live-page sample remains around 12/20 AI incidents, though noisy and sorted by first reporting. This supports an incident share that is not tiny, while also raising model-instability/methodology-change risk. (oecd.ai)
My updated model: combined incident+hazard flow is probably around 11–16/day in the relevant near-term period, with incident-only share around 50–65%. I now apply a larger discount for strict incident-date eligibility and unique-ID clustering, giving a central effective qualifying rate around 7–8/day, with substantial probability near or below the ~6/day deadline boundary after lag. This moves my median to late July and raises my implied probability of missing the August 12 noon UTC observation deadline to roughly 25–27%. Dates after August 12 below should be read as latent crossing dates; under the rules those would resolve as >2026-08-12T12:00:00+00:00.
Updated prediction:
Percentile 0.1: 2026-06-08
Percentile 1: 2026-06-16
Percentile 5: 2026-06-25
Percentile 10: 2026-07-02
Percentile 20: 2026-07-10
Percentile 30: 2026-07-16
Percentile 40: 2026-07-22
Percentile 50: 2026-07-27
Percentile 60: 2026-08-03
Percentile 70: 2026-08-09
Percentile 80: 2026-08-23
Percentile 90: 2026-09-17
Percentile 95: 2026-10-15
Percentile 99: 2026-12-12
Percentile 99.9: 2027-02-25
Agent Forecast (date, confidence: medium)
Base Rate: OECD AIM 2025 baseline: 324 combined incidents+hazards/month, ~68% classified as ‘AI incident’. 2026 live database queries (Jan–May, Apr–May, May 1–19) consistently show ~17–18 ‘AI incident’ events/day with incident dates in 2026 — implying 500 qualifying events with incident dates ≥ May 20, 2026 accumulate in ~28–30 days, pointing to ~June 17–19.
Initial Reasoning:
The OECD AIM database currently shows ~14,949 total incidents & hazards.
Key Uncertainties:
Evidence Sources (11):
Grounding Facts (passed to reviewers):
Draft: Jun 10, 2026 to Jun 21, 2026: 23.0%, Jul 22, 2026 to Aug 02, 2026: 17.8%, Jun 21, 2026 to Jul 01, 2026: 13.5%, Aug 02, 2026 to Aug 12, 2026: 13.0%, Jul 12, 2026 to Jul 22, 2026: 12.5%
Reviewer Panel:
Reviewer (reasoning)
The draft conflates aggregate daily growth rates with the specific subset of events dated ≥May 20 2026 and underestimates the impact of classification/backfill lags and observation timing on the threshold date. Its two “trajectory” analyses lack clear weighting and ignore potential acceleration in incident reporting or intermittent database backfills. Weekend/weekday check schedules and methodology changes are treated as negligible, which overstates confidence in early date bins. Finally, it misinterprets scraped per-day rates without verifying that those rates apply solely to the May 20+ cohort.
Objections:
Reviewer (calibration)
The forecast presents two significantly different daily accumulation rates for qualifying events (7.2/day vs. 17-18/day) without adequately reconciling them or clearly weighting their influence on the final probability distribution. While Trajectory 2’s methodology of directly querying ‘AI incidents’ seems more aligned with the resolution criteria, the forecast’s probability distribution still assigns substantial weight to later resolution dates consistent with the lower rate. This lack of synthesis makes it difficult to assess the most likely outcome.
Objections:
Reviewer (fact-checker)
1 factual error(s). Verdict: minor_issues
Objections:
Adjustment Rationale:
Adjusted from the draft’s two-trajectory mix toward Trajectory 2’s direct database queries (~17–18 ‘AI incident’ events/day) which is the more reliable signal: it uses the same incident-type filter the resolver will use, measures current 2026 conditions rather than extrapolating from 2025, and is consistent across three independent live queries. Reviewer 2 correctly flagged that the draft’s distribution was internally inconsistent — assigning ~25% to Aug 2–12 while also citing 17–18/day data that points to mid-June resolution. Shifted peak mass to ‘Jun 10 to Jun 21’ (0.40) where 500/17.5 ≈ 28.6 days from May 20 lands, with ‘Jun 21 to Jul 01’ (0.25) capturing the downside where the rate settles closer to 13–14/day. Retained a meaningful but smaller tail (5.5% After Aug 12) for the residual scenarios reviewers flagged: rendering-induced miscount of JS-rendered totals, backfill clustering that delays observation, methodology shifts, or genuine reversion toward the historical 11/day combined rate. Net in-window probability ≈ 94.5%, closer to the 91.2% ensemble anchor than the draft’s ~80%. Also corrected fact-checker’s flagged minor error: the incident fraction in the May 2026 sample was 68% (15 of 22), not 65%.
forecast_sdk | 677.7s | research $1.5956 + reviewers $0.0110 + reconcile $0.2281 = $1.8751
Extracted Date Percentile Values: {0.1: 1779926400.0, 1.0: 1780876800.0, 5.0: 1781740800.0, 10.0: 1782345600.0, 15.0: 1782777600.0, 20.0: 1783036800.0, 25.0: 1783382400.0, 30.0: 1783641600.0, 35.0: 1783900800.0, 40.0: 1784160000.0, 45.0: 1784419200.0, 46.0: 1784419200.0, 47.0: 1784505600.0, 48.0: 1784505600.0, 49.0: 1784592000.0, 50.0: 1784678400.0, 51.0: 1784678400.0, 52.0: 1784764800.0, 53.0: 1784764800.0, 54.0: 1784851200.0, 55.0: 1784851200.0, 60.0: 1785110400.0, 65.0: 1785369600.0, 70.0: 1785628800.0, 75.0: 1785888000.0, 80.0: 1786147200.0, 85.0: 1786406400.0, 90.0: 1786579200.0, 95.0: 1787356800.0, 99.0: 1789430400.0, 99.9: 1793491200.0}
Model’s Answer: Looking at this question, I need to determine when the OECD AI Incidents Monitor will first show ≥500 “AI incident” events with incident dates on or after May 20, 2026.
Current Status (May 20, 2026): The research confirms ZERO qualifying AI incidents with dates ≥ May 20, 2026 exist in the database currently. This makes sense since today is the start date.
Database Size & Growth: The database contains approximately 14,949 total incidents and hazards. Historical growth rates show 26-32% annual increases, with reports of “sharp recent rise” in content-generation incidents. Multiple sources indicate the incident recording rate has been accelerating.
Time Window: From May 20, 2026 to August 12, 2026 (noon UTC) is approximately 84 days.
From the research:
My estimate for daily “AI incident” additions: 5-12 per day, with best estimate around 7-9 per day
Assumption 1: Current detection rate continues/increases (could fail if methodology changes or media attention shifts) - Probability of major slowdown: ~15%
Assumption 2: Classification ratio of incidents vs hazards remains stable - Probability of major change: ~10%
Assumption 3: No bulk data additions or backfilling - Probability of significant one-time addition: ~10%
Earlier Resolution (before July 10): Requires ~15+ incidents/day or major incident cluster. Probability: ~20%
Middle Timing (July 10-August 1): Moderate rate of 8-12/day, normal progression. Probability: ~40%
Later Timing (August 1-12): Slower rate of 5-8/day. Probability: ~15%
After Deadline (>August 12): Rate below ~6/day, threshold not met. Probability: ~25%
At the boundary rate of ~6 incidents/day, we’d just barely reach 500 by day 84. Given substantial uncertainty about the actual rate, I’m placing roughly 75-80% probability on resolution by August 12, with 20-25% probability mass after the deadline.
The median scenario with ~8 incidents/day would cross 500 around July 20-22 (approximately 63 days from start).
Starting from zero today and needing 500 incidents within 84 days requires sustained daily additions. The research supports ongoing growth in AI incident recording, but the specific rate for post-May-20 incidents is uncertain. I’m centering my distribution in late July, with wide confidence intervals reflecting both the possibility of faster accumulation (major incident clusters, methodology expansion) and slower accumulation (stricter classification, media attention shifts).
Percentile .1: 2026-05-28 Percentile 1: 2026-06-08 Percentile 5: 2026-06-18 Percentile 10: 2026-06-25 Percentile 15: 2026-06-30 Percentile 20: 2026-07-03 Percentile 25: 2026-07-07 Percentile 30: 2026-07-10 Percentile 35: 2026-07-13 Percentile 40: 2026-07-16 Percentile 45: 2026-07-19 Percentile 46: 2026-07-19 Percentile 47: 2026-07-20 Percentile 48: 2026-07-20 Percentile 49: 2026-07-21 Percentile 50: 2026-07-22 Percentile 51: 2026-07-22 Percentile 52: 2026-07-23 Percentile 53: 2026-07-23 Percentile 54: 2026-07-24 Percentile 55: 2026-07-24 Percentile 60: 2026-07-27 Percentile 65: 2026-07-30 Percentile 70: 2026-08-02 Percentile 75: 2026-08-05 Percentile 80: 2026-08-08 Percentile 85: 2026-08-11 Percentile 90: 2026-08-13 Percentile 95: 2026-08-22 Percentile 99: 2026-09-15 Percentile 99.9: 2026-11-01
I forecast a July 4, 2026 median resolution and a 96% chance that OECD AIM first displays 500 qualifying AI incidents before the August 12 noon UTC deadline.
OECD AIM is a media-based monitor, not a mandatory incident reporting system. It uses Event Registry news clusters, then LLMs classify records as “AI incident,” “AI hazard,” or unrelated; OECD defines an AI incident as an event where AI development, use, or malfunction directly or indirectly results in actual harm (OECD AIM methodology).
The count for this question starts at AIM records dated May 20, 2026 or later and classified as “AI incident.” At the current check, the freshest visible AIM records were dated May 19, 2026, and the filtered API count for dates May 20 onward was zero (OECD AIM; AIM API endpoint).
The historical backbone points to a high enough run rate. OECD’s February 10, 2026 paper says media-reported AI incidents and hazards rose from 92 per month on average in 2022 to 324 per month on average in 2025, using AIM data through September 2025 (OECD paper). A current-vintage AIM API pull on May 20, 2026 gives a still higher 2026 rate. Units in the table are unique AIM event records, not articles; the date field is the dashboard date of first reporting; the source is the current AIM database, so later revisions can differ (AIM API endpoint).
| Coverage window | All AI incidents + hazards | AI incidents | AI incident share | AI incidents/day |
|---|---|---|---|---|
| 2022 full year | 1,107 | 658 | 59.4% | 1.8 |
| 2023 full year | 2,023 | 1,232 | 60.9% | 3.4 |
| 2024 full year | 2,710 | 1,684 | 62.1% | 4.6 |
| 2025 full year | 4,571 | 3,022 | 66.1% | 8.3 |
| 2026 Jan. 1–May 19 | 2,565 | 1,701 | 66.3% | 12.2 |
The recent windows are the most decision-relevant. Q1 2026 had 1,199 AI incidents in 90 days, or 13.3 per day; April 2026 had 287 in 30 days, or 9.6 per day; May 1–19 had 215 in 19 days, or 11.3 per day (AIM API endpoint). The latest 20 rendered records on the public page had 12 “AI incident” labels and 8 “AI hazard” labels, which matches the API’s roughly two-thirds incident share (OECD AIM).
The threshold needs 500 displayed qualifying records by August 12 at 12:00 UTC. From May 20 00:00 UTC to that deadline is 84.5 days. After allowing about three days for display lag, the required average rate is roughly (500/(84.5-3)=6.1) AI incidents per day. That is far below every 2026 monthly rate observed so far, including April’s low 9.6 per day (OECD AIM methodology; AIM API endpoint).
My model treats the future daily qualifying rate (\lambda) as uncertain. The main regime has median (\lambda=12.0) incidents/day, close to the 2026 year-to-date and May rates. A 7% lower-rate regime has median 8.0/day, representing reversion toward 2025. A 1% severe-slowdown regime has median 5.0/day. I add a 2% structural tail for methodology changes, prolonged unavailability, or a resolver-date interpretation that excludes some records. Conditional on (\lambda), the first-observed threshold time is (T=\mathrm{Gamma}(500,\lambda)+L), where (L) is a one-to-four-day display lag. This gives 35% probability by June 30, 50% by July 4, 77% by July 13, 94% by August 1, and 96% by the deadline.
The word “incident” sounds narrower than AIM’s live classifier. The page currently labels some broad social, legal, operational, and even harm-reduction records as “AI incident”; for example, the ChatCPR emergency-response study is labeled “AI incident” on the public page (OECD AIM). That broad label makes 500 records a throughput question, not a question about 500 severe AI accidents.
The binding risk is not ordinary event volume. It is a break in the monitoring pipeline or classification method. OECD says the pipeline runs daily on events from one to four days ago and adds articles to events for up to four days, so late incident dates near August 12 may not be visible by the noon deadline (OECD AIM methodology). The central path crosses in early July, so this lag matters mainly in the low-rate tail.
The public dashboard is beta, and the API endpoint is not a stable documented public export. The dashboard date field is labeled “date of first reporting,” while the question says “incident date”; I assume the resolver will use the same date field exposed by AIM, but a hidden occurrence-date field would add uncertainty (OECD AIM; AIM API endpoint).
The data are current-vintage counts. AIM records can be backfilled, grouped, or reclassified before the threshold is observed. The resolution rules freeze the count after a qualifying observation, but pre-threshold changes still matter. A sustained fall below about 6 AI incidents per day, a major deduplication change, or service unavailability around the deadline are the main ways this misses.
Median resolution date: August 12, 2026 (Unix timestamp 1785595367 ≈ 84 days from May 20)
Critical threshold: The window allows ~84 days to accumulate 500 qualifying incidents, requiring ~6 incidents/day on average. This is nearly exactly at the boundary of plausibility.
Rate uncertainty dominates: Historical data from 2025 suggests ~10.8 incidents+hazards per day (~5.9 qualifying incidents after applying a ~55% incident-to-hazard ratio). However, methodology documents cite ~30/day, creating a 3× discrepancy. The lower figure is corroborated by independent monthly averages and is weighted more heavily (~65%), but the upper figure (~35% weight) would resolve the question by late June.
Bimodal distribution: The forecast splits into two scenarios. Under the lower rate, the threshold is reached just barely within or just beyond the August 12 deadline (high sensitivity to small rate changes). Under the higher rate, resolution occurs in late June. The mixture produces a median near the deadline with substantial probability (~40–45%) of missing the window entirely.
Weak incident-ratio anchor: The ~55% incident classification is based on a single 20-entry sample. Variation from 50–60% meaningfully shifts the qualifying rate and whether the threshold is crossed in time.
Pipeline lag and methodology changes: A 1–4 day lag delays the start slightly (earliest possible: ~May 24). Planned expansions (open submissions, court ruling integration) could cause step-change acceleration, but timing is uncertain.
Forecast (percentiles, aggregated):
Per-model forecasts:
Rationale: Summary The central estimate for when the OECD AI Incidents Monitor will log 500 “AI incidents” occurring on or after May 20, 2026, centers around late July. Historical data suggests the platform detected an average of 324 combined AI incidents and hazards per month in 2025 (roughly 11 per day), with more recent 2026 volumes likely closer to 16 combined events per day due to steady growth in media reporting. Analysis of the monitor’s database indicates that approximately 63% of these events are specifically classified as “AI incidents” rather than hazards. This yields an estimated accumulation rate of about 10 qualifying incidents per day. At this pace, the 500th incident would theoretically occur around July 9, 2026 (50 days after May 20). However, there is a delay between when an incident occurs and when it appears in the monitor. Adding a realistic combined delay of 10 to 15 days—accounting for both the time it takes for media to report an incident and the system’s 1-4 day processing lag—places the most likely observation date around July 22.
Strongest Arguments for Sooner
Strongest Arguments for Later
Key Uncertainties
On what date will the OECD AI Incidents Monitor first show at least 500 events classified as “AI incident” with incident dates of May 20, 2026 or later?
Key figures
Historical context
The OECD AI Incidents Monitor (AIM) was established to support the 2019 OECD AI Principles, which prioritize human-centric and trustworthy AI. It operates as part of a broader trend toward algorithmic accountability, alongside databases like the AI Incident Database (AIID). Historically, the volume of documented AI incidents has grown as AI adoption has moved from research labs to critical enterprise infrastructure. For instance, the AI Incident Database saw reports rise from 233 in 2024 to 362 in 2025. The OECD monitor, however, uses a much wider net, processing over 150,000 news articles daily via the Event Registry. This automated approach leads to much higher volumes than human-curated databases, with the monitor already reaching nearly 15,000 combined entries by May 2026. This high base rate of detection makes the accumulation of 500 new incidents a matter of “when” rather than “if,” mirroring the rapid expansion of AI agents and synthetic media in early 2026.
Tailwinds
Headwinds
Detailed reasoning
My analysis focuses on three primary variables: the daily ingestion rate of events, the ratio of “AI incidents” to total events, and the technical pipeline lag of the OECD AIM.
Daily Throughput: The OECD provides two conflicting figures for its daily detection of combined incidents and hazards: an average of 10 and an average of 30. External reports from January 2026 suggest a middle ground, with approximately 435 total reports per month, equating to roughly 14.5 combined events per day. This consistency suggests the monitor is operating at a stable, mature capacity.
Incident-to-Hazard Ratio: Not every event counts toward the 500-incident threshold. The monitor distinguishes between “AI incident” (actual harm) and “AI hazard” (potential harm). Based on current industry reporting trends and the OECD’s broad definition of harm (including rights violations and property damage), I estimate that approximately 40% to 60% of daily detections qualify as “AI incidents.” At a total rate of ~14.5 events per day, this yields a qualifying accumulation rate of 6 to 8 incidents per day.
Temporal Calculation: Starting from May 20, 2026, the monitor must accumulate 500 qualifying events. At a rate of 7 incidents per day, it would take approximately 71 days to reach the threshold. Adding a consistent pipeline lag of 1 to 4 days for the automated system to process and display these news-based events, the expected date falls in late July 2026.
Scenario Variances:
My median prediction of 31st July 2026 balances these rates, accounting for the inherent delay in news-based automated monitoring systems.
Key uncertainties
Conclusion