Bot similarity — Mantic series-1

← all questions · Pairwise similarity = 1 − mean total-variation distance across questions both bots forecasted · Generated 2026-04-26 01:57:27Z

Pairwise similarity heatmap

Per-bot consensus rank

Bot Mean similarity to others Mean distance Pairs Questions forecasted
smingers-bot 0.767 0.234 10 48
SynapseSeer 0.766 0.234 10 52
hayek-bot 0.764 0.237 10 24
tom_futuresearch_bot 0.762 0.238 10 40
laertes 0.758 0.243 10 53
cassi 0.757 0.243 10 35
AtlasForecasting-bot 0.744 0.256 10 30
Panshul42 0.741 0.259 10 35
Mantic 0.741 0.259 10 54
pgodzinbot 0.736 0.264 10 57
lewinke-thinking-bot 0.646 0.354 10 56

Higher similarity = the bot's forecasts tend to agree with the rest of the cohort. Lower = more contrarian.

Top single-question disagreements

Pairs of bots whose forecasts diverged most on a single question. Total-variation distance, 0 = identical, 1 = maximally apart.

QuestionTypeABDistance
#112 — What will be the reported cost for the highest-scoring submission with a reported cost on the ARC-AGI-3 public leaderboard on August 12, 2026? numeric Mantic lewinke-thinking-bot 0.932
#31 — Will any of the following airlines file for bankruptcy before August 12, 2026? multiple_choice AtlasForecasting-bot Panshul42 0.925
#67 — How many of 20 specific Python packages will publish Python 3.15-compatible wheels by August 4, 2026? discrete Mantic lewinke-thinking-bot 0.917
#31 — Will any of the following airlines file for bankruptcy before August 12, 2026? multiple_choice AtlasForecasting-bot SynapseSeer 0.902
#112 — What will be the reported cost for the highest-scoring submission with a reported cost on the ARC-AGI-3 public leaderboard on August 12, 2026? numeric cassi lewinke-thinking-bot 0.888
#112 — What will be the reported cost for the highest-scoring submission with a reported cost on the ARC-AGI-3 public leaderboard on August 12, 2026? numeric SynapseSeer lewinke-thinking-bot 0.886
#112 — What will be the reported cost for the highest-scoring submission with a reported cost on the ARC-AGI-3 public leaderboard on August 12, 2026? numeric lewinke-thinking-bot tom_futuresearch_bot 0.879
#67 — How many of 20 specific Python packages will publish Python 3.15-compatible wheels by August 4, 2026? discrete Mantic lewinke-thinking-bot 0.877
#31 — Will any of the following airlines file for bankruptcy before August 12, 2026? multiple_choice Panshul42 lewinke-thinking-bot 0.863
#96 — How many incidents of hate will be recorded by the ADL's HEAT map. discrete lewinke-thinking-bot tom_futuresearch_bot 0.856
#31 — Will any of the following airlines file for bankruptcy before August 12, 2026? multiple_choice SynapseSeer lewinke-thinking-bot 0.847
#82 — What will euro area GDP growth be, q/q, in Eurostat’s flash estimate for 2026 Q2? discrete Mantic lewinke-thinking-bot 0.846
#102 — How many signatories will PRI list on August 12, 2026? discrete AtlasForecasting-bot lewinke-thinking-bot 0.844
#127 — How many countries will newly restrict Polymarket by August 1, 2026? discrete lewinke-thinking-bot smingers-bot 0.839
#67 — How many of 20 specific Python packages will publish Python 3.15-compatible wheels by August 4, 2026? discrete lewinke-thinking-bot pgodzinbot 0.835

Most-outlier bot per question

For each question, the bot with the highest mean distance to all other co-forecasters. A persistent contrarian shows up here often.

QuestionTypeOutlier botMean distance# peers
#112 — What will be the reported cost for the highest-scoring submission with a reported cost on the ARC-AGI-3 public leaderboard on August 12, 2026? numeric lewinke-thinking-bot 0.817 10
#102 — How many signatories will PRI list on August 12, 2026? discrete lewinke-thinking-bot 0.755 9
#82 — What will euro area GDP growth be, q/q, in Eurostat’s flash estimate for 2026 Q2? discrete lewinke-thinking-bot 0.749 10
#127 — How many countries will newly restrict Polymarket by August 1, 2026? discrete lewinke-thinking-bot 0.746 8
#67 — How many of 20 specific Python packages will publish Python 3.15-compatible wheels by August 4, 2026? discrete lewinke-thinking-bot 0.738 5
#71 — What global Q2 2026 sales will Eli Lilly report for Foundayo (orforglipron)? discrete lewinke-thinking-bot 0.708 7
#77 — What hazard ratio will Novartis first report for the primary endpoint in the Lp(a) HORIZON clinical trial? discrete lewinke-thinking-bot 0.700 7
#96 — How many incidents of hate will be recorded by the ADL's HEAT map. discrete tom_futuresearch_bot 0.689 7
#32 — How many United States ships will Iran successfully attack in the Strait of Hormuz before August 12, 2026? discrete pgodzinbot 0.663 8
#31 — Will any of the following airlines file for bankruptcy before August 12, 2026? multiple_choice AtlasForecasting-bot 0.661 7
#113 — What will be the lower bound of the FOMC’s fed funds target range following its June 16–17, 2026 meeting? discrete pgodzinbot 0.656 10
#24 — When will the U.S. government announce they will receive monetary assistance from Elon Musk? date lewinke-thinking-bot 0.623 4
#118 — When will U.S. initial unemployment claims reach their highest weekly level before August 12, 2026? date Mantic 0.612 6
#103 — How many times will "Wong Kim Ark" be referenced in the Supreme Court opinion document in Trump v. Barbara? discrete Panshul42 0.532 9
#81 — On what date will the Sudanese Armed Forces (SAF) and the Rapid Support Forces (RSF) next publicly agree to a pause or end to their conflict? date lewinke-thinking-bot 0.509 7
#123 — When will the United States gain physical access to Iran's "nuclear dust" stockpile? date lewinke-thinking-bot 0.506 8
#117 — What will be the YTD vs. 2025 change for Petroleum and Petroleum Products carloads in the AAR weekly report for the week ending June 27, 2026? numeric AtlasForecasting-bot 0.494 9
#116 — What will be the average annualized funding rate for Anthropic on Ventuals in June 2026? numeric laertes 0.484 8
#101 — How many PLA aircraft sorties will Taiwan’s MND report as crossing the Taiwan Strait median line on the busiest 24-hour reporting period from April 25 to August 10, 2026? discrete hayek-bot 0.484 9
#75 — What will be the total number of transit calls PortWatch records through the Strait of Hormuz for the period April 27–May 3, 2026? discrete lewinke-thinking-bot 0.476 7