Scenario wins: Panshul42 (2) Mantic (1) smingers-bot (1)
Question: Will SpaceX complete multiple successful full-stack Starship orbital test flights between Jun 1 and Aug 11, 2026? The window has already opened and Flight 12 (May 22) does not count, so only Flight 13 or later could qualify; the key gating factor is the FAA’s mishap investigation after the Flight 12 booster flyback failure (FAA found a mishap on May 27), not the annual launch cap. Historical post-mishap turnarounds (roughly 41–91 days in analogous 2025 cases) mean the ~76–81 days from the FAA statement to Aug. 11 is long enough for a fast recovery but not comfortably so, making a single post‑Flight‑12 launch plausible but far from certain. Even if Flight 13 flies, the resolution requires SpaceX to publicly characterize the whole mission as successful (component-level successes won’t suffice), and V3/Raptor‑3 technical risks plus mixed Flight 12 outcomes lower conditional success odds. Two successes in-window would need an early Flight 13 plus an unusually fast turnaround near SpaceX’s best historical cadence; three or more would demand a cadence step‑change and therefore carry only trace probability. Forecasters largely agree on the FAA mishap as the main bottleneck and on weak scheduling signals (tracker dates, FCC STA), and they explicitly downweight a separate Poisson-cadence model that failed to condition on the active mishap gate; recommended workflow is to condition first on FAA clearance, then on a firm SpaceX date and pad/vehicle milestones, then on SpaceX’s postflight mission-level wording.
Forecast: Probability distribution:
We are forecasting the number of Starship full-stack (Super Heavy + Ship) launches in 2026-06-01 to 2026-08-11 (UTC) that SpaceX itself characterizes as an overall success / meeting primary test objectives.
Key drivers in this 72-day window:
Synthesis of analyst views:
An amateur forecast tends to anchor on the public “targeted for June” schedule and extrapolate a high cadence (2–4 successes) without pricing in (a) FAA return-to-flight uncertainty after a mishap determination and (b) the question’s strict requirement for an overall SpaceX-declared mission success. My forecast improves by explicitly centering the distribution on 0–1 successes while still retaining a calibrated right tail for 2–4+ in case regulatory clearance is fast and cadence ramps. Confidence in improvement: moderate (schedule/regulatory uncertainty dominates).
Forecast rationale (numeric):
— Iteration 1 — Across the forecasts, the dominant driver is the FAA mishap investigation following Starship Flight 12’s May 22, 2026 anomaly. As of early June, there is still no final report or return-to-flight clearance, and historical mishap investigations often take roughly 1–4 months. That leaves only about 10 weeks in the June 1–August 11 window, making the timeline tight for even a single successful integrated flight test.
The models broadly agree that the most likely outcomes are 0 or 1 successful flight, with 0 often the modal or near-modal result and 1 close behind. The main source of disagreement is how quickly regulatory approval could arrive and how fast SpaceX could resume cadence once cleared. Several rationales note that SpaceX’s improving launch cadence and stated ambition could allow one success if approval comes soon, but 2 or more would require unusually fast FAA turnaround plus rapid, flawless execution.
A further recurring point is that “success” is a high bar: it is not just a launch attempt, but a flight SpaceX would count as meeting its primary objectives. That definition reduces the chance of counting marginal or partially successful missions. Overall, the collective reasoning places most probability mass on 0–1 successes, with 2+ treated as a low-probability tail and 3+ as very unlikely.
— Iteration 2 — The collective reasoning points to a low-to-moderate number of successful Starship integrated flight tests, with 0 or 1 as the most likely outcomes and 2+ viewed as a long tail.
The shared view is that the FAA investigation timeline is the key bottleneck, making no successful flights the most conservative expectation, while one success remains plausible if SpaceX regains launch approval early enough. Multiple successes would require an unusually fast regulatory resolution and an atypically rapid launch cadence.
— Iteration 3 — Across the forecasts, the core expectation is that SpaceX will likely complete 0 or 1 successful Starship integrated flight test in the June 1–August 11, 2026 window, with 2 successes possible but clearly less likely, and 3+ treated as a low-probability tail.
FAA mishap investigation is the key constraint
A “successful” flight requires a full mission-level success
Cadence is likely too slow for multiple successes
The combined view is that the short time window, coupled with the FAA investigation and uncertain launch readiness, strongly limits upside. The dominant forecast logic is that regulatory delay and mission-risk constrain the number of counted successes, making zero or one successful Starship integrated flight test the most likely outcome.
Question: How many Starship integrated flight tests will SpaceX successfully complete between June 1 and August 11, 2026? Final Prediction: Probability distribution:
Total Cost: extra_metadata_in_explanation is disabled Time Spent: extra_metadata_in_explanation is disabled LLMs: extra_metadata_in_explanation is disabled Bot Name: extra_metadata_in_explanation is disabled
Forecaster 1: Probability distribution:
Forecaster 2: Probability distribution:
Forecaster 3: Probability distribution:
The research summarizes that SpaceX conducted Flight 12 (IFT-12) on May 22, 2026—the first V3 configuration flight—where the Starship vehicle performed its mission objectives but Super Heavy Booster 19 crashed after multiple Raptor 3 engine failures. The FAA has classified the booster crash as a “mishap” and opened a federal investigation that has grounded Starship flights pending root-cause analysis and corrective actions. Flight 13 hardware (Booster 20 and Ship 40/20) is reported to be assembled and undergoing testing, and Flight 13 had been targeted for June 2026, but the investigation timing is unresolved. The research notes that software/sequencing fixes would likely take weeks while hardware redesigns could take months, and that the question window runs June 1–August 11, 2026 (about 66 days remaining as of June 5, 2026).
The research also compiles historical base rates and prediction-market signals: a constructed timeline of prior integrated flight tests shows average intervals of roughly 3.7 months between flights (with a shortest interval of ~1.2 months and several multi-month gaps after mishaps). Across the prior 12 integrated tests it counts ~42% full successes, ~25% partial successes, and ~33% failures. Kalshi market prices cited in the research give a 17% probability of IFT-13 launching before July 17, 51% before August 1, and 77% before August 17, reflecting market belief that a launch in June is unlikely and that the outcome depends on the FAA investigation and any required corrective work. The research highlights that multiple successful integrated flights within the June 1–August 11 window would require unusually rapid investigation clearance and very fast turnarounds relative to the program’s historical cadence.
Sources used (as cited in the research):
(Other reporting and industry coverage referenced in the research include contemporary news outlets and articles used to compile dates, hardware details, and investigation reporting.)
I’ll help you gather comprehensive information for this superforecasting question about SpaceX Starship test flights. Let me conduct several searches to find relevant news, prediction markets, and historical base rates.
Let me search for more specific information about the FAA investigation timeline and recent SpaceX Starship developments.
Modal: 0 (59.0%) • partial_shift • 9m17s
Reading: strict
The resolution criteria impose precise, narrow conditions: the flight must use the exact full-stack hardware configuration, launch strictly within a defined UTC window, and SpaceX must characterize the overall flight (not just a component) as successful via only two named official channels. The explicit exclusion of component-only success declarations and the restriction to specific official channels reflect a deliberately strict, literal reading.
Total evidence registered (shared pool): ?
| Variant | Perspective | Model | Turns | Tools | Status |
|---|---|---|---|---|---|
| 0 | inside_view (inside_view_v1) | openai/gpt-5-mini | 25 | 25 | OK |
| 1 | outside_view (outside_view_v1) | anthropic/claude-sonnet-4-6 | 25 | 37 | OK |
| 2 | contrarian (contrarian_v1) | openai/gpt-5 | 10 | 10 | OK |
Evidence confidence: medium
high evidence]FAA mishap investigation takes many weeks or months (no stated deadline); Flight 13 slips past August 11 entirely; independent analyst median for Flight 13 is August 8, 2026 (near or beyond the window); Polymarket implies market expects few 2026 flights total; FAA has previously taken months to clear Starship mishap investigations.
SpaceX has FCC authorization already in place for Flight 13 (source 21); FAA can resolve investigations faster than historical norms; SpaceX IPO incentives create pressure to demonstrate progress; Flight 12’s mishap (booster RUD) may be categorized as less severe, allowing faster resolution.
medium evidence]FAA investigation resolves in June or early July; Flight 13 launches in July or early August; SpaceX characterizes the overall flight as a success (consistent with recent pattern of 4 consecutive successes); ~58% historical success rate and improving trend; FCC authorization already in place.
FAA mishap investigation has no specified timeline; booster RUD in Flight 12 may complicate clearance; only ~37–82 days would remain after investigation clears; independent analyst median suggests August 8 launch, leaving minimal margin for even one flight before August 11.
low evidence]FAA investigation resolves quickly (e.g., within weeks); Flight 13 launches in late June or July; Block 3 V3 vehicles enable fast turnaround; Flight 14 launches within 37–58 days of Flight 13; both flights meet SpaceX’s success characterization standard.
FAA mishap investigation uncertainty; no historical precedent for two Starship flights within ~40 days; Block 3 is new hardware still being refined (source 14); independent analyst’s August 8 median for Flight 13 leaves essentially no time for a second flight; Polymarket market implies ≤3 total remaining 2026 flights.
inside_view_v1)Inside view causal chain:
Current state: FAA declared Flight 12 a “mishap” on May 27, 2026 (source 20), triggering a mandatory investigation. This is a hard regulatory gate — SpaceX cannot launch until the FAA approves corrective actions. No timeline was specified.
Causal mechanism → outcome mapping:
FAA investigation duration is the dominant unknown. Historical Starship mishap investigations (IFT-1 through IFT-5 era) have taken months. However, the Flight 12 mishap was a booster RUD — not a catastrophic vehicle-wide failure — which may qualify for a faster review. The FAA investigation started ~May 27; if it resolves in 4–8 weeks, Flight 13 could potentially launch in late July or early August.
Flight 13 launch timing: The independent analyst median is August 8, 2026 (source 15) — just 3 days before the window closes. This is a concrete signal that even optimistic scenarios barely get Flight 13 into the window. If the median is August 8, there’s roughly 50% probability it launches before August 11, conditional on launching at all before then.
Success conditional on launch: The last 4 flights (9–12) were all characterized as successes (source 10), and SpaceX called Flight 12 a success despite the booster RUD (source 9). P(success | launch within window) ≈ 75–85% given recent trend.
Second flight within window: Even in optimistic scenarios, a second flight would require ~37 days minimum after Flight 13 (minimum historical interval). If Flight 13 is August 8, there is zero time. If Flight 13 is July 20 (optimistic), Flight 14 could be late August — still outside. So P(≥2 successful flights) is very low.
Scenario weighting by evidence strength:
Zero flights (high evidence): FAA investigation with no timeline, analyst median August 8 for F13 launch, Polymarket expecting few 2026 flights. I estimate P(F13 launches before Aug 11) ≈ 45%, implying P(0 launches in window) ≈ 55%.
Exactly 1 successful flight (medium evidence): P(F13 launches before Aug 11) × P(success) ≈ 0.45 × 0.80 ≈ 36%. This is the dominant “something happens” scenario.
2 or more flights (low evidence): Requires F13 to launch significantly before August 11 (not just a few days before), plus rapid F14 turnaround. P(≥2) ≈ 6–8%.
Final distribution:
- 0 successes (bin_0): ~55%
- 1 success (bin_1): ~35%
- 2 successes (bin_2): ~7%
- ≥3 successes (bin_3): ~3%
The FAA mishap investigation is the decisive causal bottleneck. The high-evidence “zero flights” scenario dominates but doesn’t overwhelm — there’s genuine probability mass on exactly one flight if the investigation clears faster than the median analyst expects.
outside_view_v1)Outside-view anchor — reference class: Starship integrated flight-test arrival counts in ~2.2–2.5 month windows based on recent observed cadence / arrival-forecast models. Base-rate (from the brief’s arrival projections): averaging the two quantitative arrival projections in the brief (conservative P(0)=42% and moderate P(0)=31%) yields a base P(0) ≈ 36–37% (sources 12–13). Adjustment: raise probability of 0 flights because the FAA formally declared Flight 12 a “mishap” on 2026-05-27 and required an FAA‑overseen investigation before any further launches (source 20). That regulatory constraint is concrete, has no stated timeline, and the brief explicitly tags the scenario ‘‘zero flights’’ as high evidence strength — therefore I increase P(0) materially relative to the base rate. Final (outside-view) split: P(0)=55%, P(1)=30%, P(2)=12%, P(3)=3%. Justification for shape and numbers:
Weight toward 0 (55%): brief labels the scenario “Zero successful flights” as high evidence strength and highlights the FAA mishap investigation as the single largest gating unknown (source 20; scenarios block in Research Brief). The independent analyst median launch-date for Flight 13 is August 8, 2026 (source 15), which sits at the very end of the window and therefore increases the chance that no launch will occur inside the window. The arrival-forecast priors (sources 12–13) produced substantial P(0) already (31–42%), so the outside-view adjustment is an upwards shift consistent with the brief’s high-evidence regulatory update.
P(1)=30% (medium): the brief tags the ‘‘exactly one successful flight’’ scenario as medium evidence. This reflects that SpaceX already has an FCC STA for Flight 13 (source 21) and recent flights (9–12) were characterized as successes (source 10), so if the FAA investigation concludes quickly (June / early July) a single launch inside the window is plausible. I keep this probability materially below P(0) to reflect the regulatory hurdle emphasized in the brief.
P(2)=12% and P(3)=3% (low): the brief marks the ‘‘two or more successful flights’’ scenario as low evidence — rapid cadence sufficient for ≥2 flights in a ~72-day window would require a quick FAA clearance plus unusually fast turnarounds (the brief notes the shortest historical inter-flight spacing is ~37 days but that block transitions and investigations have produced much longer gaps; sources 6 and 14). I therefore allocate modest remaining mass to 2 (plausible but evidence-thin) and a small tail to 3 (very unlikely in the brief’s constraint set).
How the brief’s scenario-evidence tags were used: per the developer instruction, I treated the scenarios as peers but weighted them by the brief’s relative_evidence_strength tag (high→majority mass; medium→meaningful mass; low→small tail). I anchored on the brief’s arrival-forecast base rates (sources 12–13) and adjusted toward zero because of the FAA mishap declaration (source 20), keeping the adjustment cautious but material.
Other notes: reported granularity is integer (counts); monotone floor = 0 (as of 2026-06-05 no launches in the window yet; brief notes this as the floor). Confidence in evidence: medium (per the Research Brief). Key uncertainties that could move this forecast: how long the FAA mishap investigation takes to clear launches, and whether SpaceX characterizes any flight that does occur as an overall mission success (the resolution criteria require SpaceX to publicly call the flight successful on spacex.com or @SpaceX on X).
contrarian_v2)Adversarial reading
The implicit consensus in the brief is that the active FAA mishap investigation (declared May 27 with no timeline) plus the independent analyst median of August 8 for Flight 13 will keep the count at zero or at most one inside the June 1–August 11 window. The high-evidence scenario explicitly tags “Zero successful flights” and is backed by multiple concrete sources (FAA declaration, Aug 8 median, Polymarket 2026 cadence expectations).
Counter-evidence found (or absence)
Searched for counter-evidence; nothing substantive found; consensus appears robust. The medium-evidence “one flight” scenario still accepts the FAA gate as the binding constraint, and the low-evidence “two or more” scenario rests on optimistic assumptions about investigation speed and Block 3 turnaround that are not supported by any dated evidence in the brief.
Distribution implications
bin_0 therefore receives materially more mass than a naive centroid of the three scenarios would imply, because the single high-evidence scenario dominates. bin_2 and bin_3 are down-weighted accordingly; only a thin residual probability is left for the low-evidence rapid-cadence case.
Strategy: logit_mean • Drafts: 3/3
anchoring_v1The draft rationale correctly identifies the FAA mishap investigation as the dominant bottleneck, but there is a subtle optimistic anchoring issue in the bin_1 mass. The brief states the independent analyst’s median for Flight 13 is August 8 — meaning roughly half the probability distribution for that launch date falls after August 8, and many of those cases fall outside the window (>August 11). The draft’s P(F13 launches before Aug 11) ≈ 45% is derived by treating the August 8 median as if approximately half that distribution fits within the 3-day window remaining (Aug 8–11), but actually a heavy right tail extends well past August 11. Moreover, the draft assigns P(success | launch) ≈ 75–85%, which is the recent-flights success rate — but this does not account for the additional uncertainty that the FAA investigation may impose stricter objectives or that a new Block 3 V3 vehicle could have teething issues, mildly depressing the conditional. The net effect is that exactly 1 success (bin_1) at ~32–36% is anchored optimistically toward the early end of the Flight 13 launch distribution, rather than reflecting the modal expectation that most of the launch probability mass sits at or after August 8, leaving very little room inside the window. A flat-prior reading of “median August 8 with a wide distribution” would push P(launch before Aug 11) closer to 30–35%, not 45%, reducing bin_1 and boosting bin_0.
ceiling_v1The key structural ceiling here is the sequential dependency constraint: for 2 successes (bin_2) or ≥3 successes (bin_3) to resolve, Flight 13 must first launch AND be characterized as a success, AND then Flight 14 must also launch and succeed — all before August 11. The brief establishes that the independent analyst’s median for Flight 13’s launch is August 8, 2026 (just 3 days before window close), and the historically shortest inter-flight interval is 37 days (F5→F6). This creates an arithmetic ceiling: even if Flight 13 launches on July 1 (an optimistic scenario vs. the August 8 median), Flight 14 at the absolute minimum could arrive around August 7 — barely fitting one additional flight. For ≥3 successes (bin_3), a third flight would require Flight 15 to launch no later than ~14 days after Flight 14, which has zero historical precedent; bin_3 should be near-zero. The draft’s 10.1% (bin_2) and 3.0% (bin_3) together total ~13%, which overstates the probability given this hard sequential constraint compressed by the August 8 median launch date for the first qualifying flight.
math_v1The rationale derives a logical causal chain but maps it to a distribution that is internally coherent without mathematical errors. The three-scenario framework from the brief (high/medium/low evidence strength) is explicitly weighted into the final bins, and the probabilities sum to 1.0 exactly. However, the mapping from scenario strengths to bin probabilities warrants scrutiny: the rationale asserts P(0)=55% reflects “high-evidence scenario dominates,” yet the moderate arrival-forecast priors (sources 12–13) already allocated P(0)=31–42%. The adjustment of +13–24 percentage points is conservative relative to the magnitude of the FAA regulatory shock (a declared mishap with no timeline), and the rationale does not explicitly reconcile how much of the P(0) mass comes from the base arrival rate versus the regulatory update.
Reviews consumed: 3
Two reviewers suggested shift_lower (anchoring and ceiling), while the math reviewer endorsed no_change with only minor transparency concerns. The anchoring reviewer correctly flagged that the August 8 analyst median for Flight 13 implies a heavy right tail extending past Aug 11, suggesting P(launch in window) is closer to 30–35% rather than 45% — I partially accept this and trim bin_1 slightly. The ceiling reviewer’s structural argument on bin_3 is compelling: three full-stack successes in 72 days has zero historical precedent given a 37-day minimum interval and an August 8 median for the first qualifying flight; I cut bin_3 from 3.0% to ~1% per this hard sequential constraint. I also modestly reduce bin_2 since the same path-dependency argument compresses the two-flight scenario (analyst median Aug 8 leaves essentially no room for F14). The math reviewer’s transparency concerns are noted but did not change the distribution since the draft’s arithmetic is sound. Net effect: a partial shift toward bin_0 with meaningful trimming of the upper tail.
| Bin | Probability |
|---|---|
| 0 (bin_0) | 59.0% |
| 1 (bin_1) | 31.0% |
| 2 (bin_2) | 9.0% |
| Above 3 (bin_3) | 1.0% |
wooley_rhino • run 94f20032 • $2.76 • 9m17s
Extracted Percentile Values: {0.1: 1.01, 1.0: 1.01, 5.0: 1.01, 10.0: 1.01, 15.0: 1.01, 20.0: 1.01, 25.0: 1.01, 30.0: 1.01, 35.0: 1.01, 40.0: 1.01, 45.0: 1.01, 46.0: 1.01, 47.0: 1.01, 48.0: 1.01, 49.0: 1.01, 50.0: 1.01, 51.0: 1.01, 52.0: 1.01, 53.0: 1.01, 54.0: 1.01, 55.0: 1.01, 60.0: 1.01, 65.0: 1.01, 70.0: 1.01, 75.0: 1.01, 80.0: 2.01, 85.0: 2.01, 90.0: 2.01, 95.0: 2.01, 99.0: 3.01, 99.9: 4.01}
Model’s Answer: Step one is to anchor on the most time-relevant facts, not on SpaceX’s ambitions. The most important evidence is the late-May 2026 reporting that Flight 12 triggered an FAA mishap investigation and that SpaceX needs FAA approval of the final report and corrective actions before Flight 13 can fly. That matters more than generalized “higher cadence in 2026” rhetoric because this question resolves in only about nine and a half weeks, so a single investigation or failed test can eat most of the window.
Step two is to separate launches from successful launches. Flight 13 looks likely to be attempted within the window, with hardware visibly progressing and a June target still floating around. But the question only counts flights that SpaceX itself characterizes as a mission-level success or as achieving primary objectives, and the recent record through Flight 12 is still mixed rather than near-automatic success. So even if one or two launches occur, the count of successful launches can still be 0 or 1.
Step three is to estimate cadence under these constraints. With Flight 12 on May 22 and Flight 13 blocked pending mishap closure, my base case is that there is probably one launch attempt in the window, with a meaningful but not dominant chance of two, and only a small chance of three. Four or more successful integrated flights by August 11 would require a very fast investigation closeout, unusually smooth turnaround, and a high success rate across multiple launches; that is possible but truly long-shot.
My rough point distribution is: 0 successful flights: 39% 1 successful flight: 38% 2 successful flights: 18% 3 successful flights: 4% 4 or more successful flights: 1%
That implies the most likely exact outcomes are 0 and 1, with 2 as a clear but secondary possibility. The low outcome happens if the FAA process drags, Flight 13 slips into late July or August, or Flight 13 flies but is not declared a mission-level success. I put the broad low-outcome region of 0 to 1 successes at about 77%.
The high outcome happens if SpaceX gets Flight 13 off in late June or early July, it is publicly called a success, and that enables a second flight in late July or early August, with a smaller chance of even a third success before cutoff. I put 3 or more successes at about 5%, and 4 or more at about 1%.
The assumptions most capable of breaking this forecast are the FAA timing, the practical launch cadence from Starbase during a post-mishap period, and the mission-level success rate of V3 flights under the resolution wording. I’d assign roughly a 35% chance that my assumption about reasonably prompt FAA clearance is too optimistic, a 40% chance that my assumption about fitting a second launch in the window is too optimistic, and about a 25% chance that V3 reliability / SpaceX’s post-flight success characterization is materially better than I’m assuming. The first two push outcomes down; the last pushes them up.
Because the provided percentile bound oddly requires all quantiles to be above 1 even though 0 is clearly feasible, I have to compress the lower tail just above 1 in the percentile table. Substantively, that lower-tail mass represents the heavy probability on 0 to 1 successful flights.
Percentile .1: 1.01 Percentile 1: 1.01 Percentile 5: 1.01 Percentile 10: 1.01 Percentile 15: 1.01 Percentile 20: 1.01 Percentile 25: 1.01 Percentile 30: 1.01 Percentile 35: 1.01 Percentile 40: 1.01 Percentile 45: 1.01 Percentile 46: 1.01 Percentile 47: 1.01 Percentile 48: 1.01 Percentile 49: 1.01 Percentile 50: 1.01 Percentile 51: 1.01 Percentile 52: 1.01 Percentile 53: 1.01 Percentile 54: 1.01 Percentile 55: 1.01 Percentile 60: 1.01 Percentile 65: 1.01 Percentile 70: 1.01 Percentile 75: 1.01 Percentile 80: 2.01 Percentile 85: 2.01 Percentile 90: 2.01 Percentile 95: 2.01 Percentile 99: 3.01 Percentile 99.9: 4.01
Forecast (median): 0.3761 successful Starship integrated flight tests between June 1 and August 11, 2026.
How many Starship integrated flight tests will SpaceX successfully complete between June 1 and August 11, 2026?
Key figures
Historical context
SpaceX’s Starship program has seen a steady increase in launch frequency, moving from 2 launches in 2023 to 5 in 2025. Flight 12, the first of 2026, occurred on May 22 and marked the debut of the Version 3 (V3) hardware and Raptor 3 engines. While the upper stage met its objectives, a booster landing failure triggered the seventh FAA grounding in 12 flights. Historically, the FAA has shown flexibility, such as after Flight 8, by issuing ‘return-to-flight’ determinations before investigations were fully closed, provided public safety was not threatened. However, these investigations still typically impose a delay of several weeks to months. The shortest gap between any two integrated Starship flights is 37.36 days, highlighting the physical limit of current ground infrastructure at Starbase Pad 2.
Tailwinds
Headwinds
Detailed reasoning
My analysis is primarily driven by the intersection of SpaceX’s operational preparations and the regulatory constraints imposed by the Federal Aviation Administration (FAA). As of June 5, 2026, the Starship program is grounded following a ‘mishap’ during Flight 12 on May 22. Historical data indicates that FAA mishap investigations for this program typically last between six weeks and four months. While a return-to-flight determination can be issued earlier if public safety is not at risk—a possibility here as the Flight 12 booster fell in a hazard area with no public damage—this regulatory step remains the primary bottleneck for Flight 13.
The forecast window from June 1 to August 11, 2026, spans 72 days. At the time of this prediction, only 67 days remain. SpaceX has begun hardware preparations, with Booster 20 rolling out for testing on June 5, which suggests the company is aiming for a launch as soon as the FAA allows, potentially in late June or July. If Flight 13 launches and SpaceX characterizes it as a success, it would count as one successful test. However, the probability of zero successful flights is high (approximately 40%) because the grounding could persist beyond the window, or Flight 13 could launch but suffer anomalies that prevent SpaceX from declaring it an overall success.
A second successful flight (Flight 14) within this window is considered improbable (roughly 10-12% chance). The shortest interval between Starship launches to date is 37.36 days. For two successful tests to occur, Flight 13 would need to launch successfully in late June, followed by an immediate post-flight review, rapid vehicle turnaround, and a second successful launch before August 11. Given that the V3 architecture is still in its early testing phase and the program is currently grounded, such a high cadence is inconsistent with historical patterns. I have weighted the outcomes as follows: 1 success (approx. 48%), 0 successes (approx. 40%), and 2 successes (approx. 11%), with a negligible tail for 3 or more. This distribution reflects the binary-like nature of the FAA’s return-to-flight decision and the physical constraints of launch logistics.
Key uncertainties
Conclusion