"The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay."
The claim gets the math backwards: the expected monetary value of the St. Petersburg game is not finite — it's infinite, which is exactly why the paradox exists in the first place.
What Was Claimed?
The St. Petersburg paradox describes a coin-flipping game where your winnings double each round, and you keep flipping until you get tails. The claim is that this game has a finite expected value — a specific dollar amount that a rational person could calculate and decide whether to pay as an entry fee.
People encounter this claim when trying to resolve the paradox. The intuition is appealing: surely there must be some number that captures what the game is "worth." If there were a finite expected value, you could simply compare it to the entry fee and decide whether to play.
What Did We Find?
The expected monetary value of the St. Petersburg game is infinite, not finite. To see why, consider what each round contributes to the expected value. On flip 1, there's a 1-in-2 chance of winning $2, contributing exactly $1. On flip 2, there's a 1-in-4 chance of winning $4, also contributing exactly $1. On flip 3, a 1-in-8 chance of winning $8 — again exactly $1. This pattern holds for every flip without exception: each round contributes precisely $1 to the expected value. Adding infinitely many $1 contributions gives infinity, not a finite number. This was verified algebraically and confirmed computationally for the first 20 terms.
The divergence isn't subtle or approximate. After 10 rounds, the partial sum is exactly 10. After 20 rounds, exactly 20. The sum grows without bound — there is no ceiling.
This infinite expected value is precisely the paradox. If you take expected monetary value seriously as a decision rule, you should pay any finite entry fee to play — which strikes most people as absurd, since the game rarely pays out large sums in practice.
There is a genuine resolution to the paradox, but it doesn't involve a finite expected value. The 18th-century mathematician Daniel Bernoulli proposed that rational people maximize expected utility, not expected monetary value. Under logarithmic utility — where each doubling of wealth matters less than the last — the expected utility of the game is finite: approximately 1.386, which translates to a certainty equivalent of exactly $4. That's the most a rational, risk-averse agent should pay to play.
The claim conflates these two distinct concepts. The $4 figure is a certainty equivalent derived from expected utility theory, not an expected value. The expected value remains infinite.
What Should You Keep In Mind?
The $4 certainty equivalent is real and meaningful — but it depends on assuming logarithmic utility and zero initial wealth. Under different but still reasonable risk-averse preferences, you'd get a different finite number. The point is that any risk-averse utility model gives some finite willingness to pay; the exact amount varies.
Logarithmic utility is also not a complete fix for all related puzzles. A variant of the game with a different payoff structure can produce infinite expected utility even under log utility, showing that the deeper question of how to handle extreme tail risks in decision theory remains open.
The claim's failure is terminological as much as mathematical. "Expected value" has a precise technical meaning, and the St. Petersburg game's expected value is infinite by mathematical consensus — confirmed by both Wikipedia and the Stanford Encyclopedia of Philosophy. Saying the game has a "finite expected value" is simply incorrect, regardless of what framework you use to decide what to pay.
How Was This Verified?
This claim was evaluated by decomposing it into two sub-claims — whether the expected monetary value is finite, and whether a rational willingness to pay exists — then testing each algebraically and computationally. You can read the full mathematical breakdown in the structured proof report, review every computation and citation check in the full verification audit, or re-run the proof yourself.
What could challenge this verdict?
1. Is there a framework where the standard St. Petersburg EV is finite? No peer-reviewed source claims the standard game (unlimited flips, payoff = 2^n) has finite expected value. Bounded-payoff and finite-wealth-cap variants are different games and do not apply here. The divergence is mathematical consensus.
2. Could "expected value" in the claim mean "expected utility"? In standard probability and economics, "expected value" is a precisely defined technical term: E[X] = Σ p_i·x_i. Even under the most charitable reading, the correct term for the $4 quantity is certainty equivalent or expected utility. The claim's terminology is wrong regardless of interpretation.
3. Does Menger's (1934) super-St.-Petersburg paradox undermine SC2? Menger showed that for any unbounded utility function (including log), a game with payoff exp(2^n) has infinite expected utility. This does not break SC2: SC2 applies only to the standard St. Petersburg game (payoff = 2^n). For that game, log utility gives a finite CE = $4. Menger's extended game is a separate paradox.
4. Is log utility the only resolution? No. Any CRRA utility with risk-aversion coefficient γ > 0 gives a finite certainty equivalent for the standard game. This strengthens SC2 — the finiteness of rational willingness to pay is robust across frameworks. The exact $4 figure is specific to Bernoulli's log utility with zero initial wealth.
Sources
| Source | ID | Type | Verified |
|---|---|---|---|
| SC1: EV term (1/2)^n * 2^n = 1 for every n; series = sum of infinitely many 1s | A1 | — | Computed |
| SC1 cross-check: partial sums grow as N (unbounded), confirming divergence | A2 | — | Computed |
| SC2: E[ln(2^N)] = ln(2) * sum(n*(1/2)^n) = ln(2)*2 = 2*ln(2) (finite) | A3 | — | Computed |
| SC2 cross-check: generating function confirms sum(n*(1/2)^n) = 2 | A4 | — | Computed |
| Wikipedia: St. Petersburg paradox | B1 | Reference | Yes |
| Stanford Encyclopedia of Philosophy: St. Petersburg Paradox | B2 | Academic | Yes |
detailed evidence
Evidence Summary
| ID | Fact | Verified |
|---|---|---|
| A1 | SC1: (1/2)^n · 2^n = 1 for every n; series = sum of infinitely many 1s | Computed: all 20 terms verified = 1.0; series diverges |
| A2 | SC1 cross-check: partial sums grow as N (unbounded), confirming divergence | Computed: 10-term sum = 10.0, 20-term sum = 20.0 (N-term sum = N) |
| A3 | SC2: E[ln(2^N)] = ln(2)·sum(n·(1/2)^n) = ln(2)·2 = 2·ln(2) (finite) | Computed: E[U] = 1.386294 = 2·ln(2); certainty equivalent = $4.000000 |
| A4 | SC2 cross-check: generating function confirms sum(n·(1/2)^n) = 2 | Computed: x/(1-x)² at x=0.5 = 2.0 exactly |
| B1 | Wikipedia: St. Petersburg paradox — game rules and infinite expected value | Yes |
| B2 | Stanford Encyclopedia of Philosophy — Bernoulli utility resolution | Yes |
Proof Logic
SC1: The Expected Value is Infinite
The St. Petersburg game pays 2^n dollars if the first heads appears on flip n (probability (1/2)^n). The expected value is:
E[X] = Σ_{n=1}^∞ P(heads on flip n) · payout(n) = Σ_{n=1}^∞ (1/2)^n · 2^n
Key observation (A1): Each term simplifies exactly:
(1/2)^n · 2^n = (1/2 · 2)^n = 1^n = 1
So E[X] = 1 + 1 + 1 + … = ∞. This is not an approximation — every term equals exactly 1.0 (verified computationally for n = 1 through 20).
Wikipedia confirms: "The expected payoff of the lottery game is infinite" (B1).
Cross-check (A2): The N-term partial sum equals N exactly. The 10-term sum is 10.0; the 20-term sum is 20.0. Partial sums grow linearly without bound — the series does not converge.
SC1 is DISPROVED: the claim of "finite expected value" is mathematically false.
SC2: A Rational Person's Willingness to Pay is Finite ($4)
Bernoulli (1738) proposed that rational agents maximize expected utility, not expected monetary value. Under logarithmic utility U(w) = ln(w) (A3, B2):
E[U] = E[ln(2^N)] = Σ_{n=1}^∞ (1/2)^n · n · ln(2) = ln(2) · Σ_{n=1}^∞ n · (1/2)^n
Key computation (A4): By the generating function identity Σ_{n=1}^∞ n·x^n = x/(1−x)², evaluated at x = 1/2:
Σ_{n=1}^∞ n · (1/2)^n = (1/2) / (1/2)² = 0.5 / 0.25 = 2.0 (exactly)
Therefore (A3):
E[U] = ln(2) · 2 = 2·ln(2) ≈ 1.3863 (finite)
The certainty equivalent — the amount w such that ln(w) = E[U] — is:
CE = exp(2·ln(2)) = exp(ln(4)) = $4.00 (exactly)
A rational agent with log utility should pay at most $4 to play the game. This is finite. SC2 is PROVED.
Cross-check: Numerical summation of 10,000 terms converges to 1.38629436, matching the analytical value 2·ln(2) = 1.38629436 exactly.
Conclusion
Verdict: DISPROVED.
The claim that the St. Petersburg paradox "has a finite expected value" is mathematically false. The expected monetary value is infinite (E[X] = Σ 1 = ∞), which is precisely the source of the paradox. No credible source disputes this.
What is finite is the expected utility under Bernoulli's (1738) logarithmic utility model, and the resulting certainty equivalent of $4.00 — the most a rational risk-averse agent should pay. But this is a statement about expected utility, not expected value. The claim's framing confuses these two distinct concepts, making it false as stated.
Both supporting citations (B1: Wikipedia, B2: Stanford Encyclopedia of Philosophy) were independently verified.
audit trail
All 2 citations verified.
Original audit log
B1 — Wikipedia: St. Petersburg paradox - Status: verified - Method: full_quote - Fetch mode: live
B2 — Stanford Encyclopedia of Philosophy: St. Petersburg Paradox - Status: verified - Method: full_quote - Fetch mode: live
| Field | Value |
|---|---|
| Subject | St. Petersburg game |
| Property | expected monetary value (SC1) and rational certainty equivalent (SC2) |
| Operator | AND |
| Operator note | The claim makes two assertions: (SC1) the expected monetary value E[X] is finite, AND (SC2) a rational person should pay only a finite amount. SC1 requires the series sum_{n>=1} (1/2^n)2^n to converge. SC2 requires decision theory (expected utility) to yield a finite certainty equivalent. The overall claim is DISPROVED because SC1 is false: the EV series diverges to infinity. SC2 is true — under Bernoulli's (1738) log utility, the certainty equivalent = $4 (initial wealth assumed zero) — but this does not save the compound claim because the correct resolution involves expected utility, not expected value*. |
Natural language: "The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay."
Formal decomposition:
- SC1: The expected monetary value E[X] is finite. This requires the series Σ (1/2)^n · 2^n to converge. Verdict: DISPROVED — the series diverges.
- SC2: A rational person should be willing to pay only a finite amount. This requires expected utility theory to produce a finite certainty equivalent. Verdict: PROVED — under Bernoulli's log utility, CE = $4.
Why the compound claim is DISPROVED: SC1 is false. The claim's framing — calling the resolution value a "finite expected value" — is terminologically wrong. The correct term is certainty equivalent (or expected utility). The compound claim ("SC1 AND SC2") fails because SC1 fails.
| Fact ID | Domain | Type | Tier | Note |
|---|---|---|---|---|
| B1 | wikipedia.org | reference | 3 | Established reference source |
| B2 | stanford.edu | academic | 4 | Academic domain (.edu) — Stanford Encyclopedia of Philosophy |
── SC1: Term verification (1/2)^n * 2^n for n = 1..20 ──
First 5 terms: [1.0, 1.0, 1.0, 1.0, 1.0]
All 20 terms equal 1.0: True
SC1: Every term (1/2)^n * 2^n = 1.0 (verified n=1..20): True == True = True
Spot-check n=5:
(0.5**5) * (2.0**5): (0.5**5) * (2.0**5) = 0.5 ** 5 * 2.0 ** 5 = 1.0000
── SC1 cross-check: Partial sums ──
partial_sum_10: partial_sum_10 = 10.0 = 10.0000
partial_sum_20: partial_sum_20 = 20.0 = 20.0000
SC1 cross-check: 10-term partial sum = 10.0 (unbounded growth): 0.0 < 1e-09 = True
SC1 cross-check: 20-term partial sum = 20.0 (N-term sum = N): 0.0 < 1e-09 = True
SC1: EV series converges? Terms 11-20 contribute <0.001? (False = diverges): 10.0 < 0.001 = False
── SC2: Bernoulli expected utility ──
x / (1 - x)**2: x / (1 - x)**2 = 0.5 / (1 - 0.5) ** 2 = 2.0000
SC2/A4: sum(n*(1/2)^n) = x/(1-x)^2 at x=0.5 = 2.0 (generating function): 0.0 < 1e-14 = True
Expected utility computation:
math.log(2) * generating_fn: math.log(2) * generating_fn = math.log(2) * 2.0 = 1.3863
SC2/A3: E[ln(2^N)] = ln(2) * 2 = 2*ln(2) ≈ 1.3863 (finite): 0.0 < 1e-12 = True
Certainty equivalent:
math.exp(analytical_eu): math.exp(analytical_eu) = math.exp(1.3862943611198906) = 4.0000
SC2: Certainty equivalent = exp(2*ln(2)) = $4.00 (rational WTP, finite): 0.0 < 1e-10 = True
── SC2 cross-check: numerical convergence ──
abs(eu_numerical - analytical_eu): abs(eu_numerical - analytical_eu) = abs(1.3862943611198906 - 1.3862943611198906) = 0.0000
SC2 cross-check: 10000-term numerical EU matches 2*ln(2) within 0.001: 0.0 < 0.001 = True
SC2: Certainty equivalent is finite (< 1e15, i.e., a real finite dollar amount): 4.0 < 1000000000000000.0 = True
Overall: compound claim holds only if SC1=True AND SC2=True: 1 == 2 = False
1. Is there a mathematical framework where the standard St. Petersburg EV is finite? - Searched: "St. Petersburg paradox finite expected value" and "St. Petersburg standard game EV convergent" - Finding: No peer-reviewed source claims the standard game has finite EV. The divergence is mathematical consensus. Bounded-payoff and finite-wealth-cap variants are different games. - Breaks proof: No
2. Could "expected value" in the claim mean "expected utility" or "certainty equivalent"? - Analyzed: The claim language uses "expected value" — a precisely defined technical term in standard probability/economics. Searched for alternative readings; found none in standard literature. - Finding: Even under the most charitable reading, the correct term for the $4 quantity is certainty equivalent or expected utility. The claim's terminology is wrong. - Breaks proof: No
3. Does Menger's (1934) super-St.-Petersburg paradox undermine SC2? - Searched: "Menger 1934 super St. Petersburg paradox log utility unbounded" - Finding: Menger's result applies to a different game (payoff exp(2^n)). For the standard game (payoff 2^n), log utility gives CE = $4. Menger's result does not break SC2. - Breaks proof: No
4. Is log utility the only framework giving finite willingness to pay? - Searched: "St. Petersburg paradox CRRA utility solution" and "risk aversion St. Petersburg finite certainty equivalent" - Finding: Any risk-averse utility (CRRA with γ > 0) gives a finite CE for the standard game. This strengthens SC2. Only the exact $4 value is specific to Bernoulli's log utility with zero initial wealth. - Breaks proof: No
- Rule 1: N/A — pure computation, no empirical values extracted from quotes
- Rule 2: Both citations fetched and verified live: B1 (full_quote), B2 (full_quote)
- Rule 3:
date.today()present; not time-dependent (pure math), no date drift risk - Rule 4: CLAIM_FORMAL with detailed
operator_notedocumenting compound SC1/SC2 decomposition and why compound claim fails - Rule 5: Four independent adversarial checks performed; no counter-evidence found that breaks the proof
- Rule 6: N/A — pure computation, no empirical cross-sources; independence for A-type facts established via mathematically distinct methods (algebraic term analysis vs. partial sum growth; generating function vs. numerical convergence)
- Rule 7: All constants and formulas use
math.log(),math.exp()from Python stdlib; no hand-coded formulas or magic numbers - validate_proof.py result: 16/16 checks PASS, 0 warnings
Cite this proof
Proof Engine. (2026). Claim Verification: “The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay.” — Disproved. https://proofengine.info/proofs/the-st-petersburg-paradox-has-a-finite-expected-va/
Proof Engine. "Claim Verification: “The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay.” — Disproved." 2026. https://proofengine.info/proofs/the-st-petersburg-paradox-has-a-finite-expected-va/.
@misc{proofengine_the_st_petersburg_paradox_has_a_finite_expected_va,
title = {Claim Verification: “The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay.” — Disproved},
author = {{Proof Engine}},
year = {2026},
url = {https://proofengine.info/proofs/the-st-petersburg-paradox-has-a-finite-expected-va/},
note = {Verdict: DISPROVED. Generated by proof-engine v1.0.0},
}
TY - DATA TI - Claim Verification: “The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay.” — Disproved AU - Proof Engine PY - 2026 UR - https://proofengine.info/proofs/the-st-petersburg-paradox-has-a-finite-expected-va/ N1 - Verdict: DISPROVED. Generated by proof-engine v1.0.0 ER -
View proof source
This is the proof.py that produced the verdict above. Every fact traces to code below. (This proof has not yet been minted to Zenodo; the source here is the working copy from this repository.)
"""
Proof: The St. Petersburg paradox has a finite expected value that a rational person should be willing to pay.
Generated: 2026-03-28
Compound claim decomposition:
SC1: The St. Petersburg game's expected monetary value E[X] is finite.
→ DISPROVED: E[X] = sum_{n>=1} (1/2)^n * 2^n = sum_{n>=1} 1 = infinity (diverges).
SC2: A rational person should be willing to pay only a finite amount to play.
→ PROVED: Under Bernoulli's (1738) logarithmic utility, E[ln(2^N)] = 2*ln(2)
(finite), and the certainty equivalent = exp(2*ln(2)) = $4.
Overall verdict: DISPROVED.
The premise "has a finite expected value" is mathematically false — EV is infinite.
The correct resolution (SC2) is finite expected *utility*, not finite expected *value*.
"""
import json
import math
import os
import sys
PROOF_ENGINE_ROOT = os.environ.get("PROOF_ENGINE_ROOT")
if not PROOF_ENGINE_ROOT:
_d = os.path.dirname(os.path.abspath(__file__))
while _d != os.path.dirname(_d):
if os.path.isdir(os.path.join(_d, "proof-engine", "skills", "proof-engine", "scripts")):
PROOF_ENGINE_ROOT = os.path.join(_d, "proof-engine", "skills", "proof-engine")
break
_d = os.path.dirname(_d)
if not PROOF_ENGINE_ROOT:
raise RuntimeError("PROOF_ENGINE_ROOT not set and skill dir not found via walk-up from proof.py")
sys.path.insert(0, PROOF_ENGINE_ROOT)
from datetime import date
from scripts.verify_citations import verify_all_citations, build_citation_detail
from scripts.computations import compare, explain_calc
# ── 1. CLAIM INTERPRETATION (Rule 4) ──────────────────────────────────────────
CLAIM_NATURAL = (
"The St. Petersburg paradox has a finite expected value that a rational "
"person should be willing to pay."
)
CLAIM_FORMAL = {
"subject": "St. Petersburg game",
"property": "expected monetary value (SC1) and rational certainty equivalent (SC2)",
"operator": "AND",
"operator_note": (
"The claim makes two assertions: "
"(SC1) the expected monetary value E[X] is finite, AND "
"(SC2) a rational person should pay only a finite amount. "
"SC1 requires the series sum_{n>=1} (1/2^n)*2^n to converge to a finite number. "
"SC2 requires decision theory (expected utility) to yield a finite certainty equivalent. "
"The overall claim is DISPROVED because SC1 is false: the EV series diverges to infinity. "
"SC2 is true — under Bernoulli's (1738) log utility, the certainty equivalent = $4 "
"(initial wealth assumed zero) — but this does not save the compound claim because "
"the correct resolution involves expected *utility*, not expected *value*."
),
}
# ── 2. FACT REGISTRY ──────────────────────────────────────────────────────────
FACT_REGISTRY = {
"A1": {
"label": "SC1: EV term (1/2)^n * 2^n = 1 for every n; series = sum of infinitely many 1s",
"method": None, "result": None,
},
"A2": {
"label": "SC1 cross-check: partial sums grow as N (unbounded), confirming divergence",
"method": None, "result": None,
},
"A3": {
"label": "SC2: E[ln(2^N)] = ln(2) * sum(n*(1/2)^n) = ln(2)*2 = 2*ln(2) (finite)",
"method": None, "result": None,
},
"A4": {
"label": "SC2 cross-check: generating function confirms sum(n*(1/2)^n) = 2",
"method": None, "result": None,
},
"B1": {"key": "source_wiki",
"label": "Wikipedia: St. Petersburg paradox — game rules and infinite expected value"},
"B2": {"key": "source_sep",
"label": "Stanford Encyclopedia of Philosophy — Bernoulli utility resolution"},
}
# ── 3. EMPIRICAL FACTS ────────────────────────────────────────────────────────
empirical_facts = {
"source_wiki": {
"quote": "The expected payoff of the lottery game is infinite",
"url": "https://en.wikipedia.org/wiki/St._Petersburg_paradox",
"source_name": "Wikipedia: St. Petersburg paradox",
},
"source_sep": {
"quote": "the logarithm of the monetary amount, which entails that improbable but large monetary prizes will contribute less to the expected utility of the game than more probable but smaller monetary amounts",
"url": "https://plato.stanford.edu/entries/paradox-stpetersburg/",
"source_name": "Stanford Encyclopedia of Philosophy: St. Petersburg Paradox",
},
}
# ── 4. CITATION VERIFICATION (Rule 2) ─────────────────────────────────────────
citation_results = verify_all_citations(empirical_facts, wayback_fallback=True)
# ── 5. SC1: EV Series Diverges ────────────────────────────────────────────────
# Game rules: flip fair coin until heads. Payout = 2^n if heads first on flip n.
# P(heads on flip n) = (1/2)^n
# E[X] = sum_{n=1}^inf (1/2)^n * 2^n
# Algebraic simplification (primary method):
# Each term: (1/2)^n * 2^n = (1/2 * 2)^n = 1^n = 1
# Therefore E[X] = sum_{n=1}^inf 1 = infinity
# Verify numerically for n in [1..20] where floating-point is safe (2^20 = 1048576)
print("\n── SC1: Term verification (1/2)^n * 2^n for n = 1..20 ──")
ev_terms = [(0.5**n) * (2.0**n) for n in range(1, 21)]
all_terms_one = all(abs(t - 1.0) < 1e-10 for t in ev_terms)
print(f"First 5 terms: {ev_terms[:5]}")
print(f"All 20 terms equal 1.0: {all_terms_one}")
A1_result = compare(all_terms_one, "==", True,
label="SC1: Every term (1/2)^n * 2^n = 1.0 (verified n=1..20)")
# Spot-check with explain_calc for n=5:
print("\nSpot-check n=5:")
term_n5 = explain_calc("(0.5**5) * (2.0**5)", locals())
# Partial sums grow as N (cross-check, algebraic consequence):
print("\n── SC1 cross-check: Partial sums ──")
partial_sum_10 = sum(ev_terms[:10]) # sum of first 10 terms = 10
partial_sum_20 = sum(ev_terms[:20]) # sum of first 20 terms = 20
ps10 = explain_calc("partial_sum_10", locals())
ps20 = explain_calc("partial_sum_20", locals())
A2_result_10 = compare(abs(partial_sum_10 - 10.0), "<", 1e-9,
label="SC1 cross-check: 10-term partial sum = 10.0 (unbounded growth)")
A2_result_20 = compare(abs(partial_sum_20 - 20.0), "<", 1e-9,
label="SC1 cross-check: 20-term partial sum = 20.0 (N-term sum = N)")
# Check for convergence: if terms 11-20 still contribute ~10, the series doesn't converge
delta_10_to_20 = partial_sum_20 - partial_sum_10 # = 10.0 if each term = 1
# sc1_holds is True only if the series CONVERGES (terms 11-20 contribute < 0.001)
sc1_holds = compare(delta_10_to_20, "<", 0.001,
label="SC1: EV series converges? Terms 11-20 contribute <0.001? (False = diverges)")
# ── 6. SC2: Bernoulli Log Utility Gives Finite CE ─────────────────────────────
# Bernoulli (1738): a rational agent maximizes E[U(W)] where U(w) = ln(w).
# E[U] = E[ln(2^N)] = sum_{n=1}^inf (1/2)^n * ln(2^n)
# = sum_{n=1}^inf (1/2)^n * n * ln(2)
# = ln(2) * sum_{n=1}^inf n * (1/2)^n
# Primary: power series identity sum_{n=1}^inf n*x^n = x/(1-x)^2, at x = 1/2:
print("\n── SC2: Bernoulli expected utility ──")
x = 0.5
generating_fn = explain_calc("x / (1 - x)**2", locals())
# At x=0.5: (0.5)/(0.5)^2 = 0.5/0.25 = 2.0
A4_result = compare(abs(generating_fn - 2.0), "<", 1e-14,
label="SC2/A4: sum(n*(1/2)^n) = x/(1-x)^2 at x=0.5 = 2.0 (generating function)")
# Therefore E[U] = ln(2) * 2:
print("\nExpected utility computation:")
analytical_eu = explain_calc("math.log(2) * generating_fn", locals())
A3_result = compare(abs(analytical_eu - 2 * math.log(2)), "<", 1e-12,
label="SC2/A3: E[ln(2^N)] = ln(2) * 2 = 2*ln(2) ≈ 1.3863 (finite)")
# Certainty equivalent: w such that ln(w) = E[U] → w = exp(E[U])
print("\nCertainty equivalent:")
certainty_equivalent = explain_calc("math.exp(analytical_eu)", locals())
# CE = exp(2*ln(2)) = exp(ln(4)) = 4.0
A3_ce_result = compare(abs(certainty_equivalent - 4.0), "<", 1e-10,
label="SC2: Certainty equivalent = exp(2*ln(2)) = $4.00 (rational WTP, finite)")
# Cross-check: numerical convergence of partial sums to 2*ln(2)
print("\n── SC2 cross-check: numerical convergence ──")
eu_numerical = sum((0.5**n) * n * math.log(2) for n in range(1, 10001))
eu_diff = explain_calc("abs(eu_numerical - analytical_eu)", locals())
A3_converge = compare(eu_diff, "<", 1e-3,
label="SC2 cross-check: 10000-term numerical EU matches 2*ln(2) within 0.001")
sc2_holds = compare(certainty_equivalent, "<", 1e15,
label="SC2: Certainty equivalent is finite (< 1e15, i.e., a real finite dollar amount)")
# ── 7. ADVERSARIAL CHECKS (Rule 5) ────────────────────────────────────────────
adversarial_checks = [
{
"question": "Is there a mathematical framework where the standard St. Petersburg EV is finite?",
"verification_performed": (
"Searched 'St. Petersburg paradox finite expected value' and "
"'St. Petersburg standard game EV convergent'. Found no credible source "
"claiming the standard game (unlimited flips, payoff = 2^n) has finite EV. "
"Bounded-payoff and finite-wealth-cap variants are different games."
),
"finding": (
"No peer-reviewed source claims standard St. Petersburg EV is finite. "
"The divergence (EV = infinity) is mathematical consensus. "
"Claim's premise 'finite expected value' is false for the standard game."
),
"breaks_proof": False,
},
{
"question": "Could 'expected value' in the claim mean 'expected utility' or 'certainty equivalent'?",
"verification_performed": (
"Analyzed the claim language: 'finite expected value that a rational person "
"should be willing to pay'. In standard probability/economics usage, "
"'expected value' is a defined technical term = E[X] = sum p_i * x_i. "
"Searched for alternative readings, found none in standard literature."
),
"finding": (
"Even under the most charitable reading — 'finite value that rational persons "
"should pay' — the claim incorrectly calls it an 'expected value'. "
"The correct term for the finite quantity is 'certainty equivalent' (= $4 under "
"log utility) or 'expected utility' (= 2*ln(2)). The claim's terminology is wrong."
),
"breaks_proof": False,
},
{
"question": "Does Menger's (1934) super-St.-Petersburg paradox undermine SC2?",
"verification_performed": (
"Searched 'Menger 1934 super St. Petersburg paradox log utility unbounded'. "
"Found Menger showed that for any unbounded utility function U, a game with "
"payoff exp(2^n) produces infinite E[U]. Log utility is not a general solution."
),
"finding": (
"Menger's result does NOT break SC2. SC2 only claims that the STANDARD "
"St. Petersburg game (payoff = 2^n) has a finite certainty equivalent ($4) "
"under log utility. This holds. Menger's game has a different payoff structure "
"and is a separate paradox. SC2 is limited to the standard game."
),
"breaks_proof": False,
},
{
"question": "Is log utility the only framework giving finite willingness to pay for the standard game?",
"verification_performed": (
"Searched 'St. Petersburg paradox CRRA utility solution' and "
"'risk aversion St. Petersburg finite certainty equivalent'. Found that any "
"utility function with relative risk aversion coefficient gamma > 0 gives "
"a finite CE for the standard game. Log utility (gamma=1) is Bernoulli's original."
),
"finding": (
"Log utility is not unique — any risk-averse utility (CRRA with gamma > 0) "
"gives a finite CE for the standard game. This STRENGTHENS SC2: the conclusion "
"(finite rational willingness to pay) is robust across multiple frameworks. "
"Only the precise dollar amount ($4) is specific to Bernoulli's log utility "
"with zero initial wealth."
),
"breaks_proof": False,
},
]
# ── 8. VERDICT AND STRUCTURED OUTPUT ──────────────────────────────────────────
if __name__ == "__main__":
any_unverified = any(
cr["status"] != "verified" for cr in citation_results.values()
)
# SC1 is the primary claim (finite expected value). It is false.
# SC2 is true but uses a different framework (expected utility, not expected value).
# Overall: DISPROVED because SC1 fails.
# Compound claim (SC1 AND SC2): both must hold; since SC1 is false, compound is false.
overall_claim_holds = compare(int(sc1_holds) + int(sc2_holds), "==", 2,
label="Overall: compound claim holds only if SC1=True AND SC2=True")
if not overall_claim_holds and not any_unverified:
verdict = "DISPROVED"
elif not overall_claim_holds and any_unverified:
verdict = "DISPROVED (with unverified citations)"
else:
verdict = "PROVED" # unreachable given sc1_holds = False
FACT_REGISTRY["A1"]["method"] = "algebraic: (1/2)^n * 2^n = 1^n = 1 for all n >= 1"
FACT_REGISTRY["A1"]["result"] = f"All terms = 1.0 (verified n=1..20); series = sum of infinity 1s"
FACT_REGISTRY["A2"]["method"] = "partial sum: sum_{k=1}^{N} 1 = N (unbounded)"
FACT_REGISTRY["A2"]["result"] = f"10-term sum={partial_sum_10:.1f}, 20-term sum={partial_sum_20:.1f} (grows as N)"
FACT_REGISTRY["A3"]["method"] = "E[ln(2^N)] = ln(2) * sum(n*(1/2)^n) = ln(2) * 2 = 2*ln(2)"
FACT_REGISTRY["A3"]["result"] = f"E[U] = {analytical_eu:.6f} = 2*ln(2); CE = exp(E[U]) = ${certainty_equivalent:.6f}"
FACT_REGISTRY["A4"]["method"] = "generating function: sum(n*x^n) = x/(1-x)^2 at x=1/2"
FACT_REGISTRY["A4"]["result"] = f"sum(n*(1/2)^n) = {generating_fn:.1f} (confirms = 2)"
citation_detail = build_citation_detail(FACT_REGISTRY, citation_results, empirical_facts)
summary = {
"fact_registry": {fid: {k: v for k, v in info.items()} for fid, info in FACT_REGISTRY.items()},
"claim_formal": CLAIM_FORMAL,
"claim_natural": CLAIM_NATURAL,
"citations": citation_detail,
"cross_checks": [
{
"description": "SC1: partial sum check — N-term sum = N (confirms unbounded growth)",
"values_compared": [str(partial_sum_10), "10.0", str(partial_sum_20), "20.0"],
"agreement": abs(partial_sum_10 - 10.0) < 1e-9 and abs(partial_sum_20 - 20.0) < 1e-9,
},
{
"description": "SC2: numerical convergence of 10000-term partial sum to analytical 2*ln(2)",
"values_compared": [f"{eu_numerical:.8f}", f"{2*math.log(2):.8f}"],
"agreement": abs(eu_numerical - 2 * math.log(2)) < 1e-3,
},
],
"adversarial_checks": adversarial_checks,
"verdict": verdict,
"key_results": {
"sc1_ev_diverges": True,
"sc1_ev_is_finite": sc1_holds,
"sc1_description": "EV = sum(1, 1, 1, ...) = infinity",
"sc2_expected_utility": round(analytical_eu, 6),
"sc2_certainty_equivalent_dollars": round(certainty_equivalent, 6),
"sc2_rational_wtp_is_finite": sc2_holds,
"overall_claim_holds": overall_claim_holds,
"note": (
"SC1 is DISPROVED (EV is infinite). SC2 is PROVED ($4 rational WTP "
"under Bernoulli log utility). The claim's framing is wrong: "
"the resolution is via expected *utility*, not expected *value*."
),
},
"generator": {
"name": "proof-engine",
"version": open(os.path.join(PROOF_ENGINE_ROOT, "VERSION")).read().strip(),
"repo": "https://github.com/yaniv-golan/proof-engine",
"generated_at": date.today().isoformat(),
},
}
print("\n=== PROOF SUMMARY (JSON) ===")
print(json.dumps(summary, indent=2, default=str))
Re-execute this proof
The verdict above is cached from when this proof was minted. To re-run the exact
proof.py shown in "View proof source" and see the verdict recomputed live,
launch it in your browser — no install required.
Re-execute from GitHub commit 1ba3732 — same bytes shown above.
First run takes longer while Binder builds the container image; subsequent runs are cached.
machine-readable formats
Downloads & raw data
found this useful? ★ star on github