"Presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method as defined by mainstream philosophy of science."
Three independent authoritative sources confirm that presenting spreadsheet observations as universal theorems omits requirements that the scientific method demands. The violation is structural, not a matter of opinion.
What Was Claimed?
The original claim was that "math washing" a spreadsheet — treating patterns found in data as universal theorems — is valid scientific practice. That is a normative judgment that cannot be directly proved or disproved. So the claim was re-framed as a factual question: does this practice violate the hypothetico-deductive method, the dominant framework in mainstream philosophy of science? If it does, then whatever else might be said about the practice, it fails to meet a well-established standard for scientific reasoning.
What Did We Find?
The hypothetico-deductive method has specific requirements, and three authoritative sources each identify a different one that this practice omits.
Encyclopaedia Britannica, summarizing Karl Popper's criterion of falsifiability, states that "a theory is genuinely scientific only if it is possible in principle to establish that it is false." Presenting spreadsheet observations as universal theorems makes no provision for falsifiability. The observations are treated as self-evidently true rather than as claims that could, in principle, be refuted.
The Stanford Encyclopedia of Philosophy, in its article on scientific method, states that science "requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation." Observation alone is not enough. The practice of presenting data patterns as theorems treats observation as sufficient, skipping the step of forming hypotheses and deriving testable predictions from them.
The Catalog of Bias, maintained by the Centre for Evidence-Based Medicine at the University of Oxford, defines data-dredging as "a distortion that arises from presenting the results of unplanned statistical tests as if they were a fully prespecified course of analyses." This is precisely the pattern: unplanned pattern-finding in a spreadsheet is presented as if it were a hypothesis-driven, prespecified result.
Each source addresses a distinct failure mode — falsifiability, reasoning beyond observation, and pre-specified analysis — and all three were verified by fetching the original pages and confirming the exact quoted text.
The proof also searched for counter-arguments. The strongest candidate was Bacon's inductivism, which defends generalizing from observations. But even Bacon's model requires systematic collection, replication, and bias elimination before generalizing — and no form of inductivism endorses calling data patterns "theorems," a term that implies deductive necessity. Exploratory Data Analysis was considered too, but Tukey's framework explicitly distinguishes hypothesis generation from hypothesis confirmation. No scientific domain was found that endorses presenting data patterns as universal theorems without further theoretical grounding.
What Should You Keep In Mind?
This proof is scoped to the hypothetico-deductive method specifically. It does not claim that the practice is invalid under every conceivable philosophy of science — only that it violates the dominant one. The term "universal theorem" is interpreted strictly: a claim of deductive necessity holding without exception, not a statistical regularity or empirical generalization.
The cited sources state general principles of the scientific method. None of them specifically names "spreadsheet observations presented as theorems." The connection between the general principles and the specific practice is an inference documented in the proof, not something stated directly by the sources.
One of the three sources — the Catalog of Bias — comes from a domain that is not classified in the credibility database, though it is maintained by Oxford's Centre for Evidence-Based Medicine. The proof's conclusion does not depend solely on this source.
How Was This Verified?
This claim was evaluated by checking three independent authoritative sources on the requirements of the hypothetico-deductive method, then testing whether the practice of presenting empirical spreadsheet observations as universal theorems omits those requirements. All three citations were verified by live-fetching the source URLs and confirming the quoted text. Full details are in the structured proof report and the full verification audit. You can also re-run the proof yourself.
What could challenge this verdict?
Three adversarial checks were performed before writing this proof:
1. Is there a scientific tradition that validates presenting inductive generalizations from data as universal laws without further testing? Searched for defenses of inductivism and found Bacon's inductivism as the strongest candidate. Even Bacon's model requires systematic collection, replication, and elimination of observer bias before generalizing. Naive inductivism has been largely discredited in philosophy of science (Popper 1934, Hempel 1965). More importantly, no form of inductivism endorses presenting patterns as universal "theorems" (a term implying deductive necessity) rather than empirical generalizations. This check does not break the proof but limits the verdict's scope: the proof establishes violation of the HD method specifically, not all possible philosophies of science.
2. Does Exploratory Data Analysis (EDA) validate presenting spreadsheet patterns as scientific findings? Reviewed EDA methodology documentation and found that EDA (Tukey 1977) is an explicitly hypothesis-generating practice, not hypothesis-confirming. Tukey's framework produces candidate hypotheses for subsequent testing, not universal theorems. The EDA literature itself distinguishes pattern-finding from universal claims.
3. Could "math washing" be valid in limited empirical domains like actuarial science, empirical economics, or physics phenomenology? Searched for domain-specific practices. Empirical economics explicitly distinguishes between "stylized facts" (regularities observed in data) and economic laws or theorems. Kaldor (1961) introduced "stylized facts" precisely because observed patterns do not constitute universal theorems without theoretical grounding. Even in phenomenological physics, empirical regularities (e.g., Kepler's laws) were only elevated to scientific law status after being derived from deeper theoretical principles (Newton's mechanics). No domain endorses presenting data patterns as universal theorems directly.
None of these checks produce counter-evidence that breaks the proof.
Source: proof.py JSON summary
Sources
| Source | ID | Type | Verified |
|---|---|---|---|
| Encyclopaedia Britannica — criterion of falsifiability | B1 | Reference | Yes |
| Stanford Encyclopedia of Philosophy — scientific method | B2 | Academic | Yes |
| Catalog of Bias — data-dredging bias | B3 | Unclassified | Yes |
| Count of authoritative sources confirming HD method requirements that the practice omits | A1 | — | Computed |
detailed evidence
Evidence Summary
| ID | Fact | Verified |
|---|---|---|
| B1 | Britannica: Popper's falsifiability criterion — scientific theories must be falsifiable in principle | Yes |
| B2 | Stanford Encyclopedia of Philosophy: scientific method requires reasoning beyond observation | Yes |
| B3 | Catalog of Bias: presenting unplanned analyses as prespecified is a recognized methodological distortion | Yes |
| A1 | Count of authoritative sources confirming HD method requirements that the practice omits | Computed: 3 sources confirmed — meets threshold of 3 independent authorities |
Proof Logic
The proof establishes that presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method by showing that three independent authoritative sources each identify a distinct HD method requirement that this practice omits.
The n_confirmed value is derived from citation verification results, not hardcoded. Three sources are checked: each must be verified to count toward the threshold of 3.
-
B1 — Encyclopaedia Britannica's article on Popper's criterion of falsifiability states that "a theory is genuinely scientific only if it is possible in principle to establish that it is false." Presenting empirical observations as universal theorems makes no provision for falsifiability — the observations are treated as self-evidently true rather than as claims that could in principle be shown false. This omits the falsifiability requirement of the HD method.
-
B2 — The Stanford Encyclopedia of Philosophy's article on scientific method states that "scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation." Presenting spreadsheet observations as theorems treats observation as sufficient, omitting the requirement for reasoning beyond observation — hypothesis formation, deductive prediction, and testing.
-
B3 — The Catalog of Bias defines data-dredging as "a distortion that arises from presenting the results of unplanned statistical tests as if they were a fully prespecified course of analyses." Presenting spreadsheet observations as universal theorems follows exactly this pattern: unplanned pattern-finding is presented as if it were a prespecified, hypothesis-driven result.
Logical chain:
(1) Three independently verified sources each identify a distinct HD method requirement that the practice omits → (2) compare(n_confirmed=3, ">=", 3) → True → (3) The claim holds: the practice violates the HD method as defined by mainstream philosophy of science.
Source: author analysis
Conclusion
Verdict: PROVED
Three independently verified authoritative sources — from encyclopedic philosophy (B1), academic philosophy (B2), and scientific methodology (B3) — each confirm a distinct requirement of the hypothetico-deductive method that the practice of presenting empirical spreadsheet observations as universal theorems structurally omits: falsifiability, reasoning beyond observation, and pre-specified hypotheses. All three citations were verified by live fetch with full quote match (3/3, threshold >= 3).
Note: One citation (B3, catalogofbias.org) comes from an unclassified domain (credibility tier 2). See Source Credibility Assessment in the audit trail. The Catalog of Bias is maintained by the Centre for Evidence-Based Medicine at the University of Oxford, but its domain is not classified in the credibility database. The proof's conclusion does not depend solely on B3 — the two higher-tier sources (B1 tier 3, B2 tier 4) independently establish the violation.
Note: The cited sources state general principles of the HD method, not specifically naming "spreadsheet observations presented as theorems." The proof relies on an author-reasoning bridge connecting these general principles to the specific practice. This entailment gap is documented in the Claim Interpretation section.
audit trail
All 3 citations verified.
Original audit log
B1 — Encyclopaedia Britannica: criterion of falsifiability - Status: verified - Method: full_quote - Fetch mode: live - Coverage: 100% (full quote match) - Impact: Establishes Popper's falsifiability criterion as a requirement of the HD method. The practice of presenting empirical observations as universal theorems omits falsifiability testing.
B2 — Stanford Encyclopedia of Philosophy: scientific method - Status: verified - Method: full_quote - Fetch mode: live - Coverage: 100% (full quote match) - Impact: Establishes that scientific method requires reasoning beyond observation. The practice of presenting empirical observations as theorems treats observation as sufficient, omitting the required inferential step.
B3 — Catalog of Bias: data-dredging bias - Status: verified - Method: full_quote - Fetch mode: live - Coverage: 100% (full quote match) - Impact: Identifies the specific methodological distortion of presenting unplanned analyses as prespecified — the pattern that "math washing" follows.
All three citations were fully verified. No "with unverified citations" qualifier applies.
Source: proof.py JSON summary; impact analysis is author analysis
| Field | Value |
|---|---|
| subject | Presenting empirical spreadsheet observations as universal theorems |
| property | violates the hypothetico-deductive method as defined by mainstream philosophy of science |
| operator | >= |
| threshold | 3 |
| proof_direction | prove |
| operator_note | The claim asserts a factual violation: the practice of presenting empirical spreadsheet observations as universal theorems omits steps that the hypothetico-deductive (HD) method requires. We count independent authoritative sources that define HD method requirements (falsifiability, reasoning beyond observation, pre-specified hypotheses) which this practice structurally omits. A threshold of 3 is used to require broad consensus across distinct philosophical and methodological traditions. Entailment note: the cited sources define general requirements of the scientific method / HD method. None specifically names 'spreadsheet observations presented as theorems.' The entailment bridge is: (1) the HD method requires steps X, Y, Z; (2) presenting observations as universal theorems without hypothesis formation, falsifiability testing, or pre-specified analysis omits X, Y, Z; therefore (3) the practice violates the HD method. This inference is logically valid but requires the author-reasoning bridge documented here. Formalization scope: 'universal theorem' is interpreted strictly — a claim of deductive necessity holding without exception, not a statistical regularity or empirical generalization. 'Violates' means the practice omits one or more requirements that the HD method mandates. The proof does not address whether the practice might be valid under non-HD frameworks (e.g., pure inductivism); adversarial check 1 addresses this limitation. |
Source: proof.py JSON summary
Note: This claim was re-framed from the original. The original claim was normative: "'Math washing' a spreadsheet (presenting empirical observations as universal theorems) is valid scientific practice." That claim could not be directly proved or disproved because "valid scientific practice" is a normative judgment. The re-framed claim is factual: it asks whether the practice violates the hypothetico-deductive method as defined by mainstream philosophy of science.
Natural language: Presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method as defined by mainstream philosophy of science.
Formal interpretation: The claim asserts a factual violation: the practice of presenting empirical spreadsheet observations as universal theorems omits steps that the hypothetico-deductive (HD) method requires. We count independent authoritative sources that define HD method requirements — falsifiability, reasoning beyond observation, pre-specified hypotheses — which this practice structurally omits. A threshold of 3 is used to require broad consensus across distinct philosophical and methodological traditions.
Entailment note: The cited sources define general requirements of the scientific method / HD method. None specifically names "spreadsheet observations presented as theorems." The entailment bridge is: (1) the HD method requires steps X, Y, Z; (2) presenting observations as universal theorems without hypothesis formation, falsifiability testing, or pre-specified analysis omits X, Y, Z; therefore (3) the practice violates the HD method. This inference is logically valid but requires the author-reasoning bridge documented here.
Formalization scope: "Universal theorem" is interpreted strictly — a claim of deductive necessity holding without exception, not a statistical regularity or empirical generalization. "Violates" means the practice omits one or more requirements that the HD method mandates. The proof does not address whether the practice might be valid under non-HD frameworks (e.g., pure inductivism); adversarial check 1 addresses this limitation.
Source: proof.py JSON summary
| Fact ID | Domain | Type | Tier | Note |
|---|---|---|---|---|
| B1 | britannica.com | reference | 3 | Established reference source |
| B2 | stanford.edu | academic | 4 | Academic domain (.edu) |
| B3 | catalogofbias.org | unknown | 2 | Unclassified domain — verify source authority manually |
Note on B3 (Tier 2): catalogofbias.org is the online home of the Catalog of Bias project, affiliated with the University of Oxford's Centre for Evidence-Based Medicine (CEBM). The domain is unclassified by the automated credibility system, but the project is an established academic resource in evidence-based medicine. The conclusion does not depend solely on B3 — B1 (Tier 3) and B2 (Tier 4) independently support the proof.
Source: proof.py JSON summary; tier-2 note is author analysis
Verifying citations...
[✓] source_britannica_popper: Full quote verified (source: tier 3/reference)
[✓] source_sep_scientific_method: Full quote verified (source: tier 4/academic)
[✓] source_catalog_of_bias: Full quote verified (source: tier 2/unknown)
Confirmed sources: 3 / 3
n_confirmed = 3
compare(3, '>=', 3) = True => claim_holds = True
Proof direction: prove — claim is PROVED
Source: proof.py inline output (execution trace)
| Cross-check | Values Compared | Agreement |
|---|---|---|
| Three independent authoritative sources from distinct traditions (encyclopedic philosophy, academic philosophy reference, medical/scientific methodology catalog) each confirm a different HD method requirement that the practice omits: falsifiability (B1), reasoning beyond observation (B2), pre-specified hypotheses (B3). | B1: verified, B2: verified, B3: verified | True |
Independence rationale: B1 is Encyclopaedia Britannica's article on Popper's falsifiability criterion (encyclopedic reference). B2 is the Stanford Encyclopedia of Philosophy's article on scientific method (academic reference, peer-reviewed). B3 is the Catalog of Bias's entry on data-dredging (maintained by the Centre for Evidence-Based Medicine at the University of Oxford). These are three independently authored and maintained sources from distinct intellectual traditions. Each addresses a different failure mode: falsifiability (B1), reasoning beyond observation (B2), pre-specified hypotheses (B3).
COI assessment: No conflict of interest flags identified. None of the three sources has a financial, institutional, or ideological stake in the specific question of "math washing" or spreadsheet-based claims.
Source: proof.py JSON summary; independence rationale and COI assessment are author analysis
Check 1: Is there a scientific tradition that validates presenting inductive generalizations from data as universal laws without further testing? - Question: Is there a scientific tradition that validates presenting inductive generalizations from data as universal laws without further testing? - Verification performed: Searched 'defense inductive reasoning empirical observations sufficient universal scientific laws' and 'Bacon inductivism valid science pattern observation'. Found inductivism (Bacon's model) as a candidate defense. - Finding: Even Bacon's inductivism — the strongest defense of inductive science — requires systematic collection, replication, and elimination of observer bias before generalizing. Naive inductivism has been largely discredited in philosophy of science (Popper, 1934; Hempel, 1965). More importantly, no form of inductivism endorses presenting patterns as universal 'theorems' (a term implying deductive necessity) rather than empirical generalizations. This check does not break the proof but limits the verdict's scope: the proof establishes violation of the HD method specifically, not all possible philosophies of science. - Breaks proof: No
Check 2: Does Exploratory Data Analysis (EDA) validate presenting spreadsheet patterns as scientific findings? - Question: Does Exploratory Data Analysis (EDA) validate presenting spreadsheet patterns as scientific findings? - Verification performed: Searched 'Tukey exploratory data analysis purpose hypothesis generation not confirmation'. Reviewed EDA methodology documentation. - Finding: EDA (Tukey 1977) is an explicitly hypothesis-generating practice, not hypothesis-confirming. Tukey's framework is designed to produce candidate hypotheses for subsequent testing, not to generate universal theorems. This supports the proof: the EDA literature itself distinguishes pattern-finding from universal claims. - Breaks proof: No
Check 3: Could 'math washing' be valid in limited empirical domains like actuarial science, empirical economics, or physics phenomenology? - Question: Could 'math washing' be valid in limited empirical domains like actuarial science, empirical economics, or physics phenomenology? - Verification performed: Searched 'stylized facts empirical economics vs universal law', 'actuarial science empirical observation universal theorem'. Reviewed terminology used in empirical economic methodology. - Finding: Empirical economics explicitly distinguishes between 'stylized facts' (regularities observed in data) and 'economic laws' or theorems. Kaldor (1961) introduced 'stylized facts' precisely because observed patterns in data do NOT constitute universal theorems without theoretical grounding. Even in phenomenological physics, empirical regularities (e.g., Kepler's laws) were only elevated to scientific law status after being derived from deeper theoretical principles (Newton's mechanics). No domain endorses presenting data patterns as universal theorems directly. - Breaks proof: No
Source: proof.py JSON summary
| Rule | Status | Notes |
|---|---|---|
| Rule 1: Every empirical value parsed from quote text, not hand-typed | N/A — qualitative proof; no numeric values extracted from quotes | Proof is based on citation verification status, not numeric extraction |
| Rule 2: Every citation URL fetched and quote checked | PASS | All 3 citations verified via live fetch (B1: full_quote, B2: full_quote, B3: full_quote) |
| Rule 3: System time used for date-dependent logic | N/A — no time-dependent computation | Proof generates date via date.today() for the generator block only |
| Rule 4: Claim interpretation explicit with operator rationale | PASS | CLAIM_FORMAL includes operator_note explaining entailment bridge, formalization scope, threshold rationale, and proof direction |
| Rule 5: Adversarial checks searched for independent counter-evidence | PASS | Three adversarial checks covering inductivism defense, EDA methodology, and domain-specific practices |
| Rule 6: Cross-checks used independently sourced inputs | PASS | Three independently authored and maintained sources from distinct intellectual traditions, all verified; COI assessment performed |
| Rule 7: Constants and formulas imported from computations.py, not hand-coded | PASS | compare() imported from scripts/computations.py; no hard-coded constants |
Source: author analysis based on proof.py structure and execution results
| Fact ID | Extracted Value | Value in Quote | Quote Snippet |
|---|---|---|---|
| B1 | verified | Yes | a theory is genuinely scientific only if it is possible in principle to establis |
| B2 | verified | Yes | In addition to careful observation, then, scientific method requires a logic as |
| B3 | verified | Yes | A distortion that arises from presenting the results of unplanned statistical te |
For this qualitative/consensus proof, the extractions field records citation verification status per source rather than numeric values. "Value in Quote" indicates whether the citation was countable (verified or partial).
Source: proof.py JSON summary; extraction method note is author analysis
Cite this proof
Proof Engine. (2026). Claim Verification: “Presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method as defined by mainstream philosophy of science.” — Proved. https://proofengine.info/proofs/math-washing-a-spreadsheet-presenting-empirical-observations-as-universal/
Proof Engine. "Claim Verification: “Presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method as defined by mainstream philosophy of science.” — Proved." 2026. https://proofengine.info/proofs/math-washing-a-spreadsheet-presenting-empirical-observations-as-universal/.
@misc{proofengine_math_washing_a_spreadsheet_presenting_empirical_observations_as_universal,
title = {Claim Verification: “Presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method as defined by mainstream philosophy of science.” — Proved},
author = {{Proof Engine}},
year = {2026},
url = {https://proofengine.info/proofs/math-washing-a-spreadsheet-presenting-empirical-observations-as-universal/},
note = {Verdict: PROVED. Generated by proof-engine v1.8.0},
}
TY - DATA TI - Claim Verification: “Presenting empirical spreadsheet observations as universal theorems violates the hypothetico-deductive method as defined by mainstream philosophy of science.” — Proved AU - Proof Engine PY - 2026 UR - https://proofengine.info/proofs/math-washing-a-spreadsheet-presenting-empirical-observations-as-universal/ N1 - Verdict: PROVED. Generated by proof-engine v1.8.0 ER -
View proof source
This is the proof.py that produced the verdict above. Every fact traces to code below. (This proof has not yet been minted to Zenodo; the source here is the working copy from this repository.)
"""
Proof: Presenting empirical spreadsheet observations as universal theorems violates
the hypothetico-deductive method as defined by mainstream philosophy of science.
Generated: 2026-04-07
Proof direction: PROVE — three independent authoritative sources confirm the
hypothetico-deductive method requires steps (falsifiability, reasoning beyond
observation, pre-specified hypotheses) that this practice omits.
"""
import json
import os
import sys
from datetime import date
PROOF_ENGINE_ROOT = os.environ.get("PROOF_ENGINE_ROOT")
if not PROOF_ENGINE_ROOT:
_d = os.path.dirname(os.path.abspath(__file__))
while _d != os.path.dirname(_d):
if os.path.isdir(os.path.join(_d, "proof-engine", "skills", "proof-engine", "scripts")):
PROOF_ENGINE_ROOT = os.path.join(_d, "proof-engine", "skills", "proof-engine")
break
_d = os.path.dirname(_d)
if not PROOF_ENGINE_ROOT:
raise RuntimeError("PROOF_ENGINE_ROOT not set and skill dir not found via walk-up from proof.py")
sys.path.insert(0, PROOF_ENGINE_ROOT)
from scripts.verify_citations import verify_all_citations, build_citation_detail
from scripts.computations import compare
# 1. CLAIM INTERPRETATION (Rule 4)
CLAIM_NATURAL = (
"Presenting empirical spreadsheet observations as universal theorems violates "
"the hypothetico-deductive method as defined by mainstream philosophy of science."
)
CLAIM_FORMAL = {
"subject": "Presenting empirical spreadsheet observations as universal theorems",
"property": (
"violates the hypothetico-deductive method as defined by mainstream "
"philosophy of science"
),
"operator": ">=",
"operator_note": (
"The claim asserts a factual violation: the practice of presenting empirical "
"spreadsheet observations as universal theorems omits steps that the "
"hypothetico-deductive (HD) method requires. We count independent authoritative "
"sources that define HD method requirements (falsifiability, reasoning beyond "
"observation, pre-specified hypotheses) which this practice structurally omits. "
"A threshold of 3 is used to require broad consensus across distinct philosophical "
"and methodological traditions. "
"Entailment note: the cited sources define general requirements of the scientific "
"method / HD method. None specifically names 'spreadsheet observations presented "
"as theorems.' The entailment bridge is: (1) the HD method requires steps X, Y, Z; "
"(2) presenting observations as universal theorems without hypothesis formation, "
"falsifiability testing, or pre-specified analysis omits X, Y, Z; therefore "
"(3) the practice violates the HD method. This inference is logically valid but "
"requires the author-reasoning bridge documented here. "
"Formalization scope: 'universal theorem' is interpreted strictly — a claim of "
"deductive necessity holding without exception, not a statistical regularity or "
"empirical generalization. 'Violates' means the practice omits one or more "
"requirements that the HD method mandates. The proof does not address whether "
"the practice might be valid under non-HD frameworks (e.g., pure inductivism); "
"adversarial check 1 addresses this limitation."
),
"threshold": 3,
"proof_direction": "prove",
}
# 2. FACT REGISTRY
FACT_REGISTRY = {
"B1": {
"key": "source_britannica_popper",
"label": (
"Britannica: Popper's falsifiability criterion — scientific theories "
"must be falsifiable in principle"
),
},
"B2": {
"key": "source_sep_scientific_method",
"label": (
"Stanford Encyclopedia of Philosophy: scientific method requires "
"reasoning beyond observation"
),
},
"B3": {
"key": "source_catalog_of_bias",
"label": (
"Catalog of Bias: presenting unplanned analyses as prespecified "
"is a recognized methodological distortion"
),
},
"A1": {
"label": (
"Count of authoritative sources confirming HD method requirements "
"that the practice omits"
),
"method": None,
"result": None,
},
}
# 3. EMPIRICAL FACTS
# Sources that define HD method requirements the practice omits.
empirical_facts = {
"source_britannica_popper": {
"quote": (
"a theory is genuinely scientific only if it is possible in principle to establish "
"that it is false."
),
"url": "https://www.britannica.com/topic/criterion-of-falsifiability",
"source_name": "Encyclopaedia Britannica — criterion of falsifiability",
},
"source_sep_scientific_method": {
"quote": (
"In addition to careful observation, then, scientific method requires a logic as a "
"system of reasoning for properly arranging, but also inferring beyond, what is known "
"by observation."
),
"url": "https://plato.stanford.edu/entries/scientific-method/",
"source_name": "Stanford Encyclopedia of Philosophy — scientific method",
},
"source_catalog_of_bias": {
"quote": (
"A distortion that arises from presenting the results of unplanned statistical tests "
"as if they were a fully prespecified course of analyses."
),
"url": "https://catalogofbias.org/biases/data-dredging-bias/",
"source_name": "Catalog of Bias — data-dredging bias",
},
}
# 4. CITATION VERIFICATION (Rule 2)
citation_results = verify_all_citations(empirical_facts, wayback_fallback=True)
# 5. COUNT SOURCES WITH VERIFIED CITATIONS
COUNTABLE_STATUSES = ("verified", "partial")
n_confirmed = sum(
1 for key in empirical_facts
if citation_results[key]["status"] in COUNTABLE_STATUSES
)
print(f" Confirmed sources: {n_confirmed} / {len(empirical_facts)}")
# 6. CROSS-CHECK (Rule 6)
b1_confirmed = citation_results.get("source_britannica_popper", {}).get("status") in COUNTABLE_STATUSES
b2_confirmed = citation_results.get("source_sep_scientific_method", {}).get("status") in COUNTABLE_STATUSES
b3_confirmed = citation_results.get("source_catalog_of_bias", {}).get("status") in COUNTABLE_STATUSES
cross_check_agreement = b1_confirmed and b2_confirmed and b3_confirmed
# 7. CLAIM EVALUATION — MUST use compare(), never hardcode claim_holds (Rule 7)
claim_holds = compare(
n_confirmed,
CLAIM_FORMAL["operator"],
CLAIM_FORMAL["threshold"],
label="verified source count vs proof threshold",
)
# 8. SYSTEM TIME (Rule 3)
PROOF_GENERATION_DATE = date(2026, 4, 7)
today = date.today()
if today == PROOF_GENERATION_DATE:
date_note = "System date matches proof generation date."
else:
date_note = f"Proof generated on {PROOF_GENERATION_DATE}; running on {today}."
# 9. ADVERSARIAL CHECKS (Rule 5)
adversarial_checks = [
{
"question": (
"Is there a scientific tradition that validates presenting inductive "
"generalizations from data as universal laws without further testing?"
),
"verification_performed": (
"Searched 'defense inductive reasoning empirical observations sufficient "
"universal scientific laws' and 'Bacon inductivism valid science pattern "
"observation'. Found inductivism (Bacon's model) as a candidate defense."
),
"finding": (
"Even Bacon's inductivism — the strongest defense of inductive science — "
"requires systematic collection, replication, and elimination of observer "
"bias before generalizing. Naive inductivism has been largely discredited "
"in philosophy of science (Popper, 1934; Hempel, 1965). More importantly, "
"no form of inductivism endorses presenting patterns as universal 'theorems' "
"(a term implying deductive necessity) rather than empirical generalizations. "
"This check does not break the proof but limits the verdict's scope: "
"the proof establishes violation of the HD method specifically, not all "
"possible philosophies of science."
),
"breaks_proof": False,
},
{
"question": (
"Does Exploratory Data Analysis (EDA) validate presenting spreadsheet "
"patterns as scientific findings?"
),
"verification_performed": (
"Searched 'Tukey exploratory data analysis purpose hypothesis generation "
"not confirmation'. Reviewed EDA methodology documentation."
),
"finding": (
"EDA (Tukey 1977) is an explicitly hypothesis-generating practice, not "
"hypothesis-confirming. Tukey's framework is designed to produce candidate "
"hypotheses for subsequent testing, not to generate universal theorems. "
"This supports the proof: the EDA literature itself distinguishes "
"pattern-finding from universal claims."
),
"breaks_proof": False,
},
{
"question": (
"Could 'math washing' be valid in limited empirical domains like actuarial "
"science, empirical economics, or physics phenomenology?"
),
"verification_performed": (
"Searched 'stylized facts empirical economics vs universal law', "
"'actuarial science empirical observation universal theorem'. Reviewed "
"terminology used in empirical economic methodology."
),
"finding": (
"Empirical economics explicitly distinguishes between 'stylized facts' "
"(regularities observed in data) and 'economic laws' or theorems. "
"Kaldor (1961) introduced 'stylized facts' precisely because observed "
"patterns in data do NOT constitute universal theorems without theoretical "
"grounding. Even in phenomenological physics, empirical regularities "
"(e.g., Kepler's laws) were only elevated to scientific law status after "
"being derived from deeper theoretical principles (Newton's mechanics). "
"No domain endorses presenting data patterns as universal theorems directly."
),
"breaks_proof": False,
},
]
# 10. VERDICT AND STRUCTURED OUTPUT
if __name__ == "__main__":
any_unverified = any(
cr["status"] != "verified" for cr in citation_results.values()
)
any_breaks = any(ac.get("breaks_proof") for ac in adversarial_checks)
if any_breaks:
verdict = "UNDETERMINED"
elif claim_holds and not any_unverified:
verdict = "PROVED"
elif claim_holds and any_unverified:
verdict = "PROVED (with unverified citations)"
elif not claim_holds:
verdict = "UNDETERMINED"
else:
verdict = "UNDETERMINED"
FACT_REGISTRY["A1"]["method"] = f"count(verified citations) = {n_confirmed}"
FACT_REGISTRY["A1"]["result"] = (
f"{n_confirmed} sources confirmed (threshold: {CLAIM_FORMAL['threshold']})"
)
citation_detail = build_citation_detail(FACT_REGISTRY, citation_results, empirical_facts)
extractions = {}
for fid, info in FACT_REGISTRY.items():
if not fid.startswith("B"):
continue
ef_key = info["key"]
cr = citation_results.get(ef_key, {})
extractions[fid] = {
"value": cr.get("status", "unknown"),
"value_in_quote": cr.get("status") in COUNTABLE_STATUSES,
"quote_snippet": empirical_facts[ef_key]["quote"][:80],
}
summary = {
"fact_registry": {
fid: {k: v for k, v in info.items()}
for fid, info in FACT_REGISTRY.items()
},
"claim_formal": CLAIM_FORMAL,
"claim_natural": CLAIM_NATURAL,
"citations": citation_detail,
"extractions": extractions,
"cross_checks": [
{
"description": (
"Three independent authoritative sources from distinct traditions "
"(encyclopedic philosophy, academic philosophy reference, medical/scientific "
"methodology catalog) each confirm a different HD method requirement that "
"the practice omits: falsifiability (B1), reasoning beyond observation (B2), "
"pre-specified hypotheses (B3)."
),
"values_compared": [
citation_results.get("source_britannica_popper", {}).get("status", "unknown"),
citation_results.get("source_sep_scientific_method", {}).get("status", "unknown"),
citation_results.get("source_catalog_of_bias", {}).get("status", "unknown"),
],
"agreement": cross_check_agreement,
"coi_flags": [],
}
],
"adversarial_checks": adversarial_checks,
"verdict": verdict,
"key_results": {
"n_confirmed": n_confirmed,
"threshold": CLAIM_FORMAL["threshold"],
"operator": CLAIM_FORMAL["operator"],
"claim_holds": claim_holds,
"proof_direction": "prove",
"any_unverified_citations": any_unverified,
"date_note": date_note,
},
"generator": {
"name": "proof-engine",
"version": open(os.path.join(PROOF_ENGINE_ROOT, "VERSION")).read().strip(),
"repo": "https://github.com/yaniv-golan/proof-engine",
"generated_at": date.today().isoformat(),
},
}
print("\n=== PROOF SUMMARY (JSON) ===")
print(json.dumps(summary, indent=2, default=str))
Re-execute this proof
The verdict above is cached from when this proof was minted. To re-run the exact
proof.py shown in "View proof source" and see the verdict recomputed live,
launch it in your browser — no install required.
Re-execute from GitHub commit 1ba3732 — same bytes shown above.
First run takes longer while Binder builds the container image; subsequent runs are cached.
machine-readable formats
Downloads & raw data
found this useful? ★ star on github