ai — Proof Engine

AI-generated code has fewer security vulnerabilities than typical human-written code

DISPROVED (with unverified citations) ai technology 2026-03-29

Sources: Perry et al., ACM CCS 2023 (Stanford University), Help Net Security / Veracode 2025 GenAI Code Security Report, CodeRabbit State of AI vs Human Code Generation Report (Dec 2025) +1 more

AI hallucinations occur on fewer than 5% of factual questions

DISPROVED ai 2026-03-29

Sources: IEEE Communications Society Technology Blog, AllAboutAI LLM Hallucination Test, Artificial Analysis AA-Omniscience Benchmark

AI progress in capabilities has largely plateaued since late 2024

DISPROVED ai 2026-03-29

Sources: Epoch AI — AI capabilities progress has sped up, Epoch AI Substack — Frontier AI capabilities accelerated in 2024, LLM Stats — SWE-bench Verified Leaderboard +1 more

AI will replace over 50% of white-collar jobs by 2035

DISPROVED ai economics 2026-03-28

Sources: Yale Budget Lab / Fortune (February 2026), Anthropic Labor Market Impacts Research (January 2026), J.P. Morgan Global Research — AI's Impact on Job Growth (2025) +1 more

Current AI systems have already achieved Artificial General Intelligence (AGI).

DISPROVED ai 2026-04-06

Sources: Google DeepMind (Morris et al., 2023) — Levels of AGI paper (arXiv:2311.02462), Gary Marcus — 'Rumors of AGI's arrival have been greatly exaggerated' (Substack), Cogni Down Under — 'AGI Still Years Away' analysis (Medium) +1 more

Current AI systems in 2026 have near-zero hallucinations and human-level reasoning across most domains.

DISPROVED ai 2026-03-31

Sources: Duke University Libraries Blog (January 2026), Vectara Hallucination Leaderboard Blog (2025), OpenAI SimpleQA benchmark paper (arXiv 2024) +3 more

Deepfake videos are now indistinguishable from real footage to the average human eye.

DISPROVED ai 2026-03-29

Sources: iScience (Deepfake detection with and without content warnings, N=1093), Cognitive Research: Principles and Implications (UF study, N=1901), Fortune (Prof. Siwei Lyu, UB Media Forensic Lab)

Let (X,Y) ~ p(x,y). For each contrastive training instance, sample one positive Y_1 ~ p(y|X) and N-1 negatives Y_2,...,Y_N iid ~ p(y), conditionally independent given X. Let i* be the index of the positive, uniformly distributed over {1,...,N}. For any measurable scoring function s(x,y), define L_N(s) = - E[ log( exp(s(X,Y_{i})) / sum_j exp(s(X,Y_j)) ) ]. Then log N - L_N(s) is a lower bound on I(X;Y). The Bayes-optimal score is s(x,y) = log(p(y|x)/p(y)) + c(x), where c(x) is arbitrary. Under the standard multi-sample setup, the resulting InfoNCE lower bound tightens as N increases.

PROVED mathematics ai 2026-04-18

Consider a spike-train encoding model where spikes are generated by an inhomogeneous Poisson process with intensity lambda_t = f(eta_t), eta_t = x_t^T beta + h_t^T gamma + b, with convex parameter space for theta = (beta, gamma, b). If f is positive, convex, and log-concave, then the log-likelihood is concave in theta. Therefore every local maximum is global, ML fitting is a convex optimization problem, and the same holds for MAP inference under any log-concave prior on theta.

PROVED neuroscience mathematics ai 2026-04-18

The paper 'Goh AC, Gill IS. Wallace anastomosis in complex dissections. Eur Urol Focus. (2024) 11:80–8' does not exist.

SUPPORTED methodology ai 2026-05-20

The pattern-matching limitations identified in GSM-NoOp are practically surmountable when LLMs are allowed to offload formal reasoning steps to code execution.

PROVED ai mathematics 2026-04-08

Sources: Mirzadeh et al., GSM-Symbolic (ICLR 2025), EmergentMind GSM-Symbolic Analysis, AppleInsider coverage of GSM-Symbolic research +5 more

Topaz et al. (2026) analyzed 2.5 million biomedical papers in PubMed Central from January 2023 through February 2026 and identified 4,046 references pointing to studies that do not exist, distributed across 2,810 papers.

PROVED ai health 2026-05-20

Sources: Topaz M, Roguin N, Gupta P, Zhang Z, Peltonen L-M. Fabricated citations: an audit across 2·5 million biomedical papers. Lancet 2026; 407: 1779-81.

Training and running today's frontier AI models consumes more electricity than entire small countries.

PROVED ai climate 2026-03-28

Sources: International Energy Agency (IEA), Energy and AI 2025 report, Executive Summary, U.S. Energy Information Administration (EIA), Electricity Use in Homes, WorldData.info: Nauru Energy Consumption +2 more

Using AI tools makes humans worse at critical thinking and original problem-solving.

PROVED ai 2026-03-29

Sources: PsyPost report on Gerlich (2025), Societies 15(1):6, Microsoft Research — Lee et al. (2025), CHI 2025, Harvard Gazette (2025) +1 more

tag: ai

AI-generated code has fewer security vulnerabilities than typical human-written code

AI hallucinations occur on fewer than 5% of factual questions

AI progress in capabilities has largely plateaued since late 2024

AI will replace over 50% of white-collar jobs by 2035

Current AI systems have already achieved Artificial General Intelligence (AGI).

Current AI systems in 2026 have near-zero hallucinations and human-level reasoning across most domains.

Deepfake videos are now indistinguishable from real footage to the average human eye.

The paper 'Goh AC, Gill IS. Wallace anastomosis in complex dissections. Eur Urol Focus. (2024) 11:80–8' does not exist.

The pattern-matching limitations identified in GSM-NoOp are practically surmountable when LLMs are allowed to offload formal reasoning steps to code execution.

Topaz et al. (2026) analyzed 2.5 million biomedical papers in PubMed Central from January 2023 through February 2026 and identified 4,046 references pointing to studies that do not exist, distributed across 2,810 papers.

Training and running today's frontier AI models consumes more electricity than entire small countries.

Using AI tools makes humans worse at critical thinking and original problem-solving.