tag: ai
12 proofs
AI-generated code has fewer security vulnerabilities than typical human-written code
Sources: Perry et al., ACM CCS 2023 (Stanford University), Help Net Security / Veracode 2025 GenAI Code Security Report, CodeRabbit State of AI vs Human Code Generation Report (Dec 2025) +1 more
AI hallucinations occur on fewer than 5% of factual questions
Sources: IEEE Communications Society Technology Blog, AllAboutAI LLM Hallucination Test, Artificial Analysis AA-Omniscience Benchmark
AI progress in capabilities has largely plateaued since late 2024
Sources: Epoch AI — AI capabilities progress has sped up, Epoch AI Substack — Frontier AI capabilities accelerated in 2024, LLM Stats — SWE-bench Verified Leaderboard +1 more
AI will replace over 50% of white-collar jobs by 2035
Sources: Yale Budget Lab / Fortune (February 2026), Anthropic Labor Market Impacts Research (January 2026), J.P. Morgan Global Research — AI's Impact on Job Growth (2025) +1 more
Current AI systems have already achieved Artificial General Intelligence (AGI).
Sources: Google DeepMind (Morris et al., 2023) — Levels of AGI paper (arXiv:2311.02462), Gary Marcus — 'Rumors of AGI's arrival have been greatly exaggerated' (Substack), Cogni Down Under — 'AGI Still Years Away' analysis (Medium) +1 more
Current AI systems in 2026 have near-zero hallucinations and human-level reasoning across most domains.
Sources: Duke University Libraries Blog (January 2026), Vectara Hallucination Leaderboard Blog (2025), OpenAI SimpleQA benchmark paper (arXiv 2024) +3 more
Deepfake videos are now indistinguishable from real footage to the average human eye.
Sources: iScience (Deepfake detection with and without content warnings, N=1093), Cognitive Research: Principles and Implications (UF study, N=1901), Fortune (Prof. Siwei Lyu, UB Media Forensic Lab)
Let (X,Y) ~ p(x,y). For each contrastive training instance, sample one positive Y_1 ~ p(y|X) and N-1 negatives Y_2,...,Y_N iid ~ p(y), conditionally independent given X. Let i* be the index of the positive, uniformly distributed over {1,...,N}. For any measurable scoring function s(x,y), define L_N(s) = - E[ log( exp(s(X,Y_{i*})) / sum_j exp(s(X,Y_j)) ) ]. Then log N - L_N(s) is a lower bound on I(X;Y). The Bayes-optimal score is s*(x,y) = log(p(y|x)/p(y)) + c(x), where c(x) is arbitrary. Under the standard multi-sample setup, the resulting InfoNCE lower bound tightens as N increases.
Consider a spike-train encoding model where spikes are generated by an inhomogeneous Poisson process with intensity lambda_t = f(eta_t), eta_t = x_t^T beta + h_t^T gamma + b, with convex parameter space for theta = (beta, gamma, b). If f is positive, convex, and log-concave, then the log-likelihood is concave in theta. Therefore every local maximum is global, ML fitting is a convex optimization problem, and the same holds for MAP inference under any log-concave prior on theta.
The pattern-matching limitations identified in GSM-NoOp are practically surmountable when LLMs are allowed to offload formal reasoning steps to code execution.
Sources: Mirzadeh et al., GSM-Symbolic (ICLR 2025), EmergentMind GSM-Symbolic Analysis, AppleInsider coverage of GSM-Symbolic research +5 more
Training and running today's frontier AI models consumes more electricity than entire small countries.
Sources: International Energy Agency (IEA), Energy and AI 2025 report, Executive Summary, U.S. Energy Information Administration (EIA), Electricity Use in Homes, WorldData.info: Nauru Energy Consumption +2 more
Using AI tools makes humans worse at critical thinking and original problem-solving.
Sources: PsyPost report on Gerlich (2025), Societies 15(1):6, Microsoft Research — Lee et al. (2025), CHI 2025, Harvard Gazette (2025) +1 more