muckrAIkers cover art

muckrAIkers

muckrAIkers

By: Jacob Haimes and Igor Krawczuk
Listen for free

About this listen

Join us as we dig a tiny bit deeper into the hype surrounding "AI" press releases, research papers, and more. Each episode, we'll highlight ongoing research and investigations, providing some much needed contextualization, constructive critique, and even a smidge of occasional good will teasing to the conversation, trying to find the meaning under all of this muck.© Kairos.fm Mathematics Science
Episodes
  • AI Safety for Who?
    Oct 13 2025
    Jacob and Igor argue that AI safety is hurting users, not helping them. The techniques used to make chatbots "safe" and "aligned," such as instruction tuning and RLHF, anthropomorphize AI systems such they take advantage of our instincts as social beings. At the same time, Big Tech companies push these systems for "wellness" while dodging healthcare liability, causing real harms today We discuss what actual safety would look like, drawing on self-driving car regulations.Chapters(00:00) - Introduction & AI Investment Insanity (01:43) - The Problem with AI Safety (08:16) - Anthropomorphizing AI & Its Dangers (26:55) - Mental Health, Wellness, and AI (39:15) - Censorship, Bias, and Dual Use (44:42) - Solutions, Community Action & Final ThoughtsLinksAI Ethics & PhilosophyForeign affairs article - The Cost of the AGI DelusionNature article - Principles alone cannot guarantee ethical AIXeiaso blog post - Who Do Assistants Serve?Argmin article - The Banal Evil of AI SafetyAI Panic News article - The Rationality TrapAI Model Bias, Failures, and ImpactsBBC news article - AI Image Generation IssuesThe New York Times article - Google Gemini German Uniforms ControversyThe Verge article - Google Gemini's Embarrassing AI PicturesNPR article - Grok, Elon Musk, and Antisemitic/Racist ContentAccelerAId blog post - How AI Nudges are Transforming Up-and Cross-SellingAI Took My Job websiteAI Mental Health & Safety ConcernsEuronews article - AI Chatbot TragedyPopular Mechanics article - OpenAI and PsychosisPsychology Today article - The Emerging Problem of AI PsychosisRolling Stone article - AI Spiritual Delusions Destroying Human RelationshipsThe New York Times article - AI Chatbots and DelusionsGuidelines, Governance, and CensorshipPreprint - R1dacted: Investigating Local Censorship in DeepSeek's R1 Language ModelMinds & Machines article - The Ethics of AI Ethics: An Evaluation of GuidelinesSSRN paper - Instrument Choice in AI GovernanceAnthropic announcement - Claude Gov Models for U.S. National Security CustomersAnthropic documentation - Claude's ConstitutionReuters investigation - Meta AI Chatbot GuidelinesSwiss Federal Council consultation - Swiss AI Consultation ProceduresGrok Prompts Github RepoSimon Willison blog post - Grok 4 Heavy
    Show More Show Less
    50 mins
  • The Co-opting of Safety
    Aug 21 2025

    We dig into how the concept of AI "safety" has been co-opted and weaponized by tech companies. Starting with examples like Mecha-Hitler Grok, we explore how real safety engineering differs from AI "alignment," the myth of the alignment tax, and why this semantic confusion matters for actual safety.

    • (00:00) - Intro
    • (00:21) - Mecha-Hitler Grok
    • (10:07) - "Safety"
    • (19:40) - Under-specification
    • (53:56) - This time isn't different
    • (01:01:46) - Alignment Tax myth
    • (01:17:37) - Actually making AI safer

    Links
    • JMLR article - Underspecification Presents Challenges for Credibility in Modern Machine Learning
    • Trail of Bits paper - Towards Comprehensive Risk Assessments and Assurance of AI-Based Systems
    • SSRN paper - Uniqueness Bias: Why It Matters, How to Curb It

    Additional Referenced Papers

    • NeurIPS paper - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
    • ICML paper - AI Control: Improving Safety Despite Intentional Subversion
    • ICML paper - DarkBench: Benchmarking Dark Patterns in Large Language Models
    • OSF preprint - Current Real-World Use of Large Language Models for Mental Health
    • Anthropic preprint - Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

    Inciting Examples

    • ars Technica article - US government agency drops Grok after MechaHitler backlash, report says
    • The Guardian article - Musk’s AI Grok bot rants about ‘white genocide’ in South Africa in unrelated chats
    • BBC article - Update that made ChatGPT 'dangerously' sycophantic pulled

    Other Sources

    • London Daily article - UK AI Safety Institute Rebrands as AI Security Institute to Focus on Crime and National Security
    • Vice article - Prominent AI Philosopher and ‘Father’ of Longtermism Sent Very Racist Email to a 90s Philosophy Listserv
    • LessWrong blogpost - "notkilleveryoneism" sounds dumb (see comments)
    • EA Forum blogpost - An Overview of the AI Safety Funding Situation
    • Book by Dmitry Chernov and Didier Sornette - Man-made Catastrophes and Risk Information Concealment
    • Euronews article - OpenAI adds mental health safeguards to ChatGPT, saying chatbot has fed into users’ ‘delusions’
    • Pleias website
    • Wikipedia page on Jaywalking
    Show More Show Less
    1 hr and 24 mins
  • AI, Reasoning or Rambling?
    Jul 14 2025

    In this episode, we redefine AI's "reasoning" as mere rambling, exposing the "illusion of thinking" and "Potemkin understanding" in current models. We contrast the classical definition of reasoning (requiring logic and consistency) with Big Tech's new version, which is a generic statement about information processing. We explain how Large Rambling Models generate extensive, often irrelevant, rambling traces that appear to improve benchmarks, largely due to best-of-N sampling and benchmark gaming.

    Words and definitions actually matter! Carelessness leads to misplaced investments and an overestimation of systems that are currently just surprisingly useful autocorrects.

    • (00:00) - Intro
    • (00:40) - OBB update and Meta's talent acquisition
    • (03:09) - What are rambling models?
    • (04:25) - Definitions and polarization
    • (09:50) - Logic and consistency
    • (17:00) - Why does this matter?
    • (21:40) - More likely explanations
    • (35:05) - The "illusion of thinking" and task complexity
    • (39:07) - "Potemkin understanding" and surface-level recall
    • (50:00) - Benchmark gaming and best-of-n sampling
    • (55:40) - Costs and limitations
    • (58:24) - Claude's anecdote and the Vending Bench
    • (01:03:05) - Definitional switch and implications
    • (01:10:18) - Outro

    Links
    • Apple paper - The Illusion of Thinking
    • ICML 2025 paper - Potemkin Understanding in Large Language Models
    • Preprint - Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

    Theoretical understanding

    • Max M. Schlereth Manuscript - The limits of AGI part II
    • Preprint - (How) Do Reasoning Models Reason?
    • Preprint - A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
    • NeurIPS 2024 paper - How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad

    Empirical explanations

    • Preprint - How Do Large Language Monkeys Get Their Power (Laws)?
    • Andon Labs Preprint - Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents
    • LeapLab, Tsinghua University and Shanghai Jiao Tong University paper - Does Reinforcement Learning Really Incentivize Reasoning Capacity
    • Preprint - RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs
    • Preprint - Mind The Gap: Deep Learning Doesn't Learn Deeply
    • Preprint - Measuring AI Ability to Complete Long Tasks
    • Preprint - GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

    Other sources

    • Zuck's Haul webpage - Meta's talent acquisition tracker
      • Hacker News discussion - Opinions from the AI community
    • Interconnects blogpost - The rise of reasoning machines
    • Anthropic blog - Project Vend: Can Claude run a small shop?
    Show More Show Less
    1 hr and 11 mins
No reviews yet
In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.