What the Freakiness of 2025 in AI Tells Us About 2026
Failed to add items
Add to basket failed.
Add to Wish List failed.
Remove from Wish List failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
By:
About this listen
It’s probably not possible to satisfactorily condense a 12 month’s worth of weird progress in AI, as well as predictions for the year to come, into one video. But I’m gonna try anyway because it has been a very strange time.
http://matsprogram.org/s26-aie
My new app! https://lmcouncil.ai
Patreon Interview: https://www.patreon.com/posts/robot-in-your-27-146376094
Chapters:
00:00 - Introduction
00:34 - Reasoning Models … and limits
02:54 - A playable world
03:36 - Realism
03:50 - AI Slop gone mainstream
05:03 - DolphinGemma
05:39 - Public Mood
07:34 - AI Enlisted
08:30 - GPT-5
11:05 - Open Weight not out
13:00 - METR Breakout
17:30 - VASA-1
18:28 - Lateral Productivity
20:15 - 1 or 1000 benchmarks needed?
24:54 - Continual Learning + Altman on Superintelligence
28:08 - Automated Information Discovery ft AlphaEvolve
Hassabis on Generality: https://x.com/demishassabis/status/2003097405026193809
https://www.youtube.com/watch?v=PqVbypvxDto
Gemini 3: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
Reasoning Trade-offs: https://arxiv.org/pdf/2504.13837
DolphinGemma: https://blog.google/technology/ai/dolphingemma/?s=09
Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/
METR Time Horizon: https://arxiv.org/pdf/2503.14499
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
Flaws: https://x.com/ShashwatGoel7/status/2002369517499105443
https://shash42.substack.com/p/how-to-game-the-metr-plot
https://x.com/METR_Evals/status/2002203627377574113
GPT-5 - Altman phd in everything: https://edition.cnn.com/2025/08/14/business/chatgpt-rollout-problems
https://simple-bench.com/
AI Slop: https://www.youtube.com/watch?v=I_3vxoJDD9k
https://www.theguardian.com/technology/2025/dec/16/boost-for-artists-in-ai-copyright-battle-as-only-3-per-cent-back-uk-active-opt-out-plan
Survey: https://x.com/SearchlightInst/status/2001057144842387920/photo/1
Nvidia Nemotron: https://x.com/percyliang/status/2000608134205985169
OpenAI Compute Flywheel: https://x.com/OpenAI/status/2001363007209914399/photo/1
Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ
AI in Govt: https://x.com/jdcmedlock/status/1939814516503847259
Benchmark Gaming: https://techcrunch.com/2025/04/07/meta-exec-denies-the-company-artificially-boosted-llama-4s-benchmark-scores/
AlphaEvolve: https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf?utm_source=deepmind.google&utm_medium=referral&utm_campaign=gdm&utm_content=
Continual Learning: https://abehrouz.github.io/files/NL.pdf
Job Risk: https://archive.ph/20250708204527/https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic
GPT4o: https://x.com/AISafetyMemes/status/1916889492172013989
Vasa-1: https://www.microsoft.com/en-us/research/project/vasa-1/
Three Views: https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines
Turing Test: https://x.com/tunguz/status/1907185471211422147
Karpathy Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/
LLM Brainrot: https://arxiv.org/pdf/2510.13928
Lateral Productivity: https://www.aisi.gov.uk/frontier-ai-trends-report
Emotional Quotient: https://arxiv.org/pdf/2511.08394
Non-hype Newsletter: https://signaltonoise.beehiiv.com/
Podcast: https://aiexplainedopodcast.buzzsprout.com/
AI Insiders ($9!): https://www.patreon.com/AIExplained