Dwarkesh Podcast

Episodes

David Reich – Why the Bronze Age was an inflection point in human evolution

May 8 2026

David Reich is back.He and collaborator Ali Akbari just published a paper that overturns a long-standing consensus about human evolution — that natural selection has been dormant in our species since the agricultural revolution.By scaling ancient DNA sequencing and developing a new statistical method, they found that selection has actually sped up.Selection went especially bonkers during the Bronze Age (around 3,000 years ago).That’s when gene frequencies for everything from immune function to body fat to intelligence were most in flux.Over the last 10,000 years, selection pushed the genetic predictor of cognitive performance up by roughly a full standard deviation — most of it between 4,000 and 2,000 years ago.After we finished recording, David sketched out on a whiteboard his new heretical model about who the Neanderthals really were. Luckily, I took out my iPhone and managed to record it.He thinks the standard story (that Neanderthals are some separate archaic lineage we interbred with a little) just doesn’t fit the evidence. Instead, he proposes that Neanderthals are essentially genetically-swamped modern humans.A small population somewhere around the Caucasus invented Middle Stone Age technology roughly 300,000 years ago and expanded outward. The ones that moved into Europe interbred with local archaic humans, got genetically swamped, and became Neanderthals. The same expansion went into Africa, met much more diverged archaic Africans, and that mixture became us.This means Neanderthals and modern humans share the same cultural ancestry — the only difference is which archaic humans they mixed with afterward.David is a brilliant and rigorous scholar. It was a real delight to learn from him again.Watch on YouTube; read the transcript.Sponsors* Cursor was super useful as I prepped for this episode. Whenever I had a question, I’d have Cursor kick off a few different models simultaneously and then compare their responses. I found that this led to better results than I could get out of any individual LLM. If you’ve only used Cursor for coding, you should try using it for research. Check it out at cursor.com/dwarkesh* Jane Street uses an internal currency called “hive bucks” to allocate compute through a real-time auction – and anyone can change anyone else’s bids or even kill their jobs! Everyone just trusts each other to act in the firm’s best interest, which is what lets the system work in the first place. If this weird and high-trust culture sounds like your kind of thing, Jane Street’s hiring at janestreet.com/dwarkesh* Crusoe’s ML infra team built fastokens, an open-source tokenizer that delivers a ~9x speedup over Hugging Face and up to 40% faster time-to-first token – on real production workloads! Crusoe achieved these results by parallelizing things and using some clever engineering to handle duplicates without cross-thread coordination. Learn more at crusoe.ai/dwarkeshTimestamps(00:00:00) – Ancient DNA suggests strong selection over last 10,000 years(00:15:45) – Natural selection intensified during the Bronze Age(00:35:02) – Why didn’t evolution max out intelligence?(00:57:21) – Evolution is limited by time, not population size(01:09:02) – Why no farming before the Ice Age?(01:17:13) – The Neanderthal puzzle David can’t stop thinking about(01:54:10) – The methodology behind this breakthrough Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

2 hrs and 13 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Reiner Pope – The math behind how LLMs are trained and served

Apr 29 2026

Did a very different format with Reiner Pope - a blackboard lecture where he walks through how frontier LLMs are trained and served.
It’s shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk.
It’s a bit technical, but I encourage you to hang in there – it’s really worth it.
There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him.
Recommend watching this one on YouTube so you can see the chalkboard.
Reiner is CEO of MatX, a new chip startup (full disclosure - I’m an angel investor). He was previously at Google, where he worked on software efficiency, compilers, and TPU architecture.
Download markdown of transcript here to chat with an LLM.
Wrote up some flashcards and practice problems to help myself retain what Reiner taught. Hope it's helpful to you too!
Sponsors
* Jane Street needs constant access to incredibly low-latency compute. I recently asked one of their engineers, Clark, to talk me through how they meet these demands. Our conversation—which touched on everything from FPGAs to liquid cooling—was extremely helpful as I prepped to interview Reiner. You can watch the full discussion and explore Jane Street’s open roles at janestreet.com/dwarkesh
* Google’s Gemma 4 is the first open model that’s let me shut off the internet and create a fully disconnected “focus machine”. This is because Gemma is small enough to run on my laptop, but powerful enough to actually be useful. So, to prep for this interview, I downloaded Reiner’s scaling book, disconnected from wifi, and used Gemma to help me break down the material. Check it out at goo.gle/Gemma4
* Cursor helped me turn some notes I took on how gradients flow during large-scale pretraining into a great animation. At first, I wasn’t sure the best way to visualize the concept, but Cursor’s Composer 2 Fast model let me iterate on different ideas almost instantaneously. You can check out the animation in my recent blog post. And if you have something to visualize yourself, go to cursor.com/dwarkesh
Timestamps
(00:00:00) – How batch size affects token cost and speed
(00:32:09) – How MoE models are laid out across GPU racks
(00:47:12) – How pipeline parallelism spreads model layers across racks
(01:03:37) – Why Ilya said, “As we now know, pipelining is not wise.”
(01:18:59) – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal
(01:33:02) – Deducing long context memory costs from API pricing
(02:04:02) – Convergent evolution between neural nets and cryptography

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

2 hrs and 14 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia’s supply chain moat

Apr 15 2026

I asked Jensen about TPU competition, Nvidia’s lock on the ever more bottlenecked supply chain needed to make advanced chips, whether we should be selling AI chips to China, why Nvidia doesn’t just become a hyperscaler, how it makes its investments, and much more. Enjoy!
Watch on YouTube; read the transcript.
Sponsors
* Crusoe’s cloud runs on state-of-the-art Blackwell GPUs, with Vera Rubin deployment scheduled for later this year. But hardware is only part of the story—for inference, Crusoe’s MemoryAlloy tech implements a cluster-wide KV cache, delivering up to 10x faster TTFT and 5x better throughput than vLLM. Learn more at crusoe.ai/dwarkesh
* Cursor helped me build an AI co-researcher over the course of a weekend. Now I have an AI agent that I can collaborate with in Google Docs via inline comment threads! And while other agentic coding tools feel like a total black-box, Cursor let me stay on top of the full implementation. You can try my co-researcher out at github.com/dwarkeshsp/ai_coworker, or get started on your own Cursor project today at cursor.com/dwarkesh
* Jane Street spent ~20,000 GPU hours training backdoors into 3 different language models, then challenged my audience to find the triggers. They received some clever solutions—like comparing the base and fine-tuned versions and extrapolating any differences to reveal the hidden backdoor—but no one was able to solve all 3. So if open problems like this excite you, Jane Street is hiring. Learn more at janestreet.com/dwarkesh
Timestamps
(00:00:00) – Is Nvidia’s biggest moat its grip on scarce supply chains?
(00:16:25) – Will TPUs break Nvidia’s hold on AI compute?
(00:41:06) – Why doesn’t Nvidia become a hyperscaler?
(00:57:36) – Should we be selling AI chips to China?
(01:35:06) – Why doesn’t Nvidia make multiple different chip architectures?

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

1 hr and 43 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Michael Nielsen – How science actually progresses

Apr 7 2026

Really enjoyed chatting with Michael Nielsen about how we recognize scientific progress.It's especially relevant for closing the RL verification loop for scientific discovery.But it's also a surprisingly mysterious and elusive question when you look at the history of human science.We approach this question stories like Einstein (who claimed that he hadn't even heard of the famous Michelson-Morley experiment, which is supposed to have motivated special relativity, until after he had come up with the theory), Darwin (why did it take till 1859 to lay out an idea whose essence every farmer since antiquity must have observed?), Prout (how do you recognize that isotopes exist if you cannot chemically separate them?), and many others.The verification loop on scientific ideas is often extremely long and weirdly hostile. Ancient Athenians dismissed Aristarchus's heliocentrism in the 3rd century BC because it would imply that the stars should shift in the sky as the Earth orbits the sun. The first successful measurement of stellar parallax was in 1838. That's a 2,000-year verification loop.But clearly human science is able to make progress faster than raw experimental falsification/verification would imply, and in cases where experiments are very ambiguous. How?Michael has some very deep and provocative hypotheses about the nature of progress. One I found especially thought-provoking is that aliens will likely have a VERY different science + tech stack than us. Which contradicts the common sense picture of a linear tech tree that I was assuming. And has some interesting implications about how future civilizations might trade and cooperate with each other.Watch on Youtube; read the transcript.Sponsors* Labelbox researchers built a new safety benchmark. Why? Well, current safety benchmarks claim that attacks on top models are successful only a few percent of the time, but the prompts in those benchmarks don’t reflect how real bad actors actually write. You can read Labelbox’s research here. If this could be useful for your work, reach out at labelbox.com/dwarkesh* Mercury has an MCP that lets you give an LLM access to your full transaction history, including things like attached receipts and internal notes. I just used it to categorize my 2025 transactions, and it worked shockingly well. Modern functionality like this is exactly why I use Mercury. Learn more at mercury.com* Jane Street’s ML engineers presented some of their GPU optimization workflows at GTC, showing how they use CUDA graphs, streams, and custom kernels to shave real time off their training runs. You can watch the full talk here. And they open-sourced all the relevant code here. If this kind of stuff excites you, Jane Street is hiring — learn more at janestreet.com/dwarkeshTimestamps(00:00:00) – How scientific progress outpaces its verification loops(00:17:51) – Newton was the last of the magicians(00:23:26) – Why wasn’t natural selection obvious much earlier?(00:29:52) – Could gradient descent have discovered general relativity?(00:50:54) – Why aliens will have a different tech stack than us(01:15:26) – Are there infinitely many deep scientific principles left to discover?(01:26:25) – What drew Michael to quantum computing so early?(01:35:29) – Does science need a new way to assign credit?(01:43:57) – Prolificness versus depth(01:49:17) – What it takes to actually internalize what you learn Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

2 hrs and 3 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Terence Tao – Kepler, Newton, and the true nature of mathematical discovery

Mar 20 2026

We begin the episode with the absolutely ingenious and surprising way in which Kepler discovered the laws of planetary motion.
People sometimes say that AI will make especially fast progress at scientific discovery because of tight verification loops.
But the story of how we discovered the shape of our solar system shows how the verification loop for correct ideas can be decades (or even millennia) long.
During this time, what we know today as the better theory can actually make worse predictions.
And the reasons it survives this epistemic hell is some mixture of judgment and heuristics that we don’t even understand well enough to actually articulate, much less codify into an RL loop. Hope you enjoy!
Watch on YouTube; read the transcript.
Sponsors
- Jane Street loves challenging my audience with different creative puzzles. One of my listeners, Shawn, solved Jane Street’s ResNet challenge and posted a great walk-through on X. If you want to try one of these puzzles yourself, there’s one live now at janestreet.com/dwarkesh.
- Labelbox can get you rubric-based evals, no matter your domain. These rubrics allow you to give your model feedback on all the dimensions you care about, so you can train how it thinks, not just what it thinks. Whatever you’re focused on—math, physics, finance, psychology or something else—Labelbox can help. Learn more at labelbox.com/dwarkesh.
- Mercury just released a new feature called Insights. Insights summarizes your money in and out, showing you your biggest transactions and calling out anything worth paying attention to. It’s a super low-friction way to stay on top of your business. Learn more at mercury.com/insights.
Timestamps
(00:00:00) – Kepler was a high temperature LLM
(00:11:44) – How would we know if there’s a new unifying concept within heaps of AI slop?
(00:26:10) – The deductive overhang
(00:30:31) – Selection bias in reported AI discoveries
(00:46:43) – AI makes papers richer and broader, but not deeper
(00:53:00) – If AI solves a problem, can humans get understanding out of it?
(00:59:20) – We need a semi-formal language for the way that scientists actually talk to each other
(01:09:48) – How Terry uses his time
(01:17:05) – Human-AI hybrids will dominate math for a lot longer

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

1 hr and 24 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Dylan Patel — Deep dive on the 3 big bottlenecks to scaling AI compute

Mar 13 2026

Dylan Patel, founder of SemiAnalysis, provides a deep dive into the 3 big bottlenecks to scaling AI compute: logic, memory, and power.
And walks through the economics of labs, hyperscalers, foundries, and fab equipment manufacturers.
Learned a ton about every single level of the stack. Enjoy!
Watch on YouTube; read the transcript.
Sponsors
* Mercury has already saved me a bunch of time this tax season. Last year, I used Mercury to request W-9s from all the contractors I worked with. Then, when it came time to issue 1099s this year, I literally just clicked a button and Mercury sent them out. Learn more at mercury.com.
* Labelbox noticed that even when voice models appear to take interruptions in stride, their performance degrades. To figure out why, they built a new evaluation pipeline called EchoChain. EchoChain diagnoses voice models’ specific failure modes, letting you understand what your model needs to truly handle interruptions. Check it out at labelbox.com/dwarkesh.
* Jane Street is basically a research lab with a trading desk attached – and their infrastructure backs this up. They’ve got tens of thousands of GPUs, hundreds of thousands of CPU cores, and exabytes of storage. This is what it takes to find subtle signals hidden deep within noisy market data. If this sounds interesting, you can explore open positions at janestreet.com/dwarkesh.
Timestamps
(00:00:00) – Why an H100 is worth more today than 3 years ago
(00:24:52) – Nvidia secured TSMC allocation early; Google is getting squeezed
(00:34:34) – ASML will be the #1 constraint for AI compute scaling by 2030
(00:55:47) – Can't we just use TSMC's older fabs?
(01:05:37) – When will China outscale the West in semis?
(01:16:01) – The enormous incoming memory crunch
(01:42:34) – Scaling power in the US will not be a problem
(01:54:44) – Space GPUs aren't happening this decade
(02:14:07) – Why aren't more hedge funds making the AGI trade?
(02:18:30) – Will TSMC kick Apple out from N2?
(02:24:16) – Robots and Taiwan risk

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

2 hrs and 31 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
The most important question nobody's asking about AI

Mar 11 2026

Read the full essay here: https://www.dwarkesh.com/p/dow-anthropic
Timestamps
(00:00:00) - Anthropic vs The Pentagon
(00:04:16) - The overhangs of tyranny
(00:05:54) - AI structurally favors mass surveillance
(00:08:25) - Alignment...to whom?
(00:13:55) - Coordination not worth the costs

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Show More Show Less

25 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
How cosplaying Ancient Rome led to the scientific revolution

Mar 6 2026

Renaissance history is so much wilder and weirder than you would have expected. Very fun chatting with Ada Palmer (historian, novelist, and composer based at the University of Chicago).Some especially fascinating things I learned from the conversation and her excellent book, Inventing the Renaissance:Not only did Gutenberg go bankrupt in the 1450s (after inventing the printing press), but so did the bank that foreclosed on him, and so did his apprentices. This is because paper was still very expensive, and so you had to make this big upfront CAPEX decision to print a batch of 300 copies of a book - say the Bible. But he’s in a small landlocked German town where only priests are allowed to read the Bible - so he sells maybe 7 copies. It’s only when this technology ends up in Venice, where you can hand 10 copies to each of 30 ship captains going to 30 different cities, that it starts taking off.Speaking of which, the printing revolution wasn’t just one single discrete event, just as the computer revolution has been this whole century of going from mainframes -> personal computers -> phones -> social media, each with different and accelerating social impact. Books came first, but they’re slow to print, and made in small batches. The real revolution is pamphlets - much faster, much harder to censor. Pamphlet runners are how you can have Luther’s 95 Theses go from Wittenberg to London in 17 days.So much other wild stuff from this episode. For example, did you know that the largest and best-funded experimental laboratory in 17th century Europe was very likely the Roman one run by inquisitors? Ada jokes that the Inquisition accidentally invented peer review. The focus of the Inquisition is really misunderstood - it was obsessed with catching dangerous new heretics like Lutherans and Calvinists - it only executed one person for doing science.And this leads Ada to make an observation that I think is really wise: the authorities and censors are always worried about the exact wrong things given 20/20 hindsight. When Inquisition raids an underground bookshop during the French Enlightenment, they don’t mind the Rousseau, Voltaire, and Encyclopédie, but they lose their minds about some Jansenist treatises about the technical nature of the Trinity.More broadly, a lesson for me from this episode is that it’s just really hard to shape history in the specific way that you want to impact things. One of the most famous medieval scholars is this guy Petrarch. He survives the Black Death in the 1340s, watches his friends die to plague and bandits, and says: our leaders are selfish and terrible, we need to raise them on the Roman classics so they’ll act like Cicero. So Europe pours money into finding ancient manuscripts, building libraries, and educating princes on classical virtues. Those princes grow up and fight bigger, nastier wars than ever before with new deadlier technology. And this, combined with greater urbanization and endemic plague, results in European life expectancy decreasing from 35 in the medieval period to 18 during the Renaissance (the period which we in retrospect think of as a golden age but which many people living through it thought of as the continuation of the dark ages that had persisted since the fall of Rome).Anyways, the libraries Petrarch inspires stick around, the printing press makes them accessible to everyone, and 200 years later a generation of medical students is reading Lucretius and asking “what if there are atoms and that’s how diseases work?” which eventually leads to germ theory, vaccines, and a cure for the Black Death (Ada has longer more involved explanation of how cosplaying the Romans results through a series of many steps to the scientific revolution). Petrarch wanted to produce philosopher-kings that shared his values. Instead he created a world that doesn’t share his values at all but can cure the disease that destroyed his.Watch on YouTube; read the transcript.Sponsors* Jane Street is still waiting on someone to solve their backdoor puzzle… They’re accepting submissions until April 1st and have set aside $50,000 for the best attempts. Separately, applications are live for Jane Street’s summer ML internships in NY, London, and Hong Kong. Go check all of this out at janestreet.com/dwarkesh.* Labelbox can help ensure your agents don’t need to rely on overspecified prompts. They tailor real-world scenarios to whatever domain you’re focused on, and they make sure the data you train on rewards real understanding, not just instruction-following. Learn more at labelbox.com/dwarkesh* Mercury’s personal accounts let you add users, issue cards, and customize permissions. This is super useful for sharing finances with a partner, a roommate… or even an OpenClaw agent. And, if you’re already a Mercury Business user, your personal account is free! See terms and conditions below, and learn more at mercury.com/personal-bankingEligible Mercury Business...
Show More Show Less

2 hrs and 2 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free

Episodes

David Reich – Why the Bronze Age was an inflection point in human evolution

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Reiner Pope – The math behind how LLMs are trained and served

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia’s supply chain moat

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Michael Nielsen – How science actually progresses

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Terence Tao – Kepler, Newton, and the true nature of mathematical discovery

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Dylan Patel — Deep dive on the 3 big bottlenecks to scaling AI compute

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

The most important question nobody's asking about AI

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

How cosplaying Ancient Rome led to the scientific revolution

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed