Episodes

  • Episode 1: Introducing Vanishing Gradients
    Feb 16 2022
    In this brief introduction, Hugo introduces the rationale behind launching a new data science podcast and gets excited about his upcoming guests: Jeremy Howard, Rachael Tatman, and Heather Nolis! Original music, bleeps, and blops by local Sydney legend PlaneFace (https://planeface.bandcamp.com/album/fishing-from-an-asteroid)!
    Show More Show Less
    5 mins
  • Episode 58: Building GenAI Systems That Make Business Decisions with Thomas Wiecki (PyMC Labs)
    Sep 9 2025
    While most conversations about generative AI focus on chatbots, Thomas Wiecki (PyMC Labs, PyMC) has been building systems that help companies make actual business decisions. In this episode, he shares how Bayesian modeling and synthetic consumers can be combined with LLMs to simulate customer reactions, guide marketing spend, and support strategy. Drawing from his work with Colgate and others, Thomas explains how to scale survey methods with AI, where agents fit into analytics workflows, and what it takes to make these systems reliable. We talk through: Using LLMs as “synthetic consumers” to simulate surveys and test product ideas How Bayesian modeling and causal graphs enable transparent, trustworthy decision-making Building closed-loop systems where AI generates and critiques ideas Guardrails for multi-agent workflows in marketing mix modeling Where generative AI breaks (and how to detect failure modes) The balance between useful models and “correct” models If you’ve ever wondered how to move from flashy prototypes to AI systems that actually inform business strategy, this episode shows what it takes. LINKS: The AI MMM Agent, An AI-Powered Shortcut to Bayesian Marketing Mix Insights (https://www.pymc-labs.com/blog-posts/the-ai-mmm-agent) AI-Powered Decision Making Under Uncertainty Workshop w/ Allen Downey & Chris Fonnesbeck (PyMC Labs) (https://youtube.com/live/2Auc57lxgeU) The Podcast livestream on YouTube (https://youtube.com/live/so4AzEbgSjw?feature=share) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338
    Show More Show Less
    1 hr and 1 min
  • Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank)
    Aug 29 2025
    While many people talk about “agents,” Shreya Shankar (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply. Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks, error analysis, and guardrails needed to turn flaky LLM outputs into trustworthy pipelines. We talk through: - Treating LLM workflows as ETL pipelines for unstructured text - Error analysis: why you need humans reviewing the first 50–100 traces - Guardrails like retries, validators, and “gleaning” - How LLM judges work — rubrics, pairwise comparisons, and cost trade-offs - Cheap vs. expensive models: when to swap for savings - Where agents fit in (and where they don’t) If you’ve ever wondered how to move beyond unreliable demos, this episode shows how to scale LLMs to millions of documents — without breaking the bank. LINKS Shreya's website (https://www.sh-reya.com/) DocETL, A system for LLM-powered data processing (https://www.docetl.org/) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Watch the podcast video on YouTube (https://youtu.be/3r_Hsjy85nk) Shreya's AI evals course, which she teaches with Hamel "Evals" Husain (https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338
    Show More Show Less
    41 mins
  • Episode 56: DeepMind Just Dropped Gemma 270M... And Here’s Why It Matters
    Aug 14 2025
    While much of the AI world chases ever-larger models, Ravin Kumar (Google DeepMind) and his team build across the size spectrum, from billions of parameters down to this week’s release: Gemma 270M, the smallest member yet of the Gemma 3 open-weight family. At just 270 million parameters, a quarter the size of Gemma 1B, it’s designed for speed, efficiency, and fine-tuning. We explore what makes 270M special, where it fits alongside its billion-parameter siblings, and why you might reach for it in production even if you think “small” means “just for experiments.” We talk through: - Where 270M fits into the Gemma 3 lineup — and why it exists - On-device use cases where latency, privacy, and efficiency matter - How smaller models open up rapid, targeted fine-tuning - Running multiple models in parallel without heavyweight hardware - Why “small” models might drive the next big wave of AI adoption If you’ve ever wondered what you’d do with a model this size (or how to squeeze the most out of it) this episode will show you how small can punch far above its weight. LINKS Introducing Gemma 3 270M: The compact model for hyper-efficient AI (Google Developer Blog) (https://developers.googleblog.com/en/introducing-gemma-3-270m/) Full Model Fine-Tune Guide using Hugging Face Transformers (https://ai.google.dev/gemma/docs/core/huggingface_text_full_finetune) The Gemma 270M model on HuggingFace (https://huggingface.co/google/gemma-3-270m) The Gemma 270M model on Ollama (https://ollama.com/library/gemma3:270m) Building AI Agents with Gemma 3, a workshop with Ravin and Hugo (https://www.youtube.com/live/-IWstEStqok) (Code here (https://github.com/canyon289/ai_agent_basics)) From Images to Agents: Building and Evaluating Multimodal AI Workflows, a workshop with Ravin and Hugo (https://www.youtube.com/live/FNlM7lSt8Uk)(Code here (https://github.com/canyon289/ai_image_agent)) Evaluating AI Agents: From Demos to Dependability, an upcoming workshop with Ravin and Hugo (https://lu.ma/ezgny3dl) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Watch the podcast video on YouTube (https://youtu.be/VZDw6C2A_8E) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338 ($600 off early bird discount for November cohort availiable until August 16)
    Show More Show Less
    46 mins
  • Episode 55: From Frittatas to Production LLMs: Breakfast at SciPy
    Aug 12 2025
    Traditional software expects 100% passing tests. In LLM-powered systems, that’s not just unrealistic — it’s a feature, not a bug. Eric Ma leads research data science in Moderna’s data science and AI group, and over breakfast at SciPy we explored why AI products break the old rules, what skills different personas bring (and miss), and how to keep systems alive after the launch hype fades. You’ll hear the clink of coffee cups, the murmur of SciPy in the background, and the occasional bite of frittata as we talk (hopefully also a feature, not a bug!) We talk through: • The three personas — and the blind spots each has when shipping AI systems • Why “perfect” tests can be a sign you’re testing the wrong thing • Development vs. production observability loops — and why you need both • How curiosity about failing data separates good builders from great ones • Ways large organizations can create space for experimentation without losing delivery focus If you want to build AI products that thrive in the messy real world, this episode will help you embrace the chaos — and make it work for you. LINKS Eric' Website (https://ericmjl.github.io/) More about the workshops Eric and Hugo taught at SciPy (https://hugobowne.substack.com/p/stress-testing-llms-evaluation-frameworks) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338 ($600 off early bird discount for November cohort availiable until August 16)
    Show More Show Less
    38 mins
  • Episode 54: Scaling AI: From Colab to Clusters — A Practitioner’s Guide to Distributed Training and Inference
    Jul 18 2025
    Colab is cozy. But production won’t fit on a single GPU. Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software. We talk through: • From Colab to clusters: why scaling isn’t just about training massive models, but serving agents, handling load, and speeding up iteration • Zero-to-two GPUs: how to get started without Kubernetes, Slurm, or a PhD in networking • Scaling tradeoffs: when to care about interconnects, which infra bottlenecks actually matter, and how to avoid chasing performance ghosts • The GPU middle class: strategies for training and serving on a shoestring, with just a few cards or modest credits • Local experiments, global impact: why learning distributed systems—even just a little—can set you apart as an engineer If you’ve ever stared at a Hugging Face training script and wondered how to run it on something more than your laptop: this one’s for you. LINKS Zach on LinkedIn (https://www.linkedin.com/in/zachary-mueller-135257118/) Hugo's blog post on Stop Buliding AI Agents (https://www.linkedin.com/posts/hugo-bowne-anderson-045939a5_yesterday-i-posted-about-stop-building-ai-activity-7346942036752613376-b8-t/) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/stop-building-agents) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338 Zach's course (45% off for VG listeners!): Scratch to Scale: Large-Scale Training in the Modern World (https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39) -- https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39 📺 Watch the video version on YouTube: YouTube link (https://youtube.com/live/76NAtzWZ25s?feature=share)
    Show More Show Less
    41 mins
  • Episode 53: Human-Seeded Evals & Self-Tuning Agents: Samuel Colvin on Shipping Reliable LLMs
    Jul 8 2025
    Demos are easy; durability is hard. Samuel Colvin has spent a decade building guardrails in Python (first with Pydantic, now with Logfire), and he’s convinced most LLM failures have nothing to do with the model itself. They appear where the data is fuzzy, the prompts drift, or no one bothered to measure real-world behavior. Samuel joins me to show how a sprinkle of engineering discipline keeps those failures from ever reaching users. We talk through: • Tiny labels, big leverage: how five thumbs-ups/thumbs-downs are enough for Logfire to build a rubric that scores every call in real time • Drift alarms, not dashboards: catching the moment your prompt or data shifts instead of reading charts after the fact • Prompt self-repair: a prototype agent that rewrites its own system prompt—and tells you when it still doesn’t have what it needs • The hidden cost curve: why the last 15 percent of reliability costs far more than the flashy 85 percent demo • Business-first metrics: shipping features that meet real goals instead of chasing another decimal point of “accuracy” If you’re past the proof-of-concept stage and staring down the “now it has to work” cliff, this episode is your climbing guide. LINKS Pydantic (https://pydantic.dev/) Logfire (https://pydantic.dev/logfire) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/stop-building-agents) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — next cohort starts July 8: https://maven.com/s/course/d56067f338 📺 Watch the video version on YouTube: YouTube link (https://youtube.com/live/wk6rPZ6qJSY?feature=share)
    Show More Show Less
    45 mins
  • Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)
    Jul 2 2025
    Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries? In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people do their jobs. We cover: • How to align retrieval with user intent and why cosine similarity is not the answer • How a dumb YAML-based system outperformed so-called smart retrieval pipelines • Why vague queries like “what is this all about” expose real weaknesses in most systems • When vibe checks are enough and when formal evaluation is worth the effort • How retrieval workflows can evolve alongside your product and user needs If you are building LLM-powered systems and care about how they work, not just whether they work, this one is for you. LINKS Eric's website (https://ericmjl.github.io/) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/stop-building-agents) 🎓 Learn more: Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — next cohort starts July 8: https://maven.com/s/course/d56067f338 📺 Watch the video version on YouTube: YouTube link (https://youtu.be/d-FaR5Ywd5k)
    Show More Show Less
    29 mins