Episodes

  • Everything Hard About Building AI Agents Today
    Jun 13 2025

    Willem Pienaar and Shreya Shankar discuss the challenge of evaluating agents in production where "ground truth" is ambiguous and subjective user feedback isn't enough to improve performance.


    The discussion breaks down the three "gulfs" of human-AI interaction—Specification, Generalization, and Comprehension—and their impact on agent success.


    Willem and Shreya cover the necessity of moving the human "out of the loop" for feedback, creating faster learning cycles through implicit signals rather than direct, manual review.The conversation details practical evaluation techniques, including analyzing task failures with heat maps and the trade-offs of using simulated environments for testing.


    Willem and Shreya address the reality of a "performance ceiling" for AI and the importance of categorizing problems your agent can, can learn to, or will likely never be able to solve.


    // Bio



    Shreya Shankar

    PhD student in data management for machine learning.


    Willem Pienaar


    Willem Pienaar, CTO of Cleric, is a builder with a focus on LLM agents, MLOps, and open source tooling. He is the creator of Feast, an open source feature store, and contributed to the creation of both the feature store and MLOps categories.


    Before starting Cleric, Willem led the open source engineering team at Tecton and established the ML platform team at Gojek, where he built high scale ML systems for the Southeast Asian decacorn.


    // Related Links



    https://www.google.com/about/careers/applications/?utm_campaign=profilepage&utm_medium=profilepage&utm_source=linkedin&src=Online/LinkedIn/linkedin_pagehttps://cleric.ai/



    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~



    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Shreya on LinkedIn: /shrshnk

    Connect with Willem on LinkedIn: /willempienaar


    Timestamps:



    [00:00] Trust Issues in AI Data

    [04:49] Cloud Clarity Meets Retrieval

    [09:37] Why Fast AI Is Hard

    [11:10] Fixing AI Communication Gaps

    [14:53] Smarter Feedback for Prompts

    [19:23] Creativity Through Data Exploration

    [23:46] Helping Engineers Solve Faster

    [26:03] The Three Gaps in AI

    [28:08] Alerts Without the Noise

    [33:22] Custom vs General AI

    [34:14] Sharpening Agent Skills

    [40:01] Catching Repeat Failures

    [43:38] Rise of Self-Healing Software

    [44:12] The Chaos of Monitoring AI

    Show More Show Less
    47 mins
  • Tricks to Fine Tuning // Prithviraj Ammanabrolu // #318
    Jun 11 2025

    Tricks to Fine Tuning // MLOps Podcast #318 with Prithviraj Ammanabrolu, Research Scientist at Databricks. Join the Community: https://go.mlops.community/YTJoinIn

    Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract



    Prithviraj Ammanabrolu drops by to break down Tao fine-tuning—a clever way to train models without labeled data. Using reinforcement learning and synthetic data, Tao teaches models to evaluate and improve themselves. Raj explains how this works, where it shines (think small models punching above their weight), and why it could be a game-changer for efficient deployment.



    // Bio



    Raj is an Assistant Professor of Computer Science at the University of California, San Diego, leading the PEARLS Lab in the Department of Computer Science and Engineering (CSE). He is also a Research Scientist at Mosaic AI, Databricks, where his team is actively recruiting research scientists and engineers with expertise in reinforcement learning and distributed systems.



    Previously, he was part of the Mosaic team at the Allen Institute for AI. He earned his PhD in Computer Science from the School of Interactive Computing at Georgia Tech, advised by Professor Mark Riedl in the Entertainment Intelligence Lab.



    // Related Links



    Website: https://www.databricks.com/



    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~



    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    Join our Slack community [https://go.mlops.community/slack]

    Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)]

    Sign up for the next meetup: [https://go.mlops.community/register]

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Raj on LinkedIn: /rajammanabrolu



    Timestamps:



    [00:00] Raj's preferred coffee

    [00:36] Takeaways

    [01:02] Tao Naming Decision

    [04:19] No Labels Machine Learning

    [08:09] Tao and TAO breakdown

    [13:20] Reward Model Fine-Tuning

    [18:15] Training vs Inference Compute

    [22:32] Retraining and Model Drift

    [29:06] Prompt Tuning vs Fine-Tuning

    [34:32] Small Model Optimization Strategies

    [37:10] Small Model Potential

    [43:08] Fine-tuning Model Differences

    [46:02] Mistral Model Freedom

    [53:46] Wrap up

    Show More Show Less
    54 mins
  • Packaging MLOps Tech Neatly for Engineers and Non-engineers // Jukka Remes // #322
    Jun 10 2025

    Packaging MLOps Tech Neatly for Engineers and Non-engineers // MLOps Podcast #322 with Jukka Remes, Senior Lecturer (SW dev & AI), AI Architect at Haaga-Helia UAS, Founder & CTO at 8wave AI.



    Join the Community:


    https://go.mlops.community/YTJoinIn


    Get the newsletter: https://go.mlops.community/YTNewsletter


    // Abstract



    AI is already complex—adding the need for deep engineering expertise to use MLOps tools only makes it harder, especially for SMEs and research teams with limited resources. Yet, good MLOps is essential for managing experiments, sharing GPU compute, tracking models, and meeting AI regulations.


    While cloud providers offer MLOps tools, many organizations need flexible, open-source setups that work anywhere—from laptops to supercomputers. Shared setups can boost collaboration, productivity, and compute efficiency.In this session, Jukka introduces an open-source MLOps platform from Silo AI, now packaged for easy deployment across environments. With Git-based workflows and CI/CD automation, users can focus on building models while the platform handles the MLOps.// BioFounder & CTO, 8wave AI | Senior Lecturer, Haaga-Helia University of Applied SciencesJukka Remes has 28+ years of experience in software, machine learning, and infrastructure. Starting with SW dev in the late 1990s and analytics pipelines of fMRI research in early 2000s, he’s worked across deep learning (Nokia Technologies), GPU and cloud infrastructure (IBM), and AI consulting (Silo AI), where he also led MLOps platform development.


    Now a senior lecturer at Haaga-Helia, Jukka continues evolving that open-source MLOps platform with partners like the University of Helsinki. He leads R&D on GenAI and AI-enabled software, and is the founder of 8wave AI, which develops AI Business Operations software for next-gen AI enablement, including regulatory compliance of AI.


    // Related Links



    Open source -based MLOps k8s platform setup originally developed by Jukka's team at Silo AI - free for any use and installable in any environment from laptops to supercomputing: https://github.com/OSS-MLOPS-PLATFORM/oss-mlops-platform


    Jukka's new company:https://8wave.ai


    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~


    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    Join our Slack community [https://go.mlops.community/slack]

    Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)]

    Sign up for the next meetup: [https://go.mlops.community/register]

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Jukka on LinkedIn: /jukka-remes


    Timestamps:

    [00:00] Jukka's preferred coffee

    [00:39] Open-Source Platform Benefits

    [01:56] Silo MLOps Platform Explanation

    [05:18] AI Model Production Processes

    [10:42] AI Platform Use Cases

    [16:54] Reproducibility in Research Models

    [26:51] Pipeline setup automation

    [33:26] MLOps Adoption Journey

    [38:31] EU AI Act and Open Source

    [41:38] MLOps and 8wave AI

    [45:46] Optimizing Cross-Stakeholder Collaboration

    [52:15] Open Source ML Platform

    [55:06] Wrap up

    Show More Show Less
    56 mins
  • Hard Learned Lessons from Over a Decade in AI
    Jun 6 2025

    Tecton⁠ Founder and CEO Mike Del Balso talks about what ML/AI use cases are core components generating Millions in revenue. Demetrios and Mike go through the maturity curve that predictive Machine Learning use cases have gone through over the past 5 years, and why a feature store is a primary component of an ML stack.


    // Bio


    Mike Del Balso is the CEO and co-founder of Tecton, where he’s building the industry’s first feature platform for real-time ML. Before Tecton, Mike co-created the Uber Michelangelo ML platform. He was also a product manager at Google where he managed the core ML systems that power Google’s Search Ads business. He studied Applied Science, Electrical & Computer Engineering at the University of Toronto.


    // Related Links


    Website: www.tecton.ai


    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~


    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Mike on LinkedIn: /michaeldelbalso


    Timestamps:


    [00:00] Smarter decisions, less manual work

    [03:52] Data pipelines: pain and fixes

    [08:45] Why Tecton was born

    [11:30] ML use cases shift

    [14:14] Models for big bets

    [18:39] Build or buy drama

    [20:20] Fintech's data playbook

    [23:52] What really needs real-time

    [28:07] Speeding up ML delivery

    [32:09] Valuing ML is tricky

    [35:29] Simplifying ML toolkits

    [37:18] AI copilots in action

    [42:13] AI that fights fraud

    [45:07] Teaming up across coasts

    [46:43] Tecton + Generative AI?

    Show More Show Less
    49 mins
  • Product Metrics are LLM Evals // Raza Habib CEO of Humanloop // #320
    Jun 3 2025

    Raza Habib, the CEO of LLM Eval platform Humanloop, talks to us about how to make your AI products more accurate and reliable by shortening the feedback loop of your evals. Quickly iterating on prompts and testing what works, along with some of his favorite Dario from Anthropic AI Quotes.


    // Bio

    Raza is the CEO and Co-founder at Humanloop. He has a PhD in Machine Learning from UCL, was the founding engineer of Monolith AI, and has built speech systems at Google. For the last 4 years, he has led Humanloop and supported leading technology companies such as Duolingo, Vanta, and Gusto to build products with large language models. Raza was featured in the Forbes 30 Under 30 technology list in 2022, and Sifted recently named him one of the most influential Gen AI founders in Europe.


    // Related Links

    Websites: https://humanloop.com


    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~

    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Raza on LinkedIn: /humanloop-raza


    Timestamps:


    [00:00] Cracking Open System Failures and How We Fix Them

    [05:44] LLMs in the Wild — First Steps and Growing Pains

    [08:28] Building the Backbone of Tracing and Observability

    [13:02] Tuning the Dials for Peak Model Performance

    [13:51] From Growing Pains to Glowing Gains in AI Systems

    [17:26] Where Prompts Meet Psychology and Code

    [22:40] Why Data Experts Deserve a Seat at the Table

    [24:59] Humanloop and the Art of Configuration Taming

    [28:23] What Actually Matters in Customer-Facing AI

    [33:43] Starting Fresh with Private Models That Deliver

    [34:58] How LLM Agents Are Changing the Way We Talk

    [39:23] The Secret Lives of Prompts Inside Frameworks

    [42:58] Streaming Showdowns — Creativity vs. Convenience

    [46:26] Meet Our Auto-Tuning AI Prototype

    [49:25] Building the Blueprint for Smarter AI

    [51:24] Feedback Isn’t Optional — It’s Everything

    Show More Show Less
    53 mins
  • Getting AI Apps Past the Demo // Vaibhav Gupta // #319
    May 30 2025

    Getting AI Apps Past the Demo // MLOps Podcast #319 with Vaibhav Gupta, CEO of BoundaryML.


    Join the Community: https://go.mlops.community/YTJoinIn

    Get the newsletter: https://go.mlops.community/YTNewsletter


    // Abstract

    It's been two years, and we still seem to see AI disproportionately more in demos than production features. Why? And how can we apply engineering practices we've all learned in the past decades to our advantage here?


    // Bio

    Vaibhav is one of the creators of BAML and a YC alum. He spent 10 years in AI performance optimization at places like Google, Microsoft, and D.E. Shaw. He loves diving deep and chatting about anything related to Gen AI and Computer Vision!


    // Related Links

    Website: https://www.boundaryml.com/


    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~

    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    Join our Slack community [https://go.mlops.community/slack]

    Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)]

    Sign up for the next meetup: [https://go.mlops.community/register]

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Vaibhav on LinkedIn: /vaigup


    Timestamps:

    [00:00] Vaibhav's preferred coffee

    [00:38] What is BAML

    [03:07] LangChain Overengineering Issues

    [06:46] Verifiable English Explained

    [11:45] Python AI Integration Challenges

    [15:16] Strings as First-Class Code

    [21:45] Platform Gap in Development

    [30:06] Workflow Efficiency Tools

    [33:10] Surprising BAML Insights

    [40:43] BAML Cool Projects

    [45:54] BAML Developer Conversations

    [48:39] Wrap up

    Show More Show Less
    50 mins
  • Building Out GPU Clouds // Mohan Atreya // #317
    May 23 2025

    Demetrios and Mohan Atreya break down the GPU madness behind AI — from supply headaches and sky-high prices to the rise of nimble GPU clouds trying to outsmart the giants. They cover power-hungry hardware, failed experiments, and how new cloud models are shaking things up with smarter provisioning, tokenized access, and a whole lotta hustle. It's a wild ride through the guts of AI infrastructure — fun, fast, and full of sparks!


    Big thanks to the folks at Rafay for backing this episode — appreciate the support in making these conversations happen!


    // BioMohan is a seasoned and innovative product leader currently serving as the Chief Product Officer at Rafay Systems. He has led multi-site teams and driven product strategy at companies like Okta, Neustar, and McAfee.


    // Related LinksWebsites: https://rafay.co/


    ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~

    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Mohan on LinkedIn: /mohanatreya


    Timestamps:

    [00:00] AI/ML Customer Challenges

    [04:21] Dependency on Microsoft for Revenue

    [09:08] Challenges of Hypothesis in AI/ML

    [12:17] Neo Cloud Onboarding Challenges

    [15:02] Elastic GPU Cloud Automation

    [19:11] Dynamic GPU Inventory Management

    [20:25] Terraform Lacks Inventory Awareness

    [26:42] Onboarding and End-User Experience Strategies

    [29:30] Optimizing Storage for Data Efficiency

    [33:38] Pizza Analogy: User Preferences

    [35:18] Token-Based GPU Cloud Monetization

    [39:01] Empowering Citizen Scientists with AI

    [42:31] Innovative CFO Chatbot Solutions

    [47:09] Cloud Services Need Spectrum

    Show More Show Less
    48 mins
  • A Candid Conversation Around MCP and A2A // Rahul Parundekar and Sam Partee // #316 SF Live
    May 21 2025

    Demetrios, Sam Partee, and Rahul Parundekar unpack the chaos of AI agent tools and the evolving world of MCP (Model Context Protocol). With sharp insights and plenty of laughs, they dig into tool permissions, security quirks, agent memory, and the messy path to making agents actually useful.


    // Bio

    Sam Partee

    Sam Partee is the CTO and Co-Founder of Arcade AI. Previously a Principal Engineer leading the Applied AI team at Redis, Sam led the effort in creating the ecosystem around Redis as a vector database. He is a contributor to multiple OSS projects including Langchain, DeterminedAI, LlamaIndex and Chapel amongst others. While at Cray/HPE he created the SmartSim AI framework which is now used at national labs around the country to integrate HPC simulations like climate models with AI.


    Rahul Parundekar

    Rahul Parundekar is the founder of AI Hero. He graduated with a Master's in Computer Science from USC Los Angeles in 2010, and embarked on a career focused on Artificial Intelligence. From 2010-2017, he worked as a Senior Researcher at Toyota ITC working on agent autonomy within vehicles. His journey continued as the Director of Data Science at FigureEight (later acquired by Appen), where he and his team developed an architecture supporting over 36 ML models and managing over a million predictions daily. Since 2021, he has been working on AI Hero, aiming to democratize AI access, while also consulting on LLMOps(Large Language Model Operations), and AI system scalability. Other than his full time role as a founder, he is also passionate about community engagement, and actively organizes MLOps events in SF, and contributes educational content on RAG and LLMOps at learn.mlops.community.


    // Related Links

    Websites:

    arcade.dev

    aihero.studio~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~


    Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

    MLOps Swag/Merch: [https://shop.mlops.community/]

    Connect with Demetrios on LinkedIn: /dpbrinkm

    Connect with Rahul on LinkedIn: /rparundekar

    Connect with Sam on LinkedIn: /samparteeTimestamps:[00:00] Agents & Tools, Explained (Without Melting Your Brain)

    [09:51] MVP Servers: Why Everything’s on Fire (and How to Fix It)

    [13:18] Can We Actually Trust the Protocol?

    [18:13] KYC, But Make It AI (and Less Painful)

    [25:25] Web Automation Tests: The Bugs Strike Back

    [28:18] MCP Dev: What Went Wrong (and What Saved Us)

    [33:53] Social Login: One Button to Rule Them All

    [39:33] What Even Is an AI-Native Developer?

    [42:21] Betting Big on Smarter Models (High Risk, High Reward)

    [51:40] Harrison’s Bold New Tactic (With Real-Life Magic Tricks)

    [55:31] Async Task Handoffs: Herding Cats, But Digitally

    [1:00:37] Getting AI to Actually Help Your Workflow

    [1:03:53] The Infamous Varma System Error (And How We Dodge It)

    Show More Show Less
    1 hr and 5 mins