AI Engineering Podcast cover art

AI Engineering Podcast

AI Engineering Podcast

By: Tobias Macey
Listen for free

About this listen

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.© 2024 Boundless Notions, LLC.
Episodes
  • MCP as the API for AI‑Native Systems: Security, Orchestration, and Scale
    Dec 16 2025
    Summary In this episode Craig McLuckie, co-creator of Kubernetes and founder/CEO of Stacklok, talks about how to improve security and reliability for AI agents using curated, optimized deployments of the Model Context Protocol (MCP). Craig explains why MCP is emerging as the API layer for AI‑native applications, how to balance short‑term productivity with long‑term platform thinking, and why great tools plus frontier models still drive the best outcomes. He digs into common adoption pitfalls (tool pollution, insecure NPX installs, scattered credentials), the necessity of continuous evals for stochastic systems, and the shift from “what the agent can access” to “what the agent knows.” Craig also shares how ToolHive approaches secure runtimes, a virtual MCP gateway with semantic search, orchestration and transactional semantics, a registry for organizational tooling, and a console for self‑service—along with pragmatic patterns for auth, policy, and observability. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Craig McLuckie about improving the security of your AI agents through curated and optimized MCP deploymentInterviewIntroductionHow did you get involved in machine learning?MCP saw huge growth in attention and adoption over the course of this year. What are the stumbling blocks that teams run into when going to production with MCP servers?How do improperly managed MCP servers contribute to security problems in an agent-driven software development workflow?What are some of the problematic practices or shortcuts that you are seeing teams implement when running MCP services for their developers?What are the benefits of a curated and opinionated MCP service as shared infrastructure for an engineering team?You are building ToolHive as a system for managing and securing MCP services as a platform component. What are the strategic benefits of starting with that as the foundation for your company?There are several services for managing MCP server deployment and access control. What are the unique elements of ToolHive that make it worth adopting?For software-focused agentic AI, the approach of Claude Code etc. to be command-line based opens the door for an effectively unbounded set of tools. What are the benefits of MCP over arbitrary CLI execution in that context?What are the most interesting, innovative, or unexpected ways that you have seen ToolHive/MCP used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on ToolHive?When is ToolHive the wrong choice?What do you have planned for the future of ToolHive/Stacklok?Contact InfoGitHubLinkedInParting QuestionFrom your perspective,...
    Show More Show Less
    1 hr and 8 mins
  • Context as Code, DevX as Leverage: Accelerating Software with Multi‑Agent Workflows
    Nov 24 2025
    Summary In this episode Max Beauchemin explores how multiplayer, multi‑agent engineering is reshaping individual and team velocity for building data and AI systems. Max shares his journey from Airflow and Superset to going all‑in on AI coding agents, describing a pragmatic “AI‑first reflex” for nearly every task and the emerging role of humans as orchestrators of agents. He digs into shifting bottlenecks — code review, QA, async coordination — and how better DevX/AIX, just‑in‑time context via tools, and structured "context as code" can keep pace with agent‑accelerated execution. He then dives deep into Agor, a new open‑source agent‑orchestration platform: a spatial, multiplayer canvas that manages git worktrees and shared dev environments, enables templated prompts and zone‑based workflows, and exposes an internal MCP so agents can operate the system — and each other. Max discusses session forking, sub‑session trees, scheduling, and safety considerations, and how these capabilities enable parallelization, handoffs across roles, and richer visibility into prompting and cost/usage—pointing to a near future where software engineering centers on orchestrating teams of agents and collaborators. Resources: agor.live (docs, one‑click Codespaces, npm install), Apache Superset, and related MCP/CLI tooling referenced for agent workflows. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Maxime Beauchemin about the impact of multi-player multi-agent engineering on individual and team velocity for building better data systemsInterviewIntroductionHow did you get involved in the area of data management?Can you start by giving an overview of the types of work that you are relying on AI development agents for?As you bring agents into the mix for software engineering, what are the bottlenecks that start to show up?In my own experience there are a finite number of agents that I can manage in parallel. How does Agor help to increase that limit?How does making multi-agent management a multi-player experience change the dynamics of how you apply agentic engineering workflows?Contact InfoLinkedInLinksAgorApache AirflowApache SupersetPresetClaude CodeCodexPlaywright MCPTmuxGit WorktreesOpencode.aiGitHub CodespacesOnaThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
    Show More Show Less
    1 hr
  • Inside the Black Box: Neuron-Level Control and Safer LLMs
    Nov 16 2025
    Summary In this episode of the AI Engineering Podcast Vinay Kumar, founder and CEO of Arya.ai and head of Lexsi Labs, talks about practical strategies for understanding and steering AI systems. He discusses the differences between interpretability and explainability, and why post-hoc methods can be misleading. Vinay shares his approach to tracing relevance through deep networks and LLMs using DL Backtrace, and how interpretability is evolving from an audit tool into a lever for alignment, enabling targeted pruning, fine-tuning, unlearning, and model compression. The conversation covers setting concrete alignment metrics, the gaps in current enterprise practices for complex models, and tailoring explainability artifacts for different stakeholders. Vinay also previews his team's "AlignTune" effort for neuron-level model editing and discusses emerging trends in AI risk, multi-modal complexity, and automated safety agents. He explores when and why teams should invest in interpretability and alignment, how to operationalize findings without overcomplicating evaluation, and the best practices for private, safer LLM endpoints in enterprises, aiming to make advanced AI not just accurate but also acceptable, auditable, and scalable. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Vinay Kumar about strategies and tactics for gaining insights into the decisions of your AI systemsInterview IntroductionHow did you get involved in machine learning?Can you start by giving a quick overview of what explainability means in the context of ML/AI?What are the predominant methods used to gain insight into the internal workings of ML/AI models?How does the size and modality of a model influence the technique and evaluation of methods used?What are the contexts in which a team would incorporate explainability into their workflow?How might explainability be used in a live system to provide guardrails or efficiency/accuracy improvements?What are the aspects of model alignment and explainability that are most challenging to implement?What are the supporting systems that are necessary to be able to effectively operationalize the collection and analysis of model reliability and alignment?"Trust", "Reliability", and "Alignment" are all words that seem obvious until you try to define them concretely. What are the ways that teams work through the creation of metrics and evaluation suites to gauge compliance with those goals?What are the most interesting, innovative, or unexpected ways that you have seen explainability methods used in AI systems?What are the most interesting, unexpected, or challenging lessons that you have learned while working on explainability/...
    Show More Show Less
    1 hr and 1 min
No reviews yet
In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.