The Memriq AI Inference Brief – Engineering Edition cover art

The Memriq AI Inference Brief – Engineering Edition

The Memriq AI Inference Brief – Engineering Edition

By: Keith Bourne
Listen for free

About this listen

The Memriq AI Inference Brief – Engineering Edition is a weekly deep dive into the technical guts of modern AI systems: retrieval-augmented generation (RAG), vector databases, knowledge graphs, agents, memory systems, and more. A rotating panel of AI engineers and data scientists breaks down architectures, frameworks, and patterns from real-world projects so you can ship more intelligent systems, faster.Copyright 2025 Memriq AI Personal Development Personal Success Politics & Government
Episodes
  • Model Context Protocol: The Universal AI Integration Standard Explained
    Dec 15 2025

    Discover how the Model Context Protocol (MCP) is revolutionizing AI systems integration by simplifying complex multi-tool interactions into a scalable, open standard. In this episode, we unpack MCP’s architecture, adoption by industry leaders, and its impact on engineering workflows.

    In this episode:

    - What MCP is and why it matters for AI/ML engineers and infrastructure teams

    - The M×N integration problem and how MCP reduces it to M+N

    - Core primitives: Tools, Resources, and Prompts, and their roles in MCP

    - Technical deep dive into JSON-RPC 2.0 messaging, transports, and security with OAuth 2.1 + PKCE

    - Comparison of MCP with OpenAI Function Calling, LangChain, and custom REST APIs

    - Real-world adoption, performance metrics, and engineering trade-offs

    - Open challenges including security, authentication, and operational complexity

    Key tools & technologies mentioned:

    - Model Context Protocol (MCP)

    - JSON-RPC 2.0

    - OAuth 2.1 with PKCE

    - FastMCP Python SDK, MCP TypeScript SDK

    - agentgateway by Solo.io

    - OpenAI Function Calling

    - LangChain

    Timestamps:

    00:00 — Introduction to MCP and episode overview

    02:30 — The M×N integration problem and MCP’s solution

    05:15 — Why MCP adoption is accelerating

    07:00 — MCP architecture and core primitives explained

    10:00 — Head-to-head comparison with alternatives

    12:30 — Under the hood: protocol mechanics and transports

    15:00 — Real-world impact and usage metrics

    17:30 — Challenges and security considerations

    19:00 — Closing thoughts and future outlook

    Resources:

    • "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition
    • This podcast is brought to you by Memriq.ai - AI consultancy and content studio building tools and resources for AI practitioners.

    Show More Show Less
    20 mins
  • RAG Evaluation with ragas: Reference-Free Metrics & Monitoring
    Dec 14 2025

    Unlock the secrets to evaluating Retrieval-Augmented Generation (RAG) pipelines effectively and efficiently with ragas, the open-source framework that’s transforming AI quality assurance. In this episode, we explore how to implement reference-free evaluation, integrate continuous monitoring into your AI workflows, and optimize for production scale — all through the lens of Keith Bourne’s comprehensive Chapter 9.

    In this episode:

    - Overview of ragas and its reference-free metrics that achieve 95% human agreement on faithfulness scoring

    - Implementation patterns and code walkthroughs for integrating ragas with LangChain, LlamaIndex, and CI/CD pipelines

    - Production monitoring architecture: sampling, async evaluation, aggregation, and alerting

    - Comparison of ragas with other evaluation frameworks like DeepEval and TruLens

    - Strategies for cost optimization and asynchronous evaluation at scale

    - Advanced features: custom domain-specific metrics with AspectCritic and multi-turn evaluation support

    Key tools and technologies mentioned:

    - ragas (Retrieval Augmented Generation Assessment System)

    - LangChain, LlamaIndex

    - LangSmith, LangFuse (observability and evaluation tools)

    - OpenAI GPT-4o, GPT-3.5-turbo, Anthropic Claude, Google Gemini, Ollama

    - Python datasets library

    Timestamps:

    00:00 - Introduction and overview with Keith Bourne

    03:00 - Why reference-free evaluation matters and ragas’s approach

    06:30 - Core metrics: faithfulness, answer relevancy, context precision & recall

    09:00 - Code walkthrough: installation, dataset structure, evaluation calls

    12:00 - Integrations with LangChain, LlamaIndex, and CI/CD workflows

    14:30 - Production monitoring architecture and cost considerations

    17:00 - Advanced metrics and custom domain-specific evaluations

    19:00 - Common pitfalls and testing strategies

    20:30 - Closing thoughts and next steps

    Resources:

    - "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition

    - Memriq AI: https://Memriq.ai

    - ragas website: https://www.ragas.io/

    - ragas GitHub repository: https://github.com/vibrantlabsai/ragas (for direct access to code and docs)

    Tune in to build more reliable, scalable, and maintainable RAG systems with confidence using open-source evaluation best practices.

    Show More Show Less
    27 mins
  • Agent Engineering Unpacked: Breakthrough Discipline or Rebranded Hype?
    Dec 13 2025

    Agent engineering is rapidly emerging as a pivotal discipline in AI development, promising autonomous LLM-powered systems that can perceive, reason, and act in complex, real-world environments. But is this truly a new engineering frontier or just a rebranding of existing ideas? In this episode, we dissect the technology, tooling, real-world deployments, and the hard truths behind the hype.

    In this episode:

    - Explore the origins and "why now" of agent engineering, including key advances like OpenAI's function calling and expanded context windows

    - Break down core architectural patterns combining retrieval, tool use, and memory for reliable agent behavior

    - Compare leading frameworks and SDKs like LangChain, LangGraph, AutoGen, Anthropic Claude, and OpenAI Agents

    - Dive into production case studies from Klarna, Decagon, and TELUS showing impact and ROI

    - Discuss the critical challenges around reliability, security, evaluation, and cost optimization

    - Debate agent engineering vs. traditional ML pipelines and best practices for building scalable, observable agents

    Key tools & technologies mentioned: LangChain, LangGraph, AutoGen, Anthropic Claude SDK, OpenAI Agents SDK, Pinecone, Weaviate, Chroma, FAISS, LangSmith, Arize Phoenix, DeepEval, Giskard

    Timestamps:

    00:00 - Introduction & episode overview

    02:20 - The hype vs. reality: failure rates and market investments

    05:15 - Why agent engineering matters now: tech enablers & economics

    08:30 - Architecture essentials: retrieval, tool use, memory

    11:45 - Tooling head-to-head: LangChain, LangGraph, AutoGen & SDKs

    15:00 - Under the hood: example agent workflow and orchestration

    17:45 - Real-world impact & production case studies

    20:30 - Challenges & skepticism: reliability, security, cost

    23:00 - Agent engineering vs. traditional ML pipelines debate

    26:00 - Toolbox recommendations & engineering best practices

    28:30 - Closing thoughts & final takeaways

    Resources:

    - "Unlocking Data with Generative AI and RAG" second edition by Keith Bourne - Search for 'Keith Bourne' on Amazon

    - Memriq AI: https://memriq.ai

    Thanks for tuning into Memriq Inference Digest - Engineering Edition. Stay curious and keep building!

    Show More Show Less
    30 mins
No reviews yet
In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.