E02 RAG for Embedded Systems Development: When Retrieval Augmented Generation Makes Sense (and When It Doesn't) cover art

E02 RAG for Embedded Systems Development: When Retrieval Augmented Generation Makes Sense (and When It Doesn't)

E02 RAG for Embedded Systems Development: When Retrieval Augmented Generation Makes Sense (and When It Doesn't)

Listen for free

View show details

About this listen

Ryan and Luca explore Retrieval Augmented Generation (RAG) and its practical applications in embedded development. After Ryan's recent discussions at the Embedded Systems Summit, we dive into what RAG actually is: a system that chunks documents, stores them in vector databases, and allows AI to query specific information without hallucinating. While it sounds perfect for handling massive datasheets and documentation, the reality is more complex.

We discuss the critical challenge of chunking - breaking documents into the right-sized pieces for effective retrieval. Too big and searches become useless; too small and you lose context. Luca shares his hands-on experience trying to make RAG work with datasheets, revealing the gap between theory and practice. With modern LLMs offering larger context windows and better document parsing capabilities, we question whether RAG has missed its window of usefulness for most development tasks. The conversation covers when RAG still makes sense (legal contexts, parts catalogs, private LLMs) and explores alternatives like having LLMs use grep and other Unix tools to search documents directly.

Key Topics:

  • [02:15] What RAG is: Retrieval Augmented Generation explained
  • [04:30] RAG for embedded documentation and datasheets
  • [07:45] The chunking problem: breaking documents into searchable pieces
  • [12:20] Vector databases and similarity search mechanics
  • [16:40] Luca's real-world experience: challenges with datasheet RAG
  • [20:15] Modern alternatives: LLMs using grep and Unix tools
  • [25:30] When RAG still makes sense: legal contexts and parts catalogs
  • [30:45] RAG vs. larger context windows in modern LLMs
  • [35:20] Private LLMs and when RAG becomes relevant again

Notable Quotes:

"Data sheets are inaccurate. You still have to engineer this. You cannot just go and let it go." — Ryan

"It's so difficult to get the chunking right. If you make it too big, that's not useful. If you make it too small, then again, it becomes difficult to search for because you're losing too much context." — Luca

"These days, LLMs are good enough at just ad hoc-ing this. You can do away with all of the complexity of vector stores and chunking." — Luca

"We have the hardware. We can actually prove it one way or another. If it doesn't work on hardware, then it's not right." — Ryan

"RAG is quite tempting and quite interesting, but it's deceptively simple unless you have good reason to believe that you can get it working." — Luca

Resources Mentioned:

  • Google NotebookLM - Tool mentioned for ingesting PDFs and creating AI-generated podcasts from documents
  • TreeSitter - Syntax tree analysis tool used as alternative to RAG for code analysis
  • Embedded Systems Summit - Jacob Beningo's conference where RAG and AI topics were discussed

Connect With Us:

  • Try experimenting with modern LLMs and their built-in document parsing capabilities before investing time in RAG implementation
  • Share your experiences with RAG in embedded development - we'd love to hear what worked and what didn't
  • Consider the trade-offs between public LLMs and private models when deciding if RAG is worth the complexity for your use case
No reviews yet
In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.