• Disseminate: The Computer Science Research Podcast

  • By: Jack Waudby
  • Podcast

Disseminate: The Computer Science Research Podcast

By: Jack Waudby
  • Summary

  • This podcast features interviews with Computer Science researchers. Hosted by Dr. Jack Waudby researchers are interviewed, highlighting the problem(s) they tackled, solutions they developed, and how their findings can be applied in practice. This podcast is for industry practitioners, researchers, and students, aims to further narrow the gap between research and practice, and to generally make awesome Computer Science research more accessible. We have 2 types of episode: (i) Cutting Edge (red/blue logo) where we talk to researchers about their latest work, and (ii) High Impact (gold/silver logo) where we talk to researchers about their influential work.


    You can support the show through Buy Me a Coffee. A donation of $3 will help us keep making you awesome Computer Science research podcasts.

    Hosted on Acast. See acast.com/privacy for more information.

    Jack Waudby
    Show More Show Less
activate_mytile_page_redirect_t1
Episodes
  • Haralampos Gavriilidis | SheetReader: Efficient spreadsheet parsing
    Apr 17 2025

    In this episode of the DuckDB in Research series, Harry Gavriilidis (PhD student at TU Berlin) joins us to discuss Sheet Reader — a high-performance spreadsheet parser that dramatically outpaces traditional tools in both speed and memory efficiency. By taking advantage of the standardized structure of spreadsheet files and bypassing generic XML parsers, Sheet Reader delivers fast and lightweight parsing, even on large files. Now available as a DuckDB extension, it enables users to query spreadsheets directly with SQL and integrate them seamlessly into broader analytical workflows.


    Harry shares insights into the development process, performance benchmarks, and the surprisingly complex world of spreadsheet parsing. He also discusses community feedback, feature requests (like detecting multiple tables or parsing colored rows), and future plans — including tighter integration with DuckDB and support for Arrow. The conversation wraps up with a look at Harry’s broader research on composable database systems and data interoperability, highlighting how tools like DuckDB are reshaping modern data analysis.

    Hosted on Acast. See acast.com/privacy for more information.

    Show More Show Less
    41 mins
  • Arjen P. de Vries | faiss: An extension for vector data & search
    Apr 10 2025

    In this episode of the DuckDB in Research series, we’re joined by Arjen de Vries, Professor of Data Science at Radboud University. Arjen dives into his team’s development of a DuckDB extension for FAISS, a library originally developed at Facebook for efficient similarity search and vector operations.


    We explore the growing importance of embeddings and dense retrieval in modern information retrieval systems, and how DuckDB’s zero-copy architecture and tight integration with the Python ecosystem make it a compelling choice for managing large-scale vector data. Arjen shares insights into the technical challenges and architectural decisions behind the extension, comparisons with DuckDB’s native VSS (vector search) solution, and the broader vision of integrating vector search more deeply into relational databases.


    Along the way, we also touch on DuckDB's extension ecosystem, its potential for future research, and why tools like this are reshaping how we build and query modern AI-enabled systems.

    Hosted on Acast. See acast.com/privacy for more information.

    Show More Show Less
    46 mins
  • David Justen | POLAR: Adaptive and non-invasive join order selection via plans of least resistance
    Apr 3 2025

    In this episode, we sit down with David Justen to discuss his work on POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance which was implemented in DuckDB. David shares his journey in the database space, insights into performance optimization, and the challenges of working with modern analytical workloads. We dive into the intricacies of query compilation, vectorized execution, and how DuckDB is shaping the future of in-memory databases. Tune in for a deep dive into database internals, industry trends, and what’s next for high-performance data processing!


    Links:

    • VLDB 2024 Paper
    • David's Homepage

    Hosted on Acast. See acast.com/privacy for more information.

    Show More Show Less
    51 mins

What listeners say about Disseminate: The Computer Science Research Podcast

Average Customer Ratings

Reviews - Please select the tabs below to change the source of reviews.

In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.