From Notebooks to Production: Xorq’s lockfile Approach for Reproducible, Portable ML Pipelines

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

From Notebooks to Production: Xorq’s lockfile Approach for Reproducible, Portable ML Pipelines

Listen for free

View show details

Summary

In this episode, Hussain shares the story behind xorq: a “lockfile for ML pipelines” that makes notebook work easier to reproduce, debug, and ship. We talk about why the research→production path is still so manual, how schemas (and Arrow) become the contract between systems, and what it takes to run the same pipeline across engines like Snowflake and Databricks. We also dig into escape hatches for imperative code, why feature stores didn’t become the default, and how xorq fits alongside other technologies like Iceberg.

Chapters

00:00 Hussain's Journey in Data Science

06:00 The Need for xorq: Bridging Research and Production

10:38 Challenges in Machine Learning Deployment

17:40 The Role of Lock Files in Data Pipelines

29:51 Understanding Schema Management in Data Systems

34:40 Navigating Declarative and Imperative Transformations

36:39 The Developer's Journey with xorq

38:34 Feature Stores vs. xorq: A Comparative Analysis

43:43 The Future of Feature Stores and Machine Learning

51:41 Reproducibility in Data Pipelines: xorq vs. Git-like Operations

55:47 The Future of xorq and the Data Ecosystem

No reviews yet