AI Scaling Laws, DeepSeek’s Cost Efficiency & The Future of AI Training cover art

AI Scaling Laws, DeepSeek’s Cost Efficiency & The Future of AI Training

AI Scaling Laws, DeepSeek’s Cost Efficiency & The Future of AI Training

Listen for free

View show details

About this listen

In this first episode of Gradient Descent, hosts Vishnu Vettrivel (CTO of Wisecube AI) and Alex Thomas (Principal Data Scientist) discuss the rapid evolution of AI, the breakthroughs in LLMs, and the role of Natural Language Processing in shaping the future of artificial intelligence. They also share their experiences in AI development and explain why this podcast differs from other AI discussions.Chapters: 00:00 – Introduction 01:56 – DeepSeek Overview 02:55 – Scaling Laws and Model Performance 04:36 – Peak Data: Are we running out of quality training data? 08:10 – Industry reaction to DeepSeek 09:05 – Jevons' Paradox: Why cheaper AI can drive more demand 11:04 – Supervised Fine-Tuning vs Reinforcement Learning (RLHF) 14:49 – Why Reinforcement Learning helps AI models generalize 20:29 – Distillation and Training Efficiency 25:01 – AI safety concerns: Toxicity, bias, and censorship 30:25 – Future Trends in LLMs: Cheaper, more specialized AI models? 37:30 – Final thoughts and upcoming topics Listen on:• YouTube: https://youtube.com/@WisecubeAI/podcasts• Apple Podcast: https://apple.co/4kPMxZf• Spotify: https://open.spotify.com/show/1nG58pwg2Dv6oAhCTzab55• Amazon Music: https://bit.ly/4izpdO2 Our solutions: • https://askpythia.ai/ - ⁠LLM Hallucination Detection Tool⁠ • https://www.wisecube.ai - ⁠Wisecube AI⁠ platform for large-scale biomedical knowledge analysisFollow us: • Pythia Website: www.askpythia.ai• Wisecube Website: www.wisecube.ai• Linkedin: www.linkedin.com/company/wisecube• Facebook: www.facebook.com/wisecubeai• Reddit: www.reddit.com/r/pythia/Mentioned Materials: - Jevons’ Paradox: https://en.wikipedia.org/wiki/Jevons_paradox - Scaling Laws for Neural Language Models: https://arxiv.org/abs/2001.08361- Distilling the Knowledge in a Neural Network: https://arxiv.org/abs/1503.02531 - SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training: https://arxiv.org/abs/2501.17161 - DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning: https://arxiv.org/abs/2501.12948 - Reinforcement Learning: An Introduction (Sutton & Barto): https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf
activate_mytile_page_redirect_t1

What listeners say about AI Scaling Laws, DeepSeek’s Cost Efficiency & The Future of AI Training

Average Customer Ratings

Reviews - Please select the tabs below to change the source of reviews.

In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.