
🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"
Failed to add items
Add to basket failed.
Add to Wish List failed.
Remove from Wish List failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
By:
About this listen
👋🏼Hey AI heads🎙️ 𝐉𝐨𝐢𝐧 𝐮𝐬 for the very first 𝐓𝐞𝐜𝐡 𝐁𝐞𝐚𝐭𝐬 𝐋𝐢𝐯𝐞🔴, hosted by Kosseila—aka @CloudDude , From @CloudThrill. 🎯 This chill & laid back livestream will unpack 𝐋𝐋𝐌 𝐪𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧🔥: ✅𝐖𝐇𝐘 it matters ✅𝐇𝐎𝐖 it works✅ Enterprise (vllm) vs Consumer (@Ollama) tradeoffs ✅ and 𝐖𝐇𝐄𝐑𝐄 it’s going next.We’ll be joined by two incredible guest stars to talk about 𝐄𝐧𝐭𝐞𝐫𝐩𝐫𝐢𝐬𝐞 𝐯𝐬 𝐂𝐨𝐧𝐬𝐮𝐦𝐞𝐫 quantz 🗣️:🔷 𝐄𝐥𝐝𝐚𝐫 𝐊𝐮𝐫𝐭𝐢𝐜́, bringing the enterprise perspective with vLLM.🔷𝐂𝐨𝐥𝐢𝐧 𝐊𝐞𝐚𝐥𝐭𝐲, aka Bartowski, top downloaded GGUF quant 𝐋𝐋𝐌𝐬 on Hugging Face.🫵🏼 Come learn, and have some fun😎. 𝐂𝐡𝐚𝐩𝐭𝐞𝐫𝐬 :(00:00) Host Introduction(04:07) Eldar Intro (07:33) Bartowski Intro (13:04) What's Quantization! (16:19) Why LLMs Quantization matters? (20:39) Training Vs Inference "The new deal" (27:46) Biggest misconception about quantization(33:22) Enterprise Quantization in production (vLLM)(48:48) Consumer LLMs and quantization (Ollama, llama.cpp, GGUF) "LLMs for the people"(01:06:45) Bitnet 1Bit Quantization from Microsoft (01:28:14) How long it takes to Quantize a model (llama3 70B) GGUF or lm--compressor(01:34:23) What is I-Matrix, and why people confuse it with IQ Quantization ? (01:39:36) What's LoRA and LoRAQ(01:42:36) What is Sparsity ? (01:47:42) What is Distillation ?(01:52:34) Extreme Quantization (Unsloth) of Big models (Deepseek) at 2bits with 70% size cut(01:57:27) Will future models llama5 be trained on fp4 tensor cores ? if so why quantize it?(02:02:15) The future of LLMs on edge Devices (Google AI edge)(02:08:00) How to Evaluate the quality of Quantized model ?(02:26:09) Hugging face Role in the world of LLM/quantization (02:33:46) Hugging face Role in the world of LLM/quantization (02:36:41) Localllama Sub-redit Down (Moderator goes banana) (02:40:11) Guests Hope for the Future of LLMs and AI in General Check out quantization Blog : https://cloudthrill.ca/llm-quantizati...#AI #LLM #Quantization #TechBeatsLive #Locallama #VLLM #Ollama