🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"

Listen for free

View show details

About this listen

👋🏼Hey AI heads🎙️ 𝐉𝐨𝐢𝐧 𝐮𝐬 for the very first 𝐓𝐞𝐜𝐡 𝐁𝐞𝐚𝐭𝐬 𝐋𝐢𝐯𝐞🔴, hosted by Kosseila—aka @CloudDude , From @CloudThrill. 🎯 This chill & laid back livestream will unpack 𝐋𝐋𝐌 𝐪𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧🔥: ✅𝐖𝐇𝐘 it matters ✅𝐇𝐎𝐖 it works✅ Enterprise (vllm) vs Consumer (@Ollama) tradeoffs ✅ and 𝐖𝐇𝐄𝐑𝐄 it’s going next.We’ll be joined by two incredible guest stars to talk about 𝐄𝐧𝐭𝐞𝐫𝐩𝐫𝐢𝐬𝐞 𝐯𝐬 𝐂𝐨𝐧𝐬𝐮𝐦𝐞𝐫 quantz 🗣️:🔷 𝐄𝐥𝐝𝐚𝐫 𝐊𝐮𝐫𝐭𝐢𝐜́, bringing the enterprise perspective with vLLM.🔷𝐂𝐨𝐥𝐢𝐧 𝐊𝐞𝐚𝐥𝐭𝐲, aka Bartowski, top downloaded GGUF quant 𝐋𝐋𝐌𝐬 on Hugging Face.🫵🏼 Come learn, and have some fun😎. 𝐂𝐡𝐚𝐩𝐭𝐞𝐫𝐬 :(00:00) Host Introduction(04:07) Eldar Intro (07:33) Bartowski Intro (13:04) What's Quantization! (16:19) Why LLMs Quantization matters? (20:39) Training Vs Inference "The new deal" (27:46) Biggest misconception about quantization(33:22) Enterprise Quantization in production (vLLM)(48:48) Consumer LLMs and quantization (Ollama, llama.cpp, GGUF) "LLMs for the people"(01:06:45) Bitnet 1Bit Quantization from Microsoft (01:28:14) How long it takes to Quantize a model (llama3 70B) GGUF or lm--compressor(01:34:23) What is I-Matrix, and why people confuse it with IQ Quantization ? (01:39:36) What's LoRA and LoRAQ(01:42:36) What is Sparsity ? (01:47:42) What is Distillation ?(01:52:34) Extreme Quantization (Unsloth) of Big models (Deepseek) at 2bits with 70% size cut(01:57:27) Will future models llama5 be trained on fp4 tensor cores ? if so why quantize it?(02:02:15) The future of LLMs on edge Devices (Google AI edge)(02:08:00) How to Evaluate the quality of Quantized model ?(02:26:09) Hugging face Role in the world of LLM/quantization (02:33:46) Hugging face Role in the world of LLM/quantization (02:36:41) Localllama Sub-redit Down (Moderator goes banana) (02:40:11) Guests Hope for the Future of LLMs and AI in General Check out quantization Blog : https://cloudthrill.ca/llm-quantizati...#AI #LLM #Quantization #TechBeatsLive #Locallama #VLLM #Ollama

What listeners say about 🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"

Average Customer Ratings

Reviews - Please select the tabs below to change the source of reviews.

Audible.com.au reviews

Amazon Reviews

No Reviews are Available

Report a review on Amazon

Audiobook Categories

More to Explore

GETTING STARTED

🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"

About this listen

What listeners say about 🔴TechBeats live : LLM Quantization "vLLM vs. Llama.cpp"

Reviews - Please select the tabs below to change the source of reviews.

Audible.com.au reviews

Amazon Reviews