Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

Listen for free

View show details

About this listen

An airhacks.fm conversation with Juan Fumero (@snatverk) about: tornadovm as a Java parallel framework for accelerating data parallelization on GPUs and other hardware, first GPU experiences with ELSA Winner and Voodoo cards, explanation of TornadoVM as a plugin to existing JDKs that uses Graal as a library, TornadoVM's programming model with @parallel and @reduce annotations for parallelizable code, introduction of kernel API for lower-level GPU programming, TornadoVM's ability to dynamically reconfigure and select the best hardware for workloads, implementation of LLM inference acceleration with TornadoVM, challenges in accelerating Llama models on GPUs, introduction of tensor types in TornadoVM to support FP8 and FP16 operations, shared buffer capabilities for GPU memory management, comparison of Java Vector API performance versus GPU acceleration, discussion of model quantization as a potential use case for TornadoVM, exploration of Deep Java Library (DJL) and its ND array implementation, potential standardization of tensor types in Java, integration possibilities with Project Babylon and its Code Reflection capabilities, TornadoVM's execution plans and task graphs for defining accelerated workloads, ability to run on multiple GPUs with different backends simultaneously, potential enterprise applications for LLMs in Java including model distillation for domain-specific models, discussion of Foreign Function & Memory API integration in TornadoVM, performance comparison between different GPU backends like OpenCL and CUDA, collaboration with Intel on Level Zero API and integrated graphics support, future plans for RISC-V support in TornadoVM

Juan Fumero on twitter: @snatverk

What listeners say about Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

Average Customer Ratings

Reviews - Please select the tabs below to change the source of reviews.

Audible.com.au reviews

Amazon Reviews

No Reviews are Available

Report a review on Amazon

Audiobook Categories

More to Explore

GETTING STARTED

Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

About this listen

What listeners say about Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

Reviews - Please select the tabs below to change the source of reviews.

Audible.com.au reviews

Amazon Reviews