AI: post transformers

Name: AI: post transformers
SKU: PD_8002_394807AU

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

AI: post transformers

By: mcgrof

Listen for free

Episodes View all

GPT-NeoX: Large-Scale Autoregressive Language Modeling in PyTorch

Sep 7 2025

Thus describes EleutherAI's GPT-NeoX library, a robust open-source framework for training large-scale autoregressive language models on GPUs, building upon the Megatron and DeepSpeed libraries. It highlights the library's advanced features like distributed training, support for various hardware and systems, and cutting-edge architectural innovations. The text also provides practical guidance on setup, configuration, data preparation, training, inference, and evaluation, alongside details on pretrained models like GPT-NeoX-20B and Pythia. Furthermore, it details how to export models to Hugging Face and monitor experiments, underscoring its widespread adoption in research and industry.

Source:
https://github.com/EleutherAI/gpt-neox

Show More Show Less

12 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
SGLang: Efficient Language Model Program Execution

Sep 7 2025

This June 2024 paper introduces SGLang, a framework designed to enhance the efficiency of Large Language Model (LLM) and Vision Language Model (VLM) serving. It achieves this through a co-design of a flexible frontend language and a fast backend runtime. The frontend simplifies programming with primitives for generation and parallelism, while the backend utilizes novel optimizations like RadixAttention for KV cache reuse and compressed finite state machines for faster structured output decoding. These innovations allow SGLang to significantly improve throughput and reduce latency compared to existing systems across various LLM applications and hardware platforms. The framework is open-source, boasts extensive model support, and has seen wide industry adoption due to its performance benefits in complex LM programs.
Sources:

https://arxiv.org/pdf/2312.07104
https://docs.sglang.ai/
https://github.com/sgl-project/sglang

Show More Show Less

17 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Eleuther: evaluating LLMs

Sep 7 2025

These sources collectively explore various approaches to evaluating and improving Large Language Models (LLMs). Several papers introduce new benchmark datasets designed to test LLMs on complex reasoning tasks, such as the "BIG-Bench Hard (BBH)" suite, the graduate-level "GPQA" questions in science, and "MuSR" for multistep soft reasoning in natural language narratives. A key technique discussed across these sources is Chain-of-Thought (CoT) prompting, which encourages LLMs to show their step-by-step reasoning, leading to improved performance, often surpassing human-rater averages on challenging tasks. Additionally, the "Instruction-Following Eval (IFEval)" introduces a reproducible benchmark for verifiable instructions, allowing for objective assessment of an LLM's ability to follow explicit directives. The "MMLU-Pro Benchmark" further contributes a large-scale dataset across diverse disciplines to rigorously assess model capabilities, emphasizing the need for robust evaluation metrics and challenging data to push the boundaries of AI reasoning.

Sources:
https://github.com/EleutherAI/lm-evaluation-harness
https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/leaderboard/README.md
https://arxiv.org/pdf/2103.03874 - Measuring Mathematical Problem Solving With the
MATH Dataset
https://arxiv.org/pdf/2210.09261 - Challenging BIG-Bench tasks and
whether chain-of-thought can solve them
https://arxiv.org/pdf/2310.16049 - MUSR: TESTING THE LIMITS OF CHAIN-OF-THOUGHT
WITH MULTISTEP SOFT REASONING
https://arxiv.org/pdf/2311.07911 - Instruction-Following Evaluation for Large Language
Models
https://arxiv.org/pdf/2311.12022 - GPQA: A Graduate-Level Google-Proof
Q&A Benchmark
https://arxiv.org/pdf/2406.01574 - MMLU-Pro: A More Robust and Challenging
Multi-Task Language Understanding Benchmark

Show More Show Less

27 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from Wish List failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free

No reviews yet

Audiobook Categories

More to Explore

GETTING STARTED

AI: post transformers

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

AI: post transformers

About this listen

GPT-NeoX: Large-Scale Autoregressive Language Modeling in PyTorch

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

SGLang: Efficient Language Model Program Execution

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed

Eleuther: evaluating LLMs

Failed to add items

Add to basket failed.

Add to Wish List failed.

Remove from Wish List failed.

Follow podcast failed

Unfollow podcast failed