
OpenELM: Apple's Open Language Model Family
Failed to add items
Add to basket failed.
Add to Wish List failed.
Remove from Wish List failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
By:
About this listen
The provided May 2024 sources center around CoreNet, an Apple-developed library for training deep neural networks, and OpenELM, an efficient language model family built using CoreNet. CoreNet is a versatile toolkit supporting various tasks, including foundation models like large language models (LLMs), object classification, and semantic segmentation, with its development evolving from the earlier CVNets. A key innovation highlighted is OpenELM's layer-wise scaling strategy, which optimizes parameter allocation within transformer models to achieve superior accuracy with fewer pre-training tokens compared to other open LLMs. The resources emphasize reproducibility and transparency by providing comprehensive frameworks for OpenELM's training and evaluation, including code for inference and fine-tuning on Apple devices using the MLX library, and detailed benchmarks on both NVIDIA CUDA and Apple Silicon hardware.
Sources:
https://arxiv.org/pdf/2404.14619
https://machinelearning.apple.com/research/openelm
https://github.com/apple/corenet
https://github.com/apple/corenet/tree/main/projects/kv-prediction