Apple has unveiled lightweight OpenELM language models that can run locally on devices without a cloud connection.
Here's what we know
The OpenELM family includes eight models of two types: pre-trained and instruction-specific models. Each variable is available with 270M, 450M, 1.1B and 3B parameters.
The models are pre-trained on public datasets of 1.8 trillion tokens from sites including Reddit, Wikipedia, arXiv.org and others.
Thanks to improvements, OpenELMs can run on regular laptops and even some smartphones. Testing was conducted on PCs equipped with an Intel i9 and RTX 4090 processor, as well as a MacBook Pro M2 Max.
According to Apple, these models have good performance. Particularly notable is the 450 million parameter variable with the instructions. OpenELM-1.1B outperformed its GPT counterpart OLMo by 2.36%, while requiring half the number of tokens to pre-train.
In the ARC-C benchmark, designed to test knowledge and reasoning skills, the pre-trained version of OpenELM-3B achieved an accuracy of 42.24%. In contrast, it scored 26.76% and 73.28% on MMLU and HellaSwag, respectively.
The company has released the OpenELM source code for Hugging Face under an open license, including trained versions, benchmarks, and sample instructions.
However, Apple warns that OpenELMs may produce incorrect, malicious, or inappropriate responses due to lack of security safeguards.
source : venturebeat