Key Takeaways:
On Tuesday, 23rd April, Microsoft announced the launch of Phi-3 Mini, the company’s latest lightweight AI model and the first of three small language models (SLM) it plans to release this year.
Phi-3 Mini has 3.8 billion parameters and is trained on a data set that is relatively smaller compared to large language models (LLMs) like Microsoft-backed $80 billion startup OpenAI’s GPT-4 and Google’s Gemini.
SLMs are designed to perform easier tasks and make AI more accessible for companies with limited resources.
Microsoft Releases Small Language Model Phi-3 Mini
Phi-3 Mini is now available on Microsft’s cloud service platform Azure, machine learning model platform Hugging Face, and Ollama – a framework for running AI models on local devices.
Microsoft’s latest SLM will also be available on Nvidia’s software tool Nvidia Inference Microservices (NIM) and has also been optimized for the chipmaker’s graphics processing units (GPUs).
The American tech giant has plans to release two more versions of the SLM – Phi-3 Small with 7 billion parameters and Phi-3 Medium with 14 billion parameters.
Parameters refer to the number of complex instructions an AI model can understand. For reference, GPT-4 has 100 trillion parameters while Gemini has 175 trillion parameters, making them highly efficient artificial intelligence algorithms. Meta’s small-scale model, Llama 3, comes with 8 billion parameters.
In December, Microsoft released Phi-2, which more or less performed on par with larger and more capable models like the Llama 2. The company claims the Phi-3 can deliver much better performance than the previous version and provide responses close to LLMs 10 times bigger than it.
SLMs are Cheaper to Run and Better Suited for Personal Devices
Microsoft’s VP for generative AI research, Sebastien Bubeck, said the Phi-3 is “dramatically cheaper” than other models in the market with similar capabilities. Meanwhile, Eric Boy, corporate VP of the company’s Azure AI Platform, told The Verge that the SLM is as capable as LLMs like GPT-3.5 but in a smaller form factor.
Small AI models are cheaper to run than large language models and perform better on personal devices like smartphones and computers. Google’s Gemma 2B and 7B are good for simple chatbot and language-related functions, Anthropic’s Claude 3 Haiku can read dense research papers with graphs and summarize them instantly, and Meta’s recently launched Llama 3 8B can be used for chatbot and coding assistance.
Microsoft Used an LLM to Train Phi-3 Mini and was Inspired by Children’s Books
When asked about Phi-3’s training curriculum, Boyd said the team was inspired by how children learn from bedtime stories, books with simpler words, and sentences structured in an easily understandable format that discuss larger topics.
Microsoft developers made a list of 3,000 words and asked an LLM to teach Phi-3. The SLM picks up from where previous versions left. While Phi-1 focused on coding and Phi-2 learned to reason, Phi-3 is better at both coding and reasoning.
Boyd says companies often find smaller language models to work better for their custom applications. Since a lot of their internal data sets are smaller than larger models, and because they consume less computing power, SLMs seem to be the more affordable and apt option.
Phi-3’s smaller size is also an advantage because it is capable of running locally on low-power hardware like smartphones and does not need to offload its computing tasks to the rather expensive cloud-based processing centers.
More News: Samsung Orders Executives To Work 6 Days A Week Following Worst Earnings Year