Join our daily and weekly newsletters to receive the latest updates and exclusive content on our industry-leading AI coverage. Find out more
Supported by the Government of the United Arab Emirates Institute of Technological Innovation (TII) announced the launch of Falcon 3, a family of open-source petite language models (SLMs) designed to run efficiently on single-GPU lightweight infrastructures.
Falcon 3 includes models in four sizes – 1B, 3B, 7B and 10B – with entry-level and instructional variants, promising to democratize access to advanced AI capabilities for developers, researchers and enterprises. According to Hugging Face, these models already outperform or match popular open source counterparts in their size class, including Lama Meta and category leader Qwen-2.5.
This development comes at a time when the demand for SLMs, with smaller parameters and simpler design than LLMs, is growing rapidly due to their performance, affordability and ability to be deployed on resource-constrained devices. They are suitable for a range of applications across industries such as customer service, healthcare, mobile applications and IoT, where typical LLMs may be too computationally exorbitant to operate effectively. According to Valuates reportsthe market for these models is expected to grow at a CAGR of almost 18% over the next five years.
What does Falcon 3 bring to the table?
Trained on 14 trillion tokens – more than twice the size of its Falcon 2 predecessor – the Falcon 3 family uses a pure decoder architecture with grouped querying to share parameters and minimize memory consumption for the key-value (KV) cache during inference. This enables faster and more proficient performance when handling a variety of text tasks.
Generally, the models support four primary languages - English, French, Spanish, and Portuguese – and come with a 32K context window, allowing them to process long inputs such as multi-word documents.
“Falcon 3 is versatile, designed for both general and specialized tasks, providing users with enormous flexibility. Its basic model is ideal for generative applications, while the instruct variant is perfect for conversational tasks such as customer service or virtual assistants,” notes TII in its website.
According to results table on Hugging Face, while all four Falcon 3 models perform quite well, versions 10B and 7B are the stars of the show, achieving state-of-the-art performance in reasoning, language comprehension, following instructions, coding and math tasks.
Among models in the 13B size class, Falcon 3 versions 10B and 7B outperform competitors including Google’s Gemma 2-9B, Meta’s Llama 3.1-8B, Mistral-7B and Yi 1.5-9B. They even outperform Alibaba’s category leader, Qwen 2.5-7B, on most benchmarks – such as MUSR, MATH, GPQA, and IFEval – except for MMLU, which is a test that evaluates how well language models understand and process human language.
Implementation in various industries
Thanks to the Falcon 3 models, they are now available on Face HuggingTII is intended to serve a wide range of users, enabling cost-effective AI implementations without computational bottlenecks. With the ability to handle specific domain-centric tasks with compact processing times, models can support a variety of applications at the edge and in privacy-sensitive environments, including customer service chatbots, personalized recommendation systems, data analytics, fraud detection, healthcare diagnostics, supply chain optimization, and education.
The Institute also plans to further expand the Falcon family by introducing models with multimodal capabilities. These models are expected to be launched in January 2025.
It is worth noting that all models were released under the TII Falcon 2.0 license, a permissive license based on Apache 2.0 with an acceptable utilize policy that encourages responsible development and deployment of artificial intelligence. To aid users get started, TII has also launched Falcon Playground, a testing environment where researchers and developers can try out Falcon 3 models before integrating them into their applications.