Liquid: LFM2-8B-A1B
Model Type
Open Weight Model
8B parameters
Recommended Use Cases
Try LFM2-8B-A1B
LFM2-8B-A1B is Liquid AI's efficient on-device MoE model with 8.3B total parameters and only 1.5B active, delivering quality comparable to 3-4B dense models while running faster than Qwen3-1.7B.
LFM2-8B-A1B is the best on-device MoE in terms of both quality (comparable to 3-4B dense models) and speed (faster than Qwen3-1.7B). β Liquid AI
Overview
Released October 7, 2025, LFM2-8B-A1B brings Mixture-of-Experts efficiency to edge devices. Quantized variants fit comfortably on high-end phones, tablets, and laptops while delivering significantly improved code and knowledge capabilities compared to dense LFM2 models.
Key Features
- 8.3B total / 1.5B active parameters (MoE)
- 32K context window
- Faster than Qwen3-1.7B with better quality
- Quality comparable to 3-4B dense models
- Runs on phones, tablets, laptops
- 12 trillion training tokens
- 8 languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish
Benchmark Performance
Strong performance in instruction following and math while running significantly faster than similar-sized models:
| Benchmark | LFM2-8B-A1B | Llama-3.2-3B | gemma-3-4b-it |
|---|---|---|---|
| MMLU | 64.84 | 60.35 | 58.35 |
| IFEval | 77.58 | 71.43 | 76.85 |
| GSM8K | 84.38 | 75.21 | 89.92 |
| MGSM | 72.4 | 61.68 | 87.28 |
Recommended Use Cases
- Agentic tasks and function calling
- Data extraction
- RAG pipelines
- Creative writing
- Multi-turn conversations
Not recommended for: Knowledge-intensive tasks or complex programming (fine-tune for these)
When to Use LFM2-8B-A1B
Choose LFM2-8B-A1B when you need:
- Best quality/speed balance for on-device MoE
- Deployment on high-end phones and tablets
- Improved code and knowledge vs smaller LFM2 models
- Fast inference with 1.5B active parameters
Choose LFM2-24B-A2B when you need:
- Maximum LFM2 capability
- Server or desktop deployment
Choose LFM2-2.6B when you need:
- Dense model simplicity
- Dynamic hybrid reasoning
- Broader device compatibility
Hardware Requirements
| Device | Quantization | Performance |
|---|---|---|
| Samsung Galaxy S24 Ultra | INT4 | ~45 tok/s decode |
| AMD HX370 CPU | INT4 | ~70 tok/s decode |
| High-end laptop | Q4_K_M | Comfortable fit |
Significantly faster than models with similar active parameters (e.g., Qwen3-1.7B).
Role in Series
LFM2 text model hierarchy:
- LFM2-24B-A2B: Largest, maximum capability
- LFM2-8B-A1B: Best on-device MoE for quality/speed (this model)
- LFM2-2.6B: Dense model with dynamic reasoning
- LFM2-1.2B / 700M / 350M: Compact edge models