Liquid: LFM2-24B-A2B
Model Type
Open Weight Model
24B parameters
Recommended Use Cases
Try LFM2-24B-A2B
LFM2-24B-A2B is Liquid AI's largest foundation model, scaling the LFM2 hybrid architecture to 24 billion parameters while maintaining efficient inference with only 2.3B active parameters per token.
Quality improves log-linearly from 350M to 24B total parameters, confirming the LFM2 hybrid architecture scales reliably across nearly two orders of magnitude. β Liquid AI
Overview
Released February 24, 2026, LFM2-24B-A2B demonstrates that Liquid's hybrid architecture scales predictably to larger sizes. Despite having 24B total parameters, it activates only 2.3B per token, enabling deployment on consumer laptops and desktops with 32GB of RAM.
Key Features
- 24B total / 2.3B active parameters (MoE)
- 32K context window
- Fits in 32GB RAM for consumer deployment
- 112 tok/s decode on AMD CPU
- 293 tok/s decode on H100 GPU
- 17 trillion training tokens
- 9 languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish, Portuguese
Recommended Use Cases
- Agentic tool use and function calling
- Offline document summarization and Q&A
- Privacy-preserving customer support
- Local RAG pipelines
- Multi-step agent pipelines (fast inner-loop model)
When to Use LFM2-24B-A2B
Choose LFM2-24B-A2B when you need:
- Maximum LFM2 capability
- Local deployment on 32GB+ devices
- Best quality for complex reasoning
- Enterprise-grade on-premise deployment
Choose LFM2-8B-A1B when you need:
- Deployment on phones, tablets, or smaller laptops
- Faster inference with smaller memory footprint
- Quality comparable to 3-4B dense models
Choose LFM2-2.6B when you need:
- Smallest dense model with dynamic reasoning
- Maximum portability across devices
Hardware Requirements
| Quantization | RAM/VRAM Required |
|---|---|
| Q4_K_M | ~16GB |
| 8-bit | ~26GB |
| BF16 | ~48GB |
Fits comfortably in 32GB RAM with quantization. Day-one support for llama.cpp, vLLM, and SGLang.
Role in Series
LFM2 text model hierarchy:
- LFM2-24B-A2B: Largest, maximum capability (this model)
- LFM2-8B-A1B: Best on-device MoE for quality/speed
- LFM2-2.6B: Dense model with dynamic reasoning
- LFM2-1.2B / 700M / 350M: Compact edge models