LFM2-8B-A1B

LFM2-8B-A1B is Liquid AI's efficient on-device MoE model with 8.3B total parameters and only 1.5B active, delivering quality comparable to 3-4B dense models while running faster than Qwen3-1.7B.

LFM2-8B-A1B is the best on-device MoE in terms of both quality (comparable to 3-4B dense models) and speed (faster than Qwen3-1.7B). — Liquid AI

Overview

Released October 7, 2025, LFM2-8B-A1B brings Mixture-of-Experts efficiency to edge devices. Quantized variants fit comfortably on high-end phones, tablets, and laptops while delivering significantly improved code and knowledge capabilities compared to dense LFM2 models.

Key Features

8.3B total / 1.5B active parameters (MoE)
32K context window
Faster than Qwen3-1.7B with better quality
Quality comparable to 3-4B dense models
Runs on phones, tablets, laptops
12 trillion training tokens
8 languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish

Benchmark Performance

Strong performance in instruction following and math while running significantly faster than similar-sized models:

Benchmark	LFM2-8B-A1B	Llama-3.2-3B	gemma-3-4b-it
MMLU	64.84	60.35	58.35
IFEval	77.58	71.43	76.85
GSM8K	84.38	75.21	89.92
MGSM	72.4	61.68	87.28

Recommended Use Cases

Agentic tasks and function calling
Data extraction
RAG pipelines
Creative writing
Multi-turn conversations

Not recommended for: Knowledge-intensive tasks or complex programming (fine-tune for these)

When to Use LFM2-8B-A1B

Choose LFM2-8B-A1B when you need:

Best quality/speed balance for on-device MoE
Deployment on high-end phones and tablets
Improved code and knowledge vs smaller LFM2 models
Fast inference with 1.5B active parameters

Choose LFM2-24B-A2B when you need:

Maximum LFM2 capability
Server or desktop deployment

Choose LFM2-2.6B when you need:

Dense model simplicity
Dynamic hybrid reasoning
Broader device compatibility

Hardware Requirements

Device	Quantization	Performance
Samsung Galaxy S24 Ultra	INT4	~45 tok/s decode
AMD HX370 CPU	INT4	~70 tok/s decode
High-end laptop	Q4_K_M	Comfortable fit

Significantly faster than models with similar active parameters (e.g., Qwen3-1.7B).

Role in Series

LFM2 text model hierarchy:

LFM2-24B-A2B: Largest, maximum capability
LFM2-8B-A1B: Best on-device MoE for quality/speed (this model)
LFM2-2.6B: Dense model with dynamic reasoning
LFM2-1.2B / 700M / 350M: Compact edge models

Liquid: LFM2-8B-A1B

Model Type

Recommended Use Cases

Overview

Key Features

Benchmark Performance

Recommended Use Cases

When to Use LFM2-8B-A1B

Hardware Requirements

Role in Series

Links