Liquid iconLiquid: LFM2-8B-A1B

Model Type

Open weight model icon

Open Weight Model

8B parameters

Recommended Use Cases

Text Generation

Try LFM2-8B-A1B

LFM2-8B-A1B is Liquid AI's efficient on-device MoE model with 8.3B total parameters and only 1.5B active, delivering quality comparable to 3-4B dense models while running faster than Qwen3-1.7B.

LFM2-8B-A1B is the best on-device MoE in terms of both quality (comparable to 3-4B dense models) and speed (faster than Qwen3-1.7B). β€” Liquid AI

Overview

Released October 7, 2025, LFM2-8B-A1B brings Mixture-of-Experts efficiency to edge devices. Quantized variants fit comfortably on high-end phones, tablets, and laptops while delivering significantly improved code and knowledge capabilities compared to dense LFM2 models.

Key Features

  • 8.3B total / 1.5B active parameters (MoE)
  • 32K context window
  • Faster than Qwen3-1.7B with better quality
  • Quality comparable to 3-4B dense models
  • Runs on phones, tablets, laptops
  • 12 trillion training tokens
  • 8 languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish

Benchmark Performance

Strong performance in instruction following and math while running significantly faster than similar-sized models:

BenchmarkLFM2-8B-A1BLlama-3.2-3Bgemma-3-4b-it
MMLU64.8460.3558.35
IFEval77.5871.4376.85
GSM8K84.3875.2189.92
MGSM72.461.6887.28

Recommended Use Cases

  • Agentic tasks and function calling
  • Data extraction
  • RAG pipelines
  • Creative writing
  • Multi-turn conversations

Not recommended for: Knowledge-intensive tasks or complex programming (fine-tune for these)

When to Use LFM2-8B-A1B

Choose LFM2-8B-A1B when you need:

  • Best quality/speed balance for on-device MoE
  • Deployment on high-end phones and tablets
  • Improved code and knowledge vs smaller LFM2 models
  • Fast inference with 1.5B active parameters

Choose LFM2-24B-A2B when you need:

  • Maximum LFM2 capability
  • Server or desktop deployment

Choose LFM2-2.6B when you need:

  • Dense model simplicity
  • Dynamic hybrid reasoning
  • Broader device compatibility

Hardware Requirements

DeviceQuantizationPerformance
Samsung Galaxy S24 UltraINT4~45 tok/s decode
AMD HX370 CPUINT4~70 tok/s decode
High-end laptopQ4_K_MComfortable fit

Significantly faster than models with similar active parameters (e.g., Qwen3-1.7B).

Role in Series

LFM2 text model hierarchy:

  1. LFM2-24B-A2B: Largest, maximum capability
  2. LFM2-8B-A1B: Best on-device MoE for quality/speed (this model)
  3. LFM2-2.6B: Dense model with dynamic reasoning
  4. LFM2-1.2B / 700M / 350M: Compact edge models

Links