Gemma 3n 4B

Gemma 3n is Google's mobile-first open model optimized for on-device AI, featuring multimodal capabilities (text, image, audio, video) while running on as little as 2GB of RAM.

Meet Gemma 3n, a model that runs on as little as 2GB of RAM. It shares the same architecture as Gemini Nano.

Google AI

Overview

Gemma 3n is a next-generation open model designed specifically for on-device deployment on phones, tablets, and laptops. Created in collaboration with Qualcomm, MediaTek, and Samsung, it shares architecture with Gemini Nano and introduces groundbreaking efficiency innovations that enable frontier-level AI directly on consumer devices.

Model Variants

Model	Effective Params	Raw Params	Memory	Modalities
Gemma 3n E4B	4B effective	8B raw	~3GB	Text, Image, Audio, Video
Gemma 3n E2B	2B effective	5B raw	~2GB	Text, Image, Audio, Video

Key Features

Mobile-first architecture: Engineered from the ground up for edge devices
True multimodal: Native support for text, image, audio, and video inputs
Per-Layer Embeddings (PLE): Caches parameters to storage, dramatically reducing RAM usage
140+ language support: Text understanding in 140 languages, multimodal in 35 languages
Offline operation: Full functionality without cloud connectivity
Privacy-preserving: All processing happens locally on-device

Use Cases

Real-time voice assistants and transcription
On-device translation and language understanding
Visual scene understanding and image analysis
Privacy-sensitive applications
Offline-capable intelligent features

Role in Series

The Gemma family offers models optimized for different deployment scenarios:

Gemma 3 270M: Ultra-compact for task-specific fine-tuning
Gemma 3n E2B/E4B: Mobile-first with 2-3GB RAM, full multimodal including audio (this model)
Gemma 3 1B-27B: Cloud/desktop deployment with maximum capability
CodeGemma: Code-specialized variant for development tasks

Google: Gemma 3n 4B

Model Type

Recommended Use Cases

Overview

Model Variants

Key Features

Use Cases

Role in Series

Links