Gemma 3n: Open Source Multimodal AI

Łukasz Grochal

Hugging Face and Google have launched Gemma 3n, an open source multimodal large language model built to run efficiently on consumer hardware. Available in two compact sizes, E2B and E4B, it requires just 2–4 GB of GPU memory while delivering top-tier performance. Gemma 3n natively handles text, images, audio, and video through integrated MobileNet vision, USM audio, and a cutting-edge MatFormer backbone. It features per-layer embeddings and KV-cache sharing, driving impressive benchmarks over 1300 on LMArena—outperforming many sub-10B models.

Deeply integrated with Hugging Face’s open ecosystem, it supports straightforward fine-tuning, deployment, and community contributions. This release highlights a move toward high-quality, accessible, open source AI for developers and researchers on everyday devices.

References
2 sources
01
huggingface.coHugging Face
02
deepmind.googleDeepMind
Qwen3.6 Coding Agent Benchmarks Chart Visual

Exploring Qwen3.6: Coding Benchmarks and Speed

Palantier Dilemma Human Rights vs Sercurity

Europe's Palantir Boom Amid Sovereignty and Rights Fears

Project Glasswing: Anthropic Mythos Zero-Day Exploit Finder Art

Claude Mythos Leak Ignites Fears of Unstoppable AI Exploits

OpenRouter LLM Leaderboard April

Chinese AI Models Dominate OpenRouter Top Six in Token Usage

Claude Code’s Big npm Leak

Inside the Claude Code Leak and Anthropic’s Agent Design

China AI accelerator card shipments vs NVIDIA 2025 chart

NVIDIA’s AI Chip Share in China Drops from 95% to 55%

TurboQuant KV Cache Compression Visualization

Google’s TurboQuant makes AI caches smaller and faster

Black Forest Labs FLUX.2 klein

FLUX.2 klein 9B-KV Explained: Speed, Quality, GPUs

Nvidia Slashes LLM Context Memory With KVTC Design

KVTC: Nvidia’s 20x LLM Memory Cut Without Retraining

OpenAI Sora shutdown concept

Sora’s Short Life: Inside OpenAI’s Quiet Retreat