How Open Source AI Is Quietly Deflating the Hype Today

Łukasz Grochal

Recent analysis argues that open source AI might be the needle that bursts today’s AI bubble, because it undercuts the core assumption that only a handful of US tech giants can build and control state of the art models. Studies based on OpenRouter usage and Linux Foundation research show that open models are roughly six times cheaper per token on average than closed ones, yet deliver similar or quickly converging performance for many mainstream tasks, with the performance gap often closing within a few weeks of a big proprietary release. If companies systematically picked the best model on price plus quality, global users could save tens of billions of dollars each year.

At the same time, Chinese open weight ecosystems have surged. Benchmarks and Stanford HAI reports find that leading Chinese families such as Alibaba’s Qwen and DeepSeek are now in a statistical dead heat with US models across multiple leaderboards, and Qwen has overtaken Meta’s Llama as the most downloaded LLM family on Hugging Face. Many of these models ship as open weights, enabling fine tuning, self hosting and domain adaptation at very low marginal cost. This creates a hard strategic question for US vendors: how do you justify premium API pricing when a comparably strong open model appears days or weeks later, often for free or near free, and can be deployed behind a customer’s firewall.

If big government AI programs continue to overpay for proprietary stacks while enterprises quietly pivot to open alternatives, the market could reprice AI platforms sharply in favor of leaner, open centric players rather than today’s most richly valued incumbents.

OpenRouter LLM Leaderboard April

Chinese AI Models Dominate OpenRouter Top Six in Token Usage

Claude Code’s Big npm Leak

Inside the Claude Code Leak and Anthropic’s Agent Design

China AI accelerator card shipments vs NVIDIA 2025 chart

NVIDIA’s AI Chip Share in China Drops from 95% to 55%

TurboQuant KV Cache Compression Visualization

Google’s TurboQuant makes AI caches smaller and faster

Black Forest Labs FLUX.2 klein

FLUX.2 klein 9B-KV Explained: Speed, Quality, GPUs

Nvidia Slashes LLM Context Memory With KVTC Design

KVTC: Nvidia’s 20x LLM Memory Cut Without Retraining

OpenAI Sora shutdown concept

Sora’s Short Life: Inside OpenAI’s Quiet Retreat

Stitch (stitch.withgoogle.com) experimental Google Labs tool

Google Stitch: From simple prompt to working app UI

Yann LeCun’s AMI vision for physically grounded AI

Yann LeCun’s AMI Lab Pioneers Physical‑World AI

Project Maven Dashboards Visualizing Targets and Risks

Claude, Palantir and Who Controls AI in Modern War