Claude Distillation Drama: Anthropic vs Chinese AI Labs

Łukasz Grochal

Anthropic dropped some serious claims last month, saying DeepSeek, Moonshot AI, and MiniMax ran huge operations to "distill" capabilities from Claude using about 24,000 fake accounts and over 16 million chats. Distillation basically means feeding a weaker model's training data with outputs from a stronger one like Claude to boost your own AI quick and cheap. DeepSeek alone hit Claude over 150,000 times, zeroing in on reasoning tricks, grading tasks, and even safe replies to touchy political questions on stuff like dissidents or authoritarianism. Anthropic ties this to broken terms of service and blocked access in China, plus national security worries since distilled models might skip safety checks baked into Claude against bioweapons or cyber mischief.​

They paint it as industrial-scale theft that undercuts US export controls on AI chips, letting Chinese labs catch up without the full R&D grind. DeepSeek got about 150k exchanges, Moonshot over 3.4 million focused on coding and tools, and MiniMax the biggest at 13 million targeting agentic coding. But here's the twist: Anthropic's not spotless. They've faced lawsuits from authors over scanning millions of books, including pirated ones from sites like Library Genesis, to train Claude. A big settlement hit $1.5 billion with writers claiming copyright grabs, and docs show they bought bulk books, scanned them, then trashed originals. Elon Musk even fired back online, accusing Anthropic of data theft too based on community notes about their past deals.

It's like a game of "thief robbing thief" in AI land, where everyone's scraping web data or outputs at scale. Anthropic might be pointing fingers partly to protect their edge as competition heats up and US-China AI tensions rise. DeepSeek pushes back by saying their models like R1 and upcoming V4 come from public web and ebooks only, matching frontier performance cheaper. No lawsuits filed yet from Anthropic, just public shaming and calls for industry teamwork on defenses like better detection and chip curbs. OpenAI echoed similar gripes about DeepSeek in letters to lawmakers. Bottom line, distillation's legit when you do it to yourself, shady when rivals do it to you, but the whole field's built on massive data hauls that blur lines on what's fair game.

References(2)
Sources
Palantier Dilemma Human Rights vs Sercurity

Europe's Palantir Boom Amid Sovereignty and Rights Fears

Project Glasswing: Anthropic Mythos Zero-Day Exploit Finder Art

Claude Mythos Leak Ignites Fears of Unstoppable AI Exploits

OpenRouter LLM Leaderboard April

Chinese AI Models Dominate OpenRouter Top Six in Token Usage

Claude Code’s Big npm Leak

Inside the Claude Code Leak and Anthropic’s Agent Design

China AI accelerator card shipments vs NVIDIA 2025 chart

NVIDIA’s AI Chip Share in China Drops from 95% to 55%

TurboQuant KV Cache Compression Visualization

Google’s TurboQuant makes AI caches smaller and faster

Black Forest Labs FLUX.2 klein

FLUX.2 klein 9B-KV Explained: Speed, Quality, GPUs

Nvidia Slashes LLM Context Memory With KVTC Design

KVTC: Nvidia’s 20x LLM Memory Cut Without Retraining

OpenAI Sora shutdown concept

Sora’s Short Life: Inside OpenAI’s Quiet Retreat

Stitch (stitch.withgoogle.com) experimental Google Labs tool

Google Stitch: From simple prompt to working app UI