DeepSeek R1: $294K training cost, ultra‑low inference rates

Łukasz Grochal

DeepSeek, a Chinese AI startup, has disclosed detailed cost and training parameters for its reasoning‑focused large language model R1, sparking interest across the global AI and finance sectors. According to a peer‑reviewed paper in Nature, R1’s final training phase cost approximately USD 294,000, using 512 Nvidia H800 GPUs over about 80 hours. DeepSeek also acknowledges using Nvidia A100 units during preparatory experiments.

The model is optimized for tasks like mathematics, logic, and coding. Inference pricing for R1 is very competitive: approximately USD 0.55 per million input tokens and USD 2.19 per million output tokens, which is many times cheaper than rival models such as OpenAI’s o1. Critics, however, warn that the published cost excludes earlier research, ablation studies, data gathering, and infrastructure overhead.

Regardless, R1’s documentation is seen as a milestone in transparency for large language model development, potentially setting a benchmark for cost efficiency and scientific disclosure in AI.

References
2 sources
01
reuters.comReuters
02
cnbc.comCNBC
DeepSeek V4‑Pro 1.6T‑Parameter AI Model Architecture

DeepSeek V4: 1M‑Token Context and Budget Frontier AI Models

Palantir Manifesto Graphic: AI Defense and Culture Clash

Palantir Manifesto Hits at Regressive Cultures and AI Shift

OpenAI ChatGPT Images 2.0 feature overview

OpenAI Updates ChatGPT Images With Better Text

Publishers Are Shutting Out Internet Archive

News Giants Block Wayback Machine Over AI Fears

Claude Design Launch: Brand-Aware AI Prototyping Image

Anthropic Launches Claude Design to Rival Figma Tools

Qwen3.6 Coding Agent Benchmarks Chart Visual

Exploring Qwen3.6: Coding Benchmarks and Speed

Palantier Dilemma Human Rights vs Sercurity

Europe's Palantir Boom Amid Sovereignty and Rights Fears

Project Glasswing: Anthropic Mythos Zero-Day Exploit Finder Art

Claude Mythos Leak Ignites Fears of Unstoppable AI Exploits

OpenRouter LLM Leaderboard April

Chinese AI Models Dominate OpenRouter Top Six in Token Usage

Claude Code’s Big npm Leak

Inside the Claude Code Leak and Anthropic’s Agent Design