DeepSeek R1: $294K training cost, ultra‑low inference rates

DeepSeek, a Chinese AI startup, has disclosed detailed cost and training parameters for its reasoning‑focused large language model R1, sparking interest across the global AI and finance sectors. According to a peer‑reviewed paper in Nature, R1’s final training phase cost approximately USD 294,000, using 512 Nvidia H800 GPUs over about 80 hours. DeepSeek also acknowledges using Nvidia A100 units during preparatory experiments.

The model is optimized for tasks like mathematics, logic, and coding. Inference pricing for R1 is very competitive: approximately USD 0.55 per million input tokens and USD 2.19 per million output tokens, which is many times cheaper than rival models such as OpenAI’s o1. Critics, however, warn that the published cost excludes earlier research, ablation studies, data gathering, and infrastructure overhead.

Regardless, R1’s documentation is seen as a milestone in transparency for large language model development, potentially setting a benchmark for cost efficiency and scientific disclosure in AI.