Inside MiniMax M2.5: MoE Design, Speed and Cost

Łukasz Grochal

MiniMax M2.5 is a new frontier‑class large language model from MiniMax that targets high‑end coding, agents and office workloads while keeping costs relatively low and speed very high. Built on a roughly 229B‑parameter Mixture of Experts architecture (MoE) with a long context window of around 200k tokens, it focuses on efficient task decomposition and fast tool‑using behavior rather than just raw size. In coding benchmarks like SWE‑Bench Verified it reaches about 80% and outperforms the previous M2.1 generation by roughly a third in speed, landing in the same performance band as top commercial models such as Claude Opus and recent GPT‑series systems for code and agentic workflows.

The “Lightning” serving variant reaches about 100 tokens per second and is priced at around 0.3 USD per million input tokens with relatively low output pricing, which makes it attractive for always‑on agents, search pipelines and large‑scale automation. Compared with competitors, M2.5 usually trades a bit of general reasoning and multimodal breadth for strong coding, efficient tool calling and good price‑to‑performance, so it fits best where execution speed, cost and integration into complex workflows matter more than being the single most capable generalist model.

References
3 sources
01
minimax.ioMiniMax
02
minimax.ioMiniMax M2.5
03
huggingface.coHugging Face
DeepSeek V4‑Pro 1.6T‑Parameter AI Model Architecture

DeepSeek V4: 1M‑Token Context and Budget Frontier AI Models

Palantir Manifesto Graphic: AI Defense and Culture Clash

Palantir Manifesto Hits at Regressive Cultures and AI Shift

OpenAI ChatGPT Images 2.0 feature overview

OpenAI Updates ChatGPT Images With Better Text

Publishers Are Shutting Out Internet Archive

News Giants Block Wayback Machine Over AI Fears

Claude Design Launch: Brand-Aware AI Prototyping Image

Anthropic Launches Claude Design to Rival Figma Tools

Qwen3.6 Coding Agent Benchmarks Chart Visual

Exploring Qwen3.6: Coding Benchmarks and Speed

Palantier Dilemma Human Rights vs Sercurity

Europe's Palantir Boom Amid Sovereignty and Rights Fears

Project Glasswing: Anthropic Mythos Zero-Day Exploit Finder Art

Claude Mythos Leak Ignites Fears of Unstoppable AI Exploits

OpenRouter LLM Leaderboard April

Chinese AI Models Dominate OpenRouter Top Six in Token Usage

Claude Code’s Big npm Leak

Inside the Claude Code Leak and Anthropic’s Agent Design