GPT‑5 by OpenAI: 2M Tokens, Full Multimodality

15/07/25Łukasz Grochal

GPT‑5 is expected to debut around summer 2025, offering a groundbreaking 2 million‑token context window, a vast jump from GPT‑4o’s 128k. Early architectural leaks suggest extensive Mixture-of-Experts routing and advanced retrieval-augmented modules to maintain speed and token efficiency. Benchmarks are projected to reach ~95% on MMLU and ~82% on SWE-bench, indicating near-expert reasoning in general knowledge and coding.

Training is rumored to use multi-cluster setups spanning tens of thousands of A100/H100-class GPUs, pushing costs beyond $250M. GPT‑5 also integrates long-range session memory, enabling persistent recall across multiple days or weeks. Unlike GPT‑4o, it aims for seamless multimodal fusion, natively handling text, code, images, audio, and video. Agentic layers are expected to enhance tool invocation and multi-step planning. This positions GPT‑5 not as AGI, but a major leap in scale, contextual reasoning, and multimodal synthesis, optimized for robust production pipelines.

References

2 sources

medium.comMedium

↗

felloai.comFello AI

↗

25/04/26