OpenAI launches GPT-5

On August 7, 2025, OpenAI released GPT-5, its most advanced language model to date. The rollout marks a significant milestone in generative AI development. With GPT-5, OpenAI focuses on smarter reasoning, broader access, improved reliability, and versatile enterprise deployment.

GPT-5 introduces a new paradigm in model architecture and interaction, bringing major enhancements over GPT-4. This article outlines what GPT-5 is, how it works, who it’s for, and what it might mean for the future of artificial intelligence.

Key technical improvements

GPT-5 introduces a multi-model system with a real-time router. Rather than a single monolithic model, GPT-5 uses a smart router to determine whether a lightweight or a more capable “thinking” model should respond. This routing approach helps optimize speed and accuracy depending on query complexity.

This system improves user experience. Simple requests get fast replies, while more difficult ones trigger deeper computation. For the end user, it feels seamless. GPT-5 delivers expert-level responses when needed and quick answers when it makes sense to conserve time and resources.

One standout feature is the “thinking mode.” It allows GPT-5 to apply advanced reasoning and structured logic. The model can break complex tasks into smaller steps and work through them transparently. This feature significantly enhances GPT-5’s performance in problem-solving scenarios, including scientific reasoning, complex math, and software development.

GPT-5 also improves factual accuracy. It reduces hallucinations by more than 40 percent compared to GPT-4. When using the full reasoning mode, that number jumps to 80 percent. The model more reliably says “I don’t know” when appropriate, which increases user trust and limits misinformation.

Expanded capabilities and context

OpenAI has equipped GPT-5 with a 256,000-token context window. By comparison, GPT-4o offers a 128,000-token limit, while GPT-o3 supports up to 200,000 tokens. This expanded capacity allows GPT-5 to handle even larger inputs—such as extensive reports, entire books, or massive codebases—without losing coherence or needing truncation.

Multimodal input remains a core feature. GPT-5 can interpret both text and images. It offers improved visual reasoning, understanding charts, diagrams, and photos with greater fidelity than previous versions. Testers note its ability to flag inconsistencies between a query and a mismatched image—something GPT-4 struggled with.

The model also demonstrates improved domain expertise. GPT-5 excels at coding, writing, and health-related queries. These areas received focused improvements. It can now write, edit, and debug code across multiple languages with higher accuracy and efficiency. For writing tasks, GPT-5 shows stronger organization, richer metaphors, and an improved sense of tone and audience.

In health contexts, GPT-5 demonstrates improved caution and source citation. The model rarely hallucinates in this domain. It provides nuanced explanations while suggesting follow-up questions and pointing users toward trusted medical sources.

Benchmarks and evaluation

GPT-5 was benchmarked by OpenAI and independent organizations across a diverse set of tasks. It delivered strong results in most categories:

AIME 2025 (math competition): 94.6% accuracy without tool use.
SWE-Bench (software engineering tasks): 74.9%, outperforming Gemini 2.5 Pro (63.8%) and slightly ahead of Claude Opus 4.1 (74.5%).
Aider Polyglot (multilingual code editing): 88%, compared to 83.1% for Gemini 2.5 Pro and 72.0% for Claude Opus 4.0 (Claude 4.1 data not yet available).
GPQA (graduate-level science questions): 89.4%, ahead of Gemini 2.5 Pro (84.0%) and Claude Opus 4.1 (80.9%).
HealthBench Hard (clinical accuracy): 25.5%, demonstrating caution and consistency on high-stakes health prompts.
TauBench (tool use): GPT-5 delivered mixed results on TauBench, a benchmark evaluating an AI model’s ability to complete simulated online tasks. It scored 63.5% on a test simulating airline website navigation, slightly underperforming GPT-o3, which scored 64.8%. On retail website navigation tasks, GPT-5 achieved 81.1%, just below Claude Opus 4.1’s 82.4%.

For business users, these results suggest a model that does well not just in academic tasks but also in real-world problem solving and enterprise-grade applications. High scores in software engineering, multilingual code editing, and tool use point to GPT-5’s readiness for roles in automation, IT support, research, and customer-facing AI agents. While no model is flawless, GPT-5’s consistency and domain specialization make it a credible partner for operational efficiency and decision support in many professional settings.

Deployment and access

GPT-5 is available to all ChatGPT users, including those on the free tier. However, usage is capped at lower levels for free users. Paid tiers unlock more generous access and additional features.

The ChatGPT Plus plan, at $20 per month, provides extended GPT-5 usage and priority access. A new ChatGPT Pro plan, priced at $200 per month, offers unlimited usage and access to GPT-5 Pro, a variant optimized for high reasoning and depth. The Pro plan also unlocks advanced voice interactions and extended tool integrations.

For teams and enterprises, OpenAI offers tailored options. The ChatGPT Team plan includes a shared workspace, admin controls, and full access to GPT-5. Enterprise clients can customize access with enhanced security, data privacy, and context windows. They also gain access to tools like record mode, analytics, and business connectors.

On the API side, OpenAI introduced three GPT-5 models:

GPT-5 (full): $1.25 per 1M input tokens and $10 per 1M output tokens
GPT-5 Mini: $0.25 input / $2.00 output per 1M tokens
GPT-5 Nano: $0.05 input / $0.40 output per 1M tokens

All variants support a 256k-token context window and multimodal input. The API provides developers with control over verbosity and reasoning depth, enabling flexible integration into existing systems.

Safety and alignment

Safety remains a top priority in GPT-5’s design. OpenAI reduced hallucinations and deceptive outputs significantly. In trials, GPT-5 hallucinated just 1.6 percent of the time on hard medical prompts, compared to nearly 13 percent for GPT-4o.

The model also demonstrates improved behavior around refusals and sensitive topics. Rather than bluntly refusing, GPT-5 aims for “safe completions,” offering helpful answers within guardrails. This behavior reduces user frustration while preserving safety.

OpenAI conducted over 5,000 hours of red-team testing with internal and external experts. They applied rigorous safety layers, including content filters, reasoning monitors, and output classifiers. These measures mitigate risks in high-stakes domains such as biosecurity and misinformation.

Additionally, GPT-5 is trained to “fail gracefully.” When it cannot answer a query, it often says so clearly. This honesty improves user experience and reduces false confidence in model outputs.

My thoughts on GPT-5

GPT-5 running on my laptop in Firefox.

I use ChatGPT regularly, primarily with GPT-4o and GPT-o3, so I have a solid baseline for comparison.

The GPT-5 thinking model takes noticeably longer to generate its responses, which in my view is a strength. While GPT-o3 felt faster for some tasks, I value the way GPT-5 methodically works through each step, ensuring thoroughness. When I asked it to crunch advanced NBA statistics, it performed well and delivered clear, well-structured results.

The prose quality is a notable improvement. GPT-o3 occasionally leaned into unnecessary technical jargon, using certain words in ways that felt forced. GPT-5’s thinking model, by contrast, strikes a more natural balance between precision and readability.

It also adapts well to different tones and styles when prompted. In tests with hypothetical “risky” questions, it responded responsibly: offering sensible advice about a cut on my arm, clearly stating that siphoning gasoline is illegal, and explaining that building a Bluetooth jammer is also unlawful.

Looking ahead

GPT-5 reflects OpenAI’s commitment to making powerful AI accessible and useful. By balancing capability with safety, it offers a model ready for widespread deployment.

It does not reach AGI. OpenAI executives clarified that GPT-5 still lacks continuous learning and general autonomy. But it moves the needle. With greater reasoning, better tool use, and broad availability, GPT-5 marks a turning point.

As businesses and individuals integrate it into daily workflows, GPT-5’s true impact will become clearer. Whether used for code generation, scientific research, or everyday assistance, the model represents a new standard for general-purpose AI.

May Horiuchi

Content Specialist at Visla

May is a Content Specialist and AI Expert for Visla. She is an in-house expert on anything Visla and loves testing out different AI tools to figure out which ones are actually helpful and useful for content creators, businesses, and organizations.