OpenAI Launches GPT-5.4, Challenging Assumptions About Reasoning, Coding and Computer Use

How much smarter can AI get in helping people tackle everyday work tasks? OpenAI thinks quite a bit smarter, judging by its newest model launching March 5, 2026. GPT-5.4 arrives as the company’s most capable and efficient frontier model yet, designed specifically for professional work that requires serious thinking and precision.

OpenAI’s GPT-5.4 launches March 5, 2026, as its most capable frontier model built for professional work requiring serious thinking and precision.

The new model comes in three flavors: standard, Thinking for complex reasoning, and Pro for maximum performance. ChatGPT Plus, Team, and Pro subscribers get early access, along with Enterprise and Education customers. Developers can tap into the API using gpt-5.4 and gpt-5.4-pro.

What makes GPT-5.4 particularly interesting is its ability to actually use computers like a human would. It can navigate desktops, browse websites, and operate software autonomously. Think of it as a digital assistant that doesn’t just suggest what to do but actually does it, handling spreadsheets, presentations, and documents without constant hand-holding.

The numbers tell an impressive story. GPT-5.4 scored 83% on GDPval, a test measuring knowledge work capabilities. On browser tasks, it achieved 67.3% success compared to GPT-5.2’s 65.4%.

Perhaps most striking, it jumped from 70.9% to 92.8% on screenshot-based web navigation. When tested on 30,000 HOA and property tax portals, it succeeded 95% of the time on first attempts and 100% within three tries.

Efficiency improvements matter too. The model solves problems using markedly fewer tokens than its predecessor, completing sessions three times faster while using 70% fewer tokens in computer use tests. The API now supports context windows up to one million tokens, letting it handle truly extensive information.

OpenAI focused heavily on reducing errors. Individual claims contain 33% fewer mistakes compared to GPT-5.2, while complete responses are 18% less likely to have errors. The company describes this as their most factual model yet, with built-in safety evaluations for multi-step tasks.

The rollout happens gradually across ChatGPT and related services, with the same high cyber-risk classification as previous models. OpenAI promises expanded safety monitoring, access controls, and request blocking to manage potential misuse.

AI systems that learn from wins and losses can adapt in real time, improving performance across different market conditions through reinforcement learning.