Beats Gemini and Claude in science and math tests

By Swaleha | Published on June 11, 2025

Technology / June 11, 2025

Beats Gemini and Claude in science and math tests

OpenAI has launched its latest AI model, o3-pro, for ChatGPT Pro users. It scores higher than Gemini 2.5 Pro and Claude 4 Opus in key tests, especially in science, coding, and writing. The model comes with tools but has limitations like slower response and no image generation.

New Delhi:

O3-pro builds on the o3 model released earlier this year. It’s designed to be slower, but more reliable, particularly in subjects like coding, maths, and science. OpenAI says this model performs better in long-form thinking tasks, making it more useful in areas that need precision over speed.

OpenAI has launched a new version of its reasoning-focused AI model called o3-pro. The rollout began on June 10, and it’s now available to ChatGPT Pro and Team users, replacing the earlier o1-pro. Enterprise and education users will get access next week, according to the company. Designed for challenging questions

The biggest draw of o3-pro is its ability to tackle complex problems more effectively than previous models. OpenAI recommends using it “for challenging questions where reliability matters more than speed.” The model can do everything o3 can, including searching the web, reading uploaded files, understanding images, running Python code, and personalising answers using memory.

That also means it’s slower. The model takes more time to respond compared to o1-pro, but the trade-off is better answers. In internal reviews, OpenAI testers preferred o3-pro across every major area, including science, programming, business queries, and even writing help.

Benchmarks put it ahead of rivals

The company also tested o3-pro using what it calls the “4/4 reliability” method, where the AI must get the correct answer all four times when asked the same question. According to OpenAI, o3-pro consistently passed this test.

OpenAI claims that o3-pro is now outperforming its main rivals. On AIME 2024, a maths benchmark, o3-pro reportedly scored higher than Google’s Gemini 2.5 Pro. It also beat Anthropic’s Claude 4 Opus on the GPQA Diamond test, which checks for PhD-level science understanding.

What comes next?

For users on ChatGPT Pro, o3-pro is now available directly from the model picker. If you’re into long-form research, coding support, or even content writing with better accuracy, this model might be worth the wait. Just be prepared for slightly longer response times.

OpenAI has said that the same safety measures used in o3 apply to o3-pro.

Pricing and limitations

In the API, o3-pro is priced at $20 per million input tokens and $80 per million output tokens. For reference, one million input tokens is about 7.5 lakh words. In Indian currency, that’s roughly ₹1,740 per million input tokens and ₹6,960 for output.

That said, the model isn’t perfect. OpenAI says temporary chats are currently disabled for o3-pro because of a technical glitch. It can’t generate images either, and users looking to use Canvas, OpenAI’s workspace tool, will have to switch to another model.