OpenAI launches Flex processing for cheaper AI tasks

0
92
OpenAI launches Flex processing for cheaper AI tasks

In an effort to compete more aggressively with rival AI companies such as Google, OpenAI is launching Flex processing, an API option that provides lower prices for using AI models in exchange for slower response times and “occasional resource unavailability.”

Flex processing, available in beta for OpenAI’s recently released o3 and o4-mini reasoning models, is designed for lower-priority and “non-production” tasks such as model evaluation, data enrichment, and asynchronous workloads, OpenAI says.

This reduces API costs by exactly half. For o3, Flex processing costs $5 per million incoming tokens (~750,000 words) and $20 per million outgoing tokens, compared to the standard $10 per million incoming tokens and $40 per million outgoing tokens. For o4-mini, Flex reduces the price to $0.55 per million incoming tokens and $2.20 per million outgoing tokens from $1.10 per million incoming tokens and $4.40 per million outgoing tokens.

The launch of Flex processing comes at a time when prices for advanced artificial intelligence technologies continue to rise and competitors are releasing cheaper and more efficient models aimed at the budget. On Thursday, Google released Gemini 2.5 Flash, a reasoning model that meets or exceeds DeepSeek’s R1 in performance at a lower cost per input token.

In an email to customers announcing the launch of flexible pricing, OpenAI also indicated that developers who are in levels 1-3 of the usage tier hierarchy will need to go through a newly implemented identity verification process to gain access to o3. Tiers are determined by the amount of money spent on OpenAI services. The reasoning behind O3 and other models, as well as support for streaming APIs, is also related to verification.

OpenAI has previously stated that ID verification is designed to stop attackers from violating its use case policy.

LEAVE A REPLY

Please enter your comment!
Please enter your name here