How much does the OpenAI API cost per token?

OpenAI cost per token ranges from $0.0000002 (GPT-5.4 Nano input) to $0.00018 (GPT-5.4 Pro output). To convert from OpenAI’s published per-million-token rates, divide by 1,000,000. Most production workloads fall between $0.0000002 and $0.000015 per token depending on the model. You can use the formula in our calculator section above to estimate your own OpenAI API cost per token for any model.

Is the OpenAI API free?

OpenAI offers an OpenAI API pricing free tier with limited rate limits and credits for new accounts. Beyond the free credits, all API usage is billed per token. There is no free unlimited tier. The OpenAI API key cost is $0 to generate, but every API call after your free credits are used incurs token charges. ChatGPT Free provides limited access to GPT-5.4 Mini, but this is the consumer product, not the API.

How much does it cost to run a chatbot on the OpenAI API?

A chatbot handling 10,000 conversations per day on GPT-5.4 Mini costs roughly $518 per month. The exact OpenAI API cost depends on average conversation length, model choice, and caching efficiency. Switching from Standard to Mini reduces costs by about 70% for most support use cases.

What is the cheapest OpenAI model?

GPT-5.4 Nano is the cheapest production model at $0.20 per million input tokens and $1.25 per million output tokens. When comparing GPT-5 pricing across the full lineup, Nano costs 12x less than Standard on input. GPT-OSS-20b, an open-source model, is available at $0.03 per million tokens but has limited capabilities.

How does OpenAI API pricing compare to Anthropic Claude?

GPT-5.4 Mini ($0.75 input / $4.50 output) is roughly 25% cheaper on input than Claude Haiku 4.5 ($1.00 input / $5.00 output) and 10% cheaper on output. GPT-5.4 Standard ($2.50 / $15.00) is slightly cheaper on input than Claude Sonnet 4.6 ($3.00 / $15.00) and matches on output. At the flagship tier, Claude Opus 4.6 costs $5.00 / $25.00 per million tokens. Comparing the full OpenAI API price lineup to Claude, OpenAI is generally cheaper at the lower tiers.

What is the OpenAI Batch API and how much does it save?

The Batch API processes requests asynchronously within 24 hours at 50% of standard token prices. GPT-5.4 Standard drops from $2.50/$15.00 to $1.25/$7.50 per million tokens. It is ideal for workloads that do not need real-time responses, like nightly data processing, bulk content generation, or large-scale classification.

How do Azure OpenAI costs compare to the direct API?

Azure OpenAI pay-as-you-go rates are generally comparable to direct OpenAI API pricing. Azure also offers provisioned throughput units (PTUs) for reserved capacity at volume discounts. The main reason to choose Azure is compliance, data residency, or integration with existing Azure infrastructure, not price.

April 09, 2026 , 14 min read

OpenAI API Cost In 2026: Every Model Compared

Q: Is there an OpenAI cost calculator?

OpenAI does not offer a dedicated OpenAI cost calculator, but you can estimate costs with a simple formula: (input tokens / 1,000,000 x input price) + (output tokens / 1,000,000 x output price). OpenAI’s usage dashboard also tracks token consumption in real time and projects monthly spend.

OpenAI API costs range from $0.20 to $30 per million input tokens. Compare GPT-5.4, Mini, Nano pricing and 6 strategies to cut your bill.

By: Lyne Carolyne

Table Of Contents

How Does OpenAI API Pricing Work? What Is The OpenAI API Cost By Model? What Drives Your OpenAI API Cost? How Do You Calculate Your OpenAI API Cost? OpenAI API Cost: Real-world SaaS Examples OpenAI API Cost Vs. Alternatives: Model Comparison Azure OpenAI Cost Vs. Direct API: Which Should You Use? 6 Ways To Reduce Your OpenAI API Cost How Does CloudZero Track And Optimize OpenAI Costs? Frequently Asked Questions About OpenAI API Cost

Quick answer: How much does the OpenAI API cost? OpenAI API cost depends on the model you choose and the number of tokens you process. Input pricing ranges from $0.20 per million tokens (GPT-5.4 Nano) to $30.00 per million tokens (GPT-5.4 Pro). Output tokens cost more, ranging from $1.25 to $180.00 per million. A SaaS team routing 10,000 daily support queries through GPT-5.4 Mini, averaging 500 input tokens and 300 output tokens per query, spends about $518 per month. Costs scale directly with usage, and the model you pick is the single biggest lever on your bill.

How Does OpenAI API Pricing Work?

OpenAI uses a usage-based pricing model. You pay per token processed through the API, with separate rates for input tokens (what you send) and output tokens (what the model generates).

There is no flat monthly fee for API access.

A token is roughly three-quarters of a word in English. Every API call consumes input tokens for the prompt you send and output tokens for the response you receive. Both count toward your bill, but output tokens usually cost 4 to 6 times more than input tokens because they need more compute to generate.

OpenAI bills in increments of one million tokens. A single API call might cost a fraction of a cent, but those fractions compound fast at scale. A SaaS product serving 10,000 daily active users can easily process hundreds of millions of tokens per month.

This usage-based structure makes OpenAI flexible for prototyping, but it also means costs are unpredictable without monitoring. Engineering teams often discover their monthly bill has doubled before anyone notices a spike in token consumption.

Research Report

FinOps In The AI Era: A Critical Recalibration

What 475 executives told us about AI and cloud efficiency.

What Is The OpenAI API Cost By Model?

OpenAI’s current flagship is the GPT-5.4 family, released March 5, 2026, with Mini and Nano variants following on March 17. Here is what each model costs.

GPT-5.4 family pricing

Source: OpenAI API pricing page, verified April 2026.

Batch API offers a 50% discount on all models. Cached input pricing for Mini and Nano follows the same 90% discount pattern as Standard but confirm exact rates on OpenAI’s pricing page.

The difference between Standard and Nano is 12x on input. That gap means model selection alone can turn a $1,200/month workload into a $100/month workload if the task does not need flagship-level reasoning.

Legacy and specialized model pricing

For the full model catalog, including Codex, open-weight, and deep research models, see OpenAI’s complete model list.

OpenAI embeddings pricing

OpenAI offers two embedding models for search, classification, and RAG workloads. Text-embedding-3-small costs $0.02 per million tokens and works well for most retrieval tasks.

Text-embedding-3-large costs $0.13 per million tokens and produces higher-dimension vectors for use cases that need more precision. Both are dramatically cheaper than calling a full LLM, making embeddings the first place to look when building AI-powered search or recommendation features.

OpenAI fine-tuning pricing

Fine-tuning lets you train OpenAI models on your own data. GPT-4.1 is one of OpenAI’s flagship models available for fine-tuning, with training priced at roughly $3.00 per million tokens.

After training, fine-tuned GPT-4.1 inference typically costs around $3.00 per million input tokens and $12.00 per million output tokens, making it more expensive than the base model.

GPT-4.1 Mini offers a lower-cost option, with training at about $0.80 per million tokens and inference at roughly $0.80 input and $3.20 output.

OpenAI may offer discounted pricing for users who opt into data sharing, depending on program terms.

What Drives Your OpenAI API Cost?

Five factors determine how much you actually pay.

Model choice is the biggest lever. GPT-5.4 Nano costs $0.20 per million input tokens. GPT-5.4 Pro costs $30.00 for the same volume. That is a 150x price difference for the same number of tokens.

Token volume compounds quickly. A chatbot handling 10,000 conversations per day averaging 500 input tokens and 300 output tokens processes 240 million tokens monthly. On GPT-5.4 Mini, that costs about $518/month. On GPT-5.4 Standard, the same volume costs $1,725/month.

Context window usage affects every call. GPT-5.4 supports up to 272K tokens at standard pricing. Prompts exceeding that length trigger a surcharge: 2x on input ($5.00 per million) and 1.5x on output ($22.50 per million) for the full session. Long system prompts, retained conversation history, and RAG context all increase token counts.

Caching reduces costs when prompts share common prefixes. OpenAI automatically caches repeated input content. Cached tokens on GPT-5.4 Standard cost $0.25 per million instead of $2.50 — a 90% discount. Applications with consistent system prompts benefit the most.

Output length is often overlooked. Output tokens cost 4 to 6 times more than input tokens across all models. An unconstrained response that returns 2,000 tokens costs six times more than a response capped at 300 tokens, even though the input was identical.

How Do You Calculate Your OpenAI API Cost?

Use this formula to estimate cost per API call:

Cost per call = (input tokens / 1,000,000 x input price) + (output tokens / 1,000,000 x output price)

Here’s an example:

A customer support chatbot sends a 400-token system prompt plus a 100-token user message (500 input tokens) and receives a 300-token response (300 output tokens), using GPT-5.4 Mini.

Input cost: 500 / 1,000,000 x $0.75 = $0.000375
Output cost: 300 / 1,000,000 x $4.50 = $0.00135
Total cost per call: $0.001725 (roughly $0.002)

At 10,000 calls per day, that is $17.25/day or roughly $518/month.

To estimate your own monthly bill, multiply the cost per call by your expected daily call volume and multiply by 30. If your system prompt is reused across calls, apply caching discounts to the shared portion of the input.

For a quick estimate without manual math, OpenAI’s usage dashboard tracks token consumption in real time and projects monthly spend based on current usage patterns.

CloudZero also connects directly to OpenAI to help teams forecast monthly spending with more precision. The integration ingests your OpenAI cost and usage data, then uses your actual token consumption patterns to project future spend by model, feature, customer, or team.

OpenAI API Cost: Real-world SaaS Examples

These examples use GPT-5.4 family pricing as of April 2026 with conservative token estimates.

1. Customer support chatbot

A SaaS company routing 10,000 support queries per day through an AI chatbot. Each query averages 500 input tokens and 300 output tokens.

Model	Monthly input cost	Monthly output cost	Total monthly cost
GPT-5.4 Standard	$375	$1,350	$1,725
GPT-5.4 Mini	$113	$405	$518
GPT-5.4 Nano	$30	$113	$143

Most support chatbots do not need flagship reasoning. GPT-5.4 Mini handles FAQ-style responses and troubleshooting workflows effectively, cutting your total OpenAI cost by 70% compared to Standard.

2. Analytics reporting assistant

A product analytics feature where 2,000 users generate 10 reports per month. Each report averages 2,000 input tokens (data context plus prompt) and 500 output tokens.

Model	Monthly input cost	Monthly output cost	Total monthly cost
GPT-5.4 Standard	$100	$150	$250
GPT-5.4 Mini	$30	$45	$75
GPT-5.4 Nano	$8	$13	$21

Structured reporting is a strong fit for Nano. When the output format is predictable (e.g., JSON, tables, summaries), the cheapest model often performs well enough. Comparing OpenAI model costs across tiers is the fastest path to lower bills.

3. Semantic search with embeddings

Indexing 10 million documents at an average of 500 tokens each using text-embedding-3-small ($0.02 per million tokens).

One-time indexing cost: 5 billion tokens x $0.02/1M = $100
Monthly query cost (1 million queries, 1,000 tokens each): 1 billion tokens x $0.02/1M = $20/month

Embeddings are dramatically cheaper than LLM inference. The OpenAI token cost for embeddings is a fraction of a cent per query, making them the first option for search, classification, and recommendation workloads.

4. Audio transcription (OpenAI Whisper API pricing)

The OpenAI Whisper API cost depends on audio length. A company transcribing 500 hours of customer calls and internal meetings per month pays $0.006 per minute.

500 hours = 30,000 minutes
OpenAI Whisper API cost per minute: $0.006
Monthly cost: 30,000 x $0.006 = $180/month

For teams evaluating OpenAI Whisper API pricing, the per-minute rate makes transcription one of the most affordable OpenAI services, even at high volume.

OpenAI API Cost Vs. Alternatives: Model Comparison

OpenAI is not the only option. As of April 2026, GPT-5.4 Mini at $0.75 input is the most competitive mid-tier model on the market — cheaper than Claude Haiku 4.5 and Gemini 2.5 Flash on input. GPT-5.4 Nano at $0.20 undercuts nearly every alternative. Here is how the full lineup compares.

Model	Provider	Input (per 1M tokens)	Output (per 1M tokens)	Context window	Best for
GPT-5.4 Standard	OpenAI	$2.50	$15.00	272K (up to 1M)	General-purpose, coding, computer use
GPT-5.4 Mini	OpenAI	$0.75	$4.50	400K	High-volume chat, content generation
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	200K (up to 1M	Nuanced writing, instruction-following
Claude Haiku 4.5	Anthropic	$1.00	$5.00	200K	Fast, lightweight tasks
Claude Opus 4.6	Anthropic	$5.00	$25.00	1M	Complex reasoning, agentic tasks, nuanced analysis
Gemini 3.1 Pro (preview)	Google	$2.00	$12.00	1M	Long-context reasoning and analysis
Gemini 2.5 Flash	Google	$0.25	$1.50	1M	Budget inference at scale
DeepSeek V3.2	DeepSeek	$0.28	$0.42	128K	Budget chat, classification, extraction

Sources: OpenAI, Anthropic, Google, DeepSeek. Verified April 2026.

The key takeaway on OpenAI model pricing: GPT-5.4 Mini at $0.75 input is 4x cheaper than Claude Sonnet 4.6 on input, while GPT-5.4 Nano at $0.20 is the cheapest proprietary model available — undercutting Gemini 3.1 Flash-Lite ($0.25) and sitting just below DeepSeek V3.2 ($0.28). For teams running multi-model architectures, mixing providers based on task complexity is the most cost-effective approach.

Azure OpenAI Cost Vs. Direct API: Which Should You Use?

Azure OpenAI Service offers the same OpenAI models pricing through Microsoft’s cloud infrastructure. The OpenAI pricing model on Azure uses two approaches: pay-as-you-go (per-token, similar to the direct API) and provisioned throughput units (PTUs) for reserved capacity.

Pay-as-you-go rates on Azure are generally comparable to OpenAI’s direct pricing, but Azure adds value for enterprises that need data residency, private networking, or integration with existing Azure services. PTUs make sense for workloads with consistent, high-volume inference needs, since reserved capacity avoids per-token metering.

The tradeoff: Azure gives you compliance and infrastructure control. Direct API access gives you faster access to new models and features plus more transparent OpenAI API billing. Many teams use both, with Azure for production workloads and direct API for prototyping.

6 Ways To Reduce Your OpenAI API Cost

Controlling OpenAI costs starts with understanding where you have leverage. These six strategies target the biggest cost drivers in most production workloads.

1. Use the Batch API for non-urgent workloads

OpenAI’s Batch API processes requests asynchronously within 24 hours at 50% off standard OpenAI API pricing. For nightly data processing, bulk classification, or content generation pipelines, this is the single biggest cost lever available. GPT-5.4 Standard drops from $2.50/$15.00 to $1.25/$7.50 per million tokens.

2. Tier your models by task complexity

Not every API call needs the same model. Route simple classification and extraction to Nano ($0.20 input), use Mini ($0.75 input) for chat and summarization, and reserve Standard ($2.50 input) for complex reasoning. A tiered architecture can cut total spend by 60 to 80% compared to running everything on a single model. Review GPT-5 pricing tiers regularly as OpenAI adjusts rates with new releases.

3. Maximize cache hit rates

Structure API calls so shared content (system prompts, instructions, reference material) appears at the beginning of the prompt. OpenAI caches from the start of the input, so consistent prefixes yield the highest savings. Cached input on GPT-5.4 Standard costs $0.25 per million tokens instead of $2.50, a 90% reduction. This is one of the most effective ways to lower your OpenAI pricing per token.

4. Constrain output length

Set max_tokens on every API call. Unconstrained outputs generate more tokens than needed, and output tokens cost 4 to 6 times more than input tokens. Adding a simple instruction like “respond in under 150 words” to your system prompt can cut output token spend by 40% or more. That directly lowers your effective OpenAI API cost per token.

5. Use embeddings instead of LLM calls for search

Semantic search, classification, and recommendation tasks do not need a full LLM response. Text-embedding-3-small costs $0.02 per million tokens. That is 125 times cheaper than GPT-5.4 Standard for input. Pre-compute embeddings once, then query against them for pennies.

6. Monitor usage by feature, team, and customer

Token consumption that is invisible is token consumption that grows unchecked. Set up dashboards that track OpenAI pricing by the dimensions that matter to your business, not just total tokens. When you can see that Feature X costs $0.12 per user per month and Feature Y costs $0.003, you can make informed decisions about model selection, caching, and product pricing. CloudZero is the only cost intelligence platform that can help with that.

How Does CloudZero Track And Optimize OpenAI Costs?

Most teams track OpenAI spend at the billing level: total tokens consumed, total dollars spent.

That tells you what you spent, but not why or where.

CloudZero’s OpenAI integration ingests both cost and usage data from OpenAI’s APIs, then allocates that spend to the business dimensions that actually drive decisions: cost per customer, cost per feature, cost per AI model, cost per environment. This is the same unit economics approach CloudZero applies to AWS, Azure, GCP, Snowflake, and Datadog spend.

The result is granular visibility into questions like: How much does OpenAI API cost for Customer X? Which product feature consumes the most tokens per user? Is our inference spend growing faster than revenue?

CloudZero also helps when your AI usage surpasses expected limits. The platform’s AI-powered anomaly detection monitors OpenAI spend on an hourly basis, comparing the last 36 hours against 12 months of historical data to define what “normal” looks like.

When a deployment, prompt change, or traffic spike pushes token consumption outside that baseline, CloudZero sends automated alerts directly to the engineers who own the affected infrastructure through Slack, Jira, or email. That gives your team time to investigate and fix the issue before a small spike turns into a runaway bill.

Teams using the integration have deployed it in minutes and gained 12 months of historical OpenAI cost data within 48 hours. One customer discovered that OpenAI accounted for 25% of their total cloud spend, a blind spot they had no way to quantify before.

to see how ambitious brands like Grammarly, Skyscanner, Duolingo, Toyota, and more use CloudZero to track and optimize their AI costs across models, teams, and features. You can also get a free cloud cost assessment to find out exactly where your OpenAI costs stand today.

Frequently Asked Questions About OpenAI API Cost

Author: Lyne Carolyne

Lyne Carolyne has several years of experience in FinOps and cloud economics and brings that understanding into the content she creates. Outside work, she's an avid explorer.