Quick Answer
AI pricing covers the cost structures and billing models providers use to charge for AI products: per-token APIs (GPT-4o at $2.50/1M input tokens), per-seat subscriptions (Copilot at $30/user/month), per-conversation billing (Agentforce at $2/conversation), and consumption-based GPU compute (H100 instances at $55.04/hour). There is no standard. The total AI cost is almost always higher than the sticker price.
OpenAI charges per token. Microsoft charges per seat. AWS charges per GPU-hour. Salesforce charges per conversation. Anthropic charges per million tokens with different rates for input and output. And the engineering team that just spun up a fine-tuning job on SageMaker is about to learn what “consumption-based” means when the invoice arrives. (Spoiler: it means “more than you thought.”)
AI pricing has no standard. Gartner forecasts worldwide AI spending will reach $2.5 trillion in 2026, and every dollar of it is priced through a different model. For companies adopting AI across the organization, the question is not just “how much does AI cost?” It is “how do I compare costs across providers when one bills by the word, another by the seat, and a third by the conversation?”
CloudZero processes more than $15 billion in managed cloud and AI spend and helps engineering and finance teams understand what AI actually costs at the unit level. This guide covers the six AI pricing models that dominate the market, what the major providers actually charge, the hidden costs pricing pages leave out, and how to manage AI cost when every vendor bills differently.
One clarification: this article covers the cost of AI, not the use of AI for product pricing optimization (sometimes called “AI-powered pricing” or “dynamic pricing”). For that, Buynomics has a comparison. This guide is about what AI costs to buy, build, and run.
What does AI pricing actually mean?
AI pricing is the collection of cost structures, billing models, and metering approaches that AI providers use to charge for their products and services. Unlike traditional software (one license, one price, one invoice), AI pricing typically combines multiple billing dimensions in a single product: tokens consumed, compute hours used, data processed, features accessed, and seats provisioned.
What makes AI pricing uniquely complex is that it spans two layers most guides only cover one side of:
The application layer includes SaaS subscriptions (ChatGPT Plus at $20/month), API credits (from $0.15 to $15.00 per million tokens depending on provider and model tier), per-seat fees (Copilot at $18-$30/user/month), and outcome-based charges (Salesforce Agentforce, Intercom Fin). This is the visible ai cost: the vendor invoice, the expense report, the line item someone approved.
The infrastructure layer includes GPU compute at $1-$55/hour, model training runs, data pipelines, vector databases, and self-hosted inference. This is the invisible AI cost: buried in the cloud bill under generic compute and storage. CloudZero’s ROI in the AI Era report found organizations budget 30-36% of cloud spend for AI, while AI-specific line items show up at just 2.5%. The other 97.5% is ghost spend nobody tracks.
Generative AI pricing and AI service pricing in particular combine both layers in ways traditional software never did. Understanding where the costs live is step one. Knowing what the major providers actually charge is step two.
playbook
The AI Cost Optimization Playbook
Traditional cloud cost management is broken. Here’s why — and how to make the switch to cloud cost intelligence.
How the major AI providers price their services
Every major AI provider uses a different pricing model:
|
Provider |
Pricing model |
Consumer/entry |
Enterprise/API |
Primary cost drive |
|
OpenAI |
Per-token + subscription |
ChatGPT Free / Plus $20/mo |
GPT-4o: $2.50/$10.00 per 1M tokens |
Token volume, model tier |
|
Anthropic |
Per-token + subscription |
Claude Free / Pro $20/mo |
Sonnet 4.6: $3.00/$15.00 per 1M tokens |
Token volume, caching efficiency |
|
|
Per-token + Vertex AI |
AI Studio Free / Advanced $20/mo |
2.5 Pro: $1.25/$10.00 per 1M tokens |
Token volume, Vertex compute |
|
Microsoft |
Per-seat subscription |
Copilot Business $18/user/mo |
Enterprise $30/user/mo + M365 base |
Seat count (usage-independent) |
|
AWS |
Consumption-based |
Bedrock per-token |
SageMaker H100 at $55.04/hr |
GPU type, training duration |
|
Salesforce |
Per-conversation / Flex Credits |
Foundations (Free) |
$2/conversation or $0.10/action |
Conversation/action volume |
For complete breakdowns, see CloudZero’s guides to ChatGPT pricing, Claude API pricing, OpenAI API pricing, and Gemini API pricing. For a token-by-token comparison across all major LLMs, see CloudZero’s LLM API pricing comparison.
The table makes one thing clear: AI pricing comparison across providers is not a spreadsheet exercise. It requires normalizing fundamentally different billing models into common units. This is why organizations working across multiple cloud service providers need a platform that can ingest costs from all of these sources into a single view. CloudZero does this by pulling data from Anthropic, OpenAI, AWS, GCP, Azure, and other providers into one normalized cost model.
Now that the providers are mapped, the next question is which pricing model you are dealing with and what it means for your budget.
6 AI pricing models every buyer should understand
Several pricing models dominate the AI market. Each creates a different cost management challenge, and understanding which one you are dealing with determines how accurately you can forecast spend.
|
Model |
How you’re billed |
Example |
Forecasting difficulty |
|
Per-token |
Input + output tokens processed |
GPT-4o $2.50/$10.00 per 1M |
High — scales with usage |
|
Per-seat |
Flat monthly fee per user |
Copilot $18-$30/user/mo |
Low — fixed, but usage-blind |
|
Consumption (compute) |
GPU time by the hour |
H100 at $55.04/hr |
High — depends on uptime |
|
Per-conversation/resolution |
Each completed interaction |
Agentforce $2/conversation |
Medium — volume-dependent |
|
Tiered / freemium |
Free tier + paid capacity |
ChatGPT Free/Plus/Pro |
Medium — upgrade creep |
|
Hybrid |
Base fee + usage overages |
Subscription + per-token |
Highest — floor plus moving ceiling |
- Per-token pricing. The dominant model for AI API pricing. You pay based on input and output tokens processed. Used by OpenAI, Anthropic, Google, and most LLM pricing providers. Rates range from $0.15/1M tokens (GPT-4o mini) to $3.00/1M tokens (Claude Sonnet 4.6) for input, with output tokens costing 3-4x more. This is token based pricing in practice: costs scale linearly with usage, which means a popular AI feature can double your bill in a month without anyone changing a setting. CloudZero’s direct Anthropic and OpenAI integrations pull token-level consumption into the same cost view as cloud infrastructure, so teams can see exactly which models drive which costs.
- Per-seat subscription. Fixed monthly cost per user, regardless of usage. Microsoft Copilot at $18-$30/user/month is the canonical example. Also used by Notion AI, Grammarly, and most AI-embedded SaaS. AI subscription pricing is predictable for budgeting but disconnected from actual value: you pay the same if the user prompts Copilot 100 times a day or lets it gather dust. As Erik Peterson, CloudZero’s founder and CTO, puts it: “The cloud should drive innovation, not headaches. But you can’t innovate efficiently on costs you can’t see.” Per-seat pricing hides the usage signal that tells you if the investment is working.
- Consumption-based (compute hours). Pay for GPU infrastructure by the hour. Used by AWS SageMaker, Google Vertex AI, and Azure ML. AI infrastructure pricing ranges from $1/hour for inference-optimized chips to $55.04/hour for NVIDIA H100 clusters. This is also how AI chatbot cost scales when running self-hosted inference. Cost driver: GPU type, training duration, and if you remembered to shut down the instance over the long weekend. (The instance remembers. It kept running. So did the meter.)
- Per-conversation or per-resolution. Pay per completed interaction. Salesforce Agentforce at $2/conversation, Intercom Fin at $0.99/resolution. This model aligns cost with business outcomes, which sounds elegant until you try to forecast how many conversations your AI will handle next quarter. Salesforce already offers three different pricing models for Agentforce (per-conversation, Flex Credits, per-user), because even the vendor building the product could not settle on one.
- Tiered pricing / freemium. Free tier with limited usage, paid tiers adding more capacity. Used by ChatGPT (Free/Plus/Pro), Claude (Free/Pro/Team), and most consumer-facing AI. This is where shadow AI begins: individual contributors start on free, usage grows, and suddenly the organization has 200 individual subscriptions on personal credit cards. Expense-based SaaS purchasing grew 267% year-over-year, according to Zylo’s 2026 SaaS Management Index. That is shadow AI entering through the expense report, one $20 subscription at a time.
- Hybrid pricing. Combines a base subscription with variable usage charges. Hybrid pricing surged from 27% to 41% of B2B companies in just 12 months, according to Growth Unhinged’s 2025 State of B2B Monetization report. Example: a platform charges a monthly fee plus per-token overages above a threshold. This is the hardest model to forecast: the bill has a fixed floor and a variable ceiling, and the ceiling moves with adoption. More vendors are shifting to it as they mature.
These six models answer “how is AI priced?” But the sticker price is only the beginning. Here is what it leaves out.
The hidden costs of AI that pricing pages do not show you
The pricing table shows the part of the iceberg above the waterline. The actual cost of generative AI in production includes five categories of spend:
- GPU infrastructure. Fine-tuning models requires GPU compute at $1-$55+ per hour. Even organizations using APIs for inference often train custom models or run open-source alternatives. AI server cost for a single H100 cluster running 24/7 is $40,200/month. CloudZero’s cloud GPU pricing comparison breaks this down by instance type across AWS, GCP, and Azure.
- Data pipelines. Cleaning, labeling, embedding, and indexing data into vector databases often costs more than the model itself. Databricks and Snowflake costs sit under “data infrastructure” on the cloud bill. CloudZero’s ROI in the AI Era report found organizations’ reported AI spending is roughly 12x lower than actual AI-driven cloud consumption. The gap is data pipelines and compute that exist because of AI but never get attributed to it.
- Shadow AI. ChatGPT is now the number-one most expensed application on corporate credit cards, according to Zylo. Eight of the top 50 most-expensed applications are AI-native. This is spending that procurement cannot see, IT cannot govern, and finance cannot forecast. It is the AI equivalent of every department buying their own printer in 1998, except each printer costs $20/month and multiplies when nobody is looking.
- Integration overhead. RAG pipelines, prompt libraries, evaluation frameworks, multi-model orchestration: all require engineering time and infrastructure that never appears on a pricing page.
- Surprise bills. 78% of IT leaders experienced unexpected charges from consumption-based AI pricing, according to Zylo, and 61% had to cut other projects as a result. 61% had to cut other projects as a result. A misconfigured pipeline, a runaway training job, or a context window nobody optimized can generate a month’s budget in a weekend. CloudZero’s anomaly detection catches these spikes within hours. Fintech company Upstart used CloudZero’s anomaly alerts to cut total cloud spend by over $16 million. Drift saved $2.4 million in annual AWS costs.
The hidden cost layer is where most AI budget surprises originate. Managing it requires a different approach than reading pricing pages.
How CloudZero resolves AI pricing chaos
CloudZero provides an AI ROI solution: understanding what AI costs per team, per feature, per customer, across every provider and billing model. How?
- First-ever Anthropic integration. CloudZero was the first cloud cost platform to integrate directly with Anthropic’s Usage and Cost API, pulling Claude token consumption, model usage, and caching efficiency into the same view used for AWS, Azure, and GCP. Teams building with Claude see spend broken down by project, model, operation, and workspace.
- OpenAI with per-user granularity. CloudZero’s OpenAI integration ingests both cost and usage data, enabling cost per user, per model, per token type. No more reconciling three billing dashboards to figure out which model costs what.
- Multi-provider normalization. CloudZero ingests data from AWS, GCP, Azure, Anthropic, OpenAI, Kubernetes, Snowflake, Datadog, MongoDB, New Relic, and any other source through the AnyCost API. All normalized into one cost model. All mappable to business dimensions (teams, features, products, customers). This is AI cost management at scale.

- Unit economics for AI. CloudZero’s cost per customer capability answers the question pricing pages never do: what does each AI-powered feature cost per customer interaction? The difference between “AI costs $200K/month” and “the search feature costs $0.003 per query for profitable customers and $0.14 for unprofitable ones” is the difference between a number and a decision.
- AI-powered anomaly detection. CloudZero’s anomaly detection compares 36 hours of hourly spend against 12 months of history, then alerts the team that owns the affected workload. Not a monthly finance report. Not a weekly rollup. Hourly data, because at $55/hour, every hour counts.

- Budgets that work for consumption pricing. CloudZero’s budgeting tracks spend against plans at the team and product level. When AI API pricing costs trend toward a breach, the team that owns the spend knows before the bill arrives.
For AI cost optimization strategies (commitment planning, model right-sizing, prompt optimization), see CloudZero’s guide to AI cost optimization. For the full AI cost management framework, see CloudZero’s AI cost management guide.
Organizations like Toyota, Duolingo, Coinbase, Upstart, Drift, and Skyscanner use CloudZero to manage more than $15 billion in cloud and AI spend. Ready to see what your AI spend actually looks like?
.