How much does the Mistral API cost per million tokens?

Mistral AI API pricing per million tokens ranges from $0.02/$0.03 (Nemo) to $2.00/$5.00 (Magistral Medium). The flagship Mistral Large 3 costs $0.50/$1.50. The production default Mistral Small 4 costs $0.15/$0.60. Codestral costs $0.30/$0.90. Mistral 7B is deprecated, its successors are Ministral 8B ($0.15/$0.15) and Ministral 3B ($0.10/$0.10).

Is the Mistral AI API free?

The Mistral API free tier provides rate-limited evaluation access. Le Chat's free tier gives roughly 25 messages per day. The Mistral AI API free tier is enough to test models, not enough to serve production traffic. Is Mistral Small API free? Small 4 is available on the evaluation tier with rate limits, but production use requires the Scale plan. Is Mistral free for production? No, both products cap free usage heavily.

How much does Mistral Le Chat Pro cost?

Mistral Le Chat pricing: Free ($0), Pro ($14.99/month), Team ($24.99/user/month), Enterprise (custom). Mistral Le Chat Pro pricing at $14.99 is cheaper than ChatGPT Plus ($20) and Claude Pro ($20). Pro includes Mistral Vibe for coding, extended thinking, and deep research. Le Chat billing is entirely separate from API billing.

How much does Mistral OCR cost?

Mistral OCR pricing for OCR 3 (current version) is available on the API tab of mistral.ai/pricing. OCR 3 supports structured annotations, bounding boxes, and document Q&A. Mistral OCR is one of Mistral's fastest-growing products, check the official page for current per-page or per-token rates.

What does Codestral cost?

Mistral Codestral pricing: $0.30 input / $0.90 output per million tokens, with 32K context. Supports fill-in-the-middle (FIM) for IDE integration. Codestral inside Le Chat is included with Pro; Codestral through the API bills per token separately.

How does Mistral compare to OpenAI and Claude on price?

Mistral Large 3 at $0.50/$1.50 is 80% cheaper on input and 90% cheaper on output than GPT-5.4 ($2.50/$15.00). Against Claude Sonnet 4.6 ($3.00/$15.00), it's 83%/90% cheaper. Mistral also offers EU data residency and open weights, advantages no US provider matches.

Can I self-host Mistral models for free?

Yes. Mistral AI publishes open weights for Small 4, Medium 3.5, Magistral Small, Devstral Small, Ministral 14B/8B/3B, and others under Apache 2.0 or Modified MIT. Self-hosted models have zero per-token API cost, your Mistral inference cost is GPU compute. Mistral enterprise pricing: contact Mistral for custom deployment with SAML SSO, audit logs, and dedicated support.

May 20, 2026 , 10 min read

Mistral API Pricing In 2026: Every Model, Le Chat Plans, And How Costs Compare

Mistral API pricing in 2026: Large 3 at $0.50/$1.50, Small 4 at $0.15/$0.60, Medium 3.5 at $1.50/$7.50 per MTok. Le Chat Pro $14.99/mo. Full comparison with GPT, Claude, and DeepSeek.

By: Lyne Carolyne

Table Of Contents

What Does The Mistral API Cost Per Million Tokens? How Much Does Le Chat Cost? Consumer Plans Explained Is Mistral Cheaper Than OpenAI, Claude, and DeepSeek? How Do Mistral's Caching, Tiers, And Rate Limits Work? Why Mistral's Three Billing Streams Create a Tracking Problem Frequently Asked Questions About Mistral Pricing

Quick Answer

Mistral API pricing ranges from $0.02 input per million tokens (Nemo) to $2.00/$5.00 (Magistral Medium). The current flagship, Mistral Large 3, costs $0.50/$1.50 per MTok. Mistral Small 4 costs $0.15/$0.60. Le Chat consumer plans run separately: Free ($0), Pro ($14.99/month), Team ($24.99/user/month).

Mistral AI pricing trips up developers because it runs two completely separate billing systems under one URL. The API, pay-per-token through la Plateforme, has its own rate card. Le Chat Mistral, the consumer chat interface, has monthly subscriptions. Mistral pricing splits along this line, and understanding which system you’re paying for is the first step to understanding the bill.

A Pro subscription at $14.99/month does not cover API calls. Codestral in Le Chat comes with Pro; Codestral through the API bills at $0.30/$0.90 per MTok. It’s the enterprise equivalent of a restaurant where food and wine have separate checks, and neither menu mentions the other.

Mistral’s model catalog has also grown faster than most pricing guides can track. The 2026 Mistral AI models lineup spans multiple active models across generalist, reasoning, code, edge, multimodal, and audio tiers. Mistral names models the way French wine regions name appellations — precisely, prolifically, and with the assumption you already know what Medium 3.1 vs. Medium 3.5 means. This guide exists to make all of that clear.

This guide covers every current Mistral API pricing. It also covers the Le Chat consumer plans, how Mistral compares to OpenAI, Claude, and DeepSeek, and the billing mechanics that determine your actual Mistral AI cost.

What Does The Mistral API Cost Per Million Tokens?

Here is every current Mistral AI API price per million tokens.

Flagship and generalist models

Model	Input/MTok	Output/MTok	Context	Notes
Mistral Large 3 (2512)	$0.50	$1.50	262K	Current flagship
Mistral Medium 3.5	$1.50	$7.50	256K	Open weights (Modified MIT). Apr 2026. Highest output cost
Mistral Medium 3	$0.40	$2.00	131K	Balanced workloads
Mistral Small 4	$0.15	$0.60	128K	High-throughput production default
Mistral Small 3.2	$0.08	$0.20	128K	Cheapest current Small model
Mistral NeMo	$0.02	$0.03		Budget multilingual. Cheapest generalist

A note on the Large model line: Mistral Large pricing changed dramatically between generations. Mistral Large 2 pricing per million tokens was $2.00/$6.00. Mistral Large 3 (2512, December 2025) dropped to $0.50/$1.50 — a 75% price reduction. The current Mistral large API pricing is $0.50/$1.50.

Reasoning models (Magistral)

Model	Input/MTok	Output/MTok	Notes
Magistral Medium	$2.00	$5.00	Frontier chain-of-thought reasoning
Magistral Small 1.2	$0.50	$1.50	Open weights, multimodal reasoning

Magistral models generate more output tokens per request than generalist models, chain-of-thought reasoning is verbose by design. Budget for 2–5x the output tokens you’d expect from a standard model call. For context on how reasoning model costs compare across providers, see CloudZero’s guide to how much AI costs.

Code and agentic models

Model	Input/MTok	Output/MTok	Note
Codestral (2508)	$0.30	$0.90	FIM (fill-in-the-middle), 32K context, IDE integration
Devstral 2 (2512)	$0.40	$2.00	Newest agentic coding model
Devstral Small 1.1	$0.10	$0.30	Open weights, SWE agent tasks

Mistral Codestral pricing at $0.30/$0.90 covers both FIM completions and chat-based code generation. Codestral in Le Chat is included with a Pro subscription; Codestral through the API is a separate per-token charge. Different product, different bill.

Edge and efficient models (Ministral)

Model	Input/MTok	Output/MTok	Notes
Ministral 14B (2512)	$0.20	$0.20	Vision support
Ministral 8B (2512)	$0.15	$0.15	Edge / on-device
Ministral 3B (2512)	$0.10	$0.10	Cheapest

Multimodal and specialist models

Model	Input/MTok	Output/MTok	Notes
Pixtral Large (2411)	$2.00	$6.00	Legacy multimodal flagship
Pixtral 12B	$0.10	$0.10	Budget multimodal
Voxtral Small	$0.10	$0.30	Audio: transcription, TTS

For Mistral small pricing specifically: two current Small models exist. Small 4 ($0.15/$0.60) and Small 3.2 ($0.08/$0.20). Small 3.2 is cheaper; Small 4 is newer and more capable.

Mistral Nemo pricing at $0.02/$0.03 per MTok makes it the cheapest model in Mistral’s entire lineup and one of the cheapest APIs in the LLM market.

Which Mistral model fits which use case? Small for throughput, Large 3 for reasoning, Codestral for code, Magistral for chain-of-thought, Ministral for edge deployment.

The API rates cover the per-token cost. But Mistral API cost depends on more than the rate card, model selection, caching, and tier progression all affect the final number. Most developers searching for Mistral AI pricing also want to know what Le Chat costs, and why it’s a completely different bill.

Research Report

FinOps In The AI Era: A Critical Recalibration

What 475 executives told us about AI and cloud efficiency.

How Much Does Le Chat Cost? Consumer Plans Explained

Mistral Le Chat pricing is a subscription model entirely separate from API billing. A Le Chat Pro subscription does not include API access, and API credits don’t apply to Le Chat.

Plan	Price	Messages	Key features
Free	$0	~25/day cap	SOTA models, 500 memories, image gen, web search, 40+ connectors
Pro	$14.99/mo	Up to 6x Free	Extended thinking, deep research, 15GB storage, Mistral Vibe
Team	$24.99/user/mo	Up to 6x Free	30GB/user, domain verification, admin API
Enterprise	Custom	Custom	SAML SSO, audit logs, white label
Student	~$7.50/mo	Pro features	Verified .edu email

Is Mistral free? Yes. Both Le Chat and the API offer free tiers. Mistral AI free access through Le Chat gives you roughly 25 messages per day. The Mistral API free tier provides rate-limited access for evaluation. The Mistral free tier on both products is enough to test models, not enough to run production.

Mistral Le Chat Pro pricing at $14.99/month is the cheapest paid AI chat product from a major provider, less than ChatGPT Plus ($20) or Claude Pro ($20). It includes Mistral Vibe for in-chat coding. Codestral and Devstral through the API are billed separately. Think of it this way: Pro is a gym membership; the API is personal training sessions. Both happen at the same facility, but only one of them shows up on the subscription invoice.

The EU’s answer to American AI pricing: match the reasoning quality and undercut on price.

Is Mistral Cheaper Than OpenAI, Claude, and DeepSeek?

Yes, and on its flagship model, Mistral undercuts every major US provider by 75% or more on input.

Model	Input/MTok	Output/MTok	Context
Mistral Large 3	$0.50	$1.50	262K
Mistral Small 4	$0.15	$0.60	128K
GPT-5.4	$2.50	$15.00	1M+
Claude Sonnet 4.6	$3.00	$15.00	1M
DeepSeek V4 Flash	$0.14	$0.28	1M
Gemini 2.5 Flash	$0.30	$2.50	1M

Mistral Large 3 at $0.50/$1.50 is 80% cheaper than GPT-5.4 on input ($0.50 vs. $2.50) and 90% cheaper on output ($1.50 vs. $15.00). Against Claude Sonnet 4.6 ($3.00/$15.00), it’s 83% cheaper on input and 90% on output. A pipeline processing 1 million output tokens per day costs $1.50/day on Large 3 vs. $15.00/day on Claude Sonnet — $548/year vs. $5,475.

Mistral Small 4 at $0.15/$0.60 competes directly with Gemini 2.5 Flash ($0.30/$2.50). Half the price on input, one-quarter on output. For high-throughput classification, extraction, and chat, Small 4 is one of the most cost-effective production models available.

The one provider Mistral doesn’t undercut: DeepSeek. V4 Flash at $0.14/$0.28 is cheaper than every Mistral model. The tradeoff is data residency (EU vs. China), reliability guarantees, and open weights availability. For teams where GDPR compliance and EU hosting are requirements, not preferences, Mistral is the only frontier provider that qualifies natively.

Prices compared. Here’s how the billing mechanics such as caching, tiers and rate limits change what you actually pay.

How Do Mistral’s Caching, Tiers, And Rate Limits Work?

Three mechanics determine your actual Mistral cost beyond the rate card:

1. Prompt caching: 90% discount on repeated prefixes

Cached tokens bill at 10% of the standard input price. Include a cache_key parameter, and Mistral reuses previously computed tokens for matching prefixes. On Large 3, cached input drops from $0.50 to $0.05 per MTok. On Small 4, from $0.15 to $0.015. Mistral’s 90% cache discount matches Anthropic’s cached input pricing and exceeds OpenAI’s 50% discount. Multi-turn conversations with stable system prompts, document preambles, and tool definitions all benefit.

2. Tier system: spending drives rate limits

Mistral’s Mistral API rate limit system has five tiers, a progression that works like airline frequent flyer status, except the currency is API invoices. The free tier provides evaluation access with strict limits. The Scale plan (Tier 1, pay-as-you-go, no monthly minimum) enables production rate limits. Higher tiers are opened by cumulative billing: Tier 4 requires $2,000+ in cumulative spend. For most startups and mid-market teams, Tier 1–2 covers production needs.

3. Self-hosting: $0 per token (with asterisks)

Mistral publishes open weights for many models under Apache 2.0 or Modified MIT licenses. Small 4, Medium 3.5, Magistral Small, Devstral Small, and the Ministral family can all be self-hosted. Your Mistral inference cost becomes GPU compute, with zero per-token fees. “Free” in the same way a home garden produces “free” vegetables: no ingredient cost, real labor. For high-volume workloads where inference cost dominates, the math favors self-hosting above tens of millions of tokens per day. Below that, the API is cheaper.

The billing mechanics explain how costs accumulate. For teams running Mistral alongside other providers, there’s one more problem most people miss.

Why Mistral’s Three Billing Streams Create a Tracking Problem

Mistral is the only major AI provider where a single team can generate costs across three completely separate billing systems at once.

Self-hosted Mistral Small 4 on GPU instances: billed as cloud compute on your AWS or GCP invoice.
Large 3 and Magistral through la Plateforme API: billed per token on your Mistral account.
Le Chat Pro subscriptions for developers using Vibe: billed as a monthly SaaS charge.

The word “Mistral” appears on none of those invoices in a way that connects them. You get a cloud compute line item, an API charge, and a SaaS subscription; three vendors in the ledger, one provider in reality.

Add the fact that most teams also route requests to OpenAI, Claude, or DeepSeek, and you’ve got AI costs scattered across five or six dashboards with no shared unit of measurement. The FinOps Foundation’s 2026 State of FinOps survey — which polled 1,192 practitioners representing over $83 billion in annual cloud spend — found that visibility into AI costs is now the top challenge FinOps teams face, with granular monitoring of tokens, LLM requests, and GPU utilization the single most requested tooling capability in the entire survey.

CloudZero stitches these together. It attributes Mistral self-hosted GPU costs, la Plateforme API spend, and Amazon Bedrock charges by team, project, and customer through the CostFormation.

According to CloudZero’s FinOps in the AI Era 2026 report, 78% of organizations can’t distinguish AI costs from general cloud spend. Flexera’s 2026 State of the Cloud Report confirms the pattern from a different angle: after five years of decline, wasted cloud spend rose to 29% — driven by cost complexity from AI workloads where pricing is tied to abstract units like tokens, credits, and slots that resist traditional forecasting.

When your Mistral spend alone lives in three separate invoices, before counting other providers, attribution at the model level isn’t optional.

It’s the only way to answer the question that matters: not just “what did Mistral cost this month?” but “was the GDPR-compliant routing strategy worth the price difference over sending everything to DeepSeek?” That’s a unit economics question, not a billing question.

or take a product tour to see how ambitious brands such as Duolingo, Grammarly, Toyota, Universal, Drift and more use CloudZero to save millions of dollars in AI and cloud costs.

Frequently Asked Questions About Mistral Pricing

Author: Lyne Carolyne

Lyne Carolyne has several years of experience in FinOps and cloud economics and brings that understanding into the content she creates. Outside work, she's an avid explorer.