Table Of Contents
What Does The Mistral API Cost Per Million Tokens? How Much Does Le Chat Cost? Consumer Plans Explained Is Mistral Cheaper Than OpenAI, Claude, and DeepSeek? How Do Mistral's Caching, Tiers, And Rate Limits Work? Why Mistral's Three Billing Streams Create a Tracking Problem Frequently Asked Questions About Mistral Pricing

Quick Answer

Mistral API pricing ranges from $0.02 input per million tokens (Nemo) to $2.00/$5.00 (Magistral Medium). The current flagship, Mistral Large 3, costs $0.50/$1.50 per MTok. Mistral Small 4 costs $0.15/$0.60. Le Chat consumer plans run separately: Free ($0), Pro ($14.99/month), Team ($24.99/user/month).

Mistral AI pricing trips up developers because it runs two completely separate billing systems under one URL. The API, pay-per-token through la Plateforme, has its own rate card. Le Chat Mistral, the consumer chat interface, has monthly subscriptions. Mistral pricing splits along this line, and understanding which system you’re paying for is the first step to understanding the bill.

A Pro subscription at $14.99/month does not cover API calls. Codestral in Le Chat comes with Pro; Codestral through the API bills at $0.30/$0.90 per MTok. It’s the enterprise equivalent of a restaurant where food and wine have separate checks, and neither menu mentions the other.

Mistral’s model catalog has also grown faster than most pricing guides can track. The 2026 Mistral AI models lineup spans multiple active models across generalist, reasoning, code, edge, multimodal, and audio tiers. Mistral names models the way French wine regions name appellations — precisely, prolifically, and with the assumption you already know what Medium 3.1 vs. Medium 3.5 means. This guide exists to make all of that clear.

This guide covers every current Mistral API pricing. It also covers the Le Chat consumer plans, how Mistral compares to OpenAI, Claude, and DeepSeek, and the billing mechanics that determine your actual Mistral AI cost.

What Does The Mistral API Cost Per Million Tokens?

Here is every current Mistral AI API price per million tokens.

Flagship and generalist models

Model

Input/MTok

Output/MTok

Context

Notes

Mistral Large 3 (2512)

$0.50

$1.50

262K

Current flagship

Mistral Medium 3.5

$1.50

$7.50

256K

Open weights (Modified MIT). Apr 2026. Highest output cost

Mistral Medium 3

$0.40

$2.00

131K

Balanced workloads

Mistral Small 4

$0.15

$0.60

128K

High-throughput production default

Mistral Small 3.2

$0.08

$0.20

128K

Cheapest current Small model

Mistral NeMo

$0.02

$0.03

 

Budget multilingual. Cheapest generalist

A note on the Large model line: Mistral Large pricing changed dramatically between generations. Mistral Large 2 pricing per million tokens was $2.00/$6.00. Mistral Large 3 (2512, December 2025) dropped to $0.50/$1.50 — a 75% price reduction. The current Mistral large API pricing is $0.50/$1.50.

Reasoning models (Magistral)

Model

Input/MTok

Output/MTok

Notes

Magistral Medium

$2.00

$5.00

Frontier chain-of-thought reasoning

Magistral Small 1.2

$0.50

$1.50

Open weights, multimodal reasoning

Magistral models generate more output tokens per request than generalist models, chain-of-thought reasoning is verbose by design. Budget for 2–5x the output tokens you’d expect from a standard model call. For context on how reasoning model costs compare across providers, see CloudZero’s guide to how much AI costs.

Code and agentic models

Model

Input/MTok

Output/MTok

Note

Codestral (2508)

$0.30

$0.90

FIM (fill-in-the-middle), 32K context, IDE integration

Devstral 2 (2512)

$0.40

$2.00

Newest agentic coding model

Devstral Small 1.1

$0.10

$0.30

Open weights, SWE agent tasks

Mistral Codestral pricing at $0.30/$0.90 covers both FIM completions and chat-based code generation. Codestral in Le Chat is included with a Pro subscription; Codestral through the API is a separate per-token charge. Different product, different bill.

Edge and efficient models (Ministral)

Model

Input/MTok

Output/MTok

Notes

Ministral 14B (2512)

$0.20

$0.20

Vision support

Ministral 8B (2512)

$0.15

$0.15

Edge / on-device

Ministral 3B (2512)

$0.10

$0.10

Cheapest

Multimodal and specialist models

Model

Input/MTok

Output/MTok

Notes

Pixtral Large (2411)

$2.00

$6.00

Legacy multimodal flagship

Pixtral 12B

$0.10

$0.10

Budget multimodal

Voxtral Small

$0.10

$0.30

Audio: transcription, TTS

For Mistral small pricing specifically: two current Small models exist. Small 4 ($0.15/$0.60) and Small 3.2 ($0.08/$0.20). Small 3.2 is cheaper; Small 4 is newer and more capable.

Mistral Nemo pricing at $0.02/$0.03 per MTok makes it the cheapest model in Mistral’s entire lineup and one of the cheapest APIs in the LLM market.

Which Mistral model fits which use case? Small for throughput, Large 3 for reasoning, Codestral for code, Magistral for chain-of-thought, Ministral for edge deployment.

The API rates cover the per-token cost. But Mistral API cost depends on more than the rate card, model selection, caching, and tier progression all affect the final number. Most developers searching for Mistral AI pricing also want to know what Le Chat costs, and why it’s a completely different bill.

FinOps In The AI Era: A Critical Recalibration

What 475 executives told us about AI and cloud efficiency.

How Much Does Le Chat Cost? Consumer Plans Explained

Mistral Le Chat pricing is a subscription model entirely separate from API billing. A Le Chat Pro subscription does not include API access, and API credits don’t apply to Le Chat.

Plan

Price

Messages

Key features

Free

$0

~25/day cap

SOTA models, 500 memories, image gen, web search, 40+ connectors

Pro

$14.99/mo

Up to 6x Free

Extended thinking, deep research, 15GB storage, Mistral Vibe

Team

$24.99/user/mo

Up to 6x Free

30GB/user, domain verification, admin API

Enterprise

Custom

Custom

SAML SSO, audit logs, white label

Student

~$7.50/mo

Pro features

Verified .edu email

Is Mistral free? Yes. Both Le Chat and the API offer free tiers. Mistral AI free access through Le Chat gives you roughly 25 messages per day. The Mistral API free tier provides rate-limited access for evaluation. The Mistral free tier on both products is enough to test models, not enough to run production.

Mistral Le Chat Pro pricing at $14.99/month is the cheapest paid AI chat product from a major provider, less than ChatGPT Plus ($20) or Claude Pro ($20). It includes Mistral Vibe for in-chat coding. Codestral and Devstral through the API are billed separately. Think of it this way: Pro is a gym membership; the API is personal training sessions. Both happen at the same facility, but only one of them shows up on the subscription invoice.

The EU’s answer to American AI pricing: match the reasoning quality and undercut on price.

Is Mistral Cheaper Than OpenAI, Claude, and DeepSeek?

Yes, and on its flagship model, Mistral undercuts every major US provider by 75% or more on input.

Model

Input/MTok

Output/MTok

Context

Mistral Large 3

$0.50

$1.50

262K

Mistral Small 4

$0.15

$0.60

128K

GPT-5.4

$2.50

$15.00

1M+

Claude Sonnet 4.6

$3.00

$15.00

1M

DeepSeek V4 Flash

$0.14

$0.28

1M

Gemini 2.5 Flash

$0.30

$2.50

1M

Mistral Large 3 at $0.50/$1.50 is 80% cheaper than GPT-5.4 on input ($0.50 vs. $2.50) and 90% cheaper on output ($1.50 vs. $15.00). Against Claude Sonnet 4.6 ($3.00/$15.00), it’s 83% cheaper on input and 90% on output. A pipeline processing 1 million output tokens per day costs $1.50/day on Large 3 vs. $15.00/day on Claude Sonnet — $548/year vs. $5,475.

Mistral Small 4 at $0.15/$0.60 competes directly with Gemini 2.5 Flash ($0.30/$2.50). Half the price on input, one-quarter on output. For high-throughput classification, extraction, and chat, Small 4 is one of the most cost-effective production models available.

The one provider Mistral doesn’t undercut: DeepSeek. V4 Flash at $0.14/$0.28 is cheaper than every Mistral model. The tradeoff is data residency (EU vs. China), reliability guarantees, and open weights availability. For teams where GDPR compliance and EU hosting are requirements, not preferences, Mistral is the only frontier provider that qualifies natively.

Prices compared. Here’s how the billing mechanics such as caching, tiers and rate limits change what you actually pay.

How Do Mistral’s Caching, Tiers, And Rate Limits Work?

Three mechanics determine your actual Mistral cost beyond the rate card:

1. Prompt caching: 90% discount on repeated prefixes

Cached tokens bill at 10% of the standard input price. Include a cache_key parameter, and Mistral reuses previously computed tokens for matching prefixes. On Large 3, cached input drops from $0.50 to $0.05 per MTok. On Small 4, from $0.15 to $0.015. Mistral’s 90% cache discount matches Anthropic’s cached input pricing and exceeds OpenAI’s 50% discount. Multi-turn conversations with stable system prompts, document preambles, and tool definitions all benefit.

2. Tier system: spending drives rate limits

Mistral’s Mistral API rate limit system has five tiers, a progression that works like airline frequent flyer status, except the currency is API invoices. The free tier provides evaluation access with strict limits. The Scale plan (Tier 1, pay-as-you-go, no monthly minimum) enables production rate limits. Higher tiers are opened by cumulative billing: Tier 4 requires $2,000+ in cumulative spend. For most startups and mid-market teams, Tier 1–2 covers production needs.

3. Self-hosting: $0 per token (with asterisks)

Mistral publishes open weights for many models under Apache 2.0 or Modified MIT licenses. Small 4, Medium 3.5, Magistral Small, Devstral Small, and the Ministral family can all be self-hosted. Your Mistral inference cost becomes GPU compute, with zero per-token fees. “Free” in the same way a home garden produces “free” vegetables: no ingredient cost, real labor. For high-volume workloads where inference cost dominates, the math favors self-hosting above tens of millions of tokens per day. Below that, the API is cheaper.

The billing mechanics explain how costs accumulate. For teams running Mistral alongside other providers, there’s one more problem most people miss.

Why Mistral’s Three Billing Streams Create a Tracking Problem

Mistral is the only major AI provider where a single team can generate costs across three completely separate billing systems at once.

  • Self-hosted Mistral Small 4 on GPU instances: billed as cloud compute on your AWS or GCP invoice.
  • Large 3 and Magistral through la Plateforme API: billed per token on your Mistral account.
  • Le Chat Pro subscriptions for developers using Vibe: billed as a monthly SaaS charge.

The word “Mistral” appears on none of those invoices in a way that connects them. You get a cloud compute line item, an API charge, and a SaaS subscription;  three vendors in the ledger, one provider in reality.

Add the fact that most teams also route requests to OpenAI, Claude, or DeepSeek, and you’ve got AI costs scattered across five or six dashboards with no shared unit of measurement. The FinOps Foundation’s 2026 State of FinOps survey — which polled 1,192 practitioners representing over $83 billion in annual cloud spend — found that visibility into AI costs is now the top challenge FinOps teams face, with granular monitoring of tokens, LLM requests, and GPU utilization the single most requested tooling capability in the entire survey.

CloudZero stitches these together. It attributes Mistral self-hosted GPU costs, la Plateforme API spend, and Amazon Bedrock charges by team, project, and customer through the CostFormation.

According to CloudZero’s FinOps in the AI Era 2026 report, 78% of organizations can’t distinguish AI costs from general cloud spend. Flexera’s 2026 State of the Cloud Report confirms the pattern from a different angle: after five years of decline, wasted cloud spend rose to 29% — driven by cost complexity from AI workloads where pricing is tied to abstract units like tokens, credits, and slots that resist traditional forecasting.

When your Mistral spend alone lives in three separate invoices, before counting other providers, attribution at the model level isn’t optional.

It’s the only way to answer the question that matters: not just “what did Mistral cost this month?” but “was the GDPR-compliant routing strategy worth the price difference over sending everything to DeepSeek?” That’s a unit economics question, not a billing question.

or take a product tour to see how ambitious brands such as Duolingo, Grammarly, Toyota, Universal, Drift and more use CloudZero to save millions of dollars in AI and cloud costs.

Frequently Asked Questions About Mistral Pricing

FinOps In The AI Era: A Critical Recalibration

What 475 executives told us about AI and cloud efficiency.