Quick Answer
A new NVIDIA H100 GPU costs $25,000–$40,000 depending on variant (PCIe vs SXM5). Cloud rental runs $1.38–$8.00+ per GPU-hour on-demand, with a market median around $2.29–$3.12/hr. Used H100s sell for $6,000–$22,000, an 85% drop from the 2023 peak.
The H100 GPU cost conversation has changed more in the past six months than in the prior two years. Blackwell B200s are shipping. Used H100 SXM5 cards that sold for $40,000 in late 2023 now move for $12,000–$22,000 on secondary markets. Cloud rates have dropped 64–75% from peak. And every pricing guide in the search results for this topic is written by a company that sells GPU compute, which means the “buy vs rent” analysis conveniently concludes with “rent from us.”
This guide is different. CloudZero doesn’t sell GPUs or cloud compute. We track AI infrastructure costs across providers. The numbers below, and the buy-vs-rent math includes the costs that GPU providers prefer you calculate later: power, cooling, rack infrastructure, and depreciation against a chip that loses 15–20% of its value for every year that Blackwell stays in production.
How Much Does An H100 Cost To Buy?
A new H100 costs $25,000–$40,000 depending on variant. Which H100, from whom, and in what condition determines where you land in that range.
New purchase pricing
|
Variant |
Price range |
Memory |
TDP |
Key difference |
|
H100 PCIe 80GB |
$25,000–$30,000 |
80GB HBM3 |
350W |
Standard PCIe slot, no NVLink |
|
H100 SXM5 80GB |
$35,000–$40,000 |
80GB HBM3 |
700W |
NVLink 4.0 (900 GB/s), 30% faster compute |
|
DGX H100 (8-GPU) |
$350,000–$400,000+ |
640GB total |
10.2kW |
Full system: CPUs, networking, storage included |
The NVIDIA H100 price gap between PCIe and SXM5 is not just a memory bandwidth premium.
SXM5 connects eight GPUs via NVLink at 900 GB/s. Training a 70B parameter model across eight PCIe cards without NVLink is like running a relay race where the runners can only pass the baton through a window. It works. It is not fast. For multi-GPU training, the H100 SXM5 justifies the 40% price premium. For single-GPU inference, the PCIe variant at $25,000 does the same work at half the power draw.
H100 MSRP is a moving target. NVIDIA doesn’t publish a fixed retail price, cards move through channel partners, and street prices fluctuate with supply and demand. The ranges above reflect verified Q1 2026 market pricing from GMI Cloud, Compute Exchange, and ASA Computers ($30,970 list for PCIe 80GB).
Used and refurbished pricing
|
Condition |
Price range |
Discount vs new |
|
Refurbished |
$21,000–$34,000 |
15–20% off new |
|
Used (non-refurbished) |
$15,000–$28,000 |
30–40% off new |
|
eBay secondary market |
$6,000–$15,000 |
60–85% off peak |
The eBay prices are real. H100 SXM5 cards that sold for $40,000 in late 2023 now move for $6,000–$15,000 on secondary markets. The math behind that drop: it costs roughly 11x more to run inference on an H100 than on a B300. Anyone operating H100s for inference needs to charge dramatically more than competitors on newer hardware. The depreciation isn’t a market glitch, it’s Blackwell making Hopper-class hardware economically obsolete for certain workloads.
Through the first 24 months, H100s hold approximately 75–85% of acquisition value. After that, expect 10–20% annual depreciation as B200/B300 availability expands.

Research Report
FinOps In The AI Era: A Critical Recalibration
What 475 executives told us about AI and cloud efficiency.
How Much Does It Cost To Rent An H100 Per Hour?
H100 price per hour varies by a factor of 23x across providers, the widest spread in H100 GPU pricing today. The cheapest verified on-demand rate is $1.38/hr. AWS charges over $7.50/hr for the same chip.
GPU cloud pricing has dropped 64–75% since 2023, but cloud GPU cost still varies enough that choosing the wrong provider doubles your bill. For a broader comparison across GPU generations, see CloudZero’s GPU cloud pricing comparison. Here’s the full H100 cloud pricing comparison at May 2026 rates:
|
Provider |
$/GPU-hour (on-demand) |
Notes |
|
Thunder Compute |
$1.38 |
Cheapest verified on-demand |
|
Vast.ai (spot) |
$0.34–$2.50 |
Marketplace, spot with interruption risk |
|
Lambda Labs |
$2.86 |
On-demand |
|
RunPod |
$1.99 |
On-demand, spot from ~$1.80 |
|
CoreWeave |
$2.50–$3.11 |
Reserved pricing lower |
|
Oracle Cloud |
~$3.00 |
On-demand |
|
Azure (NC H100 v5) |
$3.40–$6.98 |
Varies by region |
|
AWS (p5) |
$7.50+ |
8-GPU instances, normalized per GPU |
|
GCP (A3) |
$8.00–$11.00 |
8-GPU instances, normalized per GPU |
Market median: $2.29/hr (Fluence Network) to $3.12/hr (GetDeploying average across 42 providers, 247 listings). Two hours on Thunder Compute’s H100s costs less than 12 minutes on GCP’s.
The H100 AWS cost warrants an asterisk. AWS only offers H100s in 8-GPU p5 instances. The listed price is per instance, not per GPU. Normalizing to per-GPU-hour puts AWS at $7.50+, 3x the market median and 5x the cheapest alternative. The same applies to GCP’s A3 instances. If your workload doesn’t need eight GPUs, you’re paying for seven you’re not using. For H100 GPU cost per hour comparisons, specialized GPU clouds consistently beat hyperscalers by 50–75%.
See also: 21+ Top Cloud Service Providers Globally In 2026
Should You Buy Or Rent H100 GPUs? The Real TCO Math
Every GPU cloud provider’s blog has a “buy vs rent” calculator that concludes renting is cheaper. They are usually right, but not for the reasons they emphasize.
The break-even calculation most guides get wrong
At $25,000 purchase and $3.00/hr rental, raw break-even is ~8,333 hours, about 347 days of 24/7 usage. Most guides stop here. They shouldn’t. The raw math ignores the part where your $25,000 GPU needs a $50,000 home to live in.
Hidden costs that shift the break-even by 6–12 months:
|
Cost |
Range |
Notes |
|
Power |
$40–$60/month per GPU |
700W SXM5 at $0.10/kWh |
|
Cooling infrastructure |
$15,000–$100,000 |
Dense GPU clusters need liquid cooling or enhanced HVAC |
|
Rack and networking |
$5,000–$15,000 per rack |
Specialized racks, PDUs, cable management |
|
Depreciation |
15–20%/year |
Accelerates as Blackwell adoption grow |
With infrastructure costs included, the actual break-even for a purchased H100 server shifts to 18+ months of continuous, near-100% utilization. Most organizations don’t sustain that. Cloud rental is cheaper for teams running GPU workloads fewer than 40 hours per week, or for bursty workloads like periodic training runs and evaluation cycles.
The key metric isn’t cost per GPU-hour. It’s cost per training run, which requires tracking utilization across sessions, not just reading invoices.
When buying makes financial sense
Purchasing makes sense under three conditions: you need GPUs running 24/7 for 18+ months, you have existing data center infrastructure (or budget of $400K+ to build it), and you have in-house ops capability to manage firmware, drivers, and cooling. That describes hyperscalers, large AI labs, and established enterprises with dedicated infrastructure teams. It doesn’t describe most mid-market engineering organizations, and that’s fine. The cloud exists precisely for teams that don’t want to operate their own GPU fleet.
Where Does The H100 Sit In The NVIDIA GPU Lineup?
The H100 specs matter less in a vacuum than in context. Here’s how the NVIDIA H100 specs compare against the full generational lineup, NVIDIA A100 vs H100, H200, and B200:
|
Spec |
A100 (Ampere) |
H100 SXM5 (Hopper) |
H200 (Hopper+ |
B200 (Blackwell) |
|
FP16 TFLOPS |
312 |
989 |
989 |
~2,250 |
|
Memory |
80GB HBM2e |
80GB HBM3 |
141GB HBM3e |
192GB HBM3e |
|
Bandwidth |
2,039 GB/s |
3,350 GB/s |
4,800 GB/s |
8,000 GB/s |
|
TDP |
400W |
700W |
700W |
1000W |
|
New price |
$10,000–$15,000 |
$25,000–$40,000 |
Not widely retail |
$30,000–$50,000 |
|
Cloud $/hr |
$1.29–$2.50 |
$1.40–$8.00+ |
$0.50–$4.95 |
$4.95–$18.00 |
- H100 vs A100 price: the A100 at $1.29–$2.50/hr cloud is cheaper per hour, but the H100 delivers 3–5x better throughput on transformer workloads via its Transformer Engine. Cost per training hour matters less than cost per training run. An A100 job that takes 100 hours costs more than an H100 job that takes 30, even at a lower hourly rate.
- H100 vs H200 price: NVIDIA H200 price starts at $0.50/hr cloud from some providers, already cheaper than most H100 on-demand pricing, with 76% more memory (141GB vs 80GB) and 43% more bandwidth. For memory-bound inference workloads, the H200 is the better buy on both performance and price.
- H100 vs B200: NVIDIA B200 price runs $30,000–$50,000 to purchase, with cloud rates at $4.95–$18.00/hr (launch premium). Raw compute is 2.3x faster. The B200 is the GPU equivalent of buying a new car the month your current lease payment becomes attractive, better in every way, and specifically designed to make you regret not waiting.
For new training infrastructure decisions, B200 is the forward-looking choice. For inference where H100 performance is sufficient, H100 cloud rates are at their lowest point ever, good timing for teams that don’t need Blackwell’s compute ceiling but do need to keep inference costs predictable.
Why GPU Cost Tracking Breaks Down Across Providers
Here’s the budget conversation nobody planned for:
A team renting H100s from Lambda Labs for training ($2.49/hr), RunPod for experimentation ($2.49/hr), and AWS for production inference ($7.50/hr) sees three separate invoices measuring usage in three different units. Lambda bills per GPU-hour. AWS bills per instance-hour (8 GPUs bundled). RunPod bills per GPU-second. None of them tell you what a single training run cost end-to-end, because the run probably touched all three.
Add the inference costs from the models these GPUs trained. Add the API costs from OpenAI, Claude, or DeepSeek for the tasks that don’t justify self-hosted inference. Add the cloud cost optimization question, are you using reserved instances, spot pricing, or bleeding money on on-demand because nobody set up the commitments? That’s the real AI cost picture: GPU rentals + API spend + cloud compute, scattered across providers, invisible to anyone without a single pane of glass.
CloudZero attributes GPU cloud spend alongside API and inference costs by team, model, and training run through the CostFormation engine.

According to CloudZero’s FinOps in the AI Era 2026 report, 78% of organizations can’t distinguish AI costs from general cloud spend. When your H100 bill is split across three providers and the model it trained generates inference costs on a fourth, the question isn’t “how much does an H100 cost per hour?” It’s “was the training run worth it?” That’s a unit economics question, the one that separates teams optimizing GPU spend from teams just tracking invoices.
to track GPU cloud costs across providers.

