Discover how CloudZero helps engineering and finance get on the same team — and unlock cloud cost intelligence to power cloud profitability
Learn moreDiscover the power of cloud cost intelligence
Give your team a better cost platform
Give engineering a cloud cost coach
Learn more about CloudZero and who we are
Learn more about CloudZero's pricing
Take a customized tour of CloudZero
Explore CloudZero by feature
Build fast with cost guardrails
Drive accountability and stay on budget
Manage all your discounts in one place
Organize spend to match your business
Understand your cloud unit economics and measure cost per customer
Discover and monitor your real Kubernetes and container costs
Measure and monitor the unit metrics that matter most to your business
Allocate cost and gain cost visibility even if your tagging isn’t perfect
Identify and measure your software COGS
Decentralize cost decisions to your engineering teams
Automatically identify wasted spend, then proactively build cost-effective infrastructure
Monitor your AWS cost and track progress in real-time as you move to the cloud
CloudZero ingests data from AWS, GCP, Azure, Snowflake, Kubernetes, and more
View all cost sourcesDiscover the best cloud cost intelligence resources
Browse webinars, ebooks, press releases, and other helpful resources
Discover the best cloud cost intelligence content
Learn how we’ve helped happy customers like SeatGeek, Drift, Remitly, and more
Check out our best upcoming and past events
Gauge the health and maturity level of your cost management and optimization efforts
Compare pricing and get advice on AWS services including EC2, RDS, ElastiCache, and more
Learn moreDiscover how SeatGeek decoded its AWS bill and measures cost per customer
Read customer storyLearn how Skyscanner decentralized cloud cost to their engineering teams
Read customer storyLearn how Malwarebytes measures cloud cost per product
Read customer storyLearn how Remitly built an engineering culture of cost autonomy
Read customer storyDiscover how Ninjacat uses cloud cost intelligence to inform business decisions
Read customer storyLearn Smartbear optimized engineering use and inform go-to-market strategies
Read customer storyMonitoring Kubernetes costs can be challenging. Here's how to analyze your K8s costs — and how CloudZero simplifies Kubernetes cost monitoring.
While Kubernetes enables your team to deliver more value, more rapidly, cost discussions around Kubernetes — and Kubernetes cost monitoring — can be difficult.
You have disposable and replaceable compute resources constantly coming and going, on a range of types of infrastructure. Yet at the end of the month, you just get a billing line item for EKS cost and a bunch of EC2 instances.
When you try to do a Kubernetes cost analysis, the bill doesn’t have any context about the workloads being orchestrated — and it certainly doesn’t link those costs to your business contexts, like cost per customer, tenant, or team.
Let’s begin at the start.
Table Of Contents
Monitoring costs in Kubernetes involves aggregating, visualizing, and tracking the cost of resources running in Kubernetes clusters. Monitoring K8s costs in real-time helps monitor these costs continuously, with alerting tools that notify you when there are cost anomalies that may lead to overspending.
See, a graph can reveal something you might have missed from a command-line output or tabular view. With dashboards and visualizations, you can more easily break down Kubernetes costs by clusters, namespaces, pods, deployments, containers, workloads, and more items.
Ideally, you’ll want to check these items’ costs for specific periods of time, pinpoint the most expensive namespaces, and assess how they compare to the rest of your setup. You can then accurately share the cost insights with your team and reconcile them with your cloud bills as well.
That not only improves Kubernetes cost analysis but also provides a solid baseline for optimizing your Kubernetes resources. Essentially, you gain a greater understanding of how you can reduce costs without negatively impacting workloads, service level agreements (SLAs), and more.
So, how do you understand Kubernetes costs? Here’s how to perform a detailed Kubernetes cost analysis.
Now, if you’re running Kubernetes via EKS on AWS (which we’ll assume for the rest of this guide) that would be the line item costs associated with a set of EC2 instances. Let’s imagine we’re building a social media site that currently consists of three distinct backend services:
In a conventionally server-based architecture, we might run this with a load balancer in front of a distinct collection of EC2 instances running each service. So, if our “chat” service is running on three t3.medium instances, we can roll the cost of operating those into the total cost of that service.
Does that tell us everything we need to know about the cost of operating our services?
Not quite:
To completely understand all this, we must distribute the costs of these external services to the parts of our business that use them — something that’ll look like this:
But, at least regarding direct server costs — the piece that Kubernetes will impact — the answer is basic enough: The server cost of any given service is the cost of the servers it runs on.
Now, imagine our monthly bills are skyrocketing and we need to "debug" our costs. Consider these two scenarios:
Scenario 1: Our “chat” service is a compute monster sitting on twenty c5.18x.large instances and still running out of CPU, but “users” and “content” are both happily plugging along with clusters of three m5.large each. Clearly, it’s “chat” that’s driving our costs and probably needs some serious rethinking.
Scenario 2: All of our services were running on a single Kubernetes cluster of twenty-one c5.18xlarge machines. The total cost of running that cluster wouldn’t by itself tell us anything about that kind of imbalance, or about which of our features might be responsible for most of our costs. It would be like looking at just a bottom line in place of our entire AWS bill, without anything broken down into individual line items, and then guessing from there.
A detailed Kubernetes cost analysis goes further.
To get closer to thinking about Kubernetes service costs, let's first reconfigure our raw server-based architecture.
What if, instead of running each service on its own separate cluster, we just had one cluster of machines, and each of them hosted some subset of our services?
In fact, let’s make it a little easier to do this on paper.
What if we had each of our instances always host a copy of each of our services? We’d have chosen a pretty silly architecture with some illogical scaling properties, true, but bear with us for a minute.
Now we still have specific, concrete costs for operating each server, but we need some intermediate model to say how much of that cost belongs to each service.
There is a way to easily analyze and manage Kubernetes costs. Schedule a demo of CloudZero to find out how.
What drives the need to scale our cluster up or down? After all, that’s the clearest meaning of “driving costs” — what’s making us unable to operate our cluster using fewer, cheaper resources?
Generally, scaling is driven primarily by two things: CPU and memory. So, if “chat” is using 80% of an instance’s CPU and 20% of its memory while “users” and “content” are both using 3% of each, we can look at these numbers and distribute the total cost of the machine into four buckets: “chat,” “users,” “content,” and “unused.”
It’s still a bit tricky — we need some way of deciding how to weigh the relative cost of memory and CPU. At CloudZero, we’ve approached this by building an econometric model that estimates — all else being equal — the relative marginal costs of adding one additional vCPU and one additional GB of memory to a given instance (to be clear: this is a useful modeling conceit, not an actually available exchange).
You can see all about how to measure Kubernetes costs with CloudZero in three easy steps here.
Here’s the model we use:
Let’s say our c5.18xlarge costs $3/hour; it has 72 vCPUs and 144 GB of memory. Let’s say one additional vCPU costs 8.9x as much as one more GB of RAM. This would mean that $2.45 of our hourly cost is attributable to compute cost and $0.55 to memory. And, further, $2.07 of the $3 belongs to the “chat” service, $0.09 each to “users” and “content,” and $0.84 is unused.
Now we’re back to a model that shows what engineering-meaningful unit is driving costs — that pesky “chat” service — and with a model that we’ll be able to carry over directly to Kubernetes.
Kubernetes itself is a way of running all of these services across a single underlying cluster of servers, even if it’s a considerably smarter one.
Here, instead of just spinning up an instance of each service on each server and letting it do what it does, each of these services will be encapsulated in a logical set of pods, and then Kubernetes will do its container orchestration magic to schedule those pods onto the cluster as needed.
Exactly the same logic discussed here applies to breaking node costs out into the pods that are using them — the only Kubernetes-specific part of the procedure comes from collecting those metrics about compute and memory usage.
Really, “pod” is the only Kubernetes abstraction that we need directly because it’s the atomic unit of scheduling, and based on it we can reconstruct any higher-level abstractions we might want to use.
AWS bills EC2 instances on an hourly basis, but a variety of pods belonging to various services, namespaces, and so on could spin up and down on a given instance over the course of that hour.
Fortunately, Kubernetes exposes a couple of Prometheus-formatted metrics on its /metrics endpoints that we can use: pod_cpu_utilization and pod_memory_utilization to tell us what is going on minute-to-minute.
A Kubernetes pod can also reserve memory and CPU, setting a floor on the resources it partitions off for its own usage while running, so that really a running pod is “using” the maximum of pod_cpu_reserved_capacity (if present) and pod_cpu_utilization.
If we’re reserving much more than we’re actually using, our “optimization” might be as trivial as changing that reservation number. But, even so, we’re still driving costs by demanding a ton of CPUs.
Now we have enough information to answer how much our Kubernetes service costs to operate. First, we take our AWS bill with per-instance costs for the cluster nodes.
Then, we collect Prometheus usage metrics from our Kubernetes cluster. We use Amazon’s Container Insights for this collection process, which gives us minute-by-minute data.
However, any collection process will work, so long as we get those usage/reservation metrics and a way of correlating Kubernetes pods back to EC2 instance IDs (and thus to our AWS bill). This is also available directly in Container Insights.
Now, we can get a pod’s hourly utilization as the sum of its per-collection-period totals — the max of reserved and utilized as we discussed before, and effectively zero for collection periods in which a pod didn’t run and so is absent from the metrics — divided by the number of collection periods per hour.
Break out the instance’s costs into memory and CPU like before, partition those costs based on utilization, and voila! Per-pod costs!
Thus, service costs are just the sum of pod costs belonging to that service. And, identically, we can construct costs for other higher-level Kubernetes abstractions like namespace in exactly the same way, by summing over pods.
We’ve also shared Kubernetes cost monitoring best practices to follow in order to optimize your K8s infrastructure continuously.
So that’s how we can calculate the cost of Kubernetes. Put it all into a spreadsheet and off you go — analyzing data for hours at a time until you're numb. Or is there a better way?
CloudZero offers an incredibly simple way to measure Kubernetes costs and view detailed breakdowns of actual cost by cluster, namespace, or pod — down to the hour.
More importantly for their business, they can see which product features those costs correlate to — helping them answer key questions about what it costs to build and run their products.
For instance, you can measure:
With this information, you can understand the cost of each individual containerized workload just like you would any other non-containerized resource — or, as is often the case, along with related non-containerized resources like storage or networking — to get a complete understanding of your software’s COGS.
You can also bring that understanding to the individual engineering teams responsible for each component of your solution so they can make better decisions to improve your company’s bottom line.
Best of all, you can do it without crazy spreadsheets or a dedicated financial analyst to help.
With CloudZero Kubernetes cost analysis, you’ll be able to:
But don’t just take our word for it. to see how efficient analyzing and managing your Kubernetes costs can be.
This blog post was written and reviewed by the CloudZero team. Combined, our team has more than a quarter century of experience in the cloud cost space. Every blog post is extensively researched and reviewed by several members of our team for accuracy and readability.
CloudZero is the only solution that enables you to allocate 100% of your spend in hours — so you can align everyone around cost dimensions that matter to your business.