Discover the power of cloud cost intelligence.
Give your team a better cost platform.
Give engineering a cloud cost coach.
Learn more about CloudZero's pricing.
Request a demo to see CloudZero in action.
Learn more about CloudZero and who we are.
Explore by feature
Build fast with cost gaurdrails.
Drive accountability and stay on budget.
Manage all your discounts in one place.
Organize spend to match your business.
How SeatGeek Decoded Its AWS Bill and Measured Cost Per CustomerRead customer story
Enable engineering to make cost-aware development decisions.
Give finance the context they need to make informed decisions.
Decentralize cloud cost and mature your FinOps program.
Discover the best cloud cost optimization content in the industry.CloudZero Advisor
Compare pricing and get advice on AWS services including EC2, RDS, ElastiCache, and more.
Browse helpful webinars, ebooks, and other useful resources.Cloud Cost Assessment
Gauge the health and maturity level of your cost management and optimization efforts.
Learn how we’ve helped happy customers like SeatGeek, Drift, Remitly, and more.
Guide: How To Overcome Tagging And Accelerate Cloud Cost AllocationSee guide
Monitoring Kubernetes costs can be challenging. Here's how to analyze your K8s costs — and how CloudZero simplifies Kubernetes cost monitoring.
While Kubernetes enables your team to deliver more value, more rapidly, cost discussions around Kubernetes — and Kubernetes cost monitoring — can be difficult.
You have disposable and replaceable compute resources constantly coming and going, on a range of types of infrastructure. Yet at the end of the month, you just get a billing line item for EKS cost and a bunch of EC2 instances.
When you try to do a Kubernetes cost analysis, the bill doesn’t have any context about the workloads being orchestrated — and it certainly doesn’t link those costs to your business contexts, like cost per customer, tenant, or team.
Let’s begin at the start.
Table Of Contents
Monitoring costs in Kubernetes involves aggregating, visualizing, and tracking the cost of resources running in Kubernetes clusters. Monitoring K8s costs in real-time helps monitor these costs continuously, with alerting tools that notify you when there are cost anomalies that may lead to overspending.
See, a graph can reveal something you might have missed from a command-line output or tabular view. With dashboards and visualizations, you can more easily break down Kubernetes costs by clusters, namespaces, pods, deployments, containers, workloads, and more items.
Ideally, you’ll want to check these items’ costs for specific periods of time, pinpoint the most expensive namespaces, and assess how they compare to the rest of your setup. You can then accurately share the cost insights with your team and reconcile them with your cloud bills as well.
That not only improves Kubernetes cost analysis but also provides a solid baseline for optimizing your Kubernetes resources. Essentially, you gain a greater understanding of how you can reduce costs without negatively impacting workloads, service level agreements (SLAs), and more.
So, how do you understand Kubernetes costs? Here’s how to perform a detailed Kubernetes cost analysis.
Now, if you’re running Kubernetes via EKS on AWS (which we’ll assume for the rest of this guide) that would be the line item costs associated with a set of EC2 instances. Let’s imagine we’re building a social media site that currently consists of three distinct backend services:
In a conventionally server-based architecture, we might run this with a load balancer in front of a distinct collection of EC2 instances running each service. So, if our “chat” service is running on three t3.medium instances, we can roll the cost of operating those into the total cost of that service.
Does that tell us everything we need to know about the cost of operating our services?
To completely understand all this, we must distribute the costs of these external services to the parts of our business that use them — something that’ll look like this:
But, at least regarding direct server costs — the piece that Kubernetes will impact — the answer is basic enough: The server cost of any given service is the cost of the servers it runs on.
Now, imagine our monthly bills are skyrocketing and we need to "debug" our costs. Consider these two scenarios:
Scenario 1: Our “chat” service is a compute monster sitting on twenty c5.18x.large instances and still running out of CPU, but “users” and “content” are both happily plugging along with clusters of three m5.large each. Clearly, it’s “chat” that’s driving our costs and probably needs some serious rethinking.
Scenario 2: All of our services were running on a single Kubernetes cluster of twenty-one c5.18xlarge machines. The total cost of running that cluster wouldn’t by itself tell us anything about that kind of imbalance, or about which of our features might be responsible for most of our costs. It would be like looking at just a bottom line in place of our entire AWS bill, without anything broken down into individual line items, and then guessing from there.
A detailed Kubernetes cost analysis goes further.
To get closer to thinking about Kubernetes service costs, let's first reconfigure our raw server-based architecture.
What if, instead of running each service on its own separate cluster, we just had one cluster of machines, and each of them hosted some subset of our services?
In fact, let’s make it a little easier to do this on paper.
What if we had each of our instances always host a copy of each of our services? We’d have chosen a pretty silly architecture with some illogical scaling properties, true, but bear with us for a minute.
Now we still have specific, concrete costs for operating each server, but we need some intermediate model to say how much of that cost belongs to each service.
There is a way to easily analyze and manage Kubernetes costs. Schedule a demo of CloudZero to find out how.
What drives the need to scale our cluster up or down? After all, that’s the clearest meaning of “driving costs” — what’s making us unable to operate our cluster using fewer, cheaper resources?
Generally, scaling is driven primarily by two things: CPU and memory. So, if “chat” is using 80% of an instance’s CPU and 20% of its memory while “users” and “content” are both using 3% of each, we can look at these numbers and distribute the total cost of the machine into four buckets: “chat,” “users,” “content,” and “unused.”
It’s still a bit tricky — we need some way of deciding how to weigh the relative cost of memory and CPU. At CloudZero, we’ve approached this by building an econometric model that estimates — all else being equal — the relative marginal costs of adding one additional vCPU and one additional GB of memory to a given instance (to be clear: this is a useful modeling conceit, not an actually available exchange).
You can see all about how to measure Kubernetes costs with CloudZero in three easy steps here.
Here’s the model we use:
Let’s say our c5.18xlarge costs $3/hour; it has 72 vCPUs and 144 GB of memory. Let’s say one additional vCPU costs 8.9x as much as one more GB of RAM. This would mean that $2.45 of our hourly cost is attributable to compute cost and $0.55 to memory. And, further, $2.07 of the $3 belongs to the “chat” service, $0.09 each to “users” and “content,” and $0.84 is unused.
Now we’re back to a model that shows what engineering-meaningful unit is driving costs — that pesky “chat” service — and with a model that we’ll be able to carry over directly to Kubernetes.
Kubernetes itself is a way of running all of these services across a single underlying cluster of servers, even if it’s a considerably smarter one.
Here, instead of just spinning up an instance of each service on each server and letting it do what it does, each of these services will be encapsulated in a logical set of pods, and then Kubernetes will do its container orchestration magic to schedule those pods onto the cluster as needed.
Exactly the same logic discussed here applies to breaking node costs out into the pods that are using them — the only Kubernetes-specific part of the procedure comes from collecting those metrics about compute and memory usage.
Really, “pod” is the only Kubernetes abstraction that we need directly because it’s the atomic unit of scheduling, and based on it we can reconstruct any higher-level abstractions we might want to use.
AWS bills EC2 instances on an hourly basis, but a variety of pods belonging to various services, namespaces, and so on could spin up and down on a given instance over the course of that hour.
Fortunately, Kubernetes exposes a couple of Prometheus-formatted metrics on its /metrics endpoints that we can use: pod_cpu_utilization and pod_memory_utilization to tell us what is going on minute-to-minute.
A Kubernetes pod can also reserve memory and CPU, setting a floor on the resources it partitions off for its own usage while running, so that really a running pod is “using” the maximum of pod_cpu_reserved_capacity (if present) and pod_cpu_utilization.
If we’re reserving much more than we’re actually using, our “optimization” might be as trivial as changing that reservation number. But, even so, we’re still driving costs by demanding a ton of CPUs.
Now we have enough information to answer how much our Kubernetes service costs to operate. First, we take our AWS bill with per-instance costs for the cluster nodes.
Then, we collect Prometheus usage metrics from our Kubernetes cluster. We use Amazon’s Container Insights for this collection process, which gives us minute-by-minute data.
However, any collection process will work, so long as we get those usage/reservation metrics and a way of correlating Kubernetes pods back to EC2 instance IDs (and thus to our AWS bill). This is also available directly in Container Insights.
Now, we can get a pod’s hourly utilization as the sum of its per-collection-period totals — the max of reserved and utilized as we discussed before, and effectively zero for collection periods in which a pod didn’t run and so is absent from the metrics — divided by the number of collection periods per hour.
Break out the instance’s costs into memory and CPU like before, partition those costs based on utilization, and voila! Per-pod costs!
Thus, service costs are just the sum of pod costs belonging to that service. And, identically, we can construct costs for other higher-level Kubernetes abstractions like namespace in exactly the same way, by summing over pods.
We’ve also shared Kubernetes cost monitoring best practices to follow in order to optimize your K8s infrastructure continuously.
So that’s how we can calculate the cost of Kubernetes. Put it all into a spreadsheet and off you go — analyzing data for hours at a time until you're numb. Or is there a better way?
CloudZero offers an incredibly simple way to measure Kubernetes costs and view detailed breakdowns of actual cost by cluster, namespace, or pod — down to the hour.
More importantly for their business, they can see which product features those costs correlate to — helping them answer key questions about what it costs to build and run their products.
For instance, you can measure:
With this information, you can understand the cost of each individual containerized workload just like you would any other non-containerized resource — or, as is often the case, along with related non-containerized resources like storage or networking — to get a complete understanding of your software’s COGS.
You can also bring that understanding to the individual engineering teams responsible for each component of your solution so they can make better decisions to improve your company’s bottom line.
Best of all, you can do it without crazy spreadsheets or a dedicated financial analyst to help.
With CloudZero Kubernetes cost analysis, you’ll be able to:
CloudZero is the only solution that enables you to allocate 100% of your spend in hours — so you can align everyone around cost dimensions that matter to your business.