While Kubernetes enables your team to deliver more value, more rapidly, cost discussions around Kubernetes — and Kubernetes cost monitoring — can be difficult.
You have disposable and replaceable compute resources constantly coming and going, on a range of types of infrastructure. Yet at the end of the month, you just get a billing line item for EKS cost and a bunch of EC2 instances.
When you try to do a Kubernetes cost analysis, the bill doesn’t have any context about the workloads being orchestrated — and it certainly doesn’t link those costs to your business contexts, like cost per customer, tenant, or team.
Let’s begin at the start.
Table Of Contents
What Is Kubernetes Cost Monitoring?
Monitoring costs in Kubernetes involves aggregating, visualizing, and tracking the cost of resources running in Kubernetes clusters. Monitoring K8s costs in real-time helps monitor these costs continuously, with alerting tools that notify you when there are cost anomalies that may lead to overspending.
See, a graph can reveal something you might have missed from a command-line output or tabular view. With dashboards and visualizations, you can more easily break down Kubernetes costs by clusters, namespaces, pods, deployments, containers, workloads, and more items.
Ideally, you’ll want to check these items’ costs for specific periods of time, pinpoint the most expensive namespaces, and assess how they compare to the rest of your setup. You can then accurately share the cost insights with your team and reconcile them with your cloud bills as well.
That not only improves Kubernetes cost analysis but also provides a solid baseline for optimizing your Kubernetes resources. Essentially, you gain a greater understanding of how you can reduce costs without negatively impacting workloads, service level agreements (SLAs), and more.
So, how do you understand Kubernetes costs? Here’s how to perform a detailed Kubernetes cost analysis.
How To Calculate And Monitor Kubernetes Costs
Now, if you’re running Kubernetes via EKS on AWS (which we’ll assume for the rest of this guide) that would be the line item costs associated with a set of EC2 instances. Let’s imagine we’re building a social media site that currently consists of three distinct backend services:
- A “content” service, which contains the full text of users’ posts, together with things like comments and reactions.
- A “users” service, which contains lists of time-ordered references to the content each user has created or reposted.
- A “chat” service, which allows real-time text communication between specific users.
In a conventionally server-based architecture, we might run this with a load balancer in front of a distinct collection of EC2 instances running each service. So, if our “chat” service is running on three t3.medium instances, we can roll the cost of operating those into the total cost of that service.
Does that tell us everything we need to know about the cost of operating our services?
- The “content” service probably hits S3 to store block data and has a database to store metadata about posts.
- Maybe it shares this database with the “users” service directly or exposes it via an internal API.
- In any case, the “chat” service also needs its own database, along with somewhere to store ephemeral data about ongoing conversations (“user xyz is typing a message …”).
To completely understand all this, we must distribute the costs of these external services to the parts of our business that use them — something that’ll look like this:
But, at least regarding direct server costs — the piece that Kubernetes will impact — the answer is basic enough: The server cost of any given service is the cost of the servers it runs on.
Kubernetes Cost Monitoring Requires A Different Approach
Now, imagine our monthly bills are skyrocketing and we need to “debug” our costs. Consider these two scenarios:
Scenario 1: Our “chat” service is a compute monster sitting on twenty c5.18x.large instances and still running out of CPU, but “users” and “content” are both happily plugging along with clusters of three m5.large each. Clearly, it’s “chat” that’s driving our costs and probably needs some serious rethinking.
Scenario 2: All of our services were running on a single Kubernetes cluster of twenty-one c5.18xlarge machines. The total cost of running that cluster wouldn’t by itself tell us anything about that kind of imbalance, or about which of our features might be responsible for most of our costs. It would be like looking at just a bottom line in place of our entire AWS bill, without anything broken down into individual line items, and then guessing from there.
A detailed Kubernetes cost analysis goes further.
To get closer to thinking about Kubernetes service costs, let’s first reconfigure our raw server-based architecture.
What if, instead of running each service on its own separate cluster, we just had one cluster of machines, and each of them hosted some subset of our services?
In fact, let’s make it a little easier to do this on paper.
What if we had each of our instances always host a copy of each of our services? We’d have chosen a pretty silly architecture with some illogical scaling properties, true, but bear with us for a minute.
Now we still have specific, concrete costs for operating each server, but we need some intermediate model to say how much of that cost belongs to each service.
There is a way to easily analyze and manage Kubernetes costs. Schedule a demo of CloudZero to find out how.
Analyze How Your Infrastructure Scales Up Or Down To Measure Kubernetes Cost
What drives the need to scale our cluster up or down? After all, that’s the clearest meaning of “driving costs” — what’s making us unable to operate our cluster using fewer, cheaper resources?
Generally, scaling is driven primarily by two things: CPU and memory. So, if “chat” is using 80% of an instance’s CPU and 20% of its memory while “users” and “content” are both using 3% of each, we can look at these numbers and distribute the total cost of the machine into four buckets: “chat,” “users,” “content,” and “unused.”
It’s still a bit tricky — we need some way of deciding how to weigh the relative cost of memory and CPU. At CloudZero, we’ve approached this by building an econometric model that estimates — all else being equal — the relative marginal costs of adding one additional vCPU and one additional GB of memory to a given instance (to be clear: this is a useful modeling conceit, not an actually available exchange).
You can see all about how to measure Kubernetes costs with CloudZero in three easy steps here.
Here’s the model we use:
Let’s say our c5.18xlarge costs $3/hour; it has 72 vCPUs and 144 GB of memory. Let’s say one additional vCPU costs 8.9x as much as one more GB of RAM. This would mean that $2.45 of our hourly cost is attributable to compute cost and $0.55 to memory. And, further, $2.07 of the $3 belongs to the “chat” service, $0.09 each to “users” and “content,” and $0.84 is unused.
Now we’re back to a model that shows what engineering-meaningful unit is driving costs — that pesky “chat” service — and with a model that we’ll be able to carry over directly to Kubernetes.
Kubernetes itself is a way of running all of these services across a single underlying cluster of servers, even if it’s a considerably smarter one.
Here, instead of just spinning up an instance of each service on each server and letting it do what it does, each of these services will be encapsulated in a logical set of pods, and then Kubernetes will do its container orchestration magic to schedule those pods onto the cluster as needed.
Exactly the same logic discussed here applies to breaking node costs out into the pods that are using them — the only Kubernetes-specific part of the procedure comes from collecting those metrics about compute and memory usage.
Really, “pod” is the only Kubernetes abstraction that we need directly because it’s the atomic unit of scheduling, and based on it we can reconstruct any higher-level abstractions we might want to use.
AWS bills EC2 instances on an hourly basis, but a variety of pods belonging to various services, namespaces, and so on could spin up and down on a given instance over the course of that hour.
Fortunately, Kubernetes exposes a couple of Prometheus-formatted metrics on its /metrics endpoints that we can use: pod_cpu_utilization and pod_memory_utilization to tell us what is going on minute-to-minute.
A Kubernetes pod can also reserve memory and CPU, setting a floor on the resources it partitions off for its own usage while running, so that really a running pod is “using” the maximum of pod_cpu_reserved_capacity (if present) and pod_cpu_utilization.
If we’re reserving much more than we’re actually using, our “optimization” might be as trivial as changing that reservation number. But, even so, we’re still driving costs by demanding a ton of CPUs.
How To Understand Kubernetes Cost Per Pod
Now we have enough information to answer how much our Kubernetes service costs to operate. First, we take our AWS bill with per-instance costs for the cluster nodes.
Then, we collect Prometheus usage metrics from our Kubernetes cluster. We use Amazon’s Container Insights for this collection process, which gives us minute-by-minute data.
However, any collection process will work, so long as we get those usage/reservation metrics and a way of correlating Kubernetes pods back to EC2 instance IDs (and thus to our AWS bill). This is also available directly in Container Insights.
Now, we can get a pod’s hourly utilization as the sum of its per-collection-period totals — the max of reserved and utilized as we discussed before, and effectively zero for collection periods in which a pod didn’t run and so is absent from the metrics — divided by the number of collection periods per hour.
Break out the instance’s costs into memory and CPU like before, partition those costs based on utilization, and voila! Per-pod costs!
Thus, service costs are just the sum of pod costs belonging to that service. And, identically, we can construct costs for other higher-level Kubernetes abstractions like namespace in exactly the same way, by summing over pods.
We’ve also shared Kubernetes cost monitoring best practices to follow in order to optimize your K8s infrastructure continuously.
So that’s how we can calculate the cost of Kubernetes. Put it all into a spreadsheet and off you go — analyzing data for hours at a time until you’re numb. Or is there a better way?
A Better Way To Handle Kubernetes Cost Monitoring And Management
CloudZero offers an incredibly simple way to measure Kubernetes costs and view detailed breakdowns of actual cost by cluster, namespace, or pod — down to the hour.
More importantly for their business, they can see which product features those costs correlate to — helping them answer key questions about what it costs to build and run their products.
For instance, you can measure:
- Cost per feature
- Cost per team or business unit
- Cost per project
- Cost per customer or tenant
With this information, you can understand the cost of each individual containerized workload just like you would any other non-containerized resource — or, as is often the case, along with related non-containerized resources like storage or networking — to get a complete understanding of your software’s COGS.
You can also bring that understanding to the individual engineering teams responsible for each component of your solution so they can make better decisions to improve your company’s bottom line.
Best of all, you can do it without crazy spreadsheets or a dedicated financial analyst to help.
With CloudZero Kubernetes cost analysis, you’ll be able to:
- See a detailed view of costs to run your Kubernetes infrastructure, including trends and patterns for forecasting and allocations
- Seamlessly measure COGS across your containerized and non-containerized environments within a single platform
- Track idle costs to optimize your Kubernetes infrastructure
- Accurately allocate Kubernetes costs by customer/tenant, project, microservices, dev team, product feature, and more
- Use real-time anomaly detection to detect cost anomalies before they become surprise costs