Shipped: LiteLLM is probably under-counting your Claude spend

By Scott Castle // Chief Product Officer at CloudZero

If you run Claude through LiteLLM, some of that spend is probably going uncounted – and you can’t see it, precisely because the data isn’t there. Routing through a gateway is messier than it looks: LiteLLM alone can carry Claude several ways – the OpenAI-compatible endpoint, and the Anthropic pass-through proxy that the native SDK and Claude Code use – and each path describes the same call differently. Different spans, different fields, a provider that sometimes isn’t labeled at all. Anything reading that telemetry gets a clean, fully attributed call on one path and a near-blank one on the next: same model, same spend, different routing choice your engineers made for reasons that had nothing to do with cost.

Why this matters

That inconsistency tends to hit your biggest traffic. The pass-through proxy is the default path for Anthropic’s own SDK and Claude Code, so the calls most likely to come through ambiguous are often your highest-volume Claude calls. You shouldn’t have to standardize how every team routes Claude just to see what you’re spending on it. Routing is an engineering decision; allocation shouldn’t depend on it.

What we built

So CloudZero reads all of it. However Claude moves through your LiteLLM gateway, we find the model, tokens, and cost wherever that path puts them, and infer the provider when the gateway leaves it off – including the ragged cases where a given LiteLLM version doesn’t emit what you’d expect. Reconciling what each route reports is the work, and it happens on our side, not yours.

How we built it

There’s nothing to configure. If you’re already streaming AI usage from your LiteLLM gateway, every routing path now parses on its own. Route Claude however suits each team – native Anthropic SDK, Claude Code, or the OpenAI-compatible endpoint – and the usage shows up with model, tokens, and cost attached. From there it flows into the same Dimensions you use for everything else, so you can allocate Claude spend by team, product, or the feature it’s powering, right next to your cloud and SaaS spend.

New to this? Create a LiteLLM connection in CloudZero, drop the Connection API key into your proxy, and you’re streaming.

Author Spotlight

Scott Castle

Scott Castle is the Chief Product Officer at CloudZero, a product and GTM executive with over two decades of experience scaling AI/ML and analytics platforms. Before CloudZero, Scott held senior product and strategy roles at Tecton, Sisense, Periscope Data, and Adobe. Scott focuses on helping engineering and finance teams turn cloud and AI cost visibility into measurable business outcomes.