Talk

On-demand

Virtual

The hidden cost layer: How Platform Engineers are taking ownership of AI token spend

AI spend is everywhere, but most teams are only watching half of it. This talk explores the emerging token cost layer inside developer tooling and what platform engineers can do to bring it under control.

Jun 23, 2026

15

mins

Cloud costs used to live in infrastructure. Then AI workloads moved in. Now the same teams managing GPU clusters and Bedrock spend are also responsible for the token consumption happening inside every developer's IDE, on every laptop, across every coding agent session.
Most teams are flying blind on this layer. No visibility into who is consuming what, on which model, for what purpose. No way to attribute costs by team, project, or task. And no governance over whether engineers are using the right model for the job, or burning expensive tokens on work that gets thrown away.

The reason it's hard to fix is structural. Coding agents generate tool calls constantly, and those tool calls return far more data than the model actually needs. A git status command returns pages of output when the agent only needs one line. A grep returns thousands of matches, most of which might be irrelevant, when a few would do. That bloat enters the context window and gets billed as tokens. Addressing it requires a lightweight agent running at the endpoint, using the hook system built into every major coding agent, to compress output before it reaches the model and to govern other aspects of the agent’s operation.

This talk walks through what that approach looks like across three areas: cost savings, where early results show 10 to 20 percent lower token spend with no change to developer workflow; agent performance, where smaller context means faster responses and in some cases more accurate outputs; and governance, where full visibility into model usage, token allocation by team and project, and policy controls ranging from hard limits to human-in-the-loop nudges give platform teams actual control over a layer that has been unmanaged until now.

The token cost problem is already landing on platform engineering's plate. This is a practical look at what you can do about it.

Virtual

Register for PlatformCon 2026