What an honest cloud bill looks like
Most teams who tell us their cloud bill is "out of control" haven't actually sat with it. They've seen the monthly total climb, asked finance for a breakdown, received a tag-grouped CSV, glanced at it, and given up.
That's not a billing problem. It's a half-hour with the right person standing in front of the right view.
The bill we walked into
A Swiss mid-market client — one platform, one cloud, three product teams. Monthly spend had grown 38 percent over twelve months while traffic grew about 9 percent. The CFO had stopped reviewing the line items in March.
Here's the shape of the spend, before we touched anything.
| Category | Monthly cost | Share | What was actually happening |
|---|---|---|---|
| Compute (always-on) | CHF 41,200 | 34% | Right-sized to peak, not to average. No autoscaling. |
| Object storage | CHF 28,900 | 24% | Three years of build artefacts on standard tier. |
| NAT egress | CHF 18,400 | 15% | Cross-AZ chatter from a misplaced cache. |
| Managed database | CHF 14,100 | 12% | Multi-AZ on three non-production environments. |
| Observability | CHF 9,700 | 8% | Per-host pricing on a fleet that had quietly tripled. |
| Other | CHF 8,300 | 7% | Long tail. |
| Total | CHF 120,600 | 100% |
Four lines explain three quarters of the bill. Each one is a decision someone made, then forgot to revisit.
The screenshot moment
Sitting with the engineering lead and pulling up the cost explorer, we found this within twenty minutes:
This isn't a tooling gap. It's a habit gap.
The four moves
We picked four moves and sized each one before touching anything. These weren't clever — they were just decisions that had been deferred.
Compute: right-size to average, not to peak
Compute was provisioned for peak load with no autoscaling. We turned on horizontal autoscaling, added a small reserved-instance commitment for the always-on baseline, and left a buffer on top.
# Before: a single fleet sized for peak
# After: a baseline group + autoscaled overflow
terraform apply -target=module.compute_baseline # 30% of peak, reserved
terraform apply -target=module.compute_overflow # autoscaled, on-demandRoughly 28 percent off the compute line, with the same headroom on a Friday afternoon. No code changes.
Storage: lifecycle the artefacts
Three years of build artefacts on standard storage was the easy one. Anything over ninety days went to a cooler tier; anything over a year went to archive. The lifecycle policy is six lines.
A bucket policy nobody wrote is a bucket policy that defaults to expensive forever. Storage tiering is the smallest amount of engineering for the largest amount of recovered budget.
Saved roughly CHF 19,000 a month. The work took an afternoon.
Network: move the cache
The NAT egress was almost entirely one cache that had been deployed in the wrong availability zone. Each cross-AZ hit was billed. Move the cache, save the line.
Observability: tier the agents
Per-host pricing on every node — including ephemeral build runners — was the kind of thing nobody notices until somebody runs the bill against the fleet inventory. We tiered: full observability on production, minimal on staging, nothing on ephemeral runners.
Where it landed
| Category | Before | After | Change |
|---|---|---|---|
| Compute (always-on) | CHF 41,200 | CHF 29,700 | −28% |
| Object storage | CHF 28,900 | CHF 9,800 | −66% |
| NAT egress | CHF 18,400 | CHF 3,100 | −83% |
| Managed database | CHF 14,100 | CHF 9,200 | −35% |
| Observability | CHF 9,700 | CHF 4,800 | −51% |
| Other | CHF 8,300 | CHF 7,900 | −5% |
| Total | CHF 120,600 | CHF 64,500 | −47% |
A 47 percent reduction. Six weeks of focused work, no architectural rewrite, no headcount change.
The pattern under it
The bill wasn't out of control. It was unattended.
The same pattern shows up almost every time we open a client's cost explorer for the first time. Defaults compound. Decisions accumulate. Nobody owns the bill until somebody sits down with it.
If your monthly cloud spend has grown noticeably faster than your traffic, the next move probably isn't a re-architecture. It's an afternoon, the cost explorer, and a person who's allowed to make changes.
That's usually where we start too.