CostRoot — Find fixable waste in your Amazon Bedrock spend

The problem

Your bill shows what you spent. It doesn't show what was wasted.

Cost Explorer and CUR tell you which team or model spent the money. They can't tell you that a stable prompt prefix is being re-sent thousands of times with caching off, that a nightly job is paying on-demand rates for work a batch queue would halve, or that retries are quietly double-charging you.

Agentic Bedrock workloads make this worse: fan-out, retry loops, and redundant context are invisible in an aggregate bill and tedious to find by hand in raw invocation logs.

CostRoot reads those logs the way an engineer would — and shows you the specific, fixable patterns, ranked, each with the exact lever.

What stays hidden in the bill

Duplicate & retry callsRe-runs and retry storms charged in full.
Caching gapsPrefixes re-sent with caching off — or caching on but expiring.
Batch opportunitiesAsync, scheduled work paying real-time rates.
Attribution gapsSpend you can't trace to a team, task, or owner.

What CostRoot detects

Four detectors. Each finds a pattern, names the lever.

Read-only, from your invocation logs. Findings are surfaced for your review — nothing is changed in your account, and no dollar figure is claimed until you confirm you own the fix.

Duplicate & retry

Duplicate and retry calls

Sessionizes traffic per principal and flags repeated and retried invocations — retry loops and redundant context that re-pay for the same work. Findings are evidence-backed and ranked.

Batch 50%

Batch-eligible traffic on on-demand

Spots scheduled, async-shaped workloads paying full on-demand rates where Batch inference applies a flat 50% cut — no model or prompt changes required.

Caching ~90%

Caching left on the table

Finds large stable prefixes re-sent with cacheRead=0. On subagents Bedrock disables caching by default; cache reads run roughly 90% cheaper than re-sending the prefix.

Cache TTL · the nuance

Caching enabled, but expiring

The one most teams miss: caching is on, yet a stable prefix is being re-written every time the workload idles past the TTL — paying the write premium on cold re-caches. We catch the cache churn, not just the cache absence.

Attribution

Attribution gaps

Where logs collapse spend to a single gateway or shared role, we mark what can't be traced to a team or task — so you know exactly where visibility ends, rather than trusting a number that quietly isn't there.

Privacy-first by construction

Designed read-only first, because that's the only honest way to audit AI spend.

Read-only cross-account role, scoped by ExternalId A CloudFormation stack you launch creates a role that can read logs and metrics and nothing else. The ExternalId is unique to you — the confused-deputy defense.
Prompt bodies hashed, never persisted Bodies are parsed in place; detectors operate on hashes and token counts. Raw prompt text doesn't need to leave your AWS account, and we never store it.
Every access we make is audited Each AssumeRole is recorded and queryable by you — a complete log of exactly what we read, and when.

write permissions requested.
Remediation ships as a diff you review and apply. We never act on your account.

How the scan works

Three steps. Nothing touches your request path.

Launch the read-only stack

One CloudFormation deploy creates a read-only role in your account, locked to an ExternalId. It can read logs and metrics. It can write nothing.

We scan your invocation logs

Logs are parsed for the four patterns above. Prompt bodies are hashed, not stored; your latency, reliability, and architecture stay exactly as they are.

You get a ranked report

A report of the fixable patterns we found, each with the lever. You confirm which fixes you own — only then does a finding become an actionable number.

Who it's for

Teams running agentic Bedrock workloads.

The design-partner program is for engineering and FinOps teams who run real Bedrock traffic and want a clear-eyed read on where it's wasteful — before committing to anything.

Agent workloads on Lambda or ECS

Multi-step agents, tool-calling, and retrieval pipelines where fan-out and retries add up.

Material, ongoing Bedrock spend

Enough steady traffic that a few percent of waste is worth an engineer's afternoon to fix.

A security review you take seriously

Teams who'd rather grant a read-only, auditable role than route production traffic through a proxy.

Apply

Apply for a free Bedrock cost-leak scan.

We're onboarding a small set of design partners. Tell us about your workload; if it's a fit, we'll run a read-only scan and walk you through what we find — no obligation, no sales pressure.

Apply for a free Bedrock cost-leak scan

Qualification form. The scan is read-only and the findings are yours to keep, whether or not we work together.

FAQ

Questions a careful team asks first.

Do you need access to our prompts?

No. Prompt bodies are hashed and reduced to token counts where the logs are parsed; raw text doesn't need to leave your AWS account, and we never persist it.

Can CostRoot change anything in our account?

No. The role is read-only with zero write permissions. Every fix is a recommendation you review and apply yourself.

Is this a proxy or an SDK?

Neither. CostRoot reads invocation logs out of band. There's nothing in your request path, so your latency and reliability are untouched.

What does the scan actually return?

A ranked list of the fixable patterns we found, each with the specific lever — a cache point, a batch migration, a route change — and the evidence behind it.

Why is there no headline savings number?

Because an honest one only exists after you confirm you own each fix lever. From logs alone, a finding is a recoverable opportunity, not a guaranteed dollar amount.

What happens after the scan?

We walk you through the findings. If they're useful and the fit is right, we can talk about an ongoing watch — but the scan stands on its own, no commitment required.

Find fixable waste in your Amazon Bedrock spend.