Synthetic example Based on realistic Bedrock usage patterns — not a real customer. Dollar figures are computed by the engine against the current, verified Bedrock price table; the workload itself is generated.
Bedrock Savings Verdict · 8,700 calls · 6-day window · read-only scan
$3,289/mo you can fix

Spend you can fix right now, with the exact change for each. Leaks are detected read-only from your invocation logs; the figure above counts only fixes you've confirmed your team owns.

Read the number this way — Customer-actionable assumes you've confirmed you own each fix lever (the owner-mapping step). From logs alone, every finding starts at $0 actionable until you confirm ownership.

$3,289 recoverable of ~$4,909/mo observed Bedrock inference · $3,289 customer-actionable · 67%
Observed spend
$4,909
Inference run-rate seen in your logs, per month.
Recoverable
$3,289
What a correct, feasible fix saves — before ownership.
You can fix
$3,289
Recoverable on levers you've confirmed your team owns. The headline — nothing else is claimed.
Ranked leaks · each with its fix
#01
$1,519/mo recoverable

Caching left on the table

PROMPT CACHING RAG service
anthropic.claude-sonnet-4-5 · 4,200 calls re-send ~28,400 prefix tokens · cacheRead=0

Enable prompt caching: add a cachePoint after the system/tools block. On subagents, Bedrock disables sub-agent caching by default — turn it on explicitly. Cache reads run ~90% cheaper than re-sending the prefix every call.

#02
$1,116/mo recoverable

Cache expiring between calls

CACHE TTL support-copilot
anthropic.claude-sonnet-4-5 · 899 cold re-writes of a ~72,000-token cached prefix · expired past the 300s TTL

Extend the cache TTL 5m → 1h: a stable prefix is being re-cached — paying the write premium — every time the workload idles past the current 5-minute window. Set a longer cache_control TTL on the stable system/tools block where the model supports it, or keep sessions warm. Net of the longer-TTL write premium.

#03
$458/mo recoverable

Batch-eligible traffic on on-demand

BATCH 50% nightly job
anthropic.claude-sonnet-4 · 2,000 calls · role identifies as a job, regular cadence (cv=0.00) · looks async

Move to Batch inference: submit JSONL to S3 for a flat 50% cut. Confirm this nightly job tolerates ~24h turnaround before switching — it already runs on a fixed schedule.

#04
$197/mo recoverable

Premium model doing cheap work

ROUTE / DOWNSHIFT Opus extractor
anthropic.claude-opus-4-5 · 1,600 calls · median output 80 tok · routing/extraction-shaped

Route or downshift the Opus extractor: enable Intelligent Prompt Routing (~30%, zero code change) as the safe first step. Bigger wins — downshift to Haiku/Nova, or distill (~75%) for this fixed extraction task.

The honest counterpoint: run this same scan against an already-tuned workload and Verdict reports ~$0 you can fix — caching on, no churn, right-sized models. We only ever headline spend you can actually recover.
read-only · prompt bodies hashed (no PII stored) · prices verified against current Bedrock rates before quoting · synthetic sample, not a real customer
Back to CostRoot Apply for a free Bedrock cost-leak scan