Cost Calculator

See what open models save you

Compare closed-source API spend against open models served per-token on Fireworks AI.

$ /mo
What does your workload look like?
Average tokens per request

Enter spend and tokens to estimate monthly request volume

Projected savings

Enter workload information to calculate your savings.

Savings are estimated from input/output token usage and an approximate cache hit rate. See the full math below.

Results are general estimates for internal discussion only. Fireworks model prices are from the published serverless per-token pricing (Standard serving path). Closed-source prices are representative examples. Fireworks does not guarantee any particular cost savings.

How the math works

No black box. Here is exactly how the calculator turns your monthly spend into a projected savings figure on Fireworks.

Step 1

Blended cost per request

Each model is billed on input, cached input, and output tokens (per 1M). The cache hit rate comes from your workload type.

cost/req = (in × (1 − cache) × inPrice + in × cache × cachedPrice + out × outPrice) ÷ 1,000,000
Step 2

Implied monthly volume

We back out how many requests your current spend buys at the closed model's cost per request.

requests/mo = monthlySpend ÷ closedCostPerReq
Step 3

Fireworks cost & savings

Hold that volume fixed, re-price it on the Fireworks model, and take the difference.

fwSpend = requests × fwCostPerReq savings = monthlySpend − fwSpend percent = savings ÷ monthlySpend × 100 annual = savings × 12
AssumptionValueWhy
Cache hit rate — Chat & assistants50%Long shared system prompt / context reused across turns
Cache hit rate — Document processing10%Mostly unique input per request, little reuse
Cache hit rate — Agentic workflows80%Many chained calls share a growing, stable prefix
Fireworks pricingStandard pathPublished per-token serverless pricing (input / cached / output per 1M)
Models without a cached ratecached = inputNo prompt-cache discount assumed for that model

Try your model on Fireworks