LLM API Cost Calculator: Estimate GPT, Claude & Gemini Token Costs (2026)

Every major AI provider bills the same way: per token. If you can count tokens, you can predict your bill before you ever hit the API. This guide shows the simple formula, the current 2026 prices for the big models, and worked examples so you can estimate any workload.

Calculate as you go: the homepage token counter includes a live cost estimator — enter a price per million tokens and see the cost update instantly. Open it →

The cost formula

API pricing has two parts, billed separately:

Input tokens — everything you send (your prompt, system message, and any context).
Output tokens — everything the model generates back.

The math is just:

cost = (input_tokens / 1,000,000 × input_price) + (output_tokens / 1,000,000 × output_price)

Prices are almost always quoted per million tokens. Output tokens usually cost two to five times more than input tokens, so the length of the reply matters a lot.

2026 pricing for popular models

Approximate list prices per million tokens, as of mid-2026. Always confirm on the provider's official pricing page before relying on these for budgeting.

Model	Input / 1M	Output / 1M
GPT-5.5 (OpenAI)	$5.00	$30.00
GPT-5.4 (OpenAI)	$2.50	$15.00
Claude Opus 4.8 (Anthropic)	$5.00	$25.00
Claude Sonnet 4.6	$3.00	$15.00
Claude Haiku 4.5	$1.00	$5.00
Gemini 3.5 Flash (Google)	$1.50	$9.00
Gemini 2.5 Pro	$1.25	$10.00
Llama 4 Maverick (Meta)	$0.20	$0.60

Prices change frequently and vary by region, batch mode, and caching. Treat this table as a reference snapshot, not a quote.

Cost scenarios to estimate before launch

A good LLM API cost calculator should model your real workload, not just one sample prompt. Start with these scenarios:

Scenario	What to count	Main cost driver
Customer support bot	System prompt, retrieved docs, user question, answer.	Repeated context plus output length.
Document summarizer	Full document chunk plus summary output.	Large input tokens.
Code review assistant	Diff, files, instructions, review comments.	Code tokenization and long outputs.
Data extraction job	Source text or JSON plus structured result.	Batch volume and repeated schemas.
Agent workflow	Tool descriptions, memory, plan, observations, final answer.	Hidden context and multiple model calls.

Worked example: a chatbot reply

Say a user sends a 300-word question and your system prompt adds 200 words of context. That is about 500 words ≈ 665 input tokens. The model replies with 400 words ≈ 533 output tokens. Using GPT-5.4 ($2.50 input / $15.00 output):

Input: 665 / 1,000,000 × $2.50 = $0.00166
Output: 533 / 1,000,000 × $15.00 = $0.00800
Total per reply ≈ $0.0097

Tiny — until you multiply. At 10,000 replies a day, that is about $97/day or roughly $2,900/month. This is why estimating tokens before launch matters.

Why output length is the biggest lever

Because output is the pricier side, the single most effective cost control is asking for shorter answers. Telling the model "answer in 2 sentences" instead of letting it ramble can cut output tokens — and the output portion of your bill — by half or more. We list more tactics in 10 ways to reduce token usage.

How to estimate before you build

Write a realistic sample prompt and a realistic sample reply.
Count the tokens for each with a token counter.
Plug them into the formula above with your chosen model's prices.
Multiply by your expected daily/monthly request volume.

Not sure how many tokens your sample text is? Start with how many tokens is my text.

Estimate your cost now: paste a sample prompt into TokenCounter.cc, set your price per million tokens, and read the cost instantly. Open the calculator →

Official pricing sources

Provider prices change, and discounts for caching, batch jobs, and regional billing can change the final number. Before making budget decisions, compare your calculator result against the official pages: OpenAI pricing, Anthropic pricing, and Google Gemini pricing.

For the best estimate, count one realistic input, estimate a realistic output, then multiply by requests per day. If your app uses long reusable context, also check whether prompt caching or batch pricing applies.

Frequently asked questions

How do I calculate the cost of an API call?

Multiply input tokens by the input price per million, multiply output tokens by the output price per million, then add the two.

Why are output tokens more expensive than input tokens?

Generating text is more compute-intensive than reading it, so providers price output two to five times higher than input.

What does 1 million tokens cost?

It depends on the model. In 2026, input ranges from about $0.10 (Gemini 2.5 Flash-Lite) to $10 and up (Claude Fable 5, GPT-5.5 Pro) per million tokens, with output typically two to five times higher.

Is this calculator exact?

It gives a close estimate. Exact billing depends on the model's real tokenizer and any caching or batch discounts, but the estimate is reliable for budgeting.

Token Counter Team

Maintainers of TokenCounter.cc, a free token estimation tool. Writes about LLM tokenization, prompt efficiency, and AI API costs.