How much does a token cost?

Token costs vary by model and feature mode. The cheapest standard input rates start at $0.05 per million tokens, while premium reasoning models and special modes can cost much more. Use the calculator to estimate your specific usage.

LLM Cost Calculator

Q: What is the cheapest LLM API?

The cheapest standard-rate LLM APIs currently tracked include GPT-5 Nano ($0.05/1M input), Gemini 2.0 Flash Lite ($0.075/1M input), and nano/flash tiers around $0.10/1M input.

Free token cost calculator for the major LLM APIs tracked by ModelPricing.ai. Compare pricing across OpenAI, Anthropic, and Google models — or browse the full pricing comparison.

Calculate Your Cost

Provider

Model

Input Tokens

Output Tokens

Select a model to see the cost estimate

Get API Access

How Token Pricing Works

LLM APIs charge per token — a unit of text roughly equal to 4 characters or 0.75 words. Every API call has two cost components: input tokens (your prompt) and output tokens (the model's response). Output tokens usually cost more than input tokens, and cache, batch, priority, fast-mode, or data-residency settings can move the final number.

LLM Pricing Comparison Table

Model	Input $/1M tokens	Output $/1M tokens	Type	Notes
claude-3-7-sonnet	$3.00	$15.00	Flat
claude-fable-5	$10.00	$50.00	Flat
claude-haiku-3	$0.250	$1.25	Flat
claude-haiku-3-5	$0.800	$4.00	Flat
claude-haiku-4-5	$1.00	$5.00	Flat
claude-mythos-5	$10.00	$50.00	Flat
claude-opus-3	$15.00	$75.00	Flat
claude-opus-4-0	$15.00	$75.00	Flat
claude-opus-4-1	$15.00	$75.00	Flat
claude-opus-4-5	$5.00	$25.00	Flat
claude-opus-4-6	$5.00	$25.00	Flat
claude-opus-4-7	$5.00	$25.00	Flat
claude-opus-4-8	$5.00	$25.00	Flat
claude-sonnet-4-0	$3.00	$15.00	Flat
claude-sonnet-4-5	$3.00	$15.00	Flat
claude-sonnet-4-6	$3.00	$15.00	Flat
gemini-2.0-flash	$0.150	$0.600	Flat
gemini-2.0-flash-lite	$0.075	$0.300	Flat
gemini-2.5-computer-use	$1.25 / $2.50	$10.00 / $15.00	Breakpoint	Threshold: 200K tokens
gemini-2.5-flash	$0.300	$2.50	Flat
gemini-2.5-flash-image	$0.300	$2.50	Multimodal	text, image
gemini-2.5-flash-lite	$0.100	$0.400	Flat
gemini-2.5-flash-native-audio	$0.500	$2.00	Multimodal	text, audio
gemini-2.5-flash-preview-tts	$0.500	$10.00	Flat
gemini-2.5-pro	$1.25 / $2.50	$10.00 / $15.00	Breakpoint	Threshold: 200K tokens
gemini-2.5-pro-preview-tts	$1.00	$20.00	Flat
gemini-3-flash	$0.500	$3.00	Flat
gemini-3-pro-image-preview	$2.00	$12.00	Multimodal	text, image
gemini-3-pro-preview	$2.00 / $4.00	$12.00 / $12.00	Breakpoint	Threshold: 200K tokens
gemini-3.1-flash-image-preview	$0.500	$3.00	Multimodal	text, image
gemini-3.1-flash-lite-preview	$0.250	$1.50	Flat
gemini-3.1-pro-preview	$2.00 / $4.00	$12.00 / $18.00	Breakpoint	Threshold: 200K tokens
gemini-3.5-flash	$1.50	$9.00	Flat
gpt-4.1	$2.00	$8.00	Flat
gpt-4.1-mini	$0.400	$1.60	Flat
gpt-4.1-nano	$0.100	$0.400	Flat
gpt-4o	$2.50	$10.00	Flat
gpt-4o-mini	$0.150	$0.600	Flat
gpt-5	$1.25	$10.00	Flat
gpt-5-codex	$1.25	$10.00	Flat
gpt-5-mini	$0.250	$2.00	Flat
gpt-5-nano	$0.050	$0.400	Flat
gpt-5-pro	$15.00	$120.00	Flat
gpt-5.1	$1.25	$10.00	Flat
gpt-5.1-codex	$1.25	$10.00	Flat
gpt-5.1-codex-max	$1.25	$10.00	Flat
gpt-5.2	$1.75	$14.00	Flat
gpt-5.2-codex	$1.75	$14.00	Flat
gpt-5.2-pro	$21.00	$168.00	Flat
gpt-5.3-codex	$1.75	$14.00	Flat
gpt-5.4	$2.50 / $5.00	$15.00 / $22.50	Breakpoint	Threshold: 272K tokens
gpt-5.4-mini	$0.750	$4.50	Flat
gpt-5.4-nano	$0.200	$1.25	Flat
gpt-5.4-pro	$30.00 / $60.00	$180.00 / $270.00	Breakpoint	Threshold: 272K tokens
gpt-5.5	$5.00 / $10.00	$30.00 / $45.00	Breakpoint	Threshold: 272K tokens
gpt-5.5-pro	$30.00	$180.00	Flat
o1	$15.00	$60.00	Flat
o1-mini	$1.10	$4.40	Flat
o1-pro	$150.00	$600.00	Flat
o3	$2.00	$8.00	Flat
o3-deep-research	$10.00	$40.00	Flat
o3-mini	$1.10	$4.40	Flat
o3-pro	$20.00	$80.00	Flat
o4-mini	$1.10	$4.40	Flat
o4-mini-deep-research	$2.00	$8.00	Flat

How LLM Pricing Works

LLM APIs charge based on tokens — units of text that roughly correspond to ~4 characters or ~0.75 words in English. Pricing is split between input tokens (your prompt) and output tokens (the model's response), with output typically costing more.

Some models use breakpoint pricing, where rates increase above a certain context length. Multimodal models may also have different rates for text, image, and audio modalities. The calculator uses the standard rate table so you can get a clean baseline before applying provider-specific extras.

Tips to Reduce LLM Costs

Right-size your model: Use smaller models (Haiku, GPT-5 Nano) for simple tasks like classification or extraction.
Minimize prompt length: Remove unnecessary context and examples from system prompts.
Cache responses: Store and reuse results for identical or similar queries.
Use model routing: Route simple queries to cheap models and only escalate to expensive models when needed.
Monitor usage: Track costs per endpoint and model to identify optimization opportunities.

Frequently Asked Questions

How do LLM APIs charge for usage?

LLM APIs charge per token, with separate rates for input (prompt) tokens and output (completion) tokens. Prices are typically quoted per million tokens.

What is the cheapest LLM API?

The cheapest standard-rate models currently tracked include GPT-5 Nano ($0.05/1M input), Gemini 2.0 Flash Lite ($0.075/1M input), and nano/flash tiers around $0.10/1M input.

How can I reduce LLM API costs?

Key strategies include: choosing the right model size for your task, minimizing prompt length, caching frequent responses, batching requests, and using cheaper models for routing/classification before calling expensive models.

Automate Your Cost Estimation

Get programmatic access to tracked model pricing with our API. Or explore our LLM pricing comparison to find the best model for your budget.

Get Started Free