LLM Cost Calculator
Free token cost calculator for the major LLM APIs tracked by ModelPricing.ai. Compare pricing across OpenAI, Anthropic, and Google models — or browse the full pricing comparison.
Sign up to use the calculator
Sign Up FreeHow Token Pricing Works
LLM APIs charge per token — a unit of text roughly equal to 4 characters or 0.75 words. Every API call has two cost components: input tokens (your prompt) and output tokens (the model's response). Output tokens usually cost more than input tokens, and cache, batch, priority, fast-mode, or data-residency settings can move the final number.
LLM Pricing Comparison Table
| Model | Input $/1M tokens | Output $/1M tokens | Type | Notes |
|---|---|---|---|---|
| claude-3-7-sonnet | $3.00 | $15.00 | Flat | |
| claude-fable-5 | $10.00 | $50.00 | Flat | |
| claude-haiku-3 | $0.250 | $1.25 | Flat | |
| claude-haiku-3-5 | $0.800 | $4.00 | Flat | |
| claude-haiku-4-5 | $1.00 | $5.00 | Flat | |
| claude-mythos-5 | $10.00 | $50.00 | Flat | |
| claude-opus-3 | $15.00 | $75.00 | Flat | |
| claude-opus-4-0 | $15.00 | $75.00 | Flat | |
| claude-opus-4-1 | $15.00 | $75.00 | Flat | |
| claude-opus-4-5 | $5.00 | $25.00 | Flat | |
| claude-opus-4-6 | $5.00 | $25.00 | Flat | |
| claude-opus-4-7 | $5.00 | $25.00 | Flat | |
| claude-opus-4-8 | $5.00 | $25.00 | Flat | |
| claude-sonnet-4-0 | $3.00 | $15.00 | Flat | |
| claude-sonnet-4-5 | $3.00 | $15.00 | Flat | |
| claude-sonnet-4-6 | $3.00 | $15.00 | Flat | |
| gemini-2.0-flash | $0.150 | $0.600 | Flat | |
| gemini-2.0-flash-lite | $0.075 | $0.300 | Flat | |
| gemini-2.5-computer-use | $1.25 / $2.50 | $10.00 / $15.00 | Breakpoint | Threshold: 200K tokens |
| gemini-2.5-flash | $0.300 | $2.50 | Flat | |
| gemini-2.5-flash-image | $0.300 | $2.50 | Multimodal | text, image |
| gemini-2.5-flash-lite | $0.100 | $0.400 | Flat | |
| gemini-2.5-flash-native-audio | $0.500 | $2.00 | Multimodal | text, audio |
| gemini-2.5-flash-preview-tts | $0.500 | $10.00 | Flat | |
| gemini-2.5-pro | $1.25 / $2.50 | $10.00 / $15.00 | Breakpoint | Threshold: 200K tokens |
| gemini-2.5-pro-preview-tts | $1.00 | $20.00 | Flat | |
| gemini-3-flash | $0.500 | $3.00 | Flat | |
| gemini-3-pro-image-preview | $2.00 | $12.00 | Multimodal | text, image |
| gemini-3-pro-preview | $2.00 / $4.00 | $12.00 / $12.00 | Breakpoint | Threshold: 200K tokens |
| gemini-3.1-flash-image-preview | $0.500 | $3.00 | Multimodal | text, image |
| gemini-3.1-flash-lite-preview | $0.250 | $1.50 | Flat | |
| gemini-3.1-pro-preview | $2.00 / $4.00 | $12.00 / $18.00 | Breakpoint | Threshold: 200K tokens |
| gemini-3.5-flash | $1.50 | $9.00 | Flat | |
| gpt-4.1 | $2.00 | $8.00 | Flat | |
| gpt-4.1-mini | $0.400 | $1.60 | Flat | |
| gpt-4.1-nano | $0.100 | $0.400 | Flat | |
| gpt-4o | $2.50 | $10.00 | Flat | |
| gpt-4o-mini | $0.150 | $0.600 | Flat | |
| gpt-5 | $1.25 | $10.00 | Flat | |
| gpt-5-codex | $1.25 | $10.00 | Flat | |
| gpt-5-mini | $0.250 | $2.00 | Flat | |
| gpt-5-nano | $0.050 | $0.400 | Flat | |
| gpt-5-pro | $15.00 | $120.00 | Flat | |
| gpt-5.1 | $1.25 | $10.00 | Flat | |
| gpt-5.1-codex | $1.25 | $10.00 | Flat | |
| gpt-5.1-codex-max | $1.25 | $10.00 | Flat | |
| gpt-5.2 | $1.75 | $14.00 | Flat | |
| gpt-5.2-codex | $1.75 | $14.00 | Flat | |
| gpt-5.2-pro | $21.00 | $168.00 | Flat | |
| gpt-5.3-codex | $1.75 | $14.00 | Flat | |
| gpt-5.4 | $2.50 / $5.00 | $15.00 / $22.50 | Breakpoint | Threshold: 272K tokens |
| gpt-5.4-mini | $0.750 | $4.50 | Flat | |
| gpt-5.4-nano | $0.200 | $1.25 | Flat | |
| gpt-5.4-pro | $30.00 / $60.00 | $180.00 / $270.00 | Breakpoint | Threshold: 272K tokens |
| gpt-5.5 | $5.00 / $10.00 | $30.00 / $45.00 | Breakpoint | Threshold: 272K tokens |
| gpt-5.5-pro | $30.00 | $180.00 | Flat | |
| o1 | $15.00 | $60.00 | Flat | |
| o1-mini | $1.10 | $4.40 | Flat | |
| o1-pro | $150.00 | $600.00 | Flat | |
| o3 | $2.00 | $8.00 | Flat | |
| o3-deep-research | $10.00 | $40.00 | Flat | |
| o3-mini | $1.10 | $4.40 | Flat | |
| o3-pro | $20.00 | $80.00 | Flat | |
| o4-mini | $1.10 | $4.40 | Flat | |
| o4-mini-deep-research | $2.00 | $8.00 | Flat |
How LLM Pricing Works
LLM APIs charge based on tokens — units of text that roughly correspond to ~4 characters or ~0.75 words in English. Pricing is split between input tokens (your prompt) and output tokens (the model's response), with output typically costing more.
Some models use breakpoint pricing, where rates increase above a certain context length. Multimodal models may also have different rates for text, image, and audio modalities. The calculator uses the standard rate table so you can get a clean baseline before applying provider-specific extras.
Tips to Reduce LLM Costs
- Right-size your model: Use smaller models (Haiku, GPT-5 Nano) for simple tasks like classification or extraction.
- Minimize prompt length: Remove unnecessary context and examples from system prompts.
- Cache responses: Store and reuse results for identical or similar queries.
- Use model routing: Route simple queries to cheap models and only escalate to expensive models when needed.
- Monitor usage: Track costs per endpoint and model to identify optimization opportunities.
Frequently Asked Questions
How do LLM APIs charge for usage?
LLM APIs charge per token, with separate rates for input (prompt) tokens and output (completion) tokens. Prices are typically quoted per million tokens.
What is the cheapest LLM API?
The cheapest standard-rate models currently tracked include GPT-5 Nano ($0.05/1M input), Gemini 2.0 Flash Lite ($0.075/1M input), and nano/flash tiers around $0.10/1M input.
How can I reduce LLM API costs?
Key strategies include: choosing the right model size for your task, minimizing prompt length, caching frequent responses, batching requests, and using cheaper models for routing/classification before calling expensive models.
Automate Your Cost Estimation
Get programmatic access to tracked model pricing with our API. Or explore our LLM pricing comparison to find the best model for your budget.
Get Started Free