Large Model API Pricing
Explore the pricing for our model API. With transparent rates and flexible options, find the right plan to meet your needs.
Explore the pricing for our model API. With transparent rates and flexible options, find the right plan to meet your needs.
Explore the pricing for our model API. With transparent rates and flexible options, find the right plan to meet your needs.
Anthropic's Claude model offers advanced AI safety capabilities, focusing on useful, harmless, and honest AI assistants with powerful reasoning and conversational abilities.
| Model Name | Input Token Range | Context | Input (/Mt) | Cache Write (/Mt) | Cache read (/Mt) | Output (/Mt) | Actions |
|---|---|---|---|---|---|---|---|
| claude-haiku-4-5-20251001 | 1–2,000 | 20,000 | $5 | $13 (5m) | $12 | $10 | Go check it out |
| 2,000–10,000 | 20,000 | $3 | $8 (5m) | $8 | $4 | Go check it out | |
| claude-sonnet-4-5-20250929 | 1–2,000 | 200,000 | $2 | $4(5m) × $8(1h)$ | $4 | $2 | Go check it out |
| 2,000–20,000 | 200,000 | $4 | $6(5 min) × $10(1 hr)$ | $6 | $4 | Go check it out | |
| claude-3-7-sonnet-20250219 | - | 200,000 | $3 | $3.75 (5 m) | $0.3 | $15 | Go check it out |
| claude-sonnet-4-20250514 | - | 200,000 | $3 | $3.75 (5 min) · $6.60 (1 hr) | $0.3 | $15 | Go check it out |
| claude-opus-4-20250514 | - | 200,000 | $15 | $18.75 (5 m) | $1.5 | $75 | Go check it out |
| claude-opus-4-1-20250805 | - | 200,000 | $15 | $18.75 (5 m) | $1.5 | $75 | Go check it out |
| claude-3-5-sonnet-20241022 | - | 200,000 | $3 | $3.75 (5 m) | $0.3 | $15 | Go check it out |
| claude-3-haiku-20240307 | - | 200,000 | $0.25 | - | - | $1.25 | Go check it out |
| claude-3-5-haiku-20241022 | - | 200,000 | $0.8 | - | - | $4 | Go check it out |
OpenAI's GPT series of models offer state-of-the-art language understanding and generation capabilities, delivering outstanding performance across a wide range of tasks, and are among the industry's leading AI models.
| Model Name | Context | Input (/Mt) | Cache Write (/Mt) | Cache read (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|---|---|
| gpt-5-codex | 400,000 | $1.25 | - | $0.125 | $10 | Go check it out |
| openai/gpt-oss-120b | 131,072 | $0.1 | - | - | $0.5 | Go check it out |
| openai/gpt-oss-20b | 131,072 | $0.05 | - | - | $0.2 | Go check it out |
| gpt-5 | 400,000 | $1.25 | $0.1(5m) \times $0.2(1h) | $0.125 | $10 | Go check it out |
| gpt-5-mini | 400,000 | $0.25 | - | $0.025 | $2 | Go check it out |
| gpt-5-nano | 400,000 | $0.05 | - | $0.005 | $0.4 | Go check it out |
| gpt-5-pro | 400,000 | $15 | $1 (1 hour) | - | $120 | Go check it out |
| gpt-5-chat-latest | 400,000 | $1.25 | - | $0.125 | $10 | Go check it out |
| gpt-4.1-mini | 1,047,576 | $0.4 | - | $0.1 | $1.6 | Go check it out |
| gpt-4.1-nano | 1,047,576 | $0.1 | - | $0.025 | $0.4 | Go check it out |
| gpt-4.1 | 1,047,576 | $2 | - | $0.5 | $8 | Go check it out |
| gpt-4o-mini | 131,072 | $0.15 | - | $0.075 | $0.6 | Go check it out |
| gpt-4o | 131,072 | $2.5 | - | $1.25 | $10 | Go check it out |
Google's Gemini model offers high-quality natural language processing capabilities, performs exceptionally well across a wide range of NLP tasks, and boasts powerful multimodal capabilities.
| Model Name | Context | Input (/Mt) | Cache Write (/Mt) | Cache read (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|---|---|
| google/gemma-3-12b-it | 131,072 | $0.05 | - | - | $0.1 | Go check it out |
| gemini-2.5-flash | 1,048,576 | $0.3 | $0.083 (5m) | $0.075 | $2.5 | Go check it out |
| gemini-2.5-pro | 1,048,576 | $1.25 | $0.375 (5m) | $0.3125 | $10 | Go check it out |
| google/gemma-3-27b-it | 32,768 | $0.119 | - | - | $0.2 | Go check it out |
| gemini-3.1-flash-lite-preview | 1,000,000 | $1 | $2(5m) \times $2(1h) | $1 | $2 | Go check it out |
| gemini-2.5-flash-lite-preview-09-2025 | 1,048,576 | $0.1 | $0.083 (5m) | $0.01 | $0.4 | Go check it out |
| gemini-2.0-flash-lite | 1,048,576 | $0.075 | $0.083 (5m) | $0.0188 | $0.3 | Go check it out |
| gemini-2.5-flash-lite | 1,048,576 | $0.1 | $0.083 (5m) | $0.025 | $0.4 | Go check it out |
| gemini-2.5-flash-lite-preview-06-17 | 1,048,576 | $0.1 | - | - | $0.4 | Go check it out |
| gemini-2.5-flash-preview-05-20 | 1,048,576 | $0.15 | - | - | $3.5 | Go check it out |
| gemini-2.5-pro-preview-06-05 | 1,048,576 | $1.25 | - | - | $10 | Go check it out |
| gemini-2.0-flash-20250609 | 1,048,576 | $0.15 | - | - | $0.6 | Go check it out |
Meta's Llama model offers state-of-the-art language understanding capabilities and features an open architecture, making it suitable for a wide range of applications.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| meta-llama/llama-3.1-8b-instruct | 16,384 | $0.02 | $0.05 | Go check it out |
| meta-llama/llama-3.3-70b-instruct | 131,072 | $0.13 | $0.39 | Go check it out |
| meta-llama/llama-4-maverick-17b-128e-instruct-fp8 | 1,048,576 | $0.17 | $0.85 | Go check it out |
| meta-llama/llama-4-scout-17b-16e-instruct | 131,072 | $0.1 | $0.5 | Go check it out |
| meta-llama/llama-3.2-3b-instruct | 32,768 | $0.03 | $0.05 | Go check it out |
The Qwen series of models offers powerful natural language processing capabilities and is available in a range of parameter sizes, from lightweight to enterprise-grade solutions.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| qwen/qwen3-next-80b-a3b-thinking | 65,536 | $0.15 | $1.5 | Go check it out |
| qwen/qwen3-coder-480b-a35b-instruct | 262,144 | $0.29 | $1.2 | Go check it out |
| qwen/qwen3-235b-a22b-thinking-2507 | 131,072 | $0.3 | $3 | Go check it out |
| qwen/qwen3-235b-a22b-instruct-2507 | 131,072 | $0.15 | $0.8 | Go check it out |
| qwen/qwen-2.5-72b-instruct | 32,000 | $0.38 | $0.4 | Go check it out |
| qwen/qwen3-235b-a22b-fp8 | 40,960 | $0.2 | $0.8 | Go check it out |
| qwen/qwen2.5-vl-72b-instruct | 32,768 | $0.8 | $0.8 | Go check it out |
| qwen/qwen3-32b-fp8 | 40,960 | $0.1 | $0.45 | Go check it out |
| qwen/qwen3-30b-a3b-fp8 | 40,960 | $0.09 | $0.45 | Go check it out |
| Qwen/Qwen3-8B | - | Free | Free | Go check it out |
| qwen/qwen3-next-80b-a3b-instruct | 65,536 | $0.15 | $1.5 | Go check it out |
| qwen/qwen-mt-plus | 4,096 | $0.25 | $0.75 | Go check it out |
| qwen/qwen3-8b-fp8 | 128,000 | $0.035 | $0.138 | Go check it out |
| qwen/qwen2.5-7b-instruct | 32,000 | $0.07 | $0.07 | Go check it out |
Baidu's ERNIE model offers advanced Chinese language understanding and multimodal capabilities, is optimized for Chinese applications, and is competitively priced.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| baidu/ernie-4.5-vl-424b-a47b | 123,000 | $0.42 | $1.25 | Go check it out |
| baidu/ernie-4.5-300b-a47b-paddle | 123,000 | $0.28 | $1.1 | Go check it out |
The GLM series of models from Tsinghua University feature advanced Chinese language understanding and generation capabilities.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| zai-org/glm-4.5 | 131,072 | $0.6 | $2.2 | Go check it out |
| zai-org/glm-4.5v | 65,536 | $0.6 | $1.8 | Go check it out |
| thudm/glm-4.1v-9b-thinking | 65,536 | $0.035 | $0.138 | Go check it out |
A fine-tuned model specifically optimized for creative and role-playing applications, featuring enhanced storytelling capabilities.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| sao10k/l3-70b-euryale-v2.1 | 8,192 | $1.48 | $1.48 | Go check it out |
| sao10k/l3-8b-lunaris | 8,192 | $0.05 | $0.05 | Go check it out |
| Sao10K/L3-8B-Stheno-v3.2 | 8,192 | $0.05 | $0.05 | Go check it out |
| sao10k/l31-70b-euryale-v2.2 | 8,192 | $1.48 | $1.48 | Go check it out |
A powerful and efficient language model from Mistral AI, designed for both commercial and open-source applications.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| mistralai/mistral-nemo | 60,288 | $0.04 | $0.17 | Go check it out |
| mistralai/mistral-7b-instruct | 32,768 | $0.029 | $0.059 | Go check it out |
Advanced AI models from DeepSeek, offering cutting-edge inference capabilities and competitive pricing for enterprise and research applications.
| Model Name | Input Token Range | Context | Input (/Mt) | Cache Write (/Mt) | Cache read (/Mt) | Output (/Mt) | Actions |
|---|---|---|---|---|---|---|---|
| deepseek/deepseek-v3.1 | - | 163,840 | $0.27 | $1 (5m) | $1 | $1 | Go check it out |
| deepseek/deepseek-r1-0528 | 1–32,768 | 163,840 | $1.5 | $0.6 (5m) | $0.9 | $6 | Go check it out |
| 131,072–204,800 | 163,840 | $3 | $0.7 (5m) | $0.5 | $6 | Go check it out | |
| 32,768–131,072 | 163,840 | $8 | $0.5 (5m) | $0.3 | $4 | Go check it out | |
| deepseek/deepseek-v3-0324 | - | 163,840 | $0.28 | $0.14 (5m) | $0.14 | $1.14 | Go check it out |
| deepseek/deepseek-v3.1-test | - | 20,000 | Free | - | - | Free | Go check it out |
MiniMax AI's advanced language model delivers powerful conversational AI capabilities, excelling in customer service, content generation, and creative applications, with robust multilingual support and enterprise-grade scalability.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| minimaxai/minimax-m1-80k | 1,000,000 | $0.55 | $2.2 | Go check it out |
An innovative AI model from Gryphe that offers professional-grade language understanding capabilities, with a focus on efficiency and adaptability, making it ideal for niche applications.
| Model Name | Context | Input (/Mt) | Output (/Mt) | Operation |
|---|---|---|---|---|
| gryphe/mythomax-l2-13b | 4,096 | $0.09 | $0.09 | Go check it out |
A sophisticated collection of state-of-the-art AI models, featuring advanced reasoning and mathematical proof capabilities, as well as cutting-edge language understanding across multiple domains.