Cheap LLM APIs — total cost ≤ $1 per million tokens
Models where the combined input + output price is at most $1 per million tokens. Good fit for high-volume workloads where quality is acceptable at the small-model tier.
118 models match this tag · sorted by lowest total cost.
| # | Model | Provider | Context | In $/1M | Out $/1M |
|---|---|---|---|---|---|
| 1 | Mistral Nemo | Mistral | 131k | 0.0200 | 0.0300 |
| 2 | Llama 3.1 8B Instruct | Meta | 16k | 0.0200 | 0.0500 |
| 3 | Llama 3 8B Instruct | Meta | 8k | 0.0400 | 0.0400 |
| 4 | Llama 3 8B Lunaris | Sao10K | 8k | 0.0400 | 0.0500 |
| 5 | Gemma 3 4B | 131k | 0.0400 | 0.0800 | |
| 6 | MythoMax 13B | Gryphe | 4k | 0.0600 | 0.0600 |
| 7 | Granite 4.0 Micro | IBM | 131k | 0.0170 | 0.1100 |
| 8 | Mistral Small 3 | Mistral | 33k | 0.0500 | 0.0800 |
| 9 | Qwen2.5 7B Instruct | Qwen | 33k | 0.0400 | 0.1000 |
| 10 | LFM2-24B-A2B | LiquidAI | 33k | 0.0300 | 0.1200 |
| 11 | Granite 4.1 8B | IBM | 131k | 0.0500 | 0.1000 |
| 12 | Qwen-Turbo | Qwen | 131k | 0.0325 | 0.1300 |
| 13 | Gemma 3 12B | 131k | 0.0400 | 0.1300 | |
| 14 | gpt-oss-20b | OpenAI | 131k | 0.0300 | 0.1400 |
| 15 | Qwen3 235B A22B Instruct 2507 | Qwen | 262k | 0.0710 | 0.1000 |
| 16 | Nova Micro 1.0 | Amazon | 128k | 0.0350 | 0.1400 |
| 17 | Gemma 3n 4B | 33k | 0.0600 | 0.1200 | |
| 18 | Command R7B (12-2024) | Cohere | 128k | 0.0375 | 0.1500 |
| 19 | Qwen3.5-9B | Qwen | 262k | 0.0400 | 0.1500 |
| 20 | Trinity Mini | Arcee AI | 131k | 0.0450 | 0.1500 |
| 21 | GLM 4 32B | Z.ai | 128k | 0.1000 | 0.1000 |
| 22 | Ministral 3 3B 2512 | Mistral | 131k | 0.1000 | 0.1000 |
| 23 | Nemotron Nano 9B V2 | NVIDIA | 131k | 0.0400 | 0.1600 |
| 24 | Reka Edge | Rekaai | 16k | 0.1000 | 0.1000 |
| 25 | Phi 4 | Microsoft | 16k | 0.0650 | 0.1400 |
| 26 | gpt-oss-120b | OpenAI | 131k | 0.0390 | 0.1800 |
| 27 | Llama 3.2 1B Instruct | Meta | 60k | 0.0270 | 0.2000 |
| 28 | Gemma 3 27B | 131k | 0.0800 | 0.1600 | |
| 29 | Nemotron 3 Nano 30B A3B | NVIDIA | 262k | 0.0500 | 0.2000 |
| 30 | Mistral Small 3.2 24B | Mistral | 128k | 0.0750 | 0.2000 |
| 31 | Hermes 2 Pro - Llama-3 8B | Nous | 8k | 0.1400 | 0.1400 |
| 32 | Ministral 3 8B 2512 | Mistral | 262k | 0.1500 | 0.1500 |
| 33 | Mistral 7B Instruct v0.1 | Mistral | 3k | 0.1100 | 0.1900 |
| 34 | Nova Lite 1.0 | Amazon | 300k | 0.0600 | 0.2400 |
| 35 | Rnj 1 Instruct | EssentialAI | 33k | 0.1500 | 0.1500 |
| 36 | Reka Flash 3 | Rekaai | 66k | 0.1000 | 0.2000 |
| 37 | UI-TARS 7B | ByteDance | 128k | 0.1000 | 0.2000 |
| 38 | Ling-2.6-flash | inclusionAI | 262k | 0.0800 | 0.2400 |
| 39 | Qwen3.5-Flash | Qwen | 1000k | 0.0650 | 0.2600 |
| 40 | Hy3 preview | Tencent | 262k | 0.0660 | 0.2600 |
| 41 | Qwen3 14B | Qwen | 41k | 0.1000 | 0.2400 |
| 42 | Qwen3 Coder 30B A3B Instruct | Qwen | 160k | 0.0700 | 0.2700 |
| 43 | ERNIE 4.5 21B A3B | Baidu Qianfan | 120k | 0.0700 | 0.2800 |
| 44 | ERNIE 4.5 21B A3B Thinking | Baidu Qianfan | 131k | 0.0700 | 0.2800 |
| 45 | Llama Guard 4 12B | Meta | 164k | 0.1800 | 0.1800 |
| 46 | Spotlight | Arcee AI | 131k | 0.1800 | 0.1800 |
| 47 | Qwen3 32B | Qwen | 41k | 0.0800 | 0.2800 |
| 48 | Gemini 2.0 Flash Lite | 1049k | 0.0750 | 0.3000 | |
| 49 | Seed 1.6 Flash | ByteDance Seed | 262k | 0.0750 | 0.3000 |
| 50 | gpt-oss-safeguard-20b | OpenAI | 131k | 0.0750 | 0.3000 |
| 51 | DeepSeek V4 Flash | DeepSeek | 1049k | 0.1260 | 0.2520 |
| 52 | Llama 4 Scout | Meta | 328k | 0.0800 | 0.3000 |
| 53 | Gemma 4 26B A4B | 262k | 0.0600 | 0.3300 | |
| 54 | Qwen3 30B A3B Instruct 2507 | Qwen | 262k | 0.0900 | 0.3000 |
| 55 | Llama 3.2 3B Instruct | Meta | 80k | 0.0510 | 0.3400 |
| 56 | Devstral Small 1.1 | Mistral | 131k | 0.1000 | 0.3000 |
| 57 | MiMo-V2-Flash | Xiaomi | 262k | 0.1000 | 0.3000 |
| 58 | Ministral 3 14B 2512 | Mistral | 262k | 0.2000 | 0.2000 |
| 59 | Step 3.5 Flash | StepFun | 262k | 0.1000 | 0.3000 |
| 60 | Voxtral Small 24B 2507 | Mistral | 32k | 0.1000 | 0.3000 |
| 61 | Llama 3.3 70B Instruct | Meta | 131k | 0.1000 | 0.3200 |
| 62 | Phi 4 Mini Instruct | Microsoft | 128k | 0.0800 | 0.3500 |
| 63 | GPT-5 Nano | OpenAI | 400k | 0.0500 | 0.4000 |
| 64 | Qwen3 8B | Qwen | 41k | 0.0500 | 0.4000 |
| 65 | GLM 4.7 Flash | Z.ai | 203k | 0.0600 | 0.4000 |
| 66 | Qwen3 30B A3B Thinking 2507 | Qwen | 131k | 0.0800 | 0.4000 |
| 67 | Gemma 4 31B | 262k | 0.1200 | 0.3700 | |
| 68 | Llama 3.2 11B Vision Instruct | Meta | 131k | 0.2450 | 0.2450 |
| 69 | GPT-4.1 Nano | OpenAI | 1048k | 0.1000 | 0.4000 |
| 70 | Gemini 2.0 Flash | 1049k | 0.1000 | 0.4000 | |
| 71 | Gemini 2.5 Flash Lite | 1049k | 0.1000 | 0.4000 | |
| 72 | Gemini 2.5 Flash Lite Preview 09-2025 | 1049k | 0.1000 | 0.4000 | |
| 73 | Llama 3.3 Nemotron Super 49B V1.5 | NVIDIA | 131k | 0.1000 | 0.4000 |
| 74 | Seed-2.0-Mini | ByteDance Seed | 262k | 0.1000 | 0.4000 |
| 75 | Llama Guard 3 8B | Meta | 131k | 0.4800 | 0.0300 |
| 76 | Qwen3 VL 32B Instruct | Qwen | 131k | 0.1040 | 0.4160 |
| 77 | Hermes 4 70B | Nous | 131k | 0.1300 | 0.4000 |
| 78 | Nemotron 3 Super | NVIDIA | 262k | 0.0900 | 0.4500 |
| 79 | Qwen3 30B A3B | Qwen | 41k | 0.0900 | 0.4500 |
| 80 | Tongyi DeepResearch 30B A3B | Alibaba | 131k | 0.0900 | 0.4500 |
| 81 | Qwen VL Plus | Qwen | 131k | 0.1365 | 0.4095 |
| 82 | Qwen3 VL 8B Instruct | Qwen | 131k | 0.0800 | 0.5000 |
| 83 | R1 Distill Qwen 32B | DeepSeek | 33k | 0.2900 | 0.2900 |
| 84 | Hermes 3 70B Instruct | Nous | 131k | 0.3000 | 0.3000 |
| 85 | Rocinante 12B | TheDrummer | 33k | 0.1700 | 0.4300 |
| 86 | Trinity Large Preview | Arcee AI | 131k | 0.1500 | 0.4500 |
| 87 | DeepSeek V3.2 | DeepSeek | 131k | 0.2520 | 0.3780 |
| 88 | DeepSeek V3.1 Nex N1 | Nex AGI | 131k | 0.1350 | 0.5000 |
| 89 | Olmo 3 32B Think | AllenAI | 66k | 0.1500 | 0.5000 |
| 90 | Qwen3 VL 30B A3B Instruct | Qwen | 131k | 0.1300 | 0.5200 |
| 91 | DeepSeek V3.2 Exp | DeepSeek | 164k | 0.2700 | 0.4100 |
| 92 | Grok 4 Fast | xAI | 2000k | 0.2000 | 0.5000 |
| 93 | Grok 4.1 Fast | xAI | 2000k | 0.2000 | 0.5000 |
| 94 | ERNIE 4.5 VL 28B A3B | Baidu Qianfan | 30k | 0.1400 | 0.5600 |
| 95 | Hunyuan A13B Instruct | Tencent | 131k | 0.1400 | 0.5700 |
| 96 | DeepSeek V3.2 Speciale | DeepSeek | 164k | 0.2870 | 0.4310 |
| 97 | Command R (08-2024) | Cohere | 128k | 0.1500 | 0.6000 |
| 98 | GPT-4o-mini | OpenAI | 128k | 0.1500 | 0.6000 |
| 99 | GPT-4o-mini (2024-07-18) | OpenAI | 128k | 0.1500 | 0.6000 |
| 100 | GPT-4o-mini Search Preview | OpenAI | 128k | 0.1500 | 0.6000 |
| 101 | Llama 4 Maverick | Meta | 1049k | 0.1500 | 0.6000 |
| 102 | Mistral Small 4 | Mistral | 262k | 0.1500 | 0.6000 |
| 103 | Solar Pro 3 | Upstage | 128k | 0.1500 | 0.6000 |
| 104 | Qwen2.5 72B Instruct | Qwen | 33k | 0.3600 | 0.4000 |
| 105 | Cydonia 24B V4.1 | TheDrummer | 131k | 0.3000 | 0.5000 |
| 106 | Grok 3 Mini | xAI | 131k | 0.3000 | 0.5000 |
| 107 | Grok 3 Mini Beta | xAI | 131k | 0.3000 | 0.5000 |
| 108 | Llama 3.1 70B Instruct | Meta | 131k | 0.4000 | 0.4000 |
| 109 | Saba | Mistral | 33k | 0.2000 | 0.6000 |
| 110 | UnslopNemo 12B | TheDrummer | 33k | 0.4000 | 0.4000 |
| 111 | Qwen3 Next 80B A3B Thinking | Qwen | 131k | 0.0975 | 0.7800 |
| 112 | Mistral Small 3.1 24B | Mistral | 128k | 0.3500 | 0.5600 |
| 113 | Qwen3 Coder Next | Qwen | 262k | 0.1100 | 0.8000 |
| 114 | DeepSeek V3 0324 | DeepSeek | 164k | 0.2000 | 0.7700 |
| 115 | GLM 4.5 Air | Z.ai | 131k | 0.1300 | 0.8500 |
| 116 | DeepSeek V3.1 | DeepSeek | 164k | 0.2100 | 0.7900 |
| 117 | Mercury 2 | Inception | 128k | 0.2500 | 0.7500 |
| 118 | Qwen2.5 VL 72B Instruct | Qwen | 32k | 0.2500 | 0.7500 |