AI 模型目錄

357 個模型 — GPT-4o, Claude, Gemini, DeepSeek, Llama

	Modality
Z Image Turbo Tongyi-MAI/Z-Image-Turbo	text→image	—	$0.0055/張	—	—	試用
Gte Base thenlper/gte-base	text→text	$0.01	免費	—	—	試用
E5 Base V2 intfloat/e5-base-v2	text→text	$0.01	免費	—	—	試用
All Minilm L6 V2 sentence-transformers/all-minilm-l6-v2	text→text	$0.01	免費	—	—	試用
Paraphrase Minilm L6 V2 sentence-transformers/paraphrase-minilm-l6-v2	text→text	$0.01	免費	—	—	試用
All Minilm L12 V2 sentence-transformers/all-minilm-l12-v2	text→text	$0.01	免費	—	—	試用
Multi Qa Mpnet Base Dot V1 sentence-transformers/multi-qa-mpnet-base-dot-v1	text→text	$0.01	免費	—	—	試用
Bge Base En V1.5 baai/bge-base-en-v1.5	text→text	$0.01	免費	—	—	試用
All Mpnet Base V2 sentence-transformers/all-mpnet-base-v2	text→text	$0.01	免費	—	—	試用
Gte Large thenlper/gte-large	text→text	$0.01	免費	—	—	試用
E5 Large V2 intfloat/e5-large-v2	text→text	$0.01	免費	—	—	試用
Multilingual E5 Large intfloat/multilingual-e5-large	text→text	$0.01	免費	—	—	試用
Bge Large En V1.5 baai/bge-large-en-v1.5	text→text	$0.01	免費	—	—	試用
Bge M3 baai/bge-m3	text→text	$0.01	免費	—	—	試用
Qwen3 Embedding 8b qwen/qwen3-embedding-8b	text→text	$0.01	免費	—	—	試用
Lfm 2.2 6b liquid/lfm-2.2-6b	text→text	$0.01	$0.02	—	—	試用
Lfm2 8b A1b liquid/lfm2-8b-a1b	text→text	$0.01	$0.02	—	—	試用
Granite 4.0 H Micro ibm-granite/granite-4.0-h-micro	text→text	$0.02	$0.11	—	—	試用
Text Embedding 3 Small openai/text-embedding-3-small text-embedding-3-small is OpenAI's improved, more performant version of the ada embedding model. Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text. Embeddings are useful for search, clustering, recommendations, anomaly detection, and classification tasks.	text→embeddings	$0.02	免費	8K	Oct 2025	試用
Qwen3 Embedding 4b qwen/qwen3-embedding-4b	text→text	$0.02	免費	—	—	試用
Llama 3.1 8b Instruct meta-llama/llama-3.1-8b-instruct	text→text	$0.02	$0.05	—	—	試用
Llama 3.2 3b Instruct meta-llama/llama-3.2-3b-instruct	text→text	$0.02	$0.02	—	—	試用
Llama Guard 3 8b meta-llama/llama-guard-3-8b	text→text	$0.02	$0.06	—	—	試用
Mistral Nemo mistralai/mistral-nemo	text→text	$0.02	$0.04	—	—	試用
Gemma 3n E4b It google/gemma-3n-e4b-it Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks such as text generation, speech recognition, translation, and image analysis. Leveraging innovations like Per-Layer Embedding (PLE) caching and the MatFormer architecture, Gemma 3n dynamically manages memory usage and computational load by selectively activating model parameters, significantly reducing runtime resource requirements. This model supports a wide linguistic range (trained in over 140 languages) and features a flexible 32K token context window. Gemma 3n can selectively load parameters, optimizing memory and computational efficiency based on the task or device capabilities, making it well-suited for privacy-focused, offline-capable applications and on-device AI solutions. [Read more in the blog post](https://developers.googleblog.com/en/introducing-gemma-3n/)	text→text	$0.02	$0.04	33K	May 2025	試用
Llama 3.2 1b Instruct meta-llama/llama-3.2-1b-instruct	text→text	$0.03	$0.20	—	—	試用
Pplx Embed V1 4b perplexity/pplx-embed-v1-4b pplx-embed-v1 -4B is one of Perplexity's state-of-the-art text embedding models built for real-world, web-scale retrieval. pplx-embed-v1 is optimized for standard dense text retrieval with the 4B parameter model maximizing retrieval quality.	text→embeddings	$0.03	免費	32K	Mar 2026	試用
Gemma 2 9b It google/gemma-2-9b-it	text→text	$0.03	$0.09	—	—	試用
Llama 3 8b Instruct meta-llama/llama-3-8b-instruct	text→text	$0.03	$0.04	—	—	試用
GPT Oss 20b openai/gpt-oss-20b	text→text	$0.03	$0.14	—	—	試用
Qwen2.5 Coder 7b Instruct qwen/qwen2.5-coder-7b-instruct	text→text	$0.03	$0.09	—	—	試用
Lfm 2 24b A2b liquid/lfm-2-24b-a2b LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per token, it delivers high-quality generation while maintaining low inference costs. The model fits within 32 GB of RAM, making it practical to run on consumer laptops and desktops without sacrificing capability.	text→text	$0.03	$0.12	33K	Feb 2026	試用
Nova Micro V1 amazon/nova-micro-v1	text→text	$0.04	$0.14	—	—	試用
Command R7b 12 2024 cohere/command-r7b-12-2024	text→text	$0.04	$0.15	—	—	試用
GPT Oss 120b:Exacto openai/gpt-oss-120b:exacto	text→text	$0.04	$0.19	—	—	試用
GPT Oss 120b openai/gpt-oss-120b	text→text	$0.04	$0.19	—	—	試用
Gemma 3 12b It google/gemma-3-12b-it	text→text	$0.04	$0.13	—	—	試用
Gemma 3 27b It google/gemma-3-27b-it	text→text	$0.04	$0.15	—	—	試用
Gemma 3 4b It google/gemma-3-4b-it	text→text	$0.04	$0.08	—	—	試用
Nemotron Nano 9b V2 nvidia/nemotron-nano-9b-v2	text→text	$0.04	$0.16	—	—	試用
Qwen 2.5 7b Instruct qwen/qwen-2.5-7b-instruct	text→text	$0.04	$0.10	—	—	試用
L3 Lunaris 8b sao10k/l3-lunaris-8b	text→text	$0.04	$0.05	—	—	試用
Trinity Mini arcee-ai/trinity-mini	text→text	$0.04	$0.15	—	—	試用
Llama 3.2 11b Vision Instruct meta-llama/llama-3.2-11b-vision-instruct	text→text	$0.05	$0.05	—	—	試用
Mistral Small 24b Instruct 2501 mistralai/mistral-small-24b-instruct-2501	text→text	$0.05	$0.08	—	—	試用
Nemotron 3 Nano 30b A3b nvidia/nemotron-3-nano-30b-a3b	text→text	$0.05	$0.20	—	—	試用
Qwen Turbo qwen/qwen-turbo	text→text	$0.05	$0.20	—	—	試用
Qwen3 8b qwen/qwen3-8b	text→text	$0.05	$0.40	—	—	試用
GPT 5 Nano openai/gpt-5-nano GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger counterparts, it retains key instruction-following and safety features. It is the successor to GPT-4.1-nano and offers a lightweight option for cost-sensitive or real-time applications.	textimagefile→text	$0.05	$0.40	400K	Aug 2025	試用
Qwen3 30b A3b Thinking 2507 qwen/qwen3-30b-a3b-thinking-2507	text→text	$0.05	$0.34	—	—	試用
Mistral Small 3.2 24b Instruct mistralai/mistral-small-3.2-24b-instruct	text→text	$0.06	$0.18	—	—	試用
Nova Lite V1 amazon/nova-lite-v1	text→text	$0.06	$0.24	—	—	試用
Mythomax L2 13b gryphe/mythomax-l2-13b	text→text	$0.06	$0.06	—	—	試用
Qwen3 14b qwen/qwen3-14b	text→text	$0.06	$0.24	—	—	試用
Glm 4.7 Flash z-ai/glm-4.7-flash	text→text	$0.06	$0.40	—	—	試用
Phi 4 microsoft/phi-4 [Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion parameters, it was trained on a mix of high-quality synthetic datasets, data from curated websites, and academic materials. It has undergone careful improvement to follow instructions accurately and maintain strong safety standards. It works best with English language inputs. For more information, please see [Phi-4 Technical Report](https://arxiv.org/pdf/2412.08905)	text→text	$0.06	$0.14	16K	Jan 2025	試用
Qwen3 Coder 30b A3b Instruct qwen/qwen3-coder-30b-a3b-instruct	text→text	$0.07	$0.27	—	—	試用
Ernie 4.5 21b A3b baidu/ernie-4.5-21b-a3b	text→text	$0.07	$0.28	—	—	試用
Ernie 4.5 21b A3b Thinking baidu/ernie-4.5-21b-a3b-thinking	text→text	$0.07	$0.28	—	—	試用
Nemotron Nano 12b V2 Vl nvidia/nemotron-nano-12b-v2-vl	text→text	$0.07	$0.20	—	—	試用
Qwen3 235b A22b 2507 qwen/qwen3-235b-a22b-2507	text→text	$0.07	$0.10	—	—	試用
Gemini 2.0 Flash Lite google/gemini-2.0-flash-lite-001	text→text	$0.07	$0.30	—	—	試用
Seed 1.6 Flash bytedance-seed/seed-1.6-flash	text→text	$0.07	$0.30	—	—	試用
GPT Oss Safeguard 20b openai/gpt-oss-safeguard-20b	text→text	$0.07	$0.30	—	—	試用
Llama 4 Scout meta-llama/llama-4-scout	text→text	$0.08	$0.30	—	—	試用
Qwen3 30b A3b qwen/qwen3-30b-a3b	text→text	$0.08	$0.28	—	—	試用
Qwen3 32b qwen/qwen3-32b	text→text	$0.08	$0.24	—	—	試用
Qwen3 Vl 8b Instruct qwen/qwen3-vl-8b-instruct	text→text	$0.08	$0.50	—	—	試用
Tongyi Deepresearch 30b A3b alibaba/tongyi-deepresearch-30b-a3b	text→text	$0.09	$0.45	—	—	試用
Llama 3.1 Lumimaid 8b neversleep/llama-3.1-lumimaid-8b	text→text	$0.09	$0.60	—	—	試用
Qwen3 30b A3b Instruct 2507 qwen/qwen3-30b-a3b-instruct-2507	text→text	$0.09	$0.30	—	—	試用
Qwen3 Next 80b A3b Instruct qwen/qwen3-next-80b-a3b-instruct	text→text	$0.09	$1.10	—	—	試用
Mimo V2 Flash xiaomi/mimo-v2-flash	text→text	$0.09	$0.29	—	—	試用
Olmo 3 7b Instruct allenai/olmo-3-7b-instruct	text→text	$0.10	$0.20	—	—	試用
Ui Tars 1.5 7b bytedance/ui-tars-1.5-7b	text→text	$0.10	$0.20	—	—	試用
Text Embedding Ada 002 openai/text-embedding-ada-002 text-embedding-ada-002 is OpenAI's legacy text embedding model.	text→embeddings	$0.10	免費	8K	Oct 2025	試用
Qwen3.5 Flash 02 23 qwen/qwen3.5-flash-02-23 The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.	textimagevideo→text	$0.10	$0.40	1M	Feb 2026	試用
Voxtral Small 24b 2507 mistralai/voxtral-small-24b-2507	text→text	$0.10	$0.30	—	—	試用
Nemotron 3 Super 120b A12b nvidia/nemotron-3-super-120b-a12b NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50% higher token generation compared to leading open models. The model features a 1M token context window for long-term agent coherence, cross-document reasoning, and multi-step task planning. Latent MoE enables calling 4 experts for the inference cost of only one, improving intelligence and generalization. Multi-environment RL training across 10+ environments delivers leading accuracy on benchmarks including AIME 2025, TerminalBench, and SWE-Bench Verified. Fully open with weights, datasets, and recipes under the NVIDIA Open License, Nemotron 3 Super allows easy customization and secure deployment anywhere — from workstation to cloud.	text→text	$0.10	$0.50	262K	Mar 2026	試用
Mistral Small Creative mistralai/mistral-small-creative	text→text	$0.10	$0.30	—	—	試用
Llama 3.3 Nemotron Super 49b V1.5 nvidia/llama-3.3-nemotron-super-49b-v1.5	text→text	$0.10	$0.40	—	—	試用
Mistral Embed 2312 mistralai/mistral-embed-2312	text→text	$0.10	免費	—	—	試用
Step 3.5 Flash stepfun/step-3.5-flash	text→text	$0.10	$0.30	—	—	試用
Glm 4 32b z-ai/glm-4-32b	text→text	$0.10	$0.10	—	—	試用
Gemini 2.0 Flash google/gemini-2.0-flash-001	text→text	$0.10	$0.40	—	—	試用
Gemini 2.5 Flash Lite Preview 09 2025 google/gemini-2.5-flash-lite-preview-09-2025	text→text	$0.10	$0.40	—	—	試用
Gemini 2.5 Flash Lite google/gemini-2.5-flash-lite	text→text	$0.10	$0.40	—	—	試用
Llama 3.3 70b Instruct meta-llama/llama-3.3-70b-instruct	text→text	$0.10	$0.32	—	—	試用
Ministral 3b 2512 mistralai/ministral-3b-2512	text→text	$0.10	$0.10	—	—	試用
Reka Edge reka/reka-edge Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding, video analysis, object detection, and agentic tool-use.	imagetextvideo→text	$0.10	$0.10	16K	Mar 2026	試用
GPT 4.1 Nano openai/gpt-4.1-nano For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.	imagetextfile→text	$0.10	$0.40	1M	Apr 2025	試用
Seed 2.0 Mini bytedance-seed/seed-2.0-mini Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/medium/high), multimodal understanding, and is optimized for lightweight tasks where cost and speed take priority.	textimagevideo→text	$0.10	$0.40	262K	Feb 2026	試用
Reka Flash 3 rekaai/reka-flash-3 Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a 32K context length and optimized through reinforcement learning (RLOO), it provides competitive performance comparable to proprietary models within a smaller parameter footprint. Ideal for low-latency, local, or on-device deployments, Reka Flash 3 is compact, supports efficient quantization (down to 11GB at 4-bit precision), and employs explicit reasoning tags ("<reasoning>") to indicate its internal thought process. Reka Flash 3 is primarily an English model with limited multilingual understanding capabilities. The model weights are released under the Apache 2.0 license.	text→text	$0.10	$0.20	66K	Mar 2025	試用
Qwen3.5 9b qwen/qwen3.5-9b Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design with early fusion of multimodal tokens, allowing the model to process and reason across text and images within the same context.	textimagevideo→text	$0.10	$0.15	262K	Mar 2026	試用
Reka Edge rekaai/reka-edge Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding, video analysis, object detection, and agentic tool-use.	imagetextvideo→text	$0.10	$0.10	16K	—	試用
Devstral Small mistralai/devstral-small Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Finetuned from Mistral Small 3.1 and released under the Apache 2.0 license, it features a 128k token context window and supports both Mistral-style function calling and XML output formats. Designed for agentic coding workflows, Devstral Small 1.1 is optimized for tasks such as codebase exploration, multi-file edits, and integration into autonomous development agents like OpenHands and Cline. It achieves 53.6% on SWE-Bench Verified, surpassing all other open models on this benchmark, while remaining lightweight enough to run on a single 4090 GPU or Apple silicon machine. The model uses a Tekken tokenizer with a 131k vocabulary and is deployable via vLLM, Transformers, Ollama, LM Studio, and other OpenAI-compatible runtimes.	text→text	$0.10	$0.30	131K	Jul 2025	試用
Qwen3 Vl 32b Instruct qwen/qwen3-vl-32b-instruct	text→text	$0.10	$0.42	—	—	試用
Mistral 7b Instruct V0.1 mistralai/mistral-7b-instruct-v0.1	text→text	$0.11	$0.19	—	—	試用
Qwen3 Vl 8b Thinking qwen/qwen3-vl-8b-thinking	text→text	$0.12	$1.36	—	—	試用
Olmo 3 7b Think allenai/olmo-3-7b-think	text→text	$0.12	$0.20	—	—	試用
Qwen 2.5 72b Instruct qwen/qwen-2.5-72b-instruct	text→text	$0.12	$0.39	—	—	試用
Qwen3 Coder Next qwen/qwen3-coder-next	text→text	$0.12	$0.75	—	—	試用
Text Embedding 3 Large openai/text-embedding-3-large text-embedding-3-large is OpenAI's most capable embedding model for both english and non-english tasks. Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text. Embeddings are useful for search, clustering, recommendations, anomaly detection, and classification tasks.	text→embeddings	$0.13	免費	8K	Oct 2025	試用
Qwen3 Vl 30b A3b Instruct qwen/qwen3-vl-30b-a3b-instruct	text→text	$0.13	$0.52	—	—	試用
Hermes 4 70b nousresearch/hermes-4-70b	text→text	$0.13	$0.40	—	—	試用
Qwen3 Vl 30b A3b Thinking qwen/qwen3-vl-30b-a3b-thinking	text→text	$0.13	$1.56	—	—	試用
Glm 4.5 Air z-ai/glm-4.5-air	text→text	$0.13	$0.85	—	—	試用
Gemma 4 26b A4b It google/gemma-4-26b-a4b-it Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.	imagetextvideo→text	$0.13	$0.40	262K	—	試用
Ernie 4.5 Vl 28b A3b baidu/ernie-4.5-vl-28b-a3b	text→text	$0.14	$0.56	—	—	試用
Hermes 2 Pro Llama 3 8b nousresearch/hermes-2-pro-llama-3-8b	text→text	$0.14	$0.14	—	—	試用
Hunyuan A13b Instruct tencent/hunyuan-a13b-instruct	text→text	$0.14	$0.57	—	—	試用
Gemma 4 31b It google/gemma-4-31b-it Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.	imagetextvideo→text	$0.14	$0.40	262K	—	試用
Qwen3 235b A22b Thinking 2507 qwen/qwen3-235b-a22b-thinking-2507	text→text	$0.15	$1.50	—	—	試用
Olmo 3.1 32b Think allenai/olmo-3.1-32b-think	text→text	$0.15	$0.50	—	—	試用
Solar Pro 3 upstage/solar-pro-3 Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized for Korean with English and Japanese support.	text→text	$0.15	$0.60	128K	Jan 2026	試用
GPT 4o Mini 2024 07 18 openai/gpt-4o-mini-2024-07-18	text→text	$0.15	$0.60	—	—	試用
GPT 4o Mini openai/gpt-4o-mini	text→text	$0.15	$0.60	—	—	試用
GPT 4o Mini Search Preview openai/gpt-4o-mini-search-preview	text→text	$0.15	$0.60	—	—	試用
Gemini Embedding google/gemini-embedding-001	text→text	$0.15	免費	—	—	試用
Codestral Embed 2505 mistralai/codestral-embed-2505	text→text	$0.15	免費	—	—	試用
Rnj 1 Instruct essentialai/rnj-1-instruct	text→text	$0.15	$0.15	—	—	試用
Olmo 3 32b Think allenai/olmo-3-32b-think	text→text	$0.15	$0.50	—	—	試用
Llama 4 Maverick meta-llama/llama-4-maverick	text→text	$0.15	$0.60	—	—	試用
Ministral 8b 2512 mistralai/ministral-8b-2512	text→text	$0.15	$0.15	—	—	試用
Command R 08 2024 cohere/command-r-08-2024	text→text	$0.15	$0.60	—	—	試用
Deepseek Chat V3.1 deepseek/deepseek-chat-v3.1	text→text	$0.15	$0.75	—	—	試用
Qwen3 Next 80b A3b Thinking qwen/qwen3-next-80b-a3b-thinking	text→text	$0.15	$1.20	—	—	試用
Qwq 32b qwen/qwq-32b	text→text	$0.15	$0.40	—	—	試用
Mistral Small 2603 mistralai/mistral-small-2603 Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It combines strong reasoning from Magistral, multimodal understanding from Pixtral, and agentic coding capabilities from Devstral, enabling one model to handle complex analysis, software development, and visual tasks within the same workflow.	textimage→text	$0.15	$0.60	262K	Mar 2026	試用
Qwen3.5 35b A3b qwen/qwen3.5-35b-a3b The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall performance is comparable to that of the Qwen3.5-27B.	textimagevideo→text	$0.16	$1.30	262K	Feb 2026	試用
Rocinante 12b thedrummer/rocinante-12b	text→text	$0.17	$0.43	—	—	試用
Llama Guard 4 12b meta-llama/llama-guard-4-12b	text→text	$0.18	$0.18	—	—	試用
Deepseek Chat V3 0324 deepseek/deepseek-chat-v3-0324	text→text	$0.19	$0.87	—	—	試用
Qwen3.5 27b qwen/qwen3.5-27b The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.	textimagevideo→text	$0.20	$1.56	262K	Feb 2026	試用
Mistral 7b Instruct V0.2 mistralai/mistral-7b-instruct-v0.2	text→text	$0.20	$0.20	—	—	試用
GPT 5.4 Nano openai/gpt-5.4-nano GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency use cases such as classification, data extraction, ranking, and sub-agent execution. The model prioritizes responsiveness and efficiency over deep reasoning, making it ideal for pipelines that require fast, reliable outputs at scale. GPT-5.4 nano is well suited for background tasks, real-time systems, and distributed agent architectures where minimizing cost and latency is essential.	fileimagetext→text	$0.20	$1.25	400K	Mar 2026	試用
Longcat Flash Chat meituan/longcat-flash-chat	text→text	$0.20	$0.80	—	—	試用
Molmo 2 8b allenai/molmo-2-8b	text→text	$0.20	$0.20	—	—	試用
Olmo 3.1 32b Instruct allenai/olmo-3.1-32b-instruct	text→text	$0.20	$0.60	—	—	試用
Llama Guard 2 8b meta-llama/llama-guard-2-8b	text→text	$0.20	$0.20	—	—	試用
Minimax 01 minimax/minimax-01	text→text	$0.20	$1.10	—	—	試用
Ministral 14b 2512 mistralai/ministral-14b-2512	text→text	$0.20	$0.20	—	—	試用
Mistral 7b Instruct mistralai/mistral-7b-instruct	text→text	$0.20	$0.20	—	—	試用
Mistral 7b Instruct V0.3 mistralai/mistral-7b-instruct-v0.3	text→text	$0.20	$0.20	—	—	試用
Mistral Saba mistralai/mistral-saba	text→text	$0.20	$0.60	—	—	試用
Intellect 3 prime-intellect/intellect-3	text→text	$0.20	$1.10	—	—	試用
Qwen 2.5 Vl 7b Instruct qwen/qwen-2.5-vl-7b-instruct	text→text	$0.20	$0.20	—	—	試用
Qwen 2.5 Coder 32b Instruct qwen/qwen-2.5-coder-32b-instruct	text→text	$0.20	$0.20	—	—	試用
Qwen2.5 Vl 32b Instruct qwen/qwen2.5-vl-32b-instruct	text→text	$0.20	$0.60	—	—	試用
Qwen3 Vl 235b A22b Instruct qwen/qwen3-vl-235b-a22b-instruct	text→text	$0.20	$0.88	—	—	試用
Grok 4 Fast x-ai/grok-4-fast	text→text	$0.20	$0.50	—	—	試用
Grok 4.1 Fast x-ai/grok-4.1-fast	text→text	$0.20	$0.50	—	—	試用
Grok Code Fast 1 x-ai/grok-code-fast-1	text→text	$0.20	$1.50	—	—	試用
Kat Coder Pro kwaipilot/kat-coder-pro	text→text	$0.21	$0.83	—	—	試用
Deepseek V3.1 Terminus:Exacto deepseek/deepseek-v3.1-terminus:exacto	text→text	$0.21	$0.79	—	—	試用
Deepseek V3.1 Terminus deepseek/deepseek-v3.1-terminus	text→text	$0.21	$0.79	—	—	試用
Qwen Vl Plus qwen/qwen-vl-plus	text→text	$0.21	$0.63	—	—	試用
Qwen3 Coder qwen/qwen3-coder	text→text	$0.22	$1.00	—	—	試用
Qwen3 Coder:Exacto qwen/qwen3-coder:exacto	text→text	$0.22	$1.80	—	—	試用
Trinity Large Thinking arcee-ai/trinity-large-thinking Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. It is free in open claw for the first five days. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7	text→text	$0.22	$0.85	262K	—	試用
Gemini 3.1 Flash Lite Preview google/gemini-3.1-flash-lite-preview Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across key capabilities. Improvements span audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.	textimagevideofileaudio→text	$0.25	$1.50	1M	Mar 2026	試用
GPT 5.1 Codex Mini openai/gpt-5.1-codex-mini GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex	imagetext→text	$0.25	$2.00	400K	Nov 2025	試用
GPT 5 Mini openai/gpt-5-mini GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost. GPT-5 Mini is the successor to OpenAI's o4-mini model.	textimagefile→text	$0.25	$2.00	400K	Aug 2025	試用
Mercury inception/mercury	text→text	$0.25	$1.00	—	—	試用
Mercury Coder inception/mercury-coder	text→text	$0.25	$1.00	—	—	試用
Seed 1.6 bytedance-seed/seed-1.6	text→text	$0.25	$2.00	—	—	試用
Claude 3 Haiku anthropic/claude-3-haiku	text→text	$0.25	$1.25	—	—	試用
Mercury 2 inception/mercury-2 Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).	text→text	$0.25	$0.75	128K	Mar 2026	試用
Seed 2.0 Lite bytedance-seed/seed-2.0-lite Seed-2.0-Lite is a balanced model designed for high-frequency enterprise workloads, optimizing for both capability and cost. Its overall performance surpasses the previous-generation Seed-1.8. It is well-suited for production tasks such as unstructured information processing, text content creation, search and recommendation, and data analysis. The model supports long-context processing, multi-source information fusion, multi-step instruction execution, and high-fidelity structured outputs—delivering stable quality while significantly reducing cost.	textimagevideo→text	$0.25	$2.00	262K	Mar 2026	試用
Minimax M2 minimax/minimax-m2	text→text	$0.26	$1.00	—	—	試用
Qwen3.5 122b A10b qwen/qwen3.5-122b-a10b The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.	textimagevideo→text	$0.26	$2.08	262K	Feb 2026	試用
Qwen3 Vl 235b A22b Thinking qwen/qwen3-vl-235b-a22b-thinking	text→text	$0.26	$2.60	—	—	試用
Deepseek V3.2 deepseek/deepseek-v3.2	text→text	$0.26	$0.38	—	—	試用
Deepseek V3.2 Exp deepseek/deepseek-v3.2-exp	text→text	$0.27	$0.41	—	—	試用
Minimax M2.1 minimax/minimax-m2.1	text→text	$0.27	$0.95	—	—	試用
Deepseek V3.1 Nex N1 nex-agi/deepseek-v3.1-nex-n1	text→text	$0.27	$1.00	—	—	試用
Ernie 4.5 300b A47b baidu/ernie-4.5-300b-a47b	text→text	$0.28	$1.10	—	—	試用
Deepseek R1 Distill Qwen 32b deepseek/deepseek-r1-distill-qwen-32b	text→text	$0.29	$0.29	—	—	試用
Minimax M2.7 minimax/minimax-m2.7 MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent collaboration, enabling it to plan, execute, and refine complex tasks across dynamic environments. Trained for production-grade performance, M2.7 handles workflows such as live debugging, root cause analysis, financial modeling, and full document generation across Word, Excel, and PowerPoint. It delivers strong results on benchmarks including 56.2% on SWE-Pro and 57.0% on Terminal Bench 2, while achieving a 1495 ELO on GDPval-AA, setting a new standard for multi-agent systems operating in real-world digital workflows.	text→text	$0.30	$1.20	205K	Mar 2026	試用
Minimax M2.5 minimax/minimax-m2.5	text→text	$0.30	$1.10	—	—	試用
Cydonia 24b V4.1 thedrummer/cydonia-24b-v4.1	text→text	$0.30	$0.50	—	—	試用
Grok 3 Mini Beta x-ai/grok-3-mini-beta	text→text	$0.30	$0.50	—	—	試用
Grok 3 Mini x-ai/grok-3-mini	text→text	$0.30	$0.50	—	—	試用
Kat Coder Pro V2 kwaipilot/kat-coder-pro-v2 KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions, with a focus on large-scale production environments, multi-system coordination, and seamless integration across modern software stacks, while also supporting web aesthetics generation to produce production-grade landing pages and presentation decks.	text→text	$0.30	$1.20	256K	Mar 2026	試用
Gemini 2.5 Flash google/gemini-2.5-flash	text→text	$0.30	$2.50	—	—	試用
Minimax M2 Her minimax/minimax-m2-her	text→text	$0.30	$1.20	—	—	試用
Codestral 2508 mistralai/codestral-2508	text→text	$0.30	$0.90	—	—	試用
Nova 2 Lite V1 amazon/nova-2-lite-v1	text→text	$0.30	$2.50	—	—	試用
Hermes 3 Llama 3.1 70b nousresearch/hermes-3-llama-3.1-70b	text→text	$0.30	$0.30	—	—	試用
Glm 4.6v z-ai/glm-4.6v	text→text	$0.30	$0.90	—	—	試用
Qwen3 Coder Flash qwen/qwen3-coder-flash	text→text	$0.30	$1.50	—	—	試用
Deepseek Chat deepseek/deepseek-chat	text→text	$0.32	$0.89	—	—	試用
Qwen3.6 Plus qwen/qwen3.6-plus Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...	textimagevideo→text	$0.33	$1.95	1M	—	試用
Mistral Small 3.1 24b Instruct mistralai/mistral-small-3.1-24b-instruct	text→text	$0.35	$0.56	—	—	試用
Glm 4.6 z-ai/glm-4.6	text→text	$0.35	$1.71	—	—	試用
Glm 4.7 z-ai/glm-4.7	text→text	$0.38	$1.70	—	—	試用
Mimo V2 Omni xiaomi/mimo-v2-omni MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities. 256K context window.	textaudioimagevideo→text	$0.40	$2.00	262K	Mar 2026	試用
Kimi K2 0905 moonshotai/kimi-k2-0905	text→text	$0.40	$2.00	—	—	試用
Qwen Plus 2025 07 28 qwen/qwen-plus-2025-07-28	text→text	$0.40	$1.20	—	—	試用
Qwen3.5 Plus 02 15 qwen/qwen3.5-plus-02-15	text→text	$0.40	$2.40	—	—	試用
Deepseek R1 0528 deepseek/deepseek-r1-0528	text→text	$0.40	$1.75	—	—	試用
Deepseek V3.2 Speciale deepseek/deepseek-v3.2-speciale	text→text	$0.40	$1.20	—	—	試用
Llama 3.1 70b Instruct meta-llama/llama-3.1-70b-instruct	text→text	$0.40	$0.40	—	—	試用
Minimax M1 minimax/minimax-m1	text→text	$0.40	$2.20	—	—	試用
Devstral 2512 mistralai/devstral-2512	text→text	$0.40	$2.00	—	—	試用
Devstral Medium mistralai/devstral-medium	text→text	$0.40	$2.00	—	—	試用
Mistral Medium 3 mistralai/mistral-medium-3	text→text	$0.40	$2.00	—	—	試用
Qwen Plus qwen/qwen-plus	text→text	$0.40	$1.20	—	—	試用
Qwen Plus 2025 07 28:Thinking qwen/qwen-plus-2025-07-28:thinking	text→text	$0.40	$1.20	—	—	試用
Unslopnemo 12b thedrummer/unslopnemo-12b	text→text	$0.40	$0.40	—	—	試用
Mistral Medium 3.1 mistralai/mistral-medium-3.1 Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning and multimodal performance with 8× lower cost compared to traditional large models, making it suitable for scalable deployments across professional and industrial use cases. The model excels in domains such as coding, STEM reasoning, and enterprise adaptation. It supports hybrid, on-prem, and in-VPC deployments and is optimized for integration into custom workflows. Mistral Medium 3.1 offers competitive accuracy relative to larger models like Claude Sonnet 3.5/3.7, Llama 4 Maverick, and Command R+, while maintaining broad compatibility across cloud environments.	textimage→text	$0.40	$2.00	131K	Aug 2025	試用
GPT 4.1 Mini openai/gpt-4.1-mini GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.	imagetextfile→text	$0.40	$1.60	1M	Apr 2025	試用
Ernie 4.5 Vl 424b A47b baidu/ernie-4.5-vl-424b-a47b	text→text	$0.42	$1.25	—	—	試用
Glm 4.6:Exacto z-ai/glm-4.6:exacto	text→text	$0.44	$1.76	—	—	試用
Kimi K2.5 moonshotai/kimi-k2.5	text→text	$0.45	$2.20	—	—	試用
Remm Slerp L2 13b undi95/remm-slerp-l2-13b	text→text	$0.45	$0.65	—	—	試用
Qwen3 235b A22b qwen/qwen3-235b-a22b	text→text	$0.46	$1.82	—	—	試用
Kimi K2 Thinking moonshotai/kimi-k2-thinking	text→text	$0.47	$2.00	—	—	試用
Kimi K2 moonshotai/kimi-k2	text→text	$0.50	$2.40	—	—	試用
Gemini 3 Flash Preview google/gemini-3-flash-preview	text→text	$0.50	$3.00	—	—	試用
Mistral Large 2512 mistralai/mistral-large-2512	text→text	$0.50	$1.50	—	—	試用
GPT 3.5 Turbo openai/gpt-3.5-turbo	text→text	$0.50	$1.50	—	—	試用
Llama 3 70b Instruct meta-llama/llama-3-70b-instruct	text→text	$0.51	$0.74	—	—	試用
Mixtral 8x7b Instruct mistralai/mixtral-8x7b-instruct	text→text	$0.54	$0.54	—	—	試用
Qwen3.5 397b A17b qwen/qwen3.5-397b-a17b	text→text	$0.55	$3.50	—	—	試用
Skyfall 36b V2 thedrummer/skyfall-36b-v2	text→text	$0.55	$0.80	—	—	試用
Glm 4.5 z-ai/glm-4.5	text→text	$0.55	$2.00	—	—	試用
Kimi K2 0905:Exacto moonshotai/kimi-k2-0905:exacto	text→text	$0.60	$2.50	—	—	試用
Llama 3.1 Nemotron Ultra 253b V1 nvidia/llama-3.1-nemotron-ultra-253b-v1	text→text	$0.60	$1.80	—	—	試用
Palmyra X5 writer/palmyra-x5	text→text	$0.60	$6.00	—	—	試用
Glm 4.5v z-ai/glm-4.5v	text→text	$0.60	$1.80	—	—	試用
Wizardlm 2 8x22b microsoft/wizardlm-2-8x22b	text→text	$0.62	$0.62	—	—	試用
Gemma 2 27b It google/gemma-2-27b-it	text→text	$0.65	$0.65	—	—	試用
L3.3 Euryale 70b sao10k/l3.3-euryale-70b	text→text	$0.65	$0.75	—	—	試用
L3.1 Euryale 70b sao10k/l3.1-euryale-70b	text→text	$0.65	$0.75	—	—	試用
Deepseek R1 deepseek/deepseek-r1	text→text	$0.70	$2.50	—	—	試用
Deepseek R1 Distill Llama 70b deepseek/deepseek-r1-distill-llama-70b	text→text	$0.70	$0.80	—	—	試用
Aion 1.0 Mini aion-labs/aion-1.0-mini Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant of a FuseAI model that outperforms R1-Distill-Qwen-32B and R1-Distill-Llama-70B, with benchmark results available on its [Hugging Face page](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview), independently replicated for verification.	text→text	$0.70	$1.40	131K	Feb 2025	試用
GPT 5.4 Mini openai/gpt-5.4-mini GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding, and tool use, while reducing latency and cost for large-scale deployments. The model is designed for production environments that require a balance of capability and efficiency, making it well suited for chat applications, coding assistants, and agent workflows that operate at scale. GPT-5.4 mini delivers reliable instruction following, solid multi-step reasoning, and consistent performance across diverse tasks with improved cost efficiency.	fileimagetext→text	$0.75	$4.50	400K	Mar 2026	試用
Weaver mancer/weaver	text→text	$0.75	$1.00	—	—	試用
Morph V3 Fast morph/morph-v3-fast	text→text	$0.80	$1.20	—	—	試用
Qwen2.5 Vl 72b Instruct qwen/qwen2.5-vl-72b-instruct	text→text	$0.80	$0.80	—	—	試用
Llemma_7b eleutherai/llemma_7b	text→text	$0.80	$1.20	—	—	試用
Aion Rp Llama 3.1 8b aion-labs/aion-rp-llama-3.1-8b Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses. It is a fine-tuned base model rather than an instruct model, designed to produce more natural and varied writing.	text→text	$0.80	$1.60	33K	Feb 2025	試用
Codellama 7b Instruct Solidity alfredpros/codellama-7b-instruct-solidity	text→text	$0.80	$1.20	—	—	試用
Nova Pro V1 amazon/nova-pro-v1	text→text	$0.80	$3.20	—	—	試用
Claude 3.5 Haiku anthropic/claude-3.5-haiku	text→text	$0.80	$4.00	—	—	試用
Qwen Vl Max qwen/qwen-vl-max	text→text	$0.80	$3.20	—	—	試用
Aion 2.0 aion-labs/aion-2.0 Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging. It also handles mature and darker themes with more nuance and depth.	text→text	$0.80	$1.60	131K	Feb 2026	試用
Router switchpoint/router	text→text	$0.85	$3.40	—	—	試用
Morph V3 Large morph/morph-v3-large	text→text	$0.90	$1.90	—	—	試用
Glm 5 z-ai/glm-5	text→text	$0.95	$2.55	—	—	試用
Glm 5 Turbo z-ai/glm-5-turbo GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows involving long execution chains, with improved complex instruction decomposition, tool use, scheduled and persistent execution, and overall stability across extended tasks.	text→text	$0.96	$3.20	203K	Mar 2026	試用
Noromaid 20b neversleep/noromaid-20b	text→text	$1.00	$1.75	—	—	試用
Mimo V2 Pro xiaomi/mimo-v2-pro MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.	text→text	$1.00	$3.00	1M	Mar 2026	試用
Claude Haiku 4.5 anthropic/claude-haiku-4.5	text→text	$1.00	$5.00	—	—	試用
Hermes 3 Llama 3.1 405b nousresearch/hermes-3-llama-3.1-405b	text→text	$1.00	$1.00	—	—	試用
Hermes 4 405b nousresearch/hermes-4-405b	text→text	$1.00	$3.00	—	—	試用
GPT 3.5 Turbo 0613 openai/gpt-3.5-turbo-0613	text→text	$1.00	$2.00	—	—	試用
Sonar perplexity/sonar	text→text	$1.00	$1.00	—	—	試用
Qwen3 Coder Plus qwen/qwen3-coder-plus	text→text	$1.00	$5.00	—	—	試用
Relace Search relace/relace-search	text→text	$1.00	$3.00	—	—	試用
o3 Mini High openai/o3-mini-high OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.	textfile→text	$1.10	$4.40	200K	Feb 2025	試用
o3 Mini openai/o3-mini OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to "high", "medium", or "low" to control the thinking time of the model. The default is "medium". OpenRouter also offers the model slug `openai/o3-mini-high` to default the parameter to "high". The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.	textfile→text	$1.10	$4.40	200K	Jan 2025	試用
o4 Mini openai/o4-mini OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains. Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.	imagetextfile→text	$1.10	$4.40	200K	Apr 2025	試用
o4 Mini High openai/o4-mini-high OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains. Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.	imagetextfile→text	$1.10	$4.40	200K	Apr 2025	試用
Llama 3.1 Nemotron 70b Instruct nvidia/llama-3.1-nemotron-70b-instruct	text→text	$1.20	$1.20	—	—	試用
Qwen3 Max qwen/qwen3-max	text→text	$1.20	$6.00	—	—	試用
Qwen3 Max Thinking qwen/qwen3-max-thinking	text→text	$1.20	$6.00	—	—	試用
Glm 5v Turbo z-ai/glm-5v-turbo GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete the full loop of “perceive → plan → execute“.	imagetextvideo→text	$1.20	$4.00	203K	—	試用
GPT 5.3 Codex-30% openai/gpt-5.3-codex GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results on SWE-Bench Pro and strong performance on Terminal-Bench 2.0 and OSWorld-Verified, reflecting improved multi-language coding, terminal proficiency, and real-world computer-use skills. The model is optimized for long-running, tool-using workflows and supports interactive steering during execution, making it suitable for complex development tasks, debugging, deployment, and iterative product work. Beyond coding, GPT-5.3-Codex performs strongly on structured knowledge-work benchmarks such as GDPval, supporting tasks like document drafting, spreadsheet analysis, slide creation, and operational research across domains. It is trained with enhanced cybersecurity awareness, including vulnerability identification capabilities, and deployed with additional safeguards for high-risk use cases. Compared to prior Codex models, it is more token-efficient and approximately 25% faster, targeting professional end-to-end workflows that span reasoning, execution, and computer interaction.	textimage→text	$1.75$1.22	$14.00$9.80	400K	Feb 2026	試用
Gemini 2.5 Pro Preview 05 06 google/gemini-2.5-pro-preview-05-06	text→text	$1.25	$10.00	—	—	試用
GPT 5 openai/gpt-5 GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.	textimagefile→text	$1.25	$10.00	400K	Aug 2025	試用
GPT 5 Chat openai/gpt-5-chat GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.	fileimagetext→text	$1.25	$10.00	128K	Aug 2025	試用
GPT 5.1 openai/gpt-5.1 GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. The model produces clearer, more grounded explanations with reduced jargon, making it easier to follow even on technical or multi-step problems. Built for broad task coverage, GPT-5.1 delivers consistent gains across math, coding, and structured analysis workloads, with more coherent long-form answers and improved tool-use reliability. It also features refined conversational alignment, enabling warmer, more intuitive responses without compromising precision. GPT-5.1 serves as the primary full-capability successor to GPT-5	imagetextfile→text	$1.25	$10.00	400K	Nov 2025	試用
GPT 5.1 Chat openai/gpt-5.1-chat GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.1 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.	fileimagetext→text	$1.25	$10.00	128K	Nov 2025	試用
GPT 5 Codex openai/gpt-5-codex GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.	textimage→text	$1.25	$10.00	400K	Sep 2025	試用
GPT 5.1 Codex openai/gpt-5.1-codex GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.	textimage→text	$1.25	$10.00	400K	Nov 2025	試用
Gemini 2.5 Pro google/gemini-2.5-pro	text→text	$1.25	$10.00	—	—	試用
Gemini 2.5 Pro Preview google/gemini-2.5-pro-preview	text→text	$1.25	$10.00	—	—	試用
Cogito V2.1 671b deepcogito/cogito-v2.1-671b	text→text	$1.25	$1.25	—	—	試用
GPT 5.1 Codex Max openai/gpt-5.1-codex-max GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic workflows spanning software engineering, mathematics, and research. GPT-5.1-Codex-Max delivers faster performance, improved reasoning, and higher token efficiency across the development lifecycle.	textimage→text	$1.25	$10.00	400K	Dec 2025	試用
Glm 5.1 z-ai/glm-5.1 GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...	text→text	$1.26	$3.96	203K	—	試用
L3 Euryale 70b sao10k/l3-euryale-70b	text→text	$1.48	$1.48	—	—	試用
GPT 3.5 Turbo Instruct openai/gpt-3.5-turbo-instruct	text→text	$1.50	$2.00	—	—	試用
Qwen Max qwen/qwen-max	text→text	$1.60	$6.40	—	—	試用
GPT 5.2 Codex openai/gpt-5.2-codex GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1-Codex, 5.2-Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.	textimage→text	$1.75	$14.00	400K	Jan 2026	試用
GPT 5.2 Chat openai/gpt-5.2-chat GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.2 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.	fileimagetext→text	$1.75	$14.00	128K	Dec 2025	試用
GPT 5.2 openai/gpt-5.2 GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. Built for broad task coverage, GPT-5.2 delivers consistent gains across math, coding, sciende, and tool calling workloads, with more coherent long-form answers and improved tool-use reliability.	fileimagetext→text	$1.75	$14.00	400K	Dec 2025	試用
GPT 5.3 Chat openai/gpt-5.3-chat GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly reduces unnecessary refusals, caveats, and overly cautious phrasing that can interrupt conversational flow.	textimagefile→text	$1.75	$14.00	128K	Mar 2026	試用
Gemini 3.1 Pro Preview google/gemini-3.1-pro-preview	text→text	$2.00	$12.00	—	—	試用
Jamba Large 1.7 ai21/jamba-large-1.7 Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context window, it delivers more accurate, contextually grounded responses and better steerability than previous versions.	text→text	$2.00	$8.00	256K	Aug 2025	試用
Gemini 3.1 Pro Preview Customtools google/gemini-3.1-pro-preview-customtools Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party or user-defined functions are available. This specialized preview endpoint significantly increases function calling reliability and ensures the model selects the most appropriate tool in coding agents and complex, multi-tool workflows. It retains the core strengths of Gemini 3.1 Pro, including multimodal reasoning across text, image, video, audio, and code, a 1M-token context window, and strong software engineering performance.	textaudioimagevideofile→text	$2.00	$12.00	1M	Feb 2026	試用
GPT 4.1 openai/gpt-4.1 GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.	imagetextfile→text	$2.00	$8.00	1M	Apr 2025	試用
Mixtral 8x22b Instruct mistralai/mixtral-8x22b-instruct Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding, and reasoning - large context length (64k) - fluency in English, French, Italian, German, and Spanish See benchmarks on the launch announcement [here](https://mistral.ai/news/mixtral-8x22b/). #moe	text→text	$2.00	$6.00	66K	Apr 2024	試用
Pixtral Large 2411 mistralai/pixtral-large-2411 Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-large-2411). The model is able to understand documents, charts and natural images. The model is available under the Mistral Research License (MRL) for research and educational use, and the Mistral Commercial License for experimentation, testing, and production for commercial purposes.	textimage→text	$2.00	$6.00	131K	Nov 2024	試用
Sonar Deep Research perplexity/sonar-deep-research	text→text	$2.00	$8.00	—	—	試用
Gemini 3 Pro Preview google/gemini-3-pro-preview	text→text	$2.00	$12.00	—	—	試用
Mistral Large mistralai/mistral-large	text→text	$2.00	$6.00	—	—	試用
Mistral Large 2407 mistralai/mistral-large-2407	text→text	$2.00	$6.00	—	—	試用
Mistral Large 2411 mistralai/mistral-large-2411	text→text	$2.00	$6.00	—	—	試用
Sonar Reasoning Pro perplexity/sonar-reasoning-pro	text→text	$2.00	$8.00	—	—	試用
o3 openai/o3 o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images.	imagetextfile→text	$2.00	$8.00	200K	Apr 2025	試用
o4 Mini Deep Research openai/o4-mini-deep-research o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.	fileimagetext→text	$2.00	$8.00	200K	Oct 2025	試用
Grok 4.20 x-ai/grok-4.20 Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)	textimage→text	$2.00	$6.00	2M	Mar 2026	試用
Grok 4.20 Multi Agent x-ai/grok-4.20-multi-agent Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: - low / medium: 4 agents - high / xhigh: 16 agents	textimagefile→text	$2.00	$6.00	2M	Mar 2026	試用
Grok 4.20 Beta x-ai/grok-4.20-beta Grok 4.20 Beta is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)	textimage→text	$2.00	$6.00	2M	Mar 2026	試用
Grok 4.20 Multi Agent Beta x-ai/grok-4.20-multi-agent-beta Grok 4.20 Multi-Agent Beta is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: - low / medium: 4 agents - high / xhigh: 16 agents	textimage→text	$2.00	$6.00	2M	Mar 2026	試用
GPT 4o Search Preview openai/gpt-4o-search-preview	text→text	$2.50	$10.00	—	—	試用
GPT 4o openai/gpt-4o	text→text	$2.50	$10.00	—	—	試用
Inflection 3 Productivity inflection/inflection-3-productivity	text→text	$2.50	$10.00	—	—	試用
Inflection 3 Pi inflection/inflection-3-pi	text→text	$2.50	$10.00	—	—	試用
Nova Premier V1 amazon/nova-premier-v1	text→text	$2.50	$12.50	—	—	試用
Command A cohere/command-a	text→text	$2.50	$10.00	—	—	試用
Command R Plus 08 2024 cohere/command-r-plus-08-2024	text→text	$2.50	$10.00	—	—	試用
GPT 4o 2024 11 20 openai/gpt-4o-2024-11-20	text→text	$2.50	$10.00	—	—	試用
GPT 4o 2024 08 06 openai/gpt-4o-2024-08-06	text→text	$2.50	$10.00	—	—	試用
GPT 5.4 openai/gpt-5.4 GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs, enabling high-context reasoning, coding, and multimodal analysis within the same workflow. The model delivers improved performance in coding, document understanding, tool use, and instruction following. It is designed as a strong default for both general-purpose tasks and software engineering, capable of generating production-quality code, synthesizing information across multiple sources, and executing complex multi-step workflows with fewer iterations and greater token efficiency.	textimagefile→text	$2.50	$15.00	1M	Mar 2026	試用
Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with memory, polished document creation, and confident computer use for web QA and workflow automation.	textimage→text	$3.00	$15.00	1M	Feb 2026	試用
Magnum V4 72b anthracite-org/magnum-v4-72b	text→text	$3.00	$5.00	—	—	試用
Claude 3.7 Sonnet anthropic/claude-3.7-sonnet	text→text	$3.00	$15.00	—	—	試用
Claude Sonnet 4.5 anthropic/claude-sonnet-4.5	text→text	$3.00	$15.00	—	—	試用
Claude Sonnet 4 anthropic/claude-sonnet-4	text→text	$3.00	$15.00	—	—	試用
GPT 3.5 Turbo 16k openai/gpt-3.5-turbo-16k	text→text	$3.00	$4.00	—	—	試用
Sonar Pro Search perplexity/sonar-pro-search	text→text	$3.00	$15.00	—	—	試用
Sonar Pro perplexity/sonar-pro	text→text	$3.00	$15.00	—	—	試用
L3.1 70b Hanami X1 sao10k/l3.1-70b-hanami-x1	text→text	$3.00	$3.00	—	—	試用
Grok 3 x-ai/grok-3	text→text	$3.00	$15.00	—	—	試用
Grok 3 Beta x-ai/grok-3-beta	text→text	$3.00	$15.00	—	—	試用
Grok 4 x-ai/grok-4	text→text	$3.00	$15.00	—	—	試用
Claude 3.7 Sonnet:Thinking anthropic/claude-3.7-sonnet:thinking Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes. Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks. Read more at the [blog post here](https://www.anthropic.com/news/claude-3-7-sonnet)	textimagefile→text	$3.00	$15.00	200K	Feb 2025	試用
Goliath 120b alpindale/goliath-120b	text→text	$3.75	$7.50	—	—	試用
Llama 3.1 405b Instruct meta-llama/llama-3.1-405b-instruct	text→text	$4.00	$4.00	—	—	試用
Llama 3.1 405b meta-llama/llama-3.1-405b	text→text	$4.00	$4.00	—	—	試用
Aion 1.0 aion-labs/aion-1.0 Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree of Thoughts (ToT) and Mixture of Experts (MoE). It is Aion Lab's most powerful reasoning model.	text→text	$4.00	$8.00	131K	Feb 2025	試用
Sorcererlm 8x22b raifle/sorcererlm-8x22b	text→text	$4.50	$4.50	—	—	試用
Claude Opus 4.6 anthropic/claude-opus-4.6	text→text	$5.00	$25.00	—	—	試用
GPT 4o 2024 05 13 openai/gpt-4o-2024-05-13	text→text	$5.00	$15.00	—	—	試用
Claude Opus 4.5 anthropic/claude-opus-4.5	text→text	$5.00	$25.00	—	—	試用
Claude Opus 4.7 anthropic/claude-opus-4.7 Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...	textimage→text	$5.00	$25.00	1M	—	試用
Claude 3.5 Sonnet anthropic/claude-3.5-sonnet	text→text	$6.00	$30.00	—	—	試用
GPT 4o:Extended openai/gpt-4o:extended	text→text	$6.00	$18.00	—	—	試用
GPT 4 Turbo openai/gpt-4-turbo	text→text	$10.00	$30.00	—	—	試用
GPT 4 Turbo Preview openai/gpt-4-turbo-preview	text→text	$10.00	$30.00	—	—	試用
GPT 4 1106 Preview openai/gpt-4-1106-preview The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to April 2023.	text→text	$10.00	$30.00	128K	Nov 2023	試用
o3 Deep Research openai/o3-deep-research o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.	imagetextfile→text	$10.00	$40.00	200K	Oct 2025	試用
Claude Opus 4 anthropic/claude-opus-4	text→text	$15.00	$75.00	—	—	試用
GPT 5 Pro openai/gpt-5-pro GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.	imagetextfile→text	$15.00	$120.00	400K	Oct 2025	試用
o1 openai/o1 The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology. Learn more in the [launch announcement](https://openai.com/o1).	textimagefile→text	$15.00	$60.00	200K	Dec 2024	試用
Claude Opus 4.1 anthropic/claude-opus-4.1	text→text	$15.00	$75.00	—	—	試用
o3 Pro openai/o3-pro The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. Note that BYOK is required for this model. Set up here: https://openrouter.ai/settings/integrations	textfileimage→text	$20.00	$80.00	200K	Jun 2025	試用
GPT 5.2 Pro openai/gpt-5.2-pro GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.	imagetextfile→text	$21.00	$168.00	400K	Dec 2025	試用
GPT 4 0314 openai/gpt-4-0314	text→text	$30.00	$60.00	—	—	試用
GPT 4 openai/gpt-4	text→text	$30.00	$60.00	—	—	試用
Claude Opus 4.6 Fast anthropic/claude-opus-4.6-fast Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode	textimage→text	$30.00	$150.00	1M	—	試用
GPT 5.4 Pro openai/gpt-5.4-pro GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs. Optimized for step-by-step reasoning, instruction following, and accuracy, GPT-5.4 Pro excels at agentic coding, long-context workflows, and multi-step problem solving.	textimagefile→text	$30.00	$180.00	1M	Mar 2026	試用
o1 Pro openai/o1-pro The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers.	textimagefile→text	$150.00	$600.00	200K	Mar 2025	試用