Popular Models

APIYI supports 400+ mainstream AI models. This page provides detailed model information, pricing, and usage instructions.

Enterprise-grade Professional and Stable AI Large Model API Hub All models are officially sourced and forwarded, with ~20% off pricing (combining top-up bonuses and exchange rate advantages), aggregating various excellent large models. No speed limits, no expiration, no account ban risks, pay-as-you-go billing, long-term reliable service.

🔥 Currently Recommended Models

The following are currently stably supplied popular models. For complete model list and real-time pricing, visit APIYI Console Pricing Page.

Model Upgrade Recommendations: We recommend using the latest models for best performance, but please note:

Initial instability is common: Newly launched models may experience slow responses, timeouts, or occasional errors due to limited compute capacity at the vendor — this typically stabilizes within days to weeks
Check parameter compatibility: New models may introduce or change parameters (e.g., max_completion_tokens replacing max_tokens). Before upgrading, verify that your API parameters remain compatible with older models
Always test before going live: Before deploying a new model to production, thoroughly validate it in a test environment to ensure output quality and API compatibility meet expectations

Model Categories

🤖 OpenAI Series

🆕 Latest Models

Model Name	Model ID	Context Length	Features	Recommended Scenarios
GPT-5.5 Pro	`gpt-5.5-pro`	1M	Strongest reasoning today, Terminal-Bench 2.0 82.7%; `/v1/responses` endpoint + SVIP group only, very expensive	Top-tier reasoning, research (professional needs)
GPT-5.5 🔥	`gpt-5.5`	1M	SWE-bench Verified 88.7%, hallucinations 60% lower than 5.4, new `xhigh` reasoning tier	Complex agents, professional workflows
GPT-5.4	`gpt-5.4`	1M	Native computer use, GDPval 83%	Complex agents, professional workflows
chat-latest	`chat-latest`	400K	Version-less alias, always points to the latest ChatGPT Instant (currently GPT-5.5 Instant)	Quick writing, conversation
GPT-5.2	`gpt-5.2`	400K	GDPval 70.9% surpassing professionals	Programming planning, structured tasks
GPT-5.3 Codex 🔥	`gpt-5.3-codex`	128K	SWE-Bench Pro SOTA, complex programming and agent tasks	Complex programming, agent tasks
GPT-5.1	`gpt-5.1`	128K	Intelligence-speed balance, SWE-bench 76.3%, 24h cache	General apps, programming

GPT Pro series (e.g. gpt-5.5-pro, gpt-5-pro) usage notes:

/v1/responses endpoint only: cannot use /v1/chat/completions — switch your SDK / code to the Responses API before calling
Very expensive: a single call may cost several dollars, and it is open to the SVIP group only to prevent accidental use on the Default group
Not recommended for non-professional needs: use GPT-5.5 / GPT-5.4 for everyday tasks; Pro suits only research and top-tier work demanding extreme reasoning depth

✅ Stable / Classic

Model Name	Model ID	Context Length	Features	Recommended Scenarios
GPT-5 ⭐	`gpt-5`	128K	Flagship stable version, ultra-strong reasoning	Top-tier reasoning, complex tasks
GPT-5 Mini	`gpt-5-mini`	128K	GPT-5 lightweight, excellent performance	Balance performance and cost
GPT-5 Nano	`gpt-5-nano`	128K	GPT-5 ultra-lightweight	Large-scale batch processing
o3 ⭐	`o3`	200K	Reasoning model, significantly price-reduced	Complex reasoning, math, programming
o4-mini	`o4-mini`	200K	Lightweight reasoning model	Top choice for programming
GPT-4.1 ⭐	`gpt-4.1`	128K	Fast speed, main workhorse	General applications
GPT-4.1 Mini	`gpt-4.1-mini`	128K	Cheaper lightweight version	Cost-sensitive scenarios
GPT-4o	`gpt-4o`	128K	Balanced multimodal capabilities	General scenarios
GPT-4o Mini	`gpt-4o-mini`	128K	Lightweight fast version	Quick response

GPT-5 Series Usage Notes:

Temperature parameter temperature must be set to 1 (only supports 1)
Use max_completion_tokens instead of max_tokens
Do not pass top_p parameter

Image and Video Generation Models have been moved to a dedicated page. Visit Image & Video Generation Models for the full list and pricing.

🎭 Claude Series (Anthropic)

🆕 Latest Models

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Claude Opus 4.7 🔥	`claude-opus-4-7`	1M (Beta)	Coding benchmarks +13% over 4.6, 3x on production tasks, tool errors cut to 1/3, new xhigh reasoning tier	Top-tier coding, complex agents
Claude Opus 4.7 Thinking 🔥	`claude-opus-4-7-thinking`	1M (Beta)	Adaptive thinking, enhanced deep reasoning	Top-tier reasoning tasks
Claude Opus 4.6	`claude-opus-4-6`	1M (Beta)	Terminal-Bench 2.0 #1, 128K output	Top-tier coding, complex agents
Claude Sonnet 4.6 🔥	`claude-sonnet-4-6`	1M (Beta)	Full upgrade, rivals Opus 4.5, great value	Programming top choice, agent dev

✅ Stable / Classic

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Claude Opus 4.5 ⭐	`claude-opus-4-5-20251101`	200K	SWE-bench 80.9%, price reduced to 1/3	Complex programming, top-tier reasoning
Claude Sonnet 4.5 ⭐	`claude-sonnet-4-5-20250929`	200K	World-class coding, SWE-bench 77.2%	Code generation, agent development
Claude Sonnet 4.5 Thinking	`claude-sonnet-4-5-20250929-thinking`	200K	Chain-of-thought mode, deep reasoning	Complex programming reasoning
Claude Haiku 4.5 ⭐	`claude-haiku-4-5-20251001`	200K	High cost-performance, SWE-bench 73.3%, 2x speed	Real-time chat, pair programming
Claude 4 Sonnet	`claude-sonnet-4-20250514`	200K	Battle-tested, top choice for programming	Code generation, analysis
Claude Opus 4.1	`claude-opus-4-1-20250805`	200K	Iterative upgrade, programming-optimized	High-demand programming tasks

Latest: Claude Opus 4.7 improves coding benchmarks 13% over 4.6, cuts tool-call errors to 1/3, and adds an xhigh reasoning tier at the same price as 4.6. Sonnet 4.6 rivals Opus 4.5 and is now the default on claude.ai. Stable: Opus 4.5 and Sonnet 4.5 are battle-tested for production. Haiku 4.5 offers 2x speed at great value.

🌟 Google Gemini Series

🆕 Latest Models

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Gemini 3.5 Flash 🔥	`gemini-3.5-flash`	1M	Terminal-Bench 2.1 76.2%, fully surpasses 3.1 Pro, ~4x faster at ~half the price	Programming top choice, cost-performance king
Gemini 3.1 Pro Preview 🔥	`gemini-3.1-pro-preview`	1M	ARC-AGI-2 77.1% (2x+ over 3 Pro), most advanced reasoning	Complex reasoning, multimodal analysis
Gemini 3 Flash Preview	`gemini-3-flash-preview`	1M	SWE-bench 78%, 3x faster, thinking / nothinking variants available	Programming, cost-performance
Gemini 3.1 Flash Lite 🔥	`gemini-3.1-flash-lite`	1M	GA version, 64% faster than 2.5 Flash, ultra-low price	High concurrency, batch, low cost

Note: Gemini 3 Pro Preview was discontinued on March 9, 2026. Please migrate to Gemini 3.1 Pro Preview.

✅ Stable / Classic

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Gemini 2.5 Pro ⭐	`gemini-2.5-pro`	2M	Official release, programming advantage, strong multimodal	Long text, programming, multimodal
Gemini 2.5 Flash ⭐	`gemini-2.5-flash`	1M	Fast speed, low cost, official release	Quick response scenarios
Gemini 2.5 Flash Lite	`gemini-2.5-flash-lite`	1M	Ultra-lightweight, faster and cheaper	Large-scale simple tasks

Latest: Gemini 3.5 Flash fully surpasses Gemini 3.1 Pro on Terminal-Bench 2.1, MCP Atlas and more, at ~4x speed and ~half the price — today’s cost-performance king. Gemini 3.1 Pro Preview doubles reasoning (ARC-AGI-2 77.1%), Google’s most advanced. Gemini 3.1 Flash Lite is now GA, the cheapest frontier model for high-concurrency. Stable: Gemini 2.5 Pro (2M context) and Gemini 2.5 Flash are GA, ideal for production.

🚀 xAI Grok Series

🆕 Latest Models

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Grok 4.3 🔥	`grok-4.3`	1M	Intelligence Index 53, τ²-Bench 98%, IFBench 81%, 1M context + multimodal	Complex reasoning, general tasks
Grok 4	`grok-4`	Standard	Official version; `grok-4-all` adds native web search	General tasks, real-time info
Grok 4 Fast Reasoning 🔥	`grok-4-fast-reasoning`	200K	Reasoning mode, 93%+ cheaper than Grok-4	Complex reasoning
Grok Code Fast 1 ⭐	`grok-code-fast-1`	256K	SWE-bench 70.8%, high-speed generation	Code generation, agent programming

✅ Stable / Classic

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Grok 3 ⭐	`grok-3`	Standard	Official stable version	Daily use
Grok 3 All	`grok-3-all`	Standard	Native web search enhanced	News, market analysis
Grok 3 Mini	`grok-3-mini`	Standard	Small model with reasoning	Lightweight tasks

🔍 DeepSeek Series

🆕 Latest Models

Model Name	Model ID	Context Length	Features	Recommended Scenarios
DeepSeek V4 Pro 🔥	`deepseek-v4-pro`	1M	1.6T/49B activated, SWE-Verified 80.6 near Claude/Gemini, Hybrid Attention	Complex reasoning, coding, agents
DeepSeek V4 Flash 🔥	`deepseek-v4-flash`	1M	284B/13B activated, just $0.14/M input, open-source SOTA value	High concurrency, batch
DeepSeek V3.2	`deepseek-v3.2`	128K	GPT-5 level, tool-use in reasoning	Complex reasoning, coding

✅ Stable / Classic

Model Name	Model ID	Context Length	Features	Recommended Scenarios
DeepSeek V3.1 ⭐	`deepseek-v3-1-250821`	128K	Mixed reasoning, Think/Non-Think dual modes	Intelligent reasoning, programming
DeepSeek R1	`deepseek-r1`	64K	Reasoning model	Math, reasoning
DeepSeek V3	`deepseek-v3`	128K	Strong comprehensive capabilities	General scenarios

🐘 Chinese Model Series

Zhipu AI (GLM)

🆕 Latest: GLM-5.1 | ✅ Stable / Classic: GLM-5, GLM-4.6

Model Name	Model ID	Context Length	Features	Recommended Scenarios
GLM-5.1 🔥	`glm-5.1`	200K	SWE-Bench Pro 58.4 beats GPT-5.4 / Opus 4.6 / Gemini 3.1 Pro, 744B MoE, MIT open-source	Complex coding, agents
GLM-5 ⭐	`glm-5`	200K	744B params (40B activated), coding aligned with Claude Opus 4.5, open-source	Complex coding, systems engineering
GLM-4.6	`glm-4.6`	200K	Code and reasoning enhanced, stable	Programming, reasoning, agents
GLM-4.5	`glm-4.5`	128K	Standard version, strong overall	General scenarios

GLM-5.1 Features:

744B MoE params, supports long-horizon agent tasks up to 8 hours
SWE-Bench Pro 58.4, strongest coding among open-source models
MIT licensed open-source, excellent value

Alibaba Qwen

🆕 Latest: Qwen3.7-Max | ✅ Stable / Classic: Qwen Max, Plus, Turbo

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Qwen3.7-Max 🔥	`qwen3.7-max`	1M	AA Intelligence Index 56.6 (global top 5, #1 in China), 35-hour long-horizon agent autonomy	Agents, multilingual, long text
Qwen Max ⭐	`qwen-max`	32K	Strongest stable version	General tasks
Qwen Plus	`qwen-plus`	32K	Enhanced version	Cost-effective
Qwen Turbo	`qwen-turbo`	32K	Fast version	Low latency

Moonshot Kimi Series

🆕 Latest: Kimi K2.6 | ✅ Stable / Classic: Kimi K2.5, K2

Model Name	Model ID	Context Length	Features	Recommended Scenarios
Kimi K2.6 🔥	`kimi-k2.6`	256K	1T MoE / 32B activated, SWE-Bench Pro 58.6 surpasses GPT-5.4 and Opus 4.6	Coding, agents
Kimi K2.5	`kimi-k2.5`	200K	Native multimodal, Agent Swarm 100 agents	Multimodal, agents
Kimi K2 Official Release ⭐	`kimi-k2-250711`	200K	Volcano Engine partnership, strong stability	Production environments

🌐 MiniMax Series

🆕 Latest: MiniMax M2.7 | ✅ Stable / Classic: MiniMax M2.5

Model Name	Model ID	Context Length	Features	Recommended Scenarios
MiniMax M2.7 🔥	`MiniMax-M2.7`	Standard	10B params, SWE-bench Pro 56.22%, self-evolving, smallest Tier-1 model	Coding, agents
MiniMax M2.5	`minimax-m2.5`	Standard	230B (10B activated), SWE-bench 80.2%, great value	Coding, agents, office automation

MiniMax M2.7 Features:

Reaches SWE-bench Pro 56.22% with just 10B params, the smallest Tier-1 model
Self-evolving; standard $0.3 / highspeed (MiniMax-M2.7-highspeed) $0.6 per 1M input tokens
Open-sourced model weights

💰 Pricing Information

Billing Methods

Pay-as-you-go: Charged based on actual Token usage
No minimum charge: Use what you pay for, balance never expires
Real-time deduction: Fees deducted from balance immediately after each call

Pricing Advantages

Official source forwarding with slight price advantages
Bulk users can contact customer service for better pricing
New users get 3 million tokens testing credit upon registration

View Real-time Pricing

Visit APIYI Console Pricing Page to view latest pricing for all models.

🛠️ Usage Recommendations

Model Selection Guide

Programming Development

Top performance: Claude Opus 4.7 (+13% coding over 4.6), GPT-5.5 (SWE-bench 88.7%), Claude Sonnet 4.6 (rivals Opus 4.5)
High cost-performance: Gemini 3.5 Flash (surpasses 3.1 Pro at ~half price), GLM-5.1 (SWE-Bench Pro 58.4), Kimi K2.6, DeepSeek V4 Flash
Alternatives: DeepSeek V4 Pro, Qwen3.7-Max, MiniMax M2.7, o4-mini

Text Creation

Top choice: GPT-5.5, GPT-5.4, Gemini 3.1 Pro Preview, Claude Opus 4.7, Claude Sonnet 4.6
Alternatives: chat-latest, Claude Sonnet 4.5, GPT-4.1, GPT-4o, Claude Haiku 4.5, GLM-4.6

Quick Response

Top choice: Gemini 3.5 Flash (~4x speed), Claude Haiku 4.5 (2x faster), GPT-4o Mini
Alternatives: Gemini 3.1 Flash Lite, Gemini 2.5 Flash, Grok 4 Fast, GPT-4.1 Mini

Image Generation

Latest recommendation: GPT Image 1.5 (4x speed boost, precise editing, from $0.01)
Professional design: SeeDream 4.5 (1.2B parameters, 4K quality, $0.035/image), Nano Banana Pro (4K HD, best text rendering)
High cost-performance: Nano Banana ($0.025/image), SeeDream 4.0 ($0.025/image)
Reverse-engineered, cheapest: sora_image, gpt-4o-image

Long Text Processing

Top choice: Gemini 2.5 Pro (2M context)
Alternatives: Claude 4 series (200K context)

Cost Optimization Recommendations

Tiered Usage: Use cheaper models for simple tasks, advanced models for complex tasks
Test Optimization: Test with small models first, use large models after determining needs
Batch Processing: Choose Nano or Mini versions for large volumes of similar tasks
Cache Reuse: Cache results for repeated queries

Model Comparison Testing - Image generation effect comparison
Real-time Price Query - Latest pricing information
API Documentation - Detailed interface specifications
Quick Start - Integration guide

Model list is continuously updated. We will promptly add newly released excellent models. For specific model needs or bulk requirements, please contact customer service.

Basics

Basic API

Image API (Official)

Video API (Official)

Multimodal Understanding API

Text API

🔥 Currently Recommended Models

Model Categories

🤖 OpenAI Series

🆕 Latest Models

✅ Stable / Classic

🎭 Claude Series (Anthropic)

🆕 Latest Models

✅ Stable / Classic

🌟 Google Gemini Series

🆕 Latest Models

✅ Stable / Classic

🚀 xAI Grok Series

🆕 Latest Models

✅ Stable / Classic

🔍 DeepSeek Series

🆕 Latest Models

✅ Stable / Classic

🐘 Chinese Model Series

Zhipu AI (GLM)

Alibaba Qwen

Moonshot Kimi Series

🌐 MiniMax Series

💰 Pricing Information

Billing Methods

Pricing Advantages

View Real-time Pricing

🛠️ Usage Recommendations

Model Selection Guide

Cost Optimization Recommendations

​🔥 Currently Recommended Models

​Model Categories

​🤖 OpenAI Series

​🆕 Latest Models

​✅ Stable / Classic

​🎭 Claude Series (Anthropic)

​🆕 Latest Models

​✅ Stable / Classic

​🌟 Google Gemini Series

​🆕 Latest Models

​✅ Stable / Classic

​🚀 xAI Grok Series

​🆕 Latest Models

​✅ Stable / Classic

​🔍 DeepSeek Series

​🆕 Latest Models

​✅ Stable / Classic

​🐘 Chinese Model Series

​Zhipu AI (GLM)

​Alibaba Qwen

​Moonshot Kimi Series

​🌐 MiniMax Series

​💰 Pricing Information

​Billing Methods

​Pricing Advantages

​View Real-time Pricing

​🛠️ Usage Recommendations

​Model Selection Guide

​Cost Optimization Recommendations

​🔗 Related Resources

🔥 Currently Recommended Models

Model Categories

🤖 OpenAI Series

🆕 Latest Models

✅ Stable / Classic

🎭 Claude Series (Anthropic)

🆕 Latest Models

✅ Stable / Classic

🌟 Google Gemini Series

🆕 Latest Models

✅ Stable / Classic

🚀 xAI Grok Series

🆕 Latest Models

✅ Stable / Classic

🔍 DeepSeek Series

🆕 Latest Models

✅ Stable / Classic

🐘 Chinese Model Series

Zhipu AI (GLM)

Alibaba Qwen

Moonshot Kimi Series

🌐 MiniMax Series

💰 Pricing Information

Billing Methods

Pricing Advantages

View Real-time Pricing

🛠️ Usage Recommendations

Model Selection Guide

Cost Optimization Recommendations

🔗 Related Resources