Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

gpt-image-2-vip is the GPT image generation reverse-engineered model on the Codex line, available on the API易 platform. Same flat $0.03/image as gpt-image-2-all and identical request/response format — the only meaningful difference is that vip accepts a size field with 30 common sizes (10 aspect ratios × 3 resolution tiers: 1K Fast / 2K Recommended / 4K Detail), including 4K.
🎨 Positioning: use gpt-image-2-vip when you need to lock the output size (e-commerce hero shots, poster templates, video thumbnails, 4K wallpapers, etc.). Just swap the model field to gpt-image-2-vip and add a size field — every other line of code stays identical to gpt-image-2-all.

Chat API

OpenAI Chat Completions format — one endpoint for both text-to-image and reference-image editing, accepts online image URLs directly.

Text-to-Image API

/v1/images/generations — text prompt + size for explicit output dimensions.

Image Editing API

/v1/images/edits — multipart upload with edit/fusion instructions.

Key differences vs gpt-image-2-all

gpt-image-2-vip and gpt-image-2-all are both reverse-engineered channels, same price, same call code. They mirror each other — swap the model field on the same request and behavior is largely identical. The differences:
Dimensiongpt-image-2-allgpt-image-2-vip
ChannelReverse-engineered ChatGPT webReverse-engineered Codex line
Price$0.03 / image$0.03 / image (flat across all sizes)
size parameter❌ Not accepted (describe in prompt)✅ 30 sizes incl. 4K
4K (e.g. 3840x2160)✅ 4K Detail tier
Generation time~30–60 seconds~90–150 seconds (on par with the official gpt-image-2)
quality parameter❌ Not accepted❌ Not accepted (do not pass)
Endpoints/chat/completions + /images/generations + /images/editsSame as left (identical)
Response formaturl / b64_json (already prefixed)Same as left
Best forPrompt-driven, size-insensitiveNeed locked output size (incl. 4K)
One-line decision: don’t need strict size, want fastest outputgpt-image-2-all; need locked size or 4Kgpt-image-2-vip; need a quality knob or strict OpenAI-API field parity → use the official gpt-image-2.

Core Features

Locked output size

The size field accepts 30 common sizes — e-commerce hero shots, poster templates, 4K wallpapers all output at exact pixels.

4K High Resolution

The 4K Detail tier covers 2880×2880 / 3840×2160 / 3840×1632 etc., suitable for large deliverables.

Flat pricing across all sizes

1K / 2K / 4K all cost $0.03/image — no surcharge for 4K.

Same call format as -all

Request structure, fields, and response shape are identical to gpt-image-2-all — switch models with just the model string.

High Text Rendering

Stable rendering of Chinese/English text, signs, and poster text — ideal for infographics and marketing assets

Chinese Prompt Friendly

Native understanding of Chinese descriptions without translation

Natural-Language Editing

Edit via conversational descriptions, no masks required, supports multi-turn iteration

Triple Endpoint Support

Compatible with /images/generations, /images/edits, and /chat/completions

Pricing

ModelBillingPriceOutput
gpt-image-2-vipPer-call$0.03 / image1 image per call, size field locks output dimension
Billing notes:
  • Flat $0.03/image across all 30 sizes — no surcharge for 4K Detail
  • Failed requests are not charged (auth failures, parameter validation errors)
  • For N images, call the API N times in parallel

Group Setup

gpt-image-2-vip lives on the Default group — no extra group needed. The reverse channel currently has stable supply, so there’s no enterprise-group fallback story like the official-relay gpt-image-2 has.
ModelGroupNotes
gpt-image-2-vipDefaultCodex reverse line, flat $0.03/img, ~90–150s
Advanced (when you also use gpt-image-2-all and the official-relay gpt-image-2): if your token covers all three models, set the token’s group priority like this:
  • First priority: image2Enterprise (1.2x enterprise group, dedicated stable lane for the official relay)
  • Default fallback: Default (both reverse models live here and route by model)
Result: official-relay gpt-image-2 rides the enterprise lane for stability, while the two reverse models stay on the default group — one token covers all three, no interference.
📖 About the image2Enterprise group: /en/live/2026-04/image2-enterprise-stable

Technical Specs

AttributeValue
Model namegpt-image-2-vip
Channel typeOfficial reverse-engineered (Codex line)
Pricing$0.03 / image, per-call (flat across all sizes)
Generation time~90–150 seconds (on par with the official gpt-image-2; slower than gpt-image-2-all’s 30–60s)
size parameter✅ 30 sizes: 10 ratios × 3 resolution tiers (1K Fast / 2K Recommended / 4K Detail)
4K support✅ 4K Detail tier (e.g., 3840x2160 / 2880x2880)
quality parameter❌ Not supported, do not pass
n parameter❌ Not supported, single image per call
Default response formaturl (R2 CDN accelerated link, ~1-day validity)
Alternative formatb64_json (already prefixed with data:image/png;base64,)
Chinese prompts✅ Natively supported
CapabilitiesText-to-image, single-image editing, multi-image fusion, natural-language editing (all three endpoints)
⏰ Image URL validity: ~1 day (default)The default url field is an R2 CDN link that expires in about 24 hours — requests after that will 404. For images that need long-term retention, download and persist them to your own storage as soon as possible after generation, or use the b64_json response format.

Endpoints

gpt-image-2-vip is compatible with the exact same three endpoints as gpt-image-2-all. Just swap the model field and add a size if needed:
EndpointPurposeContent-TypeBest for
POST /v1/chat/completionsChat-based (text-to-image / editing / multi-turn / reference images)application/jsonPass online image URLs directly; one endpoint for both generation and editing
POST /v1/images/generationsText-to-imageapplication/jsonOpenAI Images API standard format — same code can hit both official and reverse channels
POST /v1/images/editsImage editing (single/multi)multipart/form-dataOpenAI Images API standard format — same code can hit both official and reverse channels
Domain options: api.apiyi.com is the main domain. You can also use alternate gateway domains such as b.apiyi.com / vip.apiyi.com. Response behavior is identical.

Supported sizes (full 30-size table)

gpt-image-2-vip supports 10 aspect ratios × 3 resolution tiers = 30 sizes. Pass size: "WIDTHxHEIGHT" (lowercase ASCII x) directly in the request body.

1K Fast — drafts and low-cost iterations

RatioNamePixels
1:1Square1280x1280
2:3Portrait848x1280
3:2Photo1280x848
3:4Portrait960x1280
4:3Standard1280x960
4:5Social1024x1280
5:4Large1280x1024
9:16Story720x1280
16:9Wide1280x720
21:9Cinema1280x544
RatioNamePixels
1:1Square2048x2048
2:3Portrait1360x2048
3:2Photo2048x1360
3:4Portrait1536x2048
4:3Standard2048x1536
4:5Social1632x2048
5:4Large2048x1632
9:16Story1152x2048
16:9Wide2048x1152
21:9Cinema2048x864

4K Detail — large deliverables

RatioNamePixels
1:1Square2880x2880
2:3Portrait2336x3520
3:2Photo3520x2336
3:4Portrait2480x3312
4:3Standard3312x2480
4:5Social2560x3216
5:4Large3216x2560
9:16Story2160x3840
16:9Wide3840x2160
21:9Cinema3840x1632
Flat pricing across all 30 sizes: $0.03/image. No surcharge for 4K Detail.
Picking a tier:
  • 1K Fast — drafts, thumbnails, A/B tests. Fastest output (price is flat, but iteration loop is shorter).
  • 2K Recommendeddefault tier. Covers most production outputs (e-commerce hero shots, posters, infographics).
  • 4K Detail — print, large displays, video thumbnails, desktop / outdoor large format.
Minimal call example (only pass size, do not pass quality):
curl "https://api.apiyi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $YI_API_KEY" \
  -d '{
    "model": "gpt-image-2-vip",
    "prompt": "Product shot of a white ceramic mug on a gray desk, soft natural light, clean background",
    "size": "2048x1360"
  }'

Best Practices

1

Compress input images to under 1.5MB (image edit / multi-image fusion)

Compress each image you upload to under 1.5MB (JPEG quality 80-90 / down-sized resolution); apply the same cap per image in multi-image fusion. Sporadic shell_api_error / Unknown error responses are most often triggered by oversized inputs — compressing measurably improves success rate and latency. Output resolution is governed by the size field, not by input size — shrinking the input only speeds things up, it does not hurt quality. Stuffing 4K / 8K into the prompt does not produce a 4K image; resolution is set by size, not by prompt fluff.
2

Pick the size tier by deliverable

1K Fast for drafts, 2K Recommended for production, 4K Detail for print/large displays. Pricing is flat — pick by need.
3

Use lowercase ASCII x in size

Send "size": "1536x1024" — not 1536×1024, not uppercase X.
4

Do not pass quality or n

quality is rejected; n returns 1 image per call regardless — for multiple images, call in parallel.
5

Use a 300s timeout

Typical generation is 90–150s, but image upload / download time and peak-tail latency push it higher. Set 300s as a conservative baseline.
6

Choose response format by need

Use b64_json for direct web rendering; url for server-side storage/forwarding.
7

Share code with -all

Same code works for both — switch model between gpt-image-2-all and gpt-image-2-vip as needed. Use vip when you need locked size, switch back to -all for fastest iteration.

Error Codes and Retries

StatusMeaningSuggestion
400size not in the 30-size set, or malformedUse the exact strings from the table above
401Invalid tokenCheck Bearer Token
429Rate limit / quota exhaustedExponential backoff retry
500 (4K sporadic)OpenAI upstream compute fluctuation; 4K Detail tier hits this more oftenDrop to 2K Recommended and retry; if 4K is mandatory, switch to official-proxy gpt-image-2 + image2Enterprise group
5xx (other)Transient gateway/backend errorRetry 1–2 times
TimeoutCodex peak + 4K long tailSet client timeout ≥ 300s (conservative)
Client recommendations:
  • Request timeout starting at 300 seconds (conservative; typical 90–150s, but 4K Detail + peak tails go higher)
  • Use exponential backoff for 5xx and timeouts (2–3 retries recommended)
  • Log the request-id response header for debugging

FAQ

Yes, almost identical. All three endpoints (/v1/chat/completions, /v1/images/generations, /v1/images/edits) share request fields, response fields, and b64_json prefix behavior. The only differences:
  1. model field: gpt-image-2-vipgpt-image-2-all
  2. size field: vip accepts the 30-size set; -all rejects size (size goes into the prompt instead)
Practical pattern: keep one codebase with an if model == 'vip': payload['size'] = ... switch.
gpt-image-2-vip uses the Codex reverse channel — typical 90–150 seconds, on par with the official gpt-image-2 (100–120s) and slower than ChatGPT-web-line gpt-image-2-all (30–60s). For latency-sensitive workloads, prefer gpt-image-2-all; switch to vip only when you need locked size or 4K.
Yes — stick to the 30-size set. Off-list sizes may trigger upstream invalid_request_error. Pick the closest tier for your deliverable.
Symptom: at the 4K Detail tier (e.g., 3840x2160 / 2880x2880), status_code: 500 errors are easier to trigger, with upstream returning invalid_request_error:
{
  "status_code": 500,
  "error": {
    "message": "An error occurred while processing your request. ... Please include the request ID xxxxxxxx in your message.",
    "type": "invalid_request_error",
    "code": null
  }
}
Root cause: OpenAI compute fluctuation — not your request parameters. The same payload usually goes through at 2K. The Codex reverse channel is more sensitive to large outputs like 4K, especially at peak hours.Mitigation (by cost-effectiveness):
  1. Prefer 2K Recommended (e.g., 2048x1360 / 2048x2048) — significantly higher success rate, same $0.03/image
  2. Send fewer input images for img2img / multi-image fusion — the Codex reverse channel struggles under heavy input load, further pushing up 4K failure rates; pre-compressing each input image under 1.5MB also helps
  3. For guaranteed 4K — switch to the official-proxy gpt-image-2 + image2Enterprise group. The official-proxy 4K is pricier (~$0.3+/image), but markedly more stable — appropriate when 4K delivery is a hard requirement.
📖 Field note: /en/live/2026-05/gpt-image-2-vip-4k-tips
Yes, strongly recommended. Compress each input image to under 1.5MB (JPEG quality 80-90 / down-sized resolution): sporadic shell_api_error / Unknown error responses are most often triggered by oversized inputs, and compressing measurably improves success rate and latency. Note: 1.5MB is the recommended ceiling for reliability and speed; the 10MB number in the FAQ above is the gateway hard limit.Don’t worry about compression hurting quality — output resolution is governed by the size parameter, not by your input size. Shrinking the input only speeds things up.Stuffing 4K / 8K into the prompt does not actually produce 4K output. If your prompt says 8K ultra HD but you set size to 1024x1024, you still get a 1K-quality image. For 4K, set it in the size field — 1K / 2K / 4K all cost the same flat $0.03/image across the 30-size set.📖 Source: /en/live/2026-05/gpt-image-2-vip-unknown-error
No surcharge. The 4K Detail tier (3840x2160 / 2880x2880 etc.) costs the same $0.03/image as 1K and 2K.
No. This model returns 1 image per call — for multiple images, use repeated / concurrent calls instead.⚠️ Important: if you pass n=3 in the request, billing will be 0.03 × 3 = $0.09, but only 1 image is actually returned. Drop the n field to avoid wasted charges.
This is a reverse-engineered channel using synchronous chat-style responses. Outcomes split into two cases with different billing rules:1) HTTP 5xx returned → NOT billedWhen upstream content policy hard-blocks the request, you’ll see something like:
{
  "error": {
    "message": "Image was not generated as expected. Please adjust the prompt and retry (traceid: 0672821c6951af183dbf847130caaf16)",
    "localized_message": "Unknown error",
    "type": "invalid_request_error",
    "param": "",
    "code": null
  }
}
These hard errors are not billed. Ask the user to adjust the prompt and retry.2) HTTP 200 with a text “soft refusal” → BILLEDWhen the model soft-refuses inside the conversation (e.g. “I can’t do that”, “Sorry, this request involves…”), at the protocol level it looks like a normal chat completion, so it is billed. The reverse channel cannot reliably distinguish “refusal text” from “image output” at the protocol layer.Why we can’t just waive soft refusalsAuto-waiving every soft refusal would mean the platform absorbs every failed upstream call. More importantly, frequently triggering upstream content safety also raises the risk of the supplier’s account being banned — that’s a real supply-side cost we can’t fully eliminate.Recommendations for integrators
  • Pre-filter and warn users: add a keyword/scenario filter at the frontend or gateway (real-person names, copyrighted characters, sensitive topics) and surface a UI hint like “Celebrity / IP topics may fail and still be billed by upstream policy.” This cuts wasted charges sharply.
  • Monthly reimbursement for consumer products: we understand consumer-facing products can’t fully gate user input. If your monthly spend is large enough ($1000+/month), you can batch your logs monthly (short-latency calls are usually soft refusals) and contact support for a one-off manual credit — no need to file per-call appeals.
📖 Related: 500 errors are usually content-policy hits (not billed)
No. The b64_json field already includes the prefix. You can use it directly as <img src> or write it to a file. If your code follows the old “prepend prefix” pattern, you’ll produce a broken data URL — add a startsWith('data:') check first.
Recommended ≤ 10MB per image, formats png / jpg / webp. Overly large images may hit gateway limits. Each image in multi-image fusion must meet this limit.
The default url field is an R2 CDN link that expires in about 1 day (24 hours) — requests after that will 404.Strongly recommended: download and persist generated images to your own object storage (S3 / OSS / R2), CDN, or database shortly after generation.
No. This model returns the image in one shot; streaming is not supported. If latency matters, show a “generating…” progress indicator on the client side and configure a 300s timeout (conservative).
Yes. Point base_url to https://api.apiyi.com/v1 and set api_key to your API易 token. client.images.generate(model="gpt-image-2-vip", size="2048x1360", prompt=...) works directly.
When you need a quality knob (low/medium/high), mask-based local repaint, or strict OpenAI-API field parity — use gpt-image-2. See the Official vs Reverse comparison.
gpt-image-2-vip is a reverse-engineered channel (Codex line). Behavior is aligned but pricing/capabilities may not fully match the official version. For full official-API parity, use gpt-image-2.