Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

gpt-image-2 is OpenAI’s latest flagship image generation model — the upgrade to gpt-image-1.5. Core upgrades: any valid resolution (incl. 2K / 3840×2160 4K), auto high-fidelity on reference images, 20-30% cheaper at the same tier. APIYI’s gateway is fully compatible with the OpenAI Images API — point the official OpenAI SDK’s base_url here for zero-code direct connection.
🎨 Key highlights: Native support for any valid resolution (max 3840×2160 4K) + auto high-fidelity on reference image edits + 20-30% lower cost than 1.5 at same size and quality + native Chinese prompt support. Best for production scenarios that need precise size/quality control, must match the OpenAI official API exactly, or require 4K output.

Text-to-Image API

/v1/images/generations — generate images from text prompts with size / quality / output_format control.

Image Edit API

/v1/images/edits — multipart upload of reference images (up to 16) + edit/fusion instructions, with mask inpainting support.

Why Choose APIYI’s GPT-image-2 Official Relay?

Built on OpenAI’s official channel, deeply optimized for enterprise production workloads across reliability, cost, and integration experience:

Official Channel · Same as Official

Strictly routed through OpenAI’s official relay — requests and responses are 100% identical to OpenAI official: same fields, same error codes, same model behavior. Lossless quality, no silent rewrites.

No Concurrency Limits

Not bound by OpenAI’s Tier-based RPM / TPM ceilings. Enterprise-scale traffic scales linearly — batch generation and peak-load scenarios handled with ease.

Same Price + Up to 15% Off

Default unit price matches OpenAI’s official pricing. Stack with our top-up bonus events for up to 15% off — long-term cost drops noticeably.

Global Zero-Barrier Access

No overseas server or proxy required. Connect directly to api.apiyi.com from domestic data centers, home broadband, or overseas nodes — stable latency, no cross-border re-architecture.

Full Model Lineup

Seamlessly switch to the reverse-engineered gpt-image-2-all ($0.03/image flat), or the cost-leader Nano Banana Pro / 2 — mix and match per scenario.

Professional Enterprise Support

Our team specializes in production image-generation deployments, with deep experience in model selection, tuning, and integration — end-to-end support from PoC to production.

Core Features

Any Resolution (incl. 4K)

Supports any valid output size. Presets cover 1K / 2K / 3840×2160 4K. Custom sizes only need to satisfy basic constraints (edges as multiples of 16, ratio ≤ 3:1).

Auto High-Fidelity

Reference image editing automatically enables high-fidelity. Detail, character identity, and text retention dramatically improved. Do not pass input_fidelity (will error).

20-30% Cheaper

1024×1024 high quality drops from the $0.25 range of 1.5 to $0.211/image. 2K/4K is token-metered but trends down equally — long-term cost noticeably lower.

Chinese + Text Rendering

Native Chinese prompt support. Stable rendering of Chinese/English text in signage, posters, UI screenshots. Fine text is rarely blurry on high quality.

Multi-Image Fusion (up to 16)

image[] array accepts up to 16 reference images. Use “image 1 / image 2 / image 3” in the prompt to reference them by upload order.

Mask Inpainting

Upload an alpha-channel mask. Transparent regions are inpaint areas, opaque regions are preserved.

Multiple Output Formats

Supports png (default) / jpeg / webp. Set output_compression for jpeg/webp to control file size.

OpenAI SDK Direct

Point base_url to https://api.apiyi.com/v1 and call directly with the official OpenAI SDK — zero-code migration.

Pricing

Token-metered (sum of input text + input image + output image tokens). Official per-image pricing reference:
Quality1024×10241024×15361536×1024
Low$0.006$0.005$0.005
Medium$0.053$0.041$0.041
High$0.211$0.165$0.165
Pricing notes:
  • 2K / 4K has no fixed per-image price — billed by actual input + output tokens
  • Edit requests have noticeably higher input tokens than text-to-image due to forced high-fidelity
  • Streaming (stream: true + partial_images: N) costs an extra 100 output image tokens per partial
  • Compared to gpt-image-1.5 at the same size and quality, gpt-image-2 is about 20-30% cheaper

Group Setup

The gpt-image-2 official-relay channel offers two groups. Switch in dashboard → Token Settings → Group:
GroupRateWhen to use
Default1.0xSame price as OpenAI’s list — first choice when capacity is available; peak hours may see 429 / concurrency squeezes
image2Enterprise1.2xStable fallback when the default group is tight — capacity-prioritized
Why 1.2x? It’s calibrated against “a $3,000 single-recharge promo with 20% bonus ≈ OpenAI list price” — APIYI takes no margin on this lane (tax costs aside) and runs it as a pure supply-priority channel. When the default group is unstable, switch your token to image2Enterprise to ride out the spike.
Token creation UI: billing mode = pay-as-you-go priority, group = image2Enterprise (1.2x), the high-speed list-price GPT-image-2 enterprise group
📖 Stability check (recent call log): /en/live/2026-04/image2-enterprise-stable

Technical Specifications

DimensionValue
Model namegpt-image-2
Speed~120 seconds (4K high quality approaches 2 min)
Output resolutionAny valid size (1K/2K/4K, max 3840×2160)
Quality tiersauto / low / medium / high
Output formatspng (default) / jpeg / webp
Chinese prompts✅ Native
Per call1 image (n=1)
Reference image limit16 (image[])
Mask inpainting✅ Supported (alpha channel required)
Transparent background❌ Not supported (background: transparent errors)
Response fieldb64_json (raw base64, no prefix)

Endpoints

EndpointPurposeContent-Type
POST /v1/images/generationsText-to-imageapplication/json
POST /v1/images/editsReference editing / multi-image fusion / mask inpaintingmultipart/form-data
Domain selection: api.apiyi.com is the primary domain. Other gateway domains like b.apiyi.com / vip.apiyi.com work identically.

Size Reference

Preset Sizes

sizeMeaningPixels
autoAdaptive (default)Model decides
1024x1024Square 1:11K
1536x1024Landscape 3:21K
1024x1536Portrait 2:31K
2048x2048Square 1:12K
2048x1152Landscape 16:92K
3840x2160Landscape 16:94K
2160x3840Portrait 9:164K

Custom Size Constraints

gpt-image-2 accepts any valid size that satisfies all of:
  1. Max edge ≤ 3840px
  2. Both edges are multiples of 16
  3. Aspect ratio ≤ 3:1
  4. Total pixels ∈ [655,360, 8,294,400] (~0.65MP to ~8.3MP)
Valid examples: 1600x1200, 1792x1024, 2048x1536, 3200x1800 Invalid examples: 1000x1000 (not multiple of 16), 4000x4000 (over max), 3840x1000 (ratio > 3:1)
Outputs above 2560×1440 (~3.69MP) are officially marked experimental and may show quality fluctuations. For production, prefer presets like 2048x1152 / 2048x2048 / 3840x2160.

Best Practices

Onboarding tip: get the API working with low first, then scale upWe’ve seen new integrators jump straight to quality=high + high resolution and end up waiting ≈ 235 seconds (~4 minutes) per image — only to suspect the API was stuck. high mode has the highest inference complexity, and 4K can stretch close to 5 minutes. Before going to production, integrate end-to-end with quality=low first (auth, SDK, params, timeouts, error handling), then move up to medium / high only as your real quality requirement demands.
1

Integrate with low first

For new integrations, start with quality=low + a preset size to validate the full call chain (auth, params, timeouts, error handling). low is several times faster than high, so functional issues surface quickly without being masked by long latency.
2

Prefer preset sizes

The 8 official presets are tuned for stable speed and quality. Reserve custom sizes for genuinely unusual aspect ratios.
3

Match quality to scenario

Drafts / batch → low; daily / final → medium; text, fine textures, print → high. Note that lowhigh is more than visual fidelity — it’s also a step change in inference complexity, so latency scales accordingly.
4

Choose JPEG output

For final display, output_format=jpeg + output_compression=85 is faster than PNG and roughly half the size.
5

Lock high for text scenarios

Text rendering is a key strength but lower tiers can still blur. Lock quality=high for signage and poster scenarios.
6

Prepare reference images

Each image ≤ 10MB; PNG/JPEG/WebP supported; up to 16 images; reference order with “image 1 / image 2” in the prompt.
7

Tier your client timeout (high → 600s safety net)

The two parameters that dominate latency are quality and size — especially quality. Configure client timeouts per tier:
qualityRecommended client timeoutObserved latency
low120 secondstypically 10–40 seconds
medium240 secondstypically 30–90 seconds
high600 seconds (safety net)2K/4K runs 3–5 minutes; long tail observed at 235+ seconds
For high mode, set 600s as the safety-net timeout to absorb queueing, long-tail variance, and upstream jitter. Show progress in the UI; consider a task queue server-side.
8

Migration notes

Migrating from gpt-image-1.5: drop input_fidelity (forced high-fidelity, will error if passed); avoid background: transparent (not supported).

Errors & Retries

StatusMeaningSuggested action
400Invalid parameters (size constraint violation, unsupported field, etc.)Validate against size constraints; do not pass input_fidelity / background: transparent
401Invalid tokenCheck Bearer Token
403Content moderation blockAdjust prompt or pass moderation: low
429Rate limit / insufficient balanceExponential backoff
5xxGateway / backend errorRetry 1–2 times
TimeoutLong tailTier client timeout by quality: low120s / medium240s / high600s (high + 2K/4K runs 3–5 minutes; long tail observed at 235+ seconds)
Client recommendations:
  • Tier request timeout by quality: low120 seconds / medium240 seconds / high ≥ 600 seconds (safety net — observed 3–5 minutes; configuring around 120s/360s causes many false timeouts)
  • Integrate with quality=low first, then move up to medium / high as real quality needs demand
  • Exponential backoff for 5xx and timeouts (suggest 2 retries)
  • Log x-request-id header for support

FAQ

Yes. gpt-image-2 returns a raw base64 string (no prefix), unlike gpt-image-2-all. Two client patterns:
  • Write file: base64.b64decode(b64_str) → write to disk
  • Browser render: img.src = 'data:image/png;base64,' + b64_str (prepend manually)
If your code assumes the 1.5-era “already prefixed” behavior, you’ll get a corrupted data URL — handle this explicitly.
gpt-image-2 forces high-fidelity processing of reference images and no longer accepts input_fidelity. When migrating from 1.5, just remove this field — no replacement needed.
gpt-image-2 does not support background: transparent (will error). Two workarounds:
  • Set background to opaque (or omit) and key out transparency yourself with PIL / sharp / online tools
  • Temporarily fall back to gpt-image-1.5 for scenarios that genuinely need transparency
1 image (n=1). For N images, issue N parallel requests. Each is independently token-billed.
Higher resolution and higher quality require more output image tokens, which naturally takes longer. We’ve seen quality=high + high resolution take ≈ 235 seconds (~4 minutes) per image in real customer integrations, and 3840×2160 + high long-tail can stretch close to 5 minutes. Recommendations:
  • Integrate with quality=low first to validate the call chain, then move up as real quality needs demand
  • Tier client timeout by quality: low120s / medium240s / high ≥ 600s (safety net)
  • Show “generating” progress in the UI
  • Use 1024×1024 / 1536×1024 1K presets when 4K isn’t needed
Because gpt-image-2 auto-enables high-fidelity processing of reference images, the references themselves convert to large input token counts via the Vision pricing rules. Edit input tokens are noticeably higher than text-to-image — budget accordingly.
  • Same size and format as the original, ≤ 50MB
  • Must have alpha channel: transparent (alpha=0) = inpaint area, opaque = preserve
  • Only applies to the first image
  • Mask is a “soft guide” — the model may extend or contract around the masked region
PickWhen
gpt-image-2 (Official)Need precise size/quality control, must match OpenAI official exactly, want 4K output, need mask inpainting
gpt-image-2-all (Reverse)Want flat $0.03/image, 30–60s render, minimal parameters, strong consistency / Chinese text
Yes — zero code change. Point base_url to https://api.apiyi.com/v1 and set api_key to your APIYI token:
from openai import OpenAI
client = OpenAI(api_key="sk-your-key", base_url="https://api.apiyi.com/v1")
resp = client.images.generate(model="gpt-image-2", prompt="...", size="2048x1152", quality="high")
No. gpt-image-2 uses OpenAI’s official synchronous endpoint — once a request is submitted, it runs to completion with no “cancel” signal. Even if the client disconnects, the server still finishes generation and bills normally. Configure client-side timeouts carefully — do not assume “disconnect = no charge”.
Default 100 RPM (100 requests per minute). Actual usable RPM is also dynamically adjusted by overall platform concurrency. If your workload needs more, contact us with your estimated QPS / RPM and we can provision additional capacity.
No. gpt-image-2 strictly mirrors the OpenAI official API — synchronous only. The request blocks until the result is returned (high + 4K realistically 1–2 minutes). If you need an async queue or callback mechanism:
  • Wrap it yourself with a task queue (Celery / BullMQ, etc.) at the business layer
  • Or use gpt-image-2-all — generates in 30–60s, easier to poll from the front end
No. OpenAI’s built-in content moderation rejects unsafe / malformed requests with a 400 error, and no charge is incurred. Typical response:
{
  "status_code": 400,
  "error": {
    "message": "Your request was rejected by the safety system. ...",
    "type": "shell_api_error",
    "code": "moderation_blocked"
  }
}
Other zero-cost errors: 401 (invalid token), 429 (rate limit). Token billing only kicks in once the request actually reaches the model generation stage (i.e., 200 + b64_json received).
gpt-image-2 is OpenAI’s official flagship, billed by token. If you prioritize flat pricing ($0.03/image) and faster generation (30–60s), see gpt-image-2-all.