Skip to main content
POST
/
v1
/
images
/
edits
Image Edit: edit or fuse reference images by instruction
curl --request POST \
  --url https://api.apiyi.com/v1/images/edits \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form model=gpt-image-2 \
  --form 'prompt=Place subject from image 1 into scene from image 2, using color style from image 3' \
  --form 'image=<string>' \
  --form image.items='@example-file' \
  --form mask='@example-file'
{
  "created": 1776832476,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
    }
  ],
  "usage": {
    "input_tokens": 1280,
    "output_tokens": 6240,
    "total_tokens": 7520
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt

Use this file to discover all available pages before exploring further.

The interactive Playground on the right supports direct local image upload. Fill in your API Key in Authorization (format: Bearer sk-xxx), select image / mask files, fill in prompt and model, and send.
Use case: This page is for “edit / fuse / inpaint based on one or more reference images”. Request format is multipart/form-data. For pure text-to-image, use the Text-to-Image endpoint.
🖥️ Browser Playground limitation (important)This endpoint returns a raw base64 string (typically several MB) in the response. Due to browser rendering limits, the Playground on the right may show 请求时发生错误: unable to complete request after the response arrives — the request actually succeeded; the browser just can’t render such a long base64 string.Recommended workflow (beginner-friendly):
  • Copy the Python / Node.js / cURL sample below and run it locally. The code automatically base64.b64decodes the response and writes the image to a file.
  • If you must use the in-browser Playground, use a tiny reference image (< 50KB), set size to the smallest tier (e.g. 1024x1024), and quality to low.
⚠️ Key differences (when migrating from gpt-image-1.5)
  • Do not pass input_fidelitygpt-image-2 forces high-fidelity; passing it returns 400
  • Edit requests have noticeably higher input tokens — references convert to many tokens via Vision pricing; budget accordingly
  • background: transparent not supported — use opaque or post-process
  • Multi-image fusion: max 16 — repeat the image[] field; more than 16 errors out
📎 Multi-image fusion order mattersThe image[] field accepts multiple reference images. Upload order maps to “image 1 / image 2 / image 3” references in the prompt. Reference them explicitly:
Place subject from image 1 into scene from image 2, using color style from image 3
Recommended ≤ 10MB per image, formats: png / jpg / webp.

Code Examples

Python (OpenAI SDK · single-image edit)

from openai import OpenAI
import base64

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.apiyi.com/v1"
)

resp = client.images.edit(
    model="gpt-image-2",
    image=open("photo.png", "rb"),
    prompt="Replace the background with a seaside sunset, preserve subject details",
    size="1536x1024",
    quality="high"
)

# b64_json is raw base64 (no prefix) — decode manually
with open("edited.png", "wb") as f:
    f.write(base64.b64decode(resp.data[0].b64_json))

Python (OpenAI SDK · multi-image fusion)

resp = client.images.edit(
    model="gpt-image-2",
    image=[
        open("person.png", "rb"),
        open("scene.png", "rb"),
        open("style.png", "rb"),
    ],
    prompt="Place subject from image 1 into scene from image 2, using color style from image 3, keep lighting consistent",
    size="1536x1024",
    quality="high"
)

with open("fused.png", "wb") as f:
    f.write(base64.b64decode(resp.data[0].b64_json))

cURL (multi-image fusion)

curl -X POST "https://api.apiyi.com/v1/images/edits" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=gpt-image-2" \
  -F "prompt=Place subject from image 1 into scene from image 2, using color style from image 3" \
  -F "size=1536x1024" \
  -F "quality=high" \
  -F "image[]=@person.png" \
  -F "image[]=@scene.png" \
  -F "image[]=@style.png"

cURL (mask inpainting)

curl -X POST "https://api.apiyi.com/v1/images/edits" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=gpt-image-2" \
  -F "prompt=Replace the sky with pink sunset clouds" \
  -F "size=1024x1024" \
  -F "quality=high" \
  -F "image[]=@photo.png" \
  -F "mask=@mask.png" \
  | jq -r '.data[0].b64_json' | base64 -d > photo_edited.png

Node.js (Native fetch + FormData · multi-image fusion)

import fs from 'node:fs';

const form = new FormData();
form.append('model', 'gpt-image-2');
form.append('prompt', 'Place subject from image 1 into scene from image 2');
form.append('size', '1536x1024');
form.append('quality', 'high');
form.append('image[]', new Blob([fs.readFileSync('./person.png')]), 'person.png');
form.append('image[]', new Blob([fs.readFileSync('./scene.png')]), 'scene.png');

const resp = await fetch('https://api.apiyi.com/v1/images/edits', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer sk-your-api-key' },
    body: form
});

const { data } = await resp.json();
fs.writeFileSync('fused.png', Buffer.from(data[0].b64_json, 'base64'));

Parameter Reference

FieldTypeRequiredDefaultDescription
modeltextYesFixed: gpt-image-2
prompttextYesEdit / fusion instruction
image[]fileYesReference images, can repeat (max 16)
maskfileNoMask image (only applies to first image, alpha channel required)
sizetextNoautoOutput size, same as text-to-image
qualitytextNoautolow / medium / high / auto
output_formattextNopngpng / jpeg / webp
output_compressiontextNo0–100, only for jpeg / webp
backgroundtextNoautoauto / opaque (not supported: transparent)

Mask Inpainting Requirements

  • Same size and format as original, ≤ 50MB
  • Must have alpha channel: transparent (alpha=0) = inpaint area, opaque = preserve
  • Mask only applies to the first image
  • Mask is a “soft guide” — the model may extend or contract around the masked region
Multi-turn iteration: feed the previous output back as the next call’s image[] with a new instruction to incrementally refine. Each round is independently token-billed — watch cumulative cost.

Response Format

{
    "created": 1776832476,
    "data": [
        {
            "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
        }
    ],
    "usage": {
        "input_tokens": 1280,
        "output_tokens": 6240,
        "total_tokens": 7520
    }
}
b64_json is raw base64, without the data:image/...;base64, prefix — different from gpt-image-2-all. Decode it client-side to write a file, or prepend the prefix for browser rendering.
Edit requests’ input_tokens are typically significantly higher than text-to-image at the same size, because reference images are billed per Vision pricing rules. Multi-image fusion adds proportionally more input tokens per image.

Authorizations

Authorization
string
header
required

API Key obtained from APIYI Console

Body

multipart/form-data
model
enum<string>
default:gpt-image-2
required

Model name, fixed as gpt-image-2

Available options:
gpt-image-2
prompt
string
required

Edit/fusion instruction. For multi-image, use 'image 1 / image 2 / image 3' to reference upload order

Example:

"Place subject from image 1 into scene from image 2, using color style from image 3"

image
file[]
required

Reference images. For a single image, send the field once; for multiple images, repeat the same image field (e.g., -F image=@a.png -F image=@b.png, max 16) — upload order maps to image 1 / image 2 / ... in the prompt. Each ≤ 10MB, formats: png/jpg/webp

mask
file

Mask image (optional, only applies to first image). Requirements:

  • Same size and format as original
  • Must have alpha channel (alpha=0 = inpaint area, opaque = preserve)
  • Single file ≤ 50MB
size
string
default:auto

Output size (same as text-to-image). Preset or constraint-satisfying custom size

Example:

"1536x1024"

quality
enum<string>
default:auto

Quality tier

Available options:
auto,
low,
medium,
high
output_format
enum<string>
default:png

Output format

Available options:
png,
jpeg,
webp
output_compression
integer

Output compression (0–100), only effective for jpeg/webp

Required range: 0 <= x <= 100
background
enum<string>
default:auto

Background mode. auto or opaque. Not supported: transparent

Available options:
auto,
opaque

Response

Image generated successfully

created
integer
Example:

1776832476

data
object[]

Generation results (this model returns 1 image per call)

usage
object

Token usage for this call