跳转到主要内容

Documentation Index

Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

VEO 3.1 is Google’s flagship AI video generation series, producing video with synchronized audio natively — fixed 8-second clips from text prompts or reference images. APIYI exposes VEO 3.1 through a reverse-engineered channel that proxies Google Flow, billed per-clip with both synchronous streaming and async task modes.
🎬 Highlights: Native synchronized audio + video output, fixed 8-second clips, Frame-to-Video creative mode, HD portrait/landscape, dramatically lower pricing than Google official (from $0.15), and live progress streaming. Best for short-form video, ad clips, product demos, and social-media assets in high-throughput production scenarios.

Sync API

POST /v1/chat/completions, reuses the OpenAI Chat Completions protocol with stream: true for live progress.

Async API

POST /v1/videos three-step async flow, supports text-to-video and Frame-to-Video uploads — built for batch management.

Why APIYI’s VEO 3.1?

VEO 3.1 is delivered through a reverse-engineered channel (transparent proxy to Google Flow), optimized for production scenarios across price, integration friction, and feature completeness:

Price Killer · Far Below Official Pricing

Starts at $0.15 per 8-second clip — over 80% cheaper than Google’s official pricing. No need to provision Google Cloud / Vertex AI accounts; per-clip billing is fully transparent.

Unlimited Concurrency · Production Scale

APIYI maintains a transparent account pool — linearly scale batch shoots, short-form video matrices, and ad pipelines. No Google account tier ceilings.

Same Per-Clip Pricing + Top-Up Bonuses

Stack top-up bonuses for further savings. Failed generations are not billed — settlement is by successful results only.

Global Zero-Friction Access

No overseas server or proxy required — connect to api.apiyi.com directly from Mainland China data centers, residential networks, or overseas nodes. Skip the Google Flow cross-border setup entirely.

OpenAI-Compatible · Dual-Mode Access

Sync uses /v1/chat/completions (same as chat models); async uses /v1/videos (OpenAI Video API style). Both protocols drop into your existing SDK / engineering code with zero changes.

Professional Support · Enterprise Onboarding

Our team has deep video-generation expertise: prompt engineering, Frame-to-Video reference prep, batch production, and post-processing. Full PoC-to-production technical support for enterprise customers.

Key Features

Native Synchronized Audio

VEO 3.1 outputs video with synchronized native audio (ambient sound, dialogue, score) generated alongside the visuals — no separate audio post-production needed.

Generation Speed Leader

-fast series in 30–60 seconds, standard series in 1–2 minutes — 50% faster than Sora 2, ideal for high-throughput content production.

Frame-to-Video Creative Mode

-fl suffix models accept 1 reference image (start frame) or 2 (start + end frames) to animate static visuals or generate seamless transitions between two frames.

Portrait / Landscape Switching

Portrait 720×1280 (social-media short-form) and landscape 1280×720 (ads, demos) — toggled via the -landscape model suffix.

Live Streaming Progress

Sync mode (/v1/chat/completions + stream: true) returns real-time > 🏃 Progress: XX% text fragments — your frontend can render a progress bar directly.

Async Task Model

Async mode returns a video_id for independent polling and download — ideal for batch management, resume-on-failure, and long-running background jobs.

Pay on Success

Failed generations / content-policy rejections / capacity errors are not billed — you only pay for the videos you actually receive.

Multi-Video Parallel (n parameter)

Sync mode n parameter generates up to 4 different videos per request (same prompt, multiple results) for variety selection.

Pricing

Billed per clip (each clip is a fixed 8-second video). Only successfully generated videos are billed — failed tasks are free.

HD Series (720p, Live)

ModelDescriptionResolutionPrice
veo-3.1Default portrait720×1280$0.25
veo-3.1-flPortrait + Frame-to-Video720×1280$0.25
veo-3.1-fastPortrait + fast720×1280$0.15
veo-3.1-fast-flPortrait + fast + Frame-to-Video720×1280$0.15
veo-3.1-landscapeLandscape1280×720$0.25
veo-3.1-landscape-flLandscape + Frame-to-Video1280×720$0.25
veo-3.1-landscape-fastLandscape + fast1280×720$0.15
veo-3.1-landscape-fast-flLandscape + fast + Frame-to-Video1280×720$0.15

4K Series (Rolling Out)

4K HD variants are rolling out. Model variants will cover the same matrix (portrait / landscape × standard / fast × text-to-video / Frame-to-Video), with naming following the HD series convention. Per-clip pricing will be added to this table once finalized; enterprise customers with batch needs can contact sales for early access.
Billing notes:
  • Per-clip billing: Each 8-second video is a fixed unit price, independent of prompt length, reference images, or n (n=2 means billed for 2 clips)
  • Failures are free: Tasks ending in failed / content-policy rejection / gateway errors are not billed — retry safely
  • Top-up bonuses: See Top-Up Promotions

Technical Specs

DimensionSpec
Base model nameveo-3.1 (HD) / 4K series TBD
Variant axesOrientation (portrait/landscape) × Speed (standard/fast) × Mode (text-only / Frame-to-Video -fl)
Video durationFixed 8 seconds (not adjustable)
HD resolutionsPortrait 720×1280, landscape 1280×720
4K resolutionsRolling out, specs TBD
Audio track✅ Synchronized native audio
Frame-to-Video (-fl)✅ Models with -fl suffix; 1 image (start frame) or 2 images (start + end)
Sync generation time-fast series 30–60 sec, standard series 1–2 min
Sync progress streaming/v1/chat/completions + stream: true
Async polling/v1/videos + task ID + /content download
n parameterSync mode max 4 per request (async mode recommended at 1)
Video URL TTL24 hours

API Endpoints

EndpointMethodPurposeContent-Type
/v1/chat/completionsPOSTSync streaming generation (recommended for real-time UX)application/json
/v1/videosPOSTAsync task: submit text-to-video or Frame-to-Videoapplication/json or multipart/form-data
/v1/videos/{video_id}GETAsync poll task status
/v1/videos/{video_id}/contentGETAsync download video URL
Domain options: api.apiyi.com is the primary endpoint. vip.apiyi.com / b.apiyi.com are equivalent backup gateways with identical behavior.

Getting Started

Token Group

VEO 3.1 runs on APIYI’s default groupno separate group switch or application required. Just create a token under the default group from the console’s Token Management page; both Pay-as-you-go (priority) and Per-call billing modes work out of the box.

Online Playground: iCover AI

Want to try VEO 3.1 before writing any code? Use APIYI’s official video-generation testing site, iCover AI:
  • URL: icover.ai/zh/veo
  • How to use: paste a token from the default group (Pay-as-you-go or Per-call) — text-to-video and Frame-to-Video modes both work directly
  • Background: iCover AI is the official AI video-generation playground operated by APIYI. It shares the same backend with the production API, so what you see in the playground is exactly what production calls will return.
Use iCover AI to dial in your prompt first, then port the call into your code via Sync API or Async API.

Key Parameters

Model Variant Naming Rules

VEO 3.1 toggles capabilities via model name suffixes — not separate parameters:
SuffixEffectDefault (no suffix)
-landscapeLandscape (1280×720)Portrait (720×1280)
-fastFast tier (speed-first, lower price)Standard tier
-flFrame-to-Video (requires uploaded image)Pure text-to-video
Combination examples:
  • veo-3.1 — Standard portrait text-to-video (default)
  • veo-3.1-landscape-fast — Fast landscape text-to-video (best value)
  • veo-3.1-landscape-fl — Standard landscape Frame-to-Video
  • veo-3.1-landscape-fast-fl — Fast landscape Frame-to-Video (cheapest image-to-video)
  • -fl models require input_reference image upload, otherwise you get an error; pure text-to-video must not use the -fl suffix
  • Async Frame-to-Video requests must use multipart/form-data (not JSON); upload 1 image for start frame, 2 for start + end
  • Combining 4 axes yields 8 HD model IDs total — suffix order is fixed: landscapefastfl

n (Number of Videos per Sync Request)

  • Range: 1 to 4, default 1
  • Only the sync mode (/v1/chat/completions) supports n; async mode ignores it
  • Billed per video (n=2 means billed for 2 clips)

Best Practices

1

Validate prompts with -fast first

Run each new prompt at veo-3.1-fast or veo-3.1-landscape-fast first ($0.15, 30–60 seconds), then switch to standard tier for the final asset.
2

Pick orientation by use case

  • Social-media short-form (TikTok, Reels) → portrait (no -landscape)
  • YouTube / ads / product demos → landscape (-landscape)
3

Sync vs async by need

  • Need live progress feedback to users → sync streaming (/v1/chat/completions + stream: true)
  • Background batch processing or long tasks → async task model (/v1/videos + polling)
  • Details: Sync API / Async API
4

Frame-to-Video prompts focus on "motion"

-fl models already define visuals (start frame or start+end frames). The prompt should focus on how the image animates: camera motion, object motion, lighting changes, character expressions. Example: "Camera slowly pushes in, leaves gently swaying, sunlight flickering through branches".
5

Frame-to-Video shines for "transitions"

The strongest Frame-to-Video use case is smooth transitions between two frames (day → night, season changes, expression shifts, object morphing). Describe the transition process and motion changes — no need to detail visuals.
6

Client timeout ≥ 2 minutes

Sync streaming holds the connection until generation completes (-fast ≈ 60 sec, standard ≈ 2 min) — set client timeout to 120 seconds minimum. Async POST submission is sub-second, but use 30 seconds as a baseline.
7

Download videos immediately

Video URLs expire in 24 hours. Production flows must download to your own OSS / CDN as soon as completed to avoid expired links.
8

Run multiple tasks via n or parallel POSTs

  • Same prompt, multiple variants → use n: 4 for 4 results in one call
  • Different prompts in batch → submit multiple async POSTs, each with an independent video_id, then poll independently

Error Codes & Retries

StatusMeaningRecommended Action
400Invalid parameters (model name doesn’t exist, -fl missing image, n out of range)Validate parameters; Frame-to-Video must use multipart upload
401 / invalid_api_keyInvalid API KeyCheck Bearer Token; verify console group setting
403Content-policy rejectionAdjust prompt; ensure reference images are non-sensitive
429 / quota_exceededRate limit / quota exceeded / insufficient balanceExponential backoff; contact sales for higher quota
5xxGateway / upstream errorRetry async tasks 1–2 times (no charge)
Task failedGeneration failed (mostly content policy or upstream capacity)See “Content-policy errors” section below; adjust prompt and retry; failed task is not billed
video_not_foundvideo_id doesn’t exist or has expiredVerify ID; query within 24 hours

Content-policy errors (PUBLIC_ prefix)

Any failed task whose error.message / fail_reason starts with PUBLIC_ comes from upstream Google Flow’s official content policy — your prompt, reference image, or generated output triggered Google’s safety filter. It has nothing to do with the APIYI gateway, and these tasks are not billed, so you can safely retry after adjusting.
Error codeMeaning
PUBLIC_ERROR_AUDIO_FILTEREDAudio track was filtered (sensitive utterances, certain dialog languages, copyrighted audio, etc.)
PUBLIC_ERROR_PROMINENT_PEOPLE_FILTER_FAILEDHit the public-figure filter (prompt or reference image involves a real well-known person)
Other PUBLIC_ERROR_*Same family — upstream content policy rejection; the field name itself indicates the trigger
How to handle:
  1. Rewrite the prompt: remove personal names, brands, sensitive terms; if there’s spoken dialog, switch to a generic description (e.g., “the character speaks calmly”).
  2. Swap reference images: avoid using real people (especially celebrities) as start/end frames.
  3. Retry is free: these tasks are not billed, retry freely after adjusting.
Sample failed-task JSON:
{
  "task_id": "video_693742f8-45c9-4608-85e0-1c4b3dea97eb",
  "object": "task",
  "task_type": "sora2_video_generation",
  "model_name": "veo-3.1-fl",
  "platform": "openai",
  "status": "failed",
  "progress": "100%",
  "fail_reason": "PUBLIC_ERROR_AUDIO_FILTERED",
  "error": {
    "message": "PUBLIC_ERROR_AUDIO_FILTERED",
    "type": "task_failed"
  },
  "data": {
    "id": "video_693742f8-45c9-4608-85e0-1c4b3dea97eb",
    "object": "video",
    "model": "veo-3.1-fl",
    "size": "720x1280",
    "status": "failed",
    "error": {
      "code": "",
      "message": "PUBLIC_ERROR_AUDIO_FILTERED"
    }
  }
}
Recommended client config:
  • Sync request timeout: 120 seconds baseline (standard tier); -fast can drop to 60 seconds
  • Async POST submission timeout: 30 seconds; GET polling interval 5–10 seconds, max wait 10 minutes
  • Exponential backoff retries on 5xx and failed tasks (recommend 2 retries)
  • Log the x-request-id response header for debugging

FAQ

Reverse-engineered. VEO 3.1 is delivered through APIYI’s transparent account pool to Google Flow — pricing is dramatically lower than Google’s official Veo Studio rates, billed per clip with failures not billed. No official-relay channel currently — once Google’s official Vertex AI Veo API becomes generally available, we’ll evaluate adding it and update this page accordingly.
DimensionVEO 3.1Sora 2 (Official)
Price$0.15–$0.25 / 8 sec (per clip)$0.40–$8.40 / 4–12 sec (per second)
DurationFixed 8 sec4 / 8 / 12 sec
Generation time30 sec – 2 min3–10 min
Audio✅ Native sync✅ Native sync
Frame-to-Video-fl seriesinput_reference single image
StabilityReverse-engineered, subject to risk controlOfficial 99.99%
Resolution720p (4K rolling out)720p / 1024p / 1080p
Pick VEO for fast, cheap, batch use cases; pick Sora 2 Pro for highest quality and stability. See the Sora 2 Overview.
Google Flow upstream itself only exposes 8-second fixed duration — there’s currently no parameter to adjust length. For longer videos, chain Frame-to-Video clips: generate multiple 8-second segments with -fl models using each clip’s end frame as the next clip’s start frame, then stitch with ffmpeg.
  • Highest quality / hero assets → standard (veo-3.1 / veo-3.1-landscape), $0.25/clip
  • Volume / experimentation / internal preview → fast (-fast suffix), $0.15/clip, faster
  • Quality difference between fast and standard is small — fast tier is sufficient for most production use cases
-fl series requires input_reference image upload:
  • 1 image → start-frame mode: image becomes the video’s opening, AI generates subsequent frames
  • 2 images → start + end mode: first image opens, second image closes, AI generates the transition
Must use multipart/form-data (not JSON). See Async API - Frame-to-Video.
No. VEO 3.1 bills by successful results: tasks that end in failed, content-policy rejections, gateway 5xx errors, and parameter errors are all not billed. Only videos that actually complete (with a returned URL) are billed.
24 hours. Download to your own OSS / CDN immediately after generation completes to avoid losing access.
/v1/chat/completions + stream: true returns SSE format with progress text in each chunk:
data: {"choices":[{"delta":{"content":"> 🏃 Progress: 45.0%\n\n"}}]}
...
data: {"choices":[{"delta":{"content":"> ✅ Video 1 complete, [click here](https://.../xxx.mp4) to view~~~\n\n"}}]}
data: [DONE]
Frontend just needs to parse “progress” and the video URL out of delta.content. Full example in Sync API.
-fl models accept jpeg / png for input_reference, recommended size ≤ 5 MB per image. No strict resolution requirement (unlike Sora 2), but the image aspect ratio should match the target video orientation: portrait video → portrait image, landscape → landscape; otherwise the AI will auto-crop.
Yes. Sync mode is fully OpenAI Chat Completions-compatible:
from openai import OpenAI
client = OpenAI(api_key="sk-your-key", base_url="https://api.apiyi.com/v1")
resp = client.chat.completions.create(
    model="veo-3.1-fast",
    messages=[{"role": "user", "content": "A cat flying in the sky"}],
    stream=True,
    n=1
)
for chunk in resp:
    print(chunk.choices[0].delta.content or "", end="")
Async mode also works via client.videos.create(), but Frame-to-Video must use raw requests for multi-file upload (the OpenAI SDK only handles single-file uploads natively).
Yes. Each POST /v1/videos returns an independent video_id. Submit and poll in parallel. Default quota covers most business needs; for enterprise batch use cases (>10 concurrent, >100 clips per day), contact sales for a dedicated resource pool.
No. There’s no cancel endpoint currently — once submitted, a task runs to completion. Validate prompts at -fast first to avoid wasting standard-tier runs.
Not currently. VEO 3.1 outputs synchronized audio by default and Google does not expose a parameter to disable it. For audio-free output, strip with ffmpeg after download: ffmpeg -i input.mp4 -an output.mp4.
The 4K series is in gradual rollout, with model variants following the HD naming convention (covering portrait / landscape × fast / standard × Frame-to-Video). Final per-clip pricing will be reflected in the pricing table above once confirmed; enterprise customers with batch needs can contact sales for early access.
  • Sync API/v1/chat/completions + stream: true live streaming, text-to-video + Frame-to-Video samples
  • Async API/v1/videos three-step async flow, Frame-to-Video upload, full Python client example
  • Sora 2 Video Generation — OpenAI official-relay channel comparison
  • Top-Up Promotions — Bonus tiers and applicable channels
  • API Manual — General request, timeout, and retry guidance
  • Google official Veo introduction: deepmind.google/technologies/veo/
VEO 3.1 on APIYI is delivered through a Google Flow reverse-engineered channel for high-value-for-money video generation — leading speed and dramatically lower pricing than official. Two call modes (sync streaming, async task) accommodate different scenarios and integrate seamlessly with your existing OpenAI SDK / engineering code. Open a ticket from your console for any feedback.