Documentation Index
Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
VEO 3.1 is Google’s flagship AI video generation series, producing video with synchronized audio natively — fixed 8-second clips from text prompts or reference images. APIYI exposes VEO 3.1 through a reverse-engineered channel that proxies Google Flow, billed per-clip with both synchronous streaming and async task modes.Sync API
POST /v1/chat/completions, reuses the OpenAI Chat Completions protocol with stream: true for live progress.Async API
POST /v1/videos three-step async flow, supports text-to-video and Frame-to-Video uploads — built for batch management.Why APIYI’s VEO 3.1?
VEO 3.1 is delivered through a reverse-engineered channel (transparent proxy to Google Flow), optimized for production scenarios across price, integration friction, and feature completeness:Price Killer · Far Below Official Pricing
Unlimited Concurrency · Production Scale
Same Per-Clip Pricing + Top-Up Bonuses
Global Zero-Friction Access
api.apiyi.com directly from Mainland China data centers, residential networks, or overseas nodes. Skip the Google Flow cross-border setup entirely.OpenAI-Compatible · Dual-Mode Access
/v1/chat/completions (same as chat models); async uses /v1/videos (OpenAI Video API style). Both protocols drop into your existing SDK / engineering code with zero changes.Professional Support · Enterprise Onboarding
Key Features
Native Synchronized Audio
Generation Speed Leader
-fast series in 30–60 seconds, standard series in 1–2 minutes — 50% faster than Sora 2, ideal for high-throughput content production.Frame-to-Video Creative Mode
-fl suffix models accept 1 reference image (start frame) or 2 (start + end frames) to animate static visuals or generate seamless transitions between two frames.Portrait / Landscape Switching
-landscape model suffix.Live Streaming Progress
/v1/chat/completions + stream: true) returns real-time > 🏃 Progress: XX% text fragments — your frontend can render a progress bar directly.Async Task Model
video_id for independent polling and download — ideal for batch management, resume-on-failure, and long-running background jobs.Pay on Success
Multi-Video Parallel (n parameter)
n parameter generates up to 4 different videos per request (same prompt, multiple results) for variety selection.Pricing
Billed per clip (each clip is a fixed 8-second video). Only successfully generated videos are billed — failed tasks are free.HD Series (720p, Live)
| Model | Description | Resolution | Price |
|---|---|---|---|
veo-3.1 | Default portrait | 720×1280 | $0.25 |
veo-3.1-fl | Portrait + Frame-to-Video | 720×1280 | $0.25 |
veo-3.1-fast | Portrait + fast | 720×1280 | $0.15 |
veo-3.1-fast-fl | Portrait + fast + Frame-to-Video | 720×1280 | $0.15 |
veo-3.1-landscape | Landscape | 1280×720 | $0.25 |
veo-3.1-landscape-fl | Landscape + Frame-to-Video | 1280×720 | $0.25 |
veo-3.1-landscape-fast | Landscape + fast | 1280×720 | $0.15 |
veo-3.1-landscape-fast-fl | Landscape + fast + Frame-to-Video | 1280×720 | $0.15 |
4K Series (Rolling Out)
- Per-clip billing: Each 8-second video is a fixed unit price, independent of prompt length, reference images, or
n(n=2 means billed for 2 clips) - Failures are free: Tasks ending in
failed/ content-policy rejection / gateway errors are not billed — retry safely - Top-up bonuses: See Top-Up Promotions
Technical Specs
| Dimension | Spec |
|---|---|
| Base model name | veo-3.1 (HD) / 4K series TBD |
| Variant axes | Orientation (portrait/landscape) × Speed (standard/fast) × Mode (text-only / Frame-to-Video -fl) |
| Video duration | Fixed 8 seconds (not adjustable) |
| HD resolutions | Portrait 720×1280, landscape 1280×720 |
| 4K resolutions | Rolling out, specs TBD |
| Audio track | ✅ Synchronized native audio |
| Frame-to-Video (-fl) | ✅ Models with -fl suffix; 1 image (start frame) or 2 images (start + end) |
| Sync generation time | -fast series 30–60 sec, standard series 1–2 min |
| Sync progress streaming | ✅ /v1/chat/completions + stream: true |
| Async polling | ✅ /v1/videos + task ID + /content download |
n parameter | Sync mode max 4 per request (async mode recommended at 1) |
| Video URL TTL | 24 hours |
API Endpoints
| Endpoint | Method | Purpose | Content-Type |
|---|---|---|---|
/v1/chat/completions | POST | Sync streaming generation (recommended for real-time UX) | application/json |
/v1/videos | POST | Async task: submit text-to-video or Frame-to-Video | application/json or multipart/form-data |
/v1/videos/{video_id} | GET | Async poll task status | — |
/v1/videos/{video_id}/content | GET | Async download video URL | — |
Getting Started
Token Group
VEO 3.1 runs on APIYI’sdefault group — no separate group switch or application required. Just create a token under the default group from the console’s Token Management page; both Pay-as-you-go (priority) and Per-call billing modes work out of the box.
Online Playground: iCover AI
Want to try VEO 3.1 before writing any code? Use APIYI’s official video-generation testing site, iCover AI:- URL:
icover.ai/zh/veo - How to use: paste a token from the default group (Pay-as-you-go or Per-call) — text-to-video and Frame-to-Video modes both work directly
- Background: iCover AI is the official AI video-generation playground operated by APIYI. It shares the same backend with the production API, so what you see in the playground is exactly what production calls will return.
Key Parameters
Model Variant Naming Rules
VEO 3.1 toggles capabilities via model name suffixes — not separate parameters:| Suffix | Effect | Default (no suffix) |
|---|---|---|
-landscape | Landscape (1280×720) | Portrait (720×1280) |
-fast | Fast tier (speed-first, lower price) | Standard tier |
-fl | Frame-to-Video (requires uploaded image) | Pure text-to-video |
veo-3.1— Standard portrait text-to-video (default)veo-3.1-landscape-fast— Fast landscape text-to-video (best value)veo-3.1-landscape-fl— Standard landscape Frame-to-Videoveo-3.1-landscape-fast-fl— Fast landscape Frame-to-Video (cheapest image-to-video)
n (Number of Videos per Sync Request)
- Range:
1to4, default1 - Only the sync mode (
/v1/chat/completions) supportsn; async mode ignores it - Billed per video (n=2 means billed for 2 clips)
Best Practices
Validate prompts with -fast first
veo-3.1-fast or veo-3.1-landscape-fast first ($0.15, 30–60 seconds), then switch to standard tier for the final asset.Pick orientation by use case
- Social-media short-form (TikTok, Reels) → portrait (no
-landscape) - YouTube / ads / product demos → landscape (
-landscape)
Frame-to-Video prompts focus on "motion"
-fl models already define visuals (start frame or start+end frames). The prompt should focus on how the image animates: camera motion, object motion, lighting changes, character expressions. Example: "Camera slowly pushes in, leaves gently swaying, sunlight flickering through branches".Frame-to-Video shines for "transitions"
Client timeout ≥ 2 minutes
-fast ≈ 60 sec, standard ≈ 2 min) — set client timeout to 120 seconds minimum. Async POST submission is sub-second, but use 30 seconds as a baseline.Download videos immediately
completed to avoid expired links.Error Codes & Retries
| Status | Meaning | Recommended Action |
|---|---|---|
400 | Invalid parameters (model name doesn’t exist, -fl missing image, n out of range) | Validate parameters; Frame-to-Video must use multipart upload |
401 / invalid_api_key | Invalid API Key | Check Bearer Token; verify console group setting |
403 | Content-policy rejection | Adjust prompt; ensure reference images are non-sensitive |
429 / quota_exceeded | Rate limit / quota exceeded / insufficient balance | Exponential backoff; contact sales for higher quota |
5xx | Gateway / upstream error | Retry async tasks 1–2 times (no charge) |
Task failed | Generation failed (mostly content policy or upstream capacity) | See “Content-policy errors” section below; adjust prompt and retry; failed task is not billed |
video_not_found | video_id doesn’t exist or has expired | Verify ID; query within 24 hours |
Content-policy errors (PUBLIC_ prefix)
Any failed task whose error.message / fail_reason starts with PUBLIC_ comes from upstream Google Flow’s official content policy — your prompt, reference image, or generated output triggered Google’s safety filter. It has nothing to do with the APIYI gateway, and these tasks are not billed, so you can safely retry after adjusting.
| Error code | Meaning |
|---|---|
PUBLIC_ERROR_AUDIO_FILTERED | Audio track was filtered (sensitive utterances, certain dialog languages, copyrighted audio, etc.) |
PUBLIC_ERROR_PROMINENT_PEOPLE_FILTER_FAILED | Hit the public-figure filter (prompt or reference image involves a real well-known person) |
Other PUBLIC_ERROR_* | Same family — upstream content policy rejection; the field name itself indicates the trigger |
- Rewrite the prompt: remove personal names, brands, sensitive terms; if there’s spoken dialog, switch to a generic description (e.g., “the character speaks calmly”).
- Swap reference images: avoid using real people (especially celebrities) as start/end frames.
- Retry is free: these tasks are not billed, retry freely after adjusting.
- Sync request timeout: 120 seconds baseline (standard tier);
-fastcan drop to 60 seconds - Async POST submission timeout: 30 seconds; GET polling interval 5–10 seconds, max wait 10 minutes
- Exponential backoff retries on 5xx and
failedtasks (recommend 2 retries) - Log the
x-request-idresponse header for debugging
FAQ
Is VEO 3.1 official-relay or reverse-engineered? Is an official channel available?
Is VEO 3.1 official-relay or reverse-engineered? Is an official channel available?
VEO 3.1 vs Sora 2 — which should I choose?
VEO 3.1 vs Sora 2 — which should I choose?
| Dimension | VEO 3.1 | Sora 2 (Official) |
|---|---|---|
| Price | $0.15–$0.25 / 8 sec (per clip) | $0.40–$8.40 / 4–12 sec (per second) |
| Duration | Fixed 8 sec | 4 / 8 / 12 sec |
| Generation time | 30 sec – 2 min | 3–10 min |
| Audio | ✅ Native sync | ✅ Native sync |
| Frame-to-Video | ✅ -fl series | ✅ input_reference single image |
| Stability | Reverse-engineered, subject to risk control | Official 99.99% |
| Resolution | 720p (4K rolling out) | 720p / 1024p / 1080p |
Why is the video duration fixed at 8 seconds? Can I extend it?
Why is the video duration fixed at 8 seconds? Can I extend it?
-fl models using each clip’s end frame as the next clip’s start frame, then stitch with ffmpeg.How do I choose between standard and -fast?
How do I choose between standard and -fast?
- Highest quality / hero assets → standard (
veo-3.1/veo-3.1-landscape), $0.25/clip - Volume / experimentation / internal preview → fast (
-fastsuffix), $0.15/clip, faster - Quality difference between fast and standard is small — fast tier is sufficient for most production use cases
How do Frame-to-Video (-fl) models work?
How do Frame-to-Video (-fl) models work?
-fl series requires input_reference image upload:- 1 image → start-frame mode: image becomes the video’s opening, AI generates subsequent frames
- 2 images → start + end mode: first image opens, second image closes, AI generates the transition
Are failed generations billed?
Are failed generations billed?
failed, content-policy rejections, gateway 5xx errors, and parameter errors are all not billed. Only videos that actually complete (with a returned URL) are billed.How long are video URLs valid?
How long are video URLs valid?
How do I read progress in sync streaming mode?
How do I read progress in sync streaming mode?
/v1/chat/completions + stream: true returns SSE format with progress text in each chunk:delta.content. Full example in Sync API.Which image formats are supported? Reference image size limits?
Which image formats are supported? Reference image size limits?
-fl models accept jpeg / png for input_reference, recommended size ≤ 5 MB per image. No strict resolution requirement (unlike Sora 2), but the image aspect ratio should match the target video orientation: portrait video → portrait image, landscape → landscape; otherwise the AI will auto-crop.Can I use the official OpenAI SDK?
Can I use the official OpenAI SDK?
client.videos.create(), but Frame-to-Video must use raw requests for multi-file upload (the OpenAI SDK only handles single-file uploads natively).Can I run multiple tasks in parallel? What are the rate limits?
Can I run multiple tasks in parallel? What are the rate limits?
/v1/videos returns an independent video_id. Submit and poll in parallel. Default quota covers most business needs; for enterprise batch use cases (>10 concurrent, >100 clips per day), contact sales for a dedicated resource pool.Can I cancel a running task?
Can I cancel a running task?
-fast first to avoid wasting standard-tier runs.Can I disable the audio track?
Can I disable the audio track?
ffmpeg -i input.mp4 -an output.mp4.When does the 4K version launch? What's the price?
When does the 4K version launch? What's the price?
Related Docs
- Sync API —
/v1/chat/completions+stream: truelive streaming, text-to-video + Frame-to-Video samples - Async API —
/v1/videosthree-step async flow, Frame-to-Video upload, full Python client example - Sora 2 Video Generation — OpenAI official-relay channel comparison
- Top-Up Promotions — Bonus tiers and applicable channels
- API Manual — General request, timeout, and retry guidance
- Google official Veo introduction:
deepmind.google/technologies/veo/