Videos
POST /api/v1/tasks — submit a video generation task.
/api/v1/tasksBearer sk-app-…Video generation is always async. Submit a job, receive a task_id, then either poll /api/v1/tasks/{id} or set webhook_url to be notified on completion.
How async tasks work#
POST returns immediately with task_id and status:"queued". Status transitions to "running" once a worker picks it up, then "success", "failed" or "expired". Webhooks fire on terminal states.
Pick a video model#
Video models differ substantially: Veo fixes duration to 8s but supports audio tracks; Sora and Kling do 5/10s; Runway needs an image for I2V; Hailuo accepts camera directives inline in the prompt. The matrix below summarises; the per-model section spells out every parameter.
| Model ID | Model | Spec | Capabilities | Top params | Doc |
|---|---|---|---|---|---|
veo-3.1-quality | Veo 3.1 Quality Google Veo 3.1, quality tier. 8s fixed length. Accepts an input image as URL or base64 data URI. | ≤ 8s | Audio | promptaspect_ratioaudio | View → |
veo-3.1-fast | Veo 3.1 Fast Veo 3.1 fast tier — lower price, slightly less detail. | ≤ 8s | Audio | promptaspect_ratioaudio | View → |
sora2 | Sora 2 OpenAI Sora 2 — text-to-video, 5s or 10s. | ≤ 10s | promptdurationaspect_ratio | View → | |
runway-gen3-alpha-turbo | Runway Gen-3 Alpha Turbo Runway Gen-3 alpha turbo — image-to-video, 5/10s. | ≤ 10s | promptimage_urlduration | View → | |
runway-gen4 | Runway Gen-4 Runway Gen-4 — image- or text-to-video. | ≤ 10s | promptimage_urlduration | View → | |
runway-aleph | Runway Aleph Runway Aleph — edit existing video clips with text. | — | promptvideo_url | View → | |
kling-v21-master-i2v | Kling v2.1 Master (I2V) Kling 2.1 master image-to-video. | ≤ 10s | promptimage_urlduration | View → | |
kling-v21-master-t2v | Kling v2.1 Master (T2V) Kling 2.1 master text-to-video. | ≤ 10s | promptduration | View → | |
kling-v25-i2v-pro | Kling v2.5 Pro (I2V) Kling 2.5 pro image-to-video. | ≤ 10s | promptimage_urlduration | View → | |
kling-v25-t2v-pro | Kling v2.5 Pro (T2V) Kling 2.5 pro text-to-video. | ≤ 10s | promptduration | View → | |
kling-avatar-std | Kling Avatar (Std) Kling avatar standard — lip-sync animated avatar. | — | promptimage_urlaudio_url | View → | |
kling-avatar-pro | Kling Avatar (Pro) Kling avatar pro — higher fidelity. | — | promptimage_urlaudio_url | View → | |
hailuo-02 | MiniMax Hailuo 02 MiniMax Hailuo 02. Supports camera directives embedded in the prompt. | ≤ 10s | promptimage_urlduration | View → | |
grok-imagine-t2v | Grok Imagine (T2V) xAI Grok Imagine, text-to-video. | — | promptduration | View → | |
grok-imagine-i2v | Grok Imagine (I2V) xAI Grok Imagine, image-to-video. | — | promptimage_url | View → | |
topaz-upscale | Topaz Video Upscale Topaz video enhance / upscale to 4K. | — | video_urltarget_resolution | View → | |
infinitalk-audio | InfiniTalk (Audio-driven) Audio-driven talking avatar. Takes a driver audio and a still character image. | — | driver_audio_urldriver_image_url | View → |
Request body (shared)#
| Field | Type | Default | Description |
|---|---|---|---|
modelrequired | string | — | Model ID, e.g. "veo-3.1-quality", "veo-3.1-fast", "kling-3.0", "sora-2-pro", "wan-26-i2v", "hailuo-02-i2v-pro". |
promptrequired | string | — | Text prompt describing the video. |
duration | int | — | Length in seconds. 4, 6 or 8 are typical; per-model limits apply. |
resolution | string | "720p" | "480p", "720p" or "1080p" where supported. |
aspect_ratio | string | "16:9" | "16:9", "9:16", "1:1". |
image | string | — | Optional starting image URL or base64 data URI for image-to-video. |
webhook_url | string | — | POST target for task-complete events. See /docs/webhooks. |
Response#
| Field | Type | Default | Description |
|---|---|---|---|
task_id | string | — | Use with GET /api/v1/tasks/{task_id} or webhooks. |
status | string | — | QUEUED on submit; RUNNING / SUCCESS / FAILED / EXPIRED later. |
submit_time | string | — | ISO-8601 timestamp of submission. |
Per-model parameters#
Pass only the fields the per-model table lists. Orux AI rejects unknown fields with invalid_param. webhook_url is supported by all video models — strongly preferred over polling for production.
Veo 3.1 Qualityveo-3.1-quality
Google Veo 3.1, quality tier. 8s fixed length. Accepts an input image as URL or base64 data URI.
- •duration is fixed at 8 seconds — passing other values is rejected with invalid_param.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
aspect_ratio | enum | 16:9 | Output aspect ratio.16:99:161:1 |
audio | boolean | true | Include the model-generated audio track. |
image | string | — | Optional starting image. Accepts an https URL OR a base64 data URI ("data:image/png;base64,..."). |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Veo 3.1 Fastveo-3.1-fast
Veo 3.1 fast tier — lower price, slightly less detail.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
aspect_ratio | enum | 16:9 | Aspect ratio.16:99:161:1 |
audio | boolean | true | Include the audio track. |
image | string | — | URL or base64 data URI as starting frame. |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Sora 2sora2
OpenAI Sora 2 — text-to-video, 5s or 10s.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
duration | enum | 5 | Length in seconds.510 |
aspect_ratio | enum | 16:9 | Aspect ratio.16:99:16 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Runway Gen-3 Alpha Turborunway-gen3-alpha-turbo
Runway Gen-3 alpha turbo — image-to-video, 5/10s.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_urlrequired | url | — | Starting frame (image-to-video). |
duration | enum | 5 | Seconds.510 |
camera_motion | string | — | Optional camera-motion hint, e.g. "zoom_in", "pan_left". |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Runway Gen-4runway-gen4
Runway Gen-4 — image- or text-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_url | url | — | Optional starting frame. |
duration | enum | 5 | Seconds.510 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Runway Alephrunway-aleph
Runway Aleph — edit existing video clips with text.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
video_urlrequired | url | — | Source video to edit. |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Kling v2.1 Master (I2V)kling-v21-master-i2v
Kling 2.1 master image-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_urlrequired | url | — | Starting frame. |
duration | enum | 5 | Seconds.510 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Kling v2.1 Master (T2V)kling-v21-master-t2v
Kling 2.1 master text-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
duration | enum | 5 | Seconds.510 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Kling v2.5 Pro (I2V)kling-v25-i2v-pro
Kling 2.5 pro image-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_urlrequired | url | — | Starting frame. |
duration | enum | 5 | Seconds.510 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Kling v2.5 Pro (T2V)kling-v25-t2v-pro
Kling 2.5 pro text-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
duration | enum | 5 | Seconds.510 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Kling Avatar (Std)kling-avatar-std
Kling avatar standard — lip-sync animated avatar.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_urlrequired | url | — | Avatar source image. |
audio_urlrequired | url | — | Driver audio track. |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Kling Avatar (Pro)kling-avatar-pro
Kling avatar pro — higher fidelity.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_urlrequired | url | — | Avatar source image. |
audio_urlrequired | url | — | Driver audio track. |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
MiniMax Hailuo 02hailuo-02
MiniMax Hailuo 02. Supports camera directives embedded in the prompt.
- •Camera directives like [Pan Right], [Tilt Up], [Zoom In], [Tracking Shot] are written inline in prompt — see /docs/api/videos#hailuo-camera.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_url | url | — | Optional starting frame for image-to-video. |
duration | enum | 6 | Seconds.610 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Grok Imagine (T2V)grok-imagine-t2v
xAI Grok Imagine, text-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
duration | enum | 5 | Seconds.510 |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Grok Imagine (I2V)grok-imagine-i2v
xAI Grok Imagine, image-to-video.
| Field | Type | Default | Description |
|---|---|---|---|
promptrequired | string | — | Text prompt describing the desired output. |
image_urlrequired | url | — | Starting frame. |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Topaz Video Upscaletopaz-upscale
Topaz video enhance / upscale to 4K.
| Field | Type | Default | Description |
|---|---|---|---|
video_urlrequired | url | — | Source video. |
target_resolution | enum | 4k | Target resolution.1080p4k |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
InfiniTalk (Audio-driven)infinitalk-audio
Audio-driven talking avatar. Takes a driver audio and a still character image.
| Field | Type | Default | Description |
|---|---|---|---|
driver_audio_urlrequired | url | — | Audio that drives lip-sync. |
driver_image_urlrequired | url | — | Character still image. |
webhook_url | url | — | HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling. |
Hailuo 02: camera directives in prompt#
Hailuo 02 accepts camera-motion directives written inline in the prompt as bracketed tokens. Multiple directives may be combined comma-separated. The directive set is:
[Pan Right][Pan Left][Tilt Up][Tilt Down][Zoom In][Zoom Out][Tracking Shot][Dolly In][Dolly Out][Static][Handheld]Veo 3.1: image-to-video input formats#
Veo accepts three input shapes for the optional starting frame: an https URL to a publicly accessible image, a base64 data URI ("data:image/png;base64,…") inline in the JSON body, or a reference to a previous Orux AI task that produced an image (pass {"task_id":"img_…"} as the image field). Duration is fixed at 8 seconds; resolution defaults to 1080p; audio is on by default.
Examples#
Submit and poll
# 1. Submit a job
curl https://orux.top/api/v1/tasks \
-H "Authorization: Bearer $ORUX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "veo-3.1-quality",
"prompt": "A drone shot flying over Tokyo at sunset, cinematic",
"duration": 8,
"resolution": "1080p",
"webhook_url": "https://your.app/webhooks/orux"
}'
# -> { "task_id": "task_01HZX...", "status": "QUEUED", "submit_time": "2026-06-04T22:00:00" }
# 2. Poll until done
curl https://orux.top/api/v1/tasks/task_01HZX... \
-H "Authorization: Bearer $ORUX_API_KEY"
# -> { "task_id": "...", "status": "SUCCESS", "result_urls": ["https://..."], "duration_sec": 38 }Veo 3.1 with base64 starting frame
curl https://orux.top/api/v1/tasks \
-H "Authorization: Bearer $ORUX_API_KEY" \
-d '{
"model":"veo-3.1-quality",
"prompt":"Cinematic dolly-in toward a fox in a moonlit forest",
"aspect_ratio":"16:9",
"audio": true,
"image":"data:image/png;base64,iVBORw0KGgoAAA..."
}'Hailuo 02 with camera directives in prompt
curl https://orux.top/api/v1/tasks \
-H "Authorization: Bearer $ORUX_API_KEY" \
-d '{
"model":"hailuo-02",
"prompt":"[Tracking Shot, Pan Right] A skateboarder weaves through a rainy night market, neon reflections on wet pavement.",
"duration": 6
}'