Documentation

Videos

POST /api/v1/tasks — submit a video generation task.

POST/api/v1/tasksBearer sk-app-…

Video generation is always async. Submit a job, receive a task_id, then either poll /api/v1/tasks/{id} or set webhook_url to be notified on completion.

How async tasks work#

POST returns immediately with task_id and status:"queued". Status transitions to "running" once a worker picks it up, then "success", "failed" or "expired". Webhooks fire on terminal states.

Webhook over polling

For tasks that take more than 30 seconds, supply a callback_url and verify the HMAC signature instead of polling.

Pick a video model#

Video models differ substantially: Veo fixes duration to 8s but supports audio tracks; Sora and Kling do 5/10s; Runway needs an image for I2V; Hailuo accepts camera directives inline in the prompt. The matrix below summarises; the per-model section spells out every parameter.

Model ID	Model	Spec	Capabilities	Top params	Doc
`veo-3.1-quality`	Veo 3.1 Quality Google Veo 3.1, quality tier. 8s fixed length. Accepts an input image as URL or base64 data URI.	≤ 8s	Audio	`promptaspect_ratioaudio`	View →
`veo-3.1-fast`	Veo 3.1 Fast Veo 3.1 fast tier — lower price, slightly less detail.	≤ 8s	Audio	`promptaspect_ratioaudio`	View →
`sora2`	Sora 2 OpenAI Sora 2 — text-to-video, 5s or 10s.	≤ 10s		`promptdurationaspect_ratio`	View →
`runway-gen3-alpha-turbo`	Runway Gen-3 Alpha Turbo Runway Gen-3 alpha turbo — image-to-video, 5/10s.	≤ 10s		`promptimage_urlduration`	View →
`runway-gen4`	Runway Gen-4 Runway Gen-4 — image- or text-to-video.	≤ 10s		`promptimage_urlduration`	View →
`runway-aleph`	Runway Aleph Runway Aleph — edit existing video clips with text.	—		`promptvideo_url`	View →
`kling-v21-master-i2v`	Kling v2.1 Master (I2V) Kling 2.1 master image-to-video.	≤ 10s		`promptimage_urlduration`	View →
`kling-v21-master-t2v`	Kling v2.1 Master (T2V) Kling 2.1 master text-to-video.	≤ 10s		`promptduration`	View →
`kling-v25-i2v-pro`	Kling v2.5 Pro (I2V) Kling 2.5 pro image-to-video.	≤ 10s		`promptimage_urlduration`	View →
`kling-v25-t2v-pro`	Kling v2.5 Pro (T2V) Kling 2.5 pro text-to-video.	≤ 10s		`promptduration`	View →
`kling-avatar-std`	Kling Avatar (Std) Kling avatar standard — lip-sync animated avatar.	—		`promptimage_urlaudio_url`	View →
`kling-avatar-pro`	Kling Avatar (Pro) Kling avatar pro — higher fidelity.	—		`promptimage_urlaudio_url`	View →
`hailuo-02`	MiniMax Hailuo 02 MiniMax Hailuo 02. Supports camera directives embedded in the prompt.	≤ 10s		`promptimage_urlduration`	View →
`grok-imagine-t2v`	Grok Imagine (T2V) xAI Grok Imagine, text-to-video.	—		`promptduration`	View →
`grok-imagine-i2v`	Grok Imagine (I2V) xAI Grok Imagine, image-to-video.	—		`promptimage_url`	View →
`topaz-upscale`	Topaz Video Upscale Topaz video enhance / upscale to 4K.	—		`video_urltarget_resolution`	View →
`infinitalk-audio`	InfiniTalk (Audio-driven) Audio-driven talking avatar. Takes a driver audio and a still character image.	—		`driver_audio_urldriver_image_url`	View →

17 models

Request body (shared)#

Field	Type	Default	Description
`model`required	`string`	—	Model ID, e.g. "veo-3.1-quality", "veo-3.1-fast", "kling-3.0", "sora-2-pro", "wan-26-i2v", "hailuo-02-i2v-pro".
`prompt`required	`string`	—	Text prompt describing the video.
`duration`	`int`	—	Length in seconds. 4, 6 or 8 are typical; per-model limits apply.
`resolution`	`string`	`"720p"`	"480p", "720p" or "1080p" where supported.
`aspect_ratio`	`string`	`"16:9"`	"16:9", "9:16", "1:1".
`image`	`string`	—	Optional starting image URL or base64 data URI for image-to-video.
`webhook_url`	`string`	—	POST target for task-complete events. See /docs/webhooks.

Response#

Field	Type	Default	Description
`task_id`	`string`	—	Use with GET /api/v1/tasks/{task_id} or webhooks.
`status`	`string`	—	QUEUED on submit; RUNNING / SUCCESS / FAILED / EXPIRED later.
`submit_time`	`string`	—	ISO-8601 timestamp of submission.

Per-model parameters#

Pass only the fields the per-model table lists. Orux AI rejects unknown fields with invalid_param. webhook_url is supported by all video models — strongly preferred over polling for production.

Veo 3.1 Quality`veo-3.1-quality`

Google Veo 3.1, quality tier. 8s fixed length. Accepts an input image as URL or base64 data URI.

Audio

≤ 8s

async task

•duration is fixed at 8 seconds — passing other values is rejected with invalid_param.

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`aspect_ratio`	`enum`	`16:9`	Output aspect ratio. `16:99:161:1`
`audio`	`boolean`	`true`	Include the model-generated audio track.
`image`	`string`	—	Optional starting image. Accepts an https URL OR a base64 data URI ("data:image/png;base64,...").
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Veo 3.1 Fast`veo-3.1-fast`

Veo 3.1 fast tier — lower price, slightly less detail.

Audio

≤ 8s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`aspect_ratio`	`enum`	`16:9`	Aspect ratio. `16:99:161:1`
`audio`	`boolean`	`true`	Include the audio track.
`image`	`string`	—	URL or base64 data URI as starting frame.
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Sora 2`sora2`

OpenAI Sora 2 — text-to-video, 5s or 10s.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`duration`	`enum`	`5`	Length in seconds. `510`
`aspect_ratio`	`enum`	`16:9`	Aspect ratio. `16:99:16`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Runway Gen-3 Alpha Turbo`runway-gen3-alpha-turbo`

Runway Gen-3 alpha turbo — image-to-video, 5/10s.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`required	`url`	—	Starting frame (image-to-video).
`duration`	`enum`	`5`	Seconds. `510`
`camera_motion`	`string`	—	Optional camera-motion hint, e.g. "zoom_in", "pan_left".
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Runway Gen-4`runway-gen4`

Runway Gen-4 — image- or text-to-video.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`	`url`	—	Optional starting frame.
`duration`	`enum`	`5`	Seconds. `510`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Runway Aleph`runway-aleph`

Runway Aleph — edit existing video clips with text.

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`video_url`required	`url`	—	Source video to edit.
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Kling v2.1 Master (I2V)`kling-v21-master-i2v`

Kling 2.1 master image-to-video.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`required	`url`	—	Starting frame.
`duration`	`enum`	`5`	Seconds. `510`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Kling v2.1 Master (T2V)`kling-v21-master-t2v`

Kling 2.1 master text-to-video.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`duration`	`enum`	`5`	Seconds. `510`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Kling v2.5 Pro (I2V)`kling-v25-i2v-pro`

Kling 2.5 pro image-to-video.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`required	`url`	—	Starting frame.
`duration`	`enum`	`5`	Seconds. `510`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Kling v2.5 Pro (T2V)`kling-v25-t2v-pro`

Kling 2.5 pro text-to-video.

≤ 10s

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`duration`	`enum`	`5`	Seconds. `510`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Kling Avatar (Std)`kling-avatar-std`

Kling avatar standard — lip-sync animated avatar.

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`required	`url`	—	Avatar source image.
`audio_url`required	`url`	—	Driver audio track.
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Kling Avatar (Pro)`kling-avatar-pro`

Kling avatar pro — higher fidelity.

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`required	`url`	—	Avatar source image.
`audio_url`required	`url`	—	Driver audio track.
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

MiniMax Hailuo 02`hailuo-02`

MiniMax Hailuo 02. Supports camera directives embedded in the prompt.

≤ 10s

async task

•Camera directives like [Pan Right], [Tilt Up], [Zoom In], [Tracking Shot] are written inline in prompt — see /docs/api/videos#hailuo-camera.

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`	`url`	—	Optional starting frame for image-to-video.
`duration`	`enum`	`6`	Seconds. `610`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Grok Imagine (T2V)`grok-imagine-t2v`

xAI Grok Imagine, text-to-video.

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`duration`	`enum`	`5`	Seconds. `510`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Grok Imagine (I2V)`grok-imagine-i2v`

xAI Grok Imagine, image-to-video.

async task

Field	Type	Default	Description
`prompt`required	`string`	—	Text prompt describing the desired output.
`image_url`required	`url`	—	Starting frame.
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Topaz Video Upscale`topaz-upscale`

Topaz video enhance / upscale to 4K.

async task

Field	Type	Default	Description
`video_url`required	`url`	—	Source video.
`target_resolution`	`enum`	`4k`	Target resolution. `1080p4k`
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

InfiniTalk (Audio-driven)`infinitalk-audio`

Audio-driven talking avatar. Takes a driver audio and a still character image.

async task

Field	Type	Default	Description
`driver_audio_url`required	`url`	—	Audio that drives lip-sync.
`driver_image_url`required	`url`	—	Character still image.
`webhook_url`	`url`	—	HTTPS endpoint Orux AI will POST a signed event to on terminal status. Optional — fall back to polling.

Hailuo 02: camera directives in prompt#

Hailuo 02 accepts camera-motion directives written inline in the prompt as bracketed tokens. Multiple directives may be combined comma-separated. The directive set is:

[Pan Right][Pan Left][Tilt Up][Tilt Down][Zoom In][Zoom Out][Tracking Shot][Dolly In][Dolly Out][Static][Handheld]

Combining directives

Pass both image and audio inputs to drive lip-sync video; the channel adapter picks the right upstream automatically.

Veo 3.1: image-to-video input formats#

Veo accepts three input shapes for the optional starting frame: an https URL to a publicly accessible image, a base64 data URI ("data:image/png;base64,…") inline in the JSON body, or a reference to a previous Orux AI task that produced an image (pass {"task_id":"img_…"} as the image field). Duration is fixed at 8 seconds; resolution defaults to 1080p; audio is on by default.

Examples#

Submit and poll

curlshell

# 1. Submit a job
curl https://orux.top/api/v1/tasks \
  -H "Authorization: Bearer $ORUX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3.1-quality",
    "prompt": "A drone shot flying over Tokyo at sunset, cinematic",
    "duration": 8,
    "resolution": "1080p",
    "webhook_url": "https://your.app/webhooks/orux"
  }'
# -> { "task_id": "task_01HZX...", "status": "QUEUED", "submit_time": "2026-06-04T22:00:00" }

# 2. Poll until done
curl https://orux.top/api/v1/tasks/task_01HZX... \
  -H "Authorization: Bearer $ORUX_API_KEY"
# -> { "task_id": "...", "status": "SUCCESS", "result_urls": ["https://..."], "duration_sec": 38 }

Veo 3.1 with base64 starting frame

curlshell

curl https://orux.top/api/v1/tasks \
  -H "Authorization: Bearer $ORUX_API_KEY" \
  -d '{
    "model":"veo-3.1-quality",
    "prompt":"Cinematic dolly-in toward a fox in a moonlit forest",
    "aspect_ratio":"16:9",
    "audio": true,
    "image":"data:image/png;base64,iVBORw0KGgoAAA..."
  }'

Hailuo 02 with camera directives in prompt

curlshell

curl https://orux.top/api/v1/tasks \
  -H "Authorization: Bearer $ORUX_API_KEY" \
  -d '{
    "model":"hailuo-02",
    "prompt":"[Tracking Shot, Pan Right] A skateboarder weaves through a rainy night market, neon reflections on wet pavement.",
    "duration": 6
  }'