Skip to main content
POST
https://octopusx.ai
/
v1
/
video
/
generations
Unified Video Generation API
curl --request POST \
  --url https://octopusx.ai/v1/video/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "prompt": "<string>",
  "image": "<string>",
  "images": [
    "<string>"
  ],
  "input_reference": [
    "<string>"
  ],
  "duration": 123,
  "seconds": "<string>",
  "size": "<string>",
  "metadata": {}
}
'
{
  "id": "video_task_abc123",
  "task_id": "video_task_abc123",
  "object": "video",
  "model": "kling-v1",
  "status": "queued",
  "progress": 0,
  "created_at": 1735689600
}

Unified Video Generation API

This is the entry point for multi-vendor video tasks. Requests first enter the unified task protocol, and then the dispatch layer selects the actual channel for execution.
  • A single entry point supports both text-to-video and image-to-video.
  • Asynchronous processing mode; after a successful submission, a public video task ID is returned.
  • Supports passing through vendor-specific parameters via metadata.
  • Can be used with /v1/video/generations/{task_id} and /v1/videos/{task_id} to form a complete task flow.

Method and Path

POST /v1/video/generations

Request Example

curl -X POST https://octopusx.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-v1",
    "prompt": "A golden retriever running on the beach, with the camera smoothly tracking it",
    "duration": 5,
    "size": "1280x720",
    "metadata": {
      "style": "cinematic"
    }
  }'

Response Example

{
  "id": "video_task_abc123",
  "task_id": "video_task_abc123",
  "object": "video",
  "model": "kling-v1",
  "status": "queued",
  "progress": 0,
  "created_at": 1735689600
}

Authentication

Authorization: Bearer YOUR_API_KEY

Body

model
string
required
The model name used for billing and dispatch, such as kling-v1, sora-2, or veo-3. Whether it is ultimately available depends on the current Token’s assigned group and channel configuration.
prompt
string
Text prompt. It is usually required for pure text-to-video generation; if you are only making light edits based on an image, it is still recommended to provide a clear description.
image
string
Input image URL or Base64 for image-to-video generation. Together with images and input_reference, this indicates that visual reference input is present.
images
array<string>
Multi-image reference input. Commonly used for upstreams that require multiple reference images.
input_reference
string | array<string>
Additional reference materials. Some upstreams treat this as the first frame, reference image, or a collection of materials.
duration
integer
Target duration in seconds. Some channels also accept seconds; if both are provided, the exact precedence is determined by the adapter.
seconds
string
Duration field in OpenAI video-compatible style. Common values include 5, 10, and 15.
size
string
Size or ratio hint, such as 1280x720, 720x1280, or 16:9. Whether it is recognized depends on the specific channel.
metadata
object
Vendor-specific extension parameters, such as style, negative_prompt, camera_control, aspectRatio, and quality.

Response

id
string
Public task ID on the platform. It can be used directly for subsequent queries and proxy downloads.
task_id
string
Task ID field retained for compatibility with legacy APIs, usually the same as id.
object
string
Fixed to video.
status
string
Initial status after submission; common values are queued or in_progress.
progress
integer
Task progress percentage. It is usually 0 when first submitted.

Use Cases

Text-to-Video

curl -X POST https://octopusx.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3",
    "prompt": "A red sports car driving through a rainy night city",
    "seconds": "5"
  }'

Image-to-Video

curl -X POST https://octopusx.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-v1",
    "prompt": "Make the person smile and wave",
    "image": "https://.../portrait.png",
    "duration": 5
  }'

Notes

This route is a unified task entry point and does not guarantee that all channels accept the same set of parameters. Fields not recognized by the adapter may be ignored, or may only take effect for certain models.