We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Qwen3-Max-Thinking state-of-the-art reasoning model at your fingertips!

PrunaAI/

p-video

$0

/ second

Real-time AI video generation from text, images, and audio. Supports up to 1080p at 48 FPS with built-in audio generation, draft mode for 4x faster previews, and prompt upsampling.

Partner
Public
PrunaAI/p-video cover image

Input

Prompt

Text prompt describing the video content.

Image

URL of an input image to use as the first frame for image-to-video generation.. (Default: empty)

Audio

URL of an audio file to condition the video on. When provided, the duration parameter is ignored.. (Default: empty)

You need to login to use this model

Login

Settings

Duration

Duration of the generated video in seconds (1-10). Ignored when audio is provided. (Default: empty, 1 ≤ duration ≤ 10)

Resolution

Resolution of the generated video. Defaults to 720p.

Fps

Frames per second of the generated video.

Aspect Ratio

Aspect ratio of the generated video.

Seed

Random seed for reproducible generation. (Default: empty)

Draft

Generate a lower-quality draft at reduced cost. 720p draft is $0.005/s vs $0.02/s; 1080p draft is $0.01/s vs $0.04/s.

Save Audio

Whether to preserve audio from the input when doing image-to-video generation.

Prompt Upsampling

Whether to automatically enhance the prompt for better results.

Output

Model Information

P-Video

P-Video is Pruna AI's premium video generation model. It provides an all-in-one endpoint supporting text-to-video, image-to-video, and audio-conditioned generation. It offers up to 1080p resolution at 48 FPS, with configurable duration up to 10 seconds.

Key Features

  • All-in-one endpoint: Text-to-video, image-to-video, and audio-to-video in a single model
  • Draft mode: 4x faster and cheaper preview generation for rapid creative iteration
  • Built-in audio: Native dialogue generation and audio import support
  • Prompt upsampling: Automatically enhances prompts for better results
  • Flexible output: Up to 1080p resolution, 48 FPS, 7 aspect ratios

Usage

Text-to-Video

{
  "prompt": "A sports car drifting through a neon-lit city at night, cinematic aerial shot",
  "duration": 5,
  "resolution": "720p",
  "aspect_ratio": "16:9"
}
copy

Image-to-Video

Provide an image URL to animate a static image. When an image is provided, the aspect_ratio parameter is ignored.

{
  "prompt": "The camera slowly pushes in, the person turns their head and smiles, gentle wind moves their hair",
  "image": "https://example.com/photo.jpg",
  "duration": 5,
  "resolution": "720p"
}
copy

Audio-Conditioned Video

Provide an audio URL to generate video synchronized to the audio. When audio is provided, the duration parameter is ignored and the video duration matches the audio length.

{
  "prompt": "A musician performing on stage with dramatic lighting",
  "audio": "https://example.com/audio.mp3",
  "resolution": "720p",
  "aspect_ratio": "16:9"
}
copy

Draft Mode

Enable draft mode for 4x faster and cheaper generation, ideal for rapid iteration before producing a final render.

{
  "prompt": "A futuristic cityscape at sunset",
  "draft": true,
  "resolution": "1080p"
}
copy

Parameters

ParameterTypeDefaultDescription
promptstring(required)Text prompt for video generation
imagestringImage URL for image-to-video. Supports jpg, jpeg, png, webp
audiostringAudio URL for audio-conditioned generation. Supports flac, mp3, wav
durationinteger5Video duration in seconds (1–10). Ignored when audio is provided
resolutionstring"720p""720p" or "1080p"
fpsinteger2424 or 48
aspect_ratiostring"16:9"One of: 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 1:1. Ignored when image is provided
seedintegerrandomRandom seed for reproducible generation
draftbooleanfalseDraft mode for faster, cheaper preview generation
save_audiobooleantrueInclude audio in the output video
prompt_upsamplingbooleantrueEnhance the prompt automatically for better results

Performance

ConfigurationApproximate Time
10s, 720p~23s
10s, 720p, draft~5s
10s, 1080p~43s
10s, 1080p, draft~10s

Strengths

  • Strong input-image consistency
  • Reliable lip sync and native dialogue
  • High-quality subject and background rendering
  • Effective at animating low-resolution assets
  • Particularly strong with close-up subjects and foreground objects

Limitations

  • Not designed for extreme cinematic camera motion or complex multi-scene storytelling
  • Sound effects (SFX) performance is limited
  • Above two speakers, speaker separation can degrade