PrunaAI/

p-video

Partner

$0.02 / second

*

Real-time AI video generation from text, images, and audio. Supports up to 1080p at 48 FPS with built-in audio generation, draft mode for 4x faster previews, and prompt upsampling.

Public

api versions

Input

Prompt

Text prompt describing the video content.

Image

URL of an input image to use as the first frame for image-to-video generation.. (Default: empty)

Audio

URL of an audio file to condition the video on. When provided, the duration parameter is ignored.. (Default: empty)

You need to log in to use this model

Log In

Settings

Duration

Duration of the generated video in seconds (1-10). Ignored when audio is provided. (Default: empty, 1 ≤ duration ≤ 10)

Resolution

Resolution of the generated video. Defaults to 720p.

Fps

Frames per second of the generated video.

Aspect Ratio

Aspect ratio of the generated video.

Seed

Random seed for reproducible generation. (Default: empty)

Draft

Generate a lower-quality draft at reduced cost. 720p draft is $0.005/s vs $0.02/s; 1080p draft is $0.01/s vs $0.04/s.

Save Audio

Whether to preserve audio from the input when doing image-to-video generation.

Prompt Upsampling

Whether to automatically enhance the prompt for better results.

Output

Model Information

P-Video

P-Video is Pruna AI's premium video generation model. It provides an all-in-one endpoint supporting text-to-video, image-to-video, and audio-conditioned generation. It offers up to 1080p resolution at 48 FPS, with configurable duration up to 10 seconds.

Key Features

All-in-one endpoint: Text-to-video, image-to-video, and audio-to-video in a single model
Draft mode: 4x faster and cheaper preview generation for rapid creative iteration
Built-in audio: Native dialogue generation and audio import support
Prompt upsampling: Automatically enhances prompts for better results
Flexible output: Up to 1080p resolution, 48 FPS, 7 aspect ratios

Usage

Text-to-Video

{
  "prompt": "A sports car drifting through a neon-lit city at night, cinematic aerial shot",
  "duration": 5,
  "resolution": "720p",
  "aspect_ratio": "16:9"
}
copy

Image-to-Video

Provide an image URL to animate a static image. When an image is provided, the aspect_ratio parameter is ignored.

{
  "prompt": "The camera slowly pushes in, the person turns their head and smiles, gentle wind moves their hair",
  "image": "https://example.com/photo.jpg",
  "duration": 5,
  "resolution": "720p"
}
copy

Audio-Conditioned Video

Provide an audio URL to generate video synchronized to the audio. When audio is provided, the duration parameter is ignored and the video duration matches the audio length.

{
  "prompt": "A musician performing on stage with dramatic lighting",
  "audio": "https://example.com/audio.mp3",
  "resolution": "720p",
  "aspect_ratio": "16:9"
}
copy

Draft Mode

Enable draft mode for 4x faster and cheaper generation, ideal for rapid iteration before producing a final render.

{
  "prompt": "A futuristic cityscape at sunset",
  "draft": true,
  "resolution": "1080p"
}
copy

Parameters

Parameter	Type	Default	Description
`prompt`	string	(required)	Text prompt for video generation
`image`	string	—	Image URL for image-to-video. Supports jpg, jpeg, png, webp
`audio`	string	—	Audio URL for audio-conditioned generation. Supports flac, mp3, wav
`duration`	integer	5	Video duration in seconds (1–10). Ignored when audio is provided
`resolution`	string	"720p"	"720p" or "1080p"
`fps`	integer	24	24 or 48
`aspect_ratio`	string	"16:9"	One of: 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 1:1. Ignored when image is provided
`seed`	integer	random	Random seed for reproducible generation
`draft`	boolean	false	Draft mode for faster, cheaper preview generation
`save_audio`	boolean	true	Include audio in the output video
`prompt_upsampling`	boolean	true	Enhance the prompt automatically for better results

Performance

Configuration	Approximate Time
10s, 720p	~23s
10s, 720p, draft	~5s
10s, 1080p	~43s
10s, 1080p, draft	~10s

Strengths

Strong input-image consistency
Reliable lip sync and native dialogue
High-quality subject and background rendering
Effective at animating low-resolution assets
Particularly strong with close-up subjects and foreground objects

Limitations

Not designed for extreme cinematic camera motion or complex multi-scene storytelling
Sound effects (SFX) performance is limited
Above two speakers, speaker separation can degrade