distil-whisper/distil-large-v3 cover image
featured

distil-whisper/distil-large-v3

Distil-Whisper was proposed in the paper Robust Knowledge Distillation via Large-Scale Pseudo Labelling. This is the third and final installment of the Distil-Whisper English series. It the knowledge distilled version of OpenAI's Whisper large-v3, the latest and most performant Whisper model to date. Compared to previous Distil-Whisper models, the distillation procedure for distil-large-v3 has been adapted to give superior long-form transcription accuracy with OpenAI's sequential long-form algorithm.

Distil-Whisper was proposed in the paper Robust Knowledge Distillation via Large-Scale Pseudo Labelling. This is the third and final installment of the Distil-Whisper English series. It the knowledge distilled version of OpenAI's Whisper large-v3, the latest and most performant Whisper model to date. Compared to previous Distil-Whisper models, the distillation procedure for distil-large-v3 has been adapted to give superior long-form transcription accuracy with OpenAI's sequential long-form algorithm.

Public
$0.00018 / minute
ProjectPaperLicense

Input

Please upload an audio file

task to perform 2

optional text to provide as a prompt for the first window.. (Default: empty)

temperature to use for sampling (Default: 0)

language that the audio is in; uses detected language if None; use two letter language code (ISO 639-1) (e.g. en, de, ja). (Default: empty)

chunk level, either 'segment' or 'word' 2

Chunk Length S

chunk length in seconds to split audio (Default: 30, 1 ≤ chunk_length_s ≤ 30)

Output

The model is English oriented