Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
Please upload an audio file
task to perform 2
optional text to provide as a prompt for the first window.. (Default: empty)
temperature to use for sampling (Default: 0)
language that the audio is in; uses detected language if None; use two letter language code (ISO 639-1) (e.g. en, de, ja). (Default: empty)
chunk level, either 'segment' or 'word' 2
Chunk Length S
chunk length in seconds to split audio (Default: 30, 1 ≤ chunk_length_s ≤ 30)