Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labelled data and demonstrates a strong ability to generalize to many datasets and domains without fine-tuning. The model is based on a Transformer encoder-decoder architecture. Whisper models are available for various languages including English, Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, Korean, and many more.
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labelled data and demonstrates a strong ability to generalize to many datasets and domains without fine-tuning. The model is based on a Transformer encoder-decoder architecture. Whisper models are available for various languages including English, Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, Korean, and many more.
condition_on_previous_text
booleanprovide the previous output of the model as a prompt for the next window
Default value: true
temperature_increment_on_fallback
numbertemperature to increase when falling back when the decoding fails to meet either of the thresholds below
Default value: 0.2
webhook
fileThe webhook to call when inference is done, by default you will get the output in the response of your inference request