Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labelled data and demonstrates a strong ability to generalize to many datasets and domains without the need for fine-tuning. The model is based on a Transformer architecture and uses a large-scale weak supervision technique.
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labelled data and demonstrates a strong ability to generalize to many datasets and domains without the need for fine-tuning. The model is based on a Transformer architecture and uses a large-scale weak supervision technique.
ecd562088fc463cdf07bf4a997944edeae03993e
2024-12-03T17:06:40+00:00
635383574919b16bdbb8ed3a883657b2d20f1066
2023-01-26T21:41:19+00:00
6fcee574032778dd76bc56218aea38ce5c6faad1
2023-02-15T20:05:56+00:00
62e030d4039081fb7af07f198e7b5214a53be81f
2024-12-03T17:06:40+00:00