Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labeled data and demonstrates strong abilities to generalize to various datasets and domains without fine-tuning. The model is based on a Transformer encoder-decoder architecture.
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labeled data and demonstrates strong abilities to generalize to various datasets and domains without fine-tuning. The model is based on a Transformer encoder-decoder architecture.
a0b3589e1034234495a1b696c28d4832cdaf8a32
2025-01-15T05:06:40+00:00
409c908d20e1bcd7bacf19550eba40ad3f833986
2023-02-15T20:00:48+00:00