We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

FLUX.2 is live! High-fidelity image generation made simple.

How to OpenAI Whisper with per-sentence and per-word timestamp segmentation using DeepInfra
Published on 2023.04.05 by Yessen Kanapin
How to OpenAI Whisper with per-sentence and per-word timestamp segmentation using DeepInfra

Getting started

To use DeepInfra's API, you'll need an API key.

  1. Sign up or log in to your DeepInfra account
  2. Navigate to the Dashboard / API Keys section
  3. Create a new API key if you don't have one already

You'll use this API key in your requests to authenticate with our services.

Running speech recognition

Whisper is a Speech-To-Text model from OpenAI. Given an audio file with voice data it produces human speech recognition text with per sentence timestamps. There are different model sizes (small, base, large, etc.) and variants for English, see more at deepinfra.com. By default, Whisper produces by sentence timestamp segmentation. We also host whisper-timestamped that can provide timestamps for words in the audio. You can use it with our REST API. Here's how to use it:

curl -X POST \
  -F "audio=@/home/user/all-in-01.mp3" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  'https://api.deepinfra.com/v1/inference/openai/whisper-timestamped-medium.en'
copy

To see additional parameters and how to call this model, check out the documentation page for complete API reference and examples.

If you have any question, just reach out to us on our Discord server.

Related articles
Getting StartedGetting StartedGetting an API Key To use DeepInfra's services, you'll need an API key. You can get one by signing up on our platform. Sign up or log in to your DeepInfra account at deepinfra.com Navigate to the Dashboard and select API Keys Create a new ...
Juggernaut FLUX is live on DeepInfra!Juggernaut FLUX is live on DeepInfra!Juggernaut FLUX is live on DeepInfra! At DeepInfra, we care about one thing above all: making cutting-edge AI models accessible. Today, we're excited to release the most downloaded model to our platform. Whether you're a visual artist, developer, or building an app that relies on high-fidelity ...
How to use CivitAI LoRAs: 5-Minute AI Guide to Stunning Double Exposure ArtHow to use CivitAI LoRAs: 5-Minute AI Guide to Stunning Double Exposure ArtLearn how to create mesmerizing double exposure art in minutes using AI. This guide shows you how to set up a LoRA model from CivitAI and create stunning artistic compositions that blend multiple images into dreamlike masterpieces.