🚀 New models by Bria.ai, generate and edit images at scale 🚀
google/
$0.075
in
$0.30
out
Gemini 1.5 Flash is Google's foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots. Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter.

You can POST to our OpenAI Chat Completions compatible endpoint.
Passing a url to an image is the easiest way to perform OCR.
curl "https://api.deepinfra.com/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{
"model": "google/gemini-1.5-flash",
"max_tokens": 4092,
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://url.com/to/shakespeare.png"
}
}
]
}
]
}'
Another options is to read the image from a file
BASE64_IMAGE=$(base64 -w 0 shakespeare.png)
curl "https://api.deepinfra.com/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d @- <<EOF
{
"model": "google/gemini-1.5-flash",
"max_tokens": 4092,
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,$BASE64_IMAGE"
}
}
]
}
]
}
EOF
© 2025 Deep Infra. All rights reserved.