🚀 New models by Bria.ai, generate and edit images at scale 🚀
google/
$0.04
in
$0.13
out
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3-12B is Google's latest open source model, successor to Gemma 2

You can POST to our OpenAI Chat Completions compatible endpoint.
Passing a url to an image is the easiest way to perform OCR.
curl "https://api.deepinfra.com/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{
"model": "google/gemma-3-12b-it",
"max_tokens": 4092,
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://url.com/to/shakespeare.png"
}
}
]
}
]
}'
Another options is to read the image from a file
BASE64_IMAGE=$(base64 -w 0 shakespeare.png)
curl "https://api.deepinfra.com/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d @- <<EOF
{
"model": "google/gemma-3-12b-it",
"max_tokens": 4092,
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,$BASE64_IMAGE"
}
}
]
}
]
}
EOF
© 2025 Deep Infra. All rights reserved.