DeepInfra raises $107M Series B to scale the inference cloud — read the announcement
nvidia/
$0.010
/ 1M tokens
The llama-nemotron-embed-vl-1b-v2 is a high-performance multimodal embedding model designed to transform text queries and document images into dense vector representations for advanced retrieval systems. It excels at understanding complex visual content like charts, tables, and infographics.

© 2026 DeepInfra. All rights reserved.