FLUX.2 is live! High-fidelity image generation made simple.
sentence-transformers/
$0.005
/ 1M tokens
The CLIP model maps text and images to a shared vector space, enabling various applications such as image search, zero-shot image classification, and image clustering. The model can be used easily after installation, and its performance is demonstrated through zero-shot ImageNet validation set accuracy scores. Multilingual versions of the model are also available for 50+ languages.
© 2025 Deep Infra. All rights reserved.