sentence-transformers/clip-ViT-B-32 cover image

sentence-transformers/clip-ViT-B-32

The CLIP model maps text and images to a shared vector space, enabling various applications such as image search, zero-shot image classification, and image clustering. The model can be used easily after installation, and its performance is demonstrated through zero-shot ImageNet validation set accuracy scores. Multilingual versions of the model are also available for 50+ languages.

The CLIP model maps text and images to a shared vector space, enabling various applications such as image search, zero-shot image classification, and image clustering. The model can be used easily after installation, and its performance is demonstrated through zero-shot ImageNet validation set accuracy scores. Multilingual versions of the model are also available for 50+ languages.

Public
$0.005 / Mtoken
512
demoapi

61c3f1c0c7fbd01c285c063696513d859cad52eb

2023-03-03T07:37:03+00:00