sentence-transformers/clip-ViT-B-32-multilingual-v1 cover image

sentence-transformers/clip-ViT-B-32-multilingual-v1

This model is a multilingual version of the OpenAI CLIP-ViT-B32 model, which maps text and images to a common dense vector space. It includes a text embedding model that works for 50+ languages and an image encoder from CLIP. The model was trained using Multilingual Knowledge Distillation, where a multilingual DistilBERT model was trained as a student model to align the vector space of the original CLIP image encoder across many languages.

This model is a multilingual version of the OpenAI CLIP-ViT-B32 model, which maps text and images to a common dense vector space. It includes a text embedding model that works for 50+ languages and an image encoder from CLIP. The model was trained using Multilingual Knowledge Distillation, where a multilingual DistilBERT model was trained as a student model to align the vector space of the original CLIP image encoder across many languages.

Public
$0.005 / Mtoken
512
demoapi

200b64f20b3cef15ade0d31b1392519a46024087

2023-03-03T07:30:50+00:00