We present a sentence similarity model based on the Sentence Transformers architecture, which maps sentences to a 384-dimensional dense vector space. The model uses a pre-trained BERT encoder and applies mean pooling on top of the contextualized word embeddings to obtain sentence embeddings. We evaluate the model on the Sentence Embeddings Benchmark.
We present a sentence similarity model based on the Sentence Transformers architecture, which maps sentences to a 384-dimensional dense vector space. The model uses a pre-trained BERT encoder and applies mean pooling on top of the contextualized word embeddings to obtain sentence embeddings. We evaluate the model on the Sentence Embeddings Benchmark.
You can use cURL or any other http client to run inferences:
curl -X POST \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-F 'inputs=["I like chocolate"]' \
'https://api.deepinfra.com/v1/inference/sentence-transformers/paraphrase-MiniLM-L6-v2'
which will give you back something similar to:
{
"embeddings": [
[
0.0,
0.5,
1.0
],
[
1.0,
0.5,
0.0
]
],
"input_tokens": 42,
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0
}
}
webhook
fileThe webhook to call when inference is done, by default you will get the output in the response of your inference request