The ALBERT model is a transformer-based language model developed by Google researchers, designed for self-supervised learning of language representations. The model uses a combination of masked language modeling and sentence order prediction objectives, trained on a large corpus of English text data. Fine-tuning the model on specific downstream tasks can lead to improved performance, and various pre-trained versions are available for different NLP tasks.
The ALBERT model is a transformer-based language model developed by Google researchers, designed for self-supervised learning of language representations. The model uses a combination of masked language modeling and sentence order prediction objectives, trained on a large corpus of English text data. Fine-tuning the model on specific downstream tasks can lead to improved performance, and various pre-trained versions are available for different NLP tasks.
You can use cURL or any other http client to run inferences:
curl -X POST \
-d '{"input": "Where is my [MASK]?"}' \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-H 'Content-Type: application/json' \
'https://api.deepinfra.com/v1/inference/albert-base-v1'
which will give you back something similar to:
{
"results": [
{
"sequence": "where is my father?",
"score": 0.08898820728063583,
"token": 2269,
"token_str": "father"
},
{
"sequence": "where is my mother?",
"score": 0.07864926755428314,
"token": 2388,
"token_str": "mother"
}
],
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0
}
}
webhook
fileThe webhook to call when inference is done, by default you will get the output in the response of your inference request