Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

BAAI/bge-large-en-v1.5 cover image
featured
512
$0.010 / Mtoken
  • embeddings

BGE embedding is a general Embedding Model. It is pre-trained using retromae and trained on large-scale pair data using contrastive learning. Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned

Austism/chronos-hermes-13b-v2 cover image
fp16
4k
$0.13 / Mtoken
  • text-generation

This offers the imaginative writing style of chronos while still retaining coherency and being capable. Outputs are long and utilize exceptional prose. Supports a maxium context length of 4096. The model follows the Alpaca prompt format.

BAAI/bge-base-en-v1.5 cover image
512
$0.005 / Mtoken
  • embeddings

BGE embedding is a general Embedding Model. It is pre-trained using retromae and trained on large-scale pair data using contrastive learning. Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned

BAAI/bge-m3 cover image
fp32
8k
$0.010 / Mtoken
  • embeddings

BGE-M3 is a versatile text embedding model that supports multi-functionality, multi-linguality, and multi-granularity, allowing it to perform dense retrieval, multi-vector retrieval, and sparse retrieval in over 100 languages and with input sizes up to 8192 tokens. The model can be used in a retrieval pipeline with hybrid retrieval and re-ranking to achieve higher accuracy and stronger generalization capabilities. BGE-M3 has shown state-of-the-art performance on several benchmarks, including MKQA, MLDR, and NarritiveQA, and can be used as a drop-in replacement for other embedding models like DPR and BGE-v1.5.

CompVis/stable-diffusion-v1-4 cover image
$0.0005 / sec
  • text-to-image

Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

Gryphe/MythoMax-L2-13b-turbo cover image
fp8
4k
Replaced
  • text-generation

Faster version of Gryphe/MythoMax-L2-13b running on multiple H100 cards in fp8 precision. Up to 160 tps.

HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1 cover image
fp8
64k
Replaced
  • text-generation

Zephyr 141B-A35B is an instruction-tuned (assistant) version of Mixtral-8x22B. It was fine-tuned on a mix of publicly available, synthetic datasets. It achieves strong performance on chat benchmarks.

Lykon/DreamShaper cover image
$0.0005 / sec
  • text-to-image

DreamShaper started as a model to have an alternative to MidJourney in the open source world. I didn't like how MJ was handled back when I started and how closed it was and still is, as well as the lack of freedom it gives to users compared to SD. Look at all the tools we have now from TIs to LoRA, from ControlNet to Latent Couple. We can do anything. The purpose of DreamShaper has always been to make "a better Stable Diffusion", a model capable of doing everything on its own, to weave dreams.

Phind/Phind-CodeLlama-34B-v2 cover image
fp16
4k
$0.60 / Mtoken
  • text-generation

Phind-CodeLlama-34B-v2 is an open-source language model that has been fine-tuned on 1.5B tokens of high-quality programming-related data and achieved a pass@1 rate of 73.8% on HumanEval. It is multi-lingual and proficient in Python, C/C++, TypeScript, Java, and more. It has been trained on a proprietary dataset of instruction-answer pairs instead of code completion examples. The model is instruction-tuned on the Alpaca/Vicuna format to be steerable and easy-to-use. It accepts the Alpaca/Vicuna instruction format and can generate one completion for each prompt.

Qwen/Qwen2-7B-Instruct cover image
bfloat16
32k
$0.07 / Mtoken
  • text-generation

The 7 billion parameter Qwen2 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning.

XpucT/Deliberate cover image
$0.0005 / sec
  • text-to-image

The Deliberate Model allows for the creation of anything desired, with the potential for better results as the user's knowledge and detail in the prompt increase. The model is ideal for meticulous anatomy artists, creative prompt writers, art designers, and those seeking explicit content.

bigcode/starcoder2-15b cover image
fp16
16k
Replaced
  • text-generation

StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages. It specializes in code completion.

bigcode/starcoder2-15b-instruct-v0.1 cover image
fp16
Replaced
  • text-generation

We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code Large Language Model (LLM) trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs.

codellama/CodeLlama-34b-Instruct-hf cover image
fp16
4k
Replaced
  • text-generation

Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. This particular instance is the 34b instruct variant

codellama/CodeLlama-70b-Instruct-hf cover image
fp16
4k
Replaced
  • text-generation

CodeLlama-70b is the largest and latest code generation from the Code Llama collection.

cognitivecomputations/dolphin-2.6-mixtral-8x7b cover image
bfloat16
32k
$0.24 / Mtoken
  • text-generation

The Dolphin 2.6 Mixtral 8x7b model is a finetuned version of the Mixtral-8x7b model, trained on a variety of data including coding data, for 3 days on 4 A100 GPUs. It is uncensored and requires trust_remote_code. The model is very obedient and good at coding, but not DPO tuned. The dataset has been filtered for alignment and bias. The model is compliant with user requests and can be used for various purposes such as generating code or engaging in general chat.