NVIDIA Nemotron 3 Super - blazing-fast agentic AI, ready to deploy today!
Published on 2023.08.09 by Nikola BorisovFork of Text Generation Inference.The text generation inference open source project by huggingface looked like a promising framework for serving large language models (LLM). However, huggingface announced that they will change the license of code with version v1.0.0. While the previous license Apache 2.0 was permissive, the new on...
Published on 2023.08.02 by Nikola BorisovThe easiest way to build AI applications with Llama 2 LLMs.The long awaited Llama 2 models are finally here! We are excited to show you how to use them with DeepInfra. These collection of models represent the state of the art in open source language models. They are made available by Meta AI and the l...
Published on 2023.04.12 by Yessen KanapinHow to deploy Databricks Dolly v2 12b, instruction tuned casual language model.Databricks Dolly is instruction tuned 12 billion parameter casual language model based on EleutherAI's pythia-12b. It was pretrained on The Pile, GPT-J's pretraining corpus. [databricks-dolly-15k](http...
Published on 2023.04.05 by Yessen KanapinHow to OpenAI Whisper with per-sentence and per-word timestamp segmentation using DeepInfraWhisper is a Speech-To-Text model from OpenAI.
Published on 2023.03.17 by Nikola BorisovHow to deploy google/flan-ul2 - simple. (open source ChatGPT alternative)Flan-UL2 is probably the best open source model available right now for chatbots. In this post we will show you how to get started with it very easily. Flan-UL2 is large - 20B parameters. It is fine tuned version of the UL2 model using Flan dataset. Because this is quite a large model it is not eas...
Published on 2023.03.08 by IskrenA short intro on running Stable Diffusion on DeepInfraI'm glad you asked
© 2026 Deep Infra. All rights reserved.