We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

Fork of Text Generation Inference.
Published on 2023.08.09 by Nikola Borisov
Fork of Text Generation Inference.

The text generation inference open source project by huggingface looked like a promising framework for serving large language models (LLM). However, huggingface announced that they will change the license of code with version v1.0.0. While the previous license Apache 2.0 was permissive, the new one is restrictive for our use cases.

Forking the project

We decided to fork the project and continue to maintain it under the Apache 2.0 license. We will continue to contribute to the project and keep it up to date. We will accept pull requests from the community, and we will keep the project truly open source and free to use.

Here is a link to the code: https://github.com/deepinfra/text-generation-inference

We hope that in time a community of other developers and organizations that want to keep this project truly open source will form around it.

License changes mid-flight

Sadly it is becoming more and more common for popular open source projects to change their license after they gain some traction. This happened with MongoDB, Grafana, ElasticSearch, and many others. As a developer, when you decide to adopt a particular open source project, you start investing time and effort into using it. You build your application around it, and you start depending on it. Then, suddenly, the license changes, and you might be forced to find an alternative.

Imagine if meta changes the license of pytorch. Or if tomorrow huggingface decides to change the license of transformers in a similar way to prohibit commercial use.

We believe that the changing of the license of open source projects mid-flight is a unfriendly move towards the community.

If you need any help, just reach out to us on our Discord server.

Related articles
OpenClaw Cost Optimization: Cut AI API Costs by 90%OpenClaw Cost Optimization: Cut AI API Costs by 90%<p>A single ask in an OpenClaw session can cost more than a full evening of casual ChatGPT use. Ask your agent something simple, like which calendar event clashes with your flight, and the request that hits the API carries far more than your 12-token question. It also carries your SOUL.md, the tool schemas registered on [&hellip;]</p>
NVIDIA Nemotron 3 Super 120B API Benchmarks: Latency & CostNVIDIA Nemotron 3 Super 120B API Benchmarks: Latency & Cost<p>About NVIDIA Nemotron 3 Super 120B A12B NVIDIA&#8217;s Nemotron 3 Super 120B A12B is an open-weight large language model released on March 11, 2026. It features 120B total parameters with only 12B active per forward pass, delivering exceptional compute efficiency for complex multi-agent applications such as software development and cybersecurity triaging. The model uses a [&hellip;]</p>
Long Context models incomingLong Context models incomingMany users requested longer context models to help them summarize bigger chunks of text or write novels with ease. We're proud to announce our long context model selection that will grow bigger in the comming weeks. Models Mistral-based models have a context size of 32k, and amazon recently r...