How to deploy google/flan-ul2 - simple. (open source ChatGPT alternative)

Published on 2023.03.17 by Nikola Borisov

How to deploy google/flan-ul2 - simple. (open source ChatGPT alternative) header picture

Flan-UL2 is probably the best open source model available right now for chatbots. In this post we will show you how to get started with it very easily. Flan-UL2 is large - 20B parameters. It is fine tuned version of the UL2 model using Flan dataset. Because this is quite a large model it is not easy to deploy it on your own machine. If you rent a GPU in AWS, it will cost you around $1.5 per hour or $1080 per month. Using DeepInfra model deployments you only pay for the inference time, and we do not charge for cold starts. Our pricing is $0.0005 per second of running inference on Nvidia A100. Which translates to about $0.0001 per token generated by Flan-UL2.

Also check out the model page https://deepinfra.com/google/flan-ul2. You can run infrences, check the docs/API for running inferences via curl or deepctl.

Getting started

First install the deepctl command line tool.

curl https://deepinfra.com/get.sh | sh

Login to DeepInfra (using your GitHub account)

deepctl login

This will take you to the browser to login in DeepInfra using your GitHub account. When you are done, come back to the terminal.

Deployment

Deploying the google/flan-ul2 model is as easy as running the following command:

deepctl deploy create -m google/flan-ul2

This command will setup everything for you and you can just use the model right away.

Inference

You can use it with either our rest API or our deepctl command line too. Here is how to use it with the command line tool:

deepctl infer -m google/flan-ul2 -i prompt="Hello, how are you?"

To see the full documentation of how to call this model checkout out the documentation page:

deepctl model info -m google/flan-ul2

If you want a list of all the models you can use on DeepInfra, you can run:

deepctl model list

There is no easier way to get started with arguably one of the best open source LLM. This was quite easy right? You did not have to deal with docker, transformers, pytorch, etc. If you have any question, just reach out to us on our Discord server.