We present GPT-Neo 1.3B, a transformer model designed using EleutherAI's replication of the GPT-3 architecture. With 1.3B parameters, this model was trained on the large-scale curated dataset, Pile, for 380 billion tokens over 362,000 steps. Its intended use is for text generation, where it learns an inner representation of the English language and can generate texts from a prompt.
We present GPT-Neo 1.3B, a transformer model designed using EleutherAI's replication of the GPT-3 architecture. With 1.3B parameters, this model was trained on the large-scale curated dataset, Pile, for 380 billion tokens over 362,000 steps. Its intended use is for text generation, where it learns an inner representation of the English language and can generate texts from a prompt.
text to generate from
maximum length of the newly generated generated text (Default: 2048, 1 ≤ max_new_tokens ≤ 100000)
Temperature
temperature to use for sampling. 0 means the output is deterministic. Values greater than 1 encourage more diversity (Default: 0.7, 0 ≤ temperature ≤ 100)
Sample from the set of tokens with highest probability such that sum of probabilies is higher than p. Lower values focus on the most probable tokens.Higher values sample more low-probability tokens (Default: 0.9, 0 < top_p ≤ 1)
Sample from the best k (number of) tokens. 0 means off (Default: 0, 0 ≤ top_k < 100000)
Repetition Penalty
repetition penalty. Value of 1 means no penalty, values greater than 1 discourage repetition, smaller than 1 encourage repetition. (Default: 1.2, 0.01 ≤ repetition_penalty ≤ 5)
Up to 4 strings that will terminate generation immediately. Please separate items by comma
Num Responses
Number of output sequences to return. Incompatible with streaming (Default: 1, 1 ≤ num_responses ≤ 2)
You need to login to use this model
I have this dream about the day I got a job at a tech company. I just woke up on a plane. I sat down on the floor and started getting work done. After getting up around 6 p.m., I looked around and