The Pythia Scaling Suite includes 16 models (2 per size) with sizes ranging from 70M to 12B parameters, trained on the Pile dataset. The models are designed for interpretability research and match or exceed performance of similar models. The suite includes 154 intermediate checkpoints per model.
The Pythia Scaling Suite includes 16 models (2 per size) with sizes ranging from 70M to 12B parameters, trained on the Pile dataset. The models are designed for interpretability research and match or exceed performance of similar models. The suite includes 154 intermediate checkpoints per model.
text to generate from
maximum length of the newly generated generated text (Default: 2048, 1 ≤ max_new_tokens ≤ 100000)
Temperature
temperature to use for sampling. 0 means the output is deterministic. Values greater than 1 encourage more diversity (Default: 0.7, 0 ≤ temperature ≤ 100)
Sample from the set of tokens with highest probability such that sum of probabilies is higher than p. Lower values focus on the most probable tokens.Higher values sample more low-probability tokens (Default: 0.9, 0 < top_p ≤ 1)
Sample from the best k (number of) tokens. 0 means off (Default: 0, 0 ≤ top_k < 100000)
Repetition Penalty
repetition penalty. Value of 1 means no penalty, values greater than 1 discourage repetition, smaller than 1 encourage repetition. (Default: 1.2, 0.01 ≤ repetition_penalty ≤ 5)
Up to 4 strings that will terminate generation immediately. Please separate items by comma
Num Responses
Number of output sequences to return. Incompatible with streaming (Default: 1, 1 ≤ num_responses ≤ 2)
You need to login to use this model
I have this dream about the day I got a job at a tech company. I just woke up on a plane. I sat down on the floor and started getting work done. After getting up around 6 p.m., I looked around and