Pythia Scaling Suite is a collection of models for interpretability research, containing 8 models from 70M to 12B, with two models per size (one trained on Pile and the other on deduped Pile). The models are designed for scientific research, esp interpretability research. The model matches or exceeds performance of similar models like OPT and GPT-Neo.
Pythia Scaling Suite is a collection of models for interpretability research, containing 8 models from 70M to 12B, with two models per size (one trained on Pile and the other on deduped Pile). The models are designed for scientific research, esp interpretability research. The model matches or exceeds performance of similar models like OPT and GPT-Neo.
text to generate from
maximum length of the newly generated generated text (Default: 2048, 1 ≤ max_new_tokens ≤ 100000)
Temperature
temperature to use for sampling. 0 means the output is deterministic. Values greater than 1 encourage more diversity (Default: 0.7, 0 ≤ temperature ≤ 100)
Sample from the set of tokens with highest probability such that sum of probabilies is higher than p. Lower values focus on the most probable tokens.Higher values sample more low-probability tokens (Default: 0.9, 0 < top_p ≤ 1)
Sample from the best k (number of) tokens. 0 means off (Default: 0, 0 ≤ top_k < 100000)
Repetition Penalty
repetition penalty. Value of 1 means no penalty, values greater than 1 discourage repetition, smaller than 1 encourage repetition. (Default: 1.2, 0.01 ≤ repetition_penalty ≤ 5)
Up to 4 strings that will terminate generation immediately. Please separate items by comma
Num Responses
Number of output sequences to return. Incompatible with streaming (Default: 1, 1 ≤ num_responses ≤ 2)
You need to login to use this model
I have this dream about the day I got a job at a tech company. I just woke up on a plane. I sat down on the floor and started getting work done. After getting up around 6 p.m., I looked around and