Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.
Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.
llama 13b
Tweak the overall style and tone of the conversation by giving some 'master' instructions. (Default: Be a helpful assistant)
maximum length of the newly generated generated text (Default: 2048, 1 ≤ max_new_tokens ≤ 100000)
Temperature
temperature to use for sampling. 0 means the output is deterministic. Values greater than 1 encourage more diversity (Default: 0.7, 0 ≤ temperature ≤ 1)
Sample from the set of tokens with highest probability such that sum of probabilies is higher than p. Lower values focus on the most probable tokens.Higher values sample more low-probability tokens (Default: 0.9, 0 < top_p ≤ 1)
Sample from the best k (number of) tokens. 0 means off (Default: 0, 0 ≤ top_k < 100000)
Up to 4 strings that will terminate generation immediately. Please separate items by comma