Qwen2.5-Coder-7B is a powerful code-specific large language model with 7.61 billion parameters. It's designed for code generation, reasoning, and fixing tasks. The model covers 92 programming languages and has been trained on 5.5 trillion tokens of data, including source code, text-code grounding, and synthetic data.
Qwen2.5-Coder-7B is a powerful code-specific large language model with 7.61 billion parameters. It's designed for code generation, reasoning, and fixing tasks. The model covers 92 programming languages and has been trained on 5.5 trillion tokens of data, including source code, text-code grounding, and synthetic data.
text to generate from
maximum length of the newly generated generated text.If explicitly set to None it will be the model's max context length minus input length. (Default: 512, 1 ≤ max_new_tokens ≤ 1000000)
Temperature
temperature to use for sampling. 0 means the output is deterministic. Values greater than 1 encourage more diversity (Default: 0.7, 0 ≤ temperature ≤ 100)
Sample from the set of tokens with highest probability such that sum of probabilies is higher than p. Lower values focus on the most probable tokens.Higher values sample more low-probability tokens (Default: 0.9, 0 < top_p ≤ 1)
Min P
Float that represents the minimum probability for a token to be considered, relative to the probability of the most likely token. Must be in [0, 1]. Set to 0 to disable this. (Default: 0, 0 ≤ min_p ≤ 1)
Sample from the best k (number of) tokens. 0 means off (Default: 0, 0 ≤ top_k < 1000)
Repetition Penalty
repetition penalty. Value of 1 means no penalty, values greater than 1 discourage repetition, smaller than 1 encourage repetition. (Default: 1, 0.01 ≤ repetition_penalty ≤ 5)
Num Responses
Number of output sequences to return. Incompatible with streaming (Default: 1, 1 ≤ num_responses ≤ 2)
How to format the response 2
Presence Penalty
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. (Default: 0, -2 ≤ presence_penalty ≤ 2)
Frequency Penalty
Positive values penalize new tokens based on how many times they appear in the text so far, increasing the model's likelihood to talk about new topics. (Default: 0, -2 ≤ frequency_penalty ≤ 2)
A unique identifier representing your end-user, which can help monitor and detect abuse. Avoid sending us any identifying information. We recommend hashing user identifiers.. (Default: empty)
Seed for random number generator. If not provided, a random seed is used. Determinism is not guaranteed. (Default: empty, 0 ≤ seed < 9223372036854776000)
I have this dream about the day I got a job at a tech company. I just woke up on a plane. I sat down on the floor and started getting work done. After getting up around 6 p.m., I looked around and
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). For Qwen2.5-Coder, we release three base language models and instruction-tuned language models, 1.5, 7 and 32 (coming soon) billion parameters. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:
This repo contains the 7B Qwen2.5-Coder model, which has the following features:
We do not recommend using base language models for conversations. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., or fill in the middle tasks on this model.
For more details, please refer to our blog, GitHub, Documentation, Arxiv.
Detailed evaluation results are reported in this 📑 blog.
For requirements on GPU memory and the respective throughput, see results here.