MistralLite is a fine-tuned Mistral-7B-v0.1 language model, with enhanced capabilities of processing long context (up to 32K tokens). By utilizing an adapted Rotary Embedding and sliding window during fine-tuning, MistralLite is able to perform significantly better on several long context retrieve and answering tasks, while keeping the simple model structure of the original model.
MistralLite is a fine-tuned Mistral-7B-v0.1 language model, with enhanced capabilities of processing long context (up to 32K tokens). By utilizing an adapted Rotary Embedding and sliding window during fine-tuning, MistralLite is able to perform significantly better on several long context retrieve and answering tasks, while keeping the simple model structure of the original model.
f4c81328d87796560fa33e35a328716cc502726d
2023-11-15T13:11:26+00:00