DeepSeek-R1 Model Fine Tuning Series1. IntroductionContent of this article:1.1 Project background1.2 Introduction to LoRA and QLoRA1.3 LLaMA architecture and Qwen architectureLLaMA architectureQwen Architecture2. Environmental preparation2.1 Unsloth installation (Graphics card version-not available for the time being)2.2 Creating a Python project2.3 Python dependency library2.2 LoRA peft installation2.3 WandB Settings2.4 modelscope pull model2.5 Test model usage3. Training data3.1 Prepare the dataset3.2 Data cleaning3.3 Training data3.3 Training the model and saving it3.4 Merge model files3.4 Evaluation and monitoring training processEval/) related information:Training (train/) related information:
1. Introduction
Content of this article:
-
Model loading and preprocessing: Explain in detail how to load pretrained models and word segmenters, and process input data sets.
-
LoRA configuration: Introduce how to use LoRA technology to configure models and fine-tune efficiently to save computing resources.
-
Training process: Shows how to configure training parameters, use SFTTrainer for training, and record training logs through WandB.
-
Model preservation and evaluation: How to save the fine-tuned model and how to verify the model with the appropriate evaluation set.
-
Model merge: Shows how to merge multiple model weights through weighted average to obtain a more powerful model.
1.1 Project background
This document describes how to use it on a MAC notebook DeepSeek-R1-Distill-Llama-1.5B Qwen ArchitecturePerform efficient fine-tuning and use **transformers Perform data processing and combine LoRA Technology to fine-tune the model, use WandB Monitor the training process, ModelScope Download the model. (The training data volume is about 20,000 pieces)
-
Because I trained locally for MAC laptops without graphics card support, I gave up (DeepSeek-R1-Distill-Qwen-7B Q wen)
The service information downloaded is as follows:
Installation Service | Version name | effect |
---|---|---|
Unsloth | For data processing and model fine-tuning. | |
Transformers | Hugging Face provides a library of models for loading and fine-tuning of DeepSeek-R1. | |
WandB | Used for real-time monitoring and visualization of training processes. | |
LoRA | Low-rank adaptation technology for fine-tuning. | |
ModelScope | Used to download the DeepSeek-R1-8b model. | |
python3.11 | Python 3.11 | Used to execute Python scripts and training tasks. |
1.2 Introduction to LoRA and QLoRA
Here is a table of differences between LoRA and QLoRA:
characteristic | LoRA (Low-Rank Adaptation) | QLoRA (Quantized LoRA) |
---|---|---|
Core Principle | Reduce the amount of parameters that need to be adjusted by decomposition of low rank matrix | Based on LoRA, combined with quantization technology can further reduce storage and computing requirements |
Key advantages | Reduce the number of parameters that need to be adjusted during training and improve fine-tuning efficiency | In addition to the low-rank matrix, it also reduces memory usage through quantization, which is suitable for resource-limited environments |
Storage requirements | Lower, but not as memory-saving as QLoRA | Significantly reduces memory usage, suitable for use on memory-constrained devices |
Computational efficiency | Improve training efficiency and reduce computing resource consumption | The quantized low-precision calculation further improves the calculation efficiency and reduces overhead |
Applicable scenarios | Scenarios where computing resources are limited but no extreme compression is required | Environments with extremely limited memory and computing resources, especially on edge devices |
Applicable hardware | Suitable for most hardware devices, especially high-performance computing environments | Especially suitable for hardware with limited memory, such as edge devices, low memory servers, etc. |
1.3 LLaMA architecture andQwen Architecture
characteristic | LLaMA architecture | Qwen Architecture |
---|---|---|
Developer | Meta(Facebook) | DeepSeek |
Design goals | Efficient and lightweight | Chinese optimization, multilingual support |
Parameter quantity | 7B, 13B, 33B, 65B, etc. | 7B, 14B, etc. |
Open source situation | Open source | Some open source or not fully public |
Applicable scenarios | Resource-limited environment | Chinese tasks, multilingual tasks |
LLaMA architecture
-
Full name:Large Language Model Meta AI(LLaMA)
-
Developer: Developed by Meta (formerly Facebook).
-
Features:
-
Efficiency:LLaMA aims to achieve high performance with a smaller amount of parameters, focusing on optimizing computing efficiency.
-
Lightweight: The number of model parameters is relatively small (such as 7B, 13B, 33B, 65B), but through high-quality data and training methods, the performance is close to or even surpassing larger models.
-
Open source: Meta released the weights and code for LLaMA for use by the research community.
-
-
Application scenarios:
-
Suitable for resource-limited environments such as on-premises or mobile devices.
-
Suitable for various NLP tasks, especially in tasks such as generation, question and answer, text classification, etc., it has good performance and efficiency.
-
Qwen Architecture
-
Developer: Developed by China's DeepSeek team.
-
Features:
-
Customized design: Qwen may be an architecture optimized for Chinese or specific tasks, and the specific details are not fully disclosed.
-
Multilingual support: The Qwen series model usually has good support for Chinese and also performs well in English and multilingual tasks.
-
Flexible parameters: The Qwen series includes models of different sizes (such as 7B, 14B, etc.), suitable for different scenarios.
-
-
Application scenarios:
-
Qwen is suitable for tasks such as text generation, automated content creation, dialogue systems, and speech synthesis.
-
2. Environmental preparation
2.1 Unsloth installation (Graphics card version-not available for the time being)
-
UnslothIt is a tool for data processing and model fine-tuning. You can install it with the following command:
-
MAC is not tried, graphics card is required
##Official website: /unslothai/unsloth
#01 Create a project and set up a python virtual environment, python 3.11 version
#02 Install unsloth (cpu version)
brew install llvm(Homebrew clang version 19.1.7)
echo 'export PATH="/opt/homebrew/opt/llvm/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc
pip install torch
pip install numpy
pip install "unsloth[colab-new] @ git+/unslothai/"
#03 Version Check
python -c "import torch; print(torch.__version__)"
2.6.0
#04 Quote
from unsloth import FastLanguageModel
After the installation is complete, you can useUnslothPreprocess, load and fine-tune the model of data.
-
Not used for the time being
#01 Docker is recommended for linux service
#02 Pull the mirror
docker pull modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-py310-torch2.3.1-1.22.2
#03 Start
2.2 Creating a Python project
#01 The environment is python3.11
#02 Project Directory
Unsloth-DeepSeek-R1-8b/
├── data/ # Store training data, verification data, etc.
│ ├── raw/ # Raw data
│ └── processed/ # Preprocessed data
│
├── models/ #Storing model files
│ ├── checkpoints/ # Store model checkpoints during training
│ └── final_model/ # Store the final fine-tuned model
│
├── scripts/ # Store training scripts, data processing scripts, etc.
│ ├── train.py # Training script
│ ├── data_preprocessing.py# Data preprocessing script
│ └── evaluate.py # Evaluate scripts
│
├── logs/ # Store training log files
│ └── training_logs.txt # Logs during training
│
├── wandb/ # Store wandb related configuration and records
│ └── wandb_config.py # wandb configuration file
│
├── environment/ # Environment configuration file
│ ├── requirements.txt # Python dependencies for the project
│ └── environment.yml # If using Conda, you can create an environment configuration file
│
├── main.py # Main run file, start training or other tasks
└── README.md # Project description file, containing instructions on how to use and run
#03 Create a directory
# Create a subdirectory
mkdir -p data/raw
mkdir -p data/processed
mkdir -p models/checkpoints
mkdir -p models/final_model
mkdir -p scripts
mkdir -p logs
mkdir -p wandb
mkdir -p environment
# Create a file
touch scripts/train.py
touch scripts/data_preprocessing.py
touch scripts/evaluate.py
touch logs/training_logs.txt
touch wandb/wandb_config.py
touch environment/requirements.txt
touch environment/environment.yml
touch main.py
touch README.