Customized training of DeepSeek model: LoAR, COT reasoning and SFT technology application

DeepSeek-R1 Model Fine Tuning Series

DeepSeek-R1 Model Fine Tuning Series1. IntroductionContent of this article:1.1 Project background1.2 Introduction to LoRA and QLoRA1.3 LLaMA architecture and Qwen architectureLLaMA architectureQwen Architecture2. Environmental preparation2.1 Unsloth installation (Graphics card version-not available for the time being)2.2 Creating a Python project2.3 Python dependency library2.2 LoRA peft installation2.3 WandB Settings2.4 modelscope pull model2.5 Test model usage3. Training data3.1 Prepare the dataset3.2 Data cleaning3.3 Training data3.3 Training the model and saving it3.4 Merge model files3.4 Evaluation and monitoring training processEval/) related information:Training (train/) related information:

1. Introduction

Content of this article:

Model loading and preprocessing: Explain in detail how to load pretrained models and word segmenters, and process input data sets.
LoRA configuration: Introduce how to use LoRA technology to configure models and fine-tune efficiently to save computing resources.
Training process: Shows how to configure training parameters, use SFTTrainer for training, and record training logs through WandB.
Model preservation and evaluation: How to save the fine-tuned model and how to verify the model with the appropriate evaluation set.
Model merge: Shows how to merge multiple model weights through weighted average to obtain a more powerful model.

1.1 Project background

This document describes how to use it on a MAC notebook DeepSeek-R1-Distill-Llama-1.5B Qwen ArchitecturePerform efficient fine-tuning and use **transformers Perform data processing and combine LoRA Technology to fine-tune the model, use WandB Monitor the training process, ModelScope Download the model. (The training data volume is about 20,000 pieces)

Because I trained locally for MAC laptops without graphics card support, I gave up (DeepSeek-R1-Distill-Qwen-7B Q wen)

The service information downloaded is as follows:

Installation Service	Version name	effect
Unsloth		For data processing and model fine-tuning.
Transformers		Hugging Face provides a library of models for loading and fine-tuning of DeepSeek-R1.
WandB		Used for real-time monitoring and visualization of training processes.
LoRA		Low-rank adaptation technology for fine-tuning.
ModelScope		Used to download the DeepSeek-R1-8b model.
python3.11	Python 3.11	Used to execute Python scripts and training tasks.

1.2 Introduction to LoRA and QLoRA

Here is a table of differences between LoRA and QLoRA:

characteristic	LoRA (Low-Rank Adaptation)	QLoRA (Quantized LoRA)
Core Principle	Reduce the amount of parameters that need to be adjusted by decomposition of low rank matrix	Based on LoRA, combined with quantization technology can further reduce storage and computing requirements
Key advantages	Reduce the number of parameters that need to be adjusted during training and improve fine-tuning efficiency	In addition to the low-rank matrix, it also reduces memory usage through quantization, which is suitable for resource-limited environments
Storage requirements	Lower, but not as memory-saving as QLoRA	Significantly reduces memory usage, suitable for use on memory-constrained devices
Computational efficiency	Improve training efficiency and reduce computing resource consumption	The quantized low-precision calculation further improves the calculation efficiency and reduces overhead
Applicable scenarios	Scenarios where computing resources are limited but no extreme compression is required	Environments with extremely limited memory and computing resources, especially on edge devices
Applicable hardware	Suitable for most hardware devices, especially high-performance computing environments	Especially suitable for hardware with limited memory, such as edge devices, low memory servers, etc.

1.3 LLaMA architecture andQwen Architecture

characteristic	LLaMA architecture	Qwen Architecture
Developer	Meta（Facebook）	DeepSeek
Design goals	Efficient and lightweight	Chinese optimization, multilingual support
Parameter quantity	7B, 13B, 33B, 65B, etc.	7B, 14B, etc.
Open source situation	Open source	Some open source or not fully public
Applicable scenarios	Resource-limited environment	Chinese tasks, multilingual tasks

LLaMA architecture

Full name：Large Language Model Meta AI（LLaMA）
Developer: Developed by Meta (formerly Facebook).
Features：
- Efficiency:LLaMA aims to achieve high performance with a smaller amount of parameters, focusing on optimizing computing efficiency.
- Lightweight: The number of model parameters is relatively small (such as 7B, 13B, 33B, 65B), but through high-quality data and training methods, the performance is close to or even surpassing larger models.
- Open source: Meta released the weights and code for LLaMA for use by the research community.
Application scenarios：
- Suitable for resource-limited environments such as on-premises or mobile devices.
- Suitable for various NLP tasks, especially in tasks such as generation, question and answer, text classification, etc., it has good performance and efficiency.

Qwen Architecture

Developer: Developed by China's DeepSeek team.
Features：
- Customized design: Qwen may be an architecture optimized for Chinese or specific tasks, and the specific details are not fully disclosed.
- Multilingual support: The Qwen series model usually has good support for Chinese and also performs well in English and multilingual tasks.
- Flexible parameters: The Qwen series includes models of different sizes (such as 7B, 14B, etc.), suitable for different scenarios.
Application scenarios：
- Qwen is suitable for tasks such as text generation, automated content creation, dialogue systems, and speech synthesis.

2. Environmental preparation

2.1 Unsloth installation (Graphics card version-not available for the time being)

UnslothIt is a tool for data processing and model fine-tuning. You can install it with the following command:
MAC is not tried, graphics card is required


##Official website: /unslothai/unsloth

#01 Create a project and set up a python virtual environment, python 3.11 version

#02 Install unsloth (cpu version)
brew install llvm（Homebrew clang version 19.1.7）
echo 'export PATH="/opt/homebrew/opt/llvm/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

pip install torch
pip install numpy
pip install "unsloth[colab-new] @ git+/unslothai/"



#03 Version Check
python -c "import torch; print(torch.__version__)"
2.6.0

#04 Quote
from unsloth import FastLanguageModel

After the installation is complete, you can useUnslothPreprocess, load and fine-tune the model of data.

Not used for the time being


#01 Docker is recommended for linux service


#02 Pull the mirror
docker pull modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-py310-torch2.3.1-1.22.2

#03 Start

2.2 Creating a Python project


#01 The environment is python3.11

#02 Project Directory
Unsloth-DeepSeek-R1-8b/
├── data/                    # Store training data, verification data, etc.
│   ├── raw/                 # Raw data
│   └── processed/           # Preprocessed data
│
├── models/                  #Storing model files
│   ├── checkpoints/         # Store model checkpoints during training
│   └── final_model/         # Store the final fine-tuned model
│
├── scripts/                 # Store training scripts, data processing scripts, etc.
│   ├── train.py             # Training script
│   ├── data_preprocessing.py# Data preprocessing script
│   └── evaluate.py          # Evaluate scripts
│
├── logs/                    # Store training log files
│   └── training_logs.txt    # Logs during training
│
├── wandb/                   # Store wandb related configuration and records
│   └── wandb_config.py      # wandb configuration file
│
├── environment/             # Environment configuration file
│   ├── requirements.txt     # Python dependencies for the project
│   └── environment.yml      # If using Conda, you can create an environment configuration file
│
├── main.py                  # Main run file, start training or other tasks
└── README.md                # Project description file, containing instructions on how to use and run


#03 Create a directory
# Create a subdirectory
mkdir -p data/raw
mkdir -p data/processed
mkdir -p models/checkpoints
mkdir -p models/final_model
mkdir -p scripts
mkdir -p logs
mkdir -p wandb
mkdir -p environment

# Create a file
touch scripts/train.py
touch scripts/data_preprocessing.py
touch scripts/evaluate.py
touch logs/training_logs.txt
touch wandb/wandb_config.py
touch environment/requirements.txt
touch environment/environment.yml
touch main.py
touch README.