Coal mine safety macromodel: fine-tuning the internlm2 model to realize intelligent Q&A for coal mine accidents and coal mine safety knowledge

Coal Mine Safety Big Model ---- Mining Road Smart Protector

Using data such as historical coal mine accident cases, accident handling reports, safety rules and regulations, technical documents, and the question bank of the entrance examination for coal mine employees, we fine-tuned the internlm2 model to realize intelligent Q&A for coal mine accidents and coal mine safety knowledge.

Synopsis of this project:

In recent years, the state has been paying more and more attention to coal mine safety production. In order to ensure the safety of coal mine operations, it is especially important to improve the safety knowledge of the practitioners. In view of this, there is an urgent need for an efficient and integrated solution that can integrate all kinds of knowledge related to coal mine safety, and provide a precise and rapid information query, learning and decision support platform for coal mine enterprise leaders, safety management personnel, and miners.
To achieve this goal, we utilize rich data resources including historical coal mine accident cases, accident handling reports, safety operation procedures, rules and regulations, technical documents, and the question bank of the entry exam for coal mine employees, etc., and by fine-tuning the InternLM2 model, we constructed a coal mine safety grand model that is specialized in intelligent quizzes on coal mine accidents and coal mine safety knowledge.

Project Code:Intelligent Quiz on Safety Knowledge - Safety Big Model

More quality content please pay attention to the public number: Ting, artificial intelligence; will provide some related resources and quality articles, free access to read.

The project is characterized as follows:

Support regular questions in the field of coal mine safety, such as: single-choice questions, multiple-choice questions, judgment questions, fill in the blanks, etc. (for the main person in charge of coal mines and safety management personnel, coal mine operators of all kinds)
Support for safety regulations, rules and technical documents (e.g. Mine Safety Law of the People's *, Coal Mine Construction Safety Regulations)
Support historical coal mine accident cases, accident handling report query, provide detailed analysis of the cause of the accident, accident prevention measures and emergency response knowledge

form	foundation	name (of a thing)	releases	download link	Fine-tuning methods
dialog model	InternLM2-Chat-1_8B	CoalMineLLM_InternLM2-Chat-1_8B	V1.0	OpenXLab	QLora
dialog model	InternLM2-Chat-7B	CoalMineLLM_InternLM2-Chat-7B	V1.0	OpenXLab	QLora
dialog model	InternLM2-Math-7B	CoalMineLLM_InternLM2-Math-7B	V1.0	OpenXLab	QLora
dialog model	InternLM2-Chat-20B	CoalMineLLM_InternLM2-Chat-20B	V1.0	OpenXLab	QLora
dialog model	InternLM2-Chat-1_8B	CoalMineLLM_InternLM2-Chat-1_8B-full	V1.0	OpenXLab	global parametric fine-tuning
dialog model	CoalMineLLM_InternLM2-Chat-7B	CoalMineLLM_InternLM2-Chat-7B-4bit	V1.0	OpenXLab	W4A16 Quantification

1. Architecture diagram

2. Rapid access

2.1 Local Demo Deployment

git clone /yaosenJ/
cd CoalQA
conda create -n CoalQA python=3.10.0 -y
conda activate CoalQA
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r 
cd web_app
streamlit run streamlit_app.py --=127.0.0.1 -- 6006

2.2 Online experience

CoalMineLLM_InternLM2-Chat-1_8 version experience address:/apps/detail/milowang/CoalQAv1

3. Detailed guidelines

3.1 Environment configuration

git clone /yaosenJ/
cd CoalQA
conda create -n CoalQA python=3.10.0 -y
conda activate CoalQA
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r

3.2 Data construction

3.2.1 Generating InternLM2 models from documents such as rules and regulationsXtunerTraining data format

1.1 Entering the following command will generate CoalQA_data.json in the data directory

python  Generate_QAdata.py  ../data/CoalQA_data.json

The CoalQA_data.json content structure is shaped like this:

[
  {
    "input": "Please provide specific requirements for the freezing method of construction of shafts in new mine shafts." ,
    "output": "The freezing method for shaft freezing in new mine shafts needs to comply with the following requirements: the freezing depth must be extended to more than 10m through the weathering zone to the stable bedrock, the first freezing hole must be cored in its entirety, the direction and skewness of the hole must be determined during drilling, and the skewness must be corrected in time if it exceeds the prescribed level, the freezing pipe must be seamless and welded or threaded, and the water level changes in the hydrological observation holes must be observed frequently after the start of freezing. frequently observe the water level changes in the hydrological observation holes, and test excavation can be carried out only after it is determined that the freezing wall has been interlocked."
  }, .
  ...

]

1.2 Take the above generated json, processed into Xtuner trainable data format, run the following command can be

python format_internlm2.py

[
    {
        "conversation": [
            {
                "system": "You are a knowledgeable person in the field of coal mine safety, and you are very familiar with the relevant coal mine safety rules and regulations, technical and other documents. Please answer professionally and correctly the coal mine safety related questions users want to ask." ,
                "input": "Please provide specific requirements for the freezing method of construction of shafts in new mine shafts." ,
                "output": "The following requirements need to be observed for the construction of freezing method of shaft freezing in new mine shafts: the freezing depth must be extended to more than 10m to the stable bedrock through the weathering zone, the first freezing hole must be cored in its entirety, the direction and skewness of the hole must be measured when drilling, and the skewness must be corrected in time if it exceeds the stipulated requirements, the freezing pipe must be made of seamless steel pipe and welded or threaded, and the hydrological observation hole must be observed frequently for the water in the hole after freezing has begun. frequently observe the water level changes in the hydrological observation holes, and test excavation can only be carried out after determining that the freezing wall has been interlocked."
            }
        ]
    }, {
    {
        "conversation": [
            {
                "system": "You are a knowledgeable person in the field of coal mine safety and you are very familiar with the relevant coal mine safety regulations, rules, systems, technologies and other documents. Please answer professionally and correctly the coal mine safety related questions that users want to ask." ,
                "input": "What are the specific requirements in the coal mine safety regulations for the freezing method of construction of vertical shafts in new mines?" ,
                "output": "The freezing method of vertical shaft freezing construction in new mines needs to comply with the following requirements: the freezing depth must be extended through the weathering zone to the stable bedrock for more than 10m, the first freezing hole must be cored in its entirety, the direction and skewness of the hole must be determined during drilling, and the skewness must be corrected in time when it exceeds the prescribed level, the freezing pipe must be made of seamless steel pipe and welded or threaded, and the hydrological observation hole must be observed frequently after the start of the freezing process. frequently observe the water level changes in the hydrological observation holes, and test excavation can only be carried out after determining that the freezing wall has been interlocked."
            }
        ]
    }
...
]

3.2.2 Generating the InternLM2 model from the organized question bankXtunerTraining data format

Just enter the following command

python Generate_Question_bank.py

Show multiple choice generation here

[
    {
        "conversation": [
            {
                "system": "You are a knowledgeable person in the field of coal mine safety and you answer many questions. The user gives a multiple choice question and you are able to choose more than one correct option from several options." ,
                "input": "Digging roadways cannot be blasted under which of the following conditions (). \nA, when there is an omen of water breakout in the digging face or in the gunnel \nB, when the overrun distance of the water probe hole is not enough \nC, when the distance between empty roofs exceeds the regulations \nD, when the support of the digging face is not firm", , "input": "ABCD":"\nB
                "output": "ABCD"
            }
        ]
    },
    ...
]

If you want to generate training data for other topic types, please replace the content with the required content in the corresponding positions

csv_filename = '... /data/multiple-choice.csv'
#csv_filename = '... /data/mono choice.csv'
#csv_filename = '... /data/judgment.csv'
#csv_filename = '... /data/fill-in-the-blanks.csv'
#csv_filename = '... /data/short-answer.csv'

json_filename = '. /data/multiple_choice.json'
#son_filename = '... /data/single_choice.json'
#json_filename = '. /data/true_or_false.json'
#json_filename = '... /data/single_choice.json' #json_filename = '... /data/fill_in.json'
#json_filename = '. /data/shot_answer.json'

 "system": "You are a knowledgeable person in the field of coal mine safety and you answer many questions. The user gives a multiple choice question where you can choose more than one correct option out of several."
#"system": "You are a knowledgeable person in the field of coal mine safety and you can answer many questions. The user gives a single choice question and you are able to choose one correct option out of several."
#"system": "You are a knowledgeable person in the field of coal mine safety and you answer many questions. The user gives a judgment question and then you make a judgment whether it is correct or incorrect."
#"system": "You are a knowledgeable person in the field of coal mine safety and you answer many questions. The user gives a fill-in-the-blank question, and then you fill in the appropriate answer, in parentheses, in the question."
#"system": "You are a coal mine safety expert and you are familiar with the relevant mine safety rules, regulations, systems, technologies and other documents. Please answer professionally and correctly the coal mine safety related questions that the user wants to ask."

3.2.3 Multi-round dialog generation for coal mine accidents

"""
""{accident}""
With the accident investigation incident report provided above, could you please help me generate the text of a multi-round dialog in the format [
    {
        "conversation": [
            {
                "system": "You are a knowledgeable person in the field of coal mine safety, providing knowledge about coal mine safety procedures, accident prevention measures and emergency response." ,
                "input": "Hello!" ,
                "output": "Hello, I am a knowledge professional in the field of coal mine safety, how can I help you?"
            },
            {
                "input": " ",
                "output": " "
            }, ...
        ]
    }
]
"""


{
        "conversation": [
            {
                "system": "You are a knowledgeable person in the field of coal mine safety, providing knowledge about coal mine safety protocols, accident prevention measures and emergency response." ,
                "input": "Hello!" ,
                "output": "Hello, I am a knowledge professional in the field of coal mine safety, how can I help you?"
            },
            {
                "input": "What accident happened on March 22, 2023 at Zichang County Zhongda Jiaojiagou Coal Co." ,
                "output": "On March 22, 2023, at about 2:56, other accidents occurred in the 5309 railroad track downstream of ZiChang County Zhongda Jiaojiagou Coal Industry Co Ltd, resulting in one death and a direct economic loss of 1.75 million yuan."
            }, {
            {
                "input": "What was the direct cause of the accident?" ,
                "output": "The direct cause of the accident is that the mine will not meet the nominal pressure strength requirements of the pipeline as a gas discharge pipe, in the process of gas discharge pipe gas tightness test gas discharge pipe connecting flange leakage, the site operators in violation of the safety measures in the case of gas discharge pipe is not relieved with a plum wrench to tighten the leakage of the flange screws, the pipeline in the process of maintenance of force The pipeline burst out of balance during the repair process, and the operator was struck by the shockwave that spewed out, leading to the accident."
            }, {
            {
                "input": "What are the indirect causes of the accident?" ,
                "output": "Indirect causes of the accident include:\n1. ineffective implementation of production safety responsibilities and lax implementation of management systems. \n2. poor on-site safety management and violation of rules and risky operations. \n3. loopholes in technical management and low business quality of technicians. \n4. Poor safety education and training, low safety awareness among workers. \n5. deficiencies in safety supervision."
            },
            {
                "input": "How to improve the implementation of production safety responsibility in enterprises?" ,
                "output": "Ways to improve the implementation of production safety responsibilities in enterprises include:\n1. clarifying the management system of equipment and materials, ensuring strict procurement control, and contracting to clarify the specific model and scope of use of products. \n2. developing and implementing a management system for the use and recycling of pipelines, with clear responsibilities. \n3. Strengthen routine maintenance to ensure the safe use of equipment and materials. \n4. Strictly implement safety management systems such as the "Approval System for Safety Technical Measures" to ensure that the responsibility for production safety is in place."
            }, {
            {
                "input": "How to strengthen on-site safety management and avoid unauthorized risky operations?" ,
                "output": "Ways to strengthen on-site safety management include:\n1. comprehensively analyzing and identifying operational risks, and formulating detailed safety technical measures. \n2. ensuring that site operators strictly comply with safety operating procedures and prohibiting unauthorized and risky operations. \n3. Enhance on-site supervision and management to ensure that there are safety officers and persons-in-charge to carry out on-site safety supervision. \n4. realize professional teaming to ensure that work such as laying and testing of gas pumping pipelines is carried out by a professional team."
            },
            {
                "input": "How to improve technical management?" ,
                "output": "Ways to improve the level of technical management include:\n1. Strengthening the study and mastery of safety technical specifications, and ensuring that technicians understand the relevant standards and requirements. \n2. scientifically preparing safety technical measures and selecting safer processes and methods. \n3. Strictly implement the system of reviewing safety technical measures to ensure that the measures are scientific and reasonable. \n4. Ensure that operators strictly implement safety technical measures to avoid unauthorized operation. \n5. Conduct comprehensive risk analysis and research to identify and prevent possible safety risks."
            },
            {
                "input": "How to improve the effectiveness of safety education and training?" ,
                "output": "Ways to improve the effectiveness of safety education and training include:\n1. developing detailed training programs to ensure that workers have a comprehensive grasp of technical safety measures. \n2. improving the practicality of the training and enhancing the safety awareness and skills of workers through hands-on training. \n3. Enhance training on safety risk identification in the working environment and improve workers' awareness of self-insurance and mutual protection. \n4. Conduct regular training assessments to test the effectiveness of training and ensure that the training content is deeply rooted in people's minds."
            },
            {
                "input": "How to enhance safety supervision?" ,
                "output": "Ways to enhance safety supervision include:\n1. clarifying the duties of mine-based safety inspectors to ensure that they have a full grasp of the production safety situation in coal mines. \n2. strengthening the division of labor in safety supervision to ensure that responsibility for all supervisory work is assigned. \n3. conduct regular and irregular safety inspections to detect and rectify potential safety hazards. \n4. establish a sound safety supervision assessment system to ensure that safety supervision is put in place."
            }
        ]
    }

Building a knowledge graph of coal mine accidents using the GLM-4 model. Not open source for now

Additional details

1. Merge two json file script: merge_json.py
2. Format json text script: format_json.py
3. Scripts that disrupt the order of data in json:

See the DATA catalog for related data:Intelligent quiz for safety knowledge

4. Model fine-tuning

4.1 Internlm2 fine-tuning

Environment Configuration

Creating the Environment

conda create -n internlm2 python=3.10
conda activate internlm2
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia

Installation of environment packages

cd ~
git clone -b v0.1.18 /InternLM/XTuner
cd XTuner
pip install [-e .]

Download the Project Warehouse

git clone /yaosenJ/

Download model

Go to the finetune directory

cd CoalQA/finetune

Execute the following command to download the internlm2-chat-7b model parameter file:

python download_model.py

4.2 Details of the model fine-tuning process

This document provides a detailed guide to the model fine-tuning process using the XTuner tool. The process includes transformations, merging, training, and setting up network presentations for models of different sizes (1.8B and 20B).

request
- XTuner
- DeepSpeed
- Huggingface Transformers
- SSH and Git access

4.2.1 Environment installation

# If you are on the InternStudio platform, clone an existing pytorch environment locally:
#pytorch 2.0.1 py3.10_cuda11.7_cudnn8.5.0_0

studio-conda xtuner0.1.17
# If you are on another platform:
#conda create --name xtuner0.1.17 python=3.10 -y

#Activate the environment
conda activate xtuner0.1.17
# Go to your home directory (~ means "current user's home path")
cd ~
#Create the version folder and enter to follow this tutorial
mkdir -p /root/xtuner0117 && cd /root/xtuner0117

# Pull the version 0.1.17 source code
git clone -b v0.1.17 /InternLM/xtuner
#For those who don't have access to github, pull from gitee.
#git clone -b v0.1.15 /Internlm/xtuner

#Get into the source directory
cd /root/xtuner0117/xtuner

#Install XTuner from source
pip install -e '. [all]'

4.2.2 1.8B Model Training

Data preparation

# Create a data folder inside the ft folder to store the data
mkdir -p /root/ft/data && cd /root/ft/data

Prepare the model

# Create the target folder and make sure it exists.
#-The # -p option means that if the parent directory does not exist it will be created along with it, and no error will be reported if the target folder already exists.
mkdir -p /root/ft/model

# Copy the contents to the target folder. The -r option recursively copies the entire folder.
cp -r /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b/* /root/ft/model/

If you need to download it yourself, you can use the transformers library

from transformers import AutoModel

#Specify the model name
model_name = 'internlm/internlm2-chat-1_8b'

#Loading Models
model = AutoModel.from_pretrained(model_name)

#Specify the directory in which to save the model
model_save_path = '/root/ft/model'

#Saving Models
model.save_pretrained(model_save_path)

Save this code asdownload_model.py, and then run the script from the command line:

python download_model.py

This script automatically downloads the model and saves it to the specified/root/ft/model Catalog.

Download Configuration File

#XTuner provides several out-of-the-box configuration files that users can view with the following commands:
#List all built-in configuration files
#xtuner list-cfg

#If we want to find the supported profiles in the internlm2-1.8b model
xtuner list-cfg -p internlm2_1_8b

# Create a folder where the config files are stored
mkdir -p /root/ft/config

# Use the copy-cfg function in XTuner to copy the config file to the specified location
xtuner copy-cfg internlm2_1_8b_qlora_alpaca_e3 /root/ft/config

Modifying Configuration Parameters

#Modify the model address (at the position on line 27)
- pretrained_model_name_or_path = 'internlm/internlm2-1_8b'
+ pretrained_model_name_or_path = '/root/ft/model'

#Modify the dataset address to the local json file address (at the location on line 31)
- alpaca_en_path = 'tatsu-lab/alpaca'
+ alpaca_en_path = '/root/ft/data/personal_assistant.json'

# Modify max_length to reduce graphics memory consumption (at line 33)
- max_length = 2048
+ max_length = 1024

# Reduce the number of training rounds (at line 44)
- max_epochs = 3
+ max_epochs = 2

#Increase the total number of saved weights files (at the position on line 54)
- save_total_limit = 2
+ save_total_limit = 3

#Modify how many rounds of evaluation are performed (in place of line 57)
- evaluation_freq = 500
+ evaluation_freq = 300

#Modify the questions for the specific evaluation (at the position of lines 59 to 61)

# Load map_fn in OpenAI format (at line 15)
- from .map_fns import alpaca_map_fn, template_map_fn_factory
+ from .map_fns import openai_map_fn, template_map_fn_factory

# Change the address that was alpaca to be the address of the json file (at line 102)
- dataset=dict(type=load_dataset, path=alpaca_en_path), + dataset=dict(type=load_dataset, path=alpaca_en_path), #factory
+ dataset=dict(type=load_dataset, path='json', data_files=dict(train=alpaca_en_path)),

# Change dataset_map_fn to the generic OpenAI dataset format (at the position on line 105)
- dataset_map_fn=alpaca_map_fn, # change dataset_map_fn to generic OpenAI dataset format (at line 105)
+ dataset_map_fn=None, # change dataset_map_fn to generic OpenAI dataset format (position 105)

model training

#Specify the save path
xtuner train /root/ft/config/internlm2_1_8b_qlora_alpaca_e3_copy.py --work-dir /root/ft/train

#utilization deepspeed to accelerate training.
xtuner train /root/ft/config/internlm2_1_8b_qlora_alpaca_e3_copy.py --work-dir /root/ft/train_deepspeed --deepspeed deepspeed_zero2

Convert to Huggingface format

Create a catalog: Create a storage directory for the converted Huggingface model:
```
mkdir -p /root/ft/huggingface/i8000
```

model transformation: Model transformation using the provided configuration and weights files:

xtuner convert pth_to_hf /root/ft/config/internlm2_1_8b_qlora_alpaca_e3_copy.py /root/ft/train_deepspeed/iter_18000.pth /root/ft/huggingface/i8000 --fp32

Consolidation model: Merge models and resolve dependencies:

mkdir -p /root/ft/final_model_8000
export MKL_SERVICE_FORCE_INTEL=1
xtuner convert merge /root/ft/model /root/ft/huggingface/1i8000 /root/ft/final_model_18000

test model: Test the model by starting a dialog:

xtuner chat /root/ft/final_model_18000 --prompt-template internlm2_chat

Model continuation training

xtuner train /root/ft/config/internlm2_1_8b_qlora_alpaca_e3_copy.py --work-dir /root/ft/train_deepspeed --resume /root/ft/train_deepspeed/iter_8500.pth  --deepspeed deepspeed_zero1

Web Presentation Setup

Preparing the environment：

mkdir -p /root/ft/web_demo && cd /root/ft/web_demo
git clone /InternLM/
cd /root/ft/web_demo/InternLM

Running Demo Use Streamlit:

streamlit run /root/ft/web_demo/InternLM/chat/web_demo.py -- 127.0.0.1 -- 6006

Access the demo via SSH tunnel：

ssh -CNg -L 6006:127.0.0.1:6006 root@ -p developer port number

4.2.3. 20B model training

Similar to the 1.8B model training process, 20B model training involved creating the appropriate catalogs for the configuration, data, and final model. In addition, this process involves model training using multiple GPUs and converting the model to Huggingface format.

Data preparation

Preparing data for large-scale 20B model training.

# Create a directory dedicated to the 20B model data
mkdir -p /root/ft20b/data && cd /root/ft20b/data

Prepare the model

Preparing the model involves creating a target folder and copying the pre-trained 20B model to a specified location.

# Create a directory to hold the 20B model files
mkdir -p /root/ft20b/model

#Copy the pre-trained models to the newly created directory
cp -r /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-20b/* /root/ft20b/model/

Download Configuration File

Download and prepare the configuration file for the 20B model for training.

# List all configuration files that support the 20B model
xtuner list-cfg -p internlm2_20b

#Create a directory to store the configuration files for the 20B model
mkdir -p /root/ft20b/config

# Copy the required configuration files to the newly created directory
xtuner copy-cfg internlm2_20b_qlora_alpaca_e3 /root/ft20b/config

Modifying Configuration Parameters

The profile is adjusted according to the training requirements to optimize the training of the 20B model.

#Modify key parameters such as model paths and dataset paths to fit the20Bmould
- pretrained_model_name_or_path = 'internlm/internlm2-20b'
+ pretrained_model_name_or_path = '/root/ft20b/model'

- alpaca_en_path = 'tatsu-lab/alpaca'
+ alpaca_en_path = '/root/ft20b/data/specific_dataset.json'

- max_length = 2048
+ max_length = 1024

- max_epochs = 3
+ max_epochs = 2

- save_total_limit = 2
+ save_total_limit = 3

- evaluation_freq = 500
+ evaluation_freq = 300

model training

Use DeepSpeed and multi-GPU configurations to accelerate the training process of the 20B model.

#Specify the save path and start training
xtuner train /root/ft20b/config/internlm2_20b_qlora_alpaca_e3_copy.py --work-dir /root/ft20b/train_deepspeed --deepspeed deepspeed_zero2

Convert to Huggingface format

Create a catalog for the converted Huggingface model and perform the conversion.

# Create a directory to hold the converted Huggingface model
mkdir -p /root/ft20b/huggingface

# Perform the model conversion
xtuner convert pth_to_hf /root/ft20b/config/internlm2_20b_qlora_alpaca_e3_copy.py /root/ft20b/train_deepspeed/iter_2600.pth /root/ft20b /huggingface

2.7 Model merging

Merge converted models and resolve dependencies.

# Create a directory named final_model to store the merged model files
mkdir -p /root/ft20b/final_model

#Merge the models
xtuner convert merge /root/ft20b/model /root/ft20b/huggingface /root/ft20b/final_model

test model

Test the merged model by starting a dialog.

#Start a chat test with the model
xtuner chat /root/ft20b/final_model --prompt-template

 internlm2_chat

This section provides detailed guidance to ensure that the training process for the 20B model is properly managed and executed.

4.2.4 Fine-tuning the 20b Configuration Sample

max_length = 4096
pack_to_max_length = True

#parallel
sequence_parallel_size = 1

#Scheduler & Optimizer
batch_size = 4  # per_device
accumulative_counts = 16
accumulative_counts *= sequence_parallel_size
dataloader_num_workers = 0
max_epochs = 50

=》

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id         | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-80GB          On  | 00000000:89:00.0 Off |                    0 |
| N/A   65C    P0             334W / 400W |  59119MiB / 81920MiB |    100%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  | 00000000:B3:00.0 Off |                    0 |
| N/A   66C    P0             358W / 400W |  59119MiB / 81920MiB |    100%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

4.2.5 Other considerations

If you finish training on a single card, you can't continue training on a dual card.

The reason:

Root causes of the problem: The model checkpoints that were attempted to be loaded were saved in an environment with a data parallel (DP) world size of 1 (i.e., a single GPU or a single training process). However, the environment in which the current attempt to resume training has a data parallel world size of 2 (i.e., two GPUs or two training processes).

ZeRO limitations: DeepSpeed's ZeRO-Optimizer State Partitioning relies on a specific world size configuration and does not currently support automatic adjustment to the new world size. In other words, if you train on one GPU and save a checkpoint, you must do it on the same number of GPUs when loading this checkpoint for recovery training.

Optimized Performance ConfigurationThis includes setting the maximum sequence length, batch size, and other DeepSpeed-specific settings.

5. Retrieval of enhanced RAG generation

Module Purpose: Based on the user's question, retrieve the corresponding information to enhance the professionalism of the answer, so as to make CoalQA's answer more professional and reliable. The content of the search includes but is not limited to the following points:

Coal Mine Safety Related Q&A on
Cases related to coal mine safety

5.1 Technical route:

The technical routes for each part of this RAG system are:

Embedding Model: BAAI/bge-small-en-v1.5
LLM Base: InternLM2-Chat-1.8B InternLM2-Chat-7B InternLM2-Chat-20B InternLM2-Math-7B
Use BAAI/bge-reranker-large to do post-search fine ranking
Vector database:
- FAISS: is a Facebook open source efficient vector similarity search library . It provides a variety of indexing structures and search algorithms for large-scale vector retrieval , and can quickly find the most similar to a given vector vector.FAISS specializes in handling ultra-high-dimensional vector data , in image retrieval , recommendation systems and other scenarios have a wide range of applications .

5.2 Environment building:

git clone /yaosenJ/
cd rag
pip install -r

5.3 Local dataset vectorization

PDF data

The need to quantize the pdf data into the corresponding directory

JSON data

Build QA pairs and generate JSON files (single/multiple conversations) into the corresponding directory

The data format is as follows

# One-round conversations
{
  "conversation": [
      {
          "system": "You are a knowledgeable person in the field of coal mine safety and you answer many questions. The user gives a judgment question and then you make a judgment as to whether it is correct or incorrect." ,
          "input": "The state rewards units that have made remarkable achievements in improving production safety conditions, preventing production safety accidents, and participating in rescue and relief work; individuals are encouraged to do so." ,
          "output": "Error"
      }
  ]
}
#Multiple rounds of dialog
{
  "conversation": [
      {
          "system": "You are a knowledgeable person in the field of coal mine safety, providing knowledge about coal mine safety protocols, accident prevention measures, and emergency response." ,
          "input": "Hello!" ,
          "output": "Hello, I am a knowledge professional in the field of coal mine safety, how can I help you?"
      },
      {
          "input": "What accident occurred on September 11, 2022 in the underground trackless rubber wheelbarrow alley yard of the Xiashuo Coal Mine of Shanxi Sanyuan Coal Industry Company Limited of Jinneng Holding Group?" ,
          "output": "On September 11, 2022, a transportation accident occurred in the underground trackless rubber-wheeled vehicle alley car yard of the underground trackless rubber-wheeled vehicle of Shanxi Sanyuan Coal Industry Joint Stock Company Limited of Jinneng Holding Group, resulting in one death."
      },.
      {
          "input": "Please tell me what was the immediate cause of this accident?" ,
          "output": "The direct cause of the accident was that when the equipment changeover operation was carried out downhole, Ji Jingjing, the deputy shift leader of the electromechanical machine transportation team, directed and participated in the unloading operation in violation of the regulations without taking and confirming the effective anti-pouring measures, and he himself was in the danger zone of the vertical loading equipment tipping, and he was injured and died of the injuries caused by being smashed and crushed by the front beam of the end bracket and the side guards which were tipped over all of a sudden."
      }, {
      {
          "input": "So what were the indirect causes of this accident?" ,
          "output": "Indirect causes of the accident include serious deficiencies in on-site supervision and management, lax gatekeeping in the approval of procedures and measures, weak awareness of safety risks among employees, inadequate implementation of the responsibilities of the higher-level main enterprises, and inadequate supervision by local safety supervision departments."
      },.
      {
          "input": "In order to avoid similar accidents, what measures should coal mining enterprises take?" ,
          "output": "In order to avoid similar accidents, coal mining enterprises should strictly implement safety regulations, strengthen on-site safety management, ensure that safety technical measures are in place, strengthen the study and judgment of safety risks, establish a perfect safety technology management system, implement the main responsibility for safety production, strengthen safety education and training, improve the safety awareness of employees and their ability to recognize risks, and at the same time, strengthen the safety supervision and ensure that the supervision work is meticulous and seriously."
      }, {
      {
          "input": "Thank you for your detailed answer!" ,
          "output": "You're welcome, I'm glad I could help. If you have any other questions, feel free to keep asking."
      }
  ]
},

5.4 Constructing a Vector Database

1. Configuration file modification

Rewrite the document as needed:

#llm storage location
model_dir = (base_dir, 'model')

# vectorized model path and model name
embedding_path = (model_dir, 'embedding_model') # embedding
embedding_model_name = 'BAAI/bge-small-zh-v1.5'

# Rerank model path and model name
rerank_path = (model_dir, 'rerank_model') # embedding
rerank_model_name = 'BAAI/bge-reranker-large'

# recall number of documents
retrieval_num = 3

#Number of documents to be retained after refinement.
select_num = 3

prompt_template = """
    You are a helpful quiz agent. \n
    Your task is to analyze and synthesize the information retrieved back in order to provide meaningful and efficient answers.
{content}
Question: {query}
"""

2. Local call

Run the Build Local Knowledge Base script

python data_generate.py

The main steps of vectorization are as follows:

Load pdf dataset and extract text
Use RecursiveCharacterTextSplitter to split text according to a certain block size and the size of the overlap between blocks.
Loading the BAAI/bge-small-en-v1.5 model
Construct FAISS indexes from document sets (i.e., high-performance vector databases)

5.5 Related Text Recall and Elaboration

The faiss is utilized to find out the most relevant document to the question entered by the user and then the recalled text is spliced with the user's original input and input to llm. the retrieval code is as follows:

def get_retrieval_content(self, querys) -> str:
        """
            Input: User questions, whether it is necessaryrerank
            ouput: retrieved content
        """
        #print(querys)
        output = []
        content = []
        for query in querys:
            
            documents = .similarity_search(query, k=self.retrieval_num)
            
            for doc in documents:
                (doc.page_content)
            (f'Contexts length:{len(content)}')
            if self.rerank_flag:
                model = self.data_processing_obj.load_rerank_model()
                documents = self.data_processing_obj.rerank(model, query, content, self.select_num)

                for doc in documents:
                    (doc)
                (f'Selected contexts length:{len(output)}')
                (f'Selected contexts: {output}')
            else:
                (f'Selected contexts: {content}')
        return output if self.rerank_flag else content

5.6 Summary of RAG Specific Processes

Building a vector DB from a dataset
Embedding of user-entered questions
Search in vector database based on embedding results
Recall data reordering
Generate final results based on user questions and recall data

5.7 Enhancing Coal Mine Accident QA with Neo4j and Langchain Integration of Unstructured and Graph Knowledge

Enhancing QA for coal mine accidents using Neo4j and Langchain integration of unstructured and graph knowledge, seeCoalMineLLM-InternLM2-Chat-1_8B version - CoalMineLLM-InternLM2_Chat-1_8B

6. Deployment

6.1 Local deployment

Load streamlit apps directly using pytorch natively

   cd CoalQA
   conda create -n CoalQA python=3.10.0 -y
   conda activate CoalQA
   conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
   pip install -r 
   cd web_app
   streamlit run streamlit_app.py --=127.0.0.1 -- 6006

6.2 openxlab deployment:

Load streamlit apps directly using pytorch natively

streamlit run  --=127.0.0.1 -- 6006

Select Configuration to create the application, and if more hardware resources are needed, request them here

6.3 Based onLMDeployquantitative deployment:

Introduction to LMDeploy

LMDeploy byMMDeploy cap (a poem)MMRazor Jointly developed by the team, it is a full suite of lightweight, deployment and service solutions covering LLM tasks. This powerful toolkit offers the following core features:

Efficient Reasoning: LMDeploy increases request throughput to 1.8x vLLM by introducing key features such as persistent batching (aka sequential batching), blocking KV caching, dynamic splitting and fusion, tensor parallelism, and a high-performance CUDA kernel.
Efficient Quantization: LMDeploy supports both weighted-only quantization and k/v quantization, with 4-bit inference performance 2.4x that of FP16. Quantization quality has been confirmed by OpenCompass evaluation.
Easily Distribute Servers: Utilizing the Request Distribution Service, LMDeploy makes it easy and efficient to deploy multi-model services on multiple machines and cards.
Interactive reasoning model: by caching the attention k/v during multiple rounds of dialogs, the engine remembers the history of the dialogs and thus avoids repetitive processing of historical conversations.

6.3.1 Environment installation

pip install:

pip install lmdeploy

Since v0.3.0, the default precompiled packages are in theCUDA 12 on the compiler. However, if it is necessary toCUDA 11+You can install lmdeploy by.

export LMDEPLOY_VERSION=0.3.0
export PYTHON_VERSION=38
pip install /InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url /whl/cu118

6.3.2 Talking to Models with LMDeploy

Using LMDeploy to talk to the model, you can run the downloaded 1.8B model by executing the following command

lmdeploy chat /group_share/internlm2_chat_1_8b_qlora_18000

6.3.3 LMDeploy model quantification (lite)

Setting the Maximum KV Cache Cache Size

The --cache-max-entry-count parameter controls the maximum percentage of remaining video memory that can be used by the KV cache to 0.5 percent.

lmdeploy chat /group_share/internlm2_chat_1_8b_qlora_18000 --cache-max-entry-count 0.5

Quantification using W4A16

LMDeploy uses the AWQ algorithm for model 4bit weight quantization. The inference engine TurboMind provides a very efficient 4bit inference cuda kernel with more than 2.4 times the performance of FP16. It supports the following NVIDIA graphics cards:

Turing architecture (sm75): 20 series, T4
Ampere architecture (sm80,sm86): 30 series, A10, A16, A30, A100
Ada Lovelace Architecture (sm90): 40 Series

Before running, first install a dependency library.

pip install einops==0.7.0

Model quantization can be accomplished by executing only one command.

lmdeploy lite auto_awq \
   /group_share/internlm2_chat_1_8b_qlora_18000  \
  --calib-dataset 'ptb' \
  --calib-samples 128 \
  --calib-seqlen 1024 \
  --w-bits 4 \
  --w-group-size 128 \
  --work-dir /group_share/internlm2_chat_1_8b_qlora_18000-4bit

6.3.4 LMDeploy service (serve)

Start the API server with the following lmdeploy command to reason about the model:

lmdeploy serve api_server \
    /group_share/internlm2_chat_1_8b_qlora_18000-4bit \
    --model-format hf \
    --quant-policy 0 \
    --server-name 0.0.0.0 \
    --server-port 23333 \
    --tp 1

That is, you can get the FastAPI interface

7. Case Show

Project Code:

Intelligent Quiz on Safety Knowledge - Safety Big Model

More quality content please pay attention to the public number: Ting, artificial intelligence; will provide some related resources and quality articles, free access to read.

Reference Links
BGE Github
- BAAI/bge-small-en-v1.5: embedding model for building vector DB
- BAAI/bge-reranker-large:: rerank model for rearranging retrieved article paragraphs
InternLM2
- Chat-1.8B model
- Chat-7B model
- Math-7B model
- Chat-20B model
LangChain
- (computer) file
- Github Repositories
FAISS
- (computer) file
- Github Repositories