Understanding in one article: Redundant Array of Independent Disks (RAID)

Redundant Array of Independent Disks, which is often referred to as RAID, the full name of the English language is: Redundant Array of Independent Disks, the use of this technology, you can significantly increase the IO read and write speed of the hard disk device, there are also several types of data redundancy and backup mechanisms to provide users with a choice to reduce the risk of data loss brought about by damage to the user's data disk.

RAID technology through the combination of multiple hard disk devices into a larger capacity, better security of the disk array, and cut the data into multiple segments were stored in various different physical hard disk devices, and then use decentralized read/write technology to improve the overall performance of the disk array, while synchronizing multiple copies of important data to different physical hard disk devices, thus playing a very good data redundancy backup The effect of data redundancy backup.

Several commonly used RAID modes are RAID0, RAID1, RAID5 and RAID10 (RAID One Zero).

I. Several modes of RAID

1、RAID0

RAID 0 technology connects multiple physical hard disk devices (at least two) in series, either through hardware or software, to form a large volume group, and writes data to each physical disk in turn. In this way, the read/write performance of the hard disk devices can be increased by several times in the most ideal state, but if any one of the hard disks fails, the data of the whole system will be broken.
Bad. In layman's terms, RAID 0 technology can effectively increase the throughput speed of hard disk data, but does not have the ability to repair data errors.

As shown in the figure, the data is written to different hard disk devices, and the two hard disks will store the data separately, which ultimately achieves the effect of improving the reading and writing speed.

Minimum number of hard disks required: 2
Pros: faster data access.
Cons: There is no hard disk redundancy, so the risk of data loss increases. Not optimized for hard drives of different capacity sizes.

2、RAID1

RAID1 mode array mode is "N + 1", no matter how many disks, it only uses one to store data, each other disk is its mirror, for example: there are two disks, the disk utilization is 50%, there are N disks, the disk utilization is 1/N.

This mode is the most useless mode, because it is too wasteful of space, basically no one uses it, unless it is a small file storage, and the size of the space does not have any requirements, but there are relatively high requirements for security, then this time you can use this. Relative to RAID0 mode, it is slightly slower to write speed, but read speed to be faster.

Minimum number of hard disks required: 2
Benefits: N drive redundancy.
Cons: Storage pool available space is limited to the capacity of one hard disk. Not optimized for drives of different capacity sizes.

3、RAID5

RAID 5 technology is to save the data parity information of a hard disk device to other hard disk devices.The parity information of the data in a RAID 5 disk array is not saved to one hard disk device alone, but is stored to every other hard disk device except itself. The advantage of this is that if any one of the devices is damaged, you can rely on the other devices to rebuild the lost hard disk data.

Well, the above statement is too confusing, so how exactly does RAID5 achieve data redundancy for one hard disk and rebuild the data after one hard disk is damaged?

First, review the different-or operation:

A	B	in the end
0	0	0
0	1	1
1	0	1
1	1	0

0 is an even number, 1 is a base number, and the result of their dissimilarity is the parity sum. We take the above table in A and B were regarded as a hard disk, 0 and 1 said that the data stored in the hard disk, the result is also regarded as a hard disk, that is to say, the data stored in the A and B after doing the operation of the different or, put into the third hard disk; if there is a hard disk C, D, E, F it? The result isA different or B different or C different or D different or E different or F 。

This is the principle of RAID5, but not complete, because according to the location of the checksum storage is not the same, it will be a different RAID mode: If all the checksums are placed on a hard disk, the RAID mode is RAID3; if the checksums and chunks of storage to each hard disk, that is RAID5, it can be said that RAID5 is the upgraded version of RAID3. Because if the checksum are placed on a hard disk, then as long as the storage of a piece of data, it is necessary to write a checksum to the checksum disk, the hard disk dedicated to the storage of checksum is certainly easy to bad.

In the above figure, Parity is the checksum block, each disk has all the rest of the disk corresponding to the block after the calculation of the checksum, so that there is no fear of any one disk is bad, if there is a hard disk is bad, only need to all the other disks corresponding to the block of the data to do a checksum, we know what the missing piece of data is. This explains why RAID5 requires a minimum of 3 hard disks, and all 3 disks should have the same capacity.

Number of hard disks required: 3
Benefits: 1 drive fault tolerance and optimized storage space allocation.
Cons: Not optimized for drives of different capacity sizes.
The size of all the checksum blocks combined is exactly the capacity of one hard disk.

4、RAID10

First of all, it should be noted that RAID 10 is called "RAID one zero", not "RAID 10", RAID 10 technology is RAID 1 + RAID 0 technology, a "combination". As shown in the figure below, RAID 10 is a "combination" of RAID 1+RAID 0 technology.

RAID10 technology requires at least 4 hard disks to form, which are first made into RAID 1 disk arrays to ensure data security; and then the two RAID 1 disk arrays to implement RAID 0 technology to further improve the read and write speed of the hard disk device.

In this way, theoretically, up to 50% of the hard disk devices can be damaged without data loss, as long as not all the hard disks in the same array are damaged. Because RAID 10 inherits the high read and write speeds of RAID 0 and the data security of RAID 1, and because RAID 10 outperforms RAID 5 regardless of cost, RAID 10 is now a widely used storage technology.

RAID 10 is the process of partitioning information and then mirroring it in pairs. In other words, RAID 1 is the lowest level of combination, and then RAID 0 technology is used to combine RAID 1 arrays together as a "block" of hard disks. Some people may wonder whether it is possible to create RAID arrays according to RAID 0 first, and then use RAID 1 technology to combine the two arrays together. This is actually the RAID01 technology, which has a lot of flaws and is rarely used for the following reasons:

RAID10 is essentially RAID0, it is very scalable, you can add a few RAID1 arrays no problem; but RAID01 is essentially RAID1 arrays, its shortcomings are precisely the shortcomings of RAID1 arrays, that is, no matter how many RAID0 arrays are added, it is only a few mirrors that are added, and you can't really use the capacity of the newly The capacity of the new array is not really used.
RAID10 underlying array is RAID1, which allows a bad disk to ensure the normal operation of the system; however, RAID01 underlying array is RAID0, once a disk is broken, the underlying RAID0 array will crash and become unusable.

So both in terms of scalability and data security, RAID10 is far better than RAID01 disk arrays, so much so that RAID01 is hardly used.

Second, use the mdadm command to manage the RAID

Use mdamd command to create raid in linux, mdadm full name is "multiple devices admin", the syntax format is "mdadm parameter hard disk name".

The Complete List of mdadm Parameters

parameters	functionality
-a	Adding a Device to a Disk Array
-n	Specify the number of devices
-l	Specify the RAID level
-C	establish
-v	Display process
-f	Damage to analog equipment
-r	Remove the device
-Q	View Summary Information
-D	View Details
-S	Stopping a RAID Array
-x	There's a couple of spare disks.

1、Create RAID10

The following operations are performed in Vmware Workstation

1.1 Creating RAID 10

Isn't it a bit difficult to create a RAID10 right off the bat? Actually not, assuming we already have/dev/sdb、/dev/sdc、/dev/sdd、/dev/sde Four hard disks, all 20 Gigabytes in size, created with the name /dev/md0 in just one command.

mdadm -Cv /dev/md0 -n 4 -l 10 /dev/sdb /dev/sdc /dev/sdd /dev/sde

-Cv: Create a disk array and show the process

-n 4: this array has 4 disks

-l 10: the type of array created is RAID10

The RAID10 was created in a single command, seriously, I was expecting to have to manually create a number of RAID1s and then combine those RAID1s with RAID0s.

After executing this command, there will be an initialization process, it will take about a minute or so to complete the initialization, which can be done through themdadm -D /dev/md0 command to view the details of this disk array

1.2 Formatting disk arrays

After creating the array, you need to format the array before you can use it, and the format command is still the mkfs command.

mkfs.ext4 /dev/md0

1.3 Mounting a Disk Array

You can create a folder /raid and then mount /dev/md0

 mount /dev/md0 /raid

This is the time to use thelsblkcommand to look at it, you can see that it is different from the previous normal partitions

There are md0 partitions under all four disks in the array, and all are all raid10 types.

In order to mount it permanently, you also have to write the configuration file/etc/fstab

/dev/md0                /raid                   ext4    defaults        0 0

1.4 Other RAID Modes

Other RAID modes are the same, only need to change the number of hard disks used and raild type can be, so will not repeat other types of disk array creation.

2. Disk damage and replacement

In the process of using disk arrays, inevitably, there will always be a day when the hard disk is damaged, in RAID10 disk arrays, up to 50% of the hard disk is allowed to be broken at the same time, this fault tolerance is quite large, so if there is a piece of hard disk is damaged, how to replace this piece of broken hard disk?

Let's start by querying the details of this disk array of ours

mdadm -D /dev/md0

You can see that there are four hard disks in the md0 disk array.

First, let's assume that the /dev/sdb drive is bad, and we need to flag this drive as unavailable first

mdadm /dev/md0 -f /dev/sdb

Then remove the drive

mdadm /dev/md0 -r /dev/sdb

Then look at the current state of this disk array

You can see that the array status has been degraded, which means that at least one disk in the RAID array has failed or is offline, preventing the array from operating in a fully redundant manner. Since we removed the /dev/sdb disk, the degraded status is normal.

Next we add a new hard disk /dev/sdf to replace the broken /dev/sdb and add it to the array using the command -a

mdadm -D /dev/md0

Then look at the current state of the disk array

You can see that the status has an additional "recovering", indicating that the array status is being recovered, and then there is a recovery progress, the new hard disk we added is recognized as a "spare disk", the spare disk is being rebuilt.

Wait patiently for a while and then look at the status of the disk array, it's normal.

3、Use the backup disk to realize hot recovery

In the previous section we implemented the manual removal of a broken hard disk when it was damaged and then replaced it with a new one, and there were actually two problems with this whole process

The probability of a broken hard disk being detected immediately is relatively low, and it often takes some time before a broken hard disk is detected, which may not be handled in a timely enough manner.
The process of replacing the hard disk is manual, even to the point of shutting down the computer to install the new hard disk, and the whole process carries the risk of operational errors

In fact, there is a way, do not have to shut down the computer, do not have to manually remove, as long as there is a piece of idle hard disk, this piece of hard disk is usually in an idle state, once the RAID disk array in the hard disk failure will immediately automatically take over, and automatically complete the data recovery, which is a great function, is not it not?

It's really just a matter of initializing the array with the-xparameter is sufficient.

The next RAID5 example to demonstrate the function. RAID5 requires a minimum of 3 hard disk, so I prepared 3 hard disk + 1 backup disk a total of four hard disks. The following figure sdb, sdc, sdd, sde a total of four.

Next, we're going to use these four hard disks in a raid5 setup.

mdadm -Cv /dev/md0 -n 3 -l 5 -x 1 /dev/sdb /dev/sdc /dev/sdd /dev/sde

The parameters have been explained before, so I won't repeat them, but the main thing is that the -x 1 parameter has been added to indicate that there is a backup disk.

and then throughmdadm -D /dev/md0 View group RAID details

You can see that it is being initialized, wait for a while, wait for the end of its initialization and then operate the next step

You can see that there are four hard disks in this RAID5 array, but only three are part of the RAID, and oneSpare Device, also known as the backup disk.

Next format and mount

utilizationdf -h /raidcommand to view the disk occupancy of the /raid directory

You can see that a RAID5 of three 20G hard disks, mounted to the /raid directory, has a usable storage size of 40G, as expected.

Then we put a hard disk/dev/sdbMark it as corrupted and see how it works

mdadm /dev/md0 -f /dev/sdb

Next, immediately run the commandmdadm -D /dev/md0 View the details of the current disk array

You can see that the backup hard disk /dev/sde has been automatically topped up and started to initialize, and then after a while, in the query under the md0 disk array

You can see that /dev/sde has completely replaced the bad /dev/sdb drive and is already working properly.

4. Delete RAID

4.1 Unmounting

First, remove the relevant mounts from the /etc/fstab file

After that, manually unmount

umount /dev/md0

4.2 Stopping a disk array

mdadm -S /dev/md0

After stopping the array, you will notice that the /dev/md0 device has disappeared.

Next, just shut down the system and pull out the hard disk.

Finally, feel free to follow my blog: