In the previous chapter we successfully deployed bcache, and in this chapter we use Ceph in conjunction with Bcache, using Bcache to speed up ceph's data disk.
1 ceph architecture
A standard ceph cluster might have the following architecture, with SSDs/NVMEs storing metadata and SATA disks storing data. In such an architecture, the upper limit of the read/write rate of the physical SATA disk determines the upper limit of the storage cluster ceph (the barrel effect). If we add a caching layer to the SATA disk in this architecture .
The structure of the caching layer is given in the official documentation as follows
The overall idea is the same, except that the caching layer is not maintained for a long time. Currently it is still in the non-production available stage .
The bcache technology, on the other hand, has been baptized by time and is a more mature solution in ceph caching technology.
Then the Bcache storage architecture becomes the following architecture.
In this architecture we modify the schema of Bcache as followswriteback modeIn the production environment, we just need to focus on increasing the number of cache hits, while in the production environment we need to focus on increasing the number of cache hits. In a production environment we just need to focus on solving the problem of increasing the hit rate of the cache. In fact most production environments do exactly that.
2 Deployment
In Chapter 13 we covered ceph single-node deployments, so if you don't already have a single-node ceph environment, jump back in and build a minimal ceph environment on your own VM environment.
All we need to do is point the --data disk to bcache when deploying ceph osd, and we're done.
ceph-volume --cluster ceph lvm create --bluestore \
--data /dev/bcache0 -- /dev/sde1
ceph-volume --cluster ceph lvm create --bluestore \
--data /dev/bcache1 -- /dev/sde2
ceph-volume --cluster ceph lvm create --bluestore \
--data /dev/bcache2 -- /dev/sde3
At this point, the --data specifies the data disks as bcache0, bcache1, and bcache2, and partitions 1, 2, and 3 of the sde disk correspond to .
At this point in viewing the status of the ceph cluster
3 Reflections
Each data disk corresponds to a three-level directory structure as follows
sdd
└─bcache0
└─ceph--f015264a--34a3--484e--b17a--1811290fea04-osd--block--c6b8e971--5246--46db--93f8--0ceda4626015
3.1 What exactly is this long list of them?
For those of you who know what LVM is, you can clearly see that the right side is the LV (Logical Volume number, while the left side is the VG (Volume Group) number.
We know.LVM --VG --PV corresponds to a tertiary structure, so it is known thatVG By vgdisplay
pvdisplay
We'll see.PV but (not)PV Generally is a partition or disk, at this time the corresponding ceph cluster is BcacheX, at this time X is the number of Bcache. Through the Bcache number, in combination with thelsblk
The disk letter of the back-end data disk that actually stores the data is known.
(take note of lsblk
outgoingVG cap (a poem)LVThe number separator is with two -- and, thevgdisplay
cap (a poem)lvdisplay
It's all separated by a -, so just filter the last 1 paragraph when searching)
3.2 Why do we need to know this? Or to put it another way what is the point of knowing this anymore?
We know that the ceph data disk in the ceph is expressed as a osd.2 and other numbers, now the question comes, if the osd.2 is bad and need to change the disk, how do you know osd.2 corresponding to the data disk is which disk?
It is summarized in the following figure
From the figure, we can know the corresponding number of osd and the corresponding relationship between the disk, and what is the relationship with the VG LVM number, through the installation directory of osd already know the corresponding relationship between the disk and osd. I don't care what LV VG is.
Let's imagine a scenario where bcache2 is broken, and the ceph field displayed by lsblk naturally doesn't exist. How do you correlate the osd number to the disk? At this point VG and PV will come in handy.
The authors therefore recommend that when studying, one should not be an instrumentalist, but rather, one should be able to know what he knows as well as what he knows.
3.3 Are there any simple commands whose correspondence can be seen at a glance?
ceph-volume lvs list
command can be implemented directly
Write at the end:
Currently we have completed a minimized ceph cluster, ceph as a distributed storage, know its concepts, and maintenance is a difficult task, note that we will be from theory to practice, focusing on ceph operation and maintenance skills.