liwen01 2024.07.21
preamble
The squashfs file system is widely used in embedded Linux systems. Its main features are read-only and high file compression ratio. For systems with tight flash space, some resources that do not need to be modified can be packaged into a compressed read-only file system format, thus achieving the purpose of saving space.
Another feature is that it can be decompressed in chunks, which makes using the data more flexible, but also introduces the problem of read amplification.
(i) Making a squash file system
Using mksquashfs you can make files and folders into squash file system image files, for example, if we want to pack the squashfs-root folder into a squashfs image file, we can use the command:
mksquashfs squashfs-root -comp xz
Here is the file compression using the xz compression method
(1) Compression ratio test
squashfs is a read-only compressed filesystem, let's briefly test its compression capabilities
Use /dev/zero to generate zero data to write to the file corresponding to the folder squashfs_zero
dd if=/dev/zero of=file1 bs=256K count=1
Make the following test file directory and test files.
biao@ubuntu:~/test/squashfs/squashfs_zero$ tree
.
├── test1
│ ├── file1
│ ├── file1_1
│ └── file1_2
├── test2
│ ├── file2
│ ├── file2_1
│ └── file2_2
├── test3
│ ├── file3
│ ├── file3_1
│ └── file3_2
└── test4
├── file4
├── file4_1
└── file4_2
4 directories, 12 files
biao@ubuntu:~/test/squashfs/squashfs_zero$
The file sizes are as follows:
biao@ubuntu:~/test/squashfs/squashfs_zero$ du -h
1.5M ./test3
2.1M ./test2
2.1M ./test1
1.7M ./test4
7.3M .
biao@ubuntu:~/test/squashfs/squashfs_zero$
Make squashfs_zero into an image file using xz compression
mksquashfs squashfs_zero squashfs_zero.sqsh -comp xz
The file sizes are as follows:
biao@ubuntu:~/test/squashfs$ ll -h squashfs_zero.sqsh
-rw-r--r-- 1 biao biao 4.0K Jun 26 23:48 squashfs_zero.sqsh
biao@ubuntu:~/test/squashfs$
Here is the 7.3M size squashfs_zero folder compressed into a 4k size squashfs_zero.sqsh. Of course, the test here is very extreme, because the data written to the file are 0, if you write a random number of compression ratios will be very different.
(ii) Squashfs data analysis
(1) Data Layout
A Squashfs image file it contains up to the following 9 parts:Superblock、Compression options、Data blocks fragments、Inode table、Directory table、Fragment table、Export table、 UID/GID lookup table、Xattr table
。
The most inclusive means that some parts are not required, such as the Compression options part.
Their data distribution in the image file is shown below:
(2) Create a test image file
Use /dev/urandom to generate random numbers to write to the file corresponding to the folder squashfs_urandom:
dd if=/dev/urandom of=filex bs=10K count=50
Make the following test file directory and test files:
biao@ubuntu:~/test/squashfs/squashfs_urandom$ tree
.
├── test1
│ ├── file1
│ ├── file1_1
│ └── file1_2
├── test2
│ ├── file2
│ ├── file2_1
│ └── file2_2
├── test3
│ ├── file3
│ ├── file3_1
│ └── file3_2
└── test4
├── file4
├── file4_1
└── file4_2
4 directories, 12 files
biao@ubuntu:~/test/squashfs/squashfs_urandom$
Most of the components of the squashfs filesystem are also compressed, and for the purposes of our data analysis later, we set theData blocks fragments、Inode table、Directory table、Fragment table
No compression
The production commands are as follows:
mksquashfs squashfs_urandom squashfs_urandom.sqsh -comp xz -noF -noX -noI -noD
(3) Viewing Mirror Data Information
If you want to see summary information about squashfs, you can use the unsquashfs command to do so
unsquashfs -s squashfs_urandom.sqsh
The output content information is as follows.
biao@ubuntu:~/test/squashfs$ unsquashfs -s squashfs_urandom.sqsh
Found a valid SQUASHFS 4:0 superblock on squashfs_urandom.sqsh.
Creation or last append time Wed Jun 26 23:28:18 2024
Filesystem size 5032.60 Kbytes (4.91 Mbytes)
Compression xz
Block size 131072
Filesystem is exportable via NFS
Inodes are uncompressed
Data is uncompressed
Fragments are uncompressed
Always-use-fragments option is not specified
Xattrs are uncompressed
Duplicates are removed
Number of fragments 2
Number of inodes 37
Number of ids 1
biao@ubuntu:~/test/squashfs$
Here we can see that the part above where we set -no, there is no data compression.
(4) Superblock parameter analysis
Superblock is at the very beginning of the image file, the size is fixed to 96 bytes, view the data content as follows:
biao@ubuntu:~/test/squashfs$ hexdump -s 0 -n 96 -C squashfs_urandom.sqsh
00000000 68 73 71 73 11 00 00 00 ec 5c 7a 66 00 00 02 00 |hsqs.....\zf....|
00000010 02 00 00 00 04 00 11 00 cb 01 01 00 04 00 00 00 |................|
00000020 ac 02 00 00 00 00 00 00 16 9d 4e 00 00 00 00 00 |..........N.....|
00000030 0e 9d 4e 00 00 00 00 00 ff ff ff ff ff ff ff ff |..N.............|
00000040 60 98 4e 00 00 00 00 00 2e 9b 4e 00 00 00 00 00 |`.N.......N.....|
00000050 6e 9c 4e 00 00 00 00 00 00 9d 4e 00 00 00 00 00 |.......N.....|
00000060
biao@ubuntu:~/test/squashfs$
Parsing of Superblock's data
Here we see a couple of the more critical data
-
The first 4 bytes are the squashfs' magic, with the value hsqs -
block size is the maximum length of each data block, here is 128KB, squashfs support the block size range is: 4KB ~ 1MB -
compressor indicates the compression type, here 4 means xz compression, others also support GZIP, LZMA, LZO, LZ4, ZSTD data compression format. -
frag count indicates how many segments of data are stored in the fragments block. -
At the end is the start position of each table group block
(5) inode table data analysis
From the superblock we know that the start of the inode table is at position 0x4e9860
biao@ubuntu:~/test/squashfs$ hexdump -s 0x4e9860 -n 718 -C squashfs_urandom.sqsh
004e9860 cc 82 02 00 b4 01 00 00 00 00 9b e9 78 66 02 00 |............xf..|
004e9870 00 00 60 00 00 00 ff ff ff ff 00 00 00 00 00 20 |..`............ |
004e9880 03 00 00 00 02 01 00 20 01 01 02 00 b4 01 00 00 |....... ........|
004e9890 00 00 c3 e9 78 66 03 00 00 00 60 20 03 00 ff ff |....xf....` ....|
004e98a0 ff ff 00 00 00 00 00 d0 07 00 00 00 02 01 00 00 |................|
004e98b0 02 01 00 00 02 01 00 d0 01 01 02 00 b4 01 00 00 |................|
004e98c0 00 00 cf e9 78 66 04 00 00 00 60 f0 0a 00 ff ff |....xf....`.....|
004e98d0 ff ff 00 00 00 00 00 80 0c 00 00 00 02 01 00 00 |................|
004e98e0 02 01 00 00 02 01 00 00 02 01 00 00 02 01 00 00 |................|
004e98f0 02 01 00 80 00 01 01 00 fd 01 00 00 00 00 b1 e9 |................|
004e9900 78 66 01 00 00 00 00 00 00 00 02 00 00 00 3a 00 |xf............:.|
004e9910 00 00 11 00 00 00 02 00 b4 01 00 00 00 00 f9 e9 |................|
004e9920 78 66 06 00 00 00 00 00 00 00 00 00 00 00 00 00 |xf..............|
004e9930 00 00 00 78 00 00 02 00 b4 01 00 00 00 00 01 ea |...x............|
004e9940 78 66 07 00 00 00 00 00 00 00 00 00 00 00 00 78 |xf.............x|
004e9950 00 00 00 18 01 00 02 00 b4 01 00 00 00 00 08 ea |................|
004e9960 78 66 08 00 00 00 00 00 00 00 01 00 00 00 00 00 |xf..............|
004e9970 00 00 00 f0 00 00 01 00 fd 01 00 00 00 00 ea e9 |................|
.........
.........
biao@ubuntu:~/test/squashfs$
Analysis of data
There are a few parameters to note here:
(a)inode_type
inode_type is the type of inode, the value 2 means normal file, other types are defined as follows:
(b)block_sizes
Here is the size of the block described (possibly compressed), and this size needs to be parsed.
Why do some inodes have multiple block_sizes? This is because superblock defines a maximum value of a block, if the size of a file is larger than the maximum value of a block, then it has more than one block_sizes.
In fact, each file has a corresponding inode, which is sequentially distributed in the inode table.
(6) directory table Data analysis
From the superblock we know that the directory table starts at location 0x4e9b2e.
biao@ubuntu:~/test/squashfs$ hexdump -s 0x4e9b2e -n 320 -C squashfs_urandom.sqsh
004e9b2e 1c 81 02 00 00 00 00 00 00 00 02 00 00 00 00 00 |................|
004e9b3e 00 00 02 00 04 00 66 69 6c 65 31 28 00 01 00 02 |......file1(....|
004e9b4e 00 06 00 66 69 6c 65 31 5f 31 58 00 02 00 02 00 |...file1_1X.....|
004e9b5e 06 00 66 69 6c 65 31 5f 32 02 00 00 00 00 00 00 |..file1_2.......|
004e9b6e 00 06 00 00 00 b4 00 00 00 02 00 04 00 66 69 6c |.............fil|
004e9b7e 65 32 d4 00 01 00 02 00 06 00 66 69 6c 65 32 5f |e2........file2_|
004e9b8e 31 f4 00 02 00 02 00 06 00 66 69 6c 65 32 5f 32 |1........file2_2|
004e9b9e 02 00 00 00 00 00 00 00 0a 00 00 00 34 01 00 00 |............4...|
004e9bae 02 00 04 00 66 69 6c 65 33 68 01 01 00 02 00 06 |....file3h......|
004e9bbe 00 66 69 6c 65 33 5f 31 98 01 02 00 02 00 06 00 |.file3_1........|
004e9bce 66 69 6c 65 33 5f 32 02 00 00 00 00 00 00 00 0e |file3_2.........|
004e9bde 00 00 00 f8 01 00 00 02 00 04 00 66 69 6c 65 34 |...........file4|
004e9bee 28 02 01 00 02 00 06 00 66 69 6c 65 34 5f 31 60 |(.......file4_1`|
004e9bfe 02 02 00 02 00 06 00 66 69 6c 65 34 5f 32 03 00 |.......file4_2..|
004e9c0e 00 00 00 00 00 00 01 00 00 00 94 00 00 00 01 00 |................|
004e9c1e 04 00 74 65 73 74 31 14 01 04 00 01 00 04 00 74 |..test1........t|
004e9c2e 65 73 74 32 d8 01 08 00 01 00 04 00 74 65 73 74 |est2........test|
004e9c3e 33 8c 02 0c 00 01 00 04 00 74 65 73 74 34 20 80 |3........test4 .|
004e9c4e 60 70 17 00 00 00 00 00 00 90 01 01 00 00 00 00 |`p..............|
004e9c5e 60 a8 4d 00 00 00 00 00 00 f0 00 01 00 00 00 00 |`.M.............|
004e9c6e
biao@ubuntu:~/test/squashfs$
Parsing the data:
Here at the very beginning is a directory header structure, which consists of count, start, and inode number, which are defined as follows:
Each directory header must carry at least one Directory Entry, which is defined as follows:
The inode number here corresponds to the inode number in the inode table.
(7) Data blocks fragments analysis
(a)Data blocks
In the image file we are testing, we should be using xz compression, which is a regular compression method and will not be described in the Compression options, which means that the Compression options component is empty.
Immediately following the Superblock is the Data blocks data.
From the inode table and dir table we know that the file1 file with inode number 2 is stored at the very beginning.
Because our data here is uncompressed, it should normally be the same to compare the data at the beginning of the 0x60 address of the mirror file with the data at the beginning of the file1 file.
(b)fragments
fragments block is designed to store a number of small files, combining them into a block to store, there is also a small portion of the remaining data of the previous file, may also be stored in the fragments block.
For more information on what data is stored in the fragments, see the fragments table.
(iii) How squashfs works
(1) Mounting the file system:
When squashfs is mounted, the system first reads the superblock block to get basic information about squashfs and the location of each table.
(2) Access to files or directories:
-
The system gets the location of the inode table and directory table from superblock. -
If you are accessing a directory, the system looks up the directory table and gets the name of each file and subdirectory in the directory and its inode number. -
Get the inode of a file or directory from the inode table by inode number to know the metadata of the file and the location of the data block. -
For fragments of small or large files, look up the fragment table with the information in the inode to get the data location of the fragment.
(iv) Squashfs advantages and disadvantages
(1) Advantages
high compression ratio: SquashFS uses compression algorithms such as gzip, lzma, lz4, xz, etc., which can significantly reduce the size of the file system and save storage space.
read-only feature: Ideal for environments where data integrity needs to be protected, such as embedded systems and read-only images of operating systems.
Efficient random access: SquashFS supports efficient random read access and is suitable for read-frequent scenarios.
Fragmentation: With Fragment Table, SquashFS can effectively handle small files, reduce storage fragmentation and improve storage efficiency.
Storage and performance optimization: Supports compression of files, directories, and inodes, reducing storage footprint and I/O operations and improving performance.
data integrity: SquashFS can include checksums to ensure data integrity and prevent data corruption.
(2) Disadvantages
read-only feature: SquashFS is read-only and cannot directly modify files or directories in the file system. This means that when a file system needs to be updated or changed, the entire file system image must be regenerated.
Compression overhead: Although the read speed is faster, the decompression process still requires some CPU resources. In low performance embedded systems, this may have some impact on system performance.
memory consumption: the decompression process can consume a lot of memory when reading large files, which can be a bottleneck especially in resource-constrained embedded systems
wind up
The above describes the data components of the squashfs file system and how they work with each other as well as the advantages and disadvantages of the squash file system.
Here's a question: if the root filesystem uses the squashfs filesystem, and the main executable is also located in the root filesystem, how do I upgrade the root filesystem without considering the dual-partition backup upgrade?
Is it ok to write the new squashfs image file directly to the mtdblock where the root filesystem is located in the main program? Is there a risk that the root filesystem will be updated abnormally?