File System (XI): Introduction to the Linux Squashfs Read-Only File System

liwen01 2024.07.21

preamble

The squashfs file system is widely used in embedded Linux systems. Its main features are read-only and high file compression ratio. For systems with tight flash space, some resources that do not need to be modified can be packaged into a compressed read-only file system format, thus achieving the purpose of saving space.

Another feature is that it can be decompressed in chunks, which makes using the data more flexible, but also introduces the problem of read amplification.

(i) Making a squash file system

Using mksquashfs you can make files and folders into squash file system image files, for example, if we want to pack the squashfs-root folder into a squashfs image file, we can use the command:

mksquashfs squashfs-root  -comp xz

Here is the file compression using the xz compression method

(1) Compression ratio test

squashfs is a read-only compressed filesystem, let's briefly test its compression capabilities

Use /dev/zero to generate zero data to write to the file corresponding to the folder squashfs_zero

dd if=/dev/zero of=file1 bs=256K count=1

Make the following test file directory and test files.

biao@ubuntu:~/test/squashfs/squashfs_zero$ tree
.
├── test1
│   ├── file1
│   ├── file1_1
│   └── file1_2
├── test2
│   ├── file2
│   ├── file2_1
│   └── file2_2
├── test3
│   ├── file3
│   ├── file3_1
│   └── file3_2
└── test4
    ├── file4
    ├── file4_1
    └── file4_2

4 directories, 12 files
biao@ubuntu:~/test/squashfs/squashfs_zero$

The file sizes are as follows:

biao@ubuntu:~/test/squashfs/squashfs_zero$ du -h
1.5M    ./test3
2.1M    ./test2
2.1M    ./test1
1.7M    ./test4
7.3M    .
biao@ubuntu:~/test/squashfs/squashfs_zero$

Make squashfs_zero into an image file using xz compression

mksquashfs squashfs_zero squashfs_zero.sqsh -comp xz

The file sizes are as follows:

biao@ubuntu:~/test/squashfs$ ll -h squashfs_zero.sqsh 
-rw-r--r-- 1 biao biao 4.0K Jun 26 23:48 squashfs_zero.sqsh
biao@ubuntu:~/test/squashfs$

Here is the 7.3M size squashfs_zero folder compressed into a 4k size squashfs_zero.sqsh. Of course, the test here is very extreme, because the data written to the file are 0, if you write a random number of compression ratios will be very different.

(ii) Squashfs data analysis

(1) Data Layout

A Squashfs image file it contains up to the following 9 parts:Superblock、Compression options、Data blocks fragments、Inode table、Directory table、Fragment table、Export table、 UID/GID lookup table、Xattr table。

The most inclusive means that some parts are not required, such as the Compression options part.

Their data distribution in the image file is shown below:

(2) Create a test image file

Use /dev/urandom to generate random numbers to write to the file corresponding to the folder squashfs_urandom:

dd if=/dev/urandom of=filex bs=10K count=50

Make the following test file directory and test files:

biao@ubuntu:~/test/squashfs/squashfs_urandom$ tree
.
├── test1
│   ├── file1
│   ├── file1_1
│   └── file1_2
├── test2
│   ├── file2
│   ├── file2_1
│   └── file2_2
├── test3
│   ├── file3
│   ├── file3_1
│   └── file3_2
└── test4
    ├── file4
    ├── file4_1
    └── file4_2

4 directories, 12 files
biao@ubuntu:~/test/squashfs/squashfs_urandom$

Most of the components of the squashfs filesystem are also compressed, and for the purposes of our data analysis later, we set theData blocks fragments、Inode table、Directory table、Fragment tableNo compression

The production commands are as follows:

mksquashfs squashfs_urandom squashfs_urandom.sqsh -comp xz  -noF -noX -noI -noD

(3) Viewing Mirror Data Information

If you want to see summary information about squashfs, you can use the unsquashfs command to do so

unsquashfs -s squashfs_urandom.sqsh

The output content information is as follows.

biao@ubuntu:~/test/squashfs$ unsquashfs -s squashfs_urandom.sqsh 
Found a valid SQUASHFS 4:0 superblock on squashfs_urandom.sqsh.
Creation or last append time Wed Jun 26 23:28:18 2024
Filesystem size 5032.60 Kbytes (4.91 Mbytes)
Compression xz
Block size 131072
Filesystem is exportable via NFS
Inodes are uncompressed
Data is uncompressed
Fragments are uncompressed
Always-use-fragments option is not specified
Xattrs are uncompressed
Duplicates are removed
Number of fragments 2
Number of inodes 37
Number of ids 1
biao@ubuntu:~/test/squashfs$

Here we can see that the part above where we set -no, there is no data compression.

(4) Superblock parameter analysis

Superblock is at the very beginning of the image file, the size is fixed to 96 bytes, view the data content as follows:

biao@ubuntu:~/test/squashfs$ hexdump  -s 0 -n 96 -C squashfs_urandom.sqsh 
00000000  68 73 71 73 11 00 00 00  ec 5c 7a 66 00 00 02 00  |hsqs.....\zf....|
00000010  02 00 00 00 04 00 11 00  cb 01 01 00 04 00 00 00  |................|
00000020  ac 02 00 00 00 00 00 00  16 9d 4e 00 00 00 00 00  |..........N.....|
00000030  0e 9d 4e 00 00 00 00 00  ff ff ff ff ff ff ff ff  |..N.............|
00000040  60 98 4e 00 00 00 00 00  2e 9b 4e 00 00 00 00 00  |`.N.......N.....|
00000050  6e 9c 4e 00 00 00 00 00  00 9d 4e 00 00 00 00 00  |.......N.....|
00000060
biao@ubuntu:~/test/squashfs$

Parsing of Superblock's data

Here we see a couple of the more critical data

The first 4 bytes are the squashfs' magic, with the value hsqs
block size is the maximum length of each data block, here is 128KB, squashfs support the block size range is: 4KB ~ 1MB
compressor indicates the compression type, here 4 means xz compression, others also support GZIP, LZMA, LZO, LZ4, ZSTD data compression format.
frag count indicates how many segments of data are stored in the fragments block.
At the end is the start position of each table group block

(5) inode table data analysis

From the superblock we know that the start of the inode table is at position 0x4e9860

biao@ubuntu:~/test/squashfs$ hexdump  -s 0x4e9860 -n 718 -C squashfs_urandom.sqsh     
004e9860  cc 82 02 00 b4 01 00 00  00 00 9b e9 78 66 02 00  |............xf..|
004e9870  00 00 60 00 00 00 ff ff  ff ff 00 00 00 00 00 20  |..`............ |
004e9880  03 00 00 00 02 01 00 20  01 01 02 00 b4 01 00 00  |....... ........|
004e9890  00 00 c3 e9 78 66 03 00  00 00 60 20 03 00 ff ff  |....xf....` ....|
004e98a0  ff ff 00 00 00 00 00 d0  07 00 00 00 02 01 00 00  |................|
004e98b0  02 01 00 00 02 01 00 d0  01 01 02 00 b4 01 00 00  |................|
004e98c0  00 00 cf e9 78 66 04 00  00 00 60 f0 0a 00 ff ff  |....xf....`.....|
004e98d0  ff ff 00 00 00 00 00 80  0c 00 00 00 02 01 00 00  |................|
004e98e0  02 01 00 00 02 01 00 00  02 01 00 00 02 01 00 00  |................|
004e98f0  02 01 00 80 00 01 01 00  fd 01 00 00 00 00 b1 e9  |................|
004e9900  78 66 01 00 00 00 00 00  00 00 02 00 00 00 3a 00  |xf............:.|
004e9910  00 00 11 00 00 00 02 00  b4 01 00 00 00 00 f9 e9  |................|
004e9920  78 66 06 00 00 00 00 00  00 00 00 00 00 00 00 00  |xf..............|
004e9930  00 00 00 78 00 00 02 00  b4 01 00 00 00 00 01 ea  |...x............|
004e9940  78 66 07 00 00 00 00 00  00 00 00 00 00 00 00 78  |xf.............x|
004e9950  00 00 00 18 01 00 02 00  b4 01 00 00 00 00 08 ea  |................|
004e9960  78 66 08 00 00 00 00 00  00 00 01 00 00 00 00 00  |xf..............|
004e9970  00 00 00 f0 00 00 01 00  fd 01 00 00 00 00 ea e9  |................|
.........
.........
biao@ubuntu:~/test/squashfs$

Analysis of data

There are a few parameters to note here:

(a)inode_type

inode_type is the type of inode, the value 2 means normal file, other types are defined as follows:

(b)block_sizes

Here is the size of the block described (possibly compressed), and this size needs to be parsed.

Why do some inodes have multiple block_sizes? This is because superblock defines a maximum value of a block, if the size of a file is larger than the maximum value of a block, then it has more than one block_sizes.

In fact, each file has a corresponding inode, which is sequentially distributed in the inode table.

(6) directory table Data analysis

From the superblock we know that the directory table starts at location 0x4e9b2e.

biao@ubuntu:~/test/squashfs$ hexdump  -s 0x4e9b2e -n 320 -C squashfs_urandom.sqsh          
004e9b2e  1c 81 02 00 00 00 00 00  00 00 02 00 00 00 00 00  |................|
004e9b3e  00 00 02 00 04 00 66 69  6c 65 31 28 00 01 00 02  |......file1(....|
004e9b4e  00 06 00 66 69 6c 65 31  5f 31 58 00 02 00 02 00  |...file1_1X.....|
004e9b5e  06 00 66 69 6c 65 31 5f  32 02 00 00 00 00 00 00  |..file1_2.......|
004e9b6e  00 06 00 00 00 b4 00 00  00 02 00 04 00 66 69 6c  |.............fil|
004e9b7e  65 32 d4 00 01 00 02 00  06 00 66 69 6c 65 32 5f  |e2........file2_|
004e9b8e  31 f4 00 02 00 02 00 06  00 66 69 6c 65 32 5f 32  |1........file2_2|
004e9b9e  02 00 00 00 00 00 00 00  0a 00 00 00 34 01 00 00  |............4...|
004e9bae  02 00 04 00 66 69 6c 65  33 68 01 01 00 02 00 06  |....file3h......|
004e9bbe  00 66 69 6c 65 33 5f 31  98 01 02 00 02 00 06 00  |.file3_1........|
004e9bce  66 69 6c 65 33 5f 32 02  00 00 00 00 00 00 00 0e  |file3_2.........|
004e9bde  00 00 00 f8 01 00 00 02  00 04 00 66 69 6c 65 34  |...........file4|
004e9bee  28 02 01 00 02 00 06 00  66 69 6c 65 34 5f 31 60  |(.......file4_1`|
004e9bfe  02 02 00 02 00 06 00 66  69 6c 65 34 5f 32 03 00  |.......file4_2..|
004e9c0e  00 00 00 00 00 00 01 00  00 00 94 00 00 00 01 00  |................|
004e9c1e  04 00 74 65 73 74 31 14  01 04 00 01 00 04 00 74  |..test1........t|
004e9c2e  65 73 74 32 d8 01 08 00  01 00 04 00 74 65 73 74  |est2........test|
004e9c3e  33 8c 02 0c 00 01 00 04  00 74 65 73 74 34 20 80  |3........test4 .|
004e9c4e  60 70 17 00 00 00 00 00  00 90 01 01 00 00 00 00  |`p..............|
004e9c5e  60 a8 4d 00 00 00 00 00  00 f0 00 01 00 00 00 00  |`.M.............|
004e9c6e
biao@ubuntu:~/test/squashfs$

Parsing the data:

Here at the very beginning is a directory header structure, which consists of count, start, and inode number, which are defined as follows:

Each directory header must carry at least one Directory Entry, which is defined as follows:

The inode number here corresponds to the inode number in the inode table.

(7) Data blocks fragments analysis

(a)Data blocks

In the image file we are testing, we should be using xz compression, which is a regular compression method and will not be described in the Compression options, which means that the Compression options component is empty.

Immediately following the Superblock is the Data blocks data.

From the inode table and dir table we know that the file1 file with inode number 2 is stored at the very beginning.

Because our data here is uncompressed, it should normally be the same to compare the data at the beginning of the 0x60 address of the mirror file with the data at the beginning of the file1 file.

(b)fragments

fragments block is designed to store a number of small files, combining them into a block to store, there is also a small portion of the remaining data of the previous file, may also be stored in the fragments block.

For more information on what data is stored in the fragments, see the fragments table.

(iii) How squashfs works

(1) Mounting the file system：

When squashfs is mounted, the system first reads the superblock block to get basic information about squashfs and the location of each table.

(2) Access to files or directories：

The system gets the location of the inode table and directory table from superblock.
If you are accessing a directory, the system looks up the directory table and gets the name of each file and subdirectory in the directory and its inode number.
Get the inode of a file or directory from the inode table by inode number to know the metadata of the file and the location of the data block.
For fragments of small or large files, look up the fragment table with the information in the inode to get the data location of the fragment.

(iv) Squashfs advantages and disadvantages

(1) Advantages

high compression ratio: SquashFS uses compression algorithms such as gzip, lzma, lz4, xz, etc., which can significantly reduce the size of the file system and save storage space.

read-only feature: Ideal for environments where data integrity needs to be protected, such as embedded systems and read-only images of operating systems.

Efficient random access: SquashFS supports efficient random read access and is suitable for read-frequent scenarios.

Fragmentation: With Fragment Table, SquashFS can effectively handle small files, reduce storage fragmentation and improve storage efficiency.

Storage and performance optimization: Supports compression of files, directories, and inodes, reducing storage footprint and I/O operations and improving performance.

data integrity: SquashFS can include checksums to ensure data integrity and prevent data corruption.

(2) Disadvantages

read-only feature: SquashFS is read-only and cannot directly modify files or directories in the file system. This means that when a file system needs to be updated or changed, the entire file system image must be regenerated.

Compression overhead: Although the read speed is faster, the decompression process still requires some CPU resources. In low performance embedded systems, this may have some impact on system performance.

memory consumption: the decompression process can consume a lot of memory when reading large files, which can be a bottleneck especially in resource-constrained embedded systems

wind up

The above describes the data components of the squashfs file system and how they work with each other as well as the advantages and disadvantages of the squash file system.

Here's a question: if the root filesystem uses the squashfs filesystem, and the main executable is also located in the root filesystem, how do I upgrade the root filesystem without considering the dual-partition backup upgrade?

Is it ok to write the new squashfs image file directly to the mtdblock where the root filesystem is located in the main program? Is there a risk that the root filesystem will be updated abnormally?

-------------------End------------------- For more information follow sth closely liwen01 public number