preamble
Recently, online services often appear some strange problems, such as static resources on the web page can not be loaded, or request the back-end inexplicable error, or Redis error ...
When I SSH logged into the server, it was even worse, lagging on knocking out a command...
If I hadn't had the experience before, I might have thought that I was in trouble, but as a Linux user for many years, I've learned that "storage space is full" is behind all these anomalies!
The question then arises, "Where did the hard disk space go?"
run through a checklist
Let's take a look at it with the df.
First the most common commands
df -h
Let's take a look at the individual disks and mount points first
After executing this command on my server, I found that the free space in the root directory was down to a few hundred KB...
And then use du to look at the specifics.
Use the du command to see how much space each subdirectory occupies, and then sort the subdirectories with the sort command.
sudo du -h --max-depth=1 / | sort -hr
Parameter Description:
- Because you want to count the root directory directly, you need root privileges.
-
--max-depth=1
: Displays only the occupancy of the current directory and its next level subdirectories. -
sort -hr
: Descending order by size.
View the total space occupied by the catalog:
du -sh /
Better Tools
The previous du command, still doesn't work that well, mainly because there is too much data listed.
This time I used the ncdu tool
ncdu
is a more user-friendly disk usage analysis tool that supports an interactive interface.
However, many distributions don't have it built in and you need to install it first:
sudo apt install ncdu
Analyzing Root Occupancy with ncdu
sudo ncdu -x /
utilization-x
The parameter can limit the scanning scope to the current filesystem, not across the mount points. Because there are many mount points of other hard disks in the root directory, according to the analysis of using the df command earlier, it is the main hard disk that is full, so I just need to look at the main hard disk's.
After ncdu starts up, it scans the storage space for each file and then enters an interactive interface that allows you to visualize the amount of space occupied by each directory, sorted from largest to smallest.
criminal ringleader, main offender (idiom); main culprit
This can be visualized in ncdu./var/lib/docker
This directory takes up more than 70% of the storage space, proper defense!
Analyzing with du
sudo du -h --max-depth=1 /var/lib/docker | sort -hr
After that the approximate occupancy looks like this (just an example, not real data)
10G /var/lib/docker/overlay2
5G /var/lib/docker/volumes
1G /var/lib/docker/containers
500M /var/lib/docker/images
It's basically the docker image, container logs, volumes and such that eat up the hard disk
Cleaning up unused resources in Docker
If we find the problem, we'll be fine.
First, stop the useless docker container
Then run some of the cleanup commands provided by docker.
Clean up unused resources (images, containers, volumes and networks)
Docker providesdocker system prune
command that cleans up unused resources.
docker system prune -a
-
-a
: Deletes all unused mirrors (including mirrors that are not associated with a container). - take note of: This command does not remove resources that are dependent on the running container, so proceed with caution.
Purge only unused volumes
If it is/var/lib/docker/volumes
Takes up more space:
docker volume prune
Clean up only unused networks
If it is/var/lib/docker/network
Higher occupancy:
docker network prune
Deleting Unwanted Containers, Images, and Volumes
Deleting non-running containers
docker container prune
Deleting useless mirrors
docker image prune -a
Deleting useless volumes
docker volume prune
Cleaning up old mirrors and unused tags
If a lot of mirrors are being used, many of the older versions may no longer be useful.
List mirrors sorted by size
docker images --format "{{.Repository}}:{{.Tag}} {{.Size}}" | sort -k2 -h
Deleting a Specified Image
docker rmi <image-id>
What's so big?
In the process of analyzing storage space usage, I also found one file that was particularly outrageous, the following file, over 500 G...
/var/lib/docker/containers/e4b5a99b429a612885417460214ea40a6a49a3360c29180af800ff7aef4c03df/
Find the poop container.
Come find out which vessel took the shit.
The path to the log file contains the ID of the container:e4b5a99b429a612885417460214ea40a6a49a3360c29180af800ff7aef4c03df
to find out which container it is:
docker ps | grep e4b5a99b429a
The container cannot be found in the list of containers, perhaps it has been stopped or deleted. In this case, you can use thedocker inspect
Check for specific information:
docker inspect e4b5a99b429a
Analyzing Log Contents
This can be done bytail
maybeless
View the contents of the log and check for abnormal output:
sudo tail -n 50 /var/lib/docker/containers/e4b5a99b429a612885417460214ea40a6a49a3360c29180af800ff7aef4c03df/
I'm not going to post the specifics of the logs, they should look fine, it's just that it's been running for a long time, and it's built up over time...
See the next subsection for ways to deal with this issue.
Cleaning up docker log files
in the event that/var/lib/docker/containers
Takes up a lot of space, possibly because the container log file is too large.
Viewing Log Files
The logs for each container are stored in the/var/lib/docker/containers/<container-id>/<container-id>-
。
Use the following command to find the largest log file:
sudo find /var/lib/docker/containers/ -type f -name "*.log" -exec du -h {} + | sort -hr | head -n 10
Manual log cleaning
Empties the log file for a specific container:
sudo truncate -s 0 /var/lib/docker/containers/<container-id>/<container-id>-
Setting log file size limits
Limit the log size in Docker's configuration file (recommended):
-
Edit the Docker configuration file (usually
/etc/docker/
):sudo nano /etc/docker/
-
Add or modify the following configuration:
{ "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" } }
-
max-size
: Maximum 10 MB for a single log file. -
max-file
: 3 log files are retained.
-
-
Reload Docker:
sudo systemctl restart docker
Migrating /var/lib/docker to a new disk
It is possible to combine/var/lib/docker
Mounts to other disks, thus relieving storage pressure on the current disk. This is a common practice, especially if there are multiple disks.
Approach I: Direct migration/var/lib/docker
to a new disk
Stopping the Docker Service
You need to stop the Docker service before migrating the data:
sudo systemctl stop docker
mobility/var/lib/docker
To new location
Assuming the new disk is mounted on/mnt/new-disk
, execute the following command:
sudo mv /var/lib/docker /mnt/new-disk/docker
Creating Soft Links
Link the new path back to the/var/lib/docker
to allow Docker to continue to work on the default path:
sudo ln -s /mnt/new-disk/docker /var/lib/docker
Starting the Docker Service
sudo systemctl start docker
validate (a theory)
Run the following command to ensure that Docker is working properly:
docker info
Method 2: Modify the Docker configuration file to specify the new storage location
Stopping the Docker Service
sudo systemctl stop docker
mobility/var/lib/docker
to a new disk
Migrate existing data to a new disk mount point. For example, a new disk mounted at/mnt/new-disk
:
sudo mv /var/lib/docker /mnt/new-disk/docker
Modifying the Docker Configuration
Edit Docker's configuration file (usually the/etc/docker/
), specify a new storage path:
sudo nano /etc/docker/
Add or modify the following:
{
"data-root": "/mnt/new-disk/docker"
}
Starting the Docker Service
sudo systemctl start docker
validate (a theory)
Double-check that Docker is running properly:
docker info | grep "Docker Root Dir"
You should see the new path (e.g./mnt/new-disk/docker
)。
Method 3: Mount the new disk directly to the/var/lib/docker
If you want to use the new disk directly as a/var/lib/docker
mount point, you can use the following methods
Formatting a new disk
Assuming the new disk is/dev/sdb1
If the file system is formatted and created first (e.g., theext4
):
sudo mkfs.ext4 /dev/sdb1
Mounting a new disk
Mount the new disk to the/var/lib/docker
:
sudo mount /dev/sdb1 /var/lib/docker
Migration of existing data
in the event that/var/lib/docker
There is already data in the directory that needs to be copied to the new disk first:
sudo rsync -a /var/lib/docker/ /mnt/new-disk/
Then mount the new disk back/var/lib/docker
。
modifications/etc/fstab
Ensure power-on auto-mount
compiler/etc/fstab
file, add a line for the mount configuration:
/dev/sdb1 /var/lib/docker ext4 defaults 0 2
caveat
-
Data migration risks: In relocation or reconstruction
/var/lib/docker
Be sure to back up important data (e.g., persistent volume data). -
Permission issues: Make sure the permissions of the new directory are the same as the original directory:
sudo chown -R root:root /mnt/new-disk/docker
-
Checking the mount point: Ensure that the new disk is mounted successfully and set up automatic mounting to avoid path loss after system reboot.
With the above method, it is possible to successfully place the/var/lib/docker
Mount to other disks to relieve storage pressure and optimize storage layout.
Rebuild /var/lib/docker
If you still run out of space after cleaning up, you can rebuild Docker's storage directory (which will remove all containers, images, and data)
Stop the Docker service:
sudo systemctl stop docker
Backup existing Docker data (optional):
sudo mv /var/lib/docker /var/lib/
Create a new empty directory:
sudo mkdir /var/lib/docker
Start the Docker service:
sudo systemctl start docker
summarize
During this troubleshooting of the Linux server hard drive space disappearance problem, I experienced a complete storage analysis and optimization hands-on.
Key Steps Summarized:
-
Initial troubleshooting of storage occupancy
- utilization
du
respond in singingncdu
and other tools to quickly locate directories that take up a lot of space. - discoveries
/var/lib/docker
Catalogs take up a lot of storage space.
- utilization
-
In-depth positioning of specific issues
- Finding the path to the specific container log file and confirming which container is generating a large number of logs by its container ID.
- utilization
docker inspect
respond in singingdocker logs
, further analyze the contents of the log.
-
sort
- Emptied the oversized container log file by
truncate
Command to release the space immediately. - Modify the Docker configuration file (
) limits the size of the log file to avoid similar problems from happening again.
- Emptied the oversized container log file by
-
Validation and Optimization
- A restart of the Docker service verified that the service is working properly.
- utilization
docker system prune
Cleaned up useless resources and planned log management strategy.
Personal Takeaways and Reflections:
This problem solving has impressed upon me the following:
- Importance of system monitoring: Timely monitoring of storage usage can prevent problems from escalating.
- Log Management Best Practices: Excessive growth of log files is a common cause of storage occupancy, and reasonable log size limits must be set.
-
Efficient use of tools:
du
、ncdu
and tools such as Docker commands have greatly improved efficiency in troubleshooting problems. - Routine Maintenance Habits: Periodic cleanup of useless container resources (e.g., stopped containers, unused images) keeps the system running healthy.
This practice not only solved the disk space problem, but also gave me a deeper understanding of Linux system management and Docker operation and maintenance. In my future operation and maintenance work, I will pay more attention to system monitoring and optimization to prevent similar problems in advance.