Location>code7788 >text

The Mystery of the Disappearing Hard Drive Space: The Whole Process of Linux Server Storage Troubleshooting and Optimization

Popularity:744 ℃/2024-12-18 00:09:04

preamble

Recently, online services often appear some strange problems, such as static resources on the web page can not be loaded, or request the back-end inexplicable error, or Redis error ...

When I SSH logged into the server, it was even worse, lagging on knocking out a command...

If I hadn't had the experience before, I might have thought that I was in trouble, but as a Linux user for many years, I've learned that "storage space is full" is behind all these anomalies!

The question then arises, "Where did the hard disk space go?"

run through a checklist

Let's take a look at it with the df.

First the most common commands

df -h

Let's take a look at the individual disks and mount points first

After executing this command on my server, I found that the free space in the root directory was down to a few hundred KB...

And then use du to look at the specifics.

Use the du command to see how much space each subdirectory occupies, and then sort the subdirectories with the sort command.

sudo du -h --max-depth=1 / | sort -hr

Parameter Description:

  • Because you want to count the root directory directly, you need root privileges.
  • --max-depth=1: Displays only the occupancy of the current directory and its next level subdirectories.
  • sort -hr: Descending order by size.

View the total space occupied by the catalog:

du -sh /

Better Tools

The previous du command, still doesn't work that well, mainly because there is too much data listed.

This time I used the ncdu tool

ncdu is a more user-friendly disk usage analysis tool that supports an interactive interface.

However, many distributions don't have it built in and you need to install it first:

sudo apt install ncdu

Analyzing Root Occupancy with ncdu

sudo ncdu -x /

utilization-x The parameter can limit the scanning scope to the current filesystem, not across the mount points. Because there are many mount points of other hard disks in the root directory, according to the analysis of using the df command earlier, it is the main hard disk that is full, so I just need to look at the main hard disk's.

After ncdu starts up, it scans the storage space for each file and then enters an interactive interface that allows you to visualize the amount of space occupied by each directory, sorted from largest to smallest.

criminal ringleader, main offender (idiom); main culprit

This can be visualized in ncdu./var/lib/docker This directory takes up more than 70% of the storage space, proper defense!

Analyzing with du

sudo du -h --max-depth=1 /var/lib/docker | sort -hr

After that the approximate occupancy looks like this (just an example, not real data)

10G     /var/lib/docker/overlay2
5G      /var/lib/docker/volumes
1G      /var/lib/docker/containers
500M    /var/lib/docker/images

It's basically the docker image, container logs, volumes and such that eat up the hard disk

Cleaning up unused resources in Docker

If we find the problem, we'll be fine.

First, stop the useless docker container

Then run some of the cleanup commands provided by docker.

Clean up unused resources (images, containers, volumes and networks)

Docker providesdocker system prune command that cleans up unused resources.

docker system prune -a
  • -a: Deletes all unused mirrors (including mirrors that are not associated with a container).
  • take note of: This command does not remove resources that are dependent on the running container, so proceed with caution.

Purge only unused volumes

If it is/var/lib/docker/volumes Takes up more space:

docker volume prune

Clean up only unused networks

If it is/var/lib/docker/network Higher occupancy:

docker network prune

Deleting Unwanted Containers, Images, and Volumes

Deleting non-running containers

docker container prune

Deleting useless mirrors

docker image prune -a

Deleting useless volumes

docker volume prune

Cleaning up old mirrors and unused tags

If a lot of mirrors are being used, many of the older versions may no longer be useful.

List mirrors sorted by size

docker images --format "{{.Repository}}:{{.Tag}} {{.Size}}" | sort -k2 -h

Deleting a Specified Image

docker rmi <image-id>

What's so big?

In the process of analyzing storage space usage, I also found one file that was particularly outrageous, the following file, over 500 G...

/var/lib/docker/containers/e4b5a99b429a612885417460214ea40a6a49a3360c29180af800ff7aef4c03df/

Find the poop container.

Come find out which vessel took the shit.

The path to the log file contains the ID of the container:e4b5a99b429a612885417460214ea40a6a49a3360c29180af800ff7aef4c03df

to find out which container it is:

docker ps | grep e4b5a99b429a

The container cannot be found in the list of containers, perhaps it has been stopped or deleted. In this case, you can use thedocker inspect Check for specific information:

docker inspect e4b5a99b429a

Analyzing Log Contents

This can be done bytail maybeless View the contents of the log and check for abnormal output:

sudo tail -n 50 /var/lib/docker/containers/e4b5a99b429a612885417460214ea40a6a49a3360c29180af800ff7aef4c03df/

I'm not going to post the specifics of the logs, they should look fine, it's just that it's been running for a long time, and it's built up over time...

See the next subsection for ways to deal with this issue.

Cleaning up docker log files

in the event that/var/lib/docker/containers Takes up a lot of space, possibly because the container log file is too large.

Viewing Log Files

The logs for each container are stored in the/var/lib/docker/containers/<container-id>/<container-id>-

Use the following command to find the largest log file:

sudo find /var/lib/docker/containers/ -type f -name "*.log" -exec du -h {} + | sort -hr | head -n 10

Manual log cleaning

Empties the log file for a specific container:

sudo truncate -s 0 /var/lib/docker/containers/<container-id>/<container-id>-

Setting log file size limits

Limit the log size in Docker's configuration file (recommended):

  1. Edit the Docker configuration file (usually/etc/docker/):

    sudo nano /etc/docker/
    
  2. Add or modify the following configuration:

    {
      "log-driver": "json-file",
      "log-opts": {
        "max-size": "10m",
        "max-file": "3"
      }
    }
    
    • max-size: Maximum 10 MB for a single log file.
    • max-file: 3 log files are retained.
  3. Reload Docker:

    sudo systemctl restart docker
    

Migrating /var/lib/docker to a new disk

It is possible to combine/var/lib/docker Mounts to other disks, thus relieving storage pressure on the current disk. This is a common practice, especially if there are multiple disks.

Approach I: Direct migration/var/lib/docker to a new disk

Stopping the Docker Service

You need to stop the Docker service before migrating the data:

sudo systemctl stop docker

mobility/var/lib/docker To new location

Assuming the new disk is mounted on/mnt/new-disk, execute the following command:

sudo mv /var/lib/docker /mnt/new-disk/docker

Creating Soft Links

Link the new path back to the/var/lib/dockerto allow Docker to continue to work on the default path:

sudo ln -s /mnt/new-disk/docker /var/lib/docker

Starting the Docker Service

sudo systemctl start docker

validate (a theory)

Run the following command to ensure that Docker is working properly:

docker info

Method 2: Modify the Docker configuration file to specify the new storage location

Stopping the Docker Service

sudo systemctl stop docker

mobility/var/lib/docker to a new disk

Migrate existing data to a new disk mount point. For example, a new disk mounted at/mnt/new-disk

sudo mv /var/lib/docker /mnt/new-disk/docker

Modifying the Docker Configuration

Edit Docker's configuration file (usually the/etc/docker/), specify a new storage path:

sudo nano /etc/docker/

Add or modify the following:

{
  "data-root": "/mnt/new-disk/docker"
}

Starting the Docker Service

sudo systemctl start docker

validate (a theory)

Double-check that Docker is running properly:

docker info | grep "Docker Root Dir"

You should see the new path (e.g./mnt/new-disk/docker)。

Method 3: Mount the new disk directly to the/var/lib/docker

If you want to use the new disk directly as a/var/lib/docker mount point, you can use the following methods

Formatting a new disk

Assuming the new disk is/dev/sdb1If the file system is formatted and created first (e.g., theext4):

sudo mkfs.ext4 /dev/sdb1

Mounting a new disk

Mount the new disk to the/var/lib/docker

sudo mount /dev/sdb1 /var/lib/docker

Migration of existing data

in the event that/var/lib/docker There is already data in the directory that needs to be copied to the new disk first:

sudo rsync -a /var/lib/docker/ /mnt/new-disk/

Then mount the new disk back/var/lib/docker

modifications/etc/fstab Ensure power-on auto-mount

compiler/etc/fstab file, add a line for the mount configuration:

/dev/sdb1  /var/lib/docker  ext4  defaults  0  2

caveat

  1. Data migration risks: In relocation or reconstruction/var/lib/docker Be sure to back up important data (e.g., persistent volume data).

  2. Permission issues: Make sure the permissions of the new directory are the same as the original directory:

    sudo chown -R root:root /mnt/new-disk/docker
    
  3. Checking the mount point: Ensure that the new disk is mounted successfully and set up automatic mounting to avoid path loss after system reboot.

With the above method, it is possible to successfully place the/var/lib/docker Mount to other disks to relieve storage pressure and optimize storage layout.

Rebuild /var/lib/docker

If you still run out of space after cleaning up, you can rebuild Docker's storage directory (which will remove all containers, images, and data)

Stop the Docker service:

sudo systemctl stop docker

Backup existing Docker data (optional):

sudo mv /var/lib/docker /var/lib/

Create a new empty directory:

sudo mkdir /var/lib/docker

Start the Docker service:

sudo systemctl start docker

summarize

During this troubleshooting of the Linux server hard drive space disappearance problem, I experienced a complete storage analysis and optimization hands-on.

Key Steps Summarized:

  1. Initial troubleshooting of storage occupancy
    • utilizationdu respond in singingncdu and other tools to quickly locate directories that take up a lot of space.
    • discoveries/var/lib/docker Catalogs take up a lot of storage space.
  2. In-depth positioning of specific issues
    • Finding the path to the specific container log file and confirming which container is generating a large number of logs by its container ID.
    • utilizationdocker inspect respond in singingdocker logs, further analyze the contents of the log.
  3. sort
    • Emptied the oversized container log file bytruncate Command to release the space immediately.
    • Modify the Docker configuration file () limits the size of the log file to avoid similar problems from happening again.
  4. Validation and Optimization
    • A restart of the Docker service verified that the service is working properly.
    • utilizationdocker system prune Cleaned up useless resources and planned log management strategy.

Personal Takeaways and Reflections:

This problem solving has impressed upon me the following:

  • Importance of system monitoring: Timely monitoring of storage usage can prevent problems from escalating.
  • Log Management Best Practices: Excessive growth of log files is a common cause of storage occupancy, and reasonable log size limits must be set.
  • Efficient use of toolsduncdu and tools such as Docker commands have greatly improved efficiency in troubleshooting problems.
  • Routine Maintenance Habits: Periodic cleanup of useless container resources (e.g., stopped containers, unused images) keeps the system running healthy.

This practice not only solved the disk space problem, but also gave me a deeper understanding of Linux system management and Docker operation and maintenance. In my future operation and maintenance work, I will pay more attention to system monitoring and optimization to prevent similar problems in advance.