golang optimizations for traversing directory operations

In a flash go1.23 is almost released, time flies.

But today we're going to rewind three and a half years back to focus on an optimization introduced in go 1.16 regarding handling directories.

The most impressive thing about the new changes in go1.16 is probably the massive refactoring of the io package, but this refactoring actually introduces one more optimization, which this post is going to talk about.

This article defaults to a Linux environment, but this optimization is common on BSD systems as well.

Optimization when traversing directories

Traversing directories is a very common requirement, especially for directories with a large number of files, and the performance of the traversal is directly related to the overall program performance.

go1.16 adds several new interfaces for traversing directories:，(*).ReadDir，。

The most significant feature of these interfaces is the use of theIndicates that instead of。is an interface that provides an interface similar to theThe methodology:

type DirEntry interface {
        Name() string
        IsDir() bool
        Type() FileMode
        Info() (FileInfo, error)
}

It also provides a method called Info in order to get the value of the。

What's so amazing about this interface? Let's look at the performance test:

func IterateDir(path string) int {
    // go1.16 (used form a nominal expression) 就是这么实现(used form a nominal expression)，To test this we expanded it into a pair of(*).ReadDir(used form a nominal expression)调用
	f, err := (path)
	if err != nil {
		panic(err)
	}
	defer ()

	files, err := (-1)
	if err != nil {
		panic(err)
	}
	length := 0
	for _, finfo := range files {
		length = max(length, len(()))
	}
	return length
}

func IterateDir2(path string) int {
    // 1.16之前遍历目录(used form a nominal expression)常用方法之一
	f, err := (path)
	if err != nil {
		panic(err)
	}
	defer ()

	files, err := (-1)
	if err != nil {
		panic(err)
	}
	length := 0
	for _, finfo := range files {
		length = max(length, len(()))
	}
	return length
}

func BenchmarkIter1(b *) {
	for range {
		IterateDir("../test")
	}
}

func BenchmarkIter2(b *) {
	for range {
		IterateDir2("../test")
	}
}

The test directory is a directory of 5000 files located on the Btrfs file system. Our test case traverses the directory and finds the filename length of the file with the longest name.

Here are the test results:

You can see that the optimized traversal is 480% faster than the original. Why is there such a big improvement with a different function? Read on if you want to know the answer.

Principles of Optimization

Before we go any further, let's look at how the old interface got information about the files in the directory. The answer is to traverse the directory to get the path and then call theGet complete file information:

func (f *File) Readdir(n int) ([]FileInfo, error) {
	if f == nil {
		return nil, ErrInvalid
	}
	_, _, infos, err := (n, readdirFileInfo)
	if infos == nil {
		// Readdir has historically always returned a non-nil empty slice, never nil,
		// even on error (except misuse with nil receiver above).
		// Keep it that way to avoid breaking overly sensitive callers.
		infos = []FileInfo{}
	}
	return infos, err
}

this oneIt will change its behavior according to the value of the second parameter, and depending on the value it will either follow the behavior of the old code before 1.16 or adopt new optimizations. This function is implemented differently on different systems, so let's take a look at the *nix implementation:

func (f *File) readdir(n int, mode readdirMode) (names []string, dirents []DirEntry, infos []FileInfo, err error) {
	...

	for n != 0 {
		// Getting data for a catalog item using a system call
        // Meta-information about directory entries is generally stored in the data of the directory itself，So reading this information is very similar to reading a normal file
		if >= {
			 = 0
			var errno error
			, errno = (*)
			(f)
			if errno != nil {
				return names, dirents, infos, &PathError{Op: "readdirent", Path: , Err: errno}
			}
			if <= 0 {
				break // EOF
			}
		}

		buf := (*)[:]
		reclen, ok := direntReclen(buf)
		if !ok || reclen > uint64(len(buf)) {
			break
		}
        // Pay attention to this line.
		rec := buf[:reclen]

		if mode == readdirName {
			names = append(names, string(name))
		} else if mode == readdirDirEntry {
			// Here's the code to look at later
		} else {
			info, err := lstat( + "/" + string(name))
			if IsNotExist(err) {
				// File disappeared between readdir + stat.
				// Treat as if it didn't exist.
				continue
			}
			if err != nil {
				return nil, nil, infos, err
			}
			infos = append(infos, info)
		}
	}

	if n > 0 && len(names)+len(dirents)+len(infos) == 0 {
		return nil, nil, nil,
	}
	return names, dirents, infos, nil
}

ReadDirent corresponds to a system call on Linux.getdentsThis system call reads the directory entry information into a block of memory, and then the program can parse the data in this block of memory to get some information about the directory entry, which generally includes the file name, the type of the file, and whether the file is a directory or not.

After reading this information, the old code will use the file name to call lstat again, which is also a system call to get more complete information about the file, including the owner of the file, the size of the file, the date of modification of the file, and so on.

What's wrong with the old code? The big problems don't exist, and the interface is fairly easy to use, but there are some minor flaws:

Most of the time traversing a directory is mainly about getting attributes such as the names or types of files in the directory, obviouslyReturns too much information. This unused information wastes a lot of memory, and getting it takes extra time - lstat needs to go to disk io to get it, and files in directories aren't stored as tightly together as directory item information, they're scattered, so the burden of reading their meta-information one by one would be So the burden of reading their meta-information one by one would be very large.
Too many system calls are used. Since we have a lot of files in our test directory, but getdents may have to be called multiple times, let's just assume it's twice here. For each directory entry, you need to use lstat to get the details of the file, so that's another 5000 system calls, which adds up to 5002. The overhead of system calls is huge, and accumulating more than 5000 of them will bring about a performance degradation that is visible to the naked eye. In fact, linux itself has an optimization for lstat, there will not really be a situation where you have to repeatedly enter the system call 5000 times, but tens to hundreds of times is still needed.

The optimized code actually changes only one line, which is the(n, readdirDirEntry), the second parameter changes. The new code will go with the logic commented out above:

// rec := buf[:reclen] In case you forgot.recWhere is it from?
de, err := newUnixDirent(, string(name), direntType(rec))
if IsNotExist(err) {
	// File disappeared between readdir and stat.
	// Treat as if it didn't exist.
	continue
}
if err != nil {
	return nil, dirents, nil, err
}
dirents = append(dirents, de)

Replacing lstat is the function newUnixDirent, which can obtain a portion of the file's metadata without relying on additional system calls:

type unixDirent struct {
	parent string
	name string
	typ FileMode
	info FileInfo
}

func newUnixDirent(parent, name string, typ FileMode) (DirEntry, error) {
	ude := &unixDirent{
		parent: parent,
		name: name,
		typ: typ,
	}
    // Detect whether the file type information is valid
	if typ != ^FileMode(0) && !testingForceReadDirLstat {
		return ude, nil
	}

	info, err := lstat(parent + "/" + name)
	if err != nil {
		return nil, err
	}

	 = ().Type()
	 = info
	return ude, nil
}

The file name and type are obtained when parsing the directory entry, so it's straightforward to set them. However, not every file system supports storing the file type in the directory entry data, so the code does a fallback to use lstat to get the information again once it finds that the file type is invalid data.

If only the filename and the type of the file are used, then the entire traversal logic process ends here, and there is no need to call lstat if the file system provides support. so the entire traversal requires only two system calls. This is why the optimized solution is nearly five times faster.

For users who want to use other information such as file sizes, the optimization scheme actually has benefits, since lstat is now called deferred and on demand:

func (d *unixDirent) Info() (FileInfo, error) {
if ! = nil {
return , nil
}
    // Called only once
return lstat( + "/" + )
}

This also minimizes unnecessary system calls.

So the overall optimization principle is: try to make full use of the information provided by the file system itself + reduce system calls. The larger the directory to be traversed, the more obvious the optimization effect.

Support for optimization

As stated above, being able to do optimization requires the file system to store file type information in the directory entry data of the directory. This needs to be supported by the file system.

If the file system doesn't support it you end up relying on lstat to read the metadata of a specific file.

The information on the different file systems is really scattered, and there are quite a few outdated, so I spent a few days looking at the code + checking the docs to do some organizing:

btrfs, ext2, ext4: these filesystems support optimization, and man pages plus filesystem code confirms this.
OpenZFS: This filesystem is not in the Linux kernel, so it is not mentioned in the man pages, but it does support optimization.
xfs: optimization is supported, but you have to create the filesystem with something like -f -n ftype=1The only way to do this is by selecting the
F2FS, EROFS: the docs don't mention it, but see that it's supported in the kernel's code, which is located atxxx_readdirThis function is nearby.
fat32, exfat: the document did not mention, but look at the kernel code found to be supported, but the fat family of file types are not so fancy, only directories and ordinary files of these two kinds, so the code is very rough to determine the directory entry whether the dir flag is set up, there is a directory does not have a general count of ordinary files. Doing so is normal, because fat originally does not support other file types, after all, the file system does not even support soft links, not to mention the Unix Domain Socket and named pipes.
ntfs: supported, however asmarginal notesSaid, because ntfs and other filesystems handle type differently, resulting in although the filesystem itself supports most of the file types, but type information can only get the file is not a directory. So for files that are not directories, it will go to the disk and read the inode of the file and then get the file type from the inode - in fact, it is equivalent to executing lstat once, compared to lstat, it reduces the context switch when entering the system call, so the optimization effect on ntfs is not as good as on other file system.

If you look at it this way, basically all the major common file systems support this optimization.

That's why go1.16 introduced this optimization, not only is it widely supported but it's a big boost, and who doesn't love free acceleration.

How to use this optimization in other languages

As you can see here, you should find that this optimization is actually at the system level, and golang merely adapts it.

Indeed, so this optimization is not only available to golang, but also to c/c++/python.

First of all, how to take advantage of the c: directly with the system provided readdir function on the line, this function will call getdents, and then you can naturally eat the optimization. Cautions and go's the same, need to detect the file system support set d_type.

c++: Same as c. Also libstdc++'s filesystem takes the readdir implementation, so optimizations can be obtained with the filesystem standard library:

// /gcc-mirror/gcc/blob/master/libstdc++-v3/src/filesystem/#L270
inline file_type
get_file_type(const std::filesystem::__gnu_posix::dirent& d [[gnu::unused]])
{
#ifdef _GLIBCXX_HAVE_STRUCT_DIRENT_D_TYPE
  switch (d.d_type)
  {
  case DT_BLK:
    return file_type::block;
  case DT_CHR:
    return file_type::character;
  case DT_DIR:
    return file_type::directory;
  case DT_FIFO:
    return file_type::fifo;
  case DT_LNK:
    return file_type::symlink;
  case DT_REG:
    return file_type::regular;
  case DT_SOCK:
    return file_type::socket;
  case DT_UNKNOWN:
    return file_type::unknown;
  default:
    return file_type::none;
  }
#else
  return file_type::none;
#endif
}

// If the operating system and file system do not support，then fallback tolstat
// /gcc-mirror/gcc/blob/master/libstdc++-v3/include/bits/fs_dir.h#L342
file_type
_M_file_type() const
{
    if (_M_type != file_type::none && _M_type != file_type::symlink)
	    return _M_type;
    return status().type();
}

The only difference is that stat is also called if the target file is softwired.

python: use can get optimized, the bottom line is to use readdir like c:/python/cpython/blob/main/Modules/#L16211, the implementation methods and even the class names are very similar to golang, so I won't post the code.

summarize

go although performance has been criticized, but in the system programming is not ambiguous, the basic common optimization have to do, you can often pay attention to the new version of the release notes to see go in this regard to do the effort.

Looking at a simple optimization, the feasibility verification behind it is really complicated, especially since different file systems are very different in how to store extra metadata, and it took a lot of time just to look at the code.

The previously mentioned ntfs will be a bit discounted in terms of optimization, so I purposely took a Windows device and tested it under the same conditions:

As you can see there is almost no difference. If I hadn't looked at the ntfs driver for linux, I wouldn't have known that it would produce such a result. So this optimization doesn't work well on Windows, but it works on Linux and macOS.

Bold assumptions are made, careful proof is sought, and therein lies the joy of systems programming and performance optimization.

consultation

Logic for exfat's fuse driver to populate d_type:/relan/exfat/blob/master/libexfat/#L28

The Linux ntfs driver needs to get the inode of the file to get the correct file type:/torvalds/linux/blob/master/fs/ntfs3/#L337