Location>code7788 >text

Efficient File Processing: A Practical Guide to Python pathlib

Popularity:218 ℃/2024-12-12 20:40:18

in usingPythonWhen dealing with file paths, it is highly recommended to use thepathlib

pathlibHandling file paths in an object-oriented way avoids many pitfalls and also makes it easier to perform many path-related operations.

This article summarizes the common use ofpathlibA method for performing file path processing.

1. Common operations

First, we introduce how to use thepathlibto accomplish some regular file path related operations.

1.1 Constructing pathways

To build a path object, simply pass the string of the file or folder path to thePathReady to go.

from pathlib import Path

fp = "D:\\temp\\pathlib"
path = Path(fp)
path

# path boyfriend
# WindowsPath('D:/temp/pathlib')

After constructing the path object, thePathwill automatically determine if it iswindowsneverthelesslinuxdown the path.

1.2 Splice and split paths

The most troublesome aspect of splicing and splitting paths with strings is that the path separator in different systems (\ and /) processing.

utilizationPathObjects that are able to avoid this nuisance.

new_path = ("abc")
new_path
# WindowsPath('D:/temp/pathlib/abc')

new_path = Path(fp, "")
new_path
# WindowsPath('D:/temp/pathlib/')

utilizationjoinpathOr just createPathThe path is spliced when the object is spliced, without specifying a path separator.

utilizationPathSplitting paths is also convenient, and it provides several properties to get file information.

my_path = Path(fp, "")
my_path
# WindowsPath('D:/temp/pathlib/')

# The full name of the file
my_path.name
# ''

# File directory
my_path.parent
# WindowsPath('D:/temp/pathlib')

# File name (without suffix)
my_path.stem
# 'program'

# File suffix
my_path.suffix
# '.py'

# Change the file suffix
my_path.with_suffix(".go")
# WindowsPath('D:/temp/pathlib/')

1.3 Relative and absolute paths

Converting relative paths to absolute paths is recommended.PathtargetresolveMethods.

path = Path("")
path
# WindowsPath('')

# Convert to absolute path
()
# WindowsPath('D:/projects/python/samples/')

1.4. Traversing the catalog

Traversing directories is also a common file path operation.

fp = "D:\\temp\\pathlib\\a"
path = Path(fp)

for f in ("*.txt"):
    print(f)

# D:\temp\pathlib\a\
# D:\temp\pathlib\a\
# D:\temp\pathlib\a\

globfunction is to traverse only the files in the directory, if you want to traverse the files in subdirectories, use therglobfunction.

for f in ("*.txt"):
    print(f)

# D:\temp\pathlib\a\
# D:\temp\pathlib\a\
# D:\temp\pathlib\a\
# D:\temp\pathlib\a\sub_a\sub_1.txt

1.5 Reading and writing documents

The traditional way of reading and writing files is generally a two-step process: first open the file through the open function, and then read or write.

# write
with open("d:\\", "w") as f:
    ("abcdefg")

# retrieve
with open("d:\\", "r") as f:
    content = ()
    print(content)
    # abcdefg

utilizationPathobjects, the read and write operations are simpler and the code is clearer.

fp = "d:\\"
path = Path(fp)
path.write_text("uvwxyz")

content = path.read_text()
print(content)
# uvwxyz

2. Easier operation

In addition to the common operations above, for the following slightly more complex file path operations, the

utilizationPathIt can also be done more easily.

2.1. Checking for the existence of a file or directory

fp = "D:\\temp\\pathlib\\a"
path = Path(fp)

path.is_dir() # True
path.is_file() # False
() # True

2.2 Creating directories

Create a catalog using thePathobject can help us handle exceptions automatically.

path = Path("D:\\temp\\a\\b\\c\\d")
(exist_ok=True, parents=True)

exist_okcap (a poem)parentsparameter in order to save a lot of judgment when creating folders.

exist_ok=Trueexpresses the view that ifFolder dIf it exists, it is not created and no error is reported, otherwise it will report an error.

parents=TrueindicateFolder dThe upper levels of the folder are created automatically if they don't exist, and vice versa if theFolder dThe error is reported if there is a non-existent folder on the upper level of the

2.3 Automatic path normalization

utilizationPathto manipulate paths without being overly concerned about path separators across different operating systems.

existwindowsIn the system, it is also possible to use thelinuxpath splitters, for example, the following two work fine.

fp = "D:\\temp\\pathlib\\a"
path = Path(fp)

fp = "D:/temp/pathlib/a"
path = Path(fp)

3. Comparison with

pathlibIt's mainly just to replaceThe comparison between them is organized as follows:

path operation **pathlib **
Read the contents of all files path.read_text() open(path).read()
Get absolute file path () (path)
Get filename (path)
Get parent directory (path)
Get the file extension (path)[1]
File name (without extension) (path)[0]
relative path path.relative_to(parent) (path, parent)
Verify that the path is a file path.is_file() (path)
Verify that the path is a directory path.is_dir() (path)
Create a catalog (parents=True) (path)
Get current directory () ()
Get home directory () ("~")
Find files by mode (pattern) (pattern)
Recursive file search (pattern) (pattern, recursive=True)
Specification of path separators (name) (name)
splice path Path(paraent, name) (parent, name)
Get file size ().st_size (path)
Traversing the file tree () ()
Redirecting files to a new path (target) (path, target)
Delete file () (path)

Compare the two approaches to appreciate the benefits of pathlib's improvements.