Refer: https://stackabuse.com/introduction-to-the-python-pathlib-module/
How to use the Pathlib Module?
To use the pathlib module conveniently within our scripts, we import all the classes in it using:
from pathlib import *
As a first task, let's retrieve the current working directory and home directory objects, respectively, using the code below:
current_dir = Path.cwd()
home_dir = Path.home()
print(current_dir)
print(home_dir)
We can choose to import pathlib instead of importing all the classes. In that case, all the subsequent uses of classes within the module should be prefixed with pathlib.
import pathlib
current_dir = pathlib.Path.cwd()
home_dir = pathlib.Path.home()
print(current_dir)
print(home_dir)
Why use the Pathlib Module?
If you've been working with the Python language for a while, you would be wondering what is the necessity of Pathlib module when os, os.path, glob, etc. modules are already available? This is a fully justified concern. Let's try to address this via an example.
Let's say we want to make a file called "output/output.xlsx" within the current working directory. The following code tries to achieve this using the os.path module. For this, os.getcwd and os.path.join functions are used.
import os
outpath = os.path.join(os.getcwd(), 'output')
outpath_file = os.path.join(outpath, 'out.xlsx')
Alternately,
outpath_file = os.pathjoin(os.path.join(os.getcwd(), 'output'), "out.xlsx")
Though the code works, it looks clunky and is not readable nor easy to maintain. Imagine how this code would look if we wanted to create a new file inside multiple nested directories.
The same code can be re-written using Pathlib module, as follows:
from pathlib import Path
outpath = Path.cwd() / 'output' / 'output.xlsx'
This format is easier to parse mentally. In Pathlib, the Path.cwd() function is used to get the current working directory and / operator is used in place of os.path.join to combine parts of the path into a compound path object. The function nesting pattern in the os.path module is replaced by the Path class of Pathlib module that represents the path by chaining methods and attributes. The clever overloading of the / operator makes the code readable and easy to maintain.
Another benefit of the method provided by the Pathlib module is that a Path object is created rather than creating a string representation of the path. This object has several handy methods that make life easier than working with raw strings that represent paths.
Performing Operations on Paths
The classic os.path module is used only for manipulating path strings. To do something with the path, for example, creating a directory, we need the os module. The os module provides a set of functions for working with files and directories, like: mkdir for creating a directory, rename to rename a directory, getsize to get the size of a directory and so on.
Let's write some of these operations using the os module and then rewrite the same code using the Pathlib module.
Sample code written using os module:
if os.path.isdir(path):
os.rmdir(path)
If we use Pathlib module's path objects to achieve the same functionality, the resulting code will be much more readable and easier to maintain as shown below:
if path.is_dir()
path.rmdir()
It is cumbersome to find path related utilies in the os module. The Pathlib module solves the problem by replacing the utilities of os module with methods on path objects. Let us understand it even better with a code: