Posts

Showing posts with the label pathlib

How to handle AWS S3 paths using cloudpathlib?



You might be aware that pathlib.Path cannot properly deal with S3-like paths. Cloudpathlib is an easy to use package that does handle AWS S3 paths as well as other cloud provider paths. In this example we use cloudpathlib.CloudPath instead of pathlib.Path to instantiate a S3Path object from a S3 URL string. The interface of a S3Path is the same as for a PosixPath object. For example, we can call .parts on it to obtain all components of the S3Path. 


Github gist with code

dependencies: python3.9, cloudpathlib==0.10.0

How to get all files in a directory and delete them?


Obtain all files in a pathlib.Path directory and delete them


If we have a directory as a Path object we can call .iterdir() on it to obtain a generator that will yield all files present in that directory as Path objects. And if we want to delete all files in a given directory we can .unlink on every Path object returned by iterdir(). Using missing_ok=True is to avoid any race-conditions.


Github gist with code

dependencies: python3.9

How to create a directory and all of it's parent directories if none or some of them do not exist yet?



If you want to create a directory and all of it's parent directories, working with pathlib.Path objects this can be achieved by using Path.mkdir with parents=True. Setting exist_ok=True ensures that no FileExistsError is raised if the directory already exists. This way we can run it over and over again and always end up with the same result.


Github gist with code

dependencies: python3.9