How to work with tarball/tar files in Python

Pratik Choudhari

TAR stands for Tape Archive Files and this format is used to bundle a set of files into a single file, this is specifically helpful when archiving older files or sending a bunch of files over the network.

The Python programming language has tarfile standard module which can be used to work with tar files with support for gzip, bz2, and lzma compressions.

In this article, we will see how tarfile is used to read and write tar files in Python.

Reading a tar file

The tarfile.open function is used to read a tar file. It returns a tarfile.TarFile object.

The two most important arguments this function takes are the filename and operation mode, with the former being a path to the tar file and the latter indicating the mode in which the file should be opened.

The operation mode can optionally be paired with a compression method. The new syntax, therefore, becomes mode[:compression].

Following are the abbreviations for supported compression techniques:

Example:

import tarfile with tarfile.open("sample.tar", "r") as tf: print("Opened tarfile")

Extracting tar file contents

After opening a file, extraction can be done using tarfile.TarFile.extractall method. Following are the important arguments accepted by the method:

Example:

import tarfile with tarfile.open("sample.tar", "r") as tf: print("Opened tarfile") tf.extractall(path="./extraction_dir") print("All files extracted")

Extracting single file

In order to selectively extract files, we need to pass a reference of the file object or file path as string to tarfile.TarFile.extract method.

To list all files inside a tar file use the tarfile.TarFile.getmembers method which returns a list tarfile.TarInfo class instances.

Example:

import tarfile with tarfile.open("./sample.tar", "r") as tf: print("Opened tarfile") print(tf.getmembers()) print("Members listed")

Output:

Opened tarfile [<TarInfo 'sample' at 0x7fe14b53a048>, <TarInfo 'sample/sample_txt1.txt' at 0x7fe14b53a110>, <TarInfo 'sample/sample_txt2.txt' at 0x7fe14b53a1d8>, <TarInfo 'sample/sample_txt3.txt' at 0x7fe14b53a2a0>, <TarInfo 'sample/sample_txt4.txt' at 0x7fe14b53a368>]

Single file extraction

import tarfile file_name = "sample/sample_txt1.txt" with tarfile.open("sample.tar", "r") as tf: print("Opened tarfile") tf.extract(member=file_name, path="./extraction_dir") print(f"{file_name} extracted")

Writing a tar file

To add files to a tar file, the user has to open the file in append mode and use tarfile.TarFile.add method, it takes the path of file to be added as a parameter.

import tarfile file_name = "sample_txt5.txt" with tarfile.open(f"./sample.tar", "a") as tf: print("Opened tarfile") print(f"Members before addition of {file_name}") print(tf.getmembers()) tf.add(f"{file_name}", arcname="sample") print(f"Members after addition of {file_name}") print(tf.getmembers())

FREE VS Code / PyCharm Extensions I Use

✅ Write cleaner code with Sourcery, instant refactoring suggestions: Link *

* This is an affiliate link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you! 🙏

Check out my Courses