Compression options in Linux, gzip vs pigz

When it comes to Linux and the command line at some stage you are going to come across a .gz file, this is a compressed archive file much like a .zip it may contain a single or many files or many files and directories.

The 2 main utilities to create or manage a .gz are gzip and pigz, the difference in simple terms is that gzip is more widely supported however can be significantly slower due to being single-threaded, that is to say, it will only use a single CPU core or thread at a time. Pigz on the other hand will use all available cores or threads in parallel which significantly speeds it up at the cost of potentially impacting other running operations due to competing for all available CPU resources.

Pigz stands for parallel implementation of GZip (gz).

The choice as to which to use depends on the available server resources and what the job is, if you are doing local compression and resource availability is of no concern then go with pigz, it will save you time. If you plan to be compressing a file during transport to another server then using gzip as pigz can become unreliable when used to compress in-stream over the network.

A simple example of gzip:

#time gzip SW_DVD9_Win_Server_STD_CORE_2019_1809.5_64Bit_English_DC_STD_MLF_X22-34333.ISO
real 2m27.418s
user 2m10.083s
sys 0m7.362s

A simple example with the same process using pigz:

# time pigz SW_DVD9_Win_Server_STD_CORE_2019_1809.5_64Bit_English_DC_STD_MLF_X22-34333.ISO
real 0m40.679s
user 2m24.256s
sys 0m4.941s

As you can see using a quad-core server, 2 minutes 27 seconds with gzip vs 40 seconds with pigz is quite a big time-saving, the more cores available the faster it gets.

You can also pass tar through pigz just like gzip to do full directories example:


tar cf - /var/srv/files/ | pigz > filename.tar.gz


tar cf - /var/srv/files/ | pigz > filename.tar.gz


In the above examples you are using ‘tar’ with cf (create and filename) to compress everything in the directory /var/srv/files/