Skip to main content

Mastering `gzip`: File Compression in Linux

File compression is an essential technique for managing file sizes on computers. In the Linux ecosystem, gzip (short for GNU zip) is one of the most popular compression tools due to its efficiency and simplicity. It employs the DEFLATE algorithm, which is a combination of LZ77 and Huffman coding.

Syntax of gzip

The basic syntax of the gzip command is as follows:

gzip [OPTION]... [FILE]...

When you run gzip with a file name or names, the original files are compressed and saved with a .gz extension, and the originals are deleted.

Creating Sample Files for Compression

Before diving into examples, let's create some sample files to work with:

# Create a text file
echo "This is a sample text file for gzip compression." > sample1.txt

# Create another text file with repeated lines
for i in {1..100}; do echo "Repeated line ${i}" >> sample2.txt; done

These commands create two text files, sample1.txt and sample2.txt, which we'll use to demonstrate gzip compression.

Examples of Using gzip

Here are a few examples of how to use the gzip command:

Basic Compression

gzip sample1.txt

This command compresses sample1.txt into sample1.txt.gz and deletes the original sample1.txt.

Keeping the Original Files

gzip -k sample2.txt

The -k option keeps the original files after compression.

Adjusting Compression Level

gzip -9 sample1.txt

The -9 option tells gzip to use the maximum compression level.

Decompressing .gz Files

gzip -d sample1.txt.gz

The -d option decompresses the file. Alternatively, you can use gunzip:

gunzip sample1.txt.gz

Viewing Compressed File Contents

gzip -c sample1.txt | less

The -c option writes output to standard output, which can be piped to less for viewing.

Recursively Compressing Files in a Directory

gzip -r directory_name

The -r option recursively compresses all files in a specified directory.

Options Table for gzip

OptionShorthandDescription
--best-9Optimize for the best compression ratio (slowest)
--fast-1Optimize for the fastest compression
--keep-kKeep (do not delete) input files
--decompress-dDecompress files
--force-fForce compression, overwrite files without prompting
--recursive-rRecursively compress files in directories
--verbose-vProvide verbose output (show progress and compression ratio)
--stdout-cWrite output to standard output
--test-tTest compressed file integrity
--name-NStore or restore the original name and timestamp
--no-name-nDo not store or restore the original name and timestamp
--quiet-qSuppress all warnings
--helpDisplay a help message and exit
--versionDisplay version information and exit

Remember that using higher compression levels with -9 can significantly slow down the compression process and is best used when you need to compress files as much as possible, such as for archival or when preparing files for transfer over slow network connections.

Using gzip in combination with other Linux commands and shell scripts can lead to highly efficient and automated data management tasks. The tool's versatility makes it suitable for a range of applications, from backing up personal files to managing large datasets on servers.

What Can You Do Next 🙏😊

If you liked the article, consider subscribing to Cloudaffle, my YouTube Channel, where I keep posting in-depth tutorials and all edutainment stuff for software developers.

YouTube @cloudaffle