Understanding Pathname Expansion in Linux
Pathname expansion, also known as filename expansion or wildcard expansion, is a powerful feature of the Linux shell that allows you to specify multiple filenames using special characters. This feature is widely used in shell scripting and command-line operations for tasks like file manipulation, searching, and more.
What is Pathname Expansion?
Pathname expansion occurs when the shell scans a command for special wildcard characters and replaces them with an appropriate list of filenames that match the given pattern. The most commonly used wildcard characters are:
*
: Matches any number of characters (including zero).?
: Matches exactly one character.[...]
: Matches any one of the enclosed characters.
The shell processes the wildcard characters after variable expansion and command substitution but before executing the command. This article will delve into the details of how pathname expansion works, including its behavior with hidden files.
How Pathname Expansion Works
When you type a command with a wildcard character and hit Enter, the shell interprets the character and substitutes it with filenames that match the pattern. Here's a simplified step-by-step process:
Tokenization: The shell splits the command into tokens (words). Pathname expansion applies to each token separately.
Scan for Wildcards: The shell looks for wildcard characters (
*
,?
,[...]
) in each token.Matching and Replacement: For each token containing a wildcard, the shell scans the relevant directory and generates a list of filenames that match the pattern.
Sort and Replace: The shell sorts the list of filenames alphabetically ( by default) and replaces the wildcard token with the sorted list.
Command Execution: Finally, the command is executed with the expanded filenames.
The *
Wildcard
The asterisk (*
) is a wildcard that matches zero or more characters. When used
in a command, it can refer to any number of files that share a common pattern.
Example of Using *
Suppose your current directory contains the following files:
file1.txt
file2.txt
file3.txt
document1.txt
document2.txt
You can use *
to list all .txt
files like so:
ls *.txt
This command would expand to:
ls file1.txt file2.txt file3.txt document1.txt document2.txt
Using *
with Prefix and Suffix
You can also use *
with both a prefix and a suffix to narrow down the list
further. For example:
ls file*.txt
This command would expand to:
ls file1.txt file2.txt file3.txt
The ?
Wildcard
The question mark (?
) is another wildcard, but unlike *
, it matches exactly
one character. This is useful when you know the structure of the filenames but
are unsure about certain individual characters.
Example of Using ?
Let's consider the same directory as before:
file1.txt
file2.txt
file3.txt
document1.txt
document2.txt
If you want to list all file
entries that have a single-digit number before
the .txt
extension, you can use the ?
wildcard like this:
ls file?.txt
This command would expand to:
ls file1.txt file2.txt file3.txt
The ?
matches exactly one character, so it doesn't include files that have
more than one character in that position (like if you had file10.txt
, for
instance).
Combining *
and ?
You can also combine these wildcards for more complex matching. For example,
suppose you want to list files that start with either file
or document
and
end in a single digit followed by .txt
. You could use:
ls file?.txt document?.txt
This would expand to:
ls file1.txt file2.txt file3.txt document1.txt document2.txt
Expansion Of Character Classes
Suppose you have the following files in your directory:
file1.txt
file2.txt
file3.txt
fileA.txt
fileB.txt
fileC.txt
Now, let's say you want to list only the files that end with 1.txt
, 2.txt
,
or A.txt
.
You can use the following command:
ls file[123A].txt
The [123A]
part is a character class, which means "match one of the characters
1, 2, 3, or A." So, in this example, ls file[123A].txt
would expand to:
ls file1.txt file2.txt fileA.txt
Range within [...]
You can also specify a range of characters:
ls file[1-3].txt
Here [1-3]
would match any single character from 1
to 3
, so the command
would expand to:
ls file1.txt file2.txt file3.txt
Negation within [...]
You can also negate a character class by placing a !
or ^
(depending on the
shell you're using; !
is commonly used in bash) as the first character inside
the brackets.
For example:
ls file[!A-C].txt
This command would list files that do not end with A.txt
, B.txt
,
or C.txt
. In our example directory, the expansion would be:
ls file1.txt file2.txt file3.txt
Pathname Expansion and Hidden Files
In Linux, hidden files start with a dot (.
). When using pathname expansion,
the shell generally does not include hidden files unless the wildcard pattern
explicitly starts with a dot.
For example, ls *
would list all non-hidden files, but it wouldn't list hidden
files like .bashrc
or .gitignore
. If you wish to include hidden files in
pathname expansion, you can use patterns that start with a dot, like ls .*
or ls .[!.]*
(the latter excludes .
and ..
entries).
Examples with Hidden Files
# List all hidden files that end with `.bak`
ls .*.bak
This would list all hidden .bak
files like .file.bak
and .data.bak
.
# List all files, including hidden ones, that have a single-character name
ls ? .?
This would list both visible and hidden files with a single-character name, assuming such files exist.
Conclusion
Understanding pathname expansion is crucial for effective shell scripting and command-line usage. It allows you to manipulate and operate on groups of files easily, making many tasks more convenient and streamlined.
Knowing how pathname expansion interacts with hidden files also adds another layer of sophistication to your command-line skills. It enables you to perform operations explicitly on hidden or non-hidden files, giving you greater control over your file management tasks.
What Can You Do Next 🙏😊
If you liked the article, consider subscribing to Cloudaffle, my YouTube Channel, where I keep posting in-depth tutorials and all edutainment stuff for software developers.