Sorting Tabular Data with the sort
Command in Linux
Tabular data is prevalent in computing for representing data sets, databases, or
simple lists. In Linux, the sort
command is a powerful tool for arranging
tabular data in a specific order, be it numerical, alphabetical, or custom.
Preparing Tabular Data
Before sorting, we need to create a file with tabular data. We will use vim
to
create and edit a file named employees.txt
that contains a list of employees
with their department and salary.
Creating Tabular Data Using vim
:
- Open
vim
by typingvim employees.txt
in your terminal. - Enter insert mode by pressing
i
. - Type the following tabular data:
John Doe Sales 50000
Jane Smith Marketing 55000
Bob Johnson Engineering 60000
Alice Brown HR 52000
- To ensure that the columns are separated by tabs, press
Tab
between each field. - Save and exit by pressing
Esc
, typing:wq
, and then hittingEnter
.
Sorting Tabular Data
With the data prepared, let's explore how we can sort it using different criteria.
1. Sort by Name (Alphabetical Order)
sort -k1,1 employees.txt
sort
: Invokes the sort command.-k1,1
: The-k
option specifies which key or column to sort on.1,1
means to sort on the first field only, which is the Name column in our data.
2. Sort by Department
sort -k2,2 employees.txt
-k2,2
: Sorts the lines based on the second field, which in our example is the Department. It sorts from the start to the end of the specified key position.
3. Sort by Salary (Numerical Order)
sort -k3,3n employees.txt
-k3,3n
: This command sorts on the third field, interpreting the key as an integer. Then
option ensures that it sorts based on numerical value rather than alphabetical.
4. Combining Sorts
sort -k2,2 -k3,3n employees.txt
-k2,2 -k3,3n
: This sorts with a primary key of the second field (Department) and a secondary key of the third field (Salary). This means that within each department, entries are sorted by salary.
5. Reverse Sorting
sort -k3,3nr employees.txt
-k3,3nr
: Ther
option reverses the result of comparisons, so this command sorts by the third field (Salary) in reverse numerical order, meaning from highest to lowest.
6. Check Sorted Data
sort -k2,2c employees.txt
-k2,2c
: Thec
option checks if the file is already sorted by the specified key. In this case, it checks if the Department column is sorted. If the file is not sorted,sort
will not rearrange the lines but will instead return a message indicating that the file is not sorted.
Each command is applied to the employees.txt
file, and the effect varies
depending on the keys (-k) and the options (like n
, r
, c
) used. These
commands help to manipulate the order of lines in the file, providing a sorted
output based on the specified criteria.
Sorting Tabular Data
With the data prepared, let's explore how we can sort it using different criteria.
Conclusion
Sorting tabular data in Linux using the sort
command is a straightforward
process that can be customized in various ways. By mastering the different flags
and understanding how to define sort keys, you can efficiently handle the
organization of data in your day-to-day tasks, making data analysis, reporting,
and management much more effective and tailored to your needs.
What Can You Do Next 🙏😊
If you liked the article, consider subscribing to Cloudaffle, my YouTube Channel, where I keep posting in-depth tutorials and all edutainment stuff for software developers.