Sorting Tabular Data with the sort Command in Linux
Tabular data is prevalent in computing for representing data sets, databases, or
simple lists. In Linux, the sort command is a powerful tool for arranging
tabular data in a specific order, be it numerical, alphabetical, or custom.
Preparing Tabular Data
Before sorting, we need to create a file with tabular data. We will use vim to
create and edit a file named employees.txt that contains a list of employees
with their department and salary.
Creating Tabular Data Using vim:
- Open
vimby typingvim employees.txtin your terminal. - Enter insert mode by pressing
i. - Type the following tabular data:
John Doe Sales 50000
Jane Smith Marketing 55000
Bob Johnson Engineering 60000
Alice Brown HR 52000
- To ensure that the columns are separated by tabs, press
Tabbetween each field. - Save and exit by pressing
Esc, typing:wq, and then hittingEnter.
Sorting Tabular Data
With the data prepared, let's explore how we can sort it using different criteria.
1. Sort by Name (Alphabetical Order)
sort -k1,1 employees.txt
sort: Invokes the sort command.-k1,1: The-koption specifies which key or column to sort on.1,1means to sort on the first field only, which is the Name column in our data.
2. Sort by Department
sort -k2,2 employees.txt
-k2,2: Sorts the lines based on the second field, which in our example is the Department. It sorts from the start to the end of the specified key position.
3. Sort by Salary (Numerical Order)
sort -k3,3n employees.txt
-k3,3n: This command sorts on the third field, interpreting the key as an integer. Thenoption ensures that it sorts based on numerical value rather than alphabetical.
4. Combining Sorts
sort -k2,2 -k3,3n employees.txt
-k2,2 -k3,3n: This sorts with a primary key of the second field (Department) and a secondary key of the third field (Salary). This means that within each department, entries are sorted by salary.
5. Reverse Sorting
sort -k3,3nr employees.txt
-k3,3nr: Theroption reverses the result of comparisons, so this command sorts by the third field (Salary) in reverse numerical order, meaning from highest to lowest.
6. Check Sorted Data
sort -k2,2c employees.txt
-k2,2c: Thecoption checks if the file is already sorted by the specified key. In this case, it checks if the Department column is sorted. If the file is not sorted,sortwill not rearrange the lines but will instead return a message indicating that the file is not sorted.
Each command is applied to the employees.txt file, and the effect varies
depending on the keys (-k) and the options (like n, r, c) used. These
commands help to manipulate the order of lines in the file, providing a sorted
output based on the specified criteria.
Sorting Tabular Data
With the data prepared, let's explore how we can sort it using different criteria.
Conclusion
Sorting tabular data in Linux using the sort command is a straightforward
process that can be customized in various ways. By mastering the different flags
and understanding how to define sort keys, you can efficiently handle the
organization of data in your day-to-day tasks, making data analysis, reporting,
and management much more effective and tailored to your needs.
What Can You Do Next 🙏😊
If you liked the article, consider subscribing to Cloudaffle, my YouTube Channel, where I keep posting in-depth tutorials and all edutainment stuff for software developers.