Skip to main content

Sorting Tabular Data with the sort Command in Linux

Tabular data is prevalent in computing for representing data sets, databases, or simple lists. In Linux, the sort command is a powerful tool for arranging tabular data in a specific order, be it numerical, alphabetical, or custom.

Preparing Tabular Data

Before sorting, we need to create a file with tabular data. We will use vim to create and edit a file named employees.txt that contains a list of employees with their department and salary.

Creating Tabular Data Using vim:

  1. Open vim by typing vim employees.txt in your terminal.
  2. Enter insert mode by pressing i.
  3. Type the following tabular data:
John Doe    Sales       50000
Jane Smith Marketing 55000
Bob Johnson Engineering 60000
Alice Brown HR 52000
  1. To ensure that the columns are separated by tabs, press Tab between each field.
  2. Save and exit by pressing Esc, typing :wq, and then hitting Enter.

Sorting Tabular Data

With the data prepared, let's explore how we can sort it using different criteria.

1. Sort by Name (Alphabetical Order)

sort -k1,1 employees.txt
  • sort: Invokes the sort command.
  • -k1,1: The -k option specifies which key or column to sort on. 1,1 means to sort on the first field only, which is the Name column in our data.

2. Sort by Department

sort -k2,2 employees.txt
  • -k2,2: Sorts the lines based on the second field, which in our example is the Department. It sorts from the start to the end of the specified key position.

3. Sort by Salary (Numerical Order)

sort -k3,3n employees.txt
  • -k3,3n: This command sorts on the third field, interpreting the key as an integer. The n option ensures that it sorts based on numerical value rather than alphabetical.

4. Combining Sorts

sort -k2,2 -k3,3n employees.txt
  • -k2,2 -k3,3n: This sorts with a primary key of the second field (Department) and a secondary key of the third field (Salary). This means that within each department, entries are sorted by salary.

5. Reverse Sorting

sort -k3,3nr employees.txt
  • -k3,3nr: The r option reverses the result of comparisons, so this command sorts by the third field (Salary) in reverse numerical order, meaning from highest to lowest.

6. Check Sorted Data

sort -k2,2c employees.txt
  • -k2,2c: The c option checks if the file is already sorted by the specified key. In this case, it checks if the Department column is sorted. If the file is not sorted, sort will not rearrange the lines but will instead return a message indicating that the file is not sorted.

Each command is applied to the employees.txt file, and the effect varies depending on the keys (-k) and the options (like n, r, c) used. These commands help to manipulate the order of lines in the file, providing a sorted output based on the specified criteria.

Sorting Tabular Data

With the data prepared, let's explore how we can sort it using different criteria.

Conclusion

Sorting tabular data in Linux using the sort command is a straightforward process that can be customized in various ways. By mastering the different flags and understanding how to define sort keys, you can efficiently handle the organization of data in your day-to-day tasks, making data analysis, reporting, and management much more effective and tailored to your needs.

What Can You Do Next 🙏😊

If you liked the article, consider subscribing to Cloudaffle, my YouTube Channel, where I keep posting in-depth tutorials and all edutainment stuff for software developers.

YouTube @cloudaffle