This is nice. I use `column` for pretty printing CSV/TSV but it fixes two tiny gaps in `sort` (skipping header lines) and `jq` (parsing CSV input. `jq` supports `@csv` for output conversion but not input).
$ cat example.csv
color,shape,flag,index
yellow,triangle,1,11
red,square,1,15
red,circle,1,16
red,square,0,48
purple,triangle,0,51
red,square,0,77
# pretty printing
$ column -ts',' example.csv
color shape flag index
yellow triangle 1 11
red square 1 15
red circle 1 16
red square 0 48
purple triangle 0 51
red square 0 77
# sorting with skipped headers is a mess.
$ (head -n 1 example.csv && tail -n +2 example.csv | sort -r -k4 -t',') | column -ts','
color shape flag index
red square 0 77
purple triangle 0 51
red square 0 48
red circle 1 16
red square 1 15
yellow triangle 1 11
> `jq` supports `@csv` for output conversion but not input
Actually, `jq` can cope with trivial CSV input like your example, - `jq -R 'split(",")'` will turn a CSV into an array of arrays. To then sort it in reverse order by 3rd column and retain the header, the following fell out of my fingers (I'm beyond certain that a more skilled `jq` user than me could improve it):
I like these command line tools, but I think they can cripple someone actually learning programming language. For example, here is a short program that does your last example:
The whole point of UNIX userland is to not have to write a custom program for every simple case that just needs recombining some existing basic programs in a pipeline...
that's just it though, the last example is not a simple case, hence why the last example is awkward by the commenters own admission. command line tools are fine, but you need to know when to set the hammer down and pick up the chainsaw.
As far as shell scripting goes, this is hardly anything to write home about. Looks simple enough to me.
It just retains the header by printing the header first as is, and then sorting the lines after the header. It's immediately obvious how to do it to anybody who knows about head and tail.
And with Miller it's even simpler than that, still on the command line...
To me the last example is still simple. When I encounter this in the wild, I don't really care about preserving the header.
tail -n +2 example.csv | sort -r -k4 -t','
Or more often, I just do this and ignore the header
sort -r -k4 -t',' example.csv
Keeping the header feels awkward, but using `sort` to reverse sort by a specific column is still quicker to type and execute (for me) than writing a program.
I thought the Go code looked way too complex and Python would be simpler. Yes and no.
import csv
filename = 'example.csv'
sort_by = 'index'
reverse = True
with open(filename) as f:
lines = [d for d in csv.DictReader(f)]
for line in lines:
line['index'] = int(line['index'])
lines.sort(key=lambda line: line[sort_by], reverse=reverse)
print(','.join(lines[0].keys()))
for line in lines:
print(','.join(str(v) for v in line.values()))
I thought about that but 1) it seemed like cheating to write to standard out, 2) you're assuming that the column to sort by is an integer whereas I broke that code up a little bit.
But yours has the advantage of being able to support more complex CSVs.