Hacker News new | comments | show | ask | jobs | submit login

'sort | uniq' is another special case of this, and it is much better to replace that with 'sort -u'

the 'sort' in 'sort | uniq' doesn't know you are going to be throwing away all the duplicate data.

If anyone is wondering, here is an implementation of the python approach i have lying around:

  #!/usr/bin/env python2
  import sys
  from collections import defaultdict
  c = defaultdict(int)
  for line in sys.stdin:
      c[line] += 1
  top = sorted(c.items(), key=lambda (k,v): v)
  for k, v in top:
      print v, k,

Just for fun, here's a version using `Counter` from the same `collections` module which makes that blissfully simple:

    #!/usr/bin/env python2
    import sys
    from collections import Counter

    for pair in Counter(sys.stdin).most_common():
        print pair

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact