Hacker News new | past | comments | ask | show | jobs | submit login

You can do this without sorting:

    awk '!x[$0]++'



That's usually faster where possible, but it may cause problems on large data sets, since it loads the entire set of unique strings (and their counts) into an in-memory hash table.


I use something like this everyday:

awk '!($0 in a);a[$0]; print}'

I rarely if ever use uniq to remove duplicates. Sorting is expensive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: