Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yes, if piping cat into grep into awk into grep again is a Jedi like trick its no wonder the Republic collapsed.

You could condense the first three stages: cat /var/log/nginx-access.log | grep "GET" | awk -F'"' '{print $6}'

down into: awk -F'"' '/GET/{print $6}' /var/log/nginx-access.log as well.

Or with the fourth stage (cut -d" " -f1): awk -F'"' '/GET/{split($6,a," ");print a[0]}' /var/log/nginx-access.log

Add the fifth stage (grep -E "^[[:alnum:]]) by doing:

awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){print a[0]}}' /var/log/nginx-access.log

And the 6th and 7th (sort | uniq -c):

awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){b[a[0]]++}}END{for(i in b){print b[i} " " i}}' /var/log/nginx-access.log | sort -rn

Which is still just about a one liner and a lot faster then the original. Actually you could make it shorter, even faster, and more awky by tinkering with the field-separator to get rid of the split() and if:

awk -F'[[:space:]]|"' '/GET/ && $17~/^[[:alnum:]]/{a[$17]++}END{for(i in a){print a[i] " " i}}' | sort -rn

And then replace the last sort with awk's builtin asort() function but thats left as an exercise for the student ;)

But why learn the basics when you can be do Big Data and be buzz word compliant instead.



Nice straw-man. My contention is simply that knowing better than to "cat | grep" is the basics.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: