You could condense the first three stages: cat /var/log/nginx-access.log | grep "GET" | awk -F'"' '{print $6}'
down into: awk -F'"' '/GET/{print $6}' /var/log/nginx-access.log as well.
Or with the fourth stage (cut -d" " -f1): awk -F'"' '/GET/{split($6,a," ");print a[0]}' /var/log/nginx-access.log
Add the fifth stage (grep -E "^[[:alnum:]]) by doing:
awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){print a[0]}}' /var/log/nginx-access.log
And the 6th and 7th (sort | uniq -c):
awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){b[a[0]]++}}END{for(i in b){print b[i} " " i}}' /var/log/nginx-access.log | sort -rn
Which is still just about a one liner and a lot faster then the original. Actually you could make it shorter, even faster, and more awky by tinkering with the field-separator to get rid of the split() and if:
awk -F'[[:space:]]|"' '/GET/ && $17~/^[[:alnum:]]/{a[$17]++}END{for(i in a){print a[i] " " i}}' | sort -rn
And then replace the last sort with awk's builtin asort() function but thats left as an exercise for the student ;)
But why learn the basics when you can be do Big Data and be buzz word compliant instead.
You could condense the first three stages: cat /var/log/nginx-access.log | grep "GET" | awk -F'"' '{print $6}'
down into: awk -F'"' '/GET/{print $6}' /var/log/nginx-access.log as well.
Or with the fourth stage (cut -d" " -f1): awk -F'"' '/GET/{split($6,a," ");print a[0]}' /var/log/nginx-access.log
Add the fifth stage (grep -E "^[[:alnum:]]) by doing:
awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){print a[0]}}' /var/log/nginx-access.log
And the 6th and 7th (sort | uniq -c):
awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){b[a[0]]++}}END{for(i in b){print b[i} " " i}}' /var/log/nginx-access.log | sort -rn
Which is still just about a one liner and a lot faster then the original. Actually you could make it shorter, even faster, and more awky by tinkering with the field-separator to get rid of the split() and if:
awk -F'[[:space:]]|"' '/GET/ && $17~/^[[:alnum:]]/{a[$17]++}END{for(i in a){print a[i] " " i}}' | sort -rn
And then replace the last sort with awk's builtin asort() function but thats left as an exercise for the student ;)
But why learn the basics when you can be do Big Data and be buzz word compliant instead.