Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, you can't work with JSON very easily in awk. Any kind of hierarchical format like JSON and XML will give you a headache in awk, and CSV can be difficult as well.


While not as complete as a full CSV parsing lib, finding this made working with CSV in awk much easier: http://www.gnu.org/software/gawk/manual/html_node/Splitting-...

  gawk -vFPAT='[^,]*|"[^"]*"'
http://stackoverflow.com/questions/4205431/parse-a-csv-using...


That regexp fails with fields containing '"':s, but I guess you can grep for embedded double quotes ("") first.

Are there multiple variants of coding '"' in CSV fields? I don't know -- but some people who do know are those who write the CSV libs I use!

Edit: And as your link notes, it fails for embedded \n:s. Imnsho, awk needs csv (and json, etc) builtin, preferable as a plugin architecture. But then, why not just use the Perl superset?


Arnold Robbins created FPAT to parse CSV, but it doesn't really do that very well. I agree that it would have been better to just hardcode a CSV mode. CSV is common, so you shouldn't have to think hard in order to parse it, and FPAT is hard. PHP makes parsing CSV a breeze. You could write a good CSV parser in gawk and @include it in other scripts as a solution short of hacking gawk itself. But it's generally easier just to find some other way, such as swapping CSV for TSV -- which works better in awk.

Hierarchical formats like JSON are a little different, because they don't fit the awk model very well. You could add functions to work with JSON, but working with it this way wouldn't be very awk-like. You're better off preprocessing the JSON into records with another tool to make it more awk-friendly, or simply using another language altogether.


Man, seconded. I'm not sure if a CSV-ized awk is a sensible idea, but I'd love to have it if it were. CSV might be #1 on my list of "things that will cause problems for you because they are slightly harder than you think they are".


I hear you, re CSV.

Join the dar... cough, Perl side, we have cookies. :-) We have CSV parsers and everything else, all the way up to e.g. good web libraries and the best OO among the scripting languages (Moose, ~ like the Common Lisp OO environment; more or less std for new Perl projects today.)

And there is more! You can reuse most everything you know from awk! Write: perldoc perlrun

Check for -n, -p, -i, -E flags. And, as many have noted, there is a2p.

http://perldoc.perl.org/5.16.2/perlrun.html

http://perldoc.perl.org/5.16.2/a2p.html

But the main reason is that we have fun. An insane programming language which throw all this "minimal mathematical notation" stuff out the window with some linguist inspirations, but still works wonderfully (do insist on keeping to the coding standards in your group. Seriously. At a minimum -- lie and say that you do that, when people interview for a job at your place. :-) )


Which is why, in similar fashion to awk, we have utilities to deal with JSON [0] and CSV [1] output.

[0] https://github.com/stedolan/jq [1] https://github.com/onyxfish/csvkit




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: