Hacker News new | past | comments | ask | show | jobs | submit login
Handbook of Text manipulation on Unix (ibm.com)
316 points by AbyCodes on Mar 17, 2012 | hide | past | web | favorite | 23 comments



Related to this, pyp is worth taking a look at if you're interested in doing manipulation using python's libraries, but on the command line:

http://code.google.com/p/pyp/



The Unix Programming Environment by Kernighan and Pike and The AWK Programming Language are still the best books one can read about Unix text manipulation, and about Unix, period. (Part of the point is that in Unix text is supposed to be the universal language).


I like how it's laid out from the most specific tools that are easy to understand and eventually leads to the pocketknives of sed and awk that beginners might not need until they've exhausted the potential of the previous commands.


Unix for Poets is a great set of exercises for someone wanting to learn more about text manipulation with Unix tools.

http://www.iro.umontreal.ca/~felipe/IFT6010-Automne2011/reso...


Thanks for this! I really like these kinds of summaries, because while I love grep and cut and wc and perl, there are commands in here I really haven't heard of.

Plus I enjoy stringing together one-off filters longer than my arm.


If you like this, then check out Unix Power Tools. It's full of exactly this kind of stuff, with broader and deeper coverage. I highly recommend it -- I consider it one of the top ten or so books for a new programmer to spend some time with.


One useful addition to the section on streams would have been that of process substitution:

http://tldp.org/LDP/abs/html/process-sub.html

This allows you to have more than just the standard streams.


It's bash-specific (sh doesn't support it)


It's not bash-specific. Other shells have it.


But not the bourne shell


not POSIX.


Also take a look at my 3 e-books on awk, sed and perl: http://www.catonmat.net/books/


I once wrote this introduction to UNIX (which is unfortunately not complete, I lost the DocBook sources), that also provides an introduction to text manipulation.

http://danieldk.eu/Writings/unixsystems.pdf


Good post. How can I tell if a tool supports UTF-8 (or some other encoding) or not?


join was new to me. I like it....

Always happy to learn a new command.


This used to be a great site (ignore its very un-PC site name):

http://bashcurescancer.com/

It seems the site is down.


Thanks for this, had never heard of csplit. Too bad the OSX version sucks.


Sort of related: rpl[1] is an often overlooked tool for replacing text across multiple files. Terser than "perl pie" and a few nice features like simulation mode.

[1] http://www.laffeycomputer.com/rpl.html


So sad that the writer lets himself down in the first line.


What are you referring to?


Possibly to the use of "A basic tenant" when the writer really meant "A basic tenet".


He must be referring to the use of "tenant" when "tenet" was meant.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: