Hacker News new | comments | show | ask | jobs | submit login
Pipeable Ruby - forget about grep / sed / awk / wc ... (github.com)
67 points by adulau 2170 days ago | hide | past | web | 40 comments | favorite

Very nice indeed, but I think a lot of the appeal with grep/sed/awk/wc/etc. is a. they are (somewhat, with irritating differences in some cases) cross-platform and available in every unix-y environment and b. people are simply very used to them, not to mention that there are going to be circumstances where a single interpreter reading a command is not going to be as succinct/powerful/flexible as a series of piped commands. As an alternative to perl -pe it is very interesting indeed.

  # grep --- all lines including current date
  ls -al | ???
  ls -al | pru 'include?(Time.now.strftime("%Y-%m-%d"))'
Unless I'm missing something, that would be:

  ls -al | grep `date +"%Y-%m-%d"`

This works too:

    ls -al | grep $(date +%Y-%m-%d)

This particular example can even be shorter:

    ls -al | grep $(date +%F)

ah I like this syntax, thank you.

Admittedly a nice tool for ruby users. But for 'grep/sed/awk' users? Forget and learn a new syntax?

I don't quite see the benefit of pru from the examples for 'grep/sed/awk' users, memory and speed issues put aside.

Even the 'number of files by date' can be realized with, eg., awk:

  ls -al |awk '{++a[$6]} END {for (i in a) printf "%s: %d\n", i, a[i]}'

"forget about classic unix filters that have been used for 30 years"

no thanks, link-bait!

a few years ago, i proposed a -x switch to ruby that would run a "puts $_.instance_eval { ... }" over each line of ARGF, but matz didn't care for it. wouldn't have done as much as pru does, but it would have had the advantage of being available by default with just a standard ruby install.

Very cool - but never ever ever forget about grep/sed/awk/wc...

I think the consensus is, a neat project and perhaps will be adopted by some but not a replacement for the tools that have been crafted and tuned over years.

A nice property of grep is that it's fast. I'd like to know how this compares on the speed front

Probably not even close. (http://lists.freebsd.org/pipermail/freebsd-current/2010-Augu...)

grep/sed/awk are highly specialized programs optimized to do one thing very well.

A tool like this one might pick for reasons other than performance perhaps (maybe you're a Ruby programmer and can't grok grep/sed/awk?)

Most people these days have a utf8 locale; basically nothing from that email applies when that is the case, and grep runs much slower.

I though Ruby was an answer to Perl but yet it does not support the equivalent of "perl -pe"? Or did I miss something?

Many people (including Matz) discourage Ruby's Perlisms (or Perlish roots?), but at least for now, they're still there. (I say at least for now because Matz has said that they may go away at some point.)

Check man ruby and you'll see familiar (if you're used to Perl one-liners) command-line switches: -p, -l, -n, -F, -a, -i and so on.

See also here for some familiar friends: http://www.zenspider.com/Languages/Ruby/QuickRef.html#19

You missed something. Ruby can do the same thing.

One could consider PyLine to be the Python equivalent to Pipeable Ruby:


Or funcpy:


Seriously? People still do this?

  ps -ef | grep foo | grep -v grep
This always works:

  ps -ef | grep '[f]oo'
And, if you're running a modern operating system:

  pgrep foo

I use both, but 'grep -v grep' is far easier to type.

And I almost always want the full ps output, or I'm searching for a commandline argument rather than the process name, so pgrep is out.

  function psg () { [[ -z "$1" ]] && return ; local _first=${1:0:1} ; local _rest=${1:1} ; [[ -z "$_rest" ]] && _rest='[^]]' ; local _com="ps -ef | grep '[$_first]'$_rest" ; eval "$_com" }

Wow, that's certainly one way to do it. I'm still not seeing an issue with 'grep -v grep'.

  ps -ef | grep foo | grep -v grep
Will fail if foo="grep"

Drop it in your .bashrc and just type "psg foo"

Check out pgrep -f

Funny, just yesterday I was wondering what a shell would look like if it was an interpreter for a modern language. The important thing about the commands is the composition, not the language they are embedded in.

I think to make something as instantly usable as the the shell is quite hard, but to make an environment that is suitable for harder tasks that just get messy in the shell in a better language, but which still inherits a shell like way of doing things could be an interesting project.

This is not that kind of shell. "Pipeable Ruby" does mostly the same as the "perl -pe" switch with some additional syntactical shortcuts

For a shell you'd want rush (http://rush.heroku.com/) or maybe iPython.

  # seq $((2**32)) | pru "/find me/" >/dev/null
  gems/pru-0.1.3/lib/pru.rb:13:in `[]': failed to allocate memory (NoMemoryError)
Guess I'll try to not forget about grep and sed just yet...

Out of curiosity, I tried this and it worked fine:

  seq $((2**32)) | perl -ne '/find me/ && print'

There isn't much of a point to this because it doesn't change the fundamental type of data that is being piped around. You're still dealing with strings or lists of strings. The advantage of something like PowerShell, an OS built on Common Lisp, or SmallTalk is that objects can be passed around rather than just strings.

How fast is it?

Go home pru...

rw@raccoon:~> du -h messages

19M messages

rw@raccoon:~> time grep -e "foobar" < messages

real 0m0.030s

user 0m0.022s

sys 0m0.008s

rw@raccoon:~> time pru /foobar/ < messages

real 0m0.796s

user 0m0.722s

sys 0m0.071s

A valid question, but a counter-question is: which is more expensive: development time, or run time?

The answer is usually (but not always) developer time.

I'm using grep/awk/sed all day long, I need never more than 10 seconds to build a command group...

Interesting tool, but not really a replacement. Most of the examples can be written much more simply with grep/sed/awk, etc.

Similar project for C++: http://github.com/lvv/scc

  # --- print second column 
  ls -al | awk '{print $2}'
  ls -al | pru 'split(" ")[1]'
  ls -al | scc -n 'F(1)'

  # --- count and average of all integers on second position
  ls -al | awk '{ s += $2; } END {print "average" ,int(s/NR);print "count ",int(NR)}'
  ls -al | pru 'split(" ")[1]' '"average #{mean(&:to_i)}\ncount #{size}"'
  ls -al | scc 'int c=0; WRL c+=F(1); FMT("average %s\ncount %s") %(c/NR) %NR'

  # --- count lines
  ls -al | wc -l  
  ls -al | pru -r 'size'
  ls -al | scc 'WRL;NR+1'

  # -- replace a 5 with five
  ls -al | sed 's/5/five/'
  ls -al | pru 'gsub(/5/,"five")'
  ls -al | scc -n 'RR(line,R("5"),"five")'

  # every second line
  ls -al | pru 'i % 2 == 0'
  ls -al | scc -n 'NR % 2 ? line : ""'

  # sum up df's used-space column
  df | awk '{n+=$3;};  END{print n}'
  df | pru  ?????
  df | scc 'int n=0; WRL n+=F(2); n'

No, I don't want to forget about no damn grep, sed, and the like.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact