Hacker Newsnew | comments | show | ask | jobs | submitlogin
What's the most popular Ruby standard library? (omniref.com)
82 points by timr 219 days ago | comments



I'm really impressed by test_unit hitting #2. Most people who wander off into ruby bring back their TDD enthusiasm to the other languages they use. That's quite an impact for the entire industry when you think about it. Kudos to all involved for that :)

-----


Thanks! The name's "test/unit", but I won't hold it against you ;-). It's pretty sweet that people have gotten so much mileage out of code I wrote, and if it's helped us as a profession - even a little bit - to write better code, I'm thrilled.

-----


I wouldn't read that much into it. That will get pulled in as a dep for most packages, including Rails.

-----


It's one of the most frequently required libraries in individual files, too. So it's bigger than just being a dependency that's required by rails.

-----


True, but that itself is no small feat.

-----


Nice write-up. I find this sort of thing fascinating. When are you guys thinking of tackling more languages? Maybe something that lends itself to static analysis a bit more?

Also how long did that regex take to run on 5.0e8 lines of ruby?

-----


About 2 hours. The regex is IO bound, on a text column with 87GB + 68GB postgres toast. It's on a new 1TB 3k iops EC2/EBS ssd volume, which seems to be able to sustain about 20MB/s.

-----


20MB/sec is pretty dreadful, and 2 hours is not timely for that quantity of data.

I'm guessing you're paying a high price for the convenience of using a database? The kind of query you did, I'd run using grep on the command line source, possibly combined with a summarizing program written in Ruby.

-----


I love the power of the unix shell. find | cat | grep would get the job done just as well if you had all the source in an accessible file tree, but I don't think you'd see any performance increase as the bottleneck is still random reads from EBS.

The single 3k iops EBS volume being used delivers a max theoretical speed of 24MB/s with 8k pages. I'm fine living with 20MB/s in practice.

In fact, postgres does inline (de)compression and optimizes for sequential reads, so it's likely the shell would be slower for this workload given the apples to oranges characteristics. I'd love to see any performance tests making this sort of comparison, they're always educational.

-----


Even with a database it's dreadful. At 20MB/sec they need to value the time they have to wait very low before it'd be cheaper/faster to buy a small server outright and put a couple of ssd's in it if they do this kind of analysis more than a couple of times.

Or even load it up with enough memory to keep everything in RAM during normal operations. I can't remember the last time I worked on a system that did less than a couple of hundred MB/sec... And we generally buy servers in the $3k-$6k range, so nothing ridiculous.

-----


Probably even faster using LC_ALL=C and parallel grep [0]

[0] http://www.gnu.org/software/parallel/man.html#example__paral...

-----


If I'm not mistaken, requiring the "date" stdlib wasn't required until Ruby 1.9, so that might account for its low spot on the list. I'm not sure but the same might be true for "time".

-----


A little backwards. In 1.9 you don't need to require "date" to use the basic Date class.

-----


Good point. You can definitely instantiate a Time instance without requiring anything:

https://www.omniref.com/ruby/2.1.2/symbols/Time#annotation=1...

-----


Clarifying (since I think your post is easy to misinterpret if someone does not follow the link): The Time class is part of core ruby. Requiring time from stdlib adds some additional methods. Thus it is not necessary to require time to use the Time class, but requiring time _does_ add additional functionality.

-----


Very interesting. I'm now wondering if Travis CI is collecting any code usage statistics? I'd imagine that they would have a more application-centric view on the rubygems ecosystem. Also, since they are actually executing code, they could potentially collect data on constants and method calls, I believe.

-----


I think the title is a tad bit misleading, I saw this and immediately thought "Why would anyone even ask? It's obviously rails" then I checked the link and saw that they meant the Standard Library.

-----


Thanks for pointing that out. We added "standard" to the title.

-----




Applications are open for YC Summer 2015

Guidelines | FAQ | Support | Lists | Bookmarklet | DMCA | Y Combinator | Apply | Contact

Search: