Of course, most products don't really need to do calculations outside of a database and this is why Ruby has taken off so well - but it's still important to realize that choosing Ruby over PyPy/Go/Whatever really can be a very expensive choice in the long run when your product suddenly relies on some unique math solutions, and having half your product in Ruby and half in C is awful to have to deal with.
The solution to this is a project like PyPy for Ruby, but it doesn't seem to be coming...
Why? In many cases, only a tiny fraction of your code needs to be fast. In my experience, it is very reasonable to code that part in some fast language while delegating the bigger (and often more complex) part in a higher level language such as Ruby.
A ton of work has to go into computation-friendly data organization (see all of numpy, scipy, etc) before you can ever actually optimize effectively.
EDIT: since all of my responses are of the form: "oh no I did it once with great results", I'll retract the part where I said "have never really tried it". I'm not arguing it's never worked. I'm arguing that the set of likely performance problems and the set of easy-to-optimize-with-C problems don't overlap as much as people would like to believe.
For many types of problems, putting the data into an efficient format before passing it to your C/C++ code is trivial in terms of both difficulty and time spent.
I prototyped everything in Ruby, profiled, and then rewrote a handful of the critical parts in C and C++.
Even when you give up on writing true "hybrid" code (i.e. "just write modules for the slow parts in C++", which is a gigantic pain), and instead try to write driver code that calls standalone compiled applications, you find yourself having to write a lot of duplicate code to parse/validate I/O on both sides of the language boundary. It sucks, and there's no way to make it not suck.
I'd go so far as to say that for problems where processing speed is a known requirement from the start, you should just give up on interpreted languages. Fast interpreted languages have been right around the corner for long as I've been writing code, but they sure seem to be taking their time.
Ruby provides some very useful features, but a lot of those come at rather high costs. Those costs are exacerbated if you do not understand how the internal components of the language are laid out, and the purpose behind this layout.
Ruby tries to be a lot of things to a lot of people, so then when people learn how to use it for one task they automatically assume that their skills will carry over. This sort of approach might work reasonably well with a straight-forward compiled language, but this it simply can't be that easy for an interpreted language like ruby, with it's reams of special context sensitive behaviour.
For example, consider the "pointers to pointers to pointers" complaint. Nothing is stopping you from having a low level C struct for all your performance sensitive data. Granted, you would have to write a ruby wrapper for this data for when you need to export it back out to ruby, but wrapper generation could be automated.
Sure, you could just say, "You know what. I'm just going to use C" but what if your project could really use some of those ruby features outside of the performance sensitive parts of the code? It's always a tradeoff.
There is SWIG support for Ruby, as well as the Ruby FFI implementation and RubyInline, offering various degrees of simplicity, but none of them are hard. The Ruby extension interface is more work, but even that is fairly simple to work with. I've taken C libraries and wrapped them in Ruby in minutes using SWIG for example.
Overall I often these days choose to write in Ruby first and replace even for applications that I intend to write the finished version of entirely in C. Prototyping in Ruby and rewriting in C once I have a working version is faster to me than iterating over a C version.
The fact that the speed-critical parts of your problem could be reduced to a "few, small bits" indicates that we're talking about entirely different things.
I was in a problem domain where writing the time-critical pieces in a different language meant ensuring that huge amounts of working-set data were available to that optimized code. Efficient access to that data meant storage in native format, which meant that supporting two languages would require a nasty I/O layer for the interpreted language. Not worth it.
And that's just for the problems where it was possible to consider using Perl/Python at all. For some problems, the memory overhead of Perl/Python data structures ruled out use on all but the smallest problems. Again, not worth it.
It's nice that the speed-critical parts of your problem were reducible to a few lines of code, but that's not often the case. For most real problems, I/O is a huge concern, and can't be dismissed just by pre-processing the input.
It's certainly doable, and definitely the right way to go if you want number crunching in a high-level language, but it's not a free lunch. It's certainly not a quick one-time conversion and then you can forget about it.
If more people thought of FFI/RPC/etc. as just another pipe-and-filter framework, they'd be less reluctant to write properly-defined, performance critical, pieces in a more performant environment.
Then again, your use of a profiler and FFI/pipe/whatever is a secret weapon that seems to be one of the things that separate 1x from 10x engineers. :-)
After profiling a backfill script which was looking like it was going to take about two weeks to run, I found an area where certain bit-twiddling was being done which was extremely slow in Ruby. I rewrote this section in C, and was able to make this section 1000x faster then the ruby equivalent (this was MRI 1.8.5 several years ago). The total runtime for the script was brought down to about 8 hours (from about 165 TPS to around 7,000 TPS).
Fortunately, I've found that Ruby makes it exceptionally easy to write C extensions, and I often think of reusing C libraries in cases where I'm worried about performance.
Yes, you can make image transformations for your webapp fast by piping in a C-extension, because the C code can control the i/o and data organization as well as the algorithms. The integration surface is tiny. You can't, however, make the language's memory usage and garbage collection faster. At least not as easily. You won't be "dropping into C" to fix your garbage collection problems, at least not in the way people dream of when they think of "performance problems" at the outset.
Whenever I see "I'll write in Ruby and optimize in C when I need to" I assume the context of "my program is well suited to be written in Ruby." For instance, I certainly wouldn't write embedded code in ruby. However, I would venture to say that these situations are the exception rather than the rule.
These days most software has access to fairly reasonable computational resources. If you're writting microcode for some industrial control system, or calculating stock price variations with micro-second resolution then certainly stick to C or ASM or what have you. However, in an age when even a fridge will have a few hundred MB of memory, and when a lot of code is meant to be just "good enough" I think even ruby will suffice.
Think of all the saved time ...
The engine is built in C++ and the logic is coded in some higher level scripting language. Seems to work fine for them.
Agreed. Contrary to what lbrandy said, I've tried it, and it works very well. Writing Ruby extensions in C, or wrapping C libraries for Ruby, is really easy. Wrapping a PNG library for Ruby allowed us to draw about a trillion pixels' worth of images based on the human genome across a modest cluster of ~100 cores in a few days [1,2,3]. For reference, a terapixel image is about ~4TB of uncompressed data and ~800GB compressed, with Microsoft putting up comparable numbers for a project they ran on a fancier HPC cluster in 2010 . We were heavily compressing it into PNG tiles on the fly so ours came down to ~60GB.
Regarding pointers to pointers to ints--if you keep all the data within the heap of the Ruby extension, as we did with the wrapped PNG library, then it can be stored in whatever simple data structure you want.
When generating huge numbers of png images, it's worth it to tweak the compression settings. Run png optimizers like optipng/pngcrush on your output, and use the settings it finds.
LodePNG is easy to use, but doesn't let you tweak as many options as libpng.
The tiles I've inspected are 15-30% larger than necessary-- besides using sub-optimal compression heuristics, they include unnecessary metadata (creation/modification time, aspect ratio).
-Write your application in Go
-Write application in Ruby
-Profile Ruby application to identify code to be rewritten in C
-Learn C if you don't already know it
-Rewrite various parts of your application in C
-Run Valgrind to search for memleaks
-Ensure that your target systems all have required libs installed
Now replace go with ruby and ruby with perl. Go has libraries for 99% of what people are doing with ruby. If you are in that 1%, then you need to evaluate whether or not it is worth writing the library you need or if you should use another language.
At least from 1000 feet away, it seems like Ruby has this problem solved better than Go does. Maybe it's just that the nature of the solution is more apparent.
Anywhere you want to publish them.
>Can I install things from this page in that way?
Sure. Just do a "go install github.com/foo/bar". Bitbucket, github, google code, and launchpad are all in the defaults, but you can configure your import path to add whatever locations you like, and then "go list" will show packages available there, and the other go commands will be able to build/install/etc those. The same syntax is used for import statements, so your code can depend on any code any where:
I think they make it pretty clear: http://golang.org/cmd/go/ gives you pretty much everything about packages and so does "go help". Go pretty much does everything out of the box with the included toolset, from installing dependencies to formatting code to refactoring.
Option A only seems simpler because you ignore the time/complexity in coding the task at hand.
That's what they used to say about Java...
But don't worry the JVM can save you. http://code.google.com/p/jgo/
It's a "cute" platform that I am sure will beat Ruby in a few benchmarks. But I don't see any reason why anyone would seriously consider it for a real world app.
I don't think Ruby's libraries honestly provide a whole lot better coverage than Go. The stdlib is very well considered. It's fast, it scales, it's thread safe in the appropriate places. It's not a toy implementation. What it doesn't provide, it provides hooks for. Third party libs cover all the major bases. And unlike Ruby, writing interfacing code in Go is a snip. I don't see libraries as a downside.
(With one exception. Why oh why did they not include a bigdecimal? Argh, a ratio type is not an adequate replacement. Try rounding a ratio some time. Blarg.)
Also realistically 60-70% of all libraries created for other languages are bitrotted to junk or pointlessness by this point.
If you can't see why an app that fits within the worked problemspace of Go (and even with the current libraries plenty of things are wholly doable in Go) and will require much less server resources might be a good fit, then you shouldn't be making those decisions, because that is the thought pattern of a zealot.
Look Ruby is well supported, established, and proven, but it is also slow. For a lot of things that is fine, but if you are dealing with something that requires a lot of serverside work on low margins it will kill you dead. Horses for courses.
P.S. Lots of people including me write lots of real world applications in Go...
That said, Go wasn't necessarily the better choice, because of the subtlety that made the Go program overflow silently. Writing the solution in Scala or even in JS would have probably given him less of an issue.
I recently wrote a image searching program (find a set of images as sub-images in another set of images) and I wrote most of it in Racket (Scheme dialect). I wrote the inner loop to calculate the correlation matrix in C and made an FFI call.
I'm not familiar with Ruby's FFI, but I found making FFI calls dead simple in Racket.
It's pretty much `gem install ffi` and off you go. I've used a Ruby gem  that talks to a hashtable storage library via FFI and found it perfectly stable.
Because it is a huge pain in the ass, and very rarely does any good. As soon as you are writing C, you are dealing with all the headaches that go with it for both development and deployment. You've lost a big part of the benefit of using scripting language X. And this part:
>In many cases, only a tiny fraction of your code needs to be fast
Isn't true. People repeat it a lot, but I've seen no evidence to support it. Most people are using scripting languages for web apps which really are just generally slow all over. They would need to write 80% of it in C to get good performance. Seems like using a language that is as productive as scripting languages, but is also fast, makes a lot more sense.
But was started more than a year ago (the first commit was made in April 2012)
> Would you trust it enough to use it in production in the next two to five years?
This round includes many more community contributions including more Scala, Erlang, Haskell, Python, Java, PHP, etc. And Go 1.1.
It also seems like Matz and ruby core team don't really give a shit about performance.
It is very possible to achieve a high level of performance on an interpreted language!!
There's an argument that removing the GIL would require a lot of effort for little gain (and in some cases decreased performance). It's just not a priority for them.
Of course, every use case is different (sometimes very different), but I haven't ever hit a problem that has caused me to curse the GIL.
I was wrong.
Sure, the author is wrong, but not because Ruby is slow (it is, though). He's wrong, because he did not find the right solution for the problem. Instead, he wrote a brute-force solution with a simple optimization, and did not realize that this is the real cause of his code under-performing, blaming the Ruby's slowness for it, when he really should have blamed the Dunning-Kruger effect.
(That said, programming competitions are a pretty unrealistic subset of what actual software engineering is like. C++ is a fine language for programming competitions, because it's fast, excels at the sort of numeric manipulation that's common there, has a "good enough" standard library with the STL for common tasks featured there, and the common pitfalls of C++ development - memory management and slow build times, for example - won't bite you in programs of competition-length. That doesn't mean the equation doesn't change if you're, say, building a web app or parsing a simple file format.)
To me, the obvious solution is to generate the palindromes, not iterate through every number and check if it is a palindrome. Unless generating palindromes is a very hard problem, iterating over them should be the obvious solution, as it obviously will mean a substantially smaller set of iterations.
As it happens, generating a series of palindromes is easy, and is obviously massively faster, since as a first crude approximation of the difference you can take his example program, change it to step through the space with "step(10)", convert to string, replace the last digit with the first and check that the resulting number still falls within the range, and then check if it's a palindrome. Just this crude change cuts the number of iterations required to 1/10th, and it hints strongly at the effects of recursively applying the same method.
Having to handle varying length palindromes slightly complicates matters, but not much.
ruby is slow: sound
ruby's slowness is his main problem: not-sound
> memoizing, storing values/lookups that would be used later, limiting search spaces, etc
He did none of that. Kudos to OP for benchmarking & profiling though -- the speed of Go is impressive.
What I'm trying to say, (stating the obvious):
When the input is large enough, it doesn't matter what language you're programming in. And programming competitions should (and generally do) only focus on that.
But I'll admit that I'm a huge fan of optimizing my algorithms with bit-level operators, extreme care of memory allocation and other tools C offers.
Going a layer deeper, blame the competitions themselves. They very rarely have good data sets to actually weed out algorithmically poor solutions, so you can often get away with brute force or mostly brute force solutions -- if you use a low-level programming language.
For instance many times a O(N^2/100) will pass the test data just because it's written in C++ instead of something slower. That's a failure of the test data.
Languages that demand IDEs tend to be more complicated in many ways. They can be faster, but also might demand more code obfuscation through more code that need to be written like types and longer names and importing of libraries and so on. My pet peeve is that I absolutely love Ruby, but it indeed is not meant for ultimate performance. On the other hand, languages like Dart might allow for some compromise if you can get stuff done with less power at your hands like less metaprogramming and fewer shortcuts... Except that Dart is trying to tackle asynchronous programming with the standard libraries which is itself quite complicated (Future what?)
Go and Dart are not themselves too tuned for performance yet, even though they tend to be very quick when compared to Ruby. They tend to solve different problems though.
Ruby has a couple of architects, Matz and SASADA for the VM. Go has a few. Dart has many, with some tending to the standard libraries.
Programming is unfortunately way too complicated. In my opinion, languages like Ruby are wonderful when you don't have a need to hide the source-code. On the other hand, languages like Go and Dart might be good when you do have a need to hide the source-code when deployment time comes.
I was solving that problem with PHP and got it running reasonably fast (1s per range) for ranges up to around 10^60. PHP is probably faster than Ruby, but it's still in the same area.
The first trick you can use is that you don't have to go through all numbers between start and end, rather you can directly only traverse the palindromes (i.e. you will only have to increment one half of the integer).
The second trick is that only numbers consisting of 0s and 1s (and a 2 at the start/end or the middle) can satisfy the condition (exception: the number "3"). This further massively reduces the search space.
There are more criteria that you can use to narrow down the numbers to consider (e.g. the number of non-zero digits can't be greater than 9). But in the end, cracking the 10^100 range is probably not possible in a language like Ruby or PHP. Using something like C++ on the other hand it becomes a triviality.
The first two parts were very easy with only brute force by working thru the square root of the full range once, caching the numbers, and using that cache. I don't see why people (including the auhor) recalculated the numbers for each goven range, especially in part B which required 10k such ranges.
That said, my python code, as a relative newbie, took 10s to cache all fair and square numbers from 1 to 10^14. I ran thru from 1 to 10^7, checked if palindromic using strings, and if good checked the square. (there were only 39 such numbers in total). I then ran all 10k answers in seconds.
Part C was MUCH more tricky, until I discovered the same trick most others found, that only some of the square root palindromes are valid with digits 0, 1, and 2. That said, as a python n00b, my code took 40 mins to run the range from 1 to 10^100. I asked the google staff if pre-caching is allowed, and they said yes, as long as my code to generate the pre-cache is provided wih my submission.
Properly optimizing my python would probably give similar performance. But the point is that with a bit of forethought, even poorly-optimized python code by a newbie was sufficient to solve all three parts of this question.
If anyone is curious, the rule is as follows. Any palindrome where the sum of the squares of the digits add up to strictly less than 10(9 or less) will be the square root of a "fair and square number". From there, I constructed them using a recursive method which finds the first half of numbers fitting that description and mirrors them to get the palindrome.
I ended up getting an incorrect solution on C-large-2 but that was because I failed to realize that I would lose precision when finding the square roots of the bounds of the range. While integers in python can grow indefinitely math.sqrt() will give you a 32bit floating point number which will represent square roots of 10^14 numbers accurately but not square roots of 10^100 numbers. The diff which turns my program into one that gives correct answers has literally 3 changes lines(to use decimal.Decimal instead of the math library for doing square roots).
Then for each of the 1000 inputs, you run through the set of 46k fair and square numbers, counting how many are between A and B. no square roots required.
Really, for pretty much anything I could conceivably think of to call "magic" in Ruby (e.g. method_missing), there's equivalent "magic" in Python (e.g. __get_attr__). And yet PyPy is pretty darned fast.
Something like the inject method, at least in my limited use of ruby with these kind of Google Code questions, seemed problematic.
In general, I'd much rather be able to implement my solutions as quickly as possible as the challenge is usually in 1) finding the correct algorithm and 2) implementing it correctly and fast. If you can do that, choice of language usually doesn't matter unless in some of the problems you might get in the finals(I've never gotten that far, I usually rank somewhere in the top 2000). Also, for me, the ease with which I get things like arbitrary-size integers, sets and dictionaries far outweighs any speed gains from something like C++.
In these competitions, it's about implementing a solution as fast as possible, while still get it working under the time limit (5/8 minutes). It's mostly about having the right idea and implement it.
In very many cases, you would like to be able to manipulate and play with algorithms and have efficient mutable data structures. For this, Java/C++ (Essentially C with data structures) is very suitable.
However, in some cases, you'd like to perform advanced simulations, mathematical calculations or manipulations where immutable data structures are more suitable. In those cases, you would LIKE to have magic: Arbitrary numbers, ratios, (efficient) immutable data structures and a more functional style would not only eliminate work, but also reduce the amount of errors you're likely to do.
The important part is that the implementation uses the right idea, not that the implementation is fast. Usually it's handy with a lower-level language to implement the algorithm right, but at times, a higher level (functional) language is better when it comes to implementation speed.
I fail to see how "magic" properties of a language matter if you've got reasonable algorithmic chops and focus on solving the problems rather than trying to be clever.
If you're familiar with knives why not bring a knife to a gun fight? If knives are your forte, surely you should use it?
Execution speed is nice, I don't disagree. But it's not always the be-all and end-all deciding factor for what tool to use to solve a problem in a specific context.
Competitions can be stressful, and choosing a tool that feels like an extension of yourself (be that Ruby, COBOL or whatever) is, to me, the better route. Compared to something that executes faster on your computer but not in your head.
That doesn't mean you'll lose (ala your Steven Seagal analogy), maybe you're so much better than everyone else that you can overcome the disadvantage, but you'll surely be starting in a bad position compared to someone holding a 'gun' even if he or she is somewhat less skilled at using the gun than you are at using the knife.
In the case of the OP, for example a trivial optimization(the kind you do without even thinking in these competitions) would've brought his execution time down 10000x. Basically, you are asked about the number of "fair and square" numbers in a certain range, over and over again, in 10000 different cases. He could have just taken his solution, pre-generated all those numbers once and then ran all those test cases by counting in that pregenerated list, instead of generating it for every single test case(all 10^5 of them). It turns out, that in the whole span of 1->10^14 there is only 39 such numbers and it would've taken his program probably less than a minute to generate all of them(I'm being generous here, since it ran 10^5 test cases in 53min).
In fact, if you read the problem analysis later provided by google you will see that in fact you were expected to make this optimization to solve the problem. The fact that C-small had 100 test cases, C-large-1 had 10^5 and C-large-2 had 100 should've been a strong hint that C-large-1 required some pre-caching across test cases.
That wasn't the context in which I was writing my original reply, however. I was responding to a post that mentioned "magic" as a reason to disregard Ruby for competitions, which I didn't agree with. (The Ruby solution in the submission was slow even using a while loop, which is far from magic.)
If you're able to solve problems in the same amount of time using a fast and a not so fast language, of course you pick the faster one in a competition. I'm not arguing against that.
> If the goal of the competition is to slice apples
The ultimate goal in a competition is to solve the problem within a fixed amount of time. This requires you to actually implement the solution to be able to run it. If you are able to do that more easily with a language that is inferior in execution speed, that will probably be the best strategy. At least compared to using a really fast language that you are not as comfortable with.
As long as you start under 21 feet away, the knife is surprisingly effective. See http://www.policeone.com/edged-weapons/articles/102828-Edged... for more.
It is a question of knowing your tool, and knowing how to stay within its limitations.
It's just that for performance reasons some languages really are better than others.
If you have only 1-5 seconds of computations time you're going to choose for a compiled language every time
In any case, I see both python and ruby on the page you linked to, so perhaps I'm missing something.
and on that page it says:
Each submitted source program must be written in C, C++ or Pascal, it must be smaller than 100 KB, the evaluation server must be able to compile it in less than 10 seconds and at most 256 MB of memory. No other extensions different than the ones listed above will be run.
Not that I disagree with your premise - if you need to write highly performant code, lower level languages will pretty much always be faster. But abstraction is extremely beneficial in many other circumstances.
from math import sqrt
s = str(num)
return s == s[::-1]
sys.stdin.readline() # throw away first line (number of cases)
for count, line in enumerate(sys.stdin):
found = 0
start, finish = [int(num) for num in line.split(" ")]
sqrt_start = int(sqrt(start ))
sqrt_finish = int(sqrt(finish))
for x in xrange(sqrt_start, sqrt_finish+1):
square = x * x
if string_palindrome(square) and start <= square <= finish:
found += 1
print("Case #%d: %d" % (count, found))
pypy 2.0-beta2: 1.6 seconds
pypy 1.9: 2.1 seconds
ruby 1.9.3p194: 3.7 seconds
python 3.3.0: 5.9 seconds
python 2.7.1: 6.2 seconds
Edit: Even though the interger version of _palindrome is slower in pypy, there's another optimization that works there, but is slow in cpython. It brings the run time down to 0.6 seconds!
num_str = str(num)
num_len = len(num_str)
for i in xrange(0, num_len // 2):
if num_str[i] != num_str[num_len - i - 1]:
ispalindrome = lambda a: a[::-1] == a
(define (palindrome? lst)
(equal? lst (foldl cons empty lst)))
(define ITERATIONS 10000000)
(for ([i (in-range ITERATIONS)])
time racket palindrome.rkt
I took some code off SO and it bumped up the runtime substantially:
(require srfi/1 srfi/26)
(define (digits->list num (base 10))
(unfold-right zero? (cut remainder <> base) (cut quotient <> base) num))
(define (palindrome? lst)
(equal? lst (foldl cons empty lst)))
(define ITERATIONS 10000000)
(for ([i (in-range ITERATIONS)])
(palindrome? (digits->list i)))
racket palindrome.rkt 11.66s user 0.55s system 99% cpu 12.293 total
"ruby 1.9.3p194: 3.7 seconds"
which btw is a straight port of the original ruby script, that takes 53 minutes to run
so: 3.7seconds seems quite impossible
There are actually several people who solved this problem (including both large inputs) using Ruby.
And for those asking about Python, most Code Jam problems are apparently calibrated to be solvable using Python. Keep in mind that it's one of the languages Google uses internally and Guido used to work there, so they probably want to support it. From https://code.google.com/codejam/problem-preparation.html:
"Where possible, if you allow multiple programming languages, make it so that all languages are fast enough to solve the problem. Occasionally that's impossible; in Code Jam we generally calibrate by making sure a somewhat optimized Python program can solve a problem. Sometimes, though, there's just no way to have Python succeed with the best algorithm and C++ fail with a worse one. In those cases we'll typically decide that sometimes in life, you just need to use a faster programming language. We try to avoid making that decision if we can."
As others have stated, the contests are designed such that language shouldn't matter and problem sets are deemed solvable with Python as a baseline. For this particular question, the official contest analysis goes into detail on how to reduce the solution space:
Now the issue becomes the trade off between development speed vs runtime performance.
Is it possible that you might solve the problem with a worse solution in a better performing language? Yes.
Is it possible you might finish a solution within the time constraints in a more expressive language? Yes.
If something is CPU bound in production, then going with a better performing language is a viable consideration. However, my goal in programming contests is often to become a better programmer in the process. As a result, I choose to focus more on improving algorithmic complexity which is language agnostic.
: For the pedantic, I understand that language performance is different from implementation performance.
Edit: Corrected links, thanks victorhn.
And the most difficult version (L2) was solved by 19 contestants.
I don't want to detract from the article's main point with a cliche "algorithmic optimization beats fine tuning", but I think it's worth mentioning. Unless I'm mistaken, this particular problem can be solved about 10,000,000x more efficiently.
> Given two numbers X and Y, return how many numbers in that range
> (inclusive) are palindromes, and also the square of a palindrome.
If the original algorithm runs in 10^N time, this would be 10^(N/2). For N=14 as mentioned, that's roughly a speedup of 10,000,000.
Edit: It seems I misinterpreted the question (which sounds like you're supposed to enumerate all palindrome numbers X>=N>=Y, and also print out the squares of those numbers). This is actually asking for a particular type of palindromes -- those which are palindromic squares of palindromes. As macobo helpfully points out, you can drastically optimize mathematically from this additional filter.
By memory, there were under 50000 such palindromes in range 0..10^100.
Go is not an acronym
See also: "ham" (amateur radio) is not an acronym
OP's last benchmark was only for the iteration part. Look just above it and he posts the Ruby times at around 5m.
Still, pretty drastic.
for each range (start to sqr(end))
for each number in the range
check if the number is fair and square
The right approach is to first compute all fair and square numbers between 0-10^14, store the result, and then simply check how many are found in each range.
It turns out there are only 39 fair and square numbers between 0 and 10^14.
I participated in Code Jam 2013 and passed the qualification round using Python; after precomputing the fair and square numbers, checking ranges was pretty much instantaneous (for large input 1).
EDIT: I modified the Go program to use int64 instead of int (the upper bound of the first case is not a valid 32-bit integer), and the execution time is now much higher: 2 minutes 41 seconds on my laptop.
Presumably that was the target GOARCH the OP was aiming for since int on GOARCH=amd64 is 64-bit (int mirrors the native bit-ness of the architecture in Go instead of virtually always being 32-bit like it is in C/C++).
Using int64s on a 32-bit GOARCH (which I assume is your current GOARCH if int is overflowing after 32-bits) will certainly result in a lot of slowness compared to using int64 or int (which will be the same as int64) on a 64-bit GOARCH.
Also as far as speed of execution goes, it'll likely depend greatly on the version of Go you're using. The RC releases of Go 1.1 are much, much faster than Go 1.0.3 for a lot of code.
Seriously, in a language designed ~2010 a common "int" variable could be 32-bit? 64-bit should be the minimum because 32-bit is too small to express common values these days. It's 2013, if you actually want a 32-bit integer use int32. Better yet make "int" always be a particular size and use "register_t" or "int_fast_t" similar as a native int.
You can't even assign a smaller type to a larger one in Go without a cast, but you can recompile it on a different arch and, like this program, have it completely fail because the size of the variables changed. And baked-in coroutines so that you won't run out of address space on 32-bit? Really?
Go is so archaic in many ways like this. It's like somebody from the 70s tried to come up with a better C than K&R.
I share your annoyance to some degree and think it gets even worse when you mix CGO into the equation since people writing CGO will often leave C ints as Go ints when porting structs and function signatures, which is a terrible idea (because on a 64-bit system with a 64-bit C/C++ compiler and a 64-bit Go compiler, int will be 32-bit in C but 64-bit in Go).
OTOH, having int be variable-sized allows you to easily have maps or other int-indexed structures that naturally grow in capacity with the current architecture and I'm glad they didn't just peg everything to 64-bit as a 'modern' shortcut because I do a lot of Go programming that targets 32-bit ARM CPUs and the performance would be shit if the Go runtime just assumed a 64-bit baseline.
In any case, well written Go code will generally use the specific int type that makes sense for the data at hand, int16, int32, int64, etc, and not simply int. A programmer declaring a lot of un-qualified 'int' variables (for things other than loop indexes and such) in his or her Go code is (in most cases) doing Go wrong.
"It's like somebody from the 70s tried to come up with a better C than K&R."
That's basically what Go was meant to be and I suspect the Go team would consider your description of Go to be more flattering than you perhaps meant it to be.
That's why you have a -fint-is-32-bit flag or some way to bend the standard to make it run faster on those system when needed. There's no way Go with it's shared memory between threads is going to survive any future shift to 128-bit computing anyway, so just making int always 64-bit is the best choice; if your maps or indexed structures are so vast then you'll need better abstractions than coroutines.
> [better C than K&R is] basically what Go was meant to be and I suspect the Go team would consider your description of Go to be more flattering than you perhaps meant it to be.
Well it certainly was not intended to be flattering, but I suspect you are correct.
UPDATE: After updating go from 1.0.2 to 1.1, this matches my results - it now takes about 2 minutes, 47 seconds to run through the input (and Code Jam confirms the output is correct). This seems like a significant detail to me, so I'll update the post.
More importantly, the argument that Ruby is "too slow" seems to hedge on the L1 input for Fair and Square problem being solvable with "fast" languages using same unoptimized algorithm as can be used with the S small input. If you'll look at the Codejam results this is simply false. In fact, a large percentage of Ruby users passed Fair and Square L1 problem, higher than many "fast" languages:
Language % of S input solvers successful on F&S L1 input
Ruby ---------- 29.6 (112 L1, 378 S)
C ------------- 20.4 (199 L1, 973 S)
Java ---------- 27.6 (1183 L1, 4285 S)
Go ------------ 25.4 (15 L1, 59 S)
(from http://www.go-hero.net/jam/13/languages/0 )
I think this means you need to rethink your conclusions.
In fact algorithm contests are usually designed so that the small inputs are solvable by brute force algorithms, but you need to optimize them to solve larger inputs, harder problems. It's not just people using Ruby and other "slow" languages who have to rethink their approach.
You are supposed to use int when you don't care about its size (int = native word size). If you are writing a program where it matters you should explicitly use int32 or int64.
Notice that your solution runs in O(sqrt(N)) time. If N were up to 10^14, with 10^3 test cases, then your solution could have to deal with up to 10^10 iterations... which is dangerous even for a fast language (since your CPU can only do about 10^9 things per second). As a rule of thumb, one should only ever deal with at most 10^8 things. Even just incrementing an integer 10^10 times can take some time (fortunately, not every one of the test cases were 10^14 so your Go solution took under a second).
However, Problem C of Google Code Jam qualification 2013 can be easily solved, even with a slow language, by making some observations:
1) Instead of iterating through numbers and checking if they are palindromic, you can just iterate through the numbers with half the number of digits and generate the palindrome by mirroring the half. A naive solution that uses this would run in O(N^(1/4) log(N)^1.6) instead of your O(N^(1/2) log(N)^1.6) solution. (The log(N)^1.6 is due to the time taken for multiplication, assuming the Karatsuba method).
2) With a small amount of math you can figure out that for any fair and square number N = MM, then for any digit s in M, we must have s(sum of all digits in M) < 10. Thus solutions are of the form:
case 1) Contains up to nine digits 1. (e.g. 101, 111111111)
case 2) Contains up one digit 2 and up to four digits 1. (e.g. 1002001, 2)
case 3) Contains up to two digits 2 and up to one digit 1. (e.g. 200010002, 202)
case 4) Contains one three. (i.e., just 3)
A naive solution using observation 2 may run in O(log(N)^5.6) time.
3) Use combinatorics to figure out count the ways to arrange the digits in the 4 cases shown above, rather than iterating through them. As it turns out, it is unnecessary to do this since even using an O(log(N)^6) solution I passed the 10^100 input. Since there are only about 40,000 fair and square numbers from 1 to 10^100, you can easily precompute them and find them by binary search.
This is one of the (many) reasons why although languages like Ruby and Python are great for rapid development, on-the-fly updates and other reasons, it can pay dividends to be fluent in a high-performance language. It wouldn't even need to be C (although this wouldn't be hard to do in C) -- a language like D would give you excellent performance and nearly as easy development as Ruby.
In the future, I obviously need to work on coming up with better algorithms for solving these kinds of problems, but I'll probably continue to try and use Go because it's nice to have that fallback to sheer speed when my algorithm isn't great.
I benchmarked the I/O improvement I was thinking about. It actually only resulted in an improvement of about 30%. Worthwhile, probably, but not as "dramatic" as I thought it would be. Mea culpa.
It was only an EXAMPLE. It might be irrelevant to this particular issue, but look at the context of my comment: it isn't irrelevant to the subject of learning to code more efficiently.
But I would still suggest that, within the context of programming competitions, a certain kind of "programming efficiency" is close to irrelevant. A sloppily coded solution that nevertheless implements the proper algorithm can be many orders of magnitude faster than an efficiently coded implementation of a slow algorithm. Programming contests are more about having the insight to implement an algorithm having low big O, not so much about efficient coding practices. This isn't to say that efficient coding practices aren't important generally, just that there are much more important things to focus on in programming competitions.
I'm sure if the competition was based on speed of implementation there would be different conclusions.
I really hope the Ruby Communities do actually admit Ruby is Slow. Because until you actually admit there is a problem, there will never be any notion of wanting to fix it.
Most of the time, they will come up with Rails Examples or serving dynamic web pages, and the bottleneck being in Database. And the higher amount of request you get, you can solve it with caching.
This is, in my view a very limited scope of Ruby usage.
They deny any wrong doing of Ruby, suggesting it is a dynamic languages, and it's wrong comparing to a compiled languages, or the algorithm are done wrong.
They keep suggesting under xxx usage, Ruby is fast enough to get the job done.
Then there is the argument of throwing more money and hardware to grow and scale. But the truth is most of time Ruby will require more then a few dedicated servers to keep going. While it is not as drastic as the iron.io 28 to 1, but for startup lowering running cost is important too.
Then there is this arguments comes up which i hated most. You are not going to worry about the need of that many servers, or scaling that needs to change to another languages unless you are AirBnb, Twitter etc..... I mean what? I am sorry i may not have worded it correctly but most of the time those read like you are never going to be successfully that you are very unlikely to have to worry about.
Scripted languages should be used for what they are for , a thin layer on the top of other more performant languages.
Like Twitter , Ruby/Rails allowed them to bootstrap their idea ,"probe" a market, make it work , then they thought about performance and scaling. That's what scripted languages are for.
Why should Ruby (and scripting languages) be used for RAD and not performance? That's because implementations range from really slow to somewhat slow. Self, LuaJIT, V8 and IonMonkey, and increasingly PyPy are ample evidence that this need not be the case, and that's just listing the dynamically-typed languages.
The bar has been raised the past few years. Increasingly we can pick three of: productivity, libraries, performance. Ruby lacks the last, which makes it a less interesting prospect for future greenfield projects compared to its competition.
For example the implementation of the Mandelbrot algorithm in RUBY takes 47 MINUTES to complete, while it takes LESS THAN 30 SECONDS when doing the same computation using much faster languages such as JAVA, SCALA, C or FORTRAN. According to this performance test, those languages are more than 100 times faster than Ruby, Python or Perl.
Source: Computer Language Benchmarks http://benchmarksgame.alioth.debian.org/u32q/performance.php...
Other benchmark speed comparisons between programming languages, showing similar results:
Between specific programs, using specific programming language implementations.
Everyone here that has done any C/C++ or Ruby development would probably agree that if you're looking for number smashing, Ruby is just not cutting it unless you're into rockin' C extensions and getting what you want... for now.
Ruby, within a few years, will probably have some VM action, whether the JVM with JRuby or Rubinius wins out, who cares.
In any case, in like 5 or 6 years when we all have 50/100 core machines, Ruby's gonna be just fine for almost everything. Maybe by then we will be done arguing and just agree that Ruby is the nicest language to look at and develop in, even if it has subpar performance. Because if you're arguing something different, you've probably never gone from Ruby to C or Java or even Python... just so you all know, it blows.
(All the usual benchmark disclaimers apply, your mileage may very, your app is always the best bench, etc etc.)
Ruby is best when it's used for the things it was meant for, but that hasn't kept people from trying to use it for more stuff. That's how Ruby found a good use in servicing web apps, first with CGI and then with dedicated instances in Rails and the like.
Not including eval seems to be a "discipline and bondage" fetish of language designers.
Yes, you can write pretty awful code with eval. But it allows metaprogramming aswell. I'd like to judge if something is awful or awesome.
In Dart the code declaration is taken very seriously in many ways, and avoiding eval buys them more compile-time, loading-time and perhaps runtime optimization opportunities. I find that more than the language designers themselves, it's the end-users or developers that ask for more "bondage and discipline" from their tools. People who come from Java and C++ backgrounds, for example. They just can't have enough compile-time stuff aided by an IDE.
The actual disclaimers shown on the benchmarks game website apply ;-)
sounds good . I am ruby happy developer .
Porting his code to PyPy would be very interesting, though.
I treat it as a challenge, though, as it forces me to be smarter about my implementation.
Are you sure? Can you post a number of which you think Python can't handle?
time python -c 'print (5**(5**5))*(3**(3**3))' \
| python -c 'import sys; print hex(int(sys.stdin.read()))' \
| wc -c
>>> 3**1000 132207081948080663689045525975214436596542203275214816767783138506080619639097776968725823559509545821006189118620803878014774228964841274390400117588618041128947815623617812548034440554705439703889581746536825491613622083028969185564040848989376093732421718463599386955167650189414350385648747165832010614366132173102768902855220001
[snipped because HN does not allow 20k comments)
I heard some talk about allowing Python at the International Olympiad in Informatics http://www.ioinformatics.org/index.shtml (in addition to Pascal and C(++)) but that's probably far off
Java compiles the bytecode on the fly. Back in older JVMs where Java interpreted the bytecode it was also slow as hell.
Ruby: 3180 seconds
Ruby (profiled and optimized): 342 seconds
Go (first attempt): 0.7 seconds
This doesnt say "don't use ruby it's too slow". It's says prototype in your comfort zone and optimize. Isn't this what we all agree on?
Edit: auto correct fail.
I allocate the array ahead of time, as soon as I know the size.
The equivalent C/C++ version took under a second.
One thing that Code Jam does differently (and imho better) is that they give you an input set and you have 8 minutes (or so) to upload the solution. This is expected to give more than enough space for language speed variance.
Instead of execution time, coding time is much more precious in such competitions, and
Micro-optimizing has it's place, but it's usually not in a programming competition. that's what using ruby instead of C gives you leverage for that, as you can express more complex ideas and algorithms more easily (which should be the focus point).
This doesn't mean you shouldn't know the cost of operations.
Unfortunately, in such competitions you'll be competing against people who write C++ algos faster than normal people writing Python.
That doesn't mean that there aren't also people who have done the same and still prefer python (or some other language) not for execution, but development speed (and bignum, larger standard library...)
Using a faster language implementation could have allowed you to use the same brute-force algorithm and come in under the time limit, but your real problem is your algorithm, not your language implementation.
You can write fast code in a fast language and get very fast running time.
You can also write slow code in a fast language, or fast code in a slow language, and still have acceptably fast running time.
But you wrote slow code in a slow language, the worst of both worlds.
[master] clifff@fair_and_square: ruby -v
jruby 1.7.3 (1.9.3p385) 2013-02-21 dac429b on Java HotSpot(TM) 64-Bit Server VM 1.6.0_43-b01-447-11M4203 [darwin-x86_64]
[master] clifff@fair_and_square: time ruby fair_and_square.rb C-large-1.in.bak
time ruby -Xcompile.invokedynamic=true file2.rb C-large-1.in
Edit: as electrograv pointed out, another optimization is to generate palindromes from sqrt(start) to sqrt(end) rather than testing each number in the range for palindrome-ness. This saves you a factor of 10^3.5.
--> "You may always want to consider other languages too."
No one should ever agree with an ultimatum on language choice: if you need hardcore speed, use a lower-level language, not an interpreted garbage-collected scripting language. Duh.
 Richard Buckland's awesomeness - http://www.youtube.com/watch?v=JTAkUs-NjxU#t=44m21s
Or even if you don't have a side project or startup or something, go build something cool and open source it and do that, rather than working on some artificial problem.
You can click the little down arrow all you want, but it won't keep me from thinking you're mostly wasting your time with these competitions, and it won't put any money in your pocket.
Seems to me challenging the brain is never a waste of time. We would not have some of the medical technologies we have today if it were not for massively parallel GPUs, which of course were developed for the sole purpose of playing games.
That's a big debate! For a lot of people, I would, though, yes. There are serious economists who posit that universities are mostly about signalling, rather than actually learning much. I don't think I agree with that entirely, and believe it depends a lot on what's studied. It also depends on the opportunity costs, which may not be that high for many people that age.
> GPUs, which of course were developed for the sole purpose of playing games.
I don't believe the creation of games is a zero-sum game at all; there can be many winners, and, as you say, there are other side effects that are beneficial. The creation of games also falls into one of business, side-project or open source project as mentioned in my other post, no?
Artificial coding competitions aren't really about creating much of anything though, from what I can tell. A bunch of people solve the same thing and then throw away what they made.
I agree that challenging your brain isn't a waste, but it seems that in a world that is desperate for coders, there are more productive ways of challenging your brain in a way that's perhaps even more satisfying, because of the positive feeling of having helped to create something useful.
Fair enough. For myself, it was what lead me to my career. I found a large number exams followed a similar format to this very coding competition. Many times on the job I have thought back on exam questions for inspiration for the problems I was solving. In that sense, I most definitely do not classify my education as "dicking around".
It brings be to my original point; when video games were being developed, these admitted benefits were certainly not obvious at the time. We simply wanted to render things faster, and at higher resolutions. Now other applications of the same technology is ubiquitous in many regards.
As a final tangential point, I don't think the code competition is wasteful because it does not have any immediate intrinsic value. See exams example above, and really, the discussion here. Also, perhaps this was his so called "downtime". Many people would argue that not all facets of our life should be strictly geared towards productivity. He may already run a business or hold a day job (maybe his blog gives clues otherwise, I don't know).
I'm clicking the down arrow, because not only are you being a dick, you're adding nothing to the discussion.
No thanks for the name calling.