As a Python programmer that often works with these sorts of problems this reinforces my perception that Julia, while not suited to all programming problems offers a productivity gain when it comes to mathematical/algorithmic computing.
Sure I could sit and tweak the Cython code or rewrite the hot loops in C++, but it seems to me Julia hits a sweet spot where I am less likely to have to compromise programmer performance for computational performance.
 I will note however that the code in the OP doesn't seem like the clearest implementation of the algorithms in either language, some of that heavy nesting should be factored for a start.
This kind of generality isn't necessary for a lot of user-level code, where there's a specific application, but it is quite important when writing libraries that need to be as reusable as possible. And this is where the Cython solution has some problems – sure, you can make it pretty fast, but in doing so, you lose the generality that the original Python code had, and your code becomes as monomorphic as C.
In general, if anyone has performance improvements or code cleanups to the Cython PAVA code  (the implementation used by scikit-learn), please benchmark and send a pull request over to the scikit-learn team. I'm sure they'll be very happy to accept improvements.
# solution[i:k + 1] is a decreasing subsequence, so
# solution[i:k] is a decreasing subsequence, so
for i in range(n):
solution[i] = y[i]
solution[:n] = y[:n]
Consider applying it to this sequence
1, 2, 3, 4, 5, 6, 7, 6, 5, 4, 3, 2, -30.
I cannot read Julia yet, so not sure about the Julia version, but the others do indeed look very much a quadratic time algorithm at the least. The behavior will of course depend on input, so on certain inputs they may have a runtime that scales linearly. Complexity addresses the worst case behavior and as I showed coming up with such bad cases is not hard, and they need not be very un-natural either.
Does Julia do any inlining ? Or are there any plans for such.
If so, that would be one aspect where it could shine. As mentioned in the comment the Cython version is unfactored and manually inlined for performance. If Julia inlines in its JIT pass, such measures would be unnecessary and would allow one to write cleaner code.
I'm not claiming I've written the fastest possible implementations or anything though - please feel free to improve any of this code and submit to scikit-learn if you find some improvements!
Oh I never thought that you were claiming so. Thanks for clarifying the implementation. Julia is indeed something that I want to pickup. BTW what data-structure does this snippet invoke
ydash = [1 => y]
> please feel free to improve any of this code and submit to scikit-learn if you find some improvements!
Ah! then you would all hound me for my insistence of keeping https://bitbucket.org/sreangsu/libpav LGPL. A discussion I dont want to engage in. BTW I am not even sure if libpav would be faster as it stands now, because it does some really stupid stuff, like unnecessary allocation/deallocation on the heap. Its an old piece that I wrote years ago to entertain myself (proof: comments like these
// I hereby solemnly declare
// that 'it' is not aliased
// with anything that may get
// deleted during this object's
// lifetime. May the runtime
// smite me to my core otherwise.
Its such a strange coincidence: I started hacking on it again just a few days ago, and lo and behold a thread on HN on PAVA. Now motivated enough to fix my crap :) Once I get rid of that damned linked-list I think it would be decent enough, although there is a fair chance that it would be faster than the current Julia version.
That said, libpav does a lot more than just isotonic regression, its more of a marriage between isotonic regression and large margin methods like SVM. Even for just the isotonic regression case, it can work with any abstraction of an iterator (so it is container agnostic) can take a ordering operator, and is of course fully generic (meaning can handle float32, double64 without any change).
So if I can call scipy and get the result easily in one line, I would do that. When implementing new algorithms, it will indeed be easier with Julia.
Pypy unfortunately cannot be used by most people who would benefit a lot from it because numpy/scipy isn't currently supported.
I am curious if you tried running your pypy code with Numba.
I would argue that my personal choice is 1) familiarity with the language (ie, pick the most comfortable language for the task at hand) 2) performance required
Implement the algorithm inside a "with nogil" and you will see some nice speed-ups. It won't look pretty, but for certain performance-sensitive code it will be worth it.
Yeah, the active set implementation in Cython is relatively slow - see  for where I discussed how I benchmarked the existing algorithm and demonstrated the PAVA implementation was faster.
Then again, until two days ago the Cython active set algorithm was the production implementation of isotonic regression in scikit-learn, so it's not like it was a strawman example that I made up.
Also, how are you timing Python? When I run this on a sample size of 10 using IPython's %timeit, it takes a few microseconds, which is 2 orders of magnitude from what you report.
Basically, the reason MATLAB/Numpy/R folks harp so much about vectorizing code is because their interpreters are slow, and within a vectorized function they can punt to some C/C++/Cython implementation. But Julia loops compile down to almost ideal assembly; they are remarkably fast. In fact, verbose loops acting in-place almost always beat the vectorized alternatives because they don't need to allocate space for the result.
The wonderful thing about Julia is that you can code it vectorized first, and if that's not fast enough (or if it uses too much memory), you can easily rewrite it in the same language.
I wouldn't want to claim an informed opinion about whether it is a good style or not, but a block with no side effects isn't that far off from a function with no side effects.
2.http://benchmarksgame.alioth.debian.org could be a bit more representative of broad-based algorithmic performance.
3. There are lots of Python libraries for application features other than handpicked algorithms. I would be interested to see benchmarks of the marshaling code in IJulia (IPython with Julia)
(Or is Julia making a copy and I just don't see it?)
So that doesn't really give python a leg up over julia.
My standard approach is to use python to read in all my data, parse it, format it and beat it into the shape I want. Then, if I need the performance, I pass the data to a highly tuned C function for slowest number crunching parts. Then I take the data back from C and parse it, format it, summarize it and write it out to the file formats I want using python. C may be faster for number crunching, but python is much nicer for data handling, and there the performance difference is negligible.
There's work on interfacing with the C++ ABI, though. See Cpp.jl and Clang.jl (whose C++ support is still a WIP).
I used to recommend Mathematica to anyone wishing to "play around with math" and "get a feel for it", due to the user friendly interface and 'manipulate' function that allows you to pop a few sliders for realtime manipulation of parameters in functions and instant visual feedback in one line of code, but always thought of it as either a "toy" or a "learning tool", compared to Matlab that is used by "serious engineers" (mind it, I like neither, I'm a Python and C person, but I'm also closer to what some would consider a "software engineer" and definitely not a mathematician or scientist).
The second is the language itself. Matlab is a procedural language heavily inspired by Fortran. Mathematica of the other hand as a largely functional language that takes several design cues from Lisp. Historically most engineers came from a Fortran rather than a Lisp background and most engineering programming is still taught in a "Fortran-inspired" way, so the Matlab language is simply more comfortable for most engineers to think and work in.
The code is mostly wrapped in Compile statements and is very procedural in style.
As a quick test I compared their quicksort with the built in Sort obtaining results about 30 times faster than theirs.
Looking forward to Julia's JIT improvements. This is only the early days.
The value proposition of Julia (and Python with Cython) is that you don't have to choose between the productivity you have from a dynamic language (IPython/IJulia  are incredibly productive environments for ML/data science IMO), and you have the ability to easily optimize hotspots in your code as you come across them - whether that be adding type annotations/disabling bounds checking in Julia, dropping into Cython, etc.
On the other hand, this code is written in such a low-level style (nested loops mutating arrays, all functions are top level with no env-dependence) that it's hard to see any other advantage in code like this over just using C++. In fact C++ can leverage closures now with little or no performance loss, unlike either of these languages, so hof-style programming becomes feasible even if high performance is a goal. Also I think there's some benefit to static typing for catching errors early (especially as the program gets larger), and documenting return types which I always appreciate as a code reader (!).
If it's sufficiently easy to link to C++ (or Rust), you could still implement parts in C++ (or Rust) and use the REPL for playing around, that might make a good combination...
Many languages that have native code compilers (AOT or JIT) can be used. It is just many seem to think only C and C++ are the only game in town.
Personally I look forward to improved code generation in alternative language compilers, so that we go back to the time where C and C++ were just two among many languages with available compilers one could choose from.