s = 0.
n = s.shape
s += 100. * x[i + 1] - x[i] ** 2.) ** 2. + (1 - x[i]) ** 2
One should read
s = 0.
n = x.shape
for i in range(0, n - 1):
s += 100. * (x[i + 1] - x[i] ** 2.) ** 2. + (1 - x[i]) ** 2
n = x.shape
> a claimless python to c++ converter
The idea of the change was that it's more important to convey the problem it solves rather than how it's done ;-)
Simple and Effective Type Check Removal through Lazy Basic Block Versioning could be adapted to make it unnecessary to compile more than one version of the function at a (probably) minor cost of performance. It's geared more towards jitted code but something like it where it compiles different blocks where the types matter and falls back to interpreted code if users pass in some random types not expected just might work -- or the python interpreter throws an exception if the types don't make sense.
Numba can handle vectorized (numpy behavior) directly in addition to explicit loops. The former is accelerated less in comparison to plain python calling numpy (since if you can use numpy operations directly, it's already really fast) but the numpy bits in Numba can also be automatically parallelized by Numba. Explicit loops in numba are accelerated hundreds of times over Python loops (and you can use e.g. prange to write parallel loops, too). Point is, the two paradigms can be mixed and matched at will within Numba.
It seems like every example is of cython, but then the author generalizes the conclusions to Numba as well. It would be much more "honest" to show side-by-side comparison of Numba, Cython, and Pythran, since these all have different syntaxes and are fairly different tools.
Another example is that you don't have to rewrite functions for different argument types in Numba, but you do in Cython (see "convolve_laplacian" example, which can work with a simple decorator as a numba function). There again, the impression is given that Numba suffers from the same issue as Cython (and as mentioned elsewhere in the comments here, it's possible that Cython has a way around this, but I don't know the details).
One issue with mixed language programming that we learned decades ago is the issue of debugging that there typically is across the interface. Also general tool support. I recently asked the local Python expert about HPC-style profiling of Python calling C(++) libraries, for instance. (I couldn't make TAU work.)
For 2 Type agnosticism, this can be emulated with a "fused type" in Cython. See this example: http://cython.readthedocs.io/en/latest/src/userguide/numpy_t...
But I think the major inconvenience with Cython vectorization is really not about `float32` and `float64`. You get `float64` NumPy array by default from floating-point calculations. The actual inconvenience is that the vectorized function cannot take a scalar input like the NumPy ones. To remain polymorphic, I usually have to perform an `is_scalar` check on the input in a Python wrapper before sending the input data to the Cython function.
Oh, and if you use set target='gpu', your function also works on GPUs, too. Or you can use target='parallel' to make it parallelize automatically across CPU cores.
edit: also, I would be very surprised if you had seen any Julia job ads, v1.0 hasn't been released yet. Doesn't mean it can't make my scientific life easier in the meanwhile.
All of these things play to python's main strengths: huge community with connectors to every API and format, plus ability to conveniently integrate code at several levels of complexity & maturity as you prototype.
To do matrix multiplication on many matrices "stacked" together in one step, use numpy.matmul: https://docs.scipy.org/doc/numpy/reference/generated/numpy.m... (and so there's no need to slice up the array, convert to matrix, etc.)
Note that the Numpy devs are trying to (if they haven't already) get rid of the "matrix" class and just use arrays, but of course dealing with legacy code is always an issue. Once that's out of the way, people won't be distracted by "matrices" to do matrix operations, and hopefully they'll see you can do matrix operations on arrays directly. (And yes, in Python 3 you can use the @ syntax to the same effect.)
I think the python language is very elegant and the principle that there should be only one obvious way to do something has served them quite well. Unfortunately, in numpy you have quite often multiple ways to do things (often in addition to python own mechanism, e.g sum, numpy.sum and the sum method). I deal with students who have little programming experience and this can be confusing. One of the reasons I choose Julia for my lectures was that these issues do not exist in Julia. Julia is quite clean and simple in this regard.
However I completely see that for an experienced programmer (or a scientist with good programming experience), this is not a problem.
But for somebody learning to solve numerical problems, it is quite helpful that Julia code tends to be closer to the mathematical formulation. In addition, for the test I made, Julia code tends to be faster than vectorized numpy code (I can share the code if there is interest). The only major argument against Julia, in my opinion, is that it is still a young language and with a small ecosystem (much smaller in fact than python or R)
Anyhow, that experience surely doesn't map onto Julia, a completely different language. So I'd be curious to see what your use case is; it might give me a different perspective on Julia (which I have only played with a couple of times back when it was even younger).
Do not hesitate to tell me if I missed something to optimize the python code. If somebody has numba, pythan,... installed, I would be interested to see the speed-up compared to the vanilla python version on your machine.
So in short, for my cases: the fastest Julia test case (with loops and avoiding unnecessary allocation) was about 10x faster than fastest python 3 test case (with vectorization).
The runtime with vectorization are relatively similar (julia
is only about 25% faster than python). Explicit loop and careful memory management are clearly beneficial in Julia.
Moral of the story is, these Python tools are built for microbenchmarks and can do okay there, but without the full stack optimized together and without a type system that's exploitable for all of the performance tricks, it falls apart in real-world code.
I agree that Python has a library advantage in data science + ML. R has a library advantage in the area of statistics. But Julia has quite a few advantages in the core math areas of scientific computing and algorithm development. There is headway being made into DS+ML as well. Julia's pandas/dataframe equivalent is JuliaDB which adds out-of-core and online stats functionality, so it's more at the level of pandas+dask. Flux.jl is still in its early stages but it's quite a unique ML framework which can directly incorporate any Julia function at any level, and then has some working experiments with compiling to things like JS and XLA.
But in the broad view of things, every language has SciPy+NumPy pretty satisfactory (ex: Julia's Base library has most of it, the top 20 packages cover the rest), but from there all have tradeoffs in what areas the community is specializing in.
And yet, for my particular area (audio signal processing), Julia is just objectively worse than Python in expressivity, library support, and even speed.
But I'll keep trying. Maybe next year.
(But it's true that idiomatic Fortran starts indexing from 1.)
Any stochastic process will have variability for a given set of inputs. I think there are more fields within the umbrella of Science where purely functional programming isn't an achievable ideal--let alone a desirable one--than there are where this is a good fit.
You can make the argument that all programming should be functional on its own if you want, and I'm open to that. But this it's pretty sketchy to me to just shoehorn all scientific work into a category of work that should definitionally be functional.
There are huge numbers of scientists who do not agree with that at all.
If you're modelling stochastic processes it's important to be able to set the random seed you can reproduce your simulations. So for a given set of inputs you should get the same output, given that one of the inputs is your seed.
Shedskin and Pythran look somewhat similar to me (disclaimer: I've contributed quite a bit to Shedskin but have never used Pythran so far), except in Shedskin you don't even need annotations like with Pythran (the downside being the finer control you have of the native types used is through the transpiler options). Also, Shedskin development is not much active these days — to say the least — and there's zero support for Python 3, while Pythran is under fairly active development has beta support for Python 3.
If you're interested in Python / native implementations, you might be interested by Nuitka as well: http://nuitka.net/
However, I think this might be a bit misleading for people who do not know Cython:
> the Cython language, which is a Python-ish programming language, but not Python.
Actually, http://cython.org/ states:
The Cython language is a superset of the Python language that additionally supports calling C functions and declaring C types on variables and class attributes.
In my experience with Cython, this description is quite accurate: code can be annotated with C types and then compiled to efficient C code by Cython; if you don't use annotations, then you can still compile to C code but with less speed advantage.
I haven't used Cython since version 0.17 (quite old now) but IIRC the major drawback was that it was mainly targeting writing extension modules for Python; it could generate self-standing executables, but would still require a Python interpreter to be embedded in any compiled code (that was the price for seamless interoperability between Cython/compiled code and "pure Python" code).
It's a little more Python-like than just -ish.