Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Python in the Scientific World (neopythonic.blogspot.com)
69 points by fogus on Nov 5, 2009 | hide | past | favorite | 29 comments


I would be incredibly happy if we were taught in Python or anything other than the hideous monster that is Matlab on my MSc course. After using Ruby and Python it's rather painful to use such a clunky environment.


One difficulty with Python for teaching is that it is rather complex to get students set up and running with python+ipython+numpy+scipy+a compiler+... There are so many combinations, every student has a different version of a different operating system with different other crap installed. Matlab is matlab is matlab. Sage seems to come reasonably close as an alternative, though in practice there are often still issues with setup. (Maybe minor enough that the other advantages of python outweigh...)

The other advantage of matlab is that it is more friendly to non-programmers. There are a lot of people that get lots of real things done in matlab due to all the semi-idiotic design features that annoy serious programmers. (Monolithic namespace, "everything is a matrix", one function per file, etc.) The huge advantage of "copy on write" is that you don't even have to really understand the concept of "different copies of the same data". That's a big hurdle coming to python-- a hurdle you might not want to get over if you don't care about programming, and just want to solve your problem.


Yes, if you know what program you want to write then in Python (+ NumPy, Matplotlib, et al) vs Matlab, Python wins but if you are a) exploring a problem space speculatively and b) skilled in your knowledge domain but producing a piece of software is not your final goal, then Matlab wins hands down. It doesn't try to be a general-purpose programming language, it is honed over years to do exactly what control engineers want to do day-to-day.

It's like Python is a better programming environment than Excel too, but spreadsheets aren't going away anytime soon (it remains to be seen whether Python + OOo wins over the long term)

Mathworks could do a lot worse than bolt on an optional Python front-end. What people pay for Matlab for is the toolboxes, really.


As a controls engineer, I agree. I've been trying to learn/switch to python after years of Matlab, but the installation effort, programming environment and lack of controls tools compared to Matlab have hindered me.

Actually if anyone has any suggestions to address those issues, I'd appreciate it.


Installing sage (http://www.sagemath.org/) addresses the installation effort pretty well. You get a giant binary of everything bundled together and working. It addresses the programming environment to some degres (it gives you a "notebook" similar to Mathematica). I used to be a hardcore matlab guy, now I find some tasks are easier in python.


Have you tried R?


I have toyed with R, and I think it's great. I love the data-focus of the language, and the capabilities of ggplot (for which there does not seem to be an equal to in Python). But to be honest as a controls guy, and not a software guy, learning new languages is not a trivial task for me. Python seems to meet both my computational and application aspirations.


I am curious as to why you would hate matlab so much other than the clunkyness. EDIT: And of course cost/openness

Matlab works very well for many disciplines and the shear amount of documentation make it very easy to use for people who aren't really programmers. My main peeve with matlab is that to write a function it is required to make a separate file however most problems that are solved with matlab don't really need functions in the first place.

SciPy/NumPy and matplotlib can be nice to use but if I need to get an engineering problem done quickly I can find functions that I need much quicker in matlab and the matrix as the central object is very helpful.


Yes, the closed system is a big thing. Anything interesting I may create will have to be re-written in another language if I want to extend/share with others. While it's not reasonable to use something like c to test out (computer vision in this case) algorithms in practical classes, Python is both simple and open.

The problem of having many possible combinations as mentioned by lliiffee doesn't seem like much of a big deal. Surely a faculty could just specify their recommended setup?


Sure, you can specify it. Specifying is no problem!


  > My main peeve with matlab is that to write a function it
  > is required to make a separate file however most problems
  > that are solved with matlab don't really need functions 
  > in the first place. 
That's just where the madness begins! My current pet peeve is that you can't index the result of a function directly:

  x = 10:-1:1;
  
  a = sort(x)(1:5);   % illegal!  
  
  tmp = sort(x);      % depressing :(
  a = tmp(1:5);       

  f = @(x,i)x(i);     % a little better
  a = f(sort(x),1:5); 
Every release gets a bit better, but there are still a lot of strange historical limitations.


You can define a second function in a file, with the constraint that it can only be used by the main function (I think). The following could happily go in one file named foo.m:

function rtn = foo(x)

rtn = bar(x) + 1

function rtn = bar(x)

rtn = x^2


Yea!

My comment exactly. Having worked with Matlab professionally, how it survives when there's any plausible alternative is astounding. Well, not really, Matlab is pathological as a development environment but somehow does a good job of facilitating the back-of-an-envelop-magic of the physicists.

It's the crystal meth of numeric programming - a quick rush followed by a giant crash.


It's really amazing to see how something is finally starting to supplant ancient Fortran code in sci-comp.

I think one of the major reasons for Python being that choice is that its syntax and "Python way of doing things" is really geared at making it difficult to obfuscate the code into meaninglessness. Clarity is always important in research.


sorta kinda, but not really, python isn't supplanting fortran, it's supplanting Matlab/IDL for prototyping and visualization, and half-baked command languages for interfacing with packages of data reduction routines. Which is still nice, especially things like (to take astronomy) AIPS/IRAF, specialized packages of routines that are controlled by the worst command language possible.


Also note that the code at the core of packages like numpy is often still written in Fortran or C. For example, I'm pretty sure all the linear algebra processing in numpy is handled by the popular LAPACK written in Fortran. Note that software like Matlab and IDL also generally uses these well-understood packages.

Fortran isn't going anywhere, we're just hiding it in the background.


Everyone uses LAPACK. If you came along with your own versions of those routines, no-one would would trust them. LAPACK et al are the most optimized, most debugged code on the planet. Hardware vendors design around LAPACK. It's incredibly important.

Also check out Sage: http://www.sagemath.org/


Yep. The Intel F9x compiler smokes everything - at least in part because its versions of LAPACK/BLAS/ATLAS are so aggressively optimized.


Also, f2py (distributed with numpy) makes it easy to wrap fortran subroutines in python.


i mostly see matlab for algorithm development, which is like super short snippets. O(n2) swegr complexity creep isn't that big a deal when n is a page or two of code.


My guess is that scientific computation is a lot of fun. Plenty of clean mathematics. Probably not too much dingy boiler room stuff. I can see how one would be attracted to using languages like Python (and maybe Scheme too) for such tasks.


There is lots and lots of "dingy boiler room stuff" in scientific computing. What the math is doing is clean. Doing that as efficiently as possible in a computer is generally not.


And there is all the data handling/mangling that needs to be done.


Oh man. Nonono.

It's huge amounts of glue and connect-this-program-to-that-program-with-a-pile of regexes. Most of the number-crunching's in hand-optimized C or F90...

During my PhD, I designed/wrote a transition-state prediction algorithm in Python, hooking it up to atomistics codes written in Fortran. One of my colleagues, now one of my cofounders at Timetric, wrote a standards-compliant XML library in ANSI F95 - http://uszla.me.uk/FoX/ - which, believe it or not, is one of the rare cases where XML made life a lot better!


On a computer, you have to also be concerned about error propagation and efficiency, and that can take what might be a very beautiful, simple solution on paper and explode it into a big ugly mess in code.

That doesn't mean it isn't fun, but it's a lot less clean than you might think.


In NLP there is a large amount of boilerplate and perprocessing code. The barrier to entry is actually quite high. New machine learning model for machine translation? "Sorry I'm not convinced your model would work with a sophisticated multitext grammar using available translation lexicons etc etc".

Building a convincing baseline is hard. Which means that it is difficult to show your approach works in general


The Python NLTK helps a fair bit with some of that, especially at an introductory level.

Of course, I hand-wrote a lot of my NLP algorithms in Perl, back in the day. Now that's a good use of a time machine...


For beginners there is also an on-line book:

http://www.nltk.org/book


As well as a printed version, published by O'Reilly, for those of you who prefer dead trees to pixels (like me).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: