Hacker News new | comments | show | ask | jobs | submit login
Python-Based Tools for the Space Science Community (lanl.gov)
76 points by neokya 1399 days ago | hide | past | web | 28 comments | favorite

Fortunately Python is becoming the new programming language for science and engineering, replacing FORTRAN and Matlab. Unfortunately I do not see many opportunities for growth in other sectors

This is only partly true, and in a limited way. Fortran is still king for simulations or anything that needs to be fast. Other possibilities are C and assembler, and that's about it. Python + numpy has become very popular very fast for exploring data and the results of simulations, and it's a great environment for this. Note that when you use numpy you're calling fortran (and C?) routines. The fortran languages and compilers continue to evolve, and fortran will probably remain the language of choice for large-scale computation for some time. Other languages, such as Java and lisps, are sometimes used for big simulations, but Python is just too slow to be used for anything but prototyping.

This depends very strongly on the nature of the simulation. Lots of simulations are handled just perfectly by modelling your system as arrays. Consider, for example, nonlinear waves. Your simulation typically consists of FFT -> k-domain operator -> IFFT -> x-domain operator (repeat for i=0...T/dt).

No reason whatsoever not to do this in python, though of course the FFT is just fftw/fftpack and the x/k domain operators are numpy ufuncs (all written in C/Fortran).

On the other hand, for particle simulations, you need more complicated logic/data structures to handle multipole methods, so the python array model might not work so well.

Yes, it seems that there are some types of simulations that could be structured as numpy operations steered by a little Python code. But can this kind of code be run effectively on large multi-processor machines?

Array operations tend to be highly parallelizable by nature. Numpy operations can certainly be parallelized, and even distributed. Take a look at blaze.


That looks interesting. I know that numpy calculations should be straightforwardly parallelizable; my question, out of curiosity, was whether they were in practice.

Note also that a common bottleneck for array processing is memory bandwidth. Multithreading something that is memory bound will not get you much speed.

There are tons of optimisation, new representations that can be experimented with for arrays. While NumPy is already reasonably fast, I am convinced you can get much faster by expanding it (within or outside it). String/Object arrays nowhere near as useful as they could be as well.

numexpr (a tool used to optimize performance of numpy code) has support for parallelizing operations. See http://code.google.com/p/numexpr/wiki/MultiThreadVM

With the growing use of LLVM to compile numeric Python down to code nearly as fast as C there should be a lot of new opportunities to replace old Fortran code. Albeit this will probably happen over the next 5 years or so.

> Python is just too slow to be used for anything but prototyping.

Hardware time is often cheaper than engineer time, so Python may be faster/cheaper if you consider total time to value.

In the scientific computing environments that I have in mind, hardware is often fixed: you have your several-million-dollar supercomputer on site, or a fixed compute budget at a supercomputer center. Now, do you want your result in one day or 100 days? Because that's the compute time ratio we're talking about.

Not really, no, at least not without some context. Lots of people use python and numpy on very large computers. Also, running time is not the interesting metric: dev + runtime is. The tradeoffs depend on your team, the problem, etc...

While I wish Matlab just disappeared, Python is still far from replacing it in some fields (e.g. electrical engineering), in no part due to the massive amount of toolbox available for it.

Just curious (being new to python) why is that? is there something particular about the syntax or how it compiles that makes it preferred for scientific cases?

It's more likely that enough people who happened to be scientists enjoyed programming in python enough to implement scientific tools in it. (IMHO as a non-scientist)

Somewhat related, I just discovered PyEphem, and have been using it to pin down a time and location from the stars that appear in XKCD #1190



I wonder if Randall Munroe also used it to calculate the positions of those stars? He has mentioned it in some other places.

(I did not wade into that 970 page thread to see if someone had already made a similar suggestion)

If you save down one of the images and up the gamma a little, you can see a tremendous amount of detail. Far more than could practically be done by hand, and afaict, more than PyEphem has in its catalogue of "bright stars".

I suspect he used something like Stellarium and then ran the images though some filters.

This is one of the few (only?) python packages that can read and write files in NASA's CDF format - http://cdf.gsfc.nasa.gov Installing under OS X is fairly straightforward following the directions in the documentation.

I cant think of another discipline where the first example from the time section will use a Julian Day:

  >>> x=Ticktock([2452331.0142361112, 2452332.0142361112], 'JD')

The documentation mentions that "OSX install has become a challenge. With the Enthought transition to Canopy we cannot figure out clean install directions for 3rd party packages and therefore can no longer recommend using EPD for SpacePy." They have other directions including some using MacPorts.

Installation directions for Linux and Windows seem straight-forward.

Enthought going to Canopy as some weird hybrid of a distribution, and IDE and a package manager may have something to do with it.

I'm had many fewer issues with Anaconda, and have found that OS X behaves just as well as Scientific Linux or Mint for stuff like this.

This is a familiar story for this type of software. It has consistently been my experience that installation on OSX is likely to be a headache, whereas on linux it's effortless. Never tried Windows.

"OSX install has become a challenge. With the Enthought transition to Canopy we cannot figure out clean install directions for 3rd party packages and therefore can no longer recommend using EPD for SpacePy."

Um... use Anaconda? http://continuum.io/anaconda

The Python installation tool utilized to install different versions of Anaconda and component packages is called [conda](http://docs.continuum.io/conda/intro.html). [pythonbrew]( https://github.com/utahta/pythonbrew) in combination with [virtualenvwrapper](http://virtualenvwrapper.readthedocs.org/en/latest/) is also great.

Can anyone clarify what does this means: 'License Agreement based on Python Software Foundation License'?

haha, thanks ;)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact