Hacker News new | past | comments | ask | show | jobs | submit login
NumPy Tricks and Pitfalls (jupyter.org)
262 points by vladf on Dec 23, 2017 | hide | past | web | favorite | 26 comments

I uploaded to Azure Notebooks in case anyone wants to run it w/o setting up an environment:


Click Clone, Sign in, then Run

Similar in CoCalc


Click "Open in CoCalc", then select all, copy.

+1 for CoCalc! It's great to see the number of notebook hosting services flourish.

Anyone know of, or can offer, a comparison of the existing notebook hosting services?

I'm an engineering student that does programming and am wondering, can I ask what are some good resources to gain the appropriate background that would allow me to understand more of this notebook? I have used tools like Numpy but probably not very efficiently without understanding their strengths and weaknesses. What books or online courses should I look into to know about memory, flops, and things like that?

I should probably ask this somewhere else but I'm not sure what the appropriate forum is, if you could point me to where I should that would be great. Thank you!

Here's a great mini-book that explains how NumPy can improve efficiency, at an appropriate level of detail for what you're asking: http://www.labri.fr/perso/nrougier/from-python-to-numpy/

> What books or online courses should I look into to know about memory, flops, and things like that?

What you are after are books/courses on "architecture", I think. A classic book on the subject is Hennessy & Patterson.


You shouldn’t be afraid to fail, man! Instead of doing something in Matlab, try it in numpy and just do some research to figure out what you want to do.

Eventually you’ll pick up tips from all your online resources, and you’ll realize you could have done some things better the whole time.

Yes! There's many ways to learn. One way is essentially the brute force method: Google every Single thing until you get it. That's how I learned Nimoy when doing some basic image analysis.

I use a depth-first tree traversal to learn a subject. Start at something broad like "physics" and work your way down each sub-topic.

Yeah, but you better define some stopping criteria or else you'll end up with a PhD if you're not careful.

Ah, that's a shame, it looks like that notebook is no longer available. Thanks for pointing it out. I'll remove the reference.

Whats the funnest thing you guys made with NumPy?

Considering the top numeric Python libraries have dependencies on Numpy, including data manipulation (pandas), machine learning (scikit-learn) and deep learning (TensorFlow), you may want to narrow down the question.

I still don't really understand why someone would choose to use NumPy when the numerical story in Rust is now so good.

Perhaps for engineers, but for the rest of the scientific community it's pretty obvious: Rust doesn't have nearly the coverage that NumPy does. A quick check, I couldn't find a GMRES solver in Rust, which is extremely useful for solving large linear systems and hardly an obscure algorithm.

FWIW this is the same situation NumPy was in a while back ago, but instead of boldly asserting that what they had was enough, they looked to other prevalent languages to figure out what was missing and packages like pandas and statsmodels came out.

As someone who likes rust, python is just easier to code for exploration and the kinds of advantages you get for using rust aren’t particularly relevant.

The entire surrounding ecosystem. Which includes not just the core numpy/scipy libraries but the stats, modeling, ML, plotting, etc. Plus the wealth of other Python stuff it can integrate with (I work at an almost entirely Python shop -- data science people work in the same language as pipeline and application people, which is really an overlooked thing). Plus the ease of installation and management from Anaconda. Plus the notebook format for easy sharing. Plus... well, lots and lots and lots of things. "We have a fast numerical library" is step one of about ten thousand to achieving parity with what the Python/numpy/scipy ecosystem does.

They don’t compete since you could reimplement your compute kernels in Rust and use them directly or as (g)ufuncs with the numpy container, used by the whole science and data stacks (visualization stats etc). Numpy compute APIs are for prototyping or compute which isn’t CPU bottleneck.

So for the majority of numpy users Rust competes with the likes of Numba, Cython, etc, which are hard to beat if you’re already coding in Python.


Every once in awhile /r/rust will get a numerics post & discussion will mostly concede that there's still a ways to go to make this field ergonomic

Numpy is great because of all the frameworks built on top. What's the story with Rust?

Numerical computation is just one part of scientific code. Linear Algebra, data analysis, modelling, and visualization, are equally important.

TensorFlow doesn’t have a Rust interface.

Isn't this a Rust interface for TensorFlow?


I avoid rust due to the annoying, symbol-heavy syntax that requires too much Shift-key usage.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact