
A supercomputer often won’t make your code run faster - deafcalculus
https://lemire.me/blog/2017/12/11/no-a-supercomputer-wont-make-your-code-run-faster/
======
knolan
I see this so often with academics. They’re not developers and only have basic
coding ability. So they make all kinds of basic mistakes like trying to load a
bunch of images converting them to float (IR images for example) and wondering
where their memeory went. I took some spaghetti code that took a couple of
days to process thousands of IR images, in batches, and wrote one that ran in
under a minute simply because they didn’t manage their memory at all.

Even something as opening CSV files quickly can have a massive effect.

Lots of Universities switched from FORTRAN to Matlab in the last decade but
many of the researchers who learned FORTRAN try to write Matlab like FORTRAN
with messy nested for loops and no knowledge of vectorisation.

~~~
qubex
I'm one of those people: I'm not actually a scientist, but I studied
mathematics and build my own agent-based macroeconomic models (amongst other
things). I am also a fairly autodidact programmer, meaning my code is an
ungodly spaghettified mess that can be used to scare small children.

I am very much aware that my ‘programs’ are of the “big untidy spaghetti
script” variety, and I simply consider them a kind of work-in-progress,
notepad, or prototype for later professional implementation. I also have a
couple of professional programmers ( _cough_ ahem, coders, they desire to be
referred to as _coders_ ) that are entirely adept at taking my horrid final
‘thing’ and converting it into production-quality code that doesn't take down
the enterprise (with the same tools I use, incidentally: _Mathematica_ ,
Python (>3.3) including SymPy and NumPy, and ABAP/SQL/Java for interfacing
with the SAP ERP system).

The important part of this is that I never be under any illusion that I
something that I toss together and make capable of ‘running’ be definitive
code that can be put into production or used as-is, and that under no
circumstances must the coders have any bright ideas about fudging the
underlying mathematics.

~~~
knolan
> The important part of this is that I never be under any illusion that I
> something that I toss together and make capable of ‘running’ be definitive
> code that can be put into production or used as-is, and that under no
> circumstances must the coders have any bright ideas about fudging the
> underlying mathematics.

But there should be some cross germination between you and your coders? You
must have learned ways to make your spaghetti code less cumbersome for them
and they probably have learned something about the underlying economics so
that they know what it is you’re doing and how to keep everything to your
needs. Similarly you’re not reinventing the wheel each time so you must be
using tools developed by them more and more as time goes on?

~~~
qubex
Most definitely. We communicate and collaborate and to a certain degree cross-
pollinate (mainly I learn, there’s not so much extra macroeconomic depth to
add)... but what I was emphasising was the clarity about the separation of
rôles.

~~~
knolan
That's great. It sounds like a practical and productive setup.

------
rootbear
It isn't just badly done math loops that can cripple performance. Years ago,
some users were complaining that it was taking forever to load their data into
their analysis program. It turned out they were reading thousands of structs,
_one element at a time_ with the Unix read(2) _system call_! I taught them
about buffering and the read time went down by a factor of ten or more, I
forget the exact numbers.

