Hacker News new | past | comments | ask | show | jobs | submit login

> In most programming languages, the linear algebra handling for these kinds of standard operations is performed by underlying libraries called the BLAS and LAPACK library. Most open source projects use an implementation called OpenBLAS, a C implementation of BLAS/LAPACK which does many of the tricks required for getting much higher performance than "simple" codes by using CPU-specialized kernels based on the sizes of the CPU's caches. Open source projects like R and SciPy also ship with OpenBLAS because of its generally good performance and open licensing, though it's known that OpenBLAS is handily outperformed by Intel MKL which is a vendor-optimized BLAS/LAPACK implementation for Intel CPUs (which works on AMD CPUs as well).

Much of this is more complex than this. Most open source software doesn’t ship with any assumptions about a particular BLAS/LAPACK implementation at all - and on HPC systems you are generally expected to choose one as appropriate and compile your code against it. It is generally only when you download a precompiled version that you’re given a particular implementation, but it doesn’t mean you can’t use another one if you compile from source as the BLAS and LAPACK libraries just present a standard API. Generally, for performance reasons, you want to compile specifically for your platform, because precompiled wheels from Conda, PyPI, etc. will leave performance on the table.

On forward thinking cluster teams these days, sysadmins use tools like Spack and Easybuild and to some degree software is made available to available to users either directly or by request, so it’s usual to log into a cluster and have multiple implementations available to choose from and compile your code against. More often than not, it’s still on you however to compile against what you need as dependencies. It’s a worthwhile exercise in HPC to try different ones and check the performance characteristics of your code on the particular machine with multiple implementations.




If you look at the LinearSolve.jl defaulting system, if non-standard BLAS's are installed on the system then that takes priority in the defualt, so this all still works just fine (And would reduce compilation). This extra handling is mostly to make sure that the default desktop version works sufficiently well, since that's the baseline for most people.


This is true, but for desktop and laptop compute, I'd estimate something like 95% of users won't change the defaults. I'm also not sure how good our solvers are on clusters, but on desktops, the Julia versions are often faster than MKL which is the faster BLAS outside Julia that I know of.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: