Hacker News new | past | comments | ask | show | jobs | submit login

I think the problem the author of HLearn and SubHask faced isn't really so much of a problem with thunking or whatever. Compiler optimizations and whatnot are always great and avoiding indirection is good (it's also not super hard either to do it manually for 99% of cases). But rather, I believe the problem they were hinting at is that the current Haskell numeric hierarchy is insufficient to capture the patterns we want for the vast kinds of numerical computing we do today. But even then, the tools we need to design the hierarchy we do want are also a bit out of reach, and implementing them will take time.

SubHask for example uses hmatrix and other LAPACK/BLAS libraries under the hood, which is good of course -- but almost all of the actual work in SubHask, then, is around designing a reusable hierarchy that works for all the 'weird' cases you find with this domain. Even 'non weird' things like rank/dimension polymorphic products, etc, require a lot of type-level infrastructure to accomplish today, and the features to support that are constantly evolving. (An example of APL-style rank-polymorphic stuff I typed up recently can give you an idea of how much work it is to do some 'seemingly trivial' things, though the examples at the end are rather nice[1]) SubHask also has an extremely fine grained hierarchy, with lots of little classes to handle cases like "weird containers" with appropriate laws -- and figuring that out is a lot of work too.

Haskell's type system is getting more expressive all the time, so hopefully this will become less tiresome in the future, and iterating and honing the design will be far simpler. But getting the design right is much more important and the other stuff (better underlying numeric libraries, improved performance, etc) can come after.

I think you're spot on that the lack of attention here creatives a negative feedback loop, as well. Another example of this I think that was bad was that very little code took into account things like numerical stability of algorithms. The SubHask/HLearn author also tried tackling this in an interesting way, too[2].

[1] https://gist.github.com/thoughtpolice/c408773f43972e48f93377... [2] https://github.com/mikeizbicki/HerbiePlugin




>But rather, I believe the problem they were hinting at is that the current Haskell numeric hierarchy is insufficient to capture the patterns we want for the vast kinds of numerical computing we do today. But even then, the tools we need to design the hierarchy we do want are also a bit out of reach, and implementing them will take time.

I think part of this issue boils down to avoiding the laziness and immutability of the language. The "patterns we want for [...] numerical computing" not infrequently boil down to (a) build a matrix, (b) modify that matrix, (c) apply that matrix, or parts of that matrix, on vectors, (d) modify vectors, (e) partition matrices into submatrices to do work on, etc. Notice the "modify" part of so many of these.

These aren't really hard problems in imperative languages. The naive algorithm for Arnoldi iteration[0] is inherently iterative and really isn't that hard to write with ok performance in an imperative language... but try writing it naively in a functional language! So, tools and abstractions need to be built to make these kinds of algorithms more natural to write functionally without costing too much in performance. In other words: the tools and abstractions are only necessary because of the limitations of a language that leans so hard on the compiler for performance.

I think this is one of the underlying things that differentiate Haskell and other languages: In Haskell, abstractions can be absurdly reusable, allowing for really elegant code to do things... if you know what abstraction to use. If you don't, then things can get difficult to do. In imperative languages, abstractions can't quite reach the level of reusability that they can in Haskell... but you don't have to rely on the abstractions as much to solve the problems that you want to solve.

Think about the Arnoldi iteration. The iterative version in a number of languages is 10 lines, but the functional version needs some way of drilling down into an array in a mutable way (a monad) in order to make it match the iterative in performance...

[0]https://en.wikipedia.org/wiki/Arnoldi_iteration#The_Arnoldi_...


Of course, insisting on an immutable API that allows data sharing is going to produce different results than a mutable API.

There exist mutable counterparts such as Data.Vector.Unboxed.Mutable and everything in the ST monad.

The question is if you are allowing yourself to utilize them.


> The iterative version in a number of languages is 10 lines, but the functional version needs some way of drilling down into an array in a mutable way (a monad) in order to make it match the iterative in performance

Sure, but IMHO the monadic code necessary to implement something like Arnoldi is neither harder to write nor uglier than the same implementation in an imperative language.

If anything, this highlights a strength of the language: we can capture an imperative idea in a functional setting, and even have the ability to nicely isolate the mutability from the rest of our functional program (with `ST`).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: