Hacker News new | past | comments | ask | show | jobs | submit login

I'm posting this as a top-level comment, but it's really a reply to the discussion downthread about compilers being able to work magic if we let them. Better still, why not help them?

Something I took for granted for the longest time about Haskell (which remains the only language I know of with the feature) is the ability to write user-defined "rewrite rules". You can say, "okay, GHC, I know for a fact that if I use these functions in such-and-such way, you can replace it by this instead".

  {-# RULES 
    foo x (bar x) =  
    superOptimizedFooBar x 
A rule like this would be based on the programmer's knowledge of FooBar theory, which tells her that such an equality holds. The compiler hasn't studied lax monoidal FooBaroids and cannot be expected to infer this on its own. :)

Now, anywhere a user of this code writes something like

  foo [1,2,3] (bar [1,2,3])
the compiler will substitute

  superOptimizedFooBar [1,2,3]
in its place. This is a nice way to bring the compiler "closer" to the programmer, and allow the library author to integrate domain-specific knowledge into the compiler's optimizations.

You can also "specialize" by using faster implementations in certain cases. For example,

  timesFour :: Num a => a -> a
  timesFour = a + a + a + a

  timesFourInt :: Int -> Int
  timesFourInt x 
    = rightShift x 2

  {-# RULES
      timesFour :: Int -> Int
    = timesFourInt
If you call timesFour on a double, it will use addition (ha!) but using it on an Int uses bitshifting instead because this rule fires.

High-performance Haskell libraries like vector, bytestring, text, pipes, or conduit capitalize on this feature, among other techniques. When compiling code written using libraries like this, this is how it goes:

- rule #1 fires somewhere - it rewrites the code into something that matches rule #2, "clearing the way" for it to fire - rule #2 fires - rule #3 fires - rule #1 fires again - rule #4 fires

and so on, triggering a "cascade" of optimizations.

The promise of Haskell is that we already have a "sufficiently smart compiler": today, with good libraries, GHC is capable of turning clear, high-level, reusable functional code with chains of function compositions and folds and so on into tight, fast loops.


I must add, though, that getting rewrite rules to fire in cascades to get "mad gainz" requires one to grok how the GHC inliner/specializer works.


Data.Vector also utilizes an internal representation that makes fusion explicit and hence predictable (inevitable, even) called a "bundle":


but this relies on rewrite rules too, e.g. the previous module contains this rule:

  {-# RULES

  "zipWithM xs xs [Vector.Stream]" forall f xs.
    zipWithM f xs xs = mapM (\x -> f x x) xs   #-}

My biggest concern with something like this is how it affects debuggability and the principal of least surprise, for a couple of reasons. The biggest is that in the presence of bugs. If there is a bug in foo but you only use foo in the context of foo x (bar x) then all those operations get transformed and you end up with the correct behavior even though your code has a bug that will suddenly appear when you use foo in a way that that rule isn't applied. Or there's a bug in superOptimizedFooBar, so you correctly write "foo x (bar x)" which is correct, but you get the wrong results, and you may waste time trying to debug foo and bar not realizing the rule replacement. And there is also the possibility of bugs in the rules themselves. In general, I'm a little hesitant to use something that is replacing my code behind my back.

It is very interesting, though, and is a good avenue to explore. I've got some reading to do :)

You're right, of course. There are ways around this: for one, without optimization (i.e. -O0) you don't have any rewrite rules firing, so any discrepancies in behavior can be tracked down this way.

In practice, most of the libraries that include tons of rewrite/specialise/inline (ab)use are either "core" libraries (like vector/bytestring) or have a large userbase (e.g. Conduit), and rules don't really change too much from version to version, so this has never actually had the detrimental effects that "silent replacement of code" might have in the worst case.

This might[0] sound similar to apologetics for dynamically-typed languages: the only real answer to your question is that rewrite rules are ultimately a way to improve performance, and they come with caveats usually associated with replacing clearer algorithms with faster ones. (I'm thinking of the adaptive sort from the C++ STL, which iirc uses different sorting algorithms for differently-sized containers to maximize expected performance. It's not exactly intuitive that vectors of size 100 and vectors of size ~10000 are sorted differently, is it?)

Of course, the only verifiably-correct way to do these things is through the use of machine-checkable proofs of the transforms. The well-known "foldr/build" rule that many optimizations of linked-list functions reduce to, for instance, has been proven outside of GHC, and there are usually similar proofs for other rules known to the library author. The appeal of dependently-typed languages is how they are a nice vehicle for the inclusion of those proofs in library code: if the compiler can't verify that your rule makes sense, it can complain instead. You then end up supplying a proof, or saying "just believe me, okay?" ;)

[0]: "In practice"? Seriously? Next thing I know I'll be parroting "in the real world" and so on...

This post has a good overview of what can be gained by the use of these techniques:


> Using just these two rules, I saw a 225% increase in the number of nodes processed per second. Under the right circumstances, GHC can even outperform C code by applying these kinds of inter-procedural optimizations.

Also, we have QuickCheck, which allows us to check properties by randomly generating testcases. The post I linked mentions this.

I wonder if there will be a time we start seeing this with Javascript transpilers(JS->JS), which are normally used to bring feature parity and compatibility to runtimes but starting to get obsolete as the runtimes catch-up.

For example

    let x = (await grabSomeIntegers()).map(double).filter(biggerThan(3)).map(substract(1));
can be optimized automatically to use a single loop but one can rewrite it to be even faster (even though insignificantly, in this trivial example) because you know they will be integers. But then you let go of the DRY.

How do Haskell library authors deal with that?

In Haskell your example would look like

  x = do
    ints <- grabSomeIntegers
    return (ints & map (*2) & filter (> 3) & map (-1))
(where x & f = f x) which the compiler desugars into something like

  x =  fmap (map (- 1) . filter (> 3) . map (* 2)) grabSomeIntegers
Now the compiler can very well perform optimizations on the big composed function. Even then, as you say, this has its limits: polymorphic/DRY code still suffers a performance penalty because it has to work the same way with all allowed inputs.

... right? Nope, that doesn't have to be the case: Haskell supports more than just your $grandparent's parameterized types!

The vector package uses something called data families to have different in-memory representations for vectors of different types, without "letting go of the DRY" at all.


Here, Vector a is a data family: how a value of type Vector a "looks" in memory depends on a. So you'll notice a bunch of lines of the form

  newtype instance Vector Int16 = MV_Int16 (P.Vector Int16)
What this means is "if we have a vector of values of type Int16, use a primitive vector (essentially a raw array) to represent it in memory". (P is an alias for the Data.Vector.Primitive module.) Now I can write code that uses the beautiful user-facing API of vector:


Written using vectors, code like in your example is perfectly DRY; it would look like (this is untested code!) this:

  combobulate :: Num a => IO (Vector a) -> IO (Vector a)
  combobulate = do
    as <- grabSomeIntegersFromNetwork
    return (ints & V.map (*2) & V.filter (> 3) & V.map (-1))
and under the hood, whenever I use a Vector Animal[0], it uses whatever fast general-purpose array implementation is available, but the moment I call this with a vector of Int16 or Bool or Complex Double, it switches to the fast implementations in the Primitive/Internal modules.

The key is the data families: the types expose a simple, consistent API, hiding the actual guts of the memory representation from the user. The library author is free to perform whatever ugly/evil/unsafe manipulations she feels like behind the scenes on the internal representations, or not.

[0] Defining numeric operations on animals is left as an exercise to the careful reader.

I guess that's one of the benefits of a great type system. Thanks for the detailed explanation.

Yup. A good one isn't enough, though, it's the constant evolution that's awesome: type/data families are pretty new.

Your example of rewrite rules reminds me a lot of expression templates in C++, though there's no direct support by the language and it's done through the type system instead. Some libraries (e.g. eigen or the boost library for linear algebra) use expression templates to produce optimized code for a chain of functions or operators.

The specialization example is the same as template specialization in C++.

I've heard a little about generics/templating/whatever-you-call-the-things-between-angle-brackets techniques in C++ from far away.

> The specialization example is the same as template specialization in C++.

This sounds very interesting to me, because you're saying it's accomplished entirely in the type system: do you have an example or a reference you can point me to?

I'm guessing there's a type-level tag for an algorithm, and the compiler is told how to combine tags corresponding to a sequence of operations on vectors that looks a certain way. Something like do_the_thing<t, DotWith<Self>, HasFastNorm<t>> = do_the_thing<Norm<t>>? (just thinking aloud in pseudo-C++, nothing to see here, move along) In that case, it would be closer to something using a function on types, which Haskell has in the form of type families.

Oh, this is interesting!


What you have towards the end of that post is actually something like

  type family Vector a where
    type instance Vector (Ptr a) = ...
    type instance Vector a = ...
This is a closed type family. (Think of it as a pattern-match on types where the cases are tried in order.) Very cool!

I still don't know how the linear algebra/Eigen examples might work, though.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact