I think it's why it's so short: the Y-combinator makes perfect sense if you explain it in the context of expansion rather than recursion. The best demonstrator of how it combinators work isn't a bunch of algebraic derivations and explanations relative to tail-recursion, it's seeing how they expand to resolve the function given. For me, it was stepping through a factorial function in Racket.
In the end I wound up including it in the Heresy standard library.
For me it was more interactive. You start with (Python syntax)
return 1 if n <= 1 else n * factorial(n - 1)
return 1 if n <= 1 else n * recurse(n - 1)
factorial = factmaker(factmaker) # ...?
The key here for me was to accept `factorial = factmaker(factmaker)` and poke at `recurse` instead: how do we make it take a function as its argument? The only way to move forward is really that there really is which you can do with it is to replace `recurse` with `recurse(recurse)`, so that `recurse` can take a function as its argument:
return 1 if n <= 1 else n * recurse(recurse)(n - 1)
factorial = factmaker(factmaker)
With a bit of abstraction to separate out the logic, restore the original form of factmaker, and handle more arguments, you can eventually get:
def recurse(*a, **kw):
return f(f)(*a, **kw)
return 1 if n <= 1 else n * fact(n - 1)
factorial = Y(factmaker)
Personally I found "The Little Schemer" explanation the first that made it click for me. Its unique style of making you do the exercises even if you don't do them (it's hard to explain, just go and read the book) gives you an impression that you really understand what is happening.
Full disclosure: I ended up answering there as well, but I won't say which one is mine.
I have been working through SICP and I ended up taking a weekend break from it to really understand everything in the article.
Here's how it works. Say we have a recursive function like map:
map f  =  -- We'll ignore the base case since there's no recursion here
map f (x:xs) = f x : map f xs -- This is the interesting part
How? Well, let's abstract over the recursive call, replacing it with a function that we'll add as parameter to our definition:
map' _ f  =  -- Still boring
map' g f (x:xs) = f x : g g f xs -- Pay attention for later!
All we've done so far is replaced map' where it should be on the right-hand side of its own definition with g, and then added g as parameter of the function. We end up with a double g on the right-hand side because of that added parameter.
OK, now what? Well, we need somehow to make g be map', the thing we're defining. which means we need somehow to pass our definition of map' to itself, so that (in the second case) we'd end up with something that evaluates to:
map' map' f (x:xs) = f x : map' map' f xs
So how do we make this happen? Well, what is map map f (x:xs)? It's map applied to itself, then to some other arguments we need. So the key, it seems, is to figure out how to apply map to itself. Well, in the lambda calculus that's pretty easy :
why f x y = f f x y
lambda f. lambda x. lambda y. (f f) x) y
map = why map'
map f (x:xs) = why map' f (x:xs)
== (by def. of `why`)
map' map' f (x:xs)
== (by def. of `map'`
f x : map' map' f xs
Voilà. So that wasn't too hard. It's basically two steps: (1) abstract over the recursive call, then (2) figure out how to pass the function you're defining to itself. If you look at the Y combinator, you'll see that that's exactly what's going on there.
If this didn't make a lot of sense to you, it's probably because I wrote it on my iPad at 4:30 in the morning. It really is just those two steps: abstract over the recursive call, then self-apply. ...
To get deeper into the recursive mindset (if you're not there yet) and to build up deliberately to the Y combinator, take a look at one of my favorite books on functional programming, The Little Schemer . (And watch out for the jelly stains!)
0. And it's ill typed, but never mind that. It's not possible to define the Y combinator (or any other fixed-point combinator) in the simply typed-lambda calculus, so just imagine that my Haskell-style syntax is untyped, like the basic lambda calculus.
1. Although, again, there's no way to give fix a good type in the simply typed lambda calculus.
Also, would the Y combinator be a useful abstraction-aid in a compiler (for a typed language), or does it exist merely as a curiosity?
I wouldn't otherwise, but after reading this comment I have to ask: why did you use such a syntax instead of Scheme, which is what TFA uses?
EDIT: also, I second "The Little Schemer" suggestion. I'm working (for the nth time) through "The Seasoned Schemer" in preparation for "The Reasoned Schemer" and it still is very enjoyable. It's a bit like SICP but with a sense of humour and unique, engaging style of presentation. Well worth the read!