

The Anatomy Of Search Technology: Crawling Using Combinators - pathdependent
http://highscalability.com/blog/2012/5/28/the-anatomy-of-search-technology-crawling-using-combinators.html

======
walrus
Ok, I'll bite. From the article:

    
    
      Now let's see how combinators (which we discussed in the previous blog
      posting) might make doing some of these computations easier.
    

So, how are they defined in the previous blog posting?

    
    
      A combinator is an atomic operation on a cell of a database that is
      associative and preferably commutative.
    

No, it's not. A combinator is a function with no free variables. Even the
example is wrong:

    
    
      "Add(n)" is an example of a simple combinator; it adds n to whatever
      number is in the cell. 
    

If "Add" was a combinator, the cell would have to be one of the parameters of
"Add". Otherwise, the cell is a free variable.

To be clear, I'm just complaining about the misuse of the term "combinator",
since it's a word with a strict mathematical definition and no other common-
language interpretation (like "function" or "operation" have). I'm not
commenting on the actual content of the article.

~~~
greglindahl
The initial value of the cell is (eventually) one of the parameters of add(n)
-- when you compute the final value. Before you get to that point, the various
add(n) operations aimed at a given cell are combined. In the diagram in the
first posting in the series, 18 add(1) operations on the same cell turn into a
single add(18) operation. It's only then the cell is read (or the bucket is
merged) that the final value of the cell is computed.

~~~
walrus
I understand, but I still think it's iffy calling it a combinator. Maybe
calling it lazy evaluation (sort of) would be better?

(I'm not sure if you saw my edit before you replied. I added the last two
sentences only a few minutes before your response.)

------
PaulHoule
I found this to be a remarkably bad article. Out of this you'd learn nothing
whatsoever about what it takes to build a moderate-scale (10 million pages)
never mind large-scale web crawler.

~~~
greglindahl
The goal of this article was to only talk about how combinators make crawling
easier. If you'd like a more general introduction to the topic of crawling, I
provided some references in the 3rd paragraph.

