

Search Architecture - kilimchoi
http://instagram-engineering.tumblr.com/post/124162066737/search-architecture

======
ChuckMcM
That reads a lot like the architecture that FB came up with for Graph Search
as well, essentially creating queries which are s-expressions for edges and
then solving the graph for a given start node (things like 'user:me' but can
also be 'user:foo' or 'image:bar')

That is a really great system when your documents are reasonably simple and
nominally identical (pictures, comments) and the ranking signals are all
organic (liked, followed, tagged, etc).

It gets a bit more interesting when you try to infer content of the images. If
you do the image interpretation like Google does or Alchemy Vision[1] then you
get a document that itself has semantics in addition to the metadata. So 'find
all the cats' is now a corpus wide search of several billion images which do
not have a graph relationship but may have been tagged 'cat'. Clearly its a
bit easier if you constrain it with 'cats in pictures from people I follow'
but that doesn't help a whole lot if you're liberal with your follow button
:-). You can build something like the Mustang architecture paper that Google
wrote or something like the combinator based inverted graph stuff that we did
at Blekko, or I'm sure there are other ways to skin that particular cat.

I'm always interested in reading about this stuff for that reason. So many
interesting problems in computer science live in a forest of acyclic graphs.

[1] disclaimer, IBM acquired both Blekko and AlchemyAPI. See
[http://www.alchemyapi.com/products/alchemyvision](http://www.alchemyapi.com/products/alchemyvision)

