Please consider releasing a dataset, even to a limited set of qualified folks under NDA.

There are lots of ways to spread the work out. A number of the people here have backgrounds in search (which is really all about ranking and not much about searching). Even a closed Kaggle competition could yield some really interesting algorithms (Anthony is a great guy and would be happy to discuss it I'm sure. Note that a closed competition is done under NDA by only the most qualified participants).

Spreading it out might be fun, hope your consider it. Lots of smart guys love HN and would probably want to help.

