Someone needs to give these guys a big wad of cash so they can index more sites. I got some interesting results even on some moderately obscure images, but many others returned no results.
Even with a small index, though, this is pretty damn cool.
At the risk of being spammy, I really do think they'd benefit from using the service my company has developed - http://www.80legs.com. They'll be able to reach and compare millions of images per day for less than $100 (depending on the computational complexity of their image comparison).
edit:
ok I'm watching the video where you, or someone, is explaining the service. I was wondering how limiting is the access to data? I'm under the impression that you crawl the web no matter what and clients piggyback on the crawl stream and do analytics on it. So, who makes calls on web crawling method, I guess you? What if a client wants to crawl and analyze only a specific domain, country specific, for example? And how often and what exactly gets crawled? Lets say a client wants to implement a news.google.com or google alerts (or even tineye) - just as an example - it would analyze a web crawl stream from you and get data out of your system to their servers for utilization? How would such a, presumably large, data get transferred over to the client? What is provided as crawled data? Only a html stream, or the whole page that includes js, css and pictures? Or would a client need to get image links from your crawl and get images themselves? Sorry for lots of questions and incoherency :) but it does look interesting.
1. Who makes the crawling method?
We give you the ability to write your own crawling logic. You don't have to though.. we have a default crawler that runs as well.
2. Crawling specific pages?
You can specify pages to crawl using regular expressions or your own custom code.
3. How often and what gets crawled?
Up to you :)
4. How does the data get transferred over?
It's better if you don't transfer all the data over. You can push in your own compute functions to process the data you crawl, and just return much smaller result data sets.
5. What's provided as crawl data?
You can specify the results you return in your code. It's up to you.
apropos 4. - those processed data sets could get huge too, that is why I was asking.
If crawling logic can be at clients control, but there is a provided crawler that runs well too, and those prices, it sounds like an excellent product you've got there :) Let's just go over this tineye example we have here - they need to crawl web and retrieve images - so we have "Only pay $2 per million pages crawled" where they would pay $2 per million pages crawled, but what would they pay for retrieving images then?
I see so much potential in this service, now that is one hell of a product there at one hell of a price - congratulations on it, I hope you do well with it!
I don't know about them, but this looks very promising. Will you also offer search/parse primitives, like simple ways to parse classifieds or forums or other standard formats?
We plan on providing some basic utility functions that will help our customers, but one of the great things is that you can use third-party code in your own code. I actually mean two things when I say this. First, you can use available Java libraries for parsing HTML. Second, we'll be creating an app store model, where by developers can sell/license their code to other customers.
Does anyone remember the name of that free and open index of the web that some folks are compiling? They let you download it; it's like a terabyte. I think they are out of Seattle or Vancover. Anyway, it would be interesting if 80legs could make that available for searching.
It's my understanding the Sun Microsystems was toying with the idea some time back - I'm not sure if they ever followed through. It probably wouldn't have been a free service, though.
Nope, this is totally free and open. I can't find the website again, but I heard about them here. They are creating a larger index of the web than google's. The article was on the HN front page for a while because it detailed the technology they use to scrape the web.
Heh, I did the same with my usual avatar with similar results. They've been about for a while - it is impressively useful & they need more funds for sure.
Idée Inc has been around for quite a while doing cool image stuff - they are almost certainly funded. This is a really impressive pattern matching search.
A long time ago I saw something like this on OpenClipArt. The cool thing, though, was that it provided a Java applet where you could draw what you wanted and it would try to match that. I find that more useful and awesome, though this is useful too. I know basically the same thing could happen by using Paint or something, but it's cooler in browser. Please take note of this and someone implement a feature like it.