

Show HN: getimgs - a search project I built in university - Finbarr
http://getimgs.appspot.com/search.html

======
Finbarr
I built this as part of my dissertation in the 4th year of my CS degree over a
year ago. Was talking to somebody about it yesterday and they suggested I
should post it on HN so here it is.

Essentially you are looking at a search engine that attempts to extract some
relevant images from the content of the result pages in an attempt to make it
easier for the user to make a decision about which results are most relevant.
The search portion of the project is a fairly thin client that uses Google
ajax search API, making calls to my API to decorate the search results.

The API fetches the web page in real time and uses a range of features in the
source code of the html to make inferences about the usefulness of a given
image. The search engine tries to display the logo of the page at the bottom
left of each result, and a representative content image to the right. It works
really well for certain kinds of searches, e.g., 'news', and not so well for
others.

One important point to note is that it currently breaks for all wikipedia
results as my special logic for parsing those pages is now out of date as
they've updated their html :s

