

Ask YC: Regex web search? - aneesh

Do any of the major search engines offer a way to query the search engine using a regular expression?  I'm not even looking for complete regex support, even using wildcards to represent missing letters would be great, so that I can search for "Olymp*" and it will include results for "Olympics".  Google doesn't seem to support this.  Are the pages indexed in a certain way (ie, by word) that would make this kind of search prohibitively difficult?
======
crazyirish
So in your case doing Olymp* is actually not too difficult (thats just
stemming) but doing actual regex matching accross the internets would be hard
sauce. One of the traditional ways of storeing an index like this is word ->
{documents}. Doing a search for a set of words is then not too expensive,
however for full regex support they would have to look at every word entry.
Thats just sadpanda to the max.

------
adatta02
Google will also "fill in the blank" according to this
<http://www.google.com/intl/en/help/features.html>

[http://www.google.com/search?hl=en&client=firefox-a&...](http://www.google.com/search?hl=en&client=firefox-a&rls=org.mozilla%3Aen-
US%3Aofficial&hs=JQl&q=olym+*&btnG=Search) matches "Olympics" but perhaps more
interesting is
[http://www.google.com/search?q=Isaac+Newton+discovered+*&...](http://www.google.com/search?q=Isaac+Newton+discovered+*&btnG=Search)

------
brianr
This is probably overkill, but Amazon Web Services offers something called
"Grep the Web" which you can use to run offline regex searches of the web:
[http://www.amazon.com/Alexa-Web-
Search/b?ie=UTF8&node=26...](http://www.amazon.com/Alexa-Web-
Search/b?ie=UTF8&node=269962011)

------
neilk
<http://www.google.com/codesearch> does full regex search, but I'm not aware
of any mainstream web search engine that does.

