

Greplin Wants You To Redesign Wikipedia Search - tilt
http://techcrunch.com/2011/07/26/greplin-wikipedia/

======
redthrowaway
The reason Wikipedia's search is broken isn't (only) because it doesn't take
advantage of the interesting data representations you can use; it's also just
a really shitty search algo. It doesn't make any guesses, it doesn't look at
related terms, and it often won't bring up the relevant article unless you
specifically type that article's name.

There's a huge opportunity to beautifully represent information on Wikipedia
and make search easy, intuitive, and informative, but it all rests upon having
a search algo that isn't utter crap. The WMF will have to fix that before any
of this will have any practical benefit.

As an ironic aside, it seems that Wikipedia isn't OSS. I can find the repo for
MediaWiki, but Wikipedia seems to be a heavily modified version thereof, and
it appears to be proprietary. If anyone can find the repo, I'd appreciate a
link.

~~~
DieBuche
What do you mean by heavily modified? Wikipedia is running version 1.17wmf,
basically 1.17 with some fixes pulled from trunk. You'll find it here:
[http://svn.wikimedia.org/viewvc/mediawiki/branches/wmf/1.17w...](http://svn.wikimedia.org/viewvc/mediawiki/branches/wmf/1.17wmf1/)

You'll find the installed extensions here:
<http://en.wikipedia.org/wiki/Special:Version>

------
JohnsonB
Most people use Google search for Wikipedia because it's simply easier,
instead of navigating to Wikipedia and then typing in your search, just type
in your search where you usually do. Nothing you redesign on wikipedia.org
will ever change this fact, people's shortest point of access to a specific
Wikipedia article will still be their browser search bar or homepage set to
Google/etc.

If Wikipedia's search can be improved it's simply from an accuracy standpoint.
I would guess that Google employs more sophisticated spelling correction,
stemming, auto completion, etc. Developing all these features to Google's
level is very expensive and in that case it might make a lot of sense to just
use Google's technology for the Wikipedia search.

------
chime
A few years ago, I decided that I wanted to make a really good Wikipedia
search engine. I downloaded the entire Wikipedia dataset, I downloaded
DBPedia, I setup graph database, wrote code to query everything using
SPARQL/RDF, and after much trial and error, went a completely different route:
Google Ajax/JS Search API

<http://chir.ag/wiki/cat+disease+brain>

I did this because in the end, Google still gave very good results very fast.
I made mine search-as-you-type so if I wasn't satisfied by the results, I
could reedit my search text instantly.

I know because of Wikipedia's no-corporate-alliance reasons, they probably
won't use Google search. If you're thinking of answering Greplin's call based
on textual-analysis, my gut instinct says you will have a VERY hard time
beating Google. However, if you go the DBPedia route, you might end up with
something really neat. Powerset did something similar and got bought up by MS.

~~~
asarazan
It should be noted that the challenge is for design, not for actual search
algorithm implementation or anything technical.

~~~
a3_nm
I hope that the winning entry will be an actual implementation. It's not hard
to have ideas to improve Wikipedia's search; the hard part is writing the
code.

------
sgdesign
The thing is, Wikipedia's search is pretty straightforward: type keyword,
arrive on article page.

Anything you'd add to that would detract from the search experience, not
improve it. Seems to me the reason why people use Google to search wikipedia
is because it saves them a step, not because wikipedia's search is inherently
broken.

In any case I'd like to take part in the contest, but so far I can't really
think of a way to improve wikipedia's search…

~~~
pjscott
The problem is when your search doesn't bring you immediately to the right
article -- if it's not a simple search term like "Bermuda Triangle" -- and you
end up looking at a page of results like this:

[http://en.wikipedia.org/w/index.php?search=wireless+router+f...](http://en.wikipedia.org/w/index.php?search=wireless+router+firmware)

Those are actually pretty good results, but their presentation is lacking. The
snippets of article text are too short to give a proper idea of what the
article is about and how it relates to the query terms. The formatting is
straightforward but not particularly good or bad. The article metadata in
green is nigh-useless to the vast majority of people; do you want text like
"11 KB (1,170 words) - 12:19, 10 July 2011" cluttering up each search result?
The section headings, like "Content Pages" and "Multimedia" are simple links,
and it's not visually obvious under what circumstances you might want to click
them, or what will happen if you do.

That's just off the top of my head, and I'm not a designer; just someone who
cares about usability. I imagine a good designer could make something quite a
lot nicer than Wikipedia's default search pages.

------
ramidarigaz
I type "en<tab>" and then enter my query (in Chrome, at least). Doesn't get
much better than that.

However, if I didn't have that feature, I'd probably use Google anyway.
Tweaking the CSS won't change that. They'd really have to make Wikipedia's
search kick ass in some other way.

~~~
pjscott
Does Chrome's search give you realtime term completion? If you start typing in
the search box on Wikipedia, it will give you suggestions that are the titles
of articles. It's a substantial usability improvement over just typing
something, since it gives you the opportunity to skip directly to a page more
often, rather than going through the vigorously mediocre search results page.

~~~
DieBuche
Of course it does.

~~~
pjscott
Huh. Maybe I need to update Chrome, because it's not working for me. Weird.

EDIT: It was just a hiccup when I checked; the problem is gone now. Thank you
for pointing that out; from now on I'll use this method to search Wikipedia.

------
jasonwilk
My girlfriend still types Facebook into Google to go to Facebook. I don't
think it's an issue with Wikipedia.

------
wturner
I hope the winner gets built as a customizable addendum to the Mediawiki code
and released as an extension. This way if Wikipedia doesn't use it other
Mediawiki instances can.

------
achille
Here's a better search, I've mapped it as a keyword w
[http://google.com/search?q=site:en.wikipedia.org/wiki/%20%s&...](http://google.com/search?q=site:en.wikipedia.org/wiki/%20%s&btnI=I%27m+Feeling+LuckyI%27m%20Feeling%20Lucky)

thus you can type 'w india china war' and it returns: Sino-Indian_War

~~~
_delirium
> thus you can type 'w india china war' and it returns: Sino-Indian_War

That's the first result for Wikipedia's internal search, too:
[http://en.wikipedia.org/w/index.php?title=Special:Search&...](http://en.wikipedia.org/w/index.php?title=Special:Search&search=india+china+war)

------
dmbass
Using the actual Wikipedia search involves an extra step where I google for
"wikipedia". I can skip this step if I just google for the thing I want
directly on Google.

I think that says more about how good Google is than how bad Wikipedia is
(although I do agree that the Wikipedia results page needs work).

