A recent employee survey showed only 10% of WMF staff approved of the Executive Director, probably in large part due to things like this.
A critical take on the project as it has been handled: http://permalink.gmane.org/gmane.org.wikimedia.foundation/82...
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2... is a pretty good (but dated) overview from Wikipedia's weekly newspaper, but there's a few others in the Signpost and a few blog posts across the web.
This was a hypothesis about his removal when he was initially was removed, and has since been refuted by multiple sources.
By Google being the best, it only becomes better, and introduces a huge barrier to entry to competitors. It used to be possible to know what people were searching for to end up in a given Wikipedia article, but the process is now only asynchronous (and limited) through Webmaster tools
In my mind, the most interesting aspect of the announcement should not be how much money they have to spend, but how they plan on solving this paradox.
Example of a search query URL. Verified pages appear to get more weight.
Despite many real (though also some exaggerated) counter-examples, Facebook does have features to protect privacy.
One of those things is you generally can't get very useful results from search unless you're friends with someone, or a friend of a friend (depending on the user's privacy settings). You can't see general trends, or even search for every person with a given first and last name in an area, for example.
It used to be more open, but they've heavily restricted the breadth of data returned from searches within the past few years.
On top of which, Graph Search is still disabled inexplicably in many regions, for reasons which were never explained.
Ditto. I have seen this as a crippling deficiency in the Facebook platform. Not being able to search properly in my own posts or in the groups I'm a member of really sucks. For all the engineering prowess, open sourced tools, etc., shown by Facebook, the lack of a working search makes the company seem incompetent from the top down.
Sometime ago I started storing important information (like others links, my own comments, etc.) outside of Facebook where I can find them easily. I also started a Facebook page and added Notes into it to make it easier to document, find and share things. It seems ridiculous that I'd have to do this just to have access to information, but that's been the sad state for years.
"1) Public curation mechanisms for quality;
2) Transparency, telling users exactly how the information originated;
3) Open data access to metadata, giving users the exact date source of the information;
4) Protected user privacy, with their searching protected by strict privacy controls;
5) No advertising, which assures the free flow of information and a complete separation from commercial interests;
6) Internalization, which emphasizes community building and the sharing of information instead of a top-down approach."
My first thought: How will transparency impact SEO? Will spammers be able to better game the algorithm when they know its internals?
However I am excited at the prospect of a wikipedia-like public curation system for the entire web. I admit I'm flabbergasted that the whole thing ever worked, but it does.
The Mozilla / open directory project tried this. Curation doesn't scale and often assumes a single unifying ontology. This is particularly problematic in a cross-cultural context. Besides, 'quality' is not a unidimensional metric in a result set: consider timeliness, authority, notability, uniqueness, comprehensibility, etc.
Most search engines include a URL, I can see a [crawldate] button like the [cache] or [translate] buttons on each hit adding some information, but this will be of dubious additional utility for most searches.
We have duckduckgo already, friends are welcome but it's hardly a unique offering nor a trustworthy one given Snowden's revelations regarding the scale of systematic 5 eyes traffic monitoring/recording.
DDG or Google or Bing with plugins can supply this. Not ground breaking.
6) Internalization, which emphasizes community building and the sharing of information instead of a top-down approach.
This is so amorphous as to be a non-point.
So out of six points, 2 things (33%) are only useful in edge cases, 1 thing (16%) is too vague to be useful, and the other 3 things (50%) are currently implemented by others and have been tried before.
I would like to see the input of the former Blekko guys on this, https://news.ycombinator.com/user?id=ChuckMcM + https://news.ycombinator.com/user?id=greglindahl
Wikipedia is a pretty big exception to that assertion. Perhaps DMOZ (a clone of Yahoo circa 1996) is not the only way to do curation. Perhaps Wikipedia could apply what has worked for Wikipedia, i.e. develop a set of POV-neutral criteria for organizing collections of links and then invite everyone to participate.
It's really easy to be negative. But that's something that might at least be an interesting research project for the #1 open-curation system in the world.
The article is written once then modified or evolved occasionally by (almost exclusively) humans, but very frequently read. It is intended to be intelligible, being structured and based in natural language. It has a very well defined scope within a flat namespace, and often clear relations to multiple formal ontologies. It is structured to be consumed in part or in whole, and may contain rich media and strong supporting contextual information (related pages).
By contrast a search result summarizes a set of potential information sources that may answer a search query in whole or in part, to various definitions of "answer". It is generally written once, by a computer, and thrown away after some period of caching. It is intended to be concise. Each component result has relatively poor context, relying upon the searcher to interpret timeliness, authority, notability, uniqueness, comprehensibility, etc. with the limited information presented, typically a very short content excerpt. It is structured to be scanned, classically in a ranked fashion from "best hit" to "worst hit", and is generally a wall of text.
Wikipedia successfully attracts people to contribute to the former, but the latter - where the information product is generated on the fly and lasting impact is amorphous (nothing particularly concrete for contributors to point to and say "I did that! Warm and fuzzies!") - is a very different beast.
I too believe there is room for innovation ... there are potentially low hanging fruit like inter-linguistic semantic queries (not keyword search) ... but there are no such key problem areas identified in the paper's summary.
I'm imagining the edit wars and debates that take place on contentious wordings or facts in some parts of Wikipedia, but on a much wider scale involving hundreds of SEO consultants each aware that changing a particular criterion will have a quantifiable impact on their clients' bottom line. It doesn't sound like it would be fun to police.
And even the page text is not immune from the problem you describe. Grading and prioritizing sources is a fundamental part of producing a "reasonably neutral version of the truth." It's what determines what gets cited and how prominently it influences the article.
So while I wouldn't equate text and links in terms of the difficulty of managing POV-neutrality, I would say they sit on a spectrum.
DuckDuckGo a meta-search-engine! It relies mainly on Yahoo Boss API which uses Bing search (for most countries)! Yahoo Boss API turned from free to expensive in early 2015 and the future of Yahoo (tech company, not Alibaba stock) is uncertain.
We definitely need more search engines, only 6-7 exist that cover a wider range (international). Search on HN to retrieve the list, we had this discussion before.
I was Blekko's founder/cto. And it's worth noting that our founding team was the Open Directory Project's founding team. Blekko's curation data was even better than dmoz in its day. Check it out: https://github.com/blekko/slashtag-data
They'll have to keep the servers outside the USA then. It's illegal for European organisations to transfer personal data to the USA now that Safe Harbour is invalid.
Google ranks everything based on popularity - Not based on quality. Popularity and quality are two independent concepts and not necessarily related. That's something which Wikimedia understands but which Google doesn't.
Google does take plenty of quality features into account. PageRank is one of course, but that isn't some corporate conspiracy, it's that it is a good feature.
Interesting discussion on a search engine that does sort of the opposite of Google:
If a search engine let you differentiate and sort between content match vs pagerank match vs Adwords spend we might be able to mitigate the issue somewhat.
I'd not heard of the CMD+ENTER method before, so thanks for the heads up. Still not entirely sure what Firefox is submitting in that case. Will test.
I wasn't referring to this site in the parent comment, but to my freelance site. I'll put that site in my profile for 24 hours just in case anyone wants to take a look.
EDIT: Fixed, as noted in reply to nl's comments. A recent change led to a redirect line being mistakenly commented out.
I don't work in this area, but I'd say that 99% of the time I hear someone complaining about how Google is favoring sites that pay for advertising over them I find that they are making these incredibly basic errors.
For me, http://www.linguaquote.com/ gives a 404. It's only when I go to https://www.linguaquote.com/ that it works.
So I'm afraid it's not quite as simple as you make out in this case.
In other news, I've just pinpointed the missing line in the recently changed nginx config for Linguaquote; the http://www block had it's redirect commented out. Still, for this site in particular Google Webmaster tools is set up for the https version where no errors have been reported and SSL Labs gave an A+ for the stapling, PFS, heartbleed etc. etc. efforts I went to. I don't think this redirect was having an adverse effect on ranking, but I don't expect this site to hit the front page just yet - much more content to add before aiming for that.
I'd love to see an example (especially including the sites that should rank).
Will leave it there a bit longer in case you do come back to this thread as still genuinely interested in your opinion on the matter.
I have no idea what the current algorithm looks like but I'd be shocked if it somehow switched to evaluating the 'quality' of content, however one might do that with an algorithm.
Basically, when judging quality, Google is making assumptions like: "This page has a lot of backlinks, and those backlinks themselves have a lot of backlinks... Therefore this page is of high quality." This approach puts all the power in the hands of content providers (bloggers) who are funded by big companies (or well-funded startups) and who serve the interests of those companies.
Google wrongly assumes that content-providers serve the interest of consumers and that they can be trusted - Which is not the case.
If anyone's interested, here's a demo of my own project: https://tuvalie.com/fae/?q=Albert%20Einstein
Treat yourself to this relatively expensive USB stick:
The original implementation of the Google search engine would get obliterated today, though I guess you have to start somewhere.
This holds true for nearly every human endeavor.
I imagine they will build a proof of concept and get more money for it later. Have volunteers work on building it to save money. Open source the project and have others look into fixing the issues with it.
So apparently it's not money of the Wikimedia foundation funding this project.
Before I give, I go to guidestar, hit free preview(they try to trick you into a paying membership), download last few years of 1040's, and see if everything looks copacetic. I look at who is making the most money. Their is usually one person making a very good living. California non-profits are much easier to scrutinize than Deleware non-profits.
In anycase, https://wikimediafoundation.org/wiki/Financial_reports
And the internet archive is a lot more deserving.
For example something like the history of the Alaskan panhandle as seen from the US perspective is totally different when seen from a Canadian perspective.
I never use Wikipedia as a primary source of info even the linked sources I try to use as least three independent sources.
I would certainly like to see "Reliable and trustworthy information" but who do I trust who is reliable?
For example the Portuguese Wikipedia, shared by Brazil, Portugal (and other countries), with controversial matters many on colonization in the last 500 years, where both sides have academic work to support their contradictory views. Which views prevail?
An example of things being done differently are some Ex-Yugoslav countries (Serbia, Croatia, Bosnia, Montenegro, etc) whose languages are more similar than Portuguese and Brasilian Portuguese, and each one has its own Wikipedia, with different articles on the same subject depending on their point of view. Lately, I've been seeing more of the Serbocroatian Wikipedia, which I think aims to unite more of the others.
I don't know which way is better, I'm just a user.
Another reason you can't trust anyone, and this is general to the Internet, is that shilling, commercial and political interests aiming to change perception are everywhere. On reddit or facebook, with or without sources. It's the worst aspect of the internet for me these days.
No need. It's generating $23 billion per year in operating income and growing. There are no serious challengers. It's far more likely their search engine will be constrained under piles of government oversight in the coming years. About the time governments start worrying about products like this in tech, is about the time they're just beginning to become less central. The exact same thing happened to both IBM and Microsoft.
It's very, very sad. And it's also a shameful moment for the WMF.
edit: and don't just think it's me saying it. The WMF has had a mass exodus of staff in the last week or so. If you speak to any WMF non-executive staff members directly, you'll quickly find out that morale is at an all time low, and confidence in the WMF Board is sitting at something like 12%.
The Knight Foundation is about as upstanding as you can get, so it can't be that (full disclosure, I've received funding from them, so I'm definitely not unbiased on that point). So, what exactly is it that's so shameful here?
See my comment here for just a few comments on this issue:
Frankly, there's a lot more - to understand the issue better you might want to read Liam Wyatt's blog posts:
It's not nice to be the only one here on HN pointing out that there are some absolutely massive problems going on at the WMF at the moment, but I'm an outsider who was once an insider and I still know enough influential people through Facebook and other mechanisms to see enough to know that there is a crisis happening right now within the WMF.
edit: I should note that, as an outsider who doesn't ever really want to be hugely involved in Wikipedia-related matters again (for various personal reasons not necessarily related to Wikipedia or the WMF), I don't really have any fear in stating what I see - nobody can really come back at me so I have no fear of any reprisals.
The grant application you are looking at was only revealed due to a MASSIVE amount of controversy and pressure within the Wikimedia Foundation.
The community representative (James Heilman) on the board was let go the other day, in part because of concerns around this grant. You might want to look at the Wikipedia Signpost article he wrote about this:
Many people have questioned this. Lila, their Executive Director, seems to have conjured this up out of thin air, without consultation of any WMF staff members, or anyone in any of the various communities. Even highly influential, well respected people like Tim Starling appear to have been broadsided by this.
Here is what Lila Tretikov wrote about the search engine:
It was my mistake to not initiate this ideation on-wiki. Quite honestly, I really wish I could start this discussion over in a more collaborative way, knowing what I know today. Of course, that’s retrospecting with a firmer understanding of what the ideas are, and what is worthy of actually discussing. In the staff June Metrics meeting in 2015, the ideation was beginning to form in my mind from what I was learning through various conversations with staff. I had begun visualizing open knowledge existing in the shape of a universe. I saw the Wikimedia movement as the most motivated and sincere group of beings, united in their mission to build a rocket to explore Universal Free Knowledge. The words “search” and “discovery” and “knowledge” swam around in my mind with some rocket to navigate it. However, “rocket” didn’t seem to work, but in my mind, the rocket was really just an engine, or a portal, a TARDIS, that transports people on their journey through Universal Free Knowledge.
From the perspective I had in June, however, I was unprepared for the impact uttering the words “Knowledge Engine” would have. Can we all just take a moment and mercifully admit: it’s a catchy name. Perhaps not a great one or entirely appropriate in our context (hence we don’t use it any more). I was motivated. I didn’t yet know exactly what we needed to build, or how we would end up building it. I could’ve really used your insight and guidance to help shape the ideas, and model the improvements, and test and verify the impacts.
However, I was too afraid of engaging the community early on.
Why do you think that was?
I have a few thoughts, and would like to share them with you separately, as a wider topic. Either way, this was a mistake I have learned enormously from.
(this can be found here: https://meta.wikimedia.org/wiki/User_talk:LilaTretikov_(WMF)...)
That's a very, very real problem. An executive director of the Wikimedia Foundation should never have felt too afraid of engaging with the wider community on an issue as fundamental as this one.
It's even more concerning that a half-thought through idea didn't get discussed and yet a grant application was made. All those who say that the application is only for $250,000 are entirely missing the point - the entire project would be $2.5 million, this is just the first, initial stage.
It's even worse when Jimmy Wales states that:
"To make this very clear, no one in top positions has proposed or is proposing that WMF should get into the general "searching" or "try to be Google". 
Yet that is precisely what is being done here.
The WMF appear to have known about this, because they seem to have done a large number of hirings to be dedicated to search - which I hear through contacts was questioned at the time as it seemed an odd way to allocate WMF resources.
There have been, in the last week, I believe 5 or 6 influential staff leave the WMF. IN fact, they appear to be haemorrhaging staff currently, with no real sign of any abatement.
None of this is at all satisfying to me. I was very, very involved in Wikipedia years ago. I started their Admin Noticeboard, and I did lots of article work, and helped kick off some key things, one of which was the  tag which I have to admit I have some mixed feelings about. But for such an important project, it saddens me greatly to say that as an outsider now, it looks like things are being badly mismanaged.
I hope for everyone's sake (and not just the folks at the WMF) that this can be resolved. It's not like governance issues can't be addressed - when Sue Gardener was in charge of the WMF, things not only ran like clock-work, but she ensured maximum transparency and we all trusted her implicitly because she earned that trust. I can't say the same for the current Executive team.
I mean, there's certainly styles I'm unfamiliar with. I'm always open to new experiences. Could be the case here. Hers just instantly set off red flags in my intuition. I hope she didn't always write like that as it might mean whoever brought her in either fell for a con or were part of it.
I've been critical of her "rocket" imagery, but I like to think I'm understanding about the odd use of English in the rest of her comments. Especially as I'm a monolinguist, heavens only knows how I would sound if I tried to learn and speak Russian amongst Russians...
In a few weeks I'll publish the source code and do a Show HN.
I wish a lot of luck to
DuckDuckGo also started to crawl the web with its own bot (right now they're using Yandex's api).
We need more competition from different countries. Just think about the censorship done by Baidu or how Google never plays by its own rules.
It's also interesting to think about a way to monetize a search engine. For kairos.xyz I was thinking about paid accounts (1 euro per month) providing more features, like the ability to search from the command line. For example you write "kairos Richard Stallman" and it prints basic information about Richard Stallman on your terminal.
$ curl -4s http://kairos.xyz/ | grep title
$ curl -6s http://kairos.xyz/ | grep title
<title>Welcome to nginx on Debian!</title>
$ host kairos.xyz
kairos.xyz has address 188.8.131.52
kairos.xyz has IPv6 address 2604:180:0:a54::24d9
If you see this page, the nginx web server is successfully installed and working on Debian. Further configuration is required.
For online documentation and support please refer to nginx.org
Please use the reportbug tool to report bugs in the nginx package with Debian. However, check existing bug reports before reporting a new bug.
Thank you for using debian and nginx."
I meant here: http://kairos.xyz/
Who is Richard Stallman?
10 USD to EUR
help - about"
the only thing I can think off right now is that the http "Host" header field is not sent. I have several sites on the same server and Nginx is used as a reverse proxy and uses the Host field to redirect traffic to different ports.
Surely this is not designed to be written from scratch, so..
- Are they using known lexical & semantic scanners?
- Is it focused on English language first?
- What crawlers will scan content?
- I'll assume it's an open platform, but license for contributors?
- What database architecture will hold the graph?
- How does it know the mark of authority, and is this primarily based on human input learning or machine learning?
I'm sure $2.5M wont touch the sides, but maybe if it's a well directed project, with healthy user contribution, based on interesting technologies they might develop a good backbone architecture. Ambitious for sure.
Maybe Wikipedia should launch a video encyclopedia to try to provide a 5 minute video of every article, for people who like videos more than reading.
I guess the foundations aren't reasonably a response to the Jimmy banners, the company matches probably are.
EDIT: Not because it's written in PHP. Because it's architected poorly.
I get really concerned when I hear that the person who holds the vision and direction for the Wikimedia Foundation didn't really participate in it beforehand, and I get even more concerned when I see that she branches off into proposals for search technology that appear to be far outside the scope of Wikimedia projects.
Nobody has ever thought search in Wikipedia or the various projects was particularly effective. However, bringing everything together doesn't just involve searching, and frankly there are a number of more pressing governance and community issues that need to be managed.
Perhaps I'm being a bit unfair here, but she was profiled when she first joined the WMF Board, and the following was said about her:
At the meeting, she described the impact on friends and family of the Chernobyl nuclear disaster, and the difficulty of getting reliable information in the face of “so much secrecy.”
Yet we see that this is precisely what happened with this grant proposal. A major grant was applied for and awarded and not even WMF staffers knew about it. You can see on the mailing list that it was a total shock when it was finally revealed.
I'm watching this train wreck from afar, but closer than others because some of my friends are deeply involved in Wikipedia and the WMF. I'm always amazed that a leadership change can complete kill an organisation. I've seen it in the corporate world, and I see it all the time in the volunteer world as well. The Wikimedia Foundation seems to be yet another victim of the appointment of a clueless leader, with no experience in the area or with the group they are meant to be leading, thrashing around, making changes without really understanding how systems work, the history of the organisation or relying on the experience and sage advise of the many expert and dedicated people around them, ultimately leading to a great deal of unnecessary turmoil, ill-will and frankly destruction in their wake.
There was "Wikia Search" by Wikipedia founder Jimmy Wales:
"Wikia Search was a short-lived free and open-source Web search engine launched by Wikia, a for-profit wiki-hosting company founded in late 2004 by Jimmy Wales and Angela Beesley.
Wikia Search followed other experiments by Wikia into search engine technology and officially launched as a "public alpha" on January 7, 2008. The roll-out version of the search interface was widely criticized by reviewers in mainstream media. After failing to attract an audience, the site closed by 2009."
I used Wikia Search back then, it was good enough (like Bing in comparison to Google back then).
It was based on Apache Nutch and Solr/(Hadoop(?)/Lucene ...
Maybe you can rely on Lucene or SphinxSearch projects to kick-start.
I hope Wikipedia brings some innovation to search, untethered from advertising revenue.
How long will be until can algorithmically generate its own Wikipedia articles? Wikipedia relies upon coming to its site for contributions and donations. Without search, Wikipedia risks being subsumed by Google. They have a difficult position of thinking about the future without pissing off Google.
Computers are getting more and more powerful. Wikipedia needs to do stay relevant. I think this is the right decision.
I think this is an incredibly good use of their money. Google is the world's biggest surveillance machine, I hope that Wikipedia can do to them what they've already done to Encarta.
It's hard to get worked up over some other team's morale. If its such a crappy place, just quit. The could probably literally go across the street and get a new job. I don't really care. It's all way too inside baseball for me.
You might never have contributed in a significant fashion to Wikipedia and other WMF projects, but I did. Sure, I didn't get employed, but then again I know a lot of people I met and have continued to be friends with who are still deeply involved. You may not care about your friends' morale, and you might think it's easy for people to "go across the street and get a new job", but then you seem like a pretty thoughtless person.
Of course, you've not understood at all what the larger issues are. You must have a bit of a comprehension issue, because I supplied quite a few links that you apparently read that explained the underlying problems.
Just remember though: you don't care :-)
1) Being genuinely confused about what the big deal isn't the same thing as caring. It's asking for confirmation of a conclusion.
2) If you're in a situation where you're unhappy, then you have a responsibility to make yourself happy. Staying around in a crappy situation and whining about doesn't help, and neither does insulting people.
3) Wikimedia is in San Francisco. If I had to take a guess, I would say there was a literally a hundred other tech organizations in that city alone, including nonprofit organizations with a societal purpose. 18F comes to mind. Again, See 2.
1) you literally wrote "I don't really care".
2) they aren't whining. Saying so is pretty much a personal attack. It's certainly insulting. I don't appreciate it, and I'd say neither do they. Funny how that works both ways.
But interestingly enough, as has been pointed out already - people ARE leaving in droves.
I'm no longer involved in Wikipedia, but I can still be unhappy with the direction they are taking.
3) if you think that just leaving a non-profit you have emotionally invested in is an easy decision, then you really haven't thought things through. If you think it's elementary to just step out of one job and into another, that's also thoughtless.
Jimmy Wales' already tried to make a "Google Killer" ten years ago. It was tilting at windmills to say the least. Letting individuals help manage algorithmic search results was harder than you could imagine. Let's not even get into the difficulty of building an effective crawler.
One of Wikia's former CEOs, Gil Penchina, notoriously undervalued search as a result of this very public gaffe. By the time I came in, it took over five seconds to do a simple on-wiki search. Searching across wikis took so long they actually just sent the search to Google and had you abandon the site. I personally fixed a lot of these problems, and that part was pretty cool.
So now let's get to the subject at hand, which is a search feature based on an authoritative knowledge graph. Something like this should adequately surface factual information in an intuitive manner -- optimally based on natural language. Wikia already tried this, too. They brought on a very seasoned advisor who played a crucial role in the semantic web movement far back into the early oughts. I remember going to semantic web meetups in Austin when I was in grad school quite some time ago now to hear this guy talk.
This guy was essentially the SF-based manager or lead for a small team located in Poland whose job it was to take some of the "structured data" at Wikia and attempt to build some kind of knowledge graph on top of it. This project was unsuccessful.
So why did it fail? We'll start with a lack of product direction. Wikia had and probably still has a very junior product organization that is mostly interested in the site's UI and (recently) a focus on "fandom" (yuck). The team allocated to the project was based in Poland (Poznan, to be exact), and primarily kids coming out of a technical school on their first job. Your assumption about communication being a problem would be correct. However, the subject matter expert was so entrenched in his area of specialization, the problem was even more compounded on the native English-speaker side. There was too much getting in the weeds, and not enough focus on incremental progress.
To make things worse, they tried using a proprietary, not-ready-for-primetime data store because it most closely matched the SME's preconceptions on how the data should be structured. There was absolutely not an existing business use case for this data store, and problems getting it to work turned even building a simple demo into a death march.
Either way, what I'm saying is, $250,000 is not enough to solve this problem. We have attempted to solve this problem before in the MediaWiki world. It's not going to magically get better. To make something like this work, you need:
1) Best-in-class UX people who would know how a knowledge graph provides a significant improvement over existing solutions
2) Leadership that can bridge the gap between SMEs and implementers
3) Very skilled engineering resources with backgrounds in less conventional technologies
This is a massive investment that no one is willing to spend on what is essentially a media play.
About six months later, I had built a proof-of-concept that sucked data out of MediaWiki Infobox templates into Neo4j, a well supported graph database. I was able to answer questions like, "Which cartoon characters are rabbits", and "What movie won the most Oscars in 1968" using the Cypher query language.
At that point in time, Wikia had decided they were tired of investing in structured data, and wanted to re-skin the site for a third time in as many years to make it look more like BuzzFeed.
Structured data is cool. In many cases, unsupervised learning may be what you're actually looking for. But in the end it has to satisfy a real user's needs.
Wikipedia has five million English articles. Wikia has over 20 million. As far as capitalizing on this wealth of knowledge, the devil is truly in the details. But it's a real shame that all of that information isn't put to better use than to encourage the socially maladjusted to take quizzes over which anime character they're more like.
Or I'm just making the number up. Doesn't really matter to me.
Search for any news item and you'll have all articles published more than 2 minutes ago included in your results, all blog posts, everything. They consume it all, and offer the output in near-real-time.
Wikimedia don't have the resources to do that. And they especially won't without advertising to pay for it.
Bing cost MS $5.5 Billion in their field of expertise
So they are trying to compete using 2.5 mil dollars with software backed by multi billion dollars, hundreds thousands of servers, tons of data, thousands of developers, ML integration etc.?
Good luck with that. Many tried backed by x times more resources than this 2.5 mil, unfortunately all failed.
(Edit: Who decided that enter is not equal to enter?)
Search "Are cookies really bad for me" and find an answer that supports what you want to hear.
especially if they can make something that actually does rival google... other companies have spent billions and not gotten very close.