Hacker News new | past | comments | ask | show | jobs | submit login
Wikimedia Foundation director resigns after uproar over “Knowledge Engine” (arstechnica.com)
68 points by timemachine on March 1, 2016 | hide | past | favorite | 39 comments



Why did the community react with such vitriol to the search engine product? I would like a free, open, privacy conscious search engine to exist.


Relations between those what run wiki and those what own wiki have always been strained.

I'd imagine the LYING about the new product idea is a bigger cause of vitriol than the product itself.

Wikipedia editors had been asking to see documentation about the Knight Foundation grant for several months, but Wikimedia was not forthcoming with the details.

Earlier this month, documents related to the grant were leaked to and published by The Signpost, Wikipedia's online newspaper. In a special report, The Signpost published the 13-page grant agreement and ran an article asserting that the "Knowledge Engine" would be, contrary to statements by Jimmy Wales and other board members, some type of generalized Internet search engine.


But were they actually LYING about the new product proposal?

Or maybe they were just hashing out what they'd like the proposal to be, and how it's articulated, before putting it forth for community review?

If you've ever been involve in anything "management-esque" you'll know this is quite a common route to take. It's often better to wait a bit before unveiling a proposal -- until it's gelled, and you have a clear articulation of what it's going to be -- rather than dump some half-baked statement of it on everybody (which would just burn through their time, and have other negative consequences).


If you've ever been involve in anything "management-esque" you'll know this is quite a common route to take.

Yeah, but there is a difference between ideas that aren't ready to be broadcast, and betraying the trust someone has placed in me with the understanding there are just some things I have to operate on in secret. And if you've ever been involved in anything "management-esque," you'll know that a mutual acceptance of what barriers exist doesn't preclude being deceitful and dishonest in protecting that knowledge.

I won't say the line is perfectly clear, but lets look at the scenario in question. Responding in a hairsplitting manner that serves to mislead someone asking pointed questions about ideas behind the curtain. It occurred at non-profit that values transparency, the ideas were in flux but firm enough to seek grant funding, and there were no externally-imposed legal constraints to stay quiet on it. Well, that seems quite firmly in the "lie of omission" category.


I get your general point. About general principles and such.

But please understand that as an outsider, it's difficult to see (from what you've written above) where trust was broken, or what other lines have been crossed. You do seem quite strident in your overall judgement of what happened. But I still don't have a clear picture (or any picture really) as to what did happen.

You know, an event chronology, or the like? That might be helpful.


Amen.

It's an unfortunate effect of social media and the information overload. People get flooded with info. They spot hypocrisy. They think they are doing gods works by pointing it out. They get encouraged by the likes and retweets. And we get an endless cycle of people who have no idea what to do about hypocrisy upvoting each other.

The architecture needs to change. We have hit peak hypocrisy detection. You can detect hypocrisy in anything by spending a few days looking into things just like the "activist" editors have. The kind of people who do things about hypocrisy...well...where's the algo for that?


I think we are close, but we have yet to reach peak hypocrisy detection. We could have a wiki of hypocrisy, where you can search for a certain person and find out all the times they contradicted themselves, possibly with justifications. (Maybe evidence changed their minds and they stand corrected.) We could have an hypocrisy rating weighted by influence on the matter. For example:

Mr. John Smith has contradicted himself x% of the time in matters where he has a strong influence and provided a justification backed by evidence for only y%. On z% of the topics he has contradicted himself, he has offer a threshold of evidence required to change his mind again. He has violated j% of those thresholds. All this gives him an overall coherency rating of k% which puts him in the nth in the ranking of his colleagues.

That would be peak hypocrisy.


Because they are spending donated money on something with a huge chance of failure and limited upside.

Some probably even suspect that it is a vehicle for funneling the donated money to the top management's personal pockets since that's a good explanation for why they would go ahead with something so ill-advised.


Even ignoring how the money is used, some of the recent donation drives have been really off-putting. Multiple times I've been scanning an article on my phone to find the section I need, only for the page to grey-out and become non-interact-able because a donation request banner[1] has loaded over the top of the article. It's not a huge deal (scroll to the top, close it, and continue), but annoying your users is a pretty poor way to solicit donations IMO.

[1] Similar to https://upload.wikimedia.org/wikipedia/commons/f/f1/FY1415Mo...


And while this may sound far fetched, the lack of transparency hurt the cause of the executives. "If you aren't hiding something, why are you hiding something?"


Those search engines exist, among them

  -https://search.disconnect.me
  -Duck Duck Go
Look at it this way: they would try to create a competitive advantage, as they are also behind Wikipedia they could better use it, in the process they can create an API to access information in such a way that other search engines cant.

A better proposal would be to create a project to start adding semantics to the Wikipedia project, maybe a RFC and standards so all search engines can benefit of it.

Another option could be an open sourced project to host blogs making use of the above described tech.

Then you could perform a search based on the "intended meaning" instead of the words included and who reference it.

Edit: Removed Yandex per comment below after research.


Sorry, Yandex?


I stand corrected, I erronously implied that the fact DDG uses its results implied the same policy about privacy which does not.

" 2.3. Data collected by or transmitted to Yandex, the Company or Partners, in the course of accessing, interaction and operation of the Site and provision of the Services may include, without limitation, the following Data: (i) Internet Protocol (IP) address and location; (ii) cookie information; (iii) browser identification information; (iv) information on your software and hardware; (v) date and time of accessing the Site and the Services; (vi) information of third parties websites referred the Site or the Services; (vii) information related to your activity in the course of the Services use, including, without limitation, search queries history, search results provided to you in response to your query, web pages you visited by reference from the search results; (viii) other information."

https://yandex.com/legal/privacy/


Neither of those search engines are even remotely equivalent to a free, nonprofit-managed engine that doesn't track users and provides Free source code and documentation to operate a clone of the index.


No - they may look similar, at a very superficial level.

But when you think about it more, they're quite different from what's being proposed for the Knowledge Engine.


When this event was posted here a few days ago[1] I spent probably an hour or so reading through the timeline[2] posted in one of the comments.

To a non-wikipedian such as myself it's a little opaque, but builds to quite a climax. It probably helps one to understand the depth of the resentment Tretikov engendered amongst some of the staffers.

  1: https://news.ycombinator.com/item?id=11176955

  2: http://mollywhite.net/wikimedia-timeline/


As much as I value Wikipedia as a resource, I've personally never understood the culture behind it. Long-term, to me, it's unclear what the future holds for it.


It's both a never ending disaster and the greatest resource in the history of the world.


That's most of human nature you're describing there.


That's a great way to put it. With the usual things-can-always-be-improved caveat, I suspect that the same things that make Wikipedia such a valuable resource (crowdsourcing, broad scope, etc.) will continue to inevitably lead to tensions--whether organizationally or over things like inclusion/deletion.


Sounds like a description of democracy too. :-)


a very simple and well written version of the thoughts I've been having around Wikipedia.


I feel like it would be a good resource/library to have once we can just upload information straight to our minds or accessible by thought.


Given the amount of politics involved in editor circles I don't even think valuing it as a resource is that great.


Show me an editor circle free of politics and I'll show you a circle of one.


> Long-term, to me, it's unclear what the future holds for it.

Why specifically? (Honest question).


To me, there are three parts to Wikipedia: content, community, and organization. Pretty obvious the content has the potential to live on, though without a solid community, it'll get rotten pretty quickly. As for the organization, To me, it seems pretty driven by the founder, though might be wrong, and also, not sure what to make of it. Basically, my opinion is that if the community dies, Wikipedia dies.


The Wikimedia Foundation should treat its treasure trove of donations as an endowment and use it to guarantee the survival of Wikipedia. If they just invested it, they may never need to do a donation drive again.


Disappointing to see this die. I would really like there to be some open, public API with a knowledge engine backend that could be part of an open source alternative to Siri, Cortana, Echo, etc.


some projects like wikidata and dbpedia provide public APIs. You can query them using SPARQL for example.


Having actively looked into using Wikidata as part of a system... Hahaha ... yeah it's great for Wikipedia but pretty useless for anyone else. Don't get me started on how useless SPARQL is for "explorative search".


Good riddance. Everything about her has been a disaster. From her single edit history to her statement that she was afraid to address the community.

She even has her name on a damn SOFTWARE PATENT. What were they thinking when they put her in charge?


If you have a list of programming heros, it would be a nice exercise to find out how many of them have their names on a software patent. Probably quite a few of them.

I don't like software patents, but having your name on one isn't the same as being a patent troll.


If she has a software patent, it could mean she understands how the system works, you wouldn't hire a lawyer who doesn't have a title, on the other hand, one that has written a law would appear more knowledgeable.

What I suspect is the problem is this

"The Board tasked me with making changes to serve the next generation and ensure our impact in the future,"

I would suspect if this is not political correct wording for "let's make money".


> If she has a software patent, it could mean she understands how the system works, you wouldn't hire a lawyer who doesn't have a title, on the other hand, one that has written a law would appear more knowledgeable.

You're being too considerate. It's not like hiring a lawyer who's written a law or has a title. It's like saying a lawyer needs to commit a crime to be a better lawyer.


Except that software patents are not criminal, they are legal. You may wish they weren't but they are. His analogy is far more appropriate than yours.

I also wish software patents didn't exist. But I'm not foolish enough to think I can pretend they don't.


> It's like saying a lawyer needs to commit a crime to be a better lawyer.

That's a ridiculous analogy. I am no fan of software patents either but likening them to a crime is just absurd.


a bit like file sharing then


I don't know her work history, but I can easily speculate innocuous reasons why her name could be on a patent.

1/ A previous employer may have filed it on her work. There doesn't need to be any specific agency or effort on her party to do so.

2/ In her early career, when patent trolls were not so prevalent, she might not have considered patents to be a bad thing. Her current thoughts on patents might be quite different.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: