Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Explore Wikipedia edits made by institutions, companies and governments (ailef.tech)
261 points by ailef on Dec 3, 2022 | hide | past | favorite | 94 comments
Hi HN!

Wikiwho is a tool that scans Wikipedia edits and extracts those coming from specific IP ranges associated to known organizations. I've made this as a for-fun side project two years ago.

If you want to read more on how it works I've written a short blog article about it here: https://ailef.tech/2020/04/18/discovering-wikipedia-edits-ma...

I had already posted it here at the time (previous discussion: https://news.ycombinator.com/item?id=22907200) but I've now decided to release the code openly, hence the repost.

If you're insterested, you can check the repo here: https://github.com/aileftech/wikiwho (disclaimer: the code is a bit clumsy).

Cheers!




One amusing edit: http://wikiwho.ailef.tech/diff/7d674d710c8d1328f0f74f6b351ff...

IPs from the European Parliament editing out connections to Cambridge Analytica from Alex Phillips' (member of the European Parliament) page: http://wikiwho.ailef.tech/diff/584f4588ec12334300a448f39ae4c...


Good to see that the latter was reverted within 5 minutes, and the removed content remains there to this day.


Virgil Griffith had earlier done a similar tool and paid dearly for it after exposing CIA and FBI edits" 16 Aug 2007 — CIA and FBI computers used for Wikipedia edits ... The program, WikiScanner, was developed by Virgil Griffith of the Santa Fe Institute https://www.reuters.com/article/oukin-uk-security-wikipedia-...


Yes, I've mentioned it also in my original blog post. What do you mean though by "paid dearly for it"? I was not aware of anything particular happening.


Maybe pertaining to persecution like this: https://www.coindesk.com/policy/2020/12/11/virgil-griffith-s...


It's possible, but as much as I dislike what's happened to him I think it seems a bit far-fetched to see it as retribution for creating WikiScanner.


Came here to mention Virgil, glad someone beat me to it. =)

It's truly amazing how much can be gleaned from a perusal of the edit history for a page from the beginning. I assume much goes undetected.


My wife's best friend used to work for Wikipedia and said they had to block any edits made from the IP range belonging to the US capitol building. Maybe they need to just do this for all legislative offices of all countries.

Not that it stops the same people from just making these edits from home or learning to use a VPN.


It seems that such blocks would mostly just reduce transparency.


One of the Navy Network Information Center (NNIC)'s top edits is the Subic rape case.[0] Predictably and, unfortunately, it's essentially all edits trying to play down the case or cast doubt

I've been a Wikipedia editor for a few years now (mostly botany-related pages) and I've been quite the pessimist about the platform for almost as long. But browsing this website has gotten me really demoralized to be honest. Many of these edits seem like individual contributors, but some of these look downright coordinated

[0] https://en.wikipedia.org/wiki/Subic_rape_case


I am convinced the military has teams dedicated to maintaining their wikipedia information.

And that isn't including a high ranking commander having their subordinates do something for them unofficially.


> One of the Navy Network Information Center (NNIC)'s top edits is the Subic rape case.

What's the tl;dr of what happened here? It looks to me like someone in the US military raped a Filipino girl, and then either blackmailed her family or did something shady in order to get her to recant. Meanwhile there were edits on the page to disregard her initial allegations[0]

[0] http://wikiwho.ailef.tech/diffs/c5885fe11dbfd31923f7554cf41c...


I completely lost trust in Wikipedia once I realized recent history is literally being rewritten by biased editors.

For one, check the edit history of anything Russia/Ukraine war related, it's a complete shitshow.


> For one, check the edit history of anything Russia/Ukraine war related, it's a complete shitshow.

I went and checked randomly a few edits. I’m not seeing anything I would describe as a “complete shitshow”. Can you please provide examples and tell us what you find objectionable?


You should look better then, I won't do your job for you. Keep in mind, English is not the only language on Earth, and definitely not the only one on Wikipedia.


History was always written by the victors, we only see it happen in real time now.


> History was always written by the victors, we only see it happen in real time now.

A modern spin on this: history is no longer written by the victors, but by people with literally no life outside of writing crap on the internet


This was true of medieval clerics and Enlightenment historians as well. It's almost always the idle writing history because everyone else is too busy working or living their own lives. The exception to this rule is patronage history (e.g. paying others to write flattering histories) but that's exactly what this kind of astroturfing on Wikipedia is.


Not sure how that links to the article considering that these IP ranges are linked to professional institutions.


> A modern spin on this: history is no longer written by the victors, but by people with literally no life outside of writing crap on the internet

There’s perhaps a narrow sense in which this is true of how it is written, but that's really not the sense the original gets at, anyhow.

Which elements of the writing of those “people with literally no life” gets distributed, amplified, and becomes crystallized as history is still actively shaped by the victors, not only of the clashes between societies, but of the class and subculture conflicts within them.

And that editorial and distribution control, not the actual mechanical act of writing, is what the original was about.


> A modern spin on this: history is no longer written by the victors, but by people with literally no life outside of writing crap on the internet

Still acting on the basis of the propaganda they drank like kool-aid, but pretty much. Hard to exclude some biased editors aren't actually paid from organizations/parties tho.


A simpler way to describe history that is happening in real time is things that are happening now. This is not a case of "history being written by the victors," this is an example of populations being propagandized by their governments and oligarchs about the things those governments and oligarchs are currently doing.

The sense of alienation is overwhelming here. We're not reading stories about something that once happened to someone, we're being sold stories about what is happening to us, and what we are doing, right now.

-----

edit: Walter Lippmann did this stuff for a living, and wrote a lot about it and the political implications of imbalances of information. I don't think it's such a huge difference that people 100 years ago only got two newspapers a day worth of information (early and late editions), and we now get our information rations in smaller portions.


What’s a pivotal work of Walter Lippmann one could read?


...and to a much greater degree then people realize, really. As a simple example, WW2. There where several polls over the decades in Germany which country contributed the most to the fall of Nazi Germany.

In the beginning, it was mostly attributed to Russia (something like 80%) with the USA mostly protecting Europe from getting integrated into Russia after the fall. But over time this opinion has been overturned to being mostly the USA with Russia playing a minor role... And the fact that Russia could've likely just taken over Europe completely has been forgotten entirely.


>> And the fact that Russia could've likely just taken over Europe completely has been forgotten entirely.

Well, seeing how Russia can't even take over Russia today, people have doubts! That's a joke, don't get all serious ;-)


s/Russia/Soviet Union/g.

The huge role of Soviet Union was pushed by the Soviet Union as part of its Great Patriotic War narrative is continued to this day by Russia's propaganda. Without the US Lend-Lease, the situation could have turned out much differently for the soviets (as even Stalin himself admitted).


And the Royal Merchant Marine making huge efforts at resupply, and the UK's own material response.


I wasn't trying to claim that the USSR was the sole contributor to the defeat of the fascists, sorry if it sounded like that.

I just wanted to make a pretty strong example of the winners rewriting history and how this propaganda becomes fact for the society.

I'm sure everyone here agrees that it's good that the fascists lost the war and that the USA enabled Europe to stay democratic. It was a very brutal period of time in which human life was sadly undervalued.


Ironically a better example of this would your own claim that focuses solely on the crimes of Nazism without any mention of Communist crimes. Since USSR was on the Allied side, Communism never quite turned into the embodiment of evil that Nazism has become. Due to this, today many academics are proud to call themselves Communists, whereas you would be hard-pressed to find any self-proclaimed nazis, at least in the mainstream of academia. All because history is written by the victors.


For the record, i never claimed anything beyond that the public opinion changed over the years, likely because of propaganda. A lot of factors were at play, including the neverending resistance by good people, both in the Reich and occupied territory, the categorical extermination of educated people that didn't subscribe to the Nazi believes and more. Summing everything up to be because of "n" has always been incorrect.

Which is why I didn't attribute truth to either of these claims, instead I pointed out that the USSR would've likely taken over Europe without the direct involvement of the USA, i don't think that this is controversial at all, or do you believe they would've stopped in the middle of today's Germany if they weren't forced to?

Beyond that, Nazis are hardly vilified, because they're literally villainous. Please remember that Nazism isn't National Socialism, what the Nazi regime was based on. It was one specific version of it, which includes mass murdering vast amounts of people. You can make an argument that national socialism isn't necessarily bad, but doing the same for the Nazis means you also condone mass murder, as a simple example: exterminating queer people and people with disabilities was core to their doctrine.

I'm utterly at a loss how I'm supposedly ignoring the crimes of the USSR. Would I have been glad that the USA protected Europe from it if I considered them faultless?


I like to add a bit of historical context and that is that communism as an ideology,field of study and it’s ideas were quite spread out over Europe and beyond (Way before the Russian revolution and after)

Karl Marx ideas were very revolutionary for its time (a good primer can be found in the book The Value of Everything) and set a lot of ideas in motion. And there was also debate in that regard on how to institute communism. So communism could never reach the “evil” moniker like Nazism, even though it was apparent the Soviet Union was quite a brute. So the ideology was kind of separated cognitively from its implementation by the Soviet Union.

In the Cold War period, the SU was definitely seen as something that had to be defeated. There was a lot of fear of nuclear escalation between the superpowers. The Soviet Union was seen as different at best, and something to be defeated in all cases. And yes, also evil. Just watch some action movies from that time to get a general idea.


And with revision log.


Of course it is gonna be a shitshow. It’s still happening! Everybody has a vested interest in making those pages reflect their version of reality. And the non-cynic part of me says that is fine and to be expected. In the fog of war there is no true, objective reality yet. It takes a good long while for humanity to settle down and cooler heads can then piece together an accurate view of events.

There is plenty of good content on Wikipedia… you just won’t find it for highly potent current events unfolding right this instant.


Of course, this only catches the relative "amateurs". Proper professionals change IP address before making their edits, or they register a "volunteer" account (this also prevents routine disclosure of the IP address, but still makes the IP address potentially discoverable by higher-ranking Wikipedia functionaries).


Yes, this is unfortunately a very big issue and one of the reasons I didn't develop it further.


Why are you posting it again then?


In case you don't realize, I want to let you know that your post comes off as entitled and hostile.

It might not be a perfect tool, but it is interesting and freely given. Tools need not be perfect to be shared freely


I found it bewildering that, if they see a “very big issue” with it, so much that they abandoned the project for that reason, that they would repost it again on HN without mentioning that upfront. One thing I missed is that the project hadn’t been open previously, which does explain the repost now, but still it would have been appropriate to mention the insight about its limitations in the submission. Personally I felt misled by the submission text upon reading the GP comment, hence my reaction.


An IP-based tool is still useful to catch casual malfeasance. The various Twitter bots logging edits from various government IP address ranges for example did highlight interesting edits sometimes. It's just important to remember that this is only the tip of the iceberg.

Ultimately it's the structural weakness of Wikipedia that limits how much you can know about the contributors.


I have to say, this is depressing to see just how corrupt brands and institutions are to protect their own bullshit.

Nevertheless, great work and a very useful project.


You really want to feel depressed? Look up Unusual Whales [1], a day trading app focusing on investment flows in the markets who does an annual report on insider trading by US officials, including Congress and SCOTUS. Even went so far as to create a congressional ETF. The focus is US pols, but no reason to suspect other jurisdictions are any better.

There's a reason both parties put a HUUGE emphasis on culture war - it distracts us from their true crimes that just happen to transcend party lines.

[1] https://unusualwhales.com/politics


Their collection of ETFs has a standout in their anti Jim Cramer ETF

https://unusualwhales.com/etfs


What is one to infer from these? That congresspeople, like others that are afforded the right, also execute trades?


That congresspeople use insider information gained from sitting on various committees to execute suspiciously-timed trades [1], such as:

-heavy buying of semiconductors in the lead up to passage of the CHIPS act

-mass sell-off by members and their families within the day of the first briefing on the coming COVID policy mandates

-purchase of defense stocks by members who were briefed on the Russian invasion of Ukraine. One member was particularly brazen, purchasing Lockheed Martin a day before the invasion, then tweeting "War can be profitable" in an attempted swipe at media [2]

-regular returns exceeding 100% on companies that directly stand to benefit from policy decisions made by Congress people, on purchases usually executed shortly before public policy votes are held

This is not normal

[1] https://unusualwhales.com/politics/article/conflicts

[2] https://www.newsweek.com/marjorie-taylor-greene-bought-defen...


Is this data immediately accessible? What's to stop me from just following this data and copying their trades 1:1?


Public officials in the US need to file their financial activities within a certain time limit, and and this is accessible to anyone with a Bloomberg Terminal.

Usually by the time this info is made public the best profits would already have been made. There are also plenty of examples of members not filing at all. Of course nothing happens to them...

This was a bit of a scandal when Pelosi's activities became a bit too obvious (e.g [1] - and I'm not saying this is just a Democrat issue, they all do it), but the furor very quickly died down after democracy came under threat during the midterms because the other party are neo-fascists or communists, depending on which billionaire-owned news source you follow.

[1] https://www.cnbc.com/2022/02/09/congress-moves-towards-banni...


The inference is that they use insider information


A more charitable conclusion would be that brands and institutions want to ensure that accurate information is available. If a company is 5th largest yet they're shown as 6th then of course they want to correct the entry. However, I agree it's probably abused to remove any unflattering information.

Wikipedia does have a number of editing guidelines, dispute resolutions, etc. It appears they have tried to combat abuse, but I'm sure HN readers can find many failures :)



Crazy (your last link):

> The reality is that waterboarding is not a form of torture and is in fact used in the training of US forces. Much of the debate surrounding the use of torture stems from politically motivated individuals who do not understand the techniques themselves or the complex nature of military operations.


Here's another one about waterboarding, very odd...

http://wikiwho.ailef.tech/diffs/156bbb833d79c589d7791bc8d02f...


Remember when Wikipedia was never to be used as an authoritative source on any subject? Pepperidge Farms remembers.

Interesting how that suddenly changes around the time that smith-mundt was repealed.

But don't listen to me, I suffer from realistic dreams and an imperfect memory, that never happened, we have always been at war with eastasia.


There was fun time when Ed Summers made a tool to monitor Wikipedia edits from some IPs pool realtime, and it turned into worldwide effort with Twitter bots monitoring many governments and big corporations, highlighting a lot of cringe edits and poor attempts to remove some info from Wikipedia. Many bots are still active, you can find source code and list of bots here https://github.com/edsu/anon

Also there is analysis of old edits (2002-2014) using IP ranges collected for bots https://jarib.github.io/anon-history/, source code: https://github.com/jarib/anon-history


It seems my poor VM was hugged to death! I just resized it to hopefully sustain the load.


This is amazing. Thank you for making this!

I want to highlight one part from the About page[0], as it's important enough to bear repeating:

> Any information that you find with this tool must be independently verified. The mapping between IP ranges and organizations has been compiled from multiple sources and has not been manually verified so it is certain that it contains inaccuracies.

I do have a question about this tool. Is there a page that lists all the organizations in the dataset?

[0] http://wikiwho.ailef.tech/about.html

[edited to ask a question]


Thanks, I'm glad you liked this!

It's not exactly a page but there's this JSON file in the repo https://github.com/aileftech/wikiwho/blob/master/resources/r...


This is really a hobby of mine. I will sometimes go through articles and see if I can find suspicious edits. It grew out from this thread: https://news.ycombinator.com/item?id=30840671

But now, I realized I can 10x this stupid hobby with ipinfo.io

You can get company name and VPN detection from IP address. I work for IPinfo, so not plugging the API service. Use the search box on the homepage to check individual IPs for free.


Posted this on our Slack. If OP needs a company database or something, I can ask my team. I heard that we collaborated with Wikipedia on a project like this before.


LastPass are very good at deleting their security incidents

https://news.ycombinator.com/item?id=15756044


Apologies - the references are now outdated and I’m too lazy to find them again


I'm not sure why people are so downhearted by this kind of thing. It's well known that anything controversial on Wikipedia is going to have a lot of edit warring and people trying to control a narrative. What Wikipedia is best at is non-controversial topics. And that has always been the case. Even for controversial topics it's honestly surprising how more or less correct the articles tend to be.

Wikipedia is not where you read up on the war crimes that may have or may not have been committed by some country or the views of a politician on immigration. Wikipedia is where you read up on Mars's atmosphere or how solar wind works.


When it comes to current events, I think the advantage that Wikipedia has over traditional encyclopedias is not authoritative but rather referential. With most articles (such as historical events or scientific breakthroughs), Wikipedia converges to a pretty authoritative treatment of the subject and can be more accurate than a singe-source encyclopedia. With current events articles, Wikipedia provides a starting point for further reading (including via the citation pointers) whereas a single-source encyclopedia may be out of date and completely devoid of content on the subject.


I agree. When you look at the sources used for controversial topics and dive down to the original primary sources (Wikipedia has rules against using primary sources for many subjects) you find that the actual information isn't what was reported.


This is an amazing tool, but I'd caution it's using IP lists for groups of specific organizations. One of the 2 IP lists provided by the author is just the US military: https://gist.github.com/artfulhacker/a6eb800e58f2eb6f9231

The section "MOST ACTIVE ORGANIZATIONS" shouldn't be taken as a list of the most active organizations in the world editing Wikipedia, just the most active organizations that made an IP list the tool is using.

Anyone using the very large number of static IP addresses on these lists will be pooled as an edit by the organization that maintains that IP range. This means a seaman in the Navy editing a TV show article on their free time may be pooled into the "Navy Network Information Center (NNIC)". It doesn't necessary mean the NNIC has a special interest in editing 'Breaking Bad' episode synopses.


I am surprised how many people goof off at work writing wikipedia pages about their favorite TV shows.

Especially you, Canadians... lol: http://wikiwho.ailef.tech/organization/46a20a0820d609f90314e...


It might be the case that government employees are mostly just goofing off at work on Wikipedia. That's a perfectly normal, likely, and very plausible explanation of reality.

Another potential explanation is a practice of obfuscation. There might be one critical edit of some government corruption or something else buried in a list that looks like 99% tv shows. Those TV shows might have been edited solely to bury the 1 critical edit they needed to make that will be ignored because it's surrounded by hundreds of other benign looking edits. Without examining each and every edit, it's hard to completely dismiss this kind of explanation.


It’s a good thing to remember… just because it comes from a corporate IP doesn’t mean much. It is highly likely most of these edits were done by employees goofing around on Wikipedia while on their employer’s network.

If you broke down comments on HN by IP, I bet the distribution would be highly concentrated to tech companies IP ranges. Assuming that every comment was posted for nefarious reason would be flawed. The truth is much less exciting. It’s just dudes goofing around at work…


Someone is a fan of Madonna in there.


You have to check out the malcontents who made edits to the USAF base at Alconbury:

http://wikiwho.ailef.tech/diffs/0db36f4f426c89e6fb7d41e00747...


This tool is interesting and useful, but it's important to remember that it is constrained by the lists of mapped IP addresses. It will only show edits made by these specific sources, and no others:

https://github.com/ruebot/gccaedits-ip-address-ranges/blob/m...

https://gist.github.com/artfulhacker/a6eb800e58f2eb6f9231

It would be a very basic mistake conclude that these are the only groups editing Wikipedia articles to control the narrative, just because they're the only ones being sampled.


My favourite would have to be Eli Lilly bullying some random American school. http://wikiwho.ailef.tech/diff/6209781691e7fdc225681c7848504...


The US Army has made six hundred contributions to a list of ethnic slurs:

http://wikiwho.ailef.tech/page/11014515

They appear as “PEO STAMIS” here, which appears to be an IT group?


When I think about it, a lot of ethnic slurs that were introduced to me were in a context that was militarily adjacent.


I suppose that makes sense considering most conflicts involve groups who (accurately or not) wouldn't consider their enemies members of their own race.


In the conspiracy community, wikipedia articles relevant to conspiracy theories often have extremely revealing edits and have been known as a source of checking for coverup for a while. The truth, I've found, is often so shocking that many if not most people, would rather reject it (attack the messenger, bury head in sand, etc) than admit their worldview has been so wrong for so long.

I've even seen sections in the edits go poof for very big stuff.


What do you mean by 'revealing edit'? Is that an edit that goes against a conspiracy theory? Do you have grounds for suspecting those edits originate on behalf of affiliated persons instead of someone trying to present a balanced viewpoint but whom you happen to disagree with?

I used to look only occasionally at edit sections for sensitive articles involving crimes by powerful players but stopped because I never spotted any suspicious changes.

(Also, if removing edits from page history on Wikipedia was actually common I don't believe no one in the editor community blew the whistle yet. Of course some people would claim such whistleblowers are tracked and eliminated by omnipresent evil illuminati before they go public... and at that point I can't take this line of argument seriously)


How is "institutiion" or "company" defined? I couldn't find any international news providers for example, Newscorp, BBC, Daily Mail

Probably not a major issue though. As companies move more to remote and/or cloud based access from service providers - zscaler etc - that IP data is lost (or rather hidden by the provider), it certainly becomes easier to be anonymous to sites like wikipedia.


I think I'm most puzzled why the Navy is making so many edits on Boys II Men founding member Marc Nelson. The read I'm getting is they are by Marc Nelson himself since a lot of them seem to reflect his personal regrets about leaving right before they blew up.

Incredibly odd.


Why did Pfizer do all these edits on a musical? http://wikiwho.ailef.tech/diffs/c1b9f1c2d5cc1361ccf28b0108bd...


Because people do non-work things on their work computers.

This is one of the reasons I think the statement "edits made by companies" is a bit too strong... these are edits made from company owned IPs, but we have no idea if these were made based on direction from the company leadership or just by employees doing non-work related things from work computers.


It probably was an employee working for Pfizer.


Org/Page Search not working here fwi.

Great transparency, now where are the foreign service firm and their proxies.

You are telling me the FSB and the United Front aren't editing Wiki?

I've had them edit my stuff, so know it's happening. Just not sure the scale.


> Org/Page Search not working here fwi.

I think it was because the server was overloaded due to it being on the front page. It seems to be working for me now (you need to enter at least two characters for suggestions to appear).

As for the rest, I just collected some IP ranges on the internet. The might include foreign service firms or not, it wasn't feasible to check them manually. If you know about any IP range though I'll be glad to merge it into the repo.


You said their proxies, which is not providing an easy to identify method. No one is going to do this in russia on a kermlin computer. They will let their boys do that.


wow, it looks that someone from Air Force Materiel Command was very hot on Mireille Mathieu http://wikiwho.ailef.tech/page/154437

Mireille Mathieu is a french -old school- singer https://fr.wikipedia.org/wiki/Mireille_Mathieu


This is a great idea. Thanks for making and sharing.


You could link to the wikipedia article in each page


Pretty cool, please make it run under HTTPS


Seconded. My browser won't even connect over HTTP. This should be a must these days! It's so easy, Let's Encrypt makes it take 5 minutes to get a cert trusted by all modern browsers.


No TLS? Really? What year is this?


Right now the site is not loading.


Hugged to death?


Yes, indeed. It should be working fine now, though.


You might want to consider using a CDN like CloudFlare. The information on your site looks very cacheable.

That would reduce load because requests for the same content would get served by them rather than hitting your server directly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: