
Which countries are mentioned the most on Hacker News? - bemmu
http://www.bemmu.com/which-countries-are-mentioned-the-most-on-hacker-news
======
nl
There are many comments remarking on how Jordan because of the large number of
stories mentioning Michael Jordan (both of them) is causing problems.

Apparently people are unaware that this problem is at least partially solved
in modern named entity recognition systems.

The first story from a search for "Jordan"[1] is "Machine-Learning Maestro
Michael Jordan on the Delusions of Big Data and Others".

Stanford NLP[2] tags "Michael Jordan" as PERSON.

The first story mentioning Jordan as a country is "Internet Blocking Begins In
Jordan". It tags Jordan as LOCATION

Spacy's POS tagging tags "Michael" and "Jordan" as a compound person in the
first example[3] and "Jordan" as GPE (geo-poltical entity) in the second[4]

[1]
[https://hn.algolia.com/?query=Jordan&sort=byPopularity&prefi...](https://hn.algolia.com/?query=Jordan&sort=byPopularity&prefix=false&page=0&dateRange=all&type=story)

[2] [http://corenlp.run/](http://corenlp.run/) (annoyingly the state can't be
passed via URL parameters)

[3] [https://api.spacy.io/displacy/index.html?full=Machine-
Learni...](https://api.spacy.io/displacy/index.html?full=Machine-
Learning%20Maestro%20Michael%20Jordan%20on%20the%20Delusions%20of%20Big%20Data%20and%20Others)

[4]
[https://api.spacy.io/displacy/index.html?full=Internet%20Blo...](https://api.spacy.io/displacy/index.html?full=Internet%20Blocking%20Begins%20In%20Jordan)

~~~
lmm
I imagine it's a lot easier to tag "Michael Jordan" (correctly spelled) as a
person than "Micheal Jordan".

~~~
nl
Ha. Fixed, thanks.

(Note that the headlines were correctly spelled, so the computer part worked.
The failure was at the human end..)

------
itcrowd
I wonder how well this corresponds to the classic Heatmap xkcd comic; i.e. is
the distribution of mentions not just a proxy measure for the geographical
location of HN users..? Surely, you're more likely to submit a post about your
home country since you can relate to that.

[https://xkcd.com/1138/](https://xkcd.com/1138/)

~~~
mahranch
But would that really work in this case? China is the number 2 mentioned
country and while I don't have any demographic information, I find it hard to
believe that there would be that many Chinese on HN. Well, not just any
Chinese, but Chinese nationalists (people who still feel like China is
"home")?

I guess it's possible it could be the work of their "reputation management"
online guerrilla tactics. Though, the people engaging in that work generally
aren't even Chinese, they're usually subcontractors paid by the Chinese to
promote a "positive China" (in my experience, promoting a positive China
usually means downplaying or mitigating whatever bad news has come out that
day, or if nothing is going on, Japan bashing is on the docket).

For whatever reason, people are under the false assumption that China only
tries to manipulate public opinion and spread propaganda domestically. This
couldn't be further from the truth. I remember Digg.com having to sue and
fight to get the Chinese to stop gaming their system (source:
[https://en.wikipedia.org/wiki/Internet_Water_Army#Legal_prob...](https://en.wikipedia.org/wiki/Internet_Water_Army#Legal_problems))

People know that the U.S government does it, but they either forget or ignore
the fact that _other_ countries do it too. Maybe they just underestimate the
scope, or are wholly ignorant of it because they believe they haven't
encountered such propaganda before (believing something like that would be
"obvious"). But it isn't. Hell, RussiaToday (RT.com) exists almost exclusively
for this purpose (to spread russia's narrative of world events). China has
their own government ran/directed version as well, Xinhua.com. You see these
two websites submitted in excess to reddit and other social media sites.

And do you know what's so great about reddit and HN to a person looking to
spread or push an ideology or narrative? The points. They can keep track of
what works, what doesn't, how well it works, and they can keep track of their
employees. They can see how well each employee is doing at their job. It
provides a measurement for them. A quantifiable piece of information to gauge
the value of their work. Something they can bring to their bosses.

I know I'm getting off topic here but I'm actually (surprisingly) not
surprised India is so high if xkcd's premise holds. While I don't see many
Chinese commentors, I do see loads of highly intelligent Indians here. I met
one last year on HN who I had do some freelance work for me.

~~~
itcrowd
> _But would that really work in this case? China is the number 2 mentioned
> country and while I don 't have any demographic information, I find it hard
> to believe that there would be that many Chinese on HN._

I follow you up to this point (although I don't necessarily agree)

> _Well, not just any Chinese, but Chinese nationalists_

Why would you have to be a nationalist to post about China? In fact, you could
submit a "China is bad" post. Everything after this point seems to miss my
point completely.

Also, a sister comment didn't read the part of the submission where it stated
that "This only counts news stories posted, not comments."

~~~
mahranch
> Why would you have to be a nationalist to post about China?

Because the guy I replied to said: " _Surely, you 're more likely to submit a
post about your home country_"

I'm an American but my grandparents are from Ireland. I personally feel no
connection to Ireland nor do I have any idea what's going on over there. It's
any other country as far as I'm concerned. I wouldn't post news about the
country to HN. However, if I did, I would be a Irish Nationalist. Feeling any
sort of loyalty to a country (whatever the reason or justification) is
nationalism. It's literally the definition of the word.

> "This only counts news stories posted, not comments."

I'm not sure that distinction matters for my comment. It may for the study,
but my comment was a bit of a tangent anyways.

------
vishnuks
HN Search on Algolia says India is more mentioned than China. India has 37,236
results but China has only 15,228.

[https://hn.algolia.com/?query=china&sort=byPopularity&prefix...](https://hn.algolia.com/?query=china&sort=byPopularity&prefix&page=0&dateRange=all&type=story)

~~~
aidos
Interestingly, if I type _china_ I get 15,228 results but then after I hit
_enter_ it drops to 8,682. 8,318 if I wrap it in quotes.

Ah, actually, if I leave it for a minute it bounces back up to original
number.

This is all very curious behaviour!

Edit it's the same for India too, btw

~~~
aws_ls
I observed that it shows results like 'Indie', 'Indians' etc. , when you type
'India', until you press <enter>. On pressing <enter> there is a finality to
the search, and it removes other possibilities.

------
sbensu
Here[1] are the countries sorted by stories per capita which shows which
countries are over represented.

If you want to play with the script that fetches the population numbers, go to
Tools > Script Editor and add your Wolfram Alpha API key.

[1]
[https://docs.google.com/spreadsheets/d/1UX3OvN6MXDkuCiLpKPW2...](https://docs.google.com/spreadsheets/d/1UX3OvN6MXDkuCiLpKPW2xB74UQz8F9KVapm7Xy_JLCg/edit#gid=0)

------
pavlov
At least Finland isn't a poor second to Belgium [1] in this ranking.

[1]
[http://www.montypython.net/scripts/finland.php](http://www.montypython.net/scripts/finland.php)

~~~
infinidim
Suomi mainittu

------
DanBC
Does this under-count the US? A lot of stories will be about "California", and
they don't feel the need to say "California the US"

~~~
pbhjpbhj
Many stories are pertinent to a particular geography but fail to mention it at
all. Commenters seem often to assume all readers are in the USA.

------
oneeyedpigeon
This looks like it's searching literally for just the country name in text
(and I guess that's all it can really do). Of course, this slightly favours
countries with a name that might have another meaning (China, Turkey) or might
represent another geographic location (Georgia).

~~~
jmadsen
I'm guessing the number of stories about people's "China" tea sets & "Turkey"
dinners are statistically minimal on Hacker News.

Couldn't tell you how much Georgia is skewed, as a quick scan shows they are
almost all from the state. Being number 40 on the list, I suspect it isn't
that interesting to most people.

~~~
oneeyedpigeon
A quick scan revealed a story about going 'cold turkey' but, I'm sure you're
right. I was actually expecting some thanksgiving-related stories; I'm sure
they're there!

------
merb
Actually he talks about "popularity" however without knowing how many people
are posting here from japan, this says nothing at all. Also as soon as you
don't analyze the posts you don't know if it's popularity or not since you
don't analyze the content. Maybe they talk about "I dislike something in
Japan"? statistics only make sense when you have a bigger picture, everything
else is just guessing how it could be and not how it is.

~~~
sgdesign
I think the post makes it pretty clear that it's about how popular _stories_
about various countries are, it's not trying to establish a popularity ranking
of the countries themselves.

------
wcummings
HN barely pays attention to anything outside of silly valley, in or outside of
the US. Lots of spam and low quality marketing content for shitty west coast
startups.

------
xioxox
Interesting, though a lot of posts seem to be about the US by default.

I see he's missing Scotland, Wales and Northern Ireland as mappings for the
UK. It would also be useful to add capital or major cities as synonyms (e.g.
Paris, London, Berlin...).

~~~
bertil
I would love to have the ability to tag a post as “relevant outside of the US”
vs. “not relevant outside of the US” for things like tax, identity theft and
banking.

------
rurban
I find it curious that Austria ranks last in countries mentioned (with >100),
but ranks #4 in upvotes. This is a very interesting and big discrepancy. Are
the Austrians really that silent and excellent?

But from my personal experiences in popular european culture they are compared
to their more northern european fellas louder (and they do publish a lot) and
of the same kind of excellence.

~~~
creshal
It's lumped together with Germany in a big chunk of the few submissions where
it gets mentioned, so that could explain the similar scores.

------
techaddict009
Surprisingly Title wise the result is bit different:

India in title was more than China:
[https://hn.algolia.com/?query=India&sort=byPopularity&prefix...](https://hn.algolia.com/?query=India&sort=byPopularity&prefix&page=0&dateRange=all&type=story)

[https://hn.algolia.com/?query=China&sort=byPopularity&prefix...](https://hn.algolia.com/?query=China&sort=byPopularity&prefix&page=0&dateRange=all&type=story)

And United States lesser than both:

[https://hn.algolia.com/?query=United%20States&sort=byPopular...](https://hn.algolia.com/?query=United%20States&sort=byPopularity&prefix&page=0&dateRange=all&type=story)

~~~
scrollaway
"United States" doesn't account for "US", "USA" or "America"

------
alphabravodelta
Count for country Georgia is wrong, because it's associated to U.S state
Georgia. I wonder how its possible to distinguish these two.

~~~
bemmu
I suspect there are some other homonyms in there as well. Luckily most country
names are not used that much for other meanings.

~~~
beshrkayali
"Jordan" is. Most of these posts are about Michael Jordan, not the country
Jordan. That messes up the entire analysis/rating.

~~~
nl
Michael Jordan is a homonym as well. Apparently there was some basketballer
with the same name as the famous AI researcher[1].

[1]
[https://en.wikipedia.org/wiki/Michael_I._Jordan](https://en.wikipedia.org/wiki/Michael_I._Jordan)

------
captainmuon
It is a bit unfortunate he omitted Korea (both Koreas, since it is ambiguous).
I would expect North Korea to have shown up quite often. There appears to be a
certain morbid fascination with that country on this site (which I can somehow
understand).

------
taneliv
Might be fun to see EU countries combined under single label.

------
known
Are they different from
[http://stackoverflow.com/](http://stackoverflow.com/)?

------
ekianjo
Average score is a very poor measure. Because averages get pulled down heavily
by extremes. Not a proper way to analyze such series.

~~~
sgdesign
You should write your own post! I'd be curious to see score compared to
country population.

------
personlurking
How can I be notified when a particular country is mentioned on HN, whether
stories or comments?

------
codeshaman
#51-#54 have practically the same combination of colors on the flags...

~~~
anaolykarpov
Almost, because Belgium has black while the rest have blue

------
awl130
really? you just left out korea completely? that sucks. the confounding of
north and south korea would have been informative nonetheless.

------
markatkinson
I see South Africa is nowhere on the list :/

------
thomasilk
Austria has the least mentions but is in 4th place when it comes to upvotes
per post. That's what I'd call efficiency ;)

------
wmboy
No mentions of New Zealand? :(

------
amasad
Jordan's results are wrong because it's counting things like "Michael Jordan"
etc.

------
bymafmaf
Well, Turkey has double meanings which makes it wrongly #24 :)

------
pvsukale1
feeling kinda pround seeing india on no 3

------
amar-singh
Mostly US ...

------
cel1ne
On a related note: Most articles in my medium.com-digests originate in the US
and focus on US-discourse.

It's not interesting to read yet another debate on state health-care or
dating, if you're living in europe where state health-care is really a no-
brainer and dating isn't that structured.

I asked the medium-team if I could somehow configure my digest, so that I have
more diversity in the origins of the articles they send me.

They declined, so I stopped reading it. It's a shame. I backed them on
kickstarter originally.

~~~
spinningarrow
What sort of structure does dating have in the US? (Genuinely curious, I've
never thought about this before)

~~~
bertil
The idea that a meeting is a date or not has to be clarified. You can’t just
have a drink with a colleague, and let the romantic interest to-be-determined.
Americans, as I understand it, tend to have “the talk” which is a clear idiom
for an actual conversation where they mutually decide to “be exclusive”, i.e.
not take dates with other people. Timing, number of dates and all that are
also fairly established.

I have regularly seen in several countries in Europe a relationship going from
inexistent to committed, public and exclusive in a matter of hours. “Having
the talk” is often understood as an intent to break-up (because it sounds like
“We need to talk”). Whether a meeting has a romantic interest is never really
expressed: it’s generally either obvious or purposely vague than anything.
That has lead many American friends in Europe very frustrated with the dating
scene, because it comes off as unreadable. Europeans in the US can find the
formalism icky, but they generally adapt more easily.

There are also far more differences between European countries (wolf-calls are
apparently common in Italy; Scandinavia can come of as the opposite) than with
the US — but formalism is certainly the big one.

~~~
babebridou
Thanks for this post. As a European I wasn't aware these rules actually
existed. I thought they were merely "Hollywood" tropes to have explicit
romantic interest in fiction while keeping a G rating.

~~~
bertil
A lot of it might have started that way —French kissing for instance has a
complicated and not well documented history of misunderstood euphemisms– but
they have influenced culture and expectations. Most people have lived through
less relationships and break-ups in real life than they’ve seen in movies.

------
wott
Themes:

1 United States: masters of the Universe

2 China: evils commies, but biggest capitalists

3 India: H-1B source

4 United Kingdom: mother of colonial masters

5 Japan: funny robots

6 Russia: evil soviets conquering the free world

7 Canada: mounted police

8 Germany: nazi cars

9 Australia: other indigenous people genocidal colonists, but more discrete

10 France: messy/ordered, free/binding systems no one understands; Parisss

11 Israel: origin of half of the problems of the world, but nothing can be
said against them for they are friends of the masters (even masters cannot say
anything), so let us talk about drones

12 Spain: collapsing tomayto economics

13 Brazil: bikini crimelords, evil socialists

15 Pakistan : like H-1B, but terrorists

16 Netherlands: technical devices for growers

17 Sweden: boring devices

18 Greece: collapsing goat economics, might become evil socialists

19 Italy: romantic bunga-bunga

20 Ireland: collapsing potayto economics, less Irish than 4th generation
American-Irish masters.

21 Mexico: builds and feeds Silicon Valley

22 Switzerland: safe, safes, savings

~~~
fennecfoxen
> 9 Australia: other indigenous people genocidal colonists, but more discrete

Psst. _Discrete_ is when you're doing math with integers and exploring non-
continuous phenomenon. _Discreet_ is when you're trying to keep a low profile.
Pass it on.

> 11 Israel: origin of half of the problems of the world

Certainly _involved_ in half the problems, but is it the origin or is it the
destination of the problems? One way or another: some day long after they're
finally vaporized by atom bombs, history will surely look back on us and
wonder what was wrong with humanity in our day and age (while pointing fingers
about the next brewing crisis, obviously)

~~~
i26
Israel is neither involved in half the problems of world nor is it the cause
of most the problems with which it is involved.

Look at all the problems of the world. The vast majority have nothing to do
with Israel. This holds by a count of problems as well as by weighting by
magnitude or people affected. (Certainly it is featured in the media more
often than other conflicts.)

As for the problems with which Israel is involved, I don't want to start a
political discussion, but I must dispute wott's assertion and say it is more a
target of violence than an instigator.

------
ommunist
No wonder its American-centric. Is statistics for the USSR and Russia
aggregated as single line for "Russia"? These are two different countries.

~~~
ascorbic
I can't find any reference to the USSR on that page. Funnily enough I can't
find any reference to Yugoslavia, Zaire, Czechoslovakia, the Ottoman Empire,
Babylonia, Olmec or Mesopotamia either.

~~~
ommunist
Have you checked the whole 1.1 GB uncompressed JSON for :The USSR? Should be
more than twenty.

~~~
ascorbic
I'm sure it is, as I'm sure some of my other examples are. There's even 123
results for Mordor. Still no reason for any of them to go in the rankings.

