As one of the lead researchers at the Web Ecology Project- I can confirm that this was also true in 2009 (our finding of a small sample of 1M tweets found it was ~61% at the time). It doesn't shock me at all that it could be down to as low as 50% at this point.
We did not find Japanese to be as high up there as this group did. I can't remember, but I think when we ran ours the Twitter.jp was still running as a separate domain perhaps?
I'm not the most plugged in person to the Japanese Internet, but over the last 6 months or so I've started to see Twitter rise pretty fast here. First it was on the sidebars of the portals, then the Japanese word for "tweet" started to show up in the headlines for the scandal rags that I scan on the train ride to the office, then folks at my office started asking me whether I had a Twitter account yet. So you might be seeing some growth in addition to possible sampling issues.
This should be a non-brainer – ie. catering to the non-English users of the Web.
I wish I could say we were smart enough to have taken that in to account when we started, but, like most, we were oblivious to the fact, and only ended up awakening to the opportunity by virtue of our community leading the way, with their use and volunteer localizations of our offering.
At this point, about 50% of our users are non-English (our offering is available in 30 languages - again, all localized by volunteers).
We are monetizing in 40 countries, using PayPal to sell low cost subscription based web services and virtual goods. In order of sales - US, UK, Canada, Italy, Germany, Spain, Japan, France, and Brazil.
So, my best advice to all those running a start-up - think global. You will be able to monetize once you get there.
personally, I find that google translate is mostly good enough to tell what people are saying about me (and sometimes for me to fix the problem.)
Still, many people say that Orkuit was killed (from the USian point of view) by the influx of Brazilians. It will be interesting if/how this plays out on twitter. Will you just have people engaging with other people in their own language? will you have positive for everyone mixing? (this has been my experience so far.) or will one language become dominant and will speakers of other languages largely abandon the site, as in the case of orkut (which I can't even remember how to spell at this point.)
I think the larger implication is not that you can't understand tweets, but that monetization may be pretty hard if majority of tweets aren't in English.
I've worked in advertising, and there are fewer great non-English ad networks out there (can deal with large volumes and also will yield high CPMs). Also the non-English portion is fragmented into dozens of different languages.
It isn't impossible to do, but certainly harder to build an international ad sales team.
Perhaps foreign ad networks are finding native partners, instead of going after English speakers. There are >800M mandarin speakers, surely, that's enough to attract a lot of advertisers, just maybe not US ones.
The title makes me thing that this is considered a bad thing ...
For me, it would be more a good sign, twitter is getting traction internationally, how bad can that be for them?
The comment about monetization is something I hadn't thought of in the first place, but what makes you think there are no good ad network in the Spanish, Japanese or French speaking world (just to name a few) ?
Granted, I'm French, living in the Netherlands, and I tweet in both English (for my public profile) and French (mostly on my private profile), so this is a topic that reaches me.
Mainstream American websites generally follow a 66 percent international and 33 percent domestic ratio. The 50 percent figure meshes well with that ratio given the number of English speaking foreigners.
I am from the Philippines. All of my friends (in Twitter) speak in their dialects unless they want/intend something to be read by other people who only understand English.
Jon Beilin of our group wrote and open sourced (MIT/X11-license) a Python language detection module that we used on our Twitter database. http://www.webecologyproject.org/2009/09/code-release-google...
We did not find Japanese to be as high up there as this group did. I can't remember, but I think when we ran ours the Twitter.jp was still running as a separate domain perhaps?