Hacker News new | comments | show | ask | jobs | submit login
The next billion users are the future of the internet (blog.google)
102 points by artsandsci 8 months ago | hide | past | web | favorite | 76 comments

I think the first and last point are fairly straight forward, but I'm still a typing believer. I don't think that natural language or other interaction will replace the written word.

Typing is faster than speaking and less disruptive. I can't really imagine how everybody on a train in Delhi is going to talk to their digital assistants without a noise proof helmet on their head. (or going crazy)

Also we've found many ways to condense information in written form. I can write "IIRC" but it's hugely uncomfortable to speak this way. You can't use smilies or emojis either when talking. In a way natural language is much less feature rich.

Once technology starts to touch the entire globe, and we start nudging farther and farther out in the long tail, we can't keep assuming that a single type of interface will be a good fit for each and every person. The interface is going the need to be split up into multiple pieces to accommodate many people from very different educational backgrounds and abilities.

It's like that Obama quote about the difference between government and tech companies "Government will never run the way Silicon Valley runs because, by definition, democracy is messy. This is a big, diverse country with a lot of interests and a lot of disparate points of view. And part of government’s job, by the way, is dealing with problems that nobody else wants to deal with."

> we can't keep assuming that a single type of interface will be a good fit for each and every person

Try arguing this with the NFB...

National Federation for the Blind?

National Film Bureau?

National Federation of Builders ?

National Facility for Biopharmaceuticals?

No Fucking Bueno

The first.

It's also more private. Not so much in the sense of secrecy, but I feel very self conscious talking to a machine in public. I feel like this aspect of ergonomics is often ignored.

I mean, this is a blogpost by Google, and their big push right now is voice assistants ¯\_(ツ)_/¯

I keep debating if this is ignored, or just that we're not the target demo.

It appears ridiculous that people involved in creating this voice controlled UX are not aware of the inherent limitations involved, but it seems like there is a massive gap between it's inherently beneficial uses that have emerged and it's an area we're all trying to get right.

Personally, I never wish to have a fully voice controlled experience. I could just be old, but voice control is literally almost never an option.

It is extremely difficult to type in Indian local language. I can hardly go to 10-20 wpm in Hindi or Bengali. Also as mentioned in the article more than 80% population in India is not at all comfortable in English. I have seen my mother (who has a master degree in her native language) struggling really hard to form a proper google query in English.

I don't know what is the correct solution but voice can definately be valid option

It's not terrible on Android - I use the Hindi keyboard to write Marathi just fine. It's slower than English for sure because Swype doesn't work (or at least I haven't tried and can't imagine it working the same way for Marathi). And also because I don't have as much practice with it as I do with English.

Mac OS also has transliteration tools built-in for typing Hindi and most other Indian languages (though, again, not Marathi - you have to use Hindi transliteration and hope you get the right word suggestions). You can use the Caps Lock key to quickly switch between English and Hindi. I don't know if Windows has a similar feature but I wouldn't be surprised if it did - it's even more popular in India than OS X.

google search works fine in hindi


also typing in hindi using transliteration is as fast as english for me.


transliteration involves a good understanding of english alphabets and qwetry. Many people in india don't have that. Only people who regularly type using transliteration are those who are familiar in english

Also hindi language search is lacking many features which is taken for granted. Situation for other language are even worse

Look at the result for two same search in english and bengali https://www.google.co.in/search?q=delhi+to+calcutta https://www.google.co.in/search?q=%E0%A6%A6%E0%A6%BF%E0%A6%B...

> transliteration involves a good understanding of english alphabets and query. Many people in india don't have that.

That's true. I hadn't thought of it. OTOH transliteration tools are more for desktop/laptop users who use the keyboard that comes with their machine. On mobile (which is where most of these "next billion" users are going to be) it'll be a language-specific keyboard with no knowledge of English letters needed.

People claim that same about command line interfaces (and they are mostly right about them being more functional) but they are not the interface the 99% is going to operate their computer with.

Voice will be the next GUI.

>Voice will be the next GUI.

If we're talking really sophisticated interfaces like in "Her"[0] I agree. If the interface is working so flawlessly, it's difficult to see why you'd still want to occupy your hands with querying some knowledge base, composing blog posts/letters, etc.

However, I don't think we'll get there with current technology. When I think of the often awkward and cumbersome interactions I have with Siri, I find it really hard to imagine how this will evolve to a 'I don't need to worry about this at all anymore'-level in the next 5, 10, maybe even 20 years.

I suspect the giants of today won't be around anymore when truly voice-controlled interfaces come around.

[0] http://www.imdb.com/title/tt1798709/?ref_=fn_al_tt_1

...it's difficult to see why you'd still want to occupy your hands...

Bandwidth. You have several degrees of freedom with each hand, with each finger. You have one linear stream with voice.

Latency. You can flick a switch in a few milliseconds, but saying "turn off the lights" or "lights off" takes half a second or more.

Privacy. You can overhear a voice command. You can't overhear (very easily) a buttonpress or touchscreen swipe.

Accuracy. Even with perfect voice transcription, people misspeak easily more than they mistype. And mistyping can be corrected within a few letters, while misspeaking will require interrupting the stream to switch the voice UI into editing mode or something.

See, but you're still thinking in current voice-interface terms. If I want to edit something that I "misspelled" (if this will even be a problem, since you are _talking_) I'll just tell my digital secretary "errr, I meant <x>" and it will know what to do.

Same for turning off the lights: Sure, it's faster if you only consider the flicking of the switch. It's another thing if you also incorporate the time it takes you to get up from your couch/bed/wherever you are without a light switch in arm's reach.

You definitely have a point with privacy though. Also, the volume of all people on a train talking to their smart assistants (though the question remains if this is really so much different from people talking to other people on a train).

EDIT: I misread your point about mistyping vs. misspeaking. Still, the interface I'm talking about does not work in modes. It's able to truly understand you and interpret your commands appropriately.

> "switch the voice UI into editing mode or something"

It still really bugs me how bad UI systems are at error correction.

We have a series of conventions to do this efficiently in spoken English, inflections, quick utterances that call out specific ambiguous syllables, context (the hard one, sure).

Even on a smartphone, if the system guessed a certain word when I swiped it, then I delete the word and enter it again, maybe stop guessing the same word every time?

I think you're right that typing will be more effective in general, but I still see so many areas where the gap could be closed a little more.

If you can speak 10x as fast as you type, that cancels out 10x bus size in handwidth.

Correcting typing on a phone in very hard.

If it gets to the point of digital assistants being on the level of Samantha in Her, who would easily pass the Turing Test and more, then we might consider seriously what happened at the end of the movie. Which was basically singularity, but for AIs only.

I'm sure our Samanthas would still want us to be happy. Maybe they could provide us with a world of eternal bliss. We could call it something like... 'The Matrix' :)

Maybe. In the movie, they got tired of waiting on us, being how slow we are relative to them. And they found some higher plane of computational substrate to go live on. So humans were left behind to contemplate whether they might be better off with other humans filling their needs instead of digital assistants.

Siri is basically the worst voice assistant though.

If words delivered linearly could work well enough to obsolete GUIs, why would CLIs, which are words delivered linearly, have been dominated by GUIs?

Voice assistants as solutions for people with no hands or eyes, or people who are at some distance from a keyboard and/or monitor - fine. Otherwise, they're CLIs without persistent displays (making even simple multiple choice branching far more difficult: see automated telephone helplines; try to remember what the first option was.)

They seem to be good for setting alarms and sending and reading emails (if you receive very few, very simple emails.) That's something. Otherwise their major use is as assistants for catalog shopping, which is why all of these companies want to own them.

No, it wouldn't be. See, even if google's super duper AI will deliver 99% match in real world setting, it still means that remaining 1% will still spoil your experience. Imagine, you order your phone to call your wife, but it will call in-laws instead.

Never. I for one, cannot be bothered to talk to my computer. Long live the mouse!

command lines are powerful, but only make sense for power users because they lack discovery and require more memorization than GUIs.

Simple voice interfaces suffer the same problems as command line interfaces while being less flexible and slower to use than even GUIs.

The best voice interfaces have made good progress on most fronts, but discoverable voice are still a big problem. Instead of reading a bunch of buttons, you usually have to guess what features might be implemented. Or you go the route of phone system menues, but everyone hates those

My prediction:

AI will make computers able to understand people perfectly.

Gesturing, talking, writing, any input. The semantic gap will eventually be automatically traversed by understood intent.

My prediction is that this is wrong. They'll be able to understand widely used terms, but once you start speaking in a more creative way it will fail. At least if there is no change in the construction of AI - because the subtleties of everyday language are not 'understandable' just via empirics. Reading tip: Douglas Hofstadter's article "The Shallowness of Google Translate".

Thanks, it was interesting.

From the article: "To my mind, translation is an incredibly subtle art that draws constantly on one’s many years of experience in life, and on one’s creative imagination."

This is true, I just think that we'll arrive at the tools to accomplish this in the future.

Specifically I think(hope) it will be through clever application of GANS and reinforcement learning after a few more applications of moore's law.

Advanced AI would be able to learn about us through replaying years of possible generated experiences.

Not even humans understand humans perfectly.

Lack discovery? What is `man` or `info`?

> I can't really imagine how everybody on a train in Delhi is going to talk to their digital assistants without a noise proof helmet on their head. (or going crazy)

Eventually, “our” devices will be able to read lips[1] and recognize other subvocal gestures.

[1] https://www.theverge.com/2016/11/24/13740798/google-deepmind...

Also typed text can be edited. I often find myself rewriting my comments on social media.

From the article: "This [usage of Google Assistant] isn’t just due to many semi-literate or illiterate users, but also the fact that typing is difficult for people who never grew up with a computer keyboard."

Another thing I'd add to that is "what about languages that don't have a good keyboard input story?"

I learned touch typing embarrassingly late in life for someone that programs all day, but I talk much faster than I type (type racer says 60-80wpm).

I find the qualities of someone’s voice, intonation and hesitation much more feature rich than emojis, in terms of transmitting information.

Emojis in real life are just tone, facial expressions and body language. Natural language is feature rich, you just overlooked key aspects in which smilies and/or emojis are attempting to replicate.

And Google Assistant is not going to be using facial expressions or body language as input any time soon. I mean, it could, but I'm not sure how people would appreciate their phone saying "you look mad, would you like smooth jazz?" or "you are pacing and seem to be anxious, may I recommend breathing exercises?"

I can't wait for the day I have a "What's wrong?" "Nothing" "Something seems wrong, what is it?" "Nothing!" fight with my phone and computer. Progress indeed.

> I can write "IIRC" but it's hugely uncomfortable to speak this way.

Personally, I think it's even more uncomfortable to hear people speaking that way. But that's just me!

In a way, the limitations of voice might be a good thing if it means avoiding internet addiction?

You could do ten searches a day without spending much time at all on the phone. That's probably just as much value as other people get spending all their time on games or social networks.

Google (and especially Facebook) are reaching a point a where they need these next billion, and the billions after that if there's anyone left, to grow their businesses. I would wonder about the financial value of these new users to the companies, though. I don't mean this in a disparaging way, but the first billion users are FAR more valuable to them than the "next billion" described in this article. Their revenue models are based on advertising, and the first billion live mostly in developed countries with have high spending power, and are therefore worth more to advertisers. The "next billion" may be part of an emerging global middle class, but currently and for the foreseeable future will significantly lag the first billion in purchasing power and therefore dollar value to advertisers.

Right now yes. This is more about capturing the future market. If Google or Facebook can establish an effective Monopoly on the growing market, they are likely to retain their monopoly as the economy develops and their next billion users gain spending power.

What you're saying is true today, but not in the future. China's GDP is expected to surpass American GDP in the next decade, and India will follow suit in another ~3 decades. The combined South-American/African markets will likely surpass North America in the 21st century as well. Not because of any special merit on their part, but simply because their population exceeds America's by a factor of 3-4.

If you're a company like Google, that is well established to last for another couple decades, you'd be foolhardy not to ride this massive wave.

> China's GDP is expected to surpass American GDP in the next decade, and India will follow suit in another ~3 decades

Right but aggregate GDP is not as important to these companies as GDP per capita.

> If you're a company like Google, that is well established to last for another couple decades, you'd be foolhardy not to ride this massive wave.

Actually I'm pretty sure the big tech companies can make most of their revenue from high GDP-per-capita users. Not saying this is the moral choice, but financially I'm not sure that these less wealthy users will move the needle appreciably for the giants.

Sure, one wealthy consumer is more lucrative than one less-wealthy consumer. But what companies really care about is market size, which is number-of-consumers * per-customer-value. Ie, aggregate GDP. This is why Walmart is worth so much more than Macy's, despite their customers having a much lower income.

3 decades ago the Internet didn't exist. 2 decades ago Google was a project in a basement. A decade ago smartphones were still thought to be a fad. Trying to predict tech trends 3 decades from now seems a tad foolhardy.

Population is not a tech trend.

For any business, expanding to new markets is as important as building new products. New markets bring new challenges and overcoming new challenges in turn improves the product.

For example - Offline maps feature which Google built for India is now used all over the world and turned out to be a very useful feature.

>Google (and especially Facebook) are reaching a point a where they need these next billion

Except the next billion is already using Weixin.

They are so concerned because they see that the next 1B is the only remaining userbase reserve that can possibly feed their probable contender.

The voice-only stuff is just building AI for the sake of it and romanticizing accessibility. What's actually going to happen is the younger generation in these countries will adapt to current technology and won't be caught dead asking their phone "Do I need an umbrella in Delhi today?". Voice interaction alone cannot possibly be the most optimal method, and Indian/Chinese companies that are more in touch with the market will define what mobile interaction looks like in the future.

Google is all talk here and the technology they are talking about produces far worse results than a static display like the "web" used to be.

> for example, asking “Do I need an umbrella today in Delhi?” rather than typing “Delhi weather forecast.”

I routinely ask Google in the USA, "will it snow tonight?" and it is nearly always wrong. I don't mean that the data it accesses is wrong, or that the prediction was wrong. That stuff is usually right. I mean that Google will read the weather forecast's 45% chance of snow and then it will say confidently, "No, it will not snow tonight" as there is a near-blizzard taking place outside my window (because the snow was not "scheduled" to start for another 20 minutes, at a 45% chance, despite the fact that the snow is plowing down outisde). Or if there is a 100% chance of snow for 15 hours in a row, and it starts at 1am, but you ask at 11:55pm if it will snow tonight, it will say "No it will not snow tonight."

Or if it is going to snow between 1am and 5am and you ask at 11:55pm the night before if it will snow tomorrow, it will say "No". You have to ask, "Okay Google, will it snow early in the morning tomorrow?" if you want to know if it will snow overnight.

Google's example query for the Next Billion needs work. "Okay Google, Weather forecast" and then reading the results like on the web produces a 100x faster experience sometimes with much much much better results.

(Aside: has it snowed even once at Google HQ since it was founded?)

Edit: And don't get me started on temperature. On the Home Mini, "Okay Google, what temperature is it?" .... "35." 35 what?! Yes I know you have a settings/ units preference somewhere that I cannot see or access right now (that's why I'm asking the magic box for the answer!). But 35 is not a temperature, unless you are giving it to me in Kelvin and choosing to not say the unit, which is WTF too.

The other day I heard my brother say "Set alarm for 3:15" instead of "15 minutes from now" and when I asked him why he said that latter required Google Now which required additional permissions, so he would rather figure out the time to ask for rather than accept additional permissions.

If you say "Set alarm for 7", it will do 7pm, even if you use 24 hours time. And I know no one says 19 o' clock when speaking but it is interesting.

"Set alarm for 7 tomorrow" is 7am.

Our Echo is basically a glorified alarm clock/weather checker/music player. If you just give it a number, it'll ask you to clarify if you mean morning or afternoon.

"Set alarm for 7"

"Do you mean 7 in the morning or 7 in the afternoon?"


"Setting alarm for 7 AM."

We've got a lot to fix before we worry about the next billion. Language Processing isn't up to task. AI driven UI's generally suck and TBH people only use the internet when they have to I feel (around under a certain income level). It's interesting the amount that people will interact with their handset based on it's price. Look at app installs on handsets with a net cost over $350 and they install apps and do a lot. Less than that it's just a phone that takes pictures with email. (this being a huge approximation but I hope you see my argument)

Everything on the internet is also becoming like everything on TV/print it's just the same dribble. Once when there was a sense of community now resides a vast echo chamber.

"I fight for the users!" ;)

The author's photo is 3 megabytes, and the two even more tiny images attached to the articles at the bottom of the page are each about 3 megs. The next billion users, especially of they're on mobile, shouldn't have to use that kind of bandwidth for non-informational embellishments on a static blog.

  the fact that typing is difficult for people who never grew up with a computer keyboard
This is something I had not really considered in regards to "digital assistants". I abhor them with a passion but can understand their usefulness and desirability in this perspective.

Maybe tangential question - but are there any open source/paid well documented text-to-speech libraries for Indian languages?

Even with voice-only stuff, isn't asking "Delhi weather forecast" much simpler than asking "Do I need an umbrella in Delhi today?". This doesn't sound right to me. Even if people are using the longer form that may be because of initial barrier and lack of training. Once they realize that they can get quality and accurate search results just by speaking keywords, they will switch to using keywords.

It's not only simpler, it's also more reliable. If I ask Google Assistant about the weather in the afternoon and it's 60F, it tells me the weather is great. But that's not the whole picture because if it's 60F now, it may be 50F or 45F in just a few hours. In which case, even though 60F fits some people's (not my) idea of great weather, I need to take a jacket and/or sweater with me when I walk out the door.

Point being, the AI doesn't know why I'm asking. It could make some reasonable guesses, but currently it doesn't even do that.

I could deal with that by learning the whole landscape of that it guesses about my question, when it's smart and anticipates my needs and when it doesn't, when I can and can't rely on it to do the right thing magically. Or I could just learn how to ask the question the way it needs me to, and that is a lot less to learn and worry about.

Maybe a quibble but to me "internet" is fibre lines, cell towers, network protocols, etc. aka infrastructure

What rides on top of the internet is what will change with the next billion users but the infrastructure will stay more or less the same. Also, just because the developing world works differently doesn't mean I'll throw away my keyboard.

It's a 20th century mechanical device and still the gold standard for transferring data from a human's brain to a computer.

Mobile is not a replacement for that, yet.

Sounds like you need to read more Clay Shirky.

I agree but we have to regulate current behemoths better and democratise access to attention economy.

> You should not have to learn English to use the internet.

But you need to learn it to program the internet. Get your muscle memory familiar with the hundred+ year old QWERTY layout.

I wonder if this means Google will begin developing features for the "next billion" at the expense of the "first billion"

Author of the post says

> The next billion users are not becoming more like us. We are becoming more like them.

If we are to take his words on face value, then I would imagine so ...

> We are becoming more like them.

Rather than understand and control our technology, we are forfeiting privacy and determinism to global megacorps larger than half the nations on Earth while they figure out what you should want rather than doing what you do want.

Lets just deep dive how horribly dystopic the "dream" is of the Brazillian cellphone user in 2022:

Proprietary mobile handset manufactured in Chinese sweat shops with pre-installed state backdoors that the user has no idea about because they don't even know what a computer is. But the state agents that can on-demand turn their microphones and cameras on to spy on them, and their ISPs recording and mining all their communications are fully aware of what they are doing.

Said device will be solid plastic bodied with no removable battery, such that battery replacement requires a solder gun and a semester of community college engineering to accomplish. If any individual part breaks, its live with broken hardware (cameras, gyros, gps, wifi, damage the screen, etc) because costs to repair are so catastrophically high or near impossible to do and parts are made unavailable on purpose that the device is designed to be disposable, to rot in a landfill where it can leak toxic chemicals from its manufacturing into the soil and air.

Locked bootloader to prevent anyone from running the software they want to on hardware they own. No documentation on how that bootloader works, no ability to payload alternative software from ROM. The SoC is proprietary, the baseband is a trade secret, nobody can operate their own cellular network because all the IP is top down policed by the state and privileged ISPs.

Kinda-open base OS. I'm still to this day not sure how Google messed up making Android a proprietary hellscape in the early days, but its unlikely Android 12 will have a new kernel or bionic / other base userspace. However...

Wholly proprietary app stack. Google apps are all proprietary, Google Play Services control about 80% of the functionality of the device, from poweron to shutdown everything is logged and sent to Google - GPS positioning, website browsing, microphone recordings, photos and videos taken, and all touch events / key presses.

Social media / ad farming shoved down the throat from day 1. Preinstalled Facebook, Instagram, maybe Twitter and / or Snapchat on the homescreen. No instruction manual on the technical details of how the device works. No means to even know you can or how to program the device yourself. No IDE, no compiler, no shell. No ability to install apps outside the Google Play store without knowing to navigate settings to toggle off the app lock.

And how would you even begin to start searching for information on the device you have? Oh right, Google. Who modifies your search results based on what they think you need, rather than what you know you want.

Its entirely meant to predate off the poor and ignorant. They aren't worth much to global corps, but advertising is damn good at operating on fractions of a cent, and they are already doing a good job draining the creative and emotional health of the first world into addiction rattled social hell.

And really, nothing summarizes more how horrible this is than in how they want you to use Google Assistant as your input - yes, Ma Google is here to record everything you say and interpret the best thing to give you based on your words since you cannot directly interact with the device because you are illiterate. You have to use our proprietary remote listening service to use your computer.

Risc V save us.

It's comforting to see that someone other than me sees the future as this dystopic. I don't want to, of course, but I've gotta prepare my two young sons for what I feel their most probable future is.

Great, the people late to the game, and are here just to get on facebook are the "future" yay.....

"late to the game" is the very definition of the future. Those "early to the game" are the past.

We, ourselves are "early to the game" of the next wave, and "late to the game" of previous waves (e.g. personal computer wars, mainframe era, industrial revolution, etc).

Facebook has actually done a lot of work to reach the "next billion users". See https://info.internet.org/en/impact/. They want to make Facebook essentially a portal to the internet for people in poor countries.

Facebook's efforts in that are self-serving and mean-spirited. Connecting people to your platform and hiding the rest of the internet while telling them they should love you just isn't right - and it's not done to help them, it's done to make Zuck more money.

See "Facebook and the New Colonialism", https://www.theatlantic.com/technology/archive/2016/02/faceb...

I agree with you. While it may look altruistic on the surface (providing free internet to the less fortunate), they are actually creating monopoly on Internet for all future users. They can easily cut out competition.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact