
What happens behind the scenes when we type www.google.com in a browser? (2015) - MrXOR
https://github.com/vasanthk/how-web-works
======
jamesmcintyre
I worked with the guy who wrote this repo. He's one of the smartest, nicest,
and genuinely curious guys I know.

He also ran teams and his leadership style was very humble and encouraging in
a workplace that was, at times, the opposite.

So glad to see he's still publishing awesome content like this.

Be sure to check out some of his other popular repos, great stuff!

~~~
mayakumar
I roger that. His name is `Vasanth Krishnamoorthy`. I still have the good
fortune of working with him in `WalmartLabs`. Extremely talented, multi-
faceted, curious and above all a wonderful human being. I have learnt plenty
through my interactions with him and am still learning from him :-). I am
serendipitously lucky to have met him in my life.

------
MrXOR
Another similar context:

[https://github.com/alex/what-happens-when](https://github.com/alex/what-
happens-when)

[https://news.ycombinator.com/item?id=8902105](https://news.ycombinator.com/item?id=8902105)

[https://news.ycombinator.com/item?id=9189096](https://news.ycombinator.com/item?id=9189096)

------
vinayms
The main problem with these kinds of exposition topics is that the steps
involved are quite dense. By that I mean something like density of real
numbers; between any two real numbers there are any number of real numbers.
Then, the writer just chooses to expand on those steps that they are
comfortable with, or the ones that get more eyeballs. I mean, why not explain
how signal travels across various media like air, underwater cables, to and
from satellite etc?

Now, don't get me wrong, I am not saying this is useless. I am just saying
that if one chooses to give a 30k feet perspective, they had better stay there
and not bounce between 40K and 10K.

~~~
bgravinz
I’m not seeing not sticking to 30k is really detrimental to how most people
would likely consume a doc like this, which would probably be to sample
various parts to find what they’re interested in and then dig in.

~~~
tempguy9999
I would most politely wish to note that one is tended to _read_ documents. One
should rarely consume them unless in the spirit of espionage.

Equally, in the comfort of one's own parlour, one may _listen_ to one's
wireless. Not 'consume', as was once heard.

The butler informs me on this occasion that I am correct. I bow to him in
this. Uncommonly sharp fellow.

And so to the Drone's Club, where literature, wireless nor dolly birds are
permitted.

------
AndrewStephens
I love a good reductionist deep dive into things we use everyday. This is a
great overview to a wide range of topics.

I wrote something similar a couple of years ago[0] which skips the keyboard
and display but goes further into the world of IP packet transmission (and
HTTP/2). The parent link is better written though.

[0]
[https://sheep.horse/2017/10/how_you_are_reading_this_page.ht...](https://sheep.horse/2017/10/how_you_are_reading_this_page.html)

------
xurukefi
The most amazing part about this is that regardless of how much more details
one would want to put into it, it is probably practically infeasible to make
it complete. It's a wonderful example to demonstrate the importance of the
concept of abstraction in CS.

------
jimmaswell
I would've liked to see ps/2 keyboards gone over. I still use one.

~~~
notyourwork
Do you plug it into a USB port or a ps/2 port on motherboard?

~~~
jimmaswell
The motherboard.

------
turtles
In the TLS handshake: "The server generates its own hash, and then decrypts
the client-sent hash to verify that it matches"

The server decrypts a hash? But thats not how hashes work.

~~~
sowbug
The client encrypts its hash before sending it to the server. Thus the server
must decrypt it to compare it to the one it generated.

~~~
tristanperry
Hashes are one-way, so cannot be decrypted. The server can _compare_ the
results of a hash (by doing the hash itself, and comparing the results),
though.

~~~
sowbug
You and turtles are suffering from the cryptographic equivalent of a
hypercorrection, in the same way that well-intentioned people insist on the
propriety of the grammatically impossible phrase "between you and I" (which
should be "between you and me," because prepositions take objects, not
subjects.) The two of you have had the irreversibility of one-way hashes
drilled into your heads, just as many of us were taught when young not to say
"me and Susie were playing on the swingset." And you have an allergic reaction
to anyone using "decrypt" and "hash" in the same sentence, which can lead to
that allergy triggering a false positive. In this case that's what's
happening.

Cryptographic hashes are irreversible. That's the point of such a device. But
there is nothing stopping someone from taking the _result_ of a cryptographic
hash and then encrypting it, and then that someone or someone else decrypting
that ciphertext to recover the hash result. E(H(S), k) leads to an encrypted
hash, and D(E(H(S), k), k) recovers the hash. It's computationally infeasible
to retrieve S. But nobody wanted to do that; they just wanted to know H(S).

You are correct that the server compares the result of the hash (which in
context can also be called a "hash," such as "I used SHA-256 on my term paper,
and then I spray-painted the hash on the face of the town clock tower, thus
proving the existence of my term paper before the class deadline"). Nobody's
arguing that. But how did it obtain the thing it's comparing its own result
to, without M also obtaining that thing?

(I'm actually not sure whether TLS sends the actual hash or bases subsequent
computations on the assumption that both sides can independently derive it.
But if it does the former, it's totally fine to say "it decrypts the hash,"
which is the objection of the parent of this thread.)

~~~
tristanperry
Thanks, TIL :)

------
EE84M3i
I think this would be better if it had more details about font rendering and
layout engines.

~~~
jdbernard
I agree. We should fork it, flesh out those details and submit a pull request.
:)

------
fouronnes3
It's amazing how detailed this is, and yet there are still steps missing!

------
unixpickle
Of course, google in particular is behind a very complex distributed network.
Distributed DBs are mentioned, but it would be cool to know more about how web
requests are distributed and routed in this system.

~~~
ibdkhb
I did this years ago so it might be different now, but I blocked google.com
and some subdomains at the firewall, found a document hosted on google docs
(google.com domain) link in google.co.uk search result, clicked it and instead
of the link failing due to it being blocked at the firewall, a google.se
(swedish) server started sending the ip traffic for the blocked document to
come down. I never tried blocking google.se and then repeating to see what
other google domain would send the document next, but its clear Google have
written their own routing to get information around some restrictions beit
deliberate or misconfigured. Its also an excellent way to probe what servers
have blocks in place or not, ie censorship. Its also pretty much instant
rerouting ie subsecond, so their ability to pass instructions to other servers
in a timely manner is obvious. I wonder if their servers are using swarm
intelligence in area's or not? They did custom build their own machines, which
would have given them the opportunity to tear up the rule book some what.

------
ryanf323
Very comprehensive. However, it is missing how the client machine will ARP the
gateway for the MAC address if it is not in its tables.

~~~
mizzao
Perhaps you should submit a PR on the repo?

~~~
ryanf323
I should and I will

------
theandrewbailey
I remember an interview where I answered a similarly in-depth question about
what happens when you load a file, from file system traversal, down to the
heads moving across the platter (because almost no one used SSDs in 2009).

------
beamatronic
I used to work with some guys from Taos Consulting. They asked a similar
interview question. “What happens when you ping Google”. Your answer was
expected to take at least 3 hours.

~~~
yread
I also got asked a similar question in a an interview (as a programmer) and I
answered it well and they hired me. But I think it's an absolutely terrible
question for judging whether a person will be a good hire.

------
0xEFF
Do you find this an useful interview question in 2019? What is or isn’t useful
about the answer from candidates?

~~~
irq11
This question is now burnt. Nobody should be using it. (In reality, it was
burnt a long time ago, but it’s crispy and black now.)

If you do use it, you’re merely selecting for people who have read posts like
this on HN and reddit. Candidates who have memorized any of this will look
vastly superior to those who haven’t.

It’s telling that there are already two sibling comments who are arguing that
this is a good question - explains a lot about why technical interviews suck.

~~~
brokenmachine
Just the fact those people are interested enough to read HN/Reddit for their
own sake can't be a bad thing though.

I'm not an employer but I know I'd rather my workmates were the kind of people
who are actually interested in computers enough to read about them for
pleasure, not just those "straight by the book" types.

~~~
toast0
The problem with this question is if someone gives a good answer, it can be
hard to tell if they studied it or they actually know. Maybe you can suss this
out with follow ups.

If they don't give a good answer, maybe they haven't looked into networking
details and debugging for some reason -- a lot of junior people haven't, but
they may have the aptitude to learn and be great at it, but just don't have
the knowledge base yet. Although it depends on exactly what you're hiring for,
too. If you need the person like me, who will find and fix your weird problems
with networking, maybe they should know this, or be able to make fairly
plausible guesses; but most people on my team don't need to do that (although
it's always nice to have more).

~~~
aaaaaar
"it can be hard to tell if they studied it or they actually know."

What is the difference? If they studied it, they now know?

Is it because as an interviewer, you are looking for knowledge by experience,
not via book-learning?

~~~
toast0
The difference is if they studied the answer, but didn't grasp the material,
they got information to pass the test (maybe), but probably didn't get useful
information.

I guess if you stop at each point and ask 'what could go wrong here, and how
would you debug it' and they answer that well, then they've gotten the
information enough.

------
mandeepj
relevant -
[https://www.google.com/search/howsearchworks/?fg=1](https://www.google.com/search/howsearchworks/?fg=1)

~~~
RyanShook
A lot is written about the algorithms and search results from Google but I
haven’t seen much on crawling and indexing. I imagine there are multiple teams
in Google who help maintain the integrity and completeness of their indexes
but would love to know more about it considering it might be the largest
single repository of knowledge.

~~~
shereadsthenews
I believe email is significantly larger than the web. Wouldn't that be your
largest repo of knowledge, in aggregate?

~~~
RyanShook
Yes, I agree. I just meant the Google index is probably the largest
centralized collection of knowledge. Maybe not though, just a guess.

------
tianshuo
This is almost the same open-ended interview questions I've used for many
years, works a charm because it filters the engineers from the chaff (who just
copy-pastes code from the internet w/o understanding why)

~~~
mrunkel
Me too. This gives a quick insight into what the applicant knows and what they
are interested in.

This is usually one of the three questions I require as part of an
application.

The other two are a technical question tailored for the position and the third
a 'throwaway' that the applicant answer as they see fit. In the past it's been
things like "What is the airspeed of an unladen swallow? (European or
African)"

~~~
dev_north_east
> "What is the airspeed of an unladen swallow? (European or African)"

What are you expecting to get from something like that?

~~~
mrunkel
Either they get the reference and say something funny in response, they don't
get the reference and google it, or they ask a question like "what are you
expecting from something like that?"

In any case, it's a chance to show themselves as a person.

------
amelius
If only we knew!

------
techslave
i invented this question circa 2009. i have to assume, one of many that
independently invented it.

in my version, it’s not “what is behind the scenes “, it’s “tell me everything
you can in as much detail as you like, what has to happen to visit
www.google.com”. of course in 2009 google had only just become google.com.

~~~
shereadsthenews
I've noticed that when anyone claims to have invented something on HN, they
get voted down. But I have no reason to doubt it. Please tell us more about
how you came to use this question, and why you think it wasn't in circulation
before that. I know someone asked me this in an interview in 2010, so it was
in wider use by then.

------
0815test
Doesn't it depend on the browser? Does Google Chrome still do that thing where
if you fire the browser up and then type www.google.com in it, it actually
_searches Google for that string_ and returns a results page, instead of just
browsing there?

~~~
chromeguy66
The behaviour of the URL bar is modifiable in most modern browsers. The
default on most is automatic detection of URLs - if something looks like an
URL, it's treated like one and if it doesn't, it searches your search engine
of choice.

