
How the Internet works - Jonhoo
https://thesquareplanet.com/blog/how-the-internet-works/
======
jasode
_> In the interest of making the topic easier to comprehend, we will start
with a fairly high-level overview, and then dive into each individual
component as we go along. Our journey starts with a user called Alice. Alice
owns a laptop, and wants to send e-mail to a person called Bob. ... [wall of
text...]_

I understand the enthusiasm to spread knowledge but a wall of text like that
isn't going to be helpful for novices. When technical people try to impart
knowledge, the text they write usually suffers from the Curse of Knowledge[1].
For example, "DHCP" and "UDP" is mentioned several times which is a level of
detail that is not necessary for an introductory overview.

Instead, to explain how the internet works would require lots of diagrams,
pictures, videos, etc. illustrating different scenarios. E.g., sending SMTP is
different from HTTP and both use DNS. All these "mental buckets" (and many
others) are difficult to separate for novices. Instead of drilling into a
timeline of bytes sent from Alice to Bob, new concepts can be taught as
"progressive layers" of detail (with the timeline rewound each time.)
Answering the question of "how the internet works" is so open-ended that
apparently, some tech interviewers like to use it for evaluating job
candidates.

[1][https://en.wikipedia.org/wiki/Curse_of_knowledge](https://en.wikipedia.org/wiki/Curse_of_knowledge)

~~~
Jonhoo
I see your point, but I think it's still important to give an end-to-end
description, allowing people to tie it all together. The wall of text, if
you'd read it, uses progressive layers in several places (like core Internet
routing), specifically to try and avoid information overload for the reader.

If a reader were to actually _use_ networks for something (like try to write a
program that uses TCP), they would obviously need much more detailed
knowledge. At that point, diagrams become necessary, as do much deeper
explanations, but the goal of this post was to give an overview of what things
are needed to provide end-to-end communication, and why.

------
aristus
[http://carlos.bueno.org/2013/03/internet-
shape.html](http://carlos.bueno.org/2013/03/internet-shape.html)

------
heydonovan
Have you seen Code.org's video on routing? Might be worth linking to, as it is
very simple to understand:

[https://www.youtube.com/watch?v=AYdF7b3nMto](https://www.youtube.com/watch?v=AYdF7b3nMto)

------
vezzy-fnord
I'd honestly recommend that people just pick up a copy of and devote some time
getting through Andrew S. Tanenbaum and David J. Wetherall's _Computer
Networks_.

~~~
dayjah
I'd also recommend "How the world was one" ( [http://www.amazon.com/How-World-
Was-Arthur-Clarke/dp/0553074...](http://www.amazon.com/How-World-Was-Arthur-
Clarke/dp/0553074407) )

It does an astonishingly good job of describing how networks evolved and why
the function as they do today (as of mid-90's - not much has changed AFAICT).

~~~
danso
Just bought a used paperback version from Amazon, thanks!

------
markc
Apologies in advance for a near useless comment, but this typo? made me smile:

"When Bob "sumbits" this form, another HTTP request will be sent"

It took a moment to decide that it probably wasn't intentional :)

~~~
Jonhoo
Hehe, thanks, fixed.

------
pyvpx
The "Tier" system is out-dated at best and shouldn't be regurgitated in this
day and age.

~~~
Jonhoo
I'd be happy to get rid of it if it is indeed out-dated as you say, though
that was not my impression. Could you give me some references giving
information about how core Internet is now being done?

~~~
dvanduzer
Plenty of people will still talk about providers that way, but it has _never_
been about topology the way you describe. BGP is a mesh, not a tree. Packets
that do have to leave one network are routed to the closest BGP peer. Plenty
of small providers peer directly with other small providers (and always have).

It's also quite misleading to talk about the hostname once you've mentioned
BGP. At that layer, the routers have completely "forgotten" the hostname, and
are only looking at IP address/prefix to move that packet.

I understand why one might want to illustrate the fact that mail delivered via
the SMTP protocol might be retrieved from the same server via the HTTP
protocol running in different ports. But this is not a feature of TCP. The
term "TCP multiplexing" commonly refers to reusing the same end-to-end TCP
session for multiple application requests (e.g. more than one HTTP GET to the
same server/port). Very different than IP multiplexing.

But email is also a very bad example to use to illustrate IP multiplexing,
because email uses a store-and-forward mechanism very unlike HTTP lookups. A
sender's email client just doesn't do lookups for a remote email provider. A
mail client has a provider-specified SMTP server for all outbound mail
traffic, regardless of destination. The client does use DNS to find that
server's IP address. But the sender's provider's outbound email server
performs an entirely different category of DNS lookup to figure out the
destination mail server for the domain. This difference makes it a
particularly bad idea to mention SMTP at all, only to say that it's
unimportant.

I'm sorry this is such harsh criticism. It is extremely difficult to learn
this material, let alone teach it to someone else in a digestible fashion. So
please don't let this discourage you from trying to write a document like
this. But it would be a lot more effective if you just didn't try at all to
explain quite so much material in one post. Hope that helps a little?

~~~
Jonhoo
I never refer to it as a tree, only as, effectively, a tiered mesh. While it's
true that this breaks down with multi-hop BGP forwarding, I think it conveys
enough about how the routing works for a layperson to understand that is
relatively close to the truth, no? I could potentially cut down the section on
Internet routing significantly, and simply mention that it is a routing fabric
that does relatively greedy hop-by-hop forwarding, though I'm not sure that
will be more digestable.

You're right that BGP doesn't care about hostnames, or even individual IPs,
but I'm not sure that distinction is actually relevant to someone who is being
exposed to this just now.

No, I disagree. IP has no multiplexing features beyond protocol multiplexing.
Port numbers that allow multiplexing a single IP among multiple applications
on the same host only appear in UDP/TCP. What you are referring to is
specifically called HTTP multiplexing, and has little to do with TCP.

Again, technically, you are right, however there is nothing in SMTP that
precludes the sender from contacting the destination SMTP server directly
(well, except that spam detection systems will freak out). The fact that SMTP
_supports_ hop-by-hop forwarding, and in many cases this is the only mode that
is used, does not mean that this is necessitated by the protocol. And in the
interest of making the content easier to understand, I decided describing
multi-hop SMTP was simply unnecessary.

I appreciate the feedback --- it's always hard to piece together relatively
complex posts like this. As I mentioned elsewhere, I specifically wanted this
to be a single post such that readers could follow the communication flow end-
to-end. I believe this will improve reader comprehension, though of course
YMMV.

~~~
empath75
If I were writing an introduction to lay people, I'd probably just explain the
process of loading facebook.com, something people will be very familiar with.

Set up a very simple home network topology. A router/modem, and a wired
ethernet connection to a laptop. Explain all 4 layers of the tcp/ip model.
Explain DNS.

BGP seems sort of advanced for an introduction to the topic, to be honest, and
so few people use smtp rather than just using gmail's web interface, that it's
rather obscure.

