Hacker News new | past | comments | ask | show | jobs | submit login
We can't send email more than 500 miles (2002) (web.mit.edu)
1034 points by alanpage on July 8, 2020 | hide | past | favorite | 135 comments



I've had enough years to become wiser, become a fanatic for configuration management, and get over the embarrassment: I'm the consultant that screwed things up. Some background: the Stat department was running a variety of systems besides the Solaris workstations, and there was, within UNC-CH, a separate support organization that was cheaper and more comfortable with Microsoft products where Stat was sending their support dollars. When that organization needed Unix support, they called my employer, Network Computing Solutions, and I showed up.

There was effectively no firewall at UNC-CH at the time (something something academic freedom something something), and the Stat Solaris machines were not being regularly patched. Uninvited guests had infested them, and it appeared the most likely entry point was sendmail - at the time, it was the most notorious vulnerability on the internet. Since my preference to wipe and reload was unacceptable - too much downtime and too many billable hours - the obvious thing to do was update sendmail. The rest is history.


Can I get your autograph? The 500 mile email story has been a story since I was a teenager.

Thank you for the comment. It was delightful to hear your take on the (mildly apocryphal, but highly enjoyable) tale.


How about a joint autograph from both the protagonist—Trey Harris, and the antagonist—the consultant to the Stats department :) ?

Trey Harris also on HN [0].

[0] https://news.ycombinator.com/threads?id=TreyHarris


Yes. Seriously... I’m willing to pay for the shipping and everything. I’ve told this story countless times to people over the years. This and the “OpenOffice can’t print on Tuesday” bug are two of my favourite troubleshooting stories.


HN thread for the OpenOffice story:

https://news.ycombinator.com/item?id=8171956



I would also be ready to pay for a signed copy of this story. I’d frame it up in the office :)


Can I get a signed copy on dot matrix printout please


Finding a working dot matrix printer may be the most expensive part!


My father recently retired his printer (Epson LQ-850) not because the printer had failed, but because driver support was lacking. We had used the same printer since ~1992. He was super bummed; he had stockpiled ribbons and still had a full box of tractor-feed paper.

The explanation he was given by the person in charge of administration on his university-department computer: At least on the system he was using, modern drivers expected a reply from the printer in response to commands, but apparently, the LQ-850 only receives commands but does not reply.


They still make and sell new ones. Niche market for them because they can print onto transfer paper (with or without putting ink onto the top paper).


We need a video of you both discussing this tale. It has become part of our culture.


This is absolutely one of the key formative stories that helped me to think about systems at light speed scales.

I'm currently at the very early stages of building a science museum and will eventually try to incorporate this story into an exhibit about light speed. This along with "Nanoseconds", foot long pieces of wire like what Grace Hopper handed out, can truly help to bring this topic to life.

I'm also attempting to use this as the basis for a blockchain based "proof of proximity" in which a very high number of round trip encryptions of the previous blocks hash are stored in a bitcoin block. The number of round trips would be high enough that devices even a few hundred feet apart couldn't complete the task before the next block.


I actually don't understand the last part of the OP story - can someone explain it?


The sending mail server was configured with a zero timeout for connections. If it didn't IMMEDIATELY get a response from the destination mail server it would fail. This immediate failure took 3 milliseconds, a long enough time that some servers could actually respond back before that happened... but if the server was too far away (more than 500 miles) the connection would fail before the first packet could even get there due to the finite speed of light.


As a tech/coding newb I was assuming it related to speed of signal travel to approximate 500 ish miles.

Thanks so much for confirming it.

I‘m going to share this one with my sons who will appreciate the humour.


Thanks for writing the TL;DR version!


I read this story years ago and thought it was hilarious. Could've happened to anyone. In my book you're a near-celebrity and it's great that you can verify the story! Thanks for making things a little more interesting and a lot more fun :)


I’ve read this story many times, it’s hilarious (and could happen to anyone). Thanks for filling in that background - HN is so great for these kind of moments!


This should be added as an appendix to the original story. The original has been a favorite of mine for years and I love the addition.


> and get over the embarrassment

It's not that bad, these things happen.

It makes an interesting story though



2015 Comments from the original author:

https://news.ycombinator.com/threads?id=TreyHarris


I love that this took a perfect storm of having a statistician and sys admin both bent on finding the cause of a weird intermittent problem in their own way.

This could have happened a million times where the story was a lot less interesting:

"Hey, I'm having weird intermittent problems sending email."

"Hmm, we're using the wrong version of Sendmail. All fixed, case closed."


Best part of reading this is coming away having learned the existence of units the CLI. How did I spend 20 years on the shell and not have needed or discovered this?


One thing I got bitten by was the handling of Fahrenheit/Celsius, because it's a non-linear conversion between the two. When you ask to convert `10 degC` to `degF` you get 18, which is the delta of ºF corresponding to increment of 10ºC. To get the absolute temperature, you have to ask to convert `tempC(10)` to `tempF` which is 50, as expected.

https://www.gnu.org/software/units/


"Non-linear" threw me off for a second - I almost never see the mathematically correct definition of linear in computer science spaces. For anyone wondering, Celsius to Fahrenheit is an affine transform, technically not linear, because you have to add an offset, not just multiply.


On the other hand, an equation of the form y = a x + b is a linear equation. If you have Celsius and want Fahrenheit you accomplish that by applying a linear equation (F = 1.8 C + 32), so I certainly can't fault people for saying that the transformation they are doing is linear.

I wonder what people would say for something using an equation of the form y = a x^2 + b x + c to transform something? I can't say that I've heard anyone talk of quadratic transformations. On the other hand, I can't think of ever transforming anything with a quadratic equation, so never had the need t speak of it.

(Also, he called it a linear conversion, not a linear transformation).


FWIW, units on macOS (not GNU) handles conversion of `10 degC` to `degF` correctly, although it dates back to 1993.

It seems that GNU units at some point added support for several non-linear units, which may have prompted them to rethink their syntax.


Be aware that currencies are stuck with rates from several years ago and don’t update.


Running `sudo units_cur` does the trick for me.

  $ units
  Currency exchange rates from FloatRates (USD base) on 2020-05-12
  $ sudo units_cur
  $ units
  Currency exchange rates from FloatRates (USD base) on 2020-07-09
(GNU units, packed by Debian)


In case anyone else needs this:

  # systemctl edit units-currency-update.service
  
  [Unit]
  Description=Update units(1) currency rates
  
  [Service]
  Type=oneshot
  Nice=19
  ExecStart=/usr/bin/units_cur
  
  # systemctl edit units-currency-update.timer
  
  [Unit]
  Description=Update units(1) currency rates
  
  [Timer]
  OnCalendar=daily
  AccuracySec=3h
  Persistent=true
  
  [Install]
  WantedBy=timers.target
  
  # systemctl daemon-reload
  # systemctl enable units-currency-update.timer


Looking at the source of the default configuration (cat /usr/share/misc/units.lib), I believe it only defines conversions for currencies that are pegged to another one (mainly to EUR or USD).

    You have: 10 franc
    You want: dollar
    conformability error
     1.5244902 euro
     1 usdollar
    You have: 10 franc
    You want: euro
     * 1.5244902
     / 0.655957


I'm tempted to say it shouldn't even attempt to support currency conversion, as constantly in flux as it is.


  $ units
  Currency exchange rates from FloatRates (USD base) on 2019-06-05


I wonder if this could be addressed with periodic updates.


I didn’t look too deep into it, my understanding was that the source it uses to update itself has been taken offline. There are workarounds involving data massaging and a cron but honestly that’s a lot more work than typing “1000 chf to usd” into ddg and getting the converted amount. But if you know something I don’t, maybe you could share for everyone’s benefit?


I also discovered `units` because of this tale... but I was lucky enough to read it beck in the early days (pre 2005 at least).


'units' was new to me too. The version I have on my Mac wouldn't accept 'millilightseconds' but it would take 'milli-c-seconds' - presumably the units.lib database is a little different from one in the original article.


Though sadly millilightseconds is not supported on macOS, at least, so you have to go:

    3 millilightyears / 365 / 86400
Of course, round 365 to whatever average number of days you believe in :-)


units

You have: mph

You want: kph

        * 1.609344

        / 0.62137119


I have

  alias units='units --verbose
in my shell rc which makes the output much more understandable:

  You have: mph
  You want: kph
          mph = 1.609344 kph
          mph = (1 / 0.62137119) kph


I find the reciprocal thing useless, so I have units='units --compact --one-line', which gives just the number you want.


Me too! so awesome


I used to collect these kind of stories:

When I flush my toilet my computer reboots: http://www.techtales.com/tftechs.php?m=199712#66 (the first story on the page)

If I buy vanilla ice-cream my car wont start: http://www.netscrap.com/netscrap_detail.cfm?scrap_id=501

A specific cargo routing crashes system: https://www.jakepoz.com/debugging-behind-the-iron-curtain/

Tape-drive failure only within large print jobs: http://patrickthomson.tumblr.com/post/2499755681/the-best-de...

Interplanetary debugging with the Mars Rover: https://www.eetimes.com/the-trouble-with-rover-is-revealed/#


In my intern days some time around 10 years ago, a PI at the NASA GRC facility told me about a problem of this flavor an old grad student of his had.

The guy was working on an optical sensor in a light-tight lab. Every morning, he came in, calibrated the sensor, and performed measurements. All morning, it held calibration with negligible drift. But when he came back from lunch, each time, the calibration had drifted off.

Could it be related to the time of day? He tried taking his lunch an hour earlier and an hour later. Each time, the calibration was rock solid until right after lunch.

In spite of protocol, he tried eating lunch in the lab, no one else in or out. Before lunch: good calibration. After lunch: bad calibration.

He tried not eating lunch at all. That day, the calibration held all day.

How could an optical sensor have any concept of whether its user had eaten lunch? It turned out, it only had to do with the lunch box. The sensor was fiber coupled, and it was sensitive to changes in transmission losses generated by changes to local radii of the patch chord. Every morning, the grad student set his lunch box down on the lab bench, nudging the fiber into some path. After eating, he’d replace his lunch box on the bench, nudging the fiber into a different path.

After that, the fiber was secured with fixed conduit, and lunch boxes no longer entered the lab.





I love that after all these years there are still people discovering this classic for the first time.


That's a great non-cynical non-humbuggy view to have

I see something like this and have to consciously think of https://xkcd.com/1053/


thanks, I was looking for this comic for some unrelated reason.


Obvious feature request for HN: page that lists most reposted stories / links. :)


That would be the ‘past’ link, though it doesn’t turn anything up in this case as the title on this post is different. (The usual title is “The case of the 500-mile email,” but this copy is missing the subject line for some reason so the submitter used a representative phrase instead.)


Ha, I was just thinking if one wanted to farm karma they could just set up a cron to post this monthly. Seems to be a perennial favorite!


Yes, this story gets posted a lot, and many of us might know it, but at the same time, I like to think about the ones that didn't. They will learn something new today. XKCD said it better than I could: https://xkcd.com/1053/


Genuine perennial favs are worth repeating every so often --- a year or two's interval seems reasonable, and is vouched by HN.

A marker of aging for me was seeing, a decade or two after I'd first read them in the local paper as a callow youth, repeats of previous features, by topic if not the actual text. Eventually the thought occurred to me that perhaps the versions I'd remembered were themselves not original.

People tell, and repeat, and embellish, stories. Sometimes because the young'uns and whippersnappers and new arrivals haven't heard them yet. Sometimes because they're just damned good stories and we enjoy the retelling.


Yes, I’ll continue upvoting this classic when it pops up in the years ahead.


I did. And it was not only informative but also hilarious.


One of the most endearing xkcd strips. A bit lightbulb moment when I figured this out in my life.


Generally people submit links to past submissions that have comments that maybe of interest.


The reason people share links from the past isn't some passive aggressive "UGH reposts amirite?!" like what you appear to be doing -- it's because past discussion on a fun read has lots of fun morsels of comments, and it's fun to revisit them alongside today's discussion.

HN doesn't have a rule that there should only be one canonical submission for every individual link or topic. That you thought your comment would contribute anything leads me to believe that you aren't aware of that.


I am well aware of HN policy regarding reposts. I just found it amusing that this one has so many.


Maybe next time just link to the search query. Most people familiar with forums are well aware that some topics come up more often than others.


its pretty common for previous threads to be posted in the current thread


Check again. We tend to enumerate previous threads that got traction and have past discussion. It's nice to go back to read those.

But most of what this person linked had 0 to 1 comments and like 2 points. I mean, yes, I would expect an interesting story from 2002 to have at least 30 failed submissions on HN that never got traction over 18 years.


I remember reading this one here in the 2000s, here's one from 2008:

https://news.ycombinator.com/item?id=385068


The ending of this makes it sound super clean. 3 ms * speed of light => ~560 miles. "It all makes sense!"

But ... isn't the speed of light through fiber actually like 2/3 of the speed of light in a vacuum? And that fiber isn't going to be laid out in a straight line exactly to the destination. So I think really there must have been a fair bit of uncertainty around that ~3ms to abort a connection.



The FAQ was more entertaining than the actual story!


Since the story has been posted umpteenth times, this is what should be posted these days. Very entertaining, thanks!


Yes! Thank you GP this too was my first time reading that follow-up FAQ.


My goodness, the questions people sent in. "I've never met a rich nitpicker" comes to mind.


Was gonna say. Speed of light through cables or fiber optics is roughly 2/3 the speed of light through vacuum. Also I don't see how it would know it has established a connection until the round-trip has happened. All in all, it probably waited more like 10 ms, if this story were true, which it probably isn't.


I am surprised that signals travel faster in copper (3c/4) than in fiber (2c/3), anyone has an explanation?


The number for copper is not a fixed quantity, it varies with the type of cable. Electric fields are outside of the copper wire, not inside. The copper conductor acts as a wave guide for the electromagnetic wave. So it turns out that things around the cable have an effect on the speed of propagation [1], particularly the insulator. Bare copper wire in a vacuum would be very close to c. In the case of fibre, the issue is index of refraction.

[1] https://en.wikipedia.org/wiki/Velocity_factor


And the time it takes to get to that fiber. It’s a fictive piece probably?


Truly this is a worldly gem. Thank you for submitting this. :)

It's easy to forget that, even though transmissions still travel at near lightspeed, it still takes more than an instant to reach its destination, even digitally. I should keep this in mind, I think.


IMO even though this has been posted a bunch of times it’s important to repost these sort of campfire ghost stories so future developers can think more creatively about strange errors.


It should be recommended reading for new IT support staff. While you get unhelpful "its not working" requests all the time, when the user does give you information on the working / not working scenarios, you should always consider them, even if it doesn't make any sense.


Is there a book written about these campfire programming stories?

This one and the microsoft bedlam one are some of my favorites.


Not sure, but there’s plenty of stories out there that one could probably be compiled.


Only the other week we were doing some testing on a new HCI (hyper-converged infrastructure) I'm doing the network for. At the end of the test period, we were having some storage sync issues. Everything seemed to PING ok, except my colleague happened to notice large jumbo frames over 8000 bytes were getting dropped. I double checked that we hadn't inadvertently changed network configuration. It was only by chance we had another test looking at transceiver signal levels that a customer engineer saw an alarm on RX level. It was then we remembered one test was to remove a module. I then noticed some error counts. We shutdown that particular link until we could visit the site. Sure enough, that fibre wasn't quite clicked in anymore. There was enough of a bridge across the fibre air gap for shall packets, but just wide enough so large packets statically couldn't be corrected enough to work.


Made ne think about the precision required for some errors to occur. Have had sort of similar things occur, and where it's almost impossible to reproduce it when you try!

As a hobbyist sound engineer, usually regular cables are the first I check, but maybe I should extend that to fibre?

Interesting error nevertheless, and honestly, checking fibre cabling for those kind of errors would probably be a bit lower on my list, unless I saw a lot of tranciever errors.


A while back, Andreas Zwinkau collected this kind of stories. Please post him if there is anything not yet listed.

http://beza1e1.tuxen.de/lore/index.html


If somebody is tracking apt/yum downloads of packages, they might see a sudden spike for "units". Just installed and it is a nifty little useful tool.


Love this story.

I'm certain it will continue to be reposted to this website until ceases operations or the heat death of the universe whichever comes first.


Love this story, funnily enough it was the first story I ever read on HN on my first day of work as a junior developer.


Can someone please explain to me the POP reference? I do not understand what this author means by that.

I also would like help understanding what $ units gives? The command looks to be "units", but where do the numbers he entered in come from? I would appreciate this extra context.


$ is the shell prompt; he's not typing it. "3 millilightseconds" is the distance light travels in 3 milliseconds, the time a "zero" timeout would take to actually timeout. (This comes directly from the definition of the lightsecond: how far light travels in one second) "miles" is what he wants to see that distance converted to. Turns out it's 558 miles; one mile is 0.00179 of 3 millilightseconds.


Point Of Presence (POP) is the location where two or more networks meet each other to connect.

Here is a good explanation: https://blog.stackpath.com/point-of-presence/


Point of Presence is where the university network would connect with the internet


POP I believe is "point of presence"; I believe its meaning is "the point at which our network connects to the internet".


Edit: Found it - its /usr/share/units/definitions.units (on Pop!_OS, so probably same on Ubuntu/Debian).

The FAQ[0] mentions:

> units on SunOS doesn't know about "millilightseconds."

> Yes. So? I used to populate my units.dat file with tons of extra prefixes and units. And actually, I think I was using AIX to run units; I don't know if it knew about millilightseconds. Take a look at the units.dat shipped with Linux these days. It definitely knows about millilightseconds.

I tried locate for units.dat but couldn't find it. Anyone knows where is it? Not keen on running a system-wide find.

[0]: https://www.ibiblio.org/harris/500milemail-faq.html


units -U gives you the location of definitions file


macOS: /usr/share/misc/units.lib

ubuntu: /usr/share/units, after `apt-get install -y units`


Hat tip to @nfriedly from back in 2011:

FYI units on OS X doesn't recognize millilightseconds, but you can do this:

  You have: 3 lightyears / 365 / 24 / 60 / 60 / 1000

  You want: miles

   * 559.21802

  / 0.0017882113


    % units
    586 units, 56 prefixes
    You have: 3 lightyear*millisec/year
    You want: mile
     * 558.84719
     / 0.0017893979
or you can patch up the library file

    /usr/share/misc/units.lib


GNU units is available through Homebrew, and does.


You can also type "3/1000 lightseconds to miles" in Google.


GNU Units is spectacular.


This story makes me smile every time it comes up. It's fascinating how many arbitrarily coded limits we keep breaking as we make our tech go faster without re-assessing the original assumptions :)


Does anyone know of stories similar to this? I would like to read more.


I don't have time to type it up more completely, but back in the early to mid 90s I administrated, among others, some AT&T 3B2 servers. ( https://en.wikipedia.org/wiki/3B_series_computers )

There was an odd situation where some of the systems were unable to make connections to other systems ....some largish distance away. It was fairly variable, but some systems were almost always unreachable, and some were occasionally reachable.

Long investigation, but in summary, this happened because the default packet TTL was, for some reason, set to a fairly low value in a minor kernel update. I simply increased that number in the kernel, recompiled, and all of the problems went away.


https://www.reddit.com/r/talesfromtechsupport/comments/29hzl... is another classic. (I remember seeing it back in the 90s.)


Not as funny but still a very interesting read https://en.wikipedia.org/wiki/The_Cuckoo%27s_Egg


The style is very reminiscent from The Register's BOFH line of stories (though without the mischieviousness)--now those are mostly fiction, but they're amusing nonetheless...



Haha, this was interesting. I posted the same story with more or less the same title a couple of month ago [1], and no one saw it and no comments. This time it got almost 1000 points and a lots of comments. What I think is interesting is how the same thing can get so different traction. Wonder what factors it is that makes a thing get traction and not?

[1] https://news.ycombinator.com/item?id=22164691


Since I've worked with linux email servers (sendmail, qmail, postfix, exim, etc) practically my whole profissional life (since around 1997, but used BBS since 91 - I'm 41 now), this story really amused me and got my attention! I love this kind of email debugging! LOL


In network engineering circles, this one is second only to the RFC for IP over avian carriers.


There was also this "implementation" of IP over avian carriers:

https://www.blug.linux.no/rfc1149/


There is a popular GitHub repository for similar debugging stories: https://github.com/danluu/debugging-stories



I've heard this story before, but i didn't realize it was as recent as 2002. It feels like something from a much earlier bygone era, like the early 90s


In fact it is from 1994 -1997, as explained in the FAQ linked above


In a way, Hacker News acts like a Leitner deck.


This is one of my favorite bugs ever. Loved to read it again.


Debugging software is not unlike detective work.


Anybody know the original year?


This is wild!


So old! DUPE


I suggest to change "mail" to "email" in the title.


Ok, we've e'd the mail above.


The one thing that throws a wrench in this story for me: Lattes.

Latte's in 1994? In North Carolina? No way. Maybe on the West Coast, but I moved to Cali in 1989 and they were a rarity until the mid-late 90's. There were only 425 starbucks in the US in 1994 (from their site). The "fancy coffee" craze was just a blip on the radar in the mid 90's but gaining momentum.

;-)


> The "fancy coffee" craze was just a blip on the radar in the mid 90's but gaining momentum.

Friends premiered in 1994, with The Central Perk being a major set piece of the show. I mean, yeah, its New York City and not North Carolina, but college towns anywhere are going to be early in trends.

A latte in 1994 seems plausible to me. I remember getting them from a Gloria Jeans in my local suburban mall around 1990 or so.

You're not wrong about the shape of the trajectory, but all throughout the 80s the coffee shop/latte trend was slowly building steam (heh) before it went hockey stick in the mid-90s.


Oh yeah, that's right. Good point! I just illustrated how unhip I was in the early 90's.


In ‘94 (if not earlier), I was drinking lattes at a mom and pop coffee shop in a tiny town in the Midwest. And at another indie coffee shop at the nearest major university campus. That place was open 24 hours and busy at all hours. I didn’t even know what Starbucks was, but I sure knew lattes and cappuccinos.

So yeah, lattes in ‘94 in a major college town seems totally plausible.


Agreed, I graduated high school in 1995, and I was dating a college girl in Rome, Georgia. Our favorite hangout was a coffee shop that served, among other things, lattes and frappes.


Latte's in 1994? In North Carolina? No way

Definitely possible. Chapel Hill isn't like the rest of North Carolina, so I'd expect something like that to appear here before other parts of the state. And I remember the Books-a-Million in Wilmington started adding a cafe / starbucks-like area for "fancy coffee" about 1996 or so. I have no problem believing there were shops serving latte's in Chapel Hill during the era this story is described as happening in. And to be fair, the author even says in the FAQ that he's not sure about the exact date(s). It could have been as late as 1997.


From the FAQ

> My guess, from the office I remember being in, the coworkers I remember speaking about this to, and some other such irrelevant but timely details, place it somewhere between 1994 and 1997.


Latte's in 1994? In North Carolina?

We drank lattes in Louisiana in the 80's. Time to upgrade your stereotypes.


just get it in the pipes


Tubes, alright?! The Internet is a series of tubes!

https://youtu.be/f99PcP0aFNE?t=127


lol, thanks for the correction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: