
How efficiently does Morse code encode letters? - CarolineW
https://www.johndcook.com/blog/2017/02/08/how-efficient-is-morse-code/
======
e28eta
I never got around to actually learning Morse Code, but I think the English
language corpus is probably dramatically different from the messages that are
actually sent via Morse Code. My understanding is that abbreviations are
commonly used, greatly skewing the letter frequencies:
[https://en.m.wikipedia.org/wiki/Morse_code_abbreviations](https://en.m.wikipedia.org/wiki/Morse_code_abbreviations)

~~~
topspin
There are also a large number of Q codes that cover practically everything
related to communication and also naval, aeronautic and rescue concerns. Today
much of this isn't in use, but amateur radio operators use a subset of these
routinely; a typical fast Morse code exchange is often composed almost
entirely of Q codes and call signs.

[1]
[https://en.wikipedia.org/wiki/Q_code](https://en.wikipedia.org/wiki/Q_code)

~~~
khedoros1
I became aware of Q codes through their use in the book Seveneves. It provided
a brief glimpse into a subculture that I've never had much contact with, and I
ended up in a bit of a Wikipedia crawl.

~~~
FiatLuxDave
I was lucky enough as a kid to inherit my grandfather's collection of old sci-
fi books. I first became aware of Q codes through the short story "QRM
Interplanetary" from 1942
([https://en.wikipedia.org/wiki/Venus_Equilateral](https://en.wikipedia.org/wiki/Venus_Equilateral)).

I find it interesting that Q codes have changed so little over the last
century. I have some ham radio friends who will use Q codes when talking face-
to-face with each other, and the codes they use are basically the same ones
from the original list.

------
WalterBright
I have patents 6418323, 7831208, 6850782 which cover using Morse code on a
phone to send and receive text messages.

The advantage is you don't have to look at the phone's display when composing
or 'reading' text messages. The phone can be in your pocket and you can
send/receive messages without bothering others.

Obviously, it never caught on, but I thought it was a fun idea.

~~~
jlgaddis
This is actually something I would use -- say, while driving. Emoji would
probably throw me off, though.

I learnt Morse at age 12 or so, actually used it regularly on amateur radio
for quite a while, but haven't used it for probably about 20 years. I can
still "read" it when I hear it, without even thinking about it.

------
tzs
The article is looking at efficiency in terms of time to transmit a given
message. For that you want to assign the shorter codes to the symbols with the
highest frequency. Morse did a decent job of that, except that assigning
'\---' to 'O' seems way off, since '\---' is in the top 5 for longest code
length, whereas 'O' is one of the 5 more frequent symbols.

I wonder what would change, if anything, if instead we considered things from
the receiving side, and put a bound on the acceptable error rate? That '\---'
for 'O' really stands out when listening to code. The only other letter that
was a '\---' in it is J ('.---'), which only occurs about 2% as often as 'O'.
Maybe 'O' being so distinct and easy to hear, and frequent enough that you
will have an 'O' every 10 or so characters, helps keep the listener
synchronized?

Early Morse code was sent fully by hand, and so the timing would not be
precise. The timing is supposed to be, in units of the length of a dot: 1 for
a dot, 3 for a dash, 1 for the space between adjacent dots and dashes within a
character, 3 for the space between characters in a word, and 7 for the space
between words. A good, experienced operator would hit that timing very
accurately, but less experienced operators could be quite a bit off.

Someone whose timing is off might shorten the gap between characters enough
that it might run dashes from the end of one character and the start of the
next together. For example, in the word 'awkward', the 'wk' sequence becomes
'.-- -.-' and if the person did not give as big a gap as they should between
the words, you could run the trailing '\--' from the 'w' and the leading '-'
from the 'k' together giving a '\---', but even in that situation I don't
think it would sound like an 'O', because with an 'O' you go into it trying
make 3 evenly spaced things, and we are good at that. We might get the spacing
off, but we'll get it off the same way uniformly throughout the 'O'. An
accidental 'O' from running two things together won't have that uniformity,
and so I think it would stand out.

In other words, I'm guessing that the apparently anomalous assignment of a
seemly too long code to 'O' actually servers to make communication more
accurate in the presence of inexperienced senders and receivers.

~~~
rsfinn
On the other hand, the letter O in the older American Morse was relatively
short because it contained an "long" internal gap: "dit-dit" (as opposed to I
being "didit" and the "word" EE being "dit, dit"). If the regular gap is the
length of a dot, and the inter letter gap is the length of three dots, the
length of the gap in O was two dots.

International Morse eliminated the "long" internal gap (according to Wikipedia
[1], this had an advantage on the first long undersea cables), so O had to be
re-encoded. '\---' ("dahdahdah") was the only three-element code not already
being used for a letter. (It happens to be the number 5 in American Morse.)

When I was a Novice-class ham many years ago, I found that older hams would
sometimes send the pro sign C (in the sense of "confirm" or "yes") as didit-
dit instead of dahdidahdit. I never really understood why; I just went along
with it. Turns out didit-dit is C in American Morse.

[1]
[https://en.wikipedia.org/wiki/American_Morse_code](https://en.wikipedia.org/wiki/American_Morse_code)

------
gumby
It's not just frequency/density but also the value for forward error
correction: codes (code points) that can more easily be mistaken for others
should have such different semantics that an error is obvious. For example
though e and a are the two most common letters in the English language,
perhaps use the shortese code for e and the next shortest for q, so that a
misunderstanding causes "went" to become "wqnt" rather than "want. And indeed
perhaps e does not deserve the shortest code because of its importance.

~~~
sp332
English is already pretty redundant, you only need about one bit per character
for most text. You can see this if you compress a block of English text;
you'll probably get 12.5% of the original size which is one bit per eight-byte
character.

~~~
gumby
Yes, but morse is being decoded in realtime by a piece of meat (a trained NN
admittedly, but different senders have different "fists" so there's a lot of
variation). The problem is that some character errors are ambiguous.

~~~
ddingus
Hwoever thows errs pars fiarle easly

People on both ends tend to adapt quickly. A few brief, known exchanges will
typically result in longer ones going fairly well.

And "the meat" does all sorts of stuff. Names get shortened or changed for
brevity or style. Gark might be gary, for example.

One group I observed used a lot of first letter strings:

BBQSTSP?

Y

K

Barbecue, same time same place?

Yes

OK

------
diego898
If you liked this article, you should check out a blog post giving a visual
introduction to information theory[1] by Chris Olah

[1]: [https://colah.github.io/posts/2015-09-Visual-
Information/](https://colah.github.io/posts/2015-09-Visual-Information/)

------
leovonl
SOS was meant to be easily recognizable and trivial to generate:

    
    
        ... --- ... --- ... --- ... (and so on)

~~~
sverige
Actually, it is sent sequentially as a single nine-digit signal, then
repeated. Graphically this looks like ...---... ...---... ...---... and so on.

Incidentally, graphical representations of Morse code are weak tea. It has to
be heard to be understood.

------
jonah
Note that in addition to the Morse Code that roughly equates to "printable"
ASCII, there are prosigns [procedural signals] which are control characters
that affect things like transmissions and message formatting.

[https://en.wikipedia.org/wiki/Prosigns_for_Morse_code](https://en.wikipedia.org/wiki/Prosigns_for_Morse_code)

------
toddh
They also had specialized code books for different industries so that
increased compression quite a bit.

