
Always bet on text (2014) - ColinWright
http://graydon2.dreamwidth.org/193447.html?HN2
======
coffeemug
_> Human rights are moral principles or norms that describe certain standards
of human behaviour, and are regularly protected as legal rights in national
and international law._

I think it would be about as hard to find (or construct) a picture that
precisely conveys the concept of human rights as it would be to construct
appropriate text to precisely convey the emotional payload of, say, this
photo: [https://upload.wikimedia.org/wikipedia/en/b/b8/Kevin-
Carter-...](https://upload.wikimedia.org/wikipedia/en/b/b8/Kevin-Carter-Child-
Vulture-Sudan.jpg).

If we were, for some reason, forced to eliminate all forms of communicating
information but one, the most sensible form of communication for us to keep
would almost certainly be text. But we aren't, so why would we force ourselves
into voluntary communication medium asceticism?

~~~
mikekchar
I was watching a TV programme on haiku the other day. One of the things they
stressed was that because haiku is so compressed (17 mora in Japanese, where a
mora can be though of as either a vowel sound or a consonant followed by a
vowel sound in English terms) that you have to be careful of what grammar
constructs to use. Grammar does not usually lead to imagery and if you can
strip away the grammar, you can have more mora to use for words that evoke
imagery.

This made me think that haiku, in particular, is good at evoking emotions and
imagery that the reader/listener is already familiar with. There are some
incredibly powerful haiku that in just a few words can really tug at your
emotions. But you need to have already experienced the imagery for it to have
any meaning.

I find that picture to be similar in nature to a haiku. There is very little
in the picture itself, but the elements it has evoke powerful emotions because
we know what they mean. The starving child, the vulture sitting ready, the
question of "why is the child starving" (war)... You assemble these ideas in
your head and it generates a huge number of thought processes and feelings.

To me, the difference between such a picture and a haiku is that the picture
is not something I have ever seen before. I couldn't imagine it without seeing
it first. Someone could write the most poignant haiku about that image, but it
would mean nothing to me, who has never seen such a war. But, to the people of
that village, I'm sure such a haiku would be incredibly powerful -- possibly
even more so than the photograph. They would fill in their own experience in
the gaps left by the words. For someone personally familiar with the
experience, the photograph might be too real and obvious -- even gauche.

I wonder if text is an ideal mechanism for discussing topics for which the the
audience already has some experience -- something to relate to and hang on
those words. Pictures, on the other hand, are good for describing things that
people have not seen before and may not relate to.

------
rawdisk
2014\. Has this been posted before?

Anyway, I strongly agree. And I think it takes balls to state this opinion
because you will be opposed by so many.

I also think the Bourne shell, which accepts good ole text as input (as
someone downthread points out), is my most powerful application. Among other
things because it is everywhere, it's relatively small, fast, and seems to
have an infinite lifetime; it appears forever protected from obsolescence.
It's reliable.

Stating this opinion never fails to draw protest. It's just an opinion. Relax.

One time I stated it to what I thought was a sophisticated audience that I was
sure could handle it. Somebody still went bananas, claiming that "make" could
do everything the shell can do. I must be wrong but at the time I thought
"Doesn't make just run the shell?"

There will always be people who are hell bent on arguing against plain text.
And the Bourne shell. Why is anyone's guess.

Yet no matter how much internet commentators might complain, I doubt these two
things are ever going to disappear. They might get buried beneath 20 layers of
abstraction, but they will still be there.

Year after year, they just work. And for that I'm thankful.

~~~
hardcoreluddite
"Any one language cannot solve all the problems in the programming world and
so it gets to the point where you either keep it simple and reasonably
elegant, or you keep adding stuff. If you look at some of the modern desktop
applications they have feature creep. They include every bell, knob and
whistle you can imagine and finding your way around is impossible. So I
decided that the shell had reached its limits within the design constraints
that it originally had. I said ‘you know there’s not a whole lot to more I can
do and still maintain some consistency and simplicity’. The things that people
did to it after that were make it POSIX compliant and no doubt there were
other things that have been added over time. But as a scripting language I
thought it had reached the limit."

From an interview with Steve Bourne in 2009.

[http://www.computerworld.com.au/article/279011/a-z_programmi...](http://www.computerworld.com.au/article/279011/a-z_programming_languages_bourne_shell_sh/)

~~~
plonh
Bash went too far though. For example the utter disaster that is arrays, that
cause far more trouble than they help.

------
gpcz
The Art of Unix Programming dedicates a chapter to the advantages of using
text for the input and output of programs:
[http://www.catb.org/esr/writings/taoup/html/textualitychapte...](http://www.catb.org/esr/writings/taoup/html/textualitychapter.html)
. They also note that Unix and its derivatives are one of the oldest operating
system paradigms still in active use, and that it has migrated to everything
from smartphones to supercomputers. In a way, by using a Unix-like OS you are
betting on text.

On the other hand, I would argue that engineering, architecture, and industry
rely heavily on visual languages such as technical drawings, blueprints,
control/process flow diagrams, etc. Text is a good bet, but it's inadequate
for a lot of problems related to the physical world that require spatial
reasoning.

~~~
douche
I think one of the commonalities there is that that kind of pictorial
information still shares many of the advantages of text, namely that they are
searchable, asynchronous, and permanent.

Speech is often problematic, because it is so ephemeral, unless extra pains
are taken to record it. Ideally, you'd have voice recordings or at least good
minutes taken of an in-person meeting or telephone call(which actually just
turns it back into text...), so that the information doesn't just disappear
into fallible meat-memory. The other problem is that it is a real-time
communication. You can only consume the content at a single rate - you can't
speed through the filler or go back and reread something that didn't at first
make sense. With recorded video or audio, you can sort of do this, but it is
so slow, compared to reading speed, and it's not like you can just Ctrl-F
through it

~~~
varjag
Additionally, speech is also less coherent than text, unless it's a rehearsed
performance. Most people do very fluid re-formulation and editing even when
using instant messaging. With speech, backtracking is basically impossible.

------
sandworm101
I disagree. There are countless places where text is probably the least best
means. The real issue with text, with ALL text, is that it requires the viewer
to be a reader, to have a similar linguistic background to the writer.

The for example a sign system for beaches. Which is a better means of
communicating to a wide variety of people, a sign saying "Warning Sharks" or
this? [http://images.fineartamerica.com/images-medium-
large/shark-w...](http://images.fineartamerica.com/images-medium-large/shark-
warning-sign-computer-artwork-.jpg)

If you are writing code or dev notes then text is no doubt the way to go. But
if you are trying to communicate with strangers (ie prospective customers)
then symbols have their place.

~~~
Falkon1313
Symbols are very limited and many require even more cultural background
similarities than text. You soon run out of simple symbols and end up with
ambiguous or meaningless abstractions (what does "three nested red triangles"
mean?), lots of icons that look almost alike, or symbols that allude to things
based on cultural assumptions, outdated metaphors, or linguistic relations
that strangers are equally unlikely to understand.

Pictionary and charades are completely based around the idea that
communication without words is more challenging.

~~~
Symbiote
The yellow-diamond example follows American/Australian road signs, where that
means "warning" (if it's that consistent?).

In Europe, and much of the rest of the world, an ISO-derived sign would be a
black shark in a black-bordered yellow triangle. These are most commonly seen,
worldwide, as a lightning bolt on an electricity pole.

The system usees colour and shape to distinguish between warnings (black-
bordered yellow triangles), positive orders (blue circles), negative orders
(red-bordered white circles) and safety/evacuation (green rectangles).

The Vienna system for traffic signs, used by over half the world, is similar,
except warnings are red-bordered triangles.

That, at least, means "no cycling", "cycle path" and "warning: cyclists", or
"wear ear protection" and "do not wear ear protection" reuse the same symbol
on a different coloured/shaped sign.

~~~
tomjakubowski
> and safety/evacuation (green rectangles).

[https://twitter.com/charliearchy/status/647889800876433408](https://twitter.com/charliearchy/status/647889800876433408)

Ah, that totally clarifies the meaning of this sign: "evacuate in case of
Godzilla attack."

------
nicklaf
Natural language is a great querying tool for quickly combining existing
knowledge in order to generate moderately intricate thoughts. It feels easy,
but don't forget that it took many lifetimes of experience to accumulate the
database you are drawing upon. It's similarly easy to throw together a
Smalltalk prototype as it is to have a casual conversation. Not so easy is to
construct a language--natural or artificial--from scratch (let alone try to
teach it to somebody).

On the other hand, in advanced mathematics, diagrams are used all over the
place (look at category theory). When they're not used, the reader is
essentially forced to imagine spatially what is being said, often down a blind
alley (to use the phrase almost literally).

Diagrams are essential to communicating non-obvious abstract ideas in an
efficient manner. In addition, given a domain-specific technical language,
there are many possible permutations of words in a sentence, whereas the space
of sensibly drawn diagrams is much smaller. Often, after fully comprehending
an idea that was previously expressed in words, the final step in congealing
the idea in your mind (and convincing yourself of its soundness) is to draw a
definitive diagram which covers all the cases, and can be understood in a
glance.

I should also mention that many of the worst typographical errors in
mathematics completely change the meaning and jumble the thought, something
which is far more difficult to do when using a diagram.

Also, as I remarked in my other comment, diagrams complement text, with either
one enhancing the other. Neither should completely replace the other.

~~~
ccalvert
I agree, diagrams are great -- in their place. But to the author's point, can
you draw me a diagram that conveys the same information as the sample sentence
from the article:

"Human rights are moral principles or norms that describe certain standards of
human behaviour, and are regularly protected as legal rights in national and
international law."

I don't think you can, but if you could, it would likely be very large and
complex. This speaks to flexibility and information density inherent in text.

~~~
nicklaf
The diagram would indeed be enormously complex, because that sentence draws
upon a vast amount of existing human knowledge. You'd also have to invent a
visual language for describing this knowledge, which at present probably
doesn't exist. One may as well try looking at fMRI images of human brains and
try to infer the way that sentence triggers existing memories in order to
figure out how the brain represents hierarchies of knowledge in memory.

I mostly use diagrams for describing completely new, abstract ideas in
engineering or mathematics. Once the idea is understood, then yeah, it's
faster to "query the database", and simply utter a word or write down a
symbol.

------
unoti
When it comes to radio, voice takes a lot more bandwidth than digital
signals[1]. This has important practical implications. A typical voice signal
takes about 3kHz of bandwidth, whereas a CW (continuous wave, Morse code
signal) can be done in about 500Hz. The lower bandwidth lets you pack _all_ of
your transmitter power into that tiny little bandwidth, and get your signal
out that much further than the same power would give you with voice. Also,
when receiving, you can narrow your receiver down to just the tiny little
bandwidth window you're looking at, and ignore everything higher and lower,
leading to less interference. These are the key reasons why Morse code is
still alive and well today.

Other digital modes exist that give you almost the same kinds of benefits as
CW for very low bandwidth, most notably PSK31.

[1] Technically, the speed of the signal switching causes you to need more
bandwidth. So if you're doing a really fast digital transmission, like a 56k
modem, that requires more bandwidth than a 9600 baud modem transmission. This
makes intuitive sense. What's kinda surprising though is that even really fast
morse code requires more bandwidth that slow morse code.

------
shuzchen
I agree with the sentiment but not with the arguments put forward. It's just a
little ironic that the first example put forward (carvings in a stone tablet)
is represented as a jpg image. Even the text of the article itself,
considering the point of view of the pixels stored in the graphics buffer,
takes just as much space as any other image. Arguably, all text is image, but
that's a huge discussion.

Is there really less information required to represent text than data? It just
seems easier to encode ascii text because we've settled on an encoding schema
that optimizes for the english alphabet. The example of the twitter icon using
2000 bytes is only because the author decided to use a png. Using the font-
awesome typeface, the twitter icon is just 2 bytes. And I can whip up in 5
minutes a typeface where the twitter logo uses 1 byte. We can come up with an
encoding format where dingbats and logos take up 1 byte and the english
alphabet uses 2TB. It'd be a useless encoding for practical purposes, but goes
against the idea that text is inherently more informationally compact than
images.

~~~
lvh
I don't think that's a valid argument for informational density. You have to
whip up that font where something interesting maps to a low code point and
then _share it ahead of time_ before you can take advantage of that alleged
density. As an extreme example: if I create a Huffman tree where the bit
string "1" maps to the entire concatenated contents of the Library of
Congress, does that affect the information density of the Library of Congress?

~~~
shuzchen
But I'm not exactly making an argument for information density, except that
"it's kinda a weird thing and we probably don't know how to calculate it
without assuming a whole lotta things". In my opinion, your extreme example
actually supports my argument. Yes, you have to "share it ahead of time", but
that's precisely what happened with the ascii encoding. It's just ascii is
built into most systems by default and codified by a standards body, whereas
our ad-hoc encoding is not.

But does that mean ascii is inherently denser? I wouldn't use ascii to
communicate with aliens because to understand it you require knowledge of the
majority of the english language (again, english knowledge is shared ahead of
time). In fact, the Voyager Golden Record has line drawings on the cover, and
not a single character.

------
sudeepj
But as a general form of communication between humans I think visual is
powerful. Even illiterates, children understand visual. Expressing complex
abstractions in text and understanding it, at present requires a human to
undergo about 15 to 20 years of education atleast and that too with some
proficiency with the language.

On the computing front, I agree that the text is pervasive, reliable. This
brings to me to ask myself that is it because since the text was invented to
current education system we are conditioned to prefer text? Computing power
for human race non-existent until now to tackle non-text communications at a
massive scale and hence text was the natural choice. May be in future we will
explore/invent ways to handle visual info same as text (to large extent).

------
pnt12
I think the Unix shell is great exactly because it operates mostly on plain
text. It makes most programs easily interoperable through piping.

~~~
jwmerrill
Piping text is great until you want to make something that is robust to edge
cases. The various conventions for separating data, and the ways these
interact and must be escaped, are a usability disaster. This is why tools like
find end up tacking on a million flags and options instead of actually
encouraging you to compose simple operations.

~~~
username223
The UNIX shell was designed for, and makes sense in, a friendly, collaborative
environment. "This breaks for filenames with spaces." "Well then don't do
that. If punching yourself in the face hurts, then stop doing it." Feeding
malicious input to programs on your own computer is self-destructive, and
UNIX's creators felt no need to defend against it. It's "an elegant weapon for
a more civilized age."

Nowadays, even trying to read a page of "hyper-text" requires doing battle
with several levels of malicious software.

~~~
simoncion
> This breaks for filenames with spaces.

Useful things in bash:

    
    
      export IFS=$'\n'
    

or

    
    
      export IFS=$'\n'
      for I in `$COMMAND` do
        $OTHER_COMMAND "${I}"
      done
    

or, generally:

    
    
      find ./ -print0 | xargs -0 $COMMAND

~~~
pdkl95
> export IFS

Be careful exporting a changed IFS. That will inherit into child commands,
which can cause unexpected breakage. A better model is to simply set IFS in
the bash script (without export) or even better, as a one-off prefix to a
command

    
    
        # leaves IFS set for following commands
        IFS=$'\n'
        $COMMAND
    
        # only sets IFS for one command
        IFS=$'\n' $COMMAND
    

It is common to see saving/restoring IFS to protect against bugs elsewhere
from a non-standard IFS:

    
    
        oldIFS="$IFS"
        IFS=$'\n'
        # ...stuff...
        IFS="$oldIFS"
    

If you do this, a trap is a better idea to guarantee the restore happens.
Better yet, let bash handle that for you automagically by using a local
variable.

    
    
        cmd_with_nonstandard_ifs() {
            local IFS='\n'
            $COMMAND
        }
    
        cmd_with_nonstandard_ifs
        # IFS is back to normal here
    

Also, unless you're using a _really_ ancient version of bash, you shouldn't
use backticks for command substitution. Use $() instead.

    
    
        # backticks need to be escaped when nested
        FOO="`basename "\`command_that_outputs_a_path\`"`"
    
        # much easier to read in the modern form
        FOO="$(basename "$(command_that_outputs_a_path)")"
    

The thing is, you probably don't even need mess with IFS - shell globbing
handles a lot of these things for you

    
    
        $ ls -1
        a bc
        a b c d e
        ddd eee
        $ for file in * ; do echo "[$file]" ; done
        [a bc]
        [a b c d e]
        [ddd eee]
    

and arguments be expanded correctly with "$@"

    
    
        show_args() {
            for arg in "$@" ; do
               echo "arg that supports spaces: [$arg]"
            done
        }
        show_args "foo bar"
    

> find | xargs

Of course, this is always a nice option that bypasses the need for bash.

Minor suggestion: "find ." and "find ./" are identical.

~~~
simoncion
All good advice. I was intentionally making a correctness/terseness tradeoff.
:)

> ...and arguments be expanded correctly with "$@"

That assumes that your bash is contained within a script, yes? It doesn't work
for commands entered directly in a shell?

> ...unless you're using a really ancient version of bash, you shouldn't use
> backticks for command substitution.

I was -stupidly- unaware that backticks _would_ nest. I always use $() when I
want to nest command substitutions.

> "find ." and "find ./" are identical.

Oh, I know. I have the largely unjustifiable habit of _always_ putting a path
in a find invocation, as well as always spelling "the current directory" as ./

> > find | xargs

> Of course, this is always a nice option that bypasses the need for bash.

Don't forget -print0 and -0 if you expect to have to handle arguments that
contain spaces! ;)

~~~
pdkl95
> That assumes that your bash is contained within a script, yes?

Or a function. (that is, anywhere the $1, $2, ... variables are available, as
"$@" (must have the double-quotes!) just copies $1, $2, .. without changing
the word splitting.

> as well as always spelling "the current directory" as ./

That's actually a very good habit to be in a lot of other commands. It's
unnecessary with find, but it's harmless either way.

------
george_ciobanu
That text can be translated is not an advantage, images can't be translated
because they don't need to. The twitter image is expensive to store but also
contains a different kind of information, that would be very hard to convey
over text with the same precision and accuracy - how does one describe it
pixel by pixel? I could go on - I agree with the idea but text is only king
when text is the best tool, or you have the mind to fill the gaps (like in a
novel) and you don't mind variance between readers, or the data is high
structure but low resolution like concepts. Images are high resolution and
it's important to recall them exactly, with perfect accuracy, thus they
require a lot of information. Plus I'm not sure the best storage was used for
that icon. Text also compresses very well, so it must not be quite the
densest. Like I said, I could go on.

------
legulere
In my opinion formulas aren't text. Yes you could read them out loud but that
way they would be harder to understand than looking at the formula. Try
understanding maths really written down in text without formulas.

It's also wrong to speak about text being the oldest communication technology
when it developed from pictograms.

~~~
DougWebb
I think the author intended that both formulas and pictograms are forms of
text. They're both made up of abstract symbols that can be composed to create
meaning.

~~~
jameshart
Right, but "plain text" is just a one dimensional stream of symbols. A
mathematical formula is a two dimensional arrangement of symbols - there's a
degree of additional complexity there which moves it from 'text' closer to
'diagram'.

A scatter plot is just abstract symbols creating meaning, too - but it's
clearly not just text.

------
jameshart
This is true, but please, please don't make the mistake of confusing 'text'
with 'ascii'.

For example, yes, text is searchable - but remember that different human
languages consider different symbols as 'equivalent' for search and so you
need to take care matching the searched string to the text.

And while humans can readily encode information into text, extracting that
meaning again unambiguously enough for a computer to use is hard. Parsing is
one of the foundational techniques in computer science, and natural language
processing is still on the frontiers of development. Meanwhile, we're still
dealing with Excel spreadsheets that pop up a red flag and ask a human for
help whenever they see what looks like a number stored as text.

So yes, bet on text, but don't assume that because text is fundamental that it
is simple.

------
nnethercote
After this sentence:

"This blog post is likely to take perhaps 5000 bytes of storage, and could
compress down to maybe 2000; by comparison the following 20-pixel-square image
of the silhouette of a tweeting bird takes 4000 bytes:"

in my browser I see a little broken image icon which represents a missing
picture. How apt.

------
doctorstupid
People are noting counterexamples where pictures are more efficient, such as
warning symbols which are resilient to cultural translation. However, I would
argue that such cases are actually forms of text. By 'text', the author really
means collections of symbols. In that sense, ancient hieroglyphics are texts
and perhaps emojis are too.

~~~
wutbrodo
I think that stretches the definition a little too far. A warning sign of that
type is different from textual symbols in that it's readily understandable to
someone seeing it for the first time. I agree with the author's thesis in
general but that is a rather good exception to point out.

~~~
doctorstupid
You raise an interesting point. Perhaps we shouldn't stipulate that a symbol
must be understood the first time - the meaning of most symbols are probably
taught in some way - but rather that once learned, the meaning is unambiguous.
The silhouette of a shark on a beach sign is quite unambiguous, whereas an
artist's illustration of a shark may be open to interpretation. In other
words, perhaps it is the level of ambiguity which distinguishes symbols from
other graphic forms.

------
makemoniesnow
The Twitter bird picture can be compressed losslessly to under a thousand
bytes with ImageOptim.

------
WorldWideWayne
Text are just symbols and symbols are an even older and more stable form of
communication.

If I use text to make a sign, only people who can read the language will
understand it. But if I use a well known symbol, everybody will understand it.

~~~
wrenky
symbols are about as flexable as text- If you have a symbol as a sun, I might
interpret that as heat, light, the sun, a god, or anything really. Depending
on the area you are from. Knowning a language ensures that when you say X you
mean X.

And while symbols CAN be used for discussion, could this conversation be
relayed in symbols?

~~~
WorldWideWayne
You're right, they can, but they don't have to be. A flame icon is typically
used for heat or fire I think.

My point is - there's probably no "always".

------
douche
This, a thousand times this.

If they come up with something more efficient and information dense, I'd be
happy to see it, but until we get direct brain/digital connections, I don't
see it happening.

------
piker
Great article. Sales folks will (correctly?) disagree, but nobody is inventing
the 21st century's polio vaccine by watching Vine posts.

~~~
GFK_of_xmaspast
I don't think a lot of them are doing it by reading the hacker news internet
forum either.

------
_yosefk
Dreamwidth's response to worrydream... Meaning Bret Victor's ideas along the
lines of "kill math."

Visualising a lot of abstract stuff easily expressible in text is hard, and
there's all the other stuff tfa mentions.

I wanted to write about this for a long time, perhaps tfa got it done and I
won't...

