
The Craft of Text Editing (1999) - youjiuzhifeng
https://www.finseth.com/craft/
======
cmyr
I've been learning a lot lately by following along with the development of
xi[1], a new text editor written in Rust. Through reading that project's RFCs
I've then come across other interesting projects, like swiobe[2] and wi[3].

What are the other canonical resources on this topic? It feels like tons of
the interesting thought is scattered around various blogs and usenet posts and
the like. I'd love to create a nice collection of good writing on text-editing
/ tools, but I'm not sure where to start.

[1] [https://github.com/google/xi-editor](https://github.com/google/xi-editor)

[2] [https://github.com/swiboe/swiboe](https://github.com/swiboe/swiboe)

[3] [https://github.com/wi-ed/wi](https://github.com/wi-ed/wi)

~~~
omtose
Kakoune[1] has been posted recently here on HN to great reception. As a C++
developer, I think it has a very high quality codebase, especially considering
how non-trivial it is. As a user, it's been my main text editor for over a
year. It also has a vibrant community, jump in on IRC if you have questions or
ideas.

[1] [https://github.com/mawww/kakoune](https://github.com/mawww/kakoune)

~~~
zellyn
I think getting Kakoune UI support into Xi would be a very interesting
project…

~~~
omtose
Are you talking about Xi as a frontend for kakoune? If it's the UI part that
you're interested in, there is also kakoune-qml[1] made by one of the regular
uses.

[1] [https://github.com/doppioandante/kakoune-
qml](https://github.com/doppioandante/kakoune-qml)

~~~
zellyn
I think creating a Xi frontend that mimicked Kakoune's modal keybindings and
highlighting would be an interesting exercise, and probably stretch the set of
supported ideas in Xi in a good way.

------
Todd
A paper that covers some of the data structures used in editors is:

[https://www.cs.unm.edu/~crowley/papers/sds.pdf](https://www.cs.unm.edu/~crowley/papers/sds.pdf)

The gap buffer, in particular, is a great example of a simple yet powerful
idea that is perfectly suited to the problem domain.

~~~
polm23
The text in that PDF didn't show up correctly for me, but there's an HTML
version of the paper here:

[https://www.cs.unm.edu/~crowley/papers/sds/sds.html](https://www.cs.unm.edu/~crowley/papers/sds/sds.html)

~~~
bernardlunn
Was irony intended?

------
dws
Fun blast from the past. The original version of this shipped with Mark of the
Unicorn's Mince/Scribble package for CP/M. (Mince = Mince Is Not Complete
Emacs)

~~~
ScottBurson
Glad to see someone remembers that! (I was a cofounder.)

The sources for the original version of Mince were lost long ago. It's too
bad; they were very clear (thanks to the skills of Jason Linhart, the primary
author, as well as Craig) and would have made a great example for study.

~~~
kabdib
I used MINCE quite a bit, and it was great. MINCE was what you used for Emacs
if you couldn't get to an ITS machine :-)

Thanks, it was a really nice editor.

------
AlexanderDhoore
I once build a text editor using a rope[1] data structure where every line was
a node. The tree was augmented[2] with information about line numbers, titles
in the document... for very fast navigation. I don't think primitive data
structures like a gap buffer are useful anymore. They come from a time where
saving on memory was more important than it is now.

EDIT: I forgot it was also a self balancing tree! Very cool stuff.

[1]
[https://en.wikipedia.org/wiki/Rope_(data_structure)](https://en.wikipedia.org/wiki/Rope_\(data_structure\))
[2]
[https://en.wikipedia.org/wiki/Interval_tree#Augmented_tree](https://en.wikipedia.org/wiki/Interval_tree#Augmented_tree)

~~~
geocar
A dissent: Saving memory isn't strictly orthogonal to editing performance.

A paged gap buffer (as described with an array index) remains ideal when you
need to make a small number of surgical changes (insertions, deletions, etc)
to a very large file, especially given the fact that all modern systems have
page mapping hardware, so anything you implement is _effectively_ on top of a
paged gap buffer anyway. What we're really searching for is a better program-
visible structure.

To that end, the biggest difficulties in efficient text editor, is that most
text editing is (ahem) textbook, and neglects the fact that fork() copies on
write making most real operations asynchronous, and combining writev() and
mmap() can be used to produce whatever memory layout you want (a plain old
stupid byte array if that's convenient); The kernel will memcpy your page
table for you, so there's no sense in _also_ doing it in user-space. And so
on.

If you consider at which point a write() and a mmap() (or on OSX a
mach_vm_remap()) will be faster, just how much faster it will be -- imagine:
programming something as simple as a plain byte array but with instant inserts
(memmove) across multi-gigabyte buffers. Then consider the cost of a
write()+mmap() syscall combination in the worst case (a couple hundred
micros?) and you'll never use a complicated (linked list) data structure
again.

~~~
maxbrunsfeld
How would a gap buffer handle distributed editing operations like search-and-
replace, or multi-cursor typing? The data structure seems optimized for
editing in one place at a time.

~~~
geocar
There aren't a lot of structures that have an amortised cost-per-edit; you're
really only ever considering the cost of doing a single insert/deletion
operation, and you're really only ever trading that performance against the
complexity.

A paged gap buffer is actually more like a tree, so instead of a single gap in
the middle of a file, you have a gap in the middle of a block, and one block
mapped per modified space. Cost of insert/delete is limited to the cost of a
memmove within a block (cheap!), and your extent map never grows beyond 2x the
number of changes. These upper bounds are incredibly good for edits, and there
are only pathological cases that do better.

But what about search?

Search is faster too! Because _virtual memory_ has your entire file
contiguous, a search is as fast as a scan[1], which might embolden you to try
indexing your file, which might _really_ impress your users.

[1]:
[https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm](https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm)

------
gf263
I never understood why these webpages can't have like, 4 lines of CSS to make
them much more readable. Preserve the older aesthetic I guess?

~~~
massysett
Maybe because the author's expertise is something other than writing HTML, so
he picked up an old book on HTML, marked it up, and that's it. That the
browser can render HTML written 20 years ago is quite a virtue. If it only
takes 4 lines of CSS to make it more readable, then the page's lack of
readability is more an indictment of the browser (which could do this tidying
itself) rather than of the author, who should not have to continually update
HTML so it renders well on recently-invented devices.

~~~
stinkytaco
> then the page's lack of readability is more an indictment of the browser
> (which could do this tidying itself)

I _strongly_ disagree with this. The browser should do nothing it is not
explicitly made to do, which is one of the reasons it can still render HTML
from 20 years ago. We used to have browsers that tried to do that kind of
thing and we're only now extracting ourselves from that mess.

It is 100% on the website author to make their page more readable.

~~~
massysett
The whole point of vanilla HTML is that it has few presentation details. It
has some headings, bold, italic, and such. If the page does not specify
margins or font size, the browser absolutely should set these so it is most
readable on the device. If I write plain HTML today I am not optimizing for
some VR headset that will be used twenty years hence. The headset should
render the plain HTML in a manner faithful to the semantic markup, not so it
looks the same way it looked on Netscape with a VGA screen.

------
erikb
> In its most general form, text editing is the process of taking some input,
> changing it, and producing some output.

Funny how similar that definition is to the "programming" one.

~~~
fmoralesc
That's because that is just what a Turing machine does.

------
z3t4
As someone currently working on a code editor I love this stuff, but there's
usually more focus on the technical part then the human part. With todays
hardware we can do millions of stupid things every second and it will still
feel snappy. We should spend more time trying to optimize for the humans
instead of their computer.

------
jwhitlark
I liked it so much I bought a hard copy a couple years ago. Lots to learn in
that book.

