
Formalizing Text Editors in Coq - gbrown_
https://arxiv.org/abs/2006.03525
======
p0llard
This looks more like someone proved some trivial lemmas about lists and wrote
the simplest possible DSL to do some list manipulations rather than anything
close to resembling a serious piece of work on formally verified text editors.

This is first-lecture-learning-Coq level stuff to be completely honest; it's
interesting if you've never seen Coq before, but it's not really what it
claims to be at all. The amount of work to go from this to "formalizing full-
nlown text editors, such as vim" is enormous.

If we remove the proof scripts (and duplicated statements of Lemmas/Theorems
in Gallina _and_ mathematical notation) since these would never be included in
a published paper (only as an accompanying artefact unless a novel
tactic/proof scripting method is the actual topic of the paper), the length of
the paper almost halves and the amount of substance becomes clearer.

~~~
throwawaygh
_> ...the amount of substance becomes clearer_

IMO this attitude is why formal methods based on proof assistants never took
off.

That entire PL theory research community seems to think that implementation
work is unimportant and not even worth a few page "tools/casestudy" workshop
paper. Even a tiny four page write-up gets a "why isn't this a homework
assignment" style response.

That, by itself, is I guess not terrible. But POPL proceedings for 20+ years
have been littered with 2 pages of substance and 18 pages of greek-letter-
masturbation. Most of those papers build insanely complex calculi to capture a
simple idea that every junior engineer can understand in half a day. And this
complaint always fell on deaf ears. At least, as long as the author was able
to say "logical relations" or "linear types" or whatever the hell keywords
y'all are applying your elimination rules on these days.

The only thing that will ever push that field forward is _shitloads of working
and well-explained /documented code_. But hacking out some lines of
ocaml/gallina (or, god forbid, Python/C++... if I have to listen to an SML
acolyte complain about parallelism in Python one more time...) isn't valued as
much as hacking out lines of \Gammas and \vdashes. Taking the time to explain
short code examples gets you "that's a homework problem" responses. As if
explaining a digestible piece of code concisely is some sort of admission of
intellectual impotence. Smart people realize this quickly and either get out
(if they want to have an impact on the world) or learn how to write 1+1=2 in a
super complicated way (if they just want that piece of paper).

Anyways, you're right. The paper is a far cry from "formalizing a text editor
in coq". But the "your implementation work doesn't deserve my time or
attention" attitude is a big reason why this research community hasn't really
had much impact even after several decades of outsized investment. IMO.

~~~
jiggawatts
This. This is the problem with academia in a nutshell.

I get mercilessly downvoted whenever I point out that it makes _zero_ sense to
use Greek letters in computer science papers. There is no rich history of
Ancient Greek computer science that is being referenced for consistency with
the new research!!! It's 100% intellectual wankery, nothing more.

I experienced the same thing first-hand with AI research. Nobody is going to
get published for using a straightforward numerical approach like Automatic
Differentiation in a paper, so they spend 90% of their effort symbolically
differentiating the formulas and end up with something that is numerically
unstable but oh so pretty and impressive looking when laid out with LaTeX.

Publishing in science is now all about showing off, and not at all about the
advancement of science. It's politics, not practice.

PS: Go back and compare CS papers published in the 60s, 70s, and even 80s with
stuff being put out now. It's night & day!

~~~
yters
Yes and no.

Yes, a major from my many years in academia is sophistication is valued over
simplicity. A bunch of fancy symbols and equations = important, so much of
what is published appears, at least to me, to be much more complicated than it
needs to be.

No, in that the sophistication, in the good cases, is because there is a
multitude of subtletly and complexity in these subjects, as well as a formal
framing of the issue so that it can be analyzed in an unambiguous manner. So,
there is a reason for the obfuscation, although it mostly ingrown overkill at
this point.

The best papers can both deal with the great complexity and sophisticated
formalisms, while boiling it all down to an accessible delivery.

However, such papers are few and far between.

~~~
lou1306
In Coordination2019 here is a lovely paper (W Kokke, J Garrett Morris, Philip
Wadler. Towards Races in Linear Logic) that uses emoji as formal notation.
Maybe it could be the way to go? Like, if your formal language describes cars
and buses, why not use and instead of alpha and beta?

~~~
lou1306
(The above comment ended with "why not use :car and :bus instead of alpha and
beta?". Apparently HN does not support emoji?)

------
jkaptur
Interesting to contrast this with [1], which works on formalizing _rich_ text
editors (for HTML).

1\. [https://medium.engineering/why-contenteditable-is-
terrible-1...](https://medium.engineering/why-contenteditable-is-
terrible-122d8a40e480)

------
benrbray
Neat! It would be great to have a universally agreed-upon standard for how
text editors behave (especially with regard to embedded / structured content,
cursor position, etc.) and which operations they should support.

I recently came across ProseMirror [1], which seems to be a fantastic step in
this direction. I've been experimenting with using it to build a Markdown
editor with better support for math and pre-defined HTML blocks (for e.g.
stating theorems, corollaries, etc. in math notes).

[1]: [https://prosemirror.net/](https://prosemirror.net/)

~~~
edjroot
Indeed, ProseMirror is highly regarded by its code quality, just as its older
sibling, CodeMirror. It even raised quite a bit of money on Indiegogo.

Are you planning on open-sourcing your editor? I'd love to use something like
this on a project of mine. (I currently use the much less ambitious EasyMDE
[1], which is based on CodeMirror.)

[1]: [https://github.com/Ionaru/easy-markdown-
editor](https://github.com/Ionaru/easy-markdown-editor)

~~~
benrbray
Yes, that's the plan! I've long lamented that editors like Typora and Roam are
closed-source, preventing users like me from tweaking them to my specific use-
case. So I'm hoping to build something that others can modify if they need to.

My priorities are math support, citations/wikilinks and the ability to define
custom document structure (theorems, etc.). I basically want users to be able
to specify a ProseMirror schema with CSS to go along with it.

Can I ask about your specific use case? Any features you've been really
missing? No promises they'll make it in, but I'm curious :)

------
sanxiyn
It would be interesting to prove an implementation using gap buffer, against
the same specification.

~~~
neel_k
Gap buffers are zippers, which makes this particular case really easy: they
arise as derivatives of the list functor, and so for generic categorical
reasons they are isomorphic to a list plus a position in it. This means you
can prove things about lists-with-positions and basically automatically
transport them to statements about gap buffers.

The specification in this note is not an ideal top-level spec, though: it just
says there are a series of editor commands that can change one string into
another. This is just a basic sanity check, but you can't do much more than
that here because the editor state is too impoverished -- it's just the
current string.

However, there is an interesting line of attack in (1) enriching the state
enough to easily model richer things like command history and undo (eg, by
making the state be the whole history of documents and commands), (2) giving
each update command a semantics in terms of simple whole-state changes on the
enriched state, and then (3) showing that a more efficient representation
correctly implements the simple whole-state semantics.

About five years ago, Conor McBride taught a course on dependently-typed
programming in Agda in which one of the projects was implementing provably
correct text editors, using zippers/gap buffers and interaction via Mealy
machines. The link is:
[https://github.com/pigworker/CS410-15](https://github.com/pigworker/CS410-15)

------
firstbabylonian
Are there any open source libraries that implement this in practice?

------
_pmf_
Is Emacs vs. Vi decidable?

~~~
nurettin
As someone who saw the war happen back in early 2000s, vim conceded almost
immediately (because it was the efficient move). And people still can't get
over the fact that users continue to use vim anyway.

------
emj
I wish arxiv could be more like a blogging platform with standards and less a
PDF repository.

~~~
elcomet
Arxiv is a pre-print platform for scientific publications, not a blogging
platform. Other platforms are designed for this, such as medium.

You can use [https://www.arxiv-vanity.com/](https://www.arxiv-vanity.com/) to
convert papers to HTML and read them in the browser though.

