
Building a Text Editor for a Digital-First Newsroom - aaronbrethorst
https://open.nytimes.com/building-a-text-editor-for-a-digital-first-newsroom-f1cb8367fc21
======
kaennar
One of the most truly fascinating things I learned from my father was how
early 1980s word processors were designed for the magazine he worked for. The
different ways to handle fonts, screen placement, and how to include data from
a local database.

It was truly an awakening to the hidden wonders of the world.

------
TeMPOraL
RE the flat paragraph structure - did they just reimplement Emacs in the
browser?

[https://www.gnu.org/software/emacs/manual/html_node/elisp/Te...](https://www.gnu.org/software/emacs/manual/html_node/elisp/Text-
Properties.html)

(Also I think - from occasional messing with XML in .docx files - but can't be
sure that Word flattens styling like that as well.)

~~~
nwmcsween
Any sufficiently complicated editor contains an ad-hoc, informally-specified,
bug-ridden, slow implementation of half of emacs?

I prefer vim though..

~~~
TeMPOraL
Vim ain't sufficiently complicated (that can arguably be seen as a good
thing). That's why it's been mostly reimplemented _in_ Emacs.

~~~
nwmcsween
Vim has vimscript (ugh) with a lua and python gaining popularity so emacs sort
of got it right IMO anyways.

~~~
thomastjeffery
Neovim's main goal seems to be replacing vimscript with lua in a backwards-
compatible way.

------
dugmartin
I'm surprised she didn't mention Marijn in the "thank yous" at the end -
without him giving away his time building ProseMirror they would have nothing
to build upon.

------
atonse
I've invested in Mobiledoc (mostly because it was for two ember apps) and it's
mostly been fine, but is there any reason why I shouldn't be moving to
ProseMirror? Seems to be a lot more powerful and have things like
collaborative editing and track changes as well.

~~~
JasonSage
Just my experience: I had a rough time with ProseMirror trying to use it
before it was released. It didn't use modern JS modules for a long time, the
source code was a mess as far as I could tell, and JavaScript variable errors
would surface with every version.

Maybe the author has overcome these issues, but I'd be inclined to go with a
different editor. Also, ProseMirror's architecture is extremely close to
DraftJS, which I've had much fewer issues with.

~~~
marijn
Yeah thanks for dropping some deeply outdated impressions in here. (Also not
even fair—the code was never a mess and at the time it wasn't clear how ES
modules would be used so using CommonJS was a completely reasonable choice.)

~~~
iamcasen
Hey there,

Just wanted to offer some feedback. Prosemirror's docs are woefully
incomplete. The way controls are added is extremely confusing. Why do we need
some node module called "prose-mirror-example-setup" ?

I know I'd personally prefer being able to add all the markup and style for
the editor, and then just use prosemirror underneath the hood for state
management.

Another thing that I found quite difficult, was programmatically inserting
images at the cursor.

Finally, storing state as markdown has turned out to be a nightmare. Not
really your fault, but perhaps something you should be aware of.

~~~
pier25
I agree the docs are obtuse, and implementing features is hard, but document
state is not saved as markdown. The whole point of PM is to separate content
from prsentation.

------
manigandham
Cool, but it's yet another case of re-engineering the same thing. (edit: its
not entirely new).

Splitting editors into an abstracted structure for finer control over data and
presentation is something that's been done for a decade now in various text
editors. Medium and Basecamp started similar projects and there's Mobiledoc
which is trying to standardize the format. It'd be nice if all this effort
focused on a single project so we could all just get a decent editor instead
of different versions with various states of functionality.

FYI, my running list of editors:
[https://gist.github.com/manigandham/65543a0bc2bf7006a487](https://gist.github.com/manigandham/65543a0bc2bf7006a487)

~~~
zawerf
It might be helpful to update your list with some feature comparisons. For
example:

Quill vs (CKEditor and TinyMCE, Draft, ProseMirror, Trix):
[https://quilljs.com/guides/comparison-with-other-rich-
text-e...](https://quilljs.com/guides/comparison-with-other-rich-text-
editors/)

Slate vs (Draft, ProseMirror, Quill):
[https://github.com/ianstormtaylor/slate#why](https://github.com/ianstormtaylor/slate#why)

I personally went with Slate because I needed first-class React support but
not as low level as Draft.

~~~
philcockfield
I really like Slate....but the ability to do collaborative editing is
relatively new.

Have you done anything with collaborative editing with Slate? If so, how are
you finding it?

~~~
zawerf
Collaboration isn't a requirement for me. I would definitely go with quill
instead if it were. They put a lot of thought into making the delta format
natural for operational transforms:
[https://quilljs.com/docs/delta/](https://quilljs.com/docs/delta/)

Your document state is always represented by a list of deltas. Two people
trying to change the same document would just have their individual list of
ops "rebased" together using OT before being concatenated into the existing
list of ops, forming a new document that is still just a list of ops.

~~~
parthdesai
I second using quill. It's really modular and the code itself is so clean.

------
zellyn
Anyone know if Oak is (or will be) released as Open Source?

~~~
lambda
ProseMirror, the underlying editor they are using, is:
[https://prosemirror.net/](https://prosemirror.net/)

~~~
zellyn
Yep. That was clear from the article.

------
rayalez
I'm choosing an editor for my project right now. Can someone compare
ProseMirror to Slate and Quill? What is the difference, which one is better?
What do I need to know to choose one?

~~~
pfooti
I've used a lot of quill in the past couple of years. It has some tricky bits,
but seems similar in a lot of ways to ProseMirror - separate internal
representation of data that gets rendered into HTML, delta-based state
changes, etc. At this point, I use quill out of momentum - it was the only
game in town when I started working with rich text, and now I know how it
works. I also like it enough that I've never felt the need to go find
something else; it's always done what I want.

I think to fully experience all three of those editors, you should build the
following: something that generates a rich text input using the editor,
updates a data store (can be just in-memory if you want), and then re-presents
the rich text in a read-only format.

With Quill in particular, there's two ways to approach the second part - you
could dump the data into another instance of the quill editor, but one flagged
as read-only, and let quill handle rendering out all the elements. I don't do
that, myself, as it's a bit heavy and relies on the DOM to be present to
create the end result (meaning rendering rich text into emails on the server
side can be tricky). Instead I use a library (that I wrote) that converts the
quill representation into an HTML string that can be later injected into
whatever place it needs to go.

In general, I think that's where the one of the important pain points is for
any major modern text editor exist - converting the internal data
representation into a canonicalized HTML string. This gets further complicated
by your framework - it's easy enough to inject an HTML string, and angular 1
lets you even compile that so you can have custom elements there, but angular
2 has no such facility (compiling; you can still inject). So rendering custom
elements into angular 2 is feasible but quite tricky.

You'll need to figure out how to handle custom elements too, like at-mentions
and so on; those are really domain-specific. How I handle at-mentions in quill
(with a pop-up autocomplete) depends on the structure of the overall system as
well as the framework I'm in - angular 1, 2 and react all suggest different
approaches for this.

Anyway, this is all a long response that means "it depends". Each editor has
its own strengths and weaknesses that are exposed by the specific data and
presentation needs of your project and the other frameworks in play. But
building an atmention or other custom element and converting from the internal
data structure to HTML is a good test of functions for all of those.

------
nojvek
Is Oak going to be opensourced? Looks very interesting.

------
priyadarshy
I've been watching this product for a while:
[https://www.nuclino.com/](https://www.nuclino.com/) it seems they too have
built something really impressive on top of ProseMirror (it's a lot like
Quip/Paper but more interesting). ProseMirror really feels like the only
editor on top of which you can build an entire user experience and app.

------
netzone
So, we had this product (or the larger one anyway) demoed for my work and our
takeaway was that only really large organisations could possibly use it. It
needs to many human hands between writing an article and getting it published.
We want to move to something more automated, and this system would have been a
serious setback.

------
wemdyjreichert
Oak was the codename of Java IIRC... May confuse some old people like me

------
Phrodo_00
This isn't even a test editor. It's closer to a web publishing tool. The
article also didn't talk about the normal text editor challenges like the way
the text is stored in memory and modified.

~~~
gbrown_
Agreed.

I carry the personal view that when someone says "text editor" they are
implicitly implying plain text. As such the leading paragraph of the article
makes me twinge a little when "text editor" is used rather than "word
processor".

The article then continues on to describe the editor which sounds very much
like a WYSIWYG editor. Indeed ProseMirror[1] homepage says the following.

    
    
        ProseMirror tries to bridge the gap between Markdown text editing and
        classical WYSIWYG editors.
        
        It does this by implementing a WYSIWYG-style editing interface for
        documents more constrained and structured than plain HTML
    

So whilst I've made a point of it reading oddly (and I would have thought this
would also be the view of most others on HN) I am not trying to disparage the
authors use of the term "text editor". But rather I am curious as to why such
nomenclature happened to be used here as I seldom see it outside of
descriptions of things such as Vim and Emacs.

[1] [http://prosemirror.net/](http://prosemirror.net/)

~~~
amelius
What would you call Adobe Indesign?

~~~
kccqzy
It's a digital publishing system.

------
venning
_Data Warning: 100 MB._

Viewing the header illustration for more than a few seconds crashes my iPad
Safari tab. Scrolling until it is off the viewport prevents this.

EDIT: Dev Tools reports 74.8 MB transferred _with_ an ad blocker. The
illustration is a single 69.2 MB gif.

EDIT: 101 MB in total when you scroll to the bottom. Three more gifs: 18.1 MB,
7.1 MB, 4.7 MB.

~~~
tedmiston
Instapaper Text view reduces it to a slightly more reasonable 41 MB.

[https://www.instapaper.com/text?u=https%3A%2F%2Fopen.nytimes...](https://www.instapaper.com/text?u=https%3A%2F%2Fopen.nytimes.com%2Fbuilding-
a-text-editor-for-a-digital-first-newsroom-f1cb8367fc21)

(This may require a free Instapaper account to view.)

~~~
flukus
UBlock brings it down to 204KB because the images are on a CDN. Being on a CDN
is probably why the author is oblivious to the file sizes.

------
AJRF
I'd love to work for this engineering team. Their tech blog and github is full
of interesting projects.

~~~
zem
the BBC sounds like it has a really good tech team too. wonder what it is
about newspapers that leads to this sort of tech culture.

~~~
gh02t
A desire to stay relevant. Print publishers often get criticized for being
behind the curve, but a few like NYT and BBC have invested heavily in the
technology of their business. Not sure how much it's paid off, but my
impression is they aren't struggling as much as some.

I hope they manage to find the right combination to stay around, I loathe the
dystopian future where Buzzfeed and Breitbart are the only news outlets.

~~~
scottmf
Buzzfeed isn’t comparable to Breitbart at all. Buzzfeed does real journalism
these days.

~~~
gh02t
Their news coverage is pretty strongly slanted to the left. Just because I
tend to agree with them doesn't mean I think they are really that much better
than Breitbart in terms of providing objective reporting.

In any event, what I meant was that I don't want a future where we only have
sources of "news" that serve to reinforce our pre-existing beliefs, with
everybody siloed into their own particular little bubbles.

------
sshine
> If you’re like most people in America, you use a text editor nearly every
> day.

Filter bubble detected.

~~~
venning
The author does not say "work in a text editor", only "use". Considering she's
including text editors on mobile devices, which the majority of America will
utilize today, that's not a particularly bold statement.

~~~
optimuspaul
I suppose when I send a text I am using a text editor of sorts

------
econnor
I'm speechless.

------
zmix
XML

------
briandear
"We’re planning to begin work soon on a collaborative editing feature that
would allow more than one user to edit an article at the same time"

You mean like Google Docs or Quip?

------
smacktoward
_> this new story editor needs to combine the advanced features of Google Docs
with the intuitive design focus of Medium_

This part made me sad. Imagine how much more powerful our tools could be if we
didn't insist on jamming them all into web browsers!

~~~
amelius
But at least they run everywhere.

~~~
noir_lord
Where everywhere is some subset of browsers in the wild and often not that
consistently anywhere.

I mean the web is awesome but it’s far from perfect from a programming point
of view but that of course doesn’t matter for other reasons.

~~~
crcl
> Where everywhere is some subset of browsers in the wild and often not that
> consistently anywhere.

Yes, but it's easier to download a version of the browser that works. The
alternative is for the developer to build native apps for each OS they want to
support. In my experience, it's easier to build cross-browser than it is to
build cross-platform.

------
jopsen
If you have professional writers why not just teach them markdown and git?

Instead of spending engineering $$$ building, maintaining and operating a
complicated editor?

~~~
ktpsns
Because professional writers are not professional coders. Markdown covers the
basic text markups but does not define how to format all these embedded media
a modern newspaper has. Just to give an example: Inspect the syntax of
wikipedia, which is the MediaWiki wikitext
([https://www.mediawiki.org/wiki/Wikitext](https://www.mediawiki.org/wiki/Wikitext))
language. It is simple and therefore Wikipedia pages are rather simple. I
expect the NYtimes to have higher aims in markup.

~~~
combatentropy

      >> why not just teach them markdown and git?
    

I myself am surprised they don't just use markdown. Git is another story.

    
    
      > writers are not professional coders.
    

But I have heard of writers clinging to old versions of WordPerfect, which
were not WYSIWYG. They preferred the keyboard shortcuts to clicking around
with the mouse.

Furthermore if you're just writing the text of the article, you don't even
need markdown, just a textarea. They used to do these on typewriters after
all. Newspaper articles traditionally lacked anything but plain text. No bold,
italics, or subheadings.

    
    
      > Markdown covers the basic text markups
      > but does not define how to format all these embedded media
      > a modern newspaper has
    

Traditionally the jobs were separate --- and I assume they still are. There
are the reporters, who research and write the text of the article. Then there
is the layout team who lay it out, decorate it, etc. I can see how markdown
would be no good for them. They would want a graphical user interface.

~~~
acdha
> Traditionally the jobs were separate --- and I assume they still are.

Is this really a good assumption? Journalists are making complex documents
with central interactive features and lots of data visualization. It seems a
lot like classic waterfall thinking to assume that reporters could perfectly
envision that and toss it over the wall for a separate team.

~~~
combatentropy
Well yes, I think so, for these reasons:

1\. Most articles have simple layouts, like this one,
[https://www.nytimes.com/2018/04/12/us/politics/trump-
trans-p...](https://www.nytimes.com/2018/04/12/us/politics/trump-trans-
pacific-partnership.html)

2\. Most reporters were journalism majors, which teaches research,
interviewing, and writing. Graphic design is a separate discipline. It is
possible for one person to be good at both, but rare.

3\. Companies tend to cut up work, especially big companies, like the New York
Times.

4\. Complex articles like this one,
[https://www.washingtonpost.com/graphics/2018/entertainment/v...](https://www.washingtonpost.com/graphics/2018/entertainment/video-
game-movies/), aren't tossed over the wall. But that doesn't mean that one
person did it all either. I'm sure there was back and forth between the writer
and designers, and likely an editor oversaw it all.

------
jbob2000
Whew, what a gross way to over-engineer a text editor. If the NYTimes
structured themselves like a dev team, they could use common sense tools to
get much better results.

Why is the journalist responsible for how it's presented? Wouldn't it make
more sense to have some kind of product manager or designer handle this? And
why is the journalist responsible for programming the look and feel of the
their article? Why not have a front end dev do that?

~~~
ax0ar
The journalist doesn't only write an article, he also structures and plans how
the story is going to be presented because that's also part of the job. Hiring
another person in-between the article and the journalist is like hiring an
extra person to only hammer a nail when there is a construction worker to do
that job already.

They develop the underlying technology to let journalists do what they want
without needing to hire more employees.

~~~
jstewartmobile
In industries that still make money--even ones with better software-chops than
the New York Times--writing/researching an article, and presenting it, are
_still_ separate jobs--and for good reason! They require orthogonal skillsets
and a decent amount of time and hands-on experience if you want a professional
result.

No one is "only" hammering a nail here.

