The spec is still not settled, but safe to use as a guideline for an implementation. If libraries align their command API implementations with the specs, it would be easy for them to port to use native commands as the browser support gets strong.
Wouldn't it be more desirable to have these WYSIWYG editors serialize to a non-HTML markup (like textile or markdown) to reduce the hassle of user-input sanitization on the back-end? (e.g. stripping script and iframe tags). What's best-practice these days for storing and displaying rich-edit user input?
I doubt Markdown, BBCode or anything similar is a good idea here. That's just introducing extra complexity - and what for? The point of markdown is that its simple for humans to read and write directly, which isn't applicable here.
The downside to markdown should be obvious:
* more code, both server and client-side (to implement the to-and-from conversion)
* more bugs (due to more code and the complexity of escaping valid input that happens to be markup in one or the other)
* less features (if the editor supports some html that doesn't map 1-to-1 to markdown, you're in trouble)
* less future-proof/platform independent (html isn't going anywhere, but that markdown variant you're using with the custom extensions you needed might be subtly different in whatever language/platform/toolkit you'd prefer in 5 years).
Html is by far the better choice. If there's an improvement to be had here, it's in using the (compatible) XHTML5 serialization to ease parsing. And it's quite likely already using that, since that's what browsers' rich-text-editing generally produces.
I suppose it's strange to say this on HN where the markup is, well, atrocious, but after using Markdown for ages in varieties of places, it's simply more pleasant to use for the "advanced" user who doesn't want to memorize hotkeys or highlight and press buttons to give their text some basic simple formatting.
It's a hit on reddit, GitHub and more for a good reason. They could have whitelisted things as well, but they chose not to.
If you would be happy with markdown, you'll be happy with a whitelist-based HTML sanitizer. HTML santization is only a hassle if you take the blacklist approach in an attempt to allow lots more than what markdown can do.
I've used antisamy, but there are many others and I don't know which is best. But I would call the whitelist approach in general, best practice.
If you are going to manipulate the document structure on the server, JSON might be an alternative. There is an interesting project on Github that does HTML -> JSON conversion: https://github.com/gregory80/fastFrag
Whenever I see "new rich HTML editor", I notice that there is no table editing. What's the point of editor that is missing the most needed function? You could argue that it is not needed, however lack of this function leads to cases of copy-paste from word/excel with bad markup.
I've been thinking about this a lot with Hallo (http://hallojs.org/), and I'm starting to come to the conclusion that the traditional way of doing tables in WYSIWYG editors is simply wrong.
What is a table editor would essentially be a database query, with data coming from your CMS or some other system? Then instead of building a generic table by hand you could do things like Insert a table of attendees of this event, with columns displayName, company, and email.
I'll try to do some experiments with this in the summer.
If you've used Drupal, what you're describing is simply the correct usage of one of its most powerful pieces: Views. However, even with its somewhat decent "query" editing interface, it's still a complex task that requires some critical data-oriented thinking skills. It's important, I think, for people to be able to format an electronic table about as easily as they can do it with a pen and paper. It's a fundamental formatting operation, IMO, and there should always be a fast/intuitive way of making them.
I'm a contributor to Aloha and recently project underwent a huge re-write to cut down the cruft. Next release, will use a light weight UI based on jQuery and also switching to a newer command API based on W3C's editing API will also make things more faster (and easier).
Main issue with the documentation was the rapid changes to the API in the past. Since the API is getting stable there will be more concern on improving the documentation.
Sorry, just re-read my comment and realised I came across as yet another of those entitled and grumpy anonymous commenters you find online. For what it’s worth, I like Aloha as integrated into the Aloha site, I just failed miserably despite much effort to integrate seamlessly into a recent project. I was aware of the jQuery UI rewrite but the branch didn’t seem to have had any commits in the previous two weeks when I looked so I assumed progress was slow. Glad to hear there’s more definitive plans.
I stand by my comment about the lack of documentation though. I appreciate that it’s often difficult to get people to contribute to docs over code, but it’s pretty imperative for a project like this that there’s at least an easy integration walkthrough and a handful of examples showing common customisations.
I among others will work on the dev-jqueryui branch in June. I hope we will manage to stabilize it and merge it to dev in this time.
It should be noted that we plan to make the ui an optional part of Aloha (with default being a jqueryui implementation) to make it easier to integrate into existing systems. The heavy extjs dependency is part of why Aloha is so huge.
I also plan to look into using google closure compiler in ADVANCED_OPTIMIZATION mode to shrink the size to the absolute minimum.
Sorry, I was mocking the notion that someone is supposed to interop with Word and that not magically converting from Word is somehow their fault instead of Word's. I can only image it's non standard. In fact, it's probably handled by the user-agent.
Looking at the examples, I don't understand how this is "a better approach" to editing, because the HTML it outputs does not follow best practices. The HTML it outputs should look like a competent human had authored it, but currently it uses line-break hacks for paragraphs instead of semantic paragraph elements (two <br> elements aren't really a paragraph, it just looks like it), adds underline elements with Ctrl+U (these should be reserved for links and defined in stylesheets anyway) etc.
You might want to read this bit from that section:
"The b element should be used as a last resort when no other element is more appropriate. In particular, headings should use the h1 to h6 elements, stress emphasis should use the em element, importance should be denoted with the strong element, and text marked or highlighted should use the mark element."
From the looks of it - while just about legal, it's the worst option going. <em> etc are far more preferable and 'right'
Only I don't want my text to be "strong", I want it to be bold, and I don't want my italicized items to me "emphasized", I bloody want them in italics.
I don't consider my bolds and italics to be mere style and much less I consider them mere suggestions.
Bold and italic are typographic conventions of author _intent_ (that is: semantics) with centuries of use. I put them there in purpose, and I don't want them converted to anything else via styling. Sure, someone can style "b" as "purple text with a yellow dotted underline", but I might as well make my _actual_ intention clear.
Outside of its proper context, "semantic" is just a BS notion that got popular with designers and co because it sounds sophisticated, giving rise to inane arguments similar to how many angels fit in the head of a needle.
Very nice. We discussed a lot whether to use the same <iframe> approach on text editing when working a project of ours called Quabel. In the end, we went even further with a pure DOM-based editor (via contenteditable) and no hidden <textarea>. However, we encountered lots of cross-browser issues because of the differences in contenteditable behavior when using FF / Chrome / Opera and later dropped contenteditable alltogether. I chuckled when noticing the odd editor behavior when pressing backspace at the beginning of a list item because we had exactly the kind of oddities.
Feature request - ability to add 'notes' to a document. I'd imagine these as toggleable divs that would be able to be inserted anywhere in the doc, then made visible or not by toggling a class. When 'on' the visible portion would just be a small box/space/marker that, when clicked, opened up a larger div with the full note.
I'd tried to do this with the YUI and Dojo editors a couple years ago, but my JS-fu wasn't good enough. I'm possibly better now, and maybe wil try my hand at adding that to this editor (which looks nice for a lot of applications) but someone else with better skills could probably lay the foundation for a 'notes' system much better than me :)
> Feature request - ability to add 'notes' to a document. I'd imagine these as toggleable divs that would be able to be inserted anywhere in the doc, then made visible or not by toggling a class. When 'on' the visible portion would just be a small box/space/marker that, when clicked, opened up a larger div with the full note.
I think this suggestion fully qualifies as something they don't want to do, "[to create] unmaintainable tag soups and inline styles".
Who said inline styles? A class and an ID on a span tag inside a larger block would be all that's required. The clicking stuff would be applied outside of the document editing area.
This is the sort of reaction that makes me hesitant to even attempt to fork/patch it, because if this sort of thing wouldn't be accepted back in to the main trunk, I'm stuck maintaining a separate branch.
Looking at the examples, I don't understand how this is "a better approach" to editing, because the HTML it outputs does not follow best practices. The HTML it outputs should look like a competent human had authored it, but currently it uses line-break hacks for paragraphs instead of semantic paragraph elements (two <br> elements aren't really a paragraph, it just looks like it), adds underline elements with Ctrl+U (underlines should be reserved for links and defined in stylesheets anyway) etc.
Want good WYSIWYG html? Try out Apple Pages, export to E-Pub, extract and look at the xhtml files. After pretty printing it's almost like human authored. It seems that we still need the Apples and Adobes of this world.
The code isn't officially released anywhere, but someone grabbed the source, prettied it up, and put it up here: https://github.com/benjamn/kix-standalone
Obviously you can't use it for anything without a proper license.
wysihtml5 is a decent editor. It's fairly lightweight in comparison to the other editors. I feel that its approach to DOM changes makes it easier to create cross-browser consistency. However, I feel the event system could really use some work and I find myself working against the editor to add new features. I would prefer if the framework made it easier to add new features rather than having to change the core to get the desired result. (and maybe I just need to spend more time with it.)
Oh, and I should add that the built-in parser isn't very flexible and requires lengthy configuration.
The problem I have with most editors is that they try to be like a piece of desktop software. "Install me and get all of these features including themes and plugins." That's not what I want. I want an API. I want a consistent cross-browser approach to wysiwyg html editing that doesn't weigh in at 200+k minified.
What does Aloha come in at? (Web Inspector is telling me 1.6MB!)
I haven't used Aloha but my impression is that it's trying to be like all the other editors like TinyMCE or CKEditor.
wysihtml5 is lightweight, which is nice. It provides a command API so that I can wire up commands to my own toolbar and provides state management. But I don't like that some built-in commands don't offer up enough externalization, like autolinking. Or that I can't cancel an event like beforecommand. beforecommand doesn't even tell me what command is being fired. (or if it does, I couldn't find out how.)
We officially support IE7 and IE8 in Aloha. I say officially, because, although it does seem to work, it doesn't work very well (at all). IE does some weird stuff with the DOM, especially inside a contenteditable (certain combinations of DOM elements completely mess up the structure if it's inside a contenteditable, but not if it's outside).