Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Margin – A lightweight, flexible markup language for structured thought (margin.love)
210 points by nobody_nothing 52 days ago | hide | past | web | favorite | 61 comments

Hey Show HN!

I'd love some early feedback on Margin, a markup language for hierarchically structured thought.

Margin came out of my desire to build a lightweight to-do app – though I soon realized what I really wanted was a markup language to capture structured thought. One that was not only human-readable, but also easily machine parsable.

Most importantly, Margin doesn't impose strict hierarchical categories (eg. "Header 1", "Project", "Task"). Instead it allows the user/application to define those categories. The ultimate goal is to democratize your to-do lists, notes, writing, etc. making them portable and platform-independent.

Aside from feedback on the specs & philosophy of Margin, which would be very appreciated, I could use some serious help with the parser (https://margin.love/parser/). I'm only a casual coder, so this problem was a difficult one for me. The parser is both incomplete and buggy, but hopefully it gets across the basics of how Margin is supposed to work.


Sorry, one more note because I'm currently trying to write a parser for this. You should specify very specifically what characters are ignored in the beginnings and ends of lines. I would strongly suggest using UTF-8 values to specify them, too, since that's nice and unambiguous. Include in that set of ignored characters exactly what whitespace characters you are choosing to ignore, even if it's just spaces and/or tabs. There are lots of whitespace characters. Lots.

One issue with ambiguity now is that you specify that "dashes" are ignored, even though (I assume) you probably meant "hyphens". Using UTF-8 would solve that.

Also, you might consider adding "+" the set of ignored characters. It's frequently used for lists. Just something to think about.

EDIT: I should probably start filing these as issues on your github

I like the indexes, though without a query language implied I'm not sure how useful they are. I can see why you wouldn't want to specify a query language in a markup spec though.

I made a query language for use in situations just like this years ago, never ended up using it much though.


Another question: if an item has multiple annotations, should the annotations be considered ordered? Or is ordering defined to be non-meaningful? I don’t know if there’s an obvious answer to this, but some applications might benefit from semantic ordering.

These probably seem like really picky questions but I’m thinking about how to write a parser for this and these are the questions that are coming up.

My first instinct is no, the order of annotations should not matter to a parser. Or, put more precisely: that would be outside the scope of the spec, and keeping track of annotation order shouldn't be required of a valid parser. (Though, of course, if an application wanted to keep these annotations ordered it should be allowed to do so.)

Though it's a great question, and I could see myself changing course if a good use case were pitched whereby the ordering of annotations mattered specifically in a way that a use-case-specific app couldn't function if those children weren't already ordered for it in the plaintext.

There's also a case to be made that, because annotations are just a special type of child items, and because item order is generally meaningful in Margin, then we might as well keep annotations ordered too for good measure.

This looks really interesting. For a “casual coder”, I think you did a good job designing the spec for a new markup language. That’s not easy.

One weird edge case I’m thinking about: what happens if you have two lines, A and B, and A is indented with a tab and B is indented with 3 spaces? Is A the parent of B?

More generally: how do you compare tabs and spaces in the context of indentation levels?

This is a great question and something I've pondered. I recently learned that mixing tabs and space in certain Python environments, such Google Cloud Functions, simply isn't allowed -- uploading such code will cause an error.

Unfortunately, that might be the cleanest solution for Margin, too: to simply disallow the intermingling of tabs and spaces as hierarchical tokens within a single parent.

In that scenario, you could use tabs in one part of the document and spaces in the other -- as long as all the direct children of one parent item follow the same rules.

What if you would just be a little bit stricter and say that each level of indentation is either 1 tab or 2 spaces? 3 spaces is then interpreted as one level of indentation of an item starting with a space.

Did you look at org-mode? That seems to me like the closest existing implementation to what you’re attempting.

Org-mode is awesome. But only in the context of Emacs. Markdown is less capable, but almost ubiquitous. I went from org-mode[^0] to markdown some years ago, and the only thing I am missing is org-tables.

Currently I am using a set of markdown editors: Ulysses[^1], Typora, iA Writer and a few others. That makes me interface agnostic. The most important thing is the content, the (markdown)text I am producing. I can't say that about text in org-mode. It's tied to Emacs.


In Ulysses I started using an annotation system myself, that's why this project here caught my attention. I am experimenting with brackets, double brackets, colons, double colons. Still in the process of experimenting.


[^0]: was using org-mode to write documentation, journal entries, and as a task management system

[^1]: Ulysses is way more than an editor. It is like Evernote with markdown targeting authors. It‘s my ‚Zettelkasten‘

> Org-mode is awesome. But only in the context of Emacs.

Really? I agree with [Karl Voit 2017 essay](https://karl-voit.at/2017/09/23/orgmode-as-markup-only/) regarding that. To quote:

> You can type Org mode in vim, notepad.exe, Atom, Notepad++, and all other text editors out there. And in my opinion it does have advantages compared to the other, common lightweight markup standards such as Markdown, AsciiDoc, Wikitext or reStructuredText.

Voit then provides a very compelling case for why Org-mode syntax is a well-designed [lightweight markup language](https://en.wikipedia.org/wiki/Lightweight_markup_language), and works well outside of Emacs.

Org mode looks great, thanks for calling my attention to it. For anyone looking for a brief introduction, they have a compact guide[1] on their site that took me a minute to find.

It looks great for coders who are ready to learn new syntax. Though what I'd really like to avoid with Margin is anything that makes the language more complex than it needs to be.

For example, org mode defines headlines as:

    The headlines in Org start with one or more stars, on the left margin. For example:
    * Top level headline
    ** Second level
    *** Third level
With Margin, there's no concept of a "headline." For the thinker, the headline could be represented by any level of the plain text hierarchical tree. And the thinker would then be free to choose (or conform to) any Margin-based app that corresponds with their model. In other words, it works best however you want to use it.

[1] https://orgmode.org/guide/

I understand the different approach, but the ecosystem that leverages it aligns quite nicely with your goals. It’s anything from a todo app to a recipe book to a financial planner.

I doubt there’s any way to take advantage of the common objectives, but if you look at how the libraries build on the syntax, there might be lessons to be shared.

How are you going to distinguish (bulleted) lists from multi-line paragraphs without a minimum of syntax?

P.S.: Whitespace is syntax too ;-) -- signed, Guido van Rossum

I really like it. I like the annotations feature. And I like how the items are isolated.

Pretty much all you need. Would love to see it augmented with rich text. :)

Good work!

This is great, thanks for sharing. I've been looking for this to create a to-do app that can save to a text format as a backup and can seamlessly be edited through the UI or text for maximum portability, so basically a UI client for modifying this type of markup plus some added features.

Great to hear! As I said to @brigandish below (and will shamelessly repeat): if when building the app you find yourself wanting to contribute to the js parser[1] in any way (or even any non javascript-based parser), it'd be much appreciated. That's my biggest need right now, is actually writing a good parser that'll encourage people to try this out.

[1] https://github.com/gamburg/margin/blob/master/parser/js/Marg...

Btw, have you seen https://treenotation.org/ ? That is the other one I was looking, but it's more abstract and you would need to construct the markup language in it first.

I see, I'll post any feedback and eventually the project I'm doing. Thanks.

How do you deal with multiline data? For example, what if I wanted to add a multiline code snippet?

Important question. Still thinking through this: https://github.com/gamburg/margin/issues/2

Are you familiar with ArchieML?

No, but looking at it now, it's definitely an inspiring model for how to do something like this right.

I'd say the goals of Margin are similar, but more focused on thought that is specifically hierarchical in nature -- whereas ArchieML seems to be more focused on structuring text in key:value pairs (if I'm understanding it correctly).

The hope with Margin is that lots of people already store their notes, to do lists, and random thoughts in a format that might already be (or almost be) valid Margin. It's intentionally non-technical, and its syntax should make sense to those who don't know or care what Margin is.

wow, the syntax of ArchieML seems a lot harder to learn that something with a more regular syntax like, say, YAML. I wonder if the NYT still uses that. Also having no feedback while you write on a gdoc seems like a recipe for a disaster (maybe they had or have some sort of gdoc plugin to give syntax feedback?)

It’s still used in many news orgs. The writers mostly don’t touch the markup, so it works out.

I'm very eager to see this project develop. Looking at the examples, here are my thoughts:

- It seems like annotations should be excluded from the "value" field. If I write something like "Frankenstein [author:Mary Shelley]", I'm probably going to want to access the title "Frankenstein" on its own after parsing.

- It would be nice to be able to omit the brackets when there's no whitespace inside an annotation. For example, you could write "broccoli type:vegetable" instead of "broccoli [type: vegetable]"

- I like that you can use different list decorators, like -, >, or *. But these don't show up anywhere except the "raw_data" field—it would be nice if there was a field that contained just the decorator, something like:

       "raw-data" : "\t- Some Item",
       "value" : "Some Item",
       "decorator" : "-",
       "annotations": {},
       "children": []
Then you could easily use different decorators to encode semantic information. For instance, a Pros/Cons list could use '+' for Pros and '-' for Cons, and that distinction would be easy to access programmatically. If you do this, I think "[ ]" and "[x]" should be end up in the decorator field. Or maybe it could be a "prefix" field and "suffix" field, to hold decorators at either the beginning or end?

Just wanted to say, I think the second point is more awkward than it's worth. Some examples:

- check that video Alice mentioned at time 3:42 - don't forget to text Bob O:) - my favorite thing is:strawberries

It makes parsing more complicated, as well as creating unexpected results in non-obvious cases (for example the last one could just be a typo, and what if it was instead written as 'is:strawberries, cake'?). It also isn't intuitive in the case of your first example, since I think of Mary Shelley as a single thing.

Alternatively, annotations could just read to EOL, unless it's explicitly guarded off. That would depend on how people use it, I suppose.

This is the syntax used in todo.txt, which I've found works well, especially if with syntax highlighting to guard against typos. You're right that it would be harder to parse, especially if you carved out an exception for digits and special characters to account for the examples you mentioned. I'll admit that I'm not especially well positioned to weigh the benefit of syntactic sugar versus the development cost of parsing, so perhaps you're right.

Very enjoyable idea. The ideas first, tidy up later ethos is one with which many other software engineers will agree.

I haven’t written any Margin — I’ve only read through you documentation — but maybe this feedback may be useful. (Apologies if you’re already doing all of this, in which case it may be helpful to others at least.)

I love the idea of flexibility in how Margin can be used. Because if this though, you will find emergent behavior among your user base. People will do things you could never have imagined and expect Margin not to break their workflow.

If you capture these as a standard test suite or set of test cases, it will be very helpful to anyone implementing a Margin parser in another language.

You already have a reference parser in JS and if you get to the million user mark (!) you might like to provide it in C instead. Something like libSYCK did for YAML.

For now, having a test suite is the next best thing. While Margin is not prescriptive about how you use it, you’ll want Margin parsers to be strictly in agreement in how they interpret Margin, for it to be widely adopted! Then when someone sees Margin, finds it to be a good idea to use in their language of choice, and writes their own parser they can confidently announces “passes 100% of the Margin test suite!”.

Very elegant way to encourage adoption of the language! This is now on the list[1], thanks.

[1] https://github.com/gamburg/margin/issues/4

"An outliner (or outline processor) is a specialized type of text editor (word processor) used to create and edit outlines, which are text files which have a tree structure, for organization."[1]

[1] https://en.wikipedia.org/wiki/Outliner

This is awesome! Thanks for sending, I actually didn't know about these.

The big difference here would be that Margin isn't an app, but a markup language that's meant to be used as an open & portable storage system for apps. A storage system that is, by design, human readable plain text.

I do love that there's an (albeit niche) interest in apps that let you think/notate in this way.

There should be an internet law that every supposedly new invention can be found pretty much wholesale in GNU emacs

Quite promising: one thing that would probably accelerate adoption would be to provide at least one language parser that can go to/from the format to a data structure of some kind.

Agreed! This is my biggest need. If you want to contribute, or know of any open source libraries that might help me build a more robust parser, let me know -- I'm only a casual coder, and this project really stretched my capabilities of recursive thought ::o

The parser in its current form is here: https://github.com/gamburg/margin/blob/master/parser/js/Marg...

You could consider parsing into Pandoc AST (just need to output the JSON representation of it). Then you can automatically interface with all the output formats supported by Pandoc, including HTML. See pandoc --from json, and https://pandoc.org/using-the-pandoc-api.html

Thanks! This is useful.

As a long time user of vimwiki with an awful lot of hacks and plugins to work for doing something like this, margin seems like the kind of thing I was originally looking for. Vimwiki is great but suffers from having almost no machine readability, which makes integrating it with things like time tracking difficult without large, complex plugins.

However, margin doesn't really seem to have any equivalent of links between files, which I find essential for organising thoughts in anything beyond small notes. This seems like something that's contrary to the philosophy in a sense, but what would be the margin-ish equivalent of something like that, and (following from that) things like file tags? Would those be done with annotations?

Good question. I do think this would be achievable through annotations if an application were determined to do it. Annotations are sort of the catch-all for added functionality.

The philosophical question is an interesting one. I think for the most part, if items are meaningfully related, the Margin philosophy would probably be to store them within the same document parent (or file).

This shows great promise. I've been looking for various ways to impose semantics on markdown for note taking, and they all turn out to be too complex. This is very simple, has goals which are distinct from markdown, and has the potential to build something queryable.

I like this a lot and will think about using it in an app I'm planning to build, but I have one quibble with something in the docs:

> you wouldn’t build a web app in markdown

I'm willing to bet it's been done and has already been a Show HN. After all, there's "My blog is now generated by Google Docs"[1] currently on the front page of HN. If not then it's only a matter of time :)

[1] https://news.ycombinator.com/item?id=23134101

Great to hear you'd consider using it. If when building the app you find yourself wanting to contribute to the js parser[1] in any way (or even any non javascript-based parser), it'd be much appreciated. That's my biggest need right now, is actually writing a good parser that'll encourage people to try this out.

And fair, re: "you wouldn’t build a web app in markdown". I'm sure it's been done :)

[1] https://github.com/gamburg/margin/blob/master/parser/js/Marg...

This is excellent! The one thing I really like is the annotations. Been looking for a way to embed k:v attributes in a parseable markdown, for making metadata of codebases and datasets.

How are URL's handled? One solution would be to just reuse Markdown's [link](text) notation, as a subset of annotations. That would be the least-mental-friction route for markdown users.

Love this idea! Thanks for adding it to the GitHub.

You mention that apps can safely ignore annotations of unknown types, but you should probably specify that they are still expected to preserve such annotations even if they don't use them for anything, i.e. ignore doesn't mean "discard".

Alternately, use Org mode.

On first look this appears like a great foundation for a new ecosystem of outliners. Awesome!

The processing of my 6000 lines Dynalist export (perfectly compatible due to the permissive syntax!) took about 1 second on an i7 from 2015.

This project looks awesome. I created bytebase.io, a notes app that shares much of the same philosophy - flexible hierarchy, simplicity, user control.

We'll definitely consider supporting import/export to Margin.

Beautiful site. And I agree, very similar philosophy to Margin. Just requested access, excited to give it a try.

I'd love to post these notes from my phone to my server, translated into org-mode as they go :)

Does it deal in any fashion with longer blocks of text than a single line? Paragraphs under a heading?

Not yet, but it's on the list! https://github.com/gamburg/margin/issues/2

Definitely a necessary feature if people are going to use this to store text. Do let me know if you have any ideas about what kind of syntax might be best for multiline blocks -- keeping in mind it should be both easy to type and easy to read (eg. ideally avoiding '\n' or special newline characters, for example).

For Mac users, I found that these two apps together cover a lot of my needs: Things 3 and Quiver.

This is awesome! I've had thoughts along these lines, I like your annotation idea. Excited to see how this project grows.

My understanding is that this is a nested list parser. Nested lists are defined by the indentation level. Metadata for items can be specified by square brackets.

I don't see how this is different than just opening notepad.exe and writing stuff.

Exactly! The idea here is to formalize the already intuitive way many of us use plain text. Once the syntax is formalized, it can be utilized by applications. This way your data remains in a portable, non-proprietary format that can be reliably parsed - whether you choose to use notepad.exe or a Margin-specific application, you already have everything you need.

My first thought is I hate tabs.

So use spaces. Solved.

Is it possible to run locally?

Run what?

> Lightweight markup

> The plain text language

> Margin is a lightweight markup language

> Margin can thrive within any plain text editor

> Margin is not an app, but a markup language.

The author's experimental parser? It's linked: https://github.com/gamburg/margin

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact