So this is literally Lisp, just with curly braces instead of parenthesis :).
I don't understand this part of the readme:
> The advantage of Mark over S-expressions is that it is more modern, and can directly run in browser and node.js environments.
Does this mean I'm so out of date with JS that this syntax is actually legit JS? Or does Mark run its own parsers, at which point it's just like sexps, except it uses the "more modern" curly brace instead of "less modern" parenthesis?
EDN, in addition to being simpler, also has an optimized transport abstraction, transit, which serializes to application/transit+msgpack (binary) or application/transit+json (often vastly faster deserialization than messagepack because which hits the language native JSON parsers for performance in web browser, python, etc). It surprised me how big a deal hitting native json parsers is, EDN was at the top of our profiler for 100kb payloads but transit+json is zippy. The abstraction also handles value de-duplication and has a human-readable verbose writer.
No, it's close to S-expressions, but it's not a programming language like Lisp.
> just with curly braces instead of parenthesis
Well, and more syntax than S-expressions: it's got both objects and arrays as fundamental structures instead of just lists, and it has commas as noise characters.
It's cool and I love it, but it's irrelevant in the context of a universal data exchange language. In such case, you'd want to have those primitives defined in the exchange language spec itself - even if you'd end up implementing them as reader macros in CL (which I don't recommend - reader macros turn your READs into EVALs, which you obviously shouldn't do on untrusted input).
> you'd want to have those primitives defined in the exchange language spec itself
I agree with this: certain things need to be in the spec.
> even if you'd end up implementing them as reader macros in CL (which I don't recommend - reader macros turn your READs into EVALs, which you obviously shouldn't do on untrusted input).
This I don't agree with, because technically #\( & #\" are reader macros … they're just very well-defined reader macros. Presumably a spec which defined hash tables, regular expressions or whatever would define them as well as the Lisp spec defines lists and strings (and if not, well — it's a bad spec!).
> I agree with this: certain things need to be in the spec.
That was my main point. In the second part of the comment I didn't mean to discourage use of reader macros - it was more of an aside that the general facility of CL-style reader macros literally makes READ "shell out" into EVAL, so you need to (diligently) disable it for untrusted input (or reimplement a limited READ by hand). So we can't say "oh, but S-exps in Common Lisp can have anything through reader macros". Presumably if hash table literals were specified as a part of basic syntax, we could depend on it being standard and part of the safe subset of READ's duties; as it is however, we can't depend on it for arbitrary inputs.
> it's got both objects and arrays as fundamental structures instead of just lists
I'm bit sad that various lisps never standardized on a format for this; had they, then maybe we would have S-expressions as a popular data interchange format.
Common Lisp doesn't have a standard representation of hash tables, unfortunately. Also, CL didn't clean up the Lisp space completely; right now, there are Schemes, there's Clojure, LFE, Hy, and bunch of other niche Lisps, each with their own idiosyncrasies around syntax.
> Common Lisp doesn't have a standard representation of hash tables, unfortunately.
Ah, but you mentioned objects & arrays, not hash tables grin. Agreed that hash tables would have been nice, although that does then get into issues such as canonical representations (which matter e.g. for hashing).
> Also, CL didn't clean up the Lisp space completely; right now, there are Schemes, there's Clojure, LFE, Hy, and bunch of other niche Lisps, each with their own idiosyncrasies around syntax.
It would be nice if folks who want to use Lisp would use Common Lisp rather than reïnventing various forms of more-or-less round wheel. It's a remarkably well-engineered language (not perfect, of course: argument order, upcasing, pathnames & environments all leap to mind as problematic areas), and so far as I can tell quite a bit better than any of the alternatives.
In particular, it'd be really nice to see people using Schemes for serious engineering work to use Lisp instead. It's just not well-suited to writing large systems, except by grafting on an ad-hoc, informally-specified, potentially bug-ridden subset of Lisp.
Perhaps more like EDN since it doesn't have a runtime. But yeah, it's s-exps with curly braces. Which in my opinion, look worse than round parentheses … but that's opinion
I really like edn. I wish it were more widely used.
It hits a sweet spot for me between yaml and json. Yaml is easy to type/read, but I feel it's a bit too complex on the parsing side. And json is a pain to type, so I'm reluctant to use it for human entered configuration files.
Being 'more modern' means Mark takes a JS-first or web-first approach in its design. Whether we like it or not, JS has dominated the web. JSON is successful, partly because it takes a JS-first approach. Mark inherits this approach.
It is sad that so much junior talent is wasted on attempts at improvement via blind reiteration. This is one possible consequence of a mentorship vacuum: bright minds look for challenges, even ones that have been adequately overcome long ago. Imagine the good that would come of directing such energy with a clear purpose.
There is nothing sad in that. Yes, it might be non-optimal, but by trying out your own approach you will find out its limitations first-hand and understand the problem and reasoning behind alternate approaches much better.
Junior talent doesn't become senior talent by just doing the right things, but by doing the wrong things and learning from them.
Junior talent becomes senior talent by virtue of acquiring experience. Doing the wrong things and learning is one way of acquiring experience, but being directed by a mentor is a much more efficient and productive way of acquiring that seniority. A lot of lessons learned from reinventing wheels can be distilled down into a conversation or a pair programming session, but in the absence of senior leadership, it becomes a week long hacking session on a library that will ultimately rot in Git forever because it's foundationally unsound.
Yup. There really is nothing wrong with reinventing the wheel for educational purposes. The problem starts when that reinvented wheel gains a good README/webpage, and gets picked up by an ecosystem driven by novices.
>Yes, it might be non-optimal, but by trying out your own approach you will find out its limitations first-hand and understand the problem and reasoning behind alternate approaches much better.
The problem here is that "modern" is usually a justification for regression, relative to older technologies, usually done by people who never bothered to look at the old technologies before declaring them obsolete.
Wow, that was a surprisingly insightful talk; thanks for linking!
I'm intrigued in particular by the talk's conclusion about (I guess, again) disappearing distinction between volatile and non-volatile storage. To date, I've been a vocal advocate of hierarchical filesystems (not UNIX, but just as a unit of user-facing abstraction). The talk sent me on the way of reflecting whether I'm not just supporting another historical "wrong path". Lots of more thinking in front of me here. So thanks.
I would have rather said it's like QML language (which is used as the basis for Qt QML UI language). Can't find the link to the reference for plain QML. QML is like JSON, but with typed objects, and objects can have children in turn, so you can create tree structures from objects. It's actually very nice, I wish it had parsers for more languages.
JSON isn’t valid JS – its representation of strings allows U+2028 and U+2029 to appear unescaped, but JavaScript string literals don’t.
Not sure how else executing (valid) JSON in a browser would be a recipe for disaster? `eval` was the standard way to parse JSON from trusted sources for a long time.
> Does this mean I'm so out of date with JS that this syntax is actually legit JS? Or does Mark run its own parsers, at which point it's just like sexps, except it uses the "more modern" curly brace instead of "less modern" parenthesis?
Firstly, I highly respect Lisp personally, and I have no intention of downplaying it. As some have seen, the Lisp spirit is actually in the design of Mark.
Secondly, to clarify what I mean by 'being more modern'. Of course, it does not mean changing from () to <> or {}, will make it more modern or something better.
Being 'more modern' means Mark takes a JS-first or web-first approach in its design. Whether we like it or not, JS has dominated the web. JSON is successful, partly because it takes a JS-first approach. Mark inherits this approach.
Being JS-first, means there'll be least adoption barrier in web.
Being JS-first, of course does not mean JS-only. Mark is designed to be generic and used by other programming languages like JSON.
It's literally common lisp down to the use of pre-expressions with no binding as pragmas. It's like lisp someone dropped on a char and now all the parenthesis have a funny bump.
But it's not literally lisp in the sense that the meta-syntactic stuff isn't there.
Lisp has strong typing, so 1 is 1, not "1" or #\1, which, unless mark has a built in way of annotating types doesn't give this any advantages over s-expressions.
Lisp expressions also don't have any annoying type clutter that you have to have at every node in the syntax. Like (1 "1") is just a list of two things; we don't need the word "list" anywhere.
One major weakness of JSON is lack of a corresponding "infoset"; that is, an equivalence predicate. When are two JSON blobs "the same"? There's no sign of anything like this here.
Another is the lack of support for binary data. There's no sign of support for binary data here.
Finally, there's this claim:
> The advantage of Mark over S-expressions is that it is more modern, and can directly run in browser and node.js environments.
Is it more modern? I don't think I care.
Can it directly run in browser and node.js environments? What does that mean? It seems to need a parser. But then, S-expression parsers certainly directly run in browser and node.js environments.
---
IMO, SPKI SEXPs are much more sensible than this design and many, many other designs:
> IMO, SPKI SEXPs are much more sensible than this design and many, many other designs
Yes, yes, ten thousand times yes! I really don't understand why, over two decades hence, the world has stuck with XPKI & ASN.1, and has invented XML & JSON, when SPKI solved the PKI problem for good & canonical S-expressions solved the flexible- and human-readable–data-exchange problems for good.
Since you both seem to know the spec: how would you encode key/value pairs? Or would you have to have a list of nested lists, like
(my_dict (key value) (key value) (key value))
Un-ordered qualities for data can be useful (e.g. they allow you to reorder data to stream "important" stuff first), but I don't see it anywhere in here.
With canonical S-expressions, unordered sets are a problem because part of the point is to be able to have a single canonical sequence of bytes, which can be hashed or compared bytewise for equality.
In general, I'd resist specifying data as arbitrary key-value pairs, but if I decided that I indeed needed them, I'd do exactly as you suggest — and I'd mandate that the be sorted lexicographically by their keys.
Each existing format have advantages and disadvantages for particular purposes.
Benefit of HTML: You can actually write it by hand and easily see where each element begins and ends, even when the document is longer than a screenfull. Mark has the "}}}}} problem with larger documents, so it is not as suitable for human-written markup.
It is not clear to me how mixed content like <cite>Hello <i>world</i></cite> is expressed in Mark. I expect it will be pretty convoluted.
Benefits of JSON: Maps directly to simple data structures: List, dictionaries and simple values. Similar data structures are supported in almost any language. Mark has "type names" and anonymous text content which complicates serialization and serialization a lot, and is sure to give interoperability (and perhaps security) problems.
So - worst of both worlds? Instead of tying to be an overall worse alternative to all the formats, they should rather focus on a specific niche where Mark can be a better alternative.
Take configuration files, for example. They don't have large amount of textual content like HTML, and they don't need to be transferred between disparate systems.
{size width:100 height:100}
vs
<size width="100" height="100"></size>
vs
{"size": {"width":100,"height":100 }}
In this case, the Mark syntax is simpler and cleaner. Mixed content is not needed, which would make the format simpler. Yeah it is basically the same as S-expressions, but that is not a bad thing.
The HTML example is much better than "}}}}} though, since you can e.g. add a new item at the end of the list without needing a specialized editor to locate the right position. This is one of the reasons for the redundancy in repeating the tag name in the end-tag. In theory Lisp should have the same problem, but usually code (hopefully) rarely have nested blocks larger than a screen, so it is not a big issue in practice, even if )))) looks ugly. Bottom line is code have a different structure than typical hypertext documents, so just because a notation is suitable for one does not mean it is suitable for the other.
But when s-expression is used to represent document, not a program, then it is also not free to refactor deep nested content. So s-expression is no better than XML/HTML/JSON/Mark when encountering deep nested content.
HTML has the problem of </div></div></div></div></div></div></div>. Lisp has the problem of '))))))))'. JSON has the problem of '}}}}}}'. And YAML has the problem of deep indentation.
When it comes to worse-case scenario, no one wins. :-(
Oh, that is pretty cool. I didn't know about json5. This would also be quite nice for config files. Regular json is not nice for config files due to lack of comments.
Json5 is still not as "editable" as it looks though. You need to separate values with comma (except the last value), so there is more syntactic noise. So you get:
{
size: {width:100, height:100 /*yay*/},
}
This is not an issue when the text is machine-generated (as Json typically is), but is an issue when it is edited by hand as config files often is.
Yeah, nicer to read and write. More complex to parse though. S-expressions are incredibly simple to parse. But I guess every language have a YAML-parser these days.
but for config you can stick to a self-documented subset of YAML that works
so in practice you don't notice any brokenness
it's not like a web browser that has to work on a diversity of third party sources
I'm not even sure how to read that matrix anyway, and it does say:
> The YAML Test Suite currently targets YAML Version 1.2. ... some frameworks implement 1.1 or 1.0 only
Comparing {mark} to XML, it doesn't seem to support namespaces which makes the claim to be extensible somewhat dubious. How am I supposed to add custom objects without risking name clashes? Namespaces also make XML kind of fully typed without being tied to a single programming language.
Another strength of XML is support for mixed content which seems rather awkward in {mark}. The following
<p>Some <b>bold</b> text</p>
apparently needs to written as
{p 'Some' {b 'bold'} 'text'}
It would be more honest to mark support for mixed content as "verbose" in the feature table.
Besides, the name {mark} seems like a bad idea. How could you find relevant results when searching for {mark} using a search engine?
I'm enough of a type-safety bigot that I would have started with schema-first, as I want schemas for all the things. :-)
FWIW, I would suggest avoiding the (IMO) mistake of using your markup language for the schema.
E.g. like json-schemas where we need a "properties" map, "type": "string" (how many times do I have to type "type"), all sorts of syntactical overhead.
Personally, I think IDLs are much cleaner, as you can design a purpose-specific grammar. More work up front, and you don't get a parser for free, but again personally I think it's more pleasant in the long-run for developers to read and write.
Granted, not sure how that jives with your lisp/etc. way of thinking, but my two cents.
I think 'less verbose' just means 'no end tags'. Which I guess is great if you don't mind a long string of brackets at the end of your document.
While it would be cool to have something that was like JSON but could deal with complex documents, I also don't see how this is a huge improvement over XML.
Defining the type of objects is a must when you want to exchange things in a strongly typed environment (Java on the server, TypeScript on the client, for ex).
So +1 for {mark}.
Do you handle multiple typing?
(We use that a lot in Neo4J, and we think it is really neat)
Another comment:
Coming from a Semantic Web background, and using N3 as the exchange format and N3.parse() as my client-side lib, I would advise to have a UID parameter to uniquely identify objects, and a refId syntax, so any parameter can reference some other objects of the data structure. That helps when you want to transmit a graph [1].
My humble 2 cents.
[1]: I would add that it is also useful when you retrieve some refIds that are not defined in the current data structure. You can then ask the server to dereference these refIds, and send another (portion of the) graph, that you can connect with the existing data structure.
Let's say you transmit an object of type Person, that is also a Student and a MartialArtist. Your inheritance graph may define that a Student is also a Person. So not sending the Person type could be fine. But would you define a common subtype for Student+MartialArtist, just because your data serialization handles only one single type per object? Obviously no! You want to send your object with types "Student" and "MartialArtist". I.e multiple types.
You either go with JSON because everything talks JSON or you go with something that doesn't have an explicit parsing step like flatbuffers or capnproto.
If you don't care about parsing CPU efficiency then gzipped JSON beats protobuffers, CBOR, etc when you care about bytes sent over the wire.
If you care about CPU efficiency then protobuffers, CBOR, etc are worse than flatbuffers or capnproto.
There is not a lot of space for a new standard between these two existing categories.
I'd heard of CSON before. I didn't remember hearing about CBOR (but I had an implementation of CBOR already starred on GitHub apparently). However, given the following:
> CBOR is defined in an Internet Standards Document, RFC 7049. The format has been designed to be stable for decades.
I see no reason to go with CSON over CBOR. In fact just the opposite.
> The advantage of Mark over S-expressions is that it is more modern
Is it more modern because it is newer? There is mention of how adoption is limited, but wouldn't the adoption of a completely new syntax be even more limited :-)
Being 'more modern' means Mark takes a JS-first or web-first approach in its design. Whether we like it or not, JS has dominated the web. JSON is successful, partly because it takes a JS-first approach. Mark inherits this approach.
The biggest problem with these ideas is that json is already supported in the browser.
There might be a use case where your data is better represented in LDIF because it's hierarchical, but there's no built in LDIF support, so now you're importing a ton-o-javascript just to parse some new format.
At this point, we should realize json isn't meant to be human readable anyway. If you need to hunt through it, you put it into some type of json viewer so you can see the tree and query it. It's an interchange format, that's more compact than XML.
If you're shipping data between non-browser things like backend services, there are already binary formats like protobuff that have typing and can be optimized for small payloads.
Besides the nonsensical "advantage over S-expressions" statement in the README, the biggest issue I have with this is that Mark maps only to JavaScript, not to other languages where dicts/maps/hashes and arrays/slices/lists are two different things. Makes me wonder if it just has not occurred to the author that there are languages != JS.
If all other languages have no problem supporting XML, they'll have no problem supporting Mark.
It just that in languages like JS and Lua, where an object can be an map and a list at the same time, they'll have the convenience of mapping a Mark object into just one object, instead of many.
Another way to support Mark in other languages, is just to use map for both properties and contents. E.g. in Java, the key in map can be integer. Of course, the performance will not be as good as primitive array. But it can be one man's quick-and-dirty solution.
General JS arrays (not those TypedArrays) are actually maps indeed.
Thanks for several comments pointing out the unclearness of what's being "more modern".
I've updated the README to be: "The advantage of Mark over S-expressions is that it takes a more modern, JS-first approach in its design, and can be more conveniently used in web and node.js environments."
The nice thing about standards is that you have so many to choose from.
- Andrew Tanenbaum Computer Networks, 2nd ed., p. 254.
I think all developers go through some experience where they want to just "unify" everything because that will supposedly make it easier for them and other developers.
Overtime as you become more experienced or I guess jaded you realize that reality of a "GUT" technology platform or programming languages is a pipe dream and the effort to get people to use said new format/language/tech is more effort than what you get in return.
Anyway to be short about it I think most should just pick the best tool for the job and stop rebuilding things that don't need to. And if you do please make sure you have a plan to how you are going to replace all the old working stuff.
> I think most should just pick the best tool for the job and stop rebuilding things that don't need to.
I think you just contradicted yourself. Sometimes the best tool for the job is something new, something improved over what already exists.
I don't think the author intends to "replace all the old working stuff". But if this tool is better for new projects, then why not? I don't get all the negativity... do people here really love XML/JSON/YAML that much? There's a whole lot to complain about in all of those!!
I am not averse to new formats. I am averse to formats that try to “unify”.
And yeah I don’t have a problem with XML or JSON. Those two combined with some flatbuffer other men binary protocols cover most of my use cases... like really what’s with all the XML negativity.
XML is only semi-structured/typed without schema. JSON and Mark are always typed.
Full formal schema definition, as in XML, is often a burden to ad-hoc scripting, which is common in JS. JSON/Mark provides sufficient type info for these adhoc usages.
JSON is not "fully typed". It just happens to have different syntax for strings, numbers, and booleans. But the application code still needs to come up with a way to distinguish between timestamps, enums, different object types, etc.
XML uses the same syntax for strings, integers, and booleans, but it has mature schema/typing tools that make it easy to apply more precise typing, which you'd want to do anyway to identify timestamps, enums, and different object types.
Cons: Not seeing any advantage over JSON. If you want a type for objects just add a type field and have your code read it. Then you can use any of the existing parsers.
I made my own little language called Geneva [0] for similar ideas but it acts as code and can be parsed as JSON. I also came up with a spec for doing this for HTML [1] (but no code to do this yet).
Thanks for feedback on the security aspect. It is something that Mark definitely needs to consider carefully.
Current Mark implementation does not call arbitrary constructor during parsing. The constructors are created from scratch. But application users might want Mark to call their customer class constructor. I'm thinking passing in a callback function to Mark.parse().
When used for mixed content, Mark is not necessarily always slower than JSON. Many existing JSON-based DOM solutions, like JsonML and virtual-dom, need to use several JS objects to represent one element, but Mark uses only one JS object.
However, I don't have time to do some benchmarking at the moment.
I like the idea but I don't think the benefits outweighs the negative implications it would have to implement it.
I mean JSON as a data format for api stuff is just enough as it is and you'd need some serious reason why to change from JSON and these reasons just doesn't cut it.
> The advantage of Mark over S-expressions is that it is more modern, and can directly run in browser and node.js environments.
… with the right translator to JavaScript, which also happens to be true of S-expressions.
His table is incorrect, incidentally: S-expressions support mixed content (if I understand what he means) and are also fully generic.
He doesn't have a good example of the benefits of his proposal over S-expressions: 'more modern' just means 'undiscovered bugs.'
I respect his enthusiasm and hard work, but I believe what the world needs is hard work on existing things rather than hard work reïnventing the wheel.
> Mark utilizes a novel feature in JavaScript that an plain JS object is actually array-like, it can contain both named properties and indexed properties.
Where can I read more about this feature of JavaScript?
So it's basically just XML with curly braces or sexps...
And why bring YAML in the mix? Yaml isn't used for transfer I hope? Should be compared to TOML as well in that case that seems a lot better than YAML, especially for configs: https://github.com/toml-lang/toml
Or msgpack? Which also seems useful. Why not protobuf? Or just s-exps which is basically what this is.
Great initiative. I'd say why not improve the leading format, that is, JSON ;-) ? I'm collectiong all data markup flavors and extension (HJSON, JSON 1.1, JSONX, SON, etc.) in the Awesome JSON - What's Next page @ https://github.com/json-next/awesome-json-next Cheers.
Good point. If it's not JSON next but a complete new format than you will have to compete with JSON, all JSON next formats and all other alternative formats. Good luck.
PS: Answering myself - the leading data format might actually be the humble Comma-separated values (CSV) format! Love it really :-) Let's make it better and improve it - let's welcome csv,version: 1.1 -> https://csvalues.github.io
Looks like Mark is more in the tradition of YAML, that is, YAML is a superset of JSON too, e.g. it wants to be its own format (not just a humble extension). For example, my better JSON format flavor is called JSON v1.1 to make it clear its just humble JSON, but improved :-).
> Mark utilizes a novel feature in JavaScript that an plain JS object is actually array-like, it can contain both named properties and indexed properties.
wouldn't that make introspecting objects very annoying?
In Mark implementation, care has been taken so that indexed contents are not enumerable. So e.g. when you run a for ... in loop on a Mark object, you'll only see properties, not the contents.
This is one of the difference between Mark object and an array. Array contents are enumerable by default.
Not sure how many more xkcd 927s this thread will have but personally what bugs me the most is the thought that being slightly more versatile than JSON means that Mark is worth the trouble of adopting. Data structures are well represented by JSON, markup is mostly well represented with XML. I rarely really want to mix the two. Additionally, this doesn't feel like a language I would want to write documents in any more than I do XML.
I think we're fine with separate languages for data and markup.
Data and markup/document separation might not always be that clear cut.
The latest trend in CMS systems, piloted by the latest content editors, like Quill, Draft.js, ProseMirror, Slate.js is to use JSON to present the content, instead of using HTML or Markdown. Using object notation, gives rise to cleaner API and data model.
So the wall between data and markup, JSON | XML may collapse one day.
I would strongly suggest a renaming. Mark is so generic that you can already tell you want {mark}, which has a pronouncability issue, and the name is too similar to the big player that is markdown (like a bright star making it impossible to see a dimmer one right next to it), making "mark" sound to me more likely to be a markdown renderer than a JSON replacement. Given that both mark and markdown are markup languages of various sorts, the names are just too close to each other, i.e., it would be different if markdown was what it was and you were proposing "mark" as the name of a library that lets you put marks based on geographical criteria on a map or something totally different.
Mark reserving all number-only keys is statistically likely to become a problem as a project grows larger. I'd suggest finding a different way to get out-of-band data to be fully out-of-band, rather than trying to carve out a chunk of keys.
Somewhat similarly, defining a "pragma" as "something surrounded with braces that isn't a legal object" means that if you ever want to change the definition of an object in the future, you can't, because you will turn things that used to be pragmas into objects, or less likely (because you'll try to avoid this going forward) but still possible, vice versa. You need to concretely specify what a pragma is unambiguously, in a way that you can evolve either without affecting the other. It also means errors in generation become legal pragmas instead of errors, which will cause surprises, and on the flip side, errors in parsing objects can turn them into legal pragmas rather than parse errors.
I would reserve saying "Mark is a superset of JSON" for the case when you really can feed any JSON to a Mark parser and get a (roughly) equivalent structure. Alternatively, go through the documentation with a text find option and make sure every time you say "superset" it is qualified as a "feature superset". Especially in light of "(Mark does not support hexadecimal integer. This is a feature that Mark omits from JSON5.)" The word superset should either be qualified every time or mean a strict superset; "Mark is a nearly-feature-superset of JSON" would be more accurate.
In general, a review of http://seriot.ch/parsing_json.php may be appropriate; mark addresses only one serious issue, and the other fixes are ultimately fairly superficial (the trailing comma issue, for instance, is almost never a problem for me because ninety-nine-point-I-don't-know-how-many-nines percent of the time, JSON is a thing my tools generate; the cases where that is a serious issue have generally already moved on to another format like YAML, same for comments). Also, per my comment about parse errors turning objects into pragmas, if you expect this to become a big cross-language standard it is worth reviewing a snapshot of the variability in JSON parsers, which is a simpler format. A more complicated format should expect to see even more subtle divergences in its multiple implementations and things like "misreading an object as a pragma" to become even more likely at scale.
Hexadecimal integer is not part of JSON. It is a new syntax introduced in JSON5 (not JSON version 5). I don't think this feature is that useful, thus does not incorporate it into Mark.
(I can't downvote direct replies, so it wasn't me.) I was not suggesting that you incorporate it. I was suggesting modifying your marketing copy to incorporate the fact that it will not be a superset anymore. Superset is a word we should guard and not let it become "sort of superset-ish, maybe, mostly", but should mean superset. If you don't have every feature, you should not say it's a superset. Since not only is there nothing wrong with feature elimination, but when done well is a downright good thing, it's not like this is some sort of major problem for the marketing or something; just say you used some taste in what you brought over.
And again let me emphasize, since you seem to be saying it again in some other replies, that "{mark} is a superset of JSON", if you mean that syntactically (as opposed to features wise), MUST mean that every valid JSON document will produce a valid {mark} parse. Nothing less than that qualifies it as a superset. Given that you reserve numeric keys I don't think that is the case; whether the grammar is a superset is harder to determine so I haven't tried. That would be something best served by taking a very complete JSON parser test suite from someone and validating that all their corner cases that are supposed to parse in JSON, parse in {mark}. Based on my own experience in the world of parsing, the odds of you passing that first try are very low; if you manage, major kudos to you as that would be a very difficult test. (Though I would imagine that since the grammar largely came from JSON a lot of the surprises would be the ways in which your parser turns out to deviate from the grammar rather than grammar errors.)
I don't understand this part of the readme:
> The advantage of Mark over S-expressions is that it is more modern, and can directly run in browser and node.js environments.
Does this mean I'm so out of date with JS that this syntax is actually legit JS? Or does Mark run its own parsers, at which point it's just like sexps, except it uses the "more modern" curly brace instead of "less modern" parenthesis?