Hacker News new | past | comments | ask | show | jobs | submit login
The cost of parsing JSON (v8.dev)
543 points by s9w on Sept 18, 2019 | hide | past | favorite | 293 comments

I actually think the previous title of this article which was something about JSON.parse being faster than object instantiation or something like was clearer because in English the cost of something implies that it is a negative, whereas here the performance cost is a benefit relative to another solution with a higher cost.

maybe I'm being picky though.

I was expecting something about the extent to which JSON processing in the world contributes to global warming or some such

Parsing 1GB of flat JSON data is equal to 1.61 metric cow farts.

Don't tell people that. Those who believe climate change is a hoax will then do it more out of spite against the alleged hoaxers.

Think I'm joking?: https://www.scientificamerican.com/article/not-so-conservati...

Not having a sense of humour will hurt your cause more than anything.

Given the choice between a secret fool who has fun, and a joyless thought-policing jerk who happens to be right on an issue, people will choose the fool every time; and frankly I can't blame them.

> Given the choice between a secret fool who has fun, and a joyless thought-policing jerk who happens to be right on an issue, people will choose the fool every time; and frankly I can't blame them.

It always amazes me that people who need so strongly to express their individualism in such ways are willing to tie their own puppet strings and offer to dance to another's will.

Everybody knows it's the nautical cow farts that contribute over 80% of JSO2N emissions. It's causing ACID compliant rain.

Literally parsing carbon-shittons of JSON right now out of spite. It's worth the spot instance cost! Especially since the power usage + carbon created is on the other side of the world from me! Mwahahah!

relaxes in pure pristine air-shed

I agree. I was expecting something about protocol buffers or a binary based representation of JSON.

I think this fragment catches the spirit of this piece:

A good rule of thumb is to apply this technique for objects of 10 kB or larger — but as always with performance advice, measure the actual impact before making any changes.

Although it may still not be worth it. At work I have this hand-rolled utility for mocking the backend using a .har file(which is a JSON). I use it to reproduce bugs found by the testers, who are kind enough to supply me both with such a file and a screencast.

On a MacBook Pro a 2.6MB .har file takes about 140ms to parse and process.

I find this really interesting, because at some point the absolute performance benefits of `JSON.parse` is overshadowed by the fact that it blocks the main thread.

I worked on an app a while ago which would have to parse 50mb+ JSON objects on mobile devices. In some cases (especially on mid-range and low-end devices) it would hang the main thread for a couple seconds!

So I ended up using a library called oboe.js [1] to incrementally parse the massive JSON blobs putting liberal `setTimeout`'s between each step to avoid hanging the main thread for more than about 200ms at a time.

This meant that it would often take 5x longer to fully parse the JSON blob than just using `JSON.parse`, but it was a much nicer UX as the UI would never hang or freeze during that process (at least perceptively), and the user wasn't waiting on that parsing to happen to use the app, there was still more user-input I needed from them at that time. So even though it would often take 15+ seconds to parse now, the user was often spending 30+ seconds inputting more information, and now the UI would be fluid the whole time.

If you really need to work with large data files 1mb+. Json is a terrible format. You should look into flat buffers. It’s like having indexed json where there is no parsing cost. You can have millions of rows and nested objects and it will only read the bytes it needs.

It is length prefix encoded format so it’s pretty safe to work in a streaming manner too.

Good article on how Facebook used them for their mobile app:


Why not just use promises?

side note: legit question, I don't do web/app dev

Because JSON.parse blocks the thread it's in, and JS is single threaded [1].

So even if you put it behind a promise, when that promise actually runs, it will block the thread.

In essence, using promises (or callbacks or timeouts or anything else like that) allows you to delay the thread-blocking, but once the code hits `JSON.parse`, no other javascript will run until it completes. And since no other javascript will run, the UI is entirely unresponsive during that time as well.

[1] Technically there are web-workers, and I looked into them to try and solve this problem. Unfortunately any complex-objects that get sent to or from a worker need to be serialized (no pass-by-reference is allowed except for a very small subset of "C style" arrays called TypedArrays). So while you could technically send the string to a worker and have the worker call `JSON.parse` on it to get an object, when you go to pass that object back the javascript engine will need to do an "implicit" `JSON.stringify` in the worker, then a `JSON.parse` in the main thread. Making it entirely useless for my usecase.

But continuing with that same thought process, I very nearly went for an architecture that used a web-worker, did the `JSON.parse` in the worker, then exposed methods that could be called from the main thread to get small amounts of data out of the worker as needed. Something like `worker.getProperty('foo.bar.baz')` which would only take the parsing hit for very small subsets of the data at a time. But ultimately the oboe.js solution was simpler and faster at runtime.

Another trick most people don’t realize is that not only is the fetch api is asynchronous but response.json() does the conversion in a background thread and is non UI blocking.

If you have a large json object. You can use the fetch api to work with it. If you need to cache it, use the cache storage api. Unlike localStorage which will freeze the UI, cache storage wont.

It’s slightly slower since it needs to talk to another thread but who cares as long as the UI is responsive to do other things.

This is a common misconception, but response.json() still blocks the main thread.

It looks like it doesn't, but the same exact symptoms will happen even while awaiting the fetch json().

I'm guessing with Oboe.js you solved this by capturing a stream(?) of JSON but only parsing relevant chunks as they appear and match the selector? Or do you simply load the larger chunks at once (either by a request or embedding JSON into the template server side) instead of streaming?


I could see the value in this for sure. I currently have a problem of loading a ton of JS for some users who have thousands of objects embedded in the view with Rails using toJSON() in a <script>. It’s creating far too much weight on the frontend. I’ve been considering fetching it via a simple REST request instead.

Thank you for the excellent explanation!

I think of js entirely from a node.js perspective where I conceptualize it as an async task. Is this also wrong?

Node suffers from the same issues, but it's generally not as noticable in most cases. A similar situation in node would cause the server to not be able to respond to any other requests during the `JSON.parse` execution. But in the Node world, you have more options for how to get around those problems (like load balancing requests among several node processes).

But both server-side and client-side JS use the same system, the event loop. It's basically a message-queue of events that get stacked up, and the JS engine will one at a time grab the oldest event in that queue and process it to completion. Anything "async" will just throw a new event into that queue of events to be processed. The secret sauce is that any IO is done "outside" the JS execution, so other events can be processed while the IO is waiting to complete.

Take a look at this link, or search up the JS event-loop if you want to get a better explanation. It's deceptively simple.


> Is this also wrong?

Yes, node.js javascript runtime is based on V8, the same that runs in Chrome. Javascript is single threaded so anything that is not I/O bound will block the main thread. If you don't want to block the thread becasue you have long running calculation/parsing task, then you can use worker threads[1]. This will run your task in separate thread and not block the main one.

[1] https://nodejs.org/dist/latest-v12.x/docs/api/worker_threads...

Fun detail: node internally will use thread pools to do CPU-intensive tasks that would normally block the main thread.

For example: https://github.com/nodejs/node/blob/master/src/node_crypto.c...

I generally use that as an example when explaining to people why Node isn't a great fit for a lot of workloads. They have to use these features internally, but you as the user with a CPU-intensive job don't have access to those features.

and not to beat a dead horse, but worker threads again wouldn't work in this exact situation even in Nodejs. They suffer from the same problems that web-workers do, meaning they use a structured copy algorithm to send data between workers (with the exception of TypedArrays), and therefore would hang the "main thread" just as long as if you did the `JSON.parse` directly in it.

It's a really annoying problem, and I'm actually really happy to see that many others have the exact same thoughts I had at the time, and that I wasn't just missing something obvious!

Maybe the worker could parse the JSON to build an index and then send over just the index. The main thread could then use the index to access small substrings of the original giant JSON string, parse those and cache the result?

what if you used multiple async ajax requests to load different parts of the UI in place of loading all 50MB at once? could that be what the OP meant by "promiseS"?

Promises are a way to deal with async code. Parsing JSON is synchronous and CPU-bound, so promises offer no benefit. And since web pages are single-threaded[0], there isn't really any way you can parse JSON in the background and wait on the result.

[0]: There is now the Web Workers API which does allow you to run code in the background. I've never used it, but I have heard that it has a pretty high overhead since you have to communicate with it through message passing, so it's possible you wouldn't actually gain anything by using it to parse a large JSON object.


Promises still run on "the main thread" so a CPU intensive task in a promise is still going to block things. You could use a promise if you delegated your CPU task to another process or some C code that did actual threading.

I believe you'd use Workers (WebWorkers?) https://developer.mozilla.org/en-US/docs/Web/API/Workers to actually do it off the main thread entirely inside JS.

Promises still run in the main thread.

You could try to use a web worker, but then you run into the problem that they don't have shared memory, so you need to pass data back some other way.

IndexedDB runs on web workers, so that might be a good way to push the work off the main thread but still share the result.

JSON.parse is not interruptible. All the answers about promises and single threading are interesting but that's the crux of the issue.

Because JS is single threaded. If a task is too big you need to split it or the UI can't be updated until the task is done.

Is that relevant when comparing parsing json with parsing literal objects? I don't know much about JavaScript engines but I'd expect that parsing literal objects in code is also blocks the main thread.

Would probably break that up similar to how you did in that case as well. Though may use multiple server request (chunks) and/or use a websocket for the data feed.

What was the memory overhead for the application?

I don't remember details about memory stuff, it was a few years ago now, but I was pleasantly surprised to see that it wasn't nearly as bad as I first assumed it would be.

And I did originally plan on using something like a websocket, but turns out with some minor changes on the server side we could start streaming data while it was still being gathered, and oboe.js is actually able to start parsing data even while it's still downloading from a normal XHR request, and is designed to be as efficient as possible (so it throws away string data as soon as it's not needed any more).

So there weren't really any additional benefits to be had from using websockets and breaking it up into multiple distinct requests would probably have been slower!

(I just realized I forgot to add a link to oboe.js! But I highly recommend it. It seems it's just gotten better since the last time i've used it)

[1] http://oboejs.com

I'm asking purely out of curiosity - what was the content of such a large JSON object?

It was a carton scanning app, so basically a massive array of objects (carton data) which were needed so the app could function and route cartons and validate deliveries entirely offline. Due to some unfortunate limitations from our clients and some edge cases, we couldn't filter down the data on the server ahead of time. So we ended up having to keep that massive amount of data on the device, and at the end of the day 95% of it would be unused, but we wouldn't know which 95% until the device was already offline.

It was a system where the goalposts moved many times during the development. If I were to do it again, I wouldn't use JSON, but after having the goals change a few times and then having the original server-side components get co opted to work on other projects, it was hard to justify the time that would be spent switching to a different, more appropriate wire format.

I kept reading that as cartoons and I was just so confused for a second...

You can also pretty easily use a web worker now, they work well. Here's [1] an example with React hooks.

Example fibonacci worker code that doesn't block the UI, even at larger calculations

  const fib = n => (n < 2 ? n : fib(n - 1) + fib(n - 2))
  onmessage = msg => {
    console.log('fibonacci worker onmessage', msg)
    postMessage({ num: msg.data, result: fib(msg.data) })
[1] https://github.com/bharathnayak03/react-webworker-hook

Web workers won't work in this case because they need to serialize all data going into and out of them (with the exception of TypedArrays).

So passing a string to a worker and having it JSON.parse it works great. But when you go to pass that object back to the main thread, it implicitly does a JSON.stringify and a JSON.parse back on the main thread (technically it's called a "Structured Copy", but it's mostly the same thing), putting you in the exact same situation.

Good to know, thanks for clarifying.

And thanks to you for helping show me that I wasn't the only one to try that!

This whole thread has been really nice to read, because I beat my head against a wall for a long time before I finally found a solution, and I'm glad to read that I wasn't the only one to think this was a lot more deceptively hard than I thought at first thought (or second, or third...)

If this really NEEDS to be a client-side-only solution, I still believe a worker is the only way to go. Only, in this case, it needs to behave like an API. So your worker not only parses the JSON, but also responds to post messages with only the data that is requested from it.

Why? Because a large JSON structure is most probably just a large JSON structure, but you most probably don't need it as a whole. You may need a total count of items, you may need a paginated set of items, or only a certain item or a set of fields of items — well, an API.

> it implicitly does a JSON.stringify and a JSON.parse back on the main thread (technically it's called a "Structured Copy", but it's mostly the same thing)

Except funnily enough JSON.stringify + JSON.parse is usually recommendation as it's either comparable or faster than the structured copy the engine itself does :/

Web workers are depressingly bad...

You might be interested in a tool I wrote to serve .har files called server-replay: https://github.com/Stuk/server-replay

It also allows you to overlay local files, so you can change code while reusing server responses.

I mean, I get it, but I think performance is overrated in this particular case; unless it’s a significant and/or very noticeable difference, stick to object literals, please. I’d probably fire someone if I started to see `JSON.parse(…)` everywhere in a codebase just for “performance reasons” … remember, code readability and maintainability are just as important (if not more).

> I'd probably fire someone if I started to see `JSON.parse(...)`

I've had the privilege of working in organizations that consider mistakes to be the cornerstone of resilient systems. Because of that, comments like this scare me, even when intentionally hyperbolic. More so, if the product works well and is being maintained easily, why would you micromanage like that? Sounds like a minor conversation only worth having if the technical decision is having a real impact.

Thomas J. Watson:

> Recently, I was asked if I was going to fire an employee who made a mistake that cost the company $600,000. No, I replied, I just spent $600,000 training him. Why would I want somebody to hire his experience?

You probably wouldn't want to work for somebody who fired people so easily anyway. This is one reason why I find it stupid when people defend companies or are super loyal to their employers: companies don't care about you and especially companies that fire on a whim without concern that they're fucking with somebodies life. Best to work somewhere that treats you like a human instead of as a cog.

To be honest, I understand the bit of backlash that I’ve received here and I think it’s well-deserved since I should’ve worded my statement better. Thank you for your comments.

You all are correct re firing someone over mistakes and seemingly trivial matters. I was mostly referring to software engineers who make impactful decisions without good reason and/or without properly assessing the trade-offs.

I think it’s fair to say that we all want performant software, but at the same time, if I have a software engineer on my team who can’t back their decisions with some form of data and/or understanding of the trade-offs, unless they’re at the junior level, they’re not the type of software engineer who I want on my team.

I said “performance reasons” precisely because, over and over and over again in my career, I’ve watched software engineers commit unreadable messes of code that were clearly premature optimizations and/or optimizations where the performance gains weren’t significant enough to justify the costs of the unreadable and hard-to-maintain code enabling them.

I once had a software engineer unexpectedly spend almost a week rewriting a critical part of a Java codebase using the JNI because he thought it’d “make it faster” — and it did — but then all types of new native code-related issues ensued that cost the company, including a major security vulnerability that was just impossible before. On top of that, it turned out that the performance gains that we noticed were mostly significant during the startup period of the JVM, so it really wasn’t worth it. And this was a very brilliant software engineer, but he was consistently making poor decisions like this. To be clear though, he wasn’t fired! I just use that story as a realistic example. (Part of me still thinks that he just wanted to learn/use the JNI and that project seemed like the perfect target. Lol.)

But yes, it’s more complex than simply firing individual contributors for sure and I regret wording my statement that way, but I hope you all can understand the real point that I’m making.

Edit: I’d like to point out that, in my anecdote above, in hindsight, if anything, I was probably the one who looked incompetent when the suits started asking the expected questions re the sudden set of new issues, because I did my best to shield that software engineer from them (or at least I’d like to think that I did). I know the feeling of messing up at that level and I knew that he was most likely already beating himself up, so I couldn’t just let him take the fall, or worse, throw him under the bus. These tend to be complex situations in real life!

> Part of me still thinks that he just wanted to learn/use the JNI and that project seemed like the perfect target. Lol.

As a dev who sometimes goes off chasing wind mills, that's 99% of the reason why I do it. I find something nice to tinker with, and when my brain goes "ooh, shiny" I stop giving a shit about anyone's bottom line.

To be fair, it usually turns out for the better for the project and its code base! But sometimes it doesn't, and I figure that's just the cost of doing business. Companies should be willing to take these kinds of informed risks in order to improve their employees' ability, and therefore the quality of their product. However, a lot of management only sees the short term gain, because long term gain isn't incentivized for them. They just wanna do well and get a promotion.

Well, guess what, it's the same for me. Except for me to do well, I have to be learning new things constantly. So tough poop, management, I'll be chasing my white whale every once in a while. Deal with it.

Lol. That’s the spirit!

> Companies should be willing to take these kinds of informed risks in order to improve their employees' ability, and therefore the quality of their product.

Perhaps they should be willing, but your description of this distraction does not including informing the Company and allowing them to determine whether it’s a risk they are willing to accept. You decided for them because you didn’t want to receive the answer “no” in return. This isn’t right.

> "This isn’t right."

I'm afraid the morality of this situation isn't so black-and-white.

In industry, there is always a tension between production and research: cranking out widgets vs. getting better at cranking out widgets.

A dev who spends 100% of their time cranking out widgets is stagnating. That's actually not what your employer wants, despite the fact that their agile process seems to imply that ticket cranking shall be the whole of your focus.

If you ask employers if they expect you to improve your skills over time, they would absolutely say "yes". But if you ask for permission to chase a specific white whale, you will hear "no". Everyone agrees they should be saving for the future, but "not this paycheck".

Taking the naive moral approach here and spending 100% of your time on tickets is not "what's right". If anything, that's you being taken advantage of by your employer -- sacrificing the advancement of your career in the name of short-term sprint velocity gains. On top of that, stagnation is not what your employer really wants anyway.

(edit: the above excludes companies which have explicit "20% time").

I think i work in a similar manner to the gp.. It's transparent to the org... It's not I'll head down this path or investigate this _or_ get my work done.. It's an _and_ situation. Sometimes the rabbit trail is the best thing sometimes you just have to get the thing done... Either way it's still getting done

But it is an informed risk. I was hired to do work in 4 different languages, on the frontend and the backend, plus CSS and HTML. I get to weigh in on UX decisions, I get to design service infrastructure.

Was I born this way? No. I need this overhead, that's just part of being a dev (within reason).

If you require me to do lots of things, there's overhead. If you want a ticket drone for your Scrumfall projects, get a ticket drone.

> And this was a very brilliant software engineer, but he was consistently making poor decisions like this.

That is something I've noticed. Brilliance doesn't go hand in hand with making prudent and wise decisions.

> I did my best to shield that software engineer from them

I've found rather painfully that you shouldn't shield guys like that when they go off on their own to make mistakes.

Other thing, you have a team of people that are familiar with how a codebase is put together and does things. And what sort of things go wrong. It's a bad idea to disrupt that 'just because' Goofus rewrites a module to use X fad. Great! Before there were five programmers who knew how that module worked and now there is one programmer who knows how that module works.

Assuming you are managing a dev (either through a lead role, seniority, or as a manager) you absolutely should shield team members from direct demands from up that chain - that's what most of your job is... Assuming the employee was acting within the rules you've laid out then the you should shield them and consider adjusting your rules to prevent a repeat - if, to contrast, your company has some CI tooling setup and automatic deploys and reviews and whatnot - but then someone edits a file on production... that might be a fireable offense.

Additionally to contrast - if you're a co-worker and not a manager then you may need to examine your relationship (are you a mentor and thus secretly leading them or just a colleague). If a pure colleague makes a mistake you shouldn't stick your neck out too much - except to force your common manager to properly defend them.

Everyone who is fired should be fired by their manager and not anyone else in the org - that's how a team is strong and healthy.


Managers, in a healthy company, own the mistakes their subordinates make.

> you absolutely should shield team members from direct demands from up that chain - that's what most of your job is

The other part of your job is keeping your manager informed about subordinates that are being problematic. Up and rewriting a critical piece of infrastructure 'because' is problematic.

(I'm assuming that you mean that the other person is another subordinate to the same manager, rather than being someone subordinate to you)

It's a bit of a delicate balance. The golden rule is that Snitches get Stitches, but if someone is being unproductive with their time and your manager isn't aware of that fact then letting them know isn't a terrible idea. But it isn't your place to measure how your co-workers are accomplishing their tasks - assuming management isn't out to lunch then performance reviews should fall on their shoulders. Maybe your coworker cleared a rewrite with your manager and your manager was satisfied with the justification and decided that explaining the full reasoning would be a waste of time until the experimental phase was completed.

In theory good management should prevent you from feeling like you need to look over other people's shoulders, because that is their job. So if you are feeling that way you might want to talk to your manager about it, maybe they are bad at managing and are letting things slip through the cracks, maybe they find that allowing someone to experiment with a rewrite is worth the training time - it may be possible that you just need to talk it through with them and find more confidence in their management ability.

> That is something I've noticed. Brilliance doesn't go hand in hand with making prudent and wise decisions.

Reminds me that John's Carmack wife told Carmack that she wouldn't allow him to bankrupt the family with his space hobby company (Armadillo Aerospace) :)

My comment wasn’t meant against you personally (for all I know it was just for emphasis and not serious, and from your comment it seems like it), just against the attitude of firing people for small things, rather than, for example, teaching them to do better.

I agree with everything you said, except that I'm not sure that JSON.parse all over the place is going to add any significant unreadability. I think most likely it would always look the same and be just as readable as object literals once the initial getting used to it period would be over with. Hell, I think it's a lot more readable than the use of !! which I consider an abomination, but everyone keeps doing that for developer speed = productivity purposes.

It might be more difficult to format a string literal.

I'm supposing template literals take care of that. actually there might be advantages.

I've found that the readability of fast code vs. slow code is often negligible - certainly it is in the specific example under discussion. I prefer to make a habit of using faster idioms in that case, so that when speed does matter I'm already covered. I don't consider that premature optimization.

Except it isn't, because most of your developers will be using an environment that includes syntax highlighting and probably some linting. Except within string literals.

Code inside string literals is less readable and more inclined to be wrong/buggy.

I was arguing the general principle of not being afraid of premature optimization, not this particular example. You make good points.

Eh, depends on the reasons for firing people easily.

Firing easily for honest errors is moronic, fully agreed, especially if the person is learning from them. My code changes caused more than one sev0 before, but never was I personally blamed for them, as it was always some bigger underlying system issue that wouldn't have allowed me to make those mistakes, if the systems were more robust (and I was a little bit more wise and not pushed "seemingly safe" changes outside of business hours). I learned a lot from those mistakes.

Firing easily for a long history of non-improvement and not meshing well with the team (underperforming, causing a lack of cohesion within the team, etc.) is good for the team, but in principle it is similar to the "good king" kind of approach, so it all relies on the "king" having a straight head.

P.S. My last paragraph does not imply "culture fit" or any superficial stuff like that as a good reason for firing, I meant more fundamental sort of issues, like refusing to listen to people, never even attempting to improve (given you have some hiccups, just like most of us), etc.

I’m not against firing people if they are a bad fit, are incompetent or are toxic to the work environment. I’m against firing people for small things, not giving a chance to improve, firing on a whim or simply discarding people. Treat people like humans, but that doesn’t mean you can’t fire people who are a negative force on your business.

I usually find the opposite. Firing someone for cause is interminable or outright impossible, unless they're breaking the law or embarrassing you in front of customers.

I have only once fired for cause. Every other time, I’ve gone through the process, the PIP, and then terminated them according to their contract, usually with “more generous the legally or contractually required” severance. Let’s me sleep at night and has the nice side effect of keeping the peace with the remaining team.

counterpoint: you are just as free to take the same liberty with your employer. You can drop them like a bad date, and take a job somewhere else.

additional counterpoint: part of your job as being a grown up responsible adult is your ability to manage and endure risk and loss, especially the risk of your job disappearing overnight. Outside of circumstances of extreme poverty, or extreme disability, in which our government has safety nets in place (let's save the debate of sufficiency for another time, the fact remains they are in place), losing your job should not "fuck up your life" moreso as be a temporary setback. This is especially true for this industry.

Some of us can’t just lost our health insurance on a whim.

Thankfully most other developed country’s healthcare system doesn’t penalise individuals quite so significantly as the broken system you Americans keep voting for.

I’m not saying the UK or other European counties have the perfect healthcare systems either but at least we aren’t tied to a job we don’t like because losing our company’s health scheme is too scary to consider.

If you're not paying for it with your money, then you have to pay with your time: public healthcare systems, like those in Europe are known for the long wait times for patients requiring surgery, or other costly procedures.

Also, traveling to the US for treatment is still a thing, because new, advanced treatments are developed and first implemented in the US, so all that money spent give you something in return.

Unless you're rich and regularly wipe your ass with $xx,xxx bills, you end up spending your time in the US system too: getting your insurance and care provider to agree with what is covered, what isn't, and how much you have to pay.

Sometimes it takes almost a year to resolve.

BTW, even basic surgeries in the US can have a price tag of close to $100k. I've had to fight off more than one ridiculous bill like this in the last 5 years. If you're talking about medical tourism coming into the US, I can't imagine you're talking about anything but very well off people.

> Also, traveling to the US for treatment is still a thing, because new, advanced treatments are developed and first implemented in the US, so all that money spent give you something in return.

It sounds like you’re saying America is the only country in the world developing new and advanced treatments and the only country people travel to for such surgery. Clearly that’s not even remotely true (and even if it were, which it isn’t, it still doesn’t justify just how badly broken your healthcare system is for domestic users).

Whatever time I spend in the waiting room in Canada waiting for treatment, which is honestly less time than I wait for the cable company to show up to fix things, more than makes up for the fact that I spent literally zero time dealing with hospital bills.

Considering US hospital bills can easily be tens of thousands, a couple of hours wait at even $1000/hr. billable lost opportunity is still cheaper than the US alternative.

In the US, you generally need to pay with both your money and your time. I've waited three months for an appointment with a specialist, had them only tell me to go to another specialist, and paid for the privilege.

From the moment my GP refers me to a hospital for whatever reason they need to look at me, they have 8 days to respond, and must have a diagnosis within 30 days. Treatment is usually not long after and almost always proportional to the situation.

If a potential life threatening disease is suspected diagnosis and treatment must have begone after no more than 2 weeks. Most of the time it's a matter of days. If the public hospitals cannot do that I'm free to go to private hospitals without paying anything.

How is that waiting a very long time?

You don't lose it on a whim. In the US, you can file for COBRA to extend your benefits, at which point that alots you plenty of time to apply for medicaid if the circumstances were extraordinary (why do I get this weird feeling most people on this forum just never have been poor or in this situation?).

That plus your emergency savings funds, should more than account to hold you over 6 months to find your next role. 5 years ago. I'll save my survivorship bias story for how I coped with this exact situation 5 years ago because I know everyone's situation is unique, but the lessons of growing up with 2 unemployed parents and living month to month not knowing if the bank was going to repossess our house have stayed with me I guess.

Its an uneven power dynamic though. The employers typically hold many more cards than the employees.

"typically"? I think you mean "always". This notion of it being an equal relationship in both directions is ridiculous.

Well, there are those rare cases where the company relies on a person or small group to continue their business, but it’s not a typical situation.

Love the quote. Though, I have some people working with me who I would still struggle with that quote. There is an assumption in it that the employee grows from the experience.. yet, I face people who seemingly make an effort not to grow.

I would also micromanage this way. Developers leave, code remains. If your code base is full of `JSON.parse(...)` in a few years because of some developer who though "how clever to do this instead of object literals" it's not the author who has to live with their decision, it's the next code maintainer.

I see too many programmers being too clever and then leaving their clever code to become someone else's issue. My advice is be simple and make readable code. No one wants to maintain the clever code of another person.

Firing instead of teaching when the person can learn isn't management.

The problem isnt that maintainble code isnt worth the effort, the problem is that firing people until someone matches your demands is not the most effective way to GET maintainable code.

Tough life that maintainer is going to have, seeing `JSON.parse(...)` being wrapped around object literals in code. This truly is going to cost them many man hours and lots of hair pulled out in stress.

Seriously though, there's clever code and then there's just nitpicking. Micro-optimizations with JSON.parse() look ugly and nullify some editor conveniences, but they're IMO very far from being a fireable offense.

They are not a fireable offense in my book either, but they sure as hell wouldn't pass my code review. I've had to deal with too much crap like this in the past. Self-proclaimed senior devs that micro-optimize everything and leave a mess, then leave. Love them.

One should always optimize for easy maintenance. Performance is always a secondary goal, because it doesn't matter how fast (you think) your code is if you can't understand it.

Well I agree, you don't fire the person because they put JSON.parse(...) even if they put it in 1000 times. That would be silly.

The question is WHY did they do that? I'd probably get them to learn about performance tuning and do some profiling, make something faster. When they find out that it's slow because of something they didn't predict, hopefully they'll decide for themselves that they can't predict what will be slow, so no point complicating the code. If they don't get that, maybe explain it to them.

Basically the person who put JSON.parse all over the code was learning.

If they come back and arrogantly say "I'm right, and I'll carry on doing it you won't stop me", then that could be an attitude problem that might lead to question if they should be working there.

There are more nuances like if the person is claiming to be a senior developer/architect then the trigger for firing them might be more likely to be pulled. But still it is worth thinking about it first.

> I’d probably fire someone if I started to see `JSON.parse(…)` everywhere in a codebase just for “performance reasons” …

Yep, and I'd fire you for doing that! There are better ways to manage instead of showing off your authority. Oh, and by the way, would some JSON.parse statements for performance be the worst thing in your codebase(s) you guess? I mean, I cannot believe that would be the worse in your codebase. Also, if it really helps to use some JSON.parse for creating big objects for performance reasons, who cares? Instead of firing 'someone' maybe you can add some annotation to it for readability (or if that is below your imaginary level, ask the developer if he/she can add that).

Sry, but I hate people that misuse their authority by imposing their subjective opinions.

You're extrapolating quite a bit from a simple comment, which tells me that you'd probably be a poor manager as well. Then again, I'm extrapolating quite a bit as well.

Seeing something like JSON.parse throughout the code is definitely a code smell and could decrease the maintainability of the codebase, and that's a very tangible problem. Obviously you shouldn't fire someone over something like this if it's the first offense, but it definitely raises red flags and should make you monitor things a little more closely. If they show a pattern of dogmatism and poor judgement, you're probably better off finding someone else with better judgement. You're not going to find a perfect employee, but some employees are just better at making decisions for a larger project than others.

"We apologize for the fault in the subtitles^Hjavascript. Those responsible have been sacked."

"Those responsible for sacking the people who have just been sacked have been sacked"

"The directors of the firm hired to continue the credits after the other people had been sacked, wish it to be known that they have just been sacked."

Relevant "The IT Crowd" scene: https://www.youtube.com/watch?v=pGFGD5pj03M

Interesting how typescript plays into this - I mean back in the wild old days of plain old JS I would be totally fine with putting a JSON.parse here and there, especially on the hot path.

But now with static types - this would totally wreck static type checking. And you would need to spend additional cycles to validate that the data is actually correct.

Definitely a change request in the PR.

This has to be probably a really big validate perf advantage to warrant the loss of static checks.

Typescript gives you type checking if you import from a JSON file. (Node handles JSON imports and webpack will happily build that for you into a JSON.parse in a bundle.)

To be honest I think it's a big mistake on the part of typescript to not have a JSON.parse<T>.

That would just obscure the lie. I'd rather see an explicit `as T` cast at the call site to make the "trust me, typechecker, I know what shape this is" claim be in-your-face instead of hidden behind a type parameter.

(This reply assumes you're not asking for TypeScript to make a major philosophical shift and start generating runtime code to validate types. If you are, that's a discussion worth having but goes way deeper than `JSON.parse`.)

Can JS static typing not evaluate constant expressions to infer types?

TypeScript has great type inference, but there's no way to get it to parse JSON strings at type-checking time.

> Oh, and by the way, would some JSON.parse statements for performance be the worst thing in your codebase(s) you guess?

One thing I have seen from managers who don’t work regularly in the codebase — they tend to over-focus on things like whitespace and function names more than correct abstractions, separation of concerns, etc.

Would you also fire yourself?

That’s the compiler/minifier’s role anyway, to use the best construct when appropriate.

See Java’s whole “abc”+”ced” vs StringBuilder performance issues. When programmers have to alter readability for performance, it doesn’t necessarily mean they shouldn’t do it, but it means the precompiler is not advanced enough.

I wish could upvote this more than once.

Readability is crucial in code. If you have to through and change the JSON that's being parse and it takes a nontrivial amount of time, that's a big setback. Sure, it's 1.7x faster (in v8) to parse JSON, but how long does it take to parse 10kb of an object literal in the first place? Given that these static, large objects are not common place in a codebase, is it worth the tradeoff?

The precomiler, such as Babel, could introduce a plugin for this sort of optimization. We only write ASM when it going to significantly change the performance characteristics, and typically when a particular code path is run many, many times throughout an application. If an object literal like this is getting parsed that frequently, there are better ways to optimize so that doesn't need to happen at all anyway.

I could see this being very useful in a variety of applications, such as server side rendering. However, its would be best to happen in an optimization phase as you're already bundling at that point.

That was a weird era in Java history. They changed the compiler in the subsequent major version to perform that transformation automatically, but by then people had had 2 years to stare at perf graphs looking for bottlenecks.

It never was clear to me why they didn't do both of those in the same release. Backward compatibility wasn't the problem (they were already breaking that left and right).

Even better, I think Eclipse’s Java compiler introduced the optimization in a given version, but Maven hadn’t yet. So it wasn’t optimized in production, but was optimized on the developer’s machine. What a time to be alive.

Java devs saw that "abc" + "def" involved expensive String concatenation, so as a performance improvement they pro-actively, and effectively manually, changed to use explicit StringBuffer concatenations.

When the compiler switched to generate StringBuilder (unsynchronized) concatenations for "abc" + "def" nobody benefited, because they had already changed to use StringBuffer (synchronized).

Now they had to go an undo all of their hard, manual, optimization work.

I feel like the same would/might play out here.

define: hyperbole

noun exaggerated statements or claims not meant to be taken literally.

They say in the linked article that this should only be used for objects about 10kb and larger.

I'd argue that if you have 10kb or larger object literals in your codebase, you are already missing the mark on readability and maintainability in some ways.

Where you'll usually find this:

- exporting data from server to client for initialization

- localization data

- environment variables (feature maps, configuration etc)

- preloading datasets for graphs/tables

If it's exclusively going to be used for heavyweight operations like these, it's probably better to benchmark against protobuf decoding. I guess using JSON has a "works out of the box" appeal, and doesn't require defining any protobuf schema. But personally I don't see defining proto files as too prohibitive in terms of development cost.

> benchmark against protobuf decoding

Protobuf isn't built into the browser, so it can't bypass the JS parse & execute time. Instead you'd be parsing protobuf's JS, executing it, parsing proto, and producing objects. It'd be worth doing, sure, but it'd almost certainly be the slowest option by far since it's doing way more stuff in JS than either of the other two options and the JS syntax parse is the slow part.

These benchmarks indicate better protobuf performance [1]. Compute time these days is often dominated by memory transfer rates. The "slowness" of javascript seems to be offset by there being less data to begin with. Collapsing a 100KB resource down to, 50 or 25KB is usually worth it even if you have to do more operations in javascript. Not to mention end to end load time (which is probably what people are usually trying to optimize for) can be lower by reducing how much data needs to travel over the wire or radio.

At the end of the day, who knows if the use case hits edge cases or stresses parts of the implementation that is not optimized for JSON decode or protobuf. Getting meaningful performance data ultimately needs to be experimental, and resists categorical answers about whether X is faster than Y.

1. https://www.npmjs.com/package/protobufjs#performance

This article goes into a bit more detail: https://auth0.com/blog/beating-json-performance-with-protobu...

> These benchmarks indicate better protobuf performance [1].

We're exclusively talking about cold start performance here. Single, one-time object creation. Hence why JS syntax parse is the dominate factor and not execution performance. Those benchmarks are not that, they are hot performance. That's a completely different thing.

> Not to mention end to end load time (which is probably what people are usually trying to optimize for) can be lower by reducing how much data needs to travel over the wire or radio.

Wire transfer size would need to be looked at differently. The JS code & JSON string are both also going to be compressed unless you're not using a compressed Content-Type for some reason.

What is the "completely different thing" you're referring to here. Between:

1. Having a static JSON string, and decoding that string.


2. having a static blob, and using protobufs to decode that blob.

these two things accomplish the same thing. I'm not sure why you seem to think one is a "cold start" and the other is "hot" - they're both "single, one-time object creation". The former is going to be parsing ints and floats as ascii, and reading in "true" and "false". Regardless of compression, the memory-inefficient JSON encoding is going to be used (whether it's over the wire, or just as an intermediate representation during parsing). I've used protobuf decoding for things like localizations and configurations before - the "cold start" use case you're talking about - and it does in many circumstances result in faster loading. My napkin paper reasoning is that this will be much more heavily weighted to booleans and integers that are much more efficiently encoded in protobufs than JSON, so maybe if you had a use case that almost entirely decoded strings your performance differences may not be the same.

Are you including the cost of loading protobuf itself? You seem to be basing your argument on an assumed already present & loaded protobuf library.

You need to benchmark starting from nothing at all. Your link that you seem to be basing this off of has a loaded and fully JIT'd protobuf. That's not the start state.

You can measure the impact on loading time, and the size of the protobuf implementation you're using probably has an impact on the threshold at which it becomes more efficient. I don't doubt that parsing a 500 character long JSON string is probably faster than loading a protobuf to do it instead. In fact, apparently this JSON parsing trick is only effective beyond 10K or so. But past a certain threshold memory bandwidth is more crucial than loading code. If your data consists mostly of booleans and integers then JSON can often be an order of magnitude larger in size than protobufs. If it's compressed, then decompressing it takes clock cycles and the parsing code is still parsing the larger uncompressed JSON text. A protobuf library can often skip compression altogether by virtue of using normal ints and bits for numbers and booleans. So while the protobuf library does have some additional overhead it's often higher throughput for many types of data.

You’re repeatedly missing the point. This is about optimizing startup time.

The comparison should be:

cost of downloading payload + runtime cost of parsing JSON


cost to download protobuf lib + parse and execute JS protobuf lib + download payload + runtime cost of parsing

Specifically, the article talks about how parsing JS is more costly than JSON - this cost will apply to the protobuf library which certainly far exceeds 10KB. There is no way the math will work on your favor until you get to MBs of data.

> Specifically, the article talks about how parsing JS is more costly than JSON - this cost will apply to the protobuf library which certainly far exceeds 10KB.

I would suggest reading the links I posted. The minimal protobuf library, which is suitable for working with static decoding, is 6.5KB [1]. Again, you're right that the size of the protobuf library will be an important factor in dictating the scale at which it's more effective than JSON parsing but your sense of the factors is off - a light protobuf library doesn't reach 10kB let alone "far exceeds 10KB".

Furthermore, if your pages use the protobuf library already for other uses like decoding and encoding RPC messages then loading and parsing the protobuf library is basically free - you're going to be doing this anyway.

1. https://www.npmjs.com/package/protobufjs#installation

I currently work on a project using protobufjs. Our generated static classes are ~500KB and ~1.5MB, or around 140KB gzipped. The schemas are not that large, and this does not even include any network code (not part of protobufjs).

Also cache initialization scenarios, larger datasets used for common dropdown/select lists like countries w/ ISO codes etc.

None of these need readability though (or are particularly readable to begin with, regardless of if object literals or parsed).

Either way it's really more data than object at that point so it's appropriate to store it as JSON. Normally I'd place such data in a different file, but I can imagine that that might not be best for webpages.

> fata

you created a noun to describe "fat data" from a typo.

“Fata” is the word for a bucket in icelandic. Quite apt indeed.

Now I feel bad for fixing the typo...

But is it pronounced FAY-ta or FAT-a?

FAY-ta is a type of cheese. Not unlike this comment thread.

Sadly we pronounce it FEH-ta in the states

Like a lot of people who have given interviews, I have my own set of very odd stories.

I interviewed a developer and asked him to explain how the system he was currently working on worked on the whiteboard. As he talked he drew two boxes. He drew a line between those boxes. Then as he talked he kept drawing over the line between the boxes. (Now, he was jr to mid-career so I didn't expect a magnum Opus but we value people who can explain themselves because at least if they're wrong we find out before the mess gets too big. But I digress.)

Your analysis reminded me of that interaction. What kind of information architecture do you have if you're building objects that big?

I mean, as others have said, if this is the main payload being transferred from client to server, it's probably going to arrive as JSON and you're going to turn it into Objects.

If it's not that data (they're talking about cold loads) how many other categories do you have that can approach 10k?

Configuration? We have libraries for that and they often read a JSON file.

Lookup tables for fixed relationships of data in the system? Maybe, but that complicates your testing situation.

How many of those categories get loaded more than once per session? Are these really such large startup bottlenecks that we tackle this instead of other problems? GP implied incompetence but I get more of a whiff of desperation here.

Smells like a God Object pattern to me.

> remember, code readability and maintainability are just as important (if not more).

I don't know about that. Prioritising making your own job easier over the experience of all your end users feels like a much more fireable offense to me.

In this particular case I'm still a little wary of it because it feels like it's optimising for a current implementation with no idea what the future performance implications might be (or current implications in non V8 engines?) but this trend of prioritising developer experience over everything feels like a very bad one to me. It's the same reason given to justify making every web site a React app with no thought toward the extra JS payload you're sending when it's not needed.

This hack is supposed to be for huge data: 10kb or more, thus comfortably more than a page. If the >10kb wall o' code was wrapped in a parse-as-JSON-at-runtime function call which was was preceded by a three-line comment describing a quick and dirty benchmark showing that it saves a useful number of milliseconds on page load in a fairly typical use case, and if the web resource was intended to be loaded many millions of times, I would nod and approve when reviewing the code. The way the original objector writes, it sounds as though nothing would suffice to justify this hack, and certainly not a mere benchmark and 3 lines of comments preceding it. That attitude seems like unreasonable blinkered zealotry, or some other kind of tunnel vision, e.g. someone who has just never thought seriously about the appropriate tradeoffs in maintaining a web resource which gets loaded millions of times a month.

Users like code with fewer bugs and rapid response time for new feature requests, right? If you start firing people for taking the time to write readable and maintainable code, you'll be doing a greater disservice to the users than those developers were.

It depends. Say you spend 8 hours of dev work to save 1 second of processing time per call. It will take 28,800 calls until your time investment pays for.

This assumes the cost of dev time is equal to the cost of CPU time. In some cases the additional speed is going to return more value then the cost of the dev working. And other time the additional value of getting the product to market is going to win out.

The graph in the article includes results for other engines too, not only V8.

I would fire middle managers for firing individual contributors for trivial, easily correctible issues like that.

We need managers that mentor and train, instead of firing someone over silly things.

It'd certainly be a good idea to understand exactly what the alternative is when you see JSON.parse() before deciding it's bad or firing anyone, right? There are definitely some legit cases for JSON.parse(). Not to mention that a full round of you setting clear expectations, giving examples of what's recommended and what's not, giving people a chance to learn & grow, and documenting repeat offenses, should all be done before booting someone...?

Deep-copying JSON objects using stringify+parse is not just faster, but less problematic and less code than writing a recursive object copy routine.

First paragraph...

> This knowledge can be applied to improve start-up performance for web apps that ship large JSON-like configuration object literals

Third paragraph...

> A good rule of thumb is to apply this technique for objects of 10 kB or larger — but as always with performance advice, measure the actual impact before making any changes.

I'd fire people who don't RTFM

I wouldn’t mind having this in my build step, as it’s all minified and unreadable anyway, so what do I care, but I agree with you fully.

Not only would you be missing out on readability, none of your linters will catch errors within that string any more and if you use something like prettier, well, god help you. You’re almost guaranteed to introduce more wasted time than you’ll save with this doing it manually.

Well, they are suggesting it for literals that are 10 kB or larger. That means they aren't really talking about code that's in your normal codebase - it's quite rare to have a literal that large. It is more likely this is relevant for backend tools that autogenerate JavaScript code to be sent to a client.

For the main two apps I work on, there's some configurations that are different between different client deployments, this includes i18n strings, configuration settings/options, theme options and a couple of images (base64 encoded) for theming. Switching to JSON.parse was a pretty significant impact, from about over 200ms to under 100ms for my specific use case (IIRC). Memory usage was also reduced.

I don't remember the specific numbers... it was an easy change in the server handler for the base.js file that injects a __BASE__ variable.

    var clientConfig = JSON.Stringify(base.Env.Settings.ToClient(null)).Replace("\"", "\\\"");
    // NOTE: JSON.parse is faster than direct JS object injection.
    ClientBase = $"{clientTest}\nwindow.__BASE__ = JSON.parse(\"{clientConfig}\")";
    return Content($"{ClientBase}\n__BASE__.acceptLanguage=\"{lang}\";", "application/javascript");
The top part is actually a static variable that gets reused for each request, the bottom is the response with the request language being set for localization in the browser app.

I totally agree that inlining `JSON.parse` of string literals in source is a bad idea and I would reject it in a code review except under the most extreme circumstances (and even then try to identify a better solution).

On the other hand, knowing the performance characteristics, this is something that compilers could do as an optimization. Who knows if that's worth the effort, but this kind of research is part of determining that.

The JSON.parse approach might also be useful if the same data needs to be used in non-JavaScript code too.

You could then use the same string in JSON.parse(...) in your JavaScript, json_decode(...) in your PHP, JSON::Parse's parse_json(...) in your Perl, json.loads(...) in Python, and so on.

If you do have constant data that needs to match across multiple programs, it will probably be better in many or even most applications to store the constant data in one place and have everything load it from there at run time, but for those cases where it really is best to hard code the data in each program, doing so as identical JSON strings might reduce mistakes.

> I’d probably fire someone if I started to see `JSON.parse(…)`

Guys - I think he was being hyperbolic. Ya know, like everyone does on the Internet. If he had said "if I had to look at JSON.parse(...) lines constantly, I'd jump off a building!" I doubt you all would be calling 911 over an attempted suicide.

Seriously, chill.

If I used this one weird trick, I'd want it to be compile time checked.

I'd stick that JSON in a separate file, get typescript to compile it "just to check it's OK" then get the compiled code and include it as a string using something like https://webpack.js.org/loaders/raw-loader/, I guess (not used it before).

There might be a leaner way to do this (maybe the whole thing can be done as a webpack loader in one step), but something like this.

They mentioned that it should only be used for very large objects (say, 10k), so if you're seeing ~10k, hard-coded objects throughout your code, you should probably fire someone. If it's in just a few places, there should be a comment describing it (e.g. "large object constructed from DB query, use JSON to make page load faster").

Believe you can use "Interceptors" or the Adapter pattern on the Front-end to easily use JSON.parse once for all your http calls instead of littering it throughout the code base.

Why do you care? It’s syntax and can be automated via build tools so you need not hurt your eyes with syntax that you consider to be unpleasant.

Which that’s the crux of the issue here, your opinion.

TFA says this is could make sense for objects over 10kb. They clearly aren’t advocating doing it everywhere in a code base.

No there’re not

Most development time is going to be spent on reading code that's already written, so yes, they do matter. With the speeds mentioned it's not going to be appreciable until you hit a massive scale, which, let's face it, most of us aren't working with.

Most dev time for people refactoring code - yes... but not for new projects.

And as you say some people do write at scale.

> code readability and maintainability are just as important (if not more).

This is wrong, that’s all I was saying. Code right and it is readable anyway

Well, we're talking about injected variables...

    const injectedValue = JSON.parse("$SERVER_JSON_VALUE.replace("\"","\\\"")");
    // vs
    const injectedValue = $SERVER_JSON_VALUE;
generally for a single value in the codebase is emphatically NOT a huge issue... and if it saves 80-120ms or so on the load, that's a significant impact. Not to mention the lower memory overhead while doing so.

Deliberately provocative conversation piece:

If you're concerned enough about performance, or message passing costs are enough of an overall performance bottleneck, that parsing your messages even 1.7x as fast is worth changing the way you code, you probably shouldn't be using JSON as your message format in the first place.

We're talking about JavaScript in the browser though... what other message format is more readily and performantly processed in-browser using JavaScript than JSON?

Once WebAssembly gains APIs to change the DOM quickly and the big JS frameworks switch to WebAssembly for their internal engines, you could make the case for usage of binary formats like protobuf. IMO this trend of piling on technology after technology to handle bloat instead of designing websites to be lean is wrong but it's certainly the direction we are walking into. Websites will become even more opaque and complex. Definitely not looking forward to it.

>Once WebAssembly gains APIs to change the DOM quickly and the big JS frameworks switch to WebAssembly for their internal engines

Any idea what the timeline for such changes could be? Personally I'd welcome the possibility of compiling complex web apps down to WASM, but I can't see the things you mention happening any time soon

You don’t need webassembly. We already have typed arrays to represent binary data and do fast slices.

That’s essentially what flatbuffers is. Slightly larger than protobuf but insanely fast to parse since it doesn’t need to scan the whole file. It’s both memory and CPU efficient. Netflix uses it in their app because TVs can be low powered devices.

That’s why Netflix feels so much lighter than amazon, hbo or Hulu. They all freeze my Vizio TV but Netflix is smooth.


> Websites will become even more opaque and complex. Definitely not looking forward to it.

Why is that an issue? Do websites need to be open source? How many people, including software developers, will actually view the source for a 3rd party website and/or try to debug it? Beyond screen-scraping and learning purposes I don't see a use case for it.

I'd be quite happy with my browser(s) downloading and executing binary blobs if it means better usage of my devices' resources and bandwidth.

I don’t think this is a one-or-the-other situation.

“Bloat” is a problem on, say, news sites with horrible ads and it’d be great if they kept it lean. Obviously they don’t need webassembly and binary formats.

But... there are also incredibly powerful tools (google maps and docs, quake in the browser, streaming services, etc) that push the boundaries, which these kinds of tech will enhance, or make possible in the first place.

The browser DOM wouldn’t change simply because you are accessing it from a different language. I really get the impression that people who advocate web assembly as JavaScript replacement do so out of some ignorance of JavaScript and almost complete ignorance of the DOM.

I'd like to see protobuf take off, but we're talking 2025, 2028?

https://google.github.io/flatbuffers/ Is what you want in the browser.

I'm not necessarily assuming JavaScript in the browser; we could just as easily be talking back-end services written in Node.

That said, I'm going to also submit that, if you're shoving big enough messages at a fast enough clip that you feel motivated to be this worried about deserialization speed in the browser, you've also got bigger fish to fry.

Either way you cut it, it's at least worth stopping to think about whether you're being penny wise and pound foolish.

Protobuf.js, gRPC, etc, etc, etc. It's not like you can't send arbitrary binary data over HTTP or WS.

For typical cases JavaScript protos (jspb) should use JSON wire format. You would only use binary wire format depending on whether your message type is suited to it, e.g., lots of internal byte arrays.

Lots of repetitive data with a relatively flat structure can be a good argument to get away from JSON, too.

Let's set aside binary formats for a moment. I once sped up populating a large-ish table of data by an order of magnitude - and achieved a pretty decent reduction in data volume, too - just by switching the format to CSV.

Is CSV so different than JSON?

  ["my", "data", 1, 2, 3]

I'm surprised it is 10x faster.

Infrastructure for protobufs is pretty effective. It is parsed in browser, but several implementations show better performance than JSON decoding. A unified set of libraries (or rather, code generators) to encode and decode in various languages is a plus, too. Both my current and last company used protobufs to encode API request data, as well as for things like config loading.


So I guess we should "transpile" static objects into strings that we call with JSON.parse?

Not sure if I should end this comment with a /s or not.

Sometimes you can have massive static objects. Like a list of emojis and their codepoints.[1] It could be useful in cases like that. You'd have to experiment, though.

[1] https://raw.githubusercontent.com/joypixels/emoji-toolkit/ma...

Since we compile JavaScript to an unreadable mess anyway... Why not?

That's...what the article proposes, so yes?

The key part here is

> As long as the JSON string is only evaluated once the JSON.parse approach is much faster.

So doing this would only work on top-level declarations, because the javascript runtime will cache the parsed JSON structure for subsequent executions (eg. creating an object in a loop).

WebPack, properly configured, will let you import plain old .json files through the normal es6 import mechanism. It doesn't use this technique though: the JSON file is treated as normal JS source (preprocessed to remove unnecessary quotes etc.) and wrapped with an export.

Commenting the security issue from the end of explainer for visibility.

I’m having flashbacks to the Java serialize vulnerabilities from a couple years ago.

ECMAScript and JSON do not have the same set of escape characters:

``` Note: It’s crucially important to post-process user-controlled input to escape any special character sequences, depending on the context. In this particular case, we’re injecting into a <script> tag, so we must (also) escape </script, <script, and <!- -. ```

JSON is a syntactic subset of Javascript in ES2019 [1].


Why would you ever be escaping HTML in client-side JS? You should be using appropriate DOM APIs (which don't include innerHTML) to manipulate the document.

If the JSON itself contains strings with markup included, and you're injecting directly into a script tag in the HTML document.

Though, if you're dealing with a typed object server-side and/or loading into a .js file request, it's less of an issue, if you aren't supporting html markup in the object to begin with. In my own use case, both are true.

Hence why I said "You should be using appropriate DOM APIs"...

The appropriate DOM APIs don't take HTML strings in the first place. You shouldn't be passing HTML strings to JS.

I think you’ve missed the vulnerability. You can use appropriate DOM APIs all you want and be hit by a malicious escaper if you don’t serialize in the right way. At some point, your js is inserted into the document via a script tag. If you use the JSON parser to initialize your data more quickly, and your data has user input anywhere within it, then if you aren’t careful about encoding/decoding your string, the attacker’s can abruptly interrupt your own script simply by inserting `</script>//attacker script here`. It is comparable to an SQL injection attack, but instead it is HTML injection that is made possible if you use JSON.stringify without taking the differing character specification into account (the example in the write up shows one good way to do this).

The attacker is escaping JSON.parse to escape HTML.

Hence why I said "You should be using appropriate DOM APIs"...

The appropriate DOM APIs don't take HTML strings in the first place. You shouldn't be passing HTML strings to JS.

Well I guess this means that if you have a 1k+ lines of static JSON, it would be better if you consider converting it to a string and use JSON.parse instead.

I'm not sure I can find a use case of such a big object declaration. Usually what you do is to get it from somewhere ( file, db - with nodejs, xhr ) where it's been parsed with JSON.parse anyway.

I guess somebody needed to do that, but I am totally with you that this sounds like a follow up request or read for metadata to me.

This makes total sense, it really just means that the time it takes to compile JSON.parse on a string literal is offset by how much simpler and faster parsing a JSON object is than a js one.

It seems that you could define a subset of JS Objects that are exactly those you could define with JSON (no functions, no recursion), and the browser could always read these more efficiently. With modern tooling, these could probably even be automatically be discovered at compile time, and all your `const a = {b: "c"}` objects could be changed into `const a = SimpleObject({b: "c"})`.

Without that, you could probably use Typescript today to spit out JSON whenever the compiler saw that it was more efficient.

There is no way of knowing someone won’t do a.foo = window.alert later though, unless it’s a frozen object

That's also true of an object parsed with JSON.parse

JSON.parse won't parse a function call/literal. Direct injection would.

You're right about that, but I don't think you're following the argument. The argument was that there could be a `SimpleObject` that's limited and parses quicker. A `SimpleObject` wouldn't parse a function call, just like JSON.parse.

As OP said, "subset of JS Objects". This subset of JS objects wouldn't support function calls.

Don't we already have that, plus a transport medium with JSON?

The speed-up is at object instantiation.`a.foo = window.alert` could only be done post instantiation.

Why isn’t the Javascript compiler storing an intermediate form after parsing the code? Then surely it would be faster to just execute the bytecode?

It is, I believe. The article seems to be saying you can get to that intermediate form more quickly by parsing the object from a string than you can parsing it as a POJO.

This is the XOR AX,AX of the 21th century

What's XOR AX,AX of the 20th century?

This is assembly. This instruction is doing an boolean "Exclusive Or" of a register named "AX" (think of a register like a hardware variable) with itself into itself. This always leave the register containing 0, because 0 XOR 0 = 0 and 1 XOR 1 = 0.

And for some weird reason, doing that was faster than just doing MOV AX, 0 (which is literally "Move 0 into AX", so "AX = 0" in more familiar syntax).

Edit: Oh man, my fellow HN'ers, as I did, all jumped on the occasion to show off :)

Ah! Wow, that is surprising. Why was it so much faster?

In the beginning (8086/8088) because it was a shorter instruction encoding than the literal "move a zero into AX" instruction, so it created smaller code and saved a memory read (even if that read was a prefetch memory read).

Later Intel actually special cased it in the instruction decoding path for the later CPU's (starting somewhere around the PII/PIII era, but I don't remember the exact timeframe) so that it also does not consume an execution unit and it /effectively/ executes in zero cycles.

The reason why it is special cased is that on pipelined or OoO CPU, xor ax,ax would otherwise be significantly slower than straight mov as xor has dependency on its operand registers.

On a similar note on many RISC architectures NOP is actually something like ADD r0, r0, r0 and that too is usually special cased in the hazard stall and result forwarding logic (althought usually the special cased part is “ignore hazards that involve r0”)

Adding the same register to itself and then assigning the value to itself is not really a nop. Why is that instruction forbidden and converted to a nop instead?

Because in some RISC architectures, the first (R0) or last (R31 or whatever is largest register number) is also a 'special' register in that it is hardwired to zero, any reads return zero, and writes are simply discarded.

In this instance, if the register always reads as zero, and can never be changed, then ADD R0,R0,R0 is, in effect, a NOP, so it gets special cased, and doing so avoids having to allocate an additional opcode explicitly to the "NOP" instruction.

Anticipating the question of "why is Rx (0 or 31 or ??) hardwired to zero?", that one is because it is useful for:

1) obtaining a zero without having to perform a load from memory

2) creating additional addressing modes by reusing another existing addressing mode

For #2, if the RISC arch. implements, say, base plus index addressing where two registers are added together to obtain a final address, using R0 as one of the inputs creates a direct addressing mode from the base plus index mode.

So base plus index could be written as load R5, R6+R7. Substituting R0 (assuming it is the hardwired to zero register) for R6 (or R7) results in directly addressing from the value in the other register, converting an 'indexed' addressing mode into a 'direct' mode, without having to add a 'mode select bit' to the actual instruction. The result being that the chip only needed hardware for a single addressing mode, but the programmer has two addressing modes available for their use. If memory serves, the DEC Alpha made large use of tricks like this. The hardware only implemented a small handful of addressing modes (say 4-5) yet the full set of addressing modes exposed to the programmer was two or three times larger due to creative uses of "zero stuffing" into the actual hardware modes.

1999 in Pentium Pro. As the other guy says the real special part is that writes zero to ax without reading ax. So it breaks a dependency chain if you do this in a loop.

The important tuning piece for OOO processors is removing dependencies between calculations.

There's a bunch of overhead associated with reading a constant from the program and moving that to the register.

Math operations, on the other hand, are very simple to do because it's baked right into the CPU, only requiring one instruction and a few cycles

No, the reason is that `xor ax, ax` has a shorter encoding, because it only takes 3 bits to encode a register on 32-bit x86, and far more bits are required for a useful immediate move. It would be ridiculous to have a short encoding for `mov rN, #imm` where the immediate has to be in the range 0-7.

> It would be ridiculous to have a short encoding for `mov rN, #imm` where the immediate has to be in the range 0-7.

Such instruction didn't exist but I wouldn't call it ridiculous. It would actually be quite useful as setting a register to 3 is far more common than setting a register to 7911.

M68000 has a MOVEQ #imm, Dn

Since its instructions are always multiples of 16bit though, the immediate value is 8 bit, so slightly more useful, but the most common use of it was certainly setting a register to 0

If you're going to do something like that, you might as well steal from the register numbering bits instead.

(many architectures have done this: Intel has lots of short-form encodings for "op {al, ax, eax, rax} #imm", and ARM has Thumb, which basically chops the register count in half for all instructions to shorten their encoding)

Trivia: MIPS and RISC-V have the r0 register equal to zero to optimize this case in a more elegant way.

So does x64, almost. More accurately, so do the underlying Intel and AMD micro-archs that host x64.

Shorter (in assembled bytes) and faster than MOV AX,0


Goes back to the 8080, in fact.

The op "XOR" the the "xor" operation between two things and stores the result in the first operand. "XOR"ing a number with itself always results in 0, which is then stored in the operand itself.

This operation was almost always faster than moving a constant (like 0) into a register, because of all of the overhead of making the constant and reading and writing.

xor'ing a value with itself produces 0, so XOR AX,AX is equivalent to "AX = 0". It's a common trick in assembly code because it's faster (and/or maybe more compact, can't remember) than the more obvious MOV AX,$0.

Fastest way to zero a register.

I've wondered about that with JSON.parse(JSON.stringify(obj)) being the fastest way to deep clone a plain JS object in any JS engine. It always looks dumb when it shows up, but it works well enough and I've seen it described as a best practice by some types of data manipulation libraries.

Would like to know this aswell.

I know using spread

const clone = {...source};

Is faster than


And seeing as in this case the source is already a js object it would probably be slow to string and the parse.

Keep in mind that both of those two options are only shallow clones. If you've got any bit of nesting and need a deep clone the code starts to get a lot more complex. In general, in places where I "need" a lot of deep clones I try to use an Immutable library with good reuse/sharing tools, but when you have to deep clone a plain JS object JSON.parse(JSON.stringify(obj)) is incredibly fast in most browsers, despite the fact that you would think string allocation might be a bottle neck.

Title correction: it's faster to parse and initialize using JSON than to parse and compile the code required to initialize Objects directly. If the code is already compiled, it's much faster to initialize in code. At least that's my understanding of the linked article.

I think your understanding is correct. The relevant quote:

> As long as the JSON string is only evaluated once, the JSON.parse approach is much faster compared to the JavaScript object literal, especially for cold loads.

So that means that theoretically someone could write a Babel plugin that replaces objects with some sort of memoized JSON.parse in order to get the best of both worlds? Could that actually work in practice?

Edit: I think you would also need to identify and only optimize objects that are handled immutably. However, my understanding is that you can do that in a compiler/transpiler fairly easily.

> some sort of memoized JSON.parse in order to get the best of both worlds? Could that actually work in practice?

Probably not, because the kind of object literals that can be converted to JSON.parse are the kind where you have different objects with the same shape, whereas reusing a memoized object would result in multiple references to the same object.

EDIT: wait nvm, that's what you addressed with your edit.

So is this insight limited to literals at page load, or are there situations where it might be faster to construct a JSON string and then parse it compared to directly creating the object?

I'm thinking of compression libraries, or things like FlatBuffers.

IE has a similar issue with adding page elements. I was trying to add a bunch of tr/td elements to a page. My goal was to load a 5,000+ row excel file. I tried using document.createElement but it was painfully slow. I switch to string concatenation and element.innerHTML. It was about 30 times faster.

This was only in IE 11 and edge. Chrome and Firefox took basically the same time for both.

I added them all to a new, detatched tbody. After my appending loops I removed the existing tbody then appending the new tbody to the table.

That makes sense. The js object syntax is much more complex than json.

Is there a way to provide v8 with compiled code instead of unparsed JS?

While it isn’t exactly what you’re looking for, this is the closest thing I’m aware of for pure JS: https://prepack.io/ and outside of that o guess you’re looking at WASM, although I know this isn’t exactly what you were asking for.

But if your goal is to compile a little closer to v8 then you’d have to set v8 as your compilation target, making that only really worthwhile on the server (since you’d be missing out on all other browser runtimes). I could see such a thing being pretty useful for something CloudFlare workers where startup time is vital, and your execution environment is guaranteed to be v8. Unless of course they’ve managed to always keep them hot.

Thanks. I was thinking more about desktop apps (Elecron, CEF, etc).

Isn’t that what Web Assembly (WASM) is attempting to do? It’s a bytecode, not machine code, so it’s processor agnostic too.

I don't know about doing this directly with v8, but when used in the browser, the code will be cached after being compiled a certain number of times.

How many times?

According to the fine article, twice in three days.

You're right, I didn't realize that.

Couldn't this same performance boost be achieved by adding an optional "strict mode"-like flag for object literals to v8? Adding JSON.parse(…) everywhere you need an object literal seems exceptionally kludgy, even for JS.

something like let map = {{ x:7, y:13 }};

where the double-brackets promise you're only going to do JSONish stuff in there

    Because the JSON grammar is much simpler than
    JavaScript’s grammar, JSON can be parsed more
    efficiently than JavaScript.
Hmm.. shouldn't that hold for most programming languages then?

Let's try it for PHP:

    time php -r 'for ($i=0;$i<10000000; $i++) $data = [1,2,3];'
    real 0m0,173s
    user 0m0,161s
    sys  0m0,012s

    time php -r 'for ($i=0;$i<10000000; $i++) $data = json_decode("[1,2,3]");'
    real 0m4,125s
    user 0m4,120s
    sys  0m0,005s
So for 10 million repetitions, a small PHP structure is about 20x faster then parsing JSON. But to test the point of the article, one should use the sama data structure it uses (https://raw.githubusercontent.com/WebKit/webkit/ffdd2799d323...) and parse it only once.

Better test:

    time for f in `seq 1000` ; do php -r '$data=[1,2,3];';  done
real 0m11.169s user 0m6.765s sys 0m4.477s

    time for f in `seq 1000` ; do php -r '$data=json_decode("[1,2,3]");';  done
real 0m9.997s user 0m6.093s sys 0m3.974s

Awesome. So it does hold true for PHP as well.

I believe you are doing it wrong here, since the code is parsed only once and then executed in a loop in the first sample. If you parsed one array with 10000000 elements it should be faster.


> shouldn't that hold for most programming languages then? Only for interpreted languages that have a native implementation of the JSON parser (that is the json parser isn't itself interpreted)

But you're not parsing the array 10000000 times in the first example. Use `eval`.

Not sure if eval is the right choice as the article compares parsing json to using a native js structure as well.

The correct comparison would be to try it on the 7MB json data the article is based on.

The possible syntax for Object literals is a lot more complicated than what JSON can contain. Hence the JSON parser is faster than the JS parser for the same object.

JS object literals allow such fun things as:

    "foo": "bar",
    foo1: function () { return "bar" },
    foo2: () => "bar",
    foo3 () { return "bar" },
    get foo4 () { return "bar" },
    ["foo5"]: () => "bar"

Please don't quote with code blocks. For mobile users:

> Because the JSON grammar is much simpler than JavaScript’s grammar, JSON can be parsed more efficiently than JavaScript.

Your example compares parsing 1 time with parsing 10000000 times.

PHP does not re-parse code during execution.

    $ time php -r 'for ($i=0;$i<10000000; $i++) $data = eval("[1,2,3];");'

    real 0m10.568s
    user 0m10.539s
    sys 0m0.028s

This completely depends on the implementation of JSON in your language. That happens to be heavily optimized in JavaScript for obvious reasons.

A lot of the overhead in there is the call to the function json_decode. Function calls are much more expensive than operators in PHP. So this really isn't apples-to-apples.

The differences in data structures between the two languages may also play a significant role, what with PHP using complex z-val structs behind the scenes for almost all data types.

PHP doesn't run inside Chrome's JS engine as far as I know... probably apples and oranges here.

try json_decode(..., true) to get a similar data structure?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact