Hacker News new | comments | show | ask | jobs | submit login
Lazy deserialization (v8project.blogspot.com)
65 points by AndrewDucker 65 days ago | hide | past | web | favorite | 20 comments

A snapshot contains everything needed to fully initialize a new Isolate, including language constants (e.g., the undefined value), internal bytecode handlers used by the interpreter, built-in objects (e.g., String), and the functions installed on built-in objects (e.g., String.prototype.replace) together with their executable Code objects.

This sounds suspiciously like the Smalltalk image. The Smalltalk undefined value was involved in some paradoxes, therefore it couldn't be completely defined/instantiated declaratively.

I wonder if the (de)serialization mechanism used here could be re-targeted in a manner resembling the Parcel technology developed for VisualWorks Smalltalk? Basically, the runtime state of an application could be serialized into a "parcel," which could then be rapidly deserialized and more or less directly injected into the runtime image.

> This sounds suspiciously like the Smalltalk image.

Lars Bak was a major contributor to both V8 and StrongTalk.

> The Smalltalk undefined value was involved in some paradoxes

Could you elaborate on this? Or provide some references? This sounds interesteing but when I tried DDG and Google for "smalltalk undefined paradoxes" the results were not exactly satisfactory.

nil was the sole instance of the class UndefinedObject, which was a subclass of Object, whose superclass evaluated to nil.

Another paradox: Every instance has a Class. A Class is also an instance of a Class. That object also has a Class, and that instance of a Class also has a Class, and so on. It was turtles all the way down.

hrmmm... sounds like a coinductive description https://en.wikipedia.org/wiki/Coinduction

(i've done a bit of modelling using coinduction, and it has a very funky but precise relationship with immutable object oriented program descriptions)

Good work - but can we just take a minute to look at this...

> Over the past two years, the snapshot has nearly tripled in size, going from roughly 600 KB in early 2016 to over 1500 KB today.

1.5MB per snapshot... per tab effectively! It's crazy how wasteful we've become.

I have thought for awhile, and continue to think, that developers avoid "premature" optimization too fiercely. Yes, there are diminishing returns with optimization effort, but too often people interpret "avoid premature optimization" as "never optimize unless it feels slow, and even then only if it feels slow when it's the only thing running. Otherwise, blame everything else that is running first!"

Not to mention that the "never optimize until it feels slow" judgement calls are usually conducted on Macbook Pros with fast internet.

Oh absolutely. I have been noticing this is starting to happen with SSDs as well. A lot of modern games run awful on mechanical drives.

that's unfortunately because we are pushing fancier and fancier models with higher and higher resolution textures

Indeed, the quote of Knuth on this is followed immediately by not neglecting that critical minority of code where it matters.

Sadly, the promise of zero cost abstraction is a huge siren call. And not likely to change anytime soon.

"Zero cost abstraction" is like a sarcastic joke at this point in web development. If you pull up performance comparisons for web backends, a lot of the popular ones based on interpreted languages are absolutely abysmal when compared to c++ or Java code (Node is a good example). Many definitely have streamlined development workflows and have nice, high-level abstractions, but at a cost to performance. I don't mean that Node is useless or something, sometimes it makes sense, but it still forces you to compromise.

Front-end frameworks are even worse. A lot of older (but not "ancient") PCs are unusable on the modern web because of poorly-optimized JS or Adobe Flash (a decent portion of this issue is also due to the inherent inefficiency of JS and Flash as well). Fortunately, Google has been making strides with V8, Mozilla did awesome with Firefox "Quantum" and everyone is slowly ditching Flash, but performance still seems to be an ever-present issue.

Fully agreed. I might extend it beyond web.

It does seem to be getting comical.

I vaguely remember Lars Bek saying ~8 years ago that the snapshot size was 50kb (at the time).

Couldn’t they do this with a copy-on-write initial heap that all the engine instances clone?

I think that's pretty close to what they're doing, really. Though usually COW refers to doing things on a page by page basis, which probably would not be as effective (since it would copy entire pages, when only a little was needed, so unless you were careful to put related function close together, you'd copy a lot more of the heap than you really want).

You're probably right though. I bet they end up closer and closer to that. Possibly even sharing pages that can't be modified, like code objects. That's what the last sentence or two seems to allude to. Eventually, they'll pretty much be reinventing shared libraries for the JavaScript world. Not that that's a bad thing.

Shared libs with security related side-channel leakage?

@dang Could this be replaced with the non-mobile link?


Updated. Thanks!

Whoops! Forgot you're sharing the load with dang now. Thanks!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact