Hacker News new | past | comments | ask | show | jobs | submit login
Introducing the B3 JIT compiler (webkit.org)
213 points by basugasubaku on Feb 15, 2016 | hide | past | favorite | 128 comments



Really cool article. Posts like this always make me wonder what the state of the programming would be if browsers hadn't sucked up almost all of the world's compiler optimizers.


Look at what Mike Pall did with LuaJIT2 - I assume if the world wasn't so focused on the web, we would see more of it in other languages. But really, things aren't that bad. Microsoft has enough good people working on RyuJIT, PyPy has some of the best people advancing metatracing JITs, and Mike Pall is a god among men.


The Father, The Son, The Holy Ghost, and Mike Pall.


To be fair, GPU vendors sucked up a ton too.

But considering that optimized scalar code performance has moved, what, maybe 40% over the last two decades, I'm going to say "not much". Compilers are sexy, but they're very much a solved problem. If we were all forced to get by with the optimized performance we saw from GCC 2.7.2, I think we'd all survive. Most of us wouldn't even notice the change.


I'd disagree. Classical compiler work is very mature, yes - and new progress in things like register allocation and backend IR-based optimization stuff is well trod ground.

But in the context of JIT compilers for dynamically typed languages, in particular the space involving runtime inferred hidden type models, there is a TON of work left on the table.

It hasn't been paid much attention to in academia, IMHO largley because of a historical perspective on optimizing dynamic languages as "not classy" work among language theorists. I hope that perspective changes over time.


> Compilers are sexy, but they're very much a solved problem.

Not for all of the other widely-used languages that still have incredibly simple interpreters. Think how much energy could have been saved if Ruby, Python, and PHP were all as fast as your average JS engine.


> Think how much energy could have been saved if Ruby, Python, and PHP

I would think we'd see even better improvements if developers would move towards statically typed languages as well.


Exactly. Ruby would probably have massive adoption with JS-like speed.


My implementation of Ruby, JRuby+Truffle, is as fast as V8

http://stefan-marr.de/downloads/crystal.html


Note that usually being "as fast as" a production JSVM means also proving that you can start up as fast as JSVMs do. Have you done this?


Search around for "Substrate VM". I see it referenced in some slide decks, and it's designed to make JVM startup much faster.

Here's an old slidedeck that talks about it: http://www.oracle.com/technetwork/java/jvmls2013wimmer-20140...


Wow thx for the reminder, I have forgotten about it already. I remember the promise of Truffle + Graal + Substrate VM.

My god can't believe it was 3 years ago i read about it on HN.


I know about it. I'm not concerned with future hypotheticals, but current actual measurements.

That's the currency I deal in.


There seem to be measurements in the slides I reference, so it wasn't exactly "future work". But of course it sucks that this work is not yet publicly available (AFAIK), probably due to not being 100% production ready yet. But I'm sure it's getting there.


Ruby has massive adoption, as do Python and Perl and PHP. Environments with objectively better performance like the JVM and .NET have not, in fact, done all that well comparatively in this environment (which is to say they've done fine too and achieved "massive" adoption, just not that much better than their slower competitors).

In fact, looking at the market as it stands right now I'd say that performance concerns are almost entirely uncorrelated with programming environment adoption.


Considering the massive speed improvements we have been seeing in Javascript, but also languages like Ruby, I'd say compilers is a solved problem for staticly compiled languages, but perhaps not as much for interpreted highly expressive languages.


Sort of. Lisp had a good combo of expressiveness and speed in the 80s, but it reached that point on the efficiency frontier by different techniques, like heavier use of macros. The newer techniques like trace compilation can make life even better, but the language design decisions in Ruby/Python/etc. that made that sort of thing necessary if you want speed, they didn't really pay for themselves from the perspective of smug Lisp weenies like me who were happy enough with our language and just wanted pragmatics like libraries.


Did you just call Javascript a "highly expressive" language? Really?!?


LOL! It is expressive though. It's got really powerful first-class functions. It's got classes. It's got prototypes. It's got generators. It's got other things that I don't even remember (but will probably have to learn, to implement them, make them fast, and then fix the bugs).

I actually think that the reason why JS is so odd is that it is so expressive. That tends to happen with kitchen sink languages like C++.

Relevant: "There are only two kinds of programming languages: those people always bitch about and those nobody uses."


Cool. "We took a dynamic language and made it fast(er again)".

And you're gonna apply that to the darkest corners of ES6? Whee, language theory nerd paradise!

I can't wait for the compilers targeting that using all sorts of really odd idioms that in effect work like fuzz testing.

https://www.destroyallsoftware.com/talks/the-birth-and-death...


More expressive than a less imaginative subset of C. More expressive than assembler. Much less expressive than any decent high-level language.

Javascript is low level, primitive and clumsy. There is absolutely no excuse for it being so crappy.

If you still think it is "expressive", you never seen an expressive language.


> If you still think it is "expressive", you never seen an expressive language.

Are you aware who Filip Pizlo is?


Of course - a low level guy. It is quite likely that he is either not exposed at all to any expressive languages or reject them out of ideological reasons (JikesRVM is quite a strong symptom of the latter case).


I have no idea what the Jikes RVM has to do with "ideology". The point is that you're trying to tell someone who has a significant academic and industrial PL pedigree that they don't know what an expressive language is.

You seem to have an intense dislike of JavaScript. I'll the first one to admit that JavaScript has a lot of problems. But not being expressive enough is not one of them. JS is, if anything, too dynamic and expressive.


Yes, if he thinks that JS is expressive he does not know what is expressive. And his track record clearly lacks anything related to any expressive language. I can only explain it with an ideology, because ignorance is not an option here.

And, no, JS is not expressive. It got strict limits on an available level of abstraction (unless you go into eval, of course).


ajross is right. People choose the languages they like regardless of performance.

JS perf is important to users though. Faster execution means fewer watts spent rendering and interacting with your favorite web page.

(Fun fact: B3's backend contains a machine description language that gets compiled to C++ code by a ruby script, opcode_generator.rb. We use Ruby a lot.)


Why didn't you use JavaScript for that purpose (in a similar vein to how LuaJIT uses Lua for dynasm)?


I'm not a big fan of self-hosting. I like that you can build JavaScriptCore without using JavaScriptCore.


I wasn't really referring to self hosting. I was wondering whether it made sense to write tools like JavaScriptCore's offlineasm in JavaScript rather than Ruby (as LuaJIT does by using Lua for its dynasm tool). The LuaJIT build process actually builds a cut down copy for Lua for the purposes of running dynasm, which is then used to build parts of LuaJIT.


We spend time optimizing JavaScriptCore's build time. It's a hard problem and it's important when managing such a large project. People inevitably have to do clean builds and that sucks when you have >400k lines of C++ code.

So, naturally, we want to avoid building JSC twice. ;-)


Suppose that you are developing not to push this platform, but to simple do better on this platform. What does self-hosting gain you in that case?


Lately quite a few apps are saving energy by moving from Ruby/Python/PHP to Go.


And Go proves munificent's point: it doesn't have many compiler optimizations either. (This may change with the WIP SSA backend, but the point remains that Go gained huge popularity in spite of having a non-optimizing compiler.)


Yeah, more interesting is having them rediscovering Turbo Pascal compile speeds.

EDIT: I wonder why the positive effect to re-discovering that not all compilers need to be like C and C++ compile speeds and that it was once upon a time mainstream, is worthy of downvotes.


Can't critique this one: Go was an attempt to re-create the Oberon experience in modern setting with some additions from other languages. Rather than accidental re-discovery, getting Oberon (not Pascal) speed out of the compiler was an explicit design goal. One of few examples of modern IT really learning from the past.

Unfortunately, they didn't learn about the stuff between Oberon and 2007 that would've been nice to have in a modern, app language. ;)


It is not a critic, apparently it was understood as such.

The remark is tailored to those that think compiled languages can only be slow as C and C++ compilers, since they never used anything else, and then jump of joy when they use Go.

Yet if it wasn't for the VM detour of the last 20 years, that experience would probably be a current one, instead of being re-discovered.


"The remark is tailored to those..."

Ahh. Ok.

"Yet if it wasn't for the VM detour of the last 20 years"

You mean the C++ and VM detours? ;)


I mean having the JVM not adopt the tooling model of Eiffel, OCaml and others where you get to choose bytecode/JIT for developing and AOT compilation for release builds.


Oh i hear you. That makes sense. I wanted all of them to do incremental compilation or interactive development for build then AOT for deploy. We're on same page there.


For the curious (I am sure pjmlp is well aware of this) D and Go compile speeds are pretty neck and neck. DMD was faster, then Go caught on, not sure which one is faster now, depends on the nod. If I may add, D is a lot less impoverished than Go, although its coroutines story is not that strong.


Are you claiming Go runs at the speed of Ruby/Python/PHP etc? Because from what I read, migration to Go from above mentioned language led to lot of hardware / memory saving. I'd think Java can certainly be considered having highly optimized compiler / runtime. But it has almost same performance as Go and much higher memory usage compared to Go.

http://benchmarksgame.alioth.debian.org/u64q/go.html


> Are you claiming Go runs at the speed of Ruby/Python/PHP etc?

No. I'm claiming it doesn't perform anywhere near the level of optimization of GCC and LLVM.

> I'd think Java can certainly be considered having highly optimized compiler / runtime. But it has almost same performance as Go and much higher memory usage compared to Go.

I disagree with what seems to be your implication that compiler optimizations don't matter (and almost everyone else who works on compilers would also disagree), but I don't really want to turn this thread into a critique of the benchmarks game, so let's just leave it at that.


> … and much higher memory usage compared to Go.

Please don't jump to the conclusion that huge differences in the default memory allocation of tiny 100 line programs means there will be similar huge differences between ordinary large programs.

Notice that even for those tiny 100 line programs, the difference can be more like 2x when memory actually needs to allocated for the task.


"Compiler optimization" in this thread is referring to speeding up compilation time†, not runtime of generated code. Go produces fast code, but the go compiler does not generate that code particularly quickly.

† Which can be seen as a runtime cost for interpreter+JIT languages, but that's a different issue. We're talking time-to-steady-state when the interpreter is fed a file (which is something Javascript engine devs worry about a lot), not benchmark-speed-at-steady-state.


No, it's the opposite. The Go compiler generates medium quality code but compiles fast.


If Ruby, Python and PHP would suddenly disappear, even more effort and energy would have been saved.


    Compilers are sexy, but they're very much a solved problem.
This may be true for sequential languages, but is very much false for the compilation of concurrency and parallelism. It's basically not known how to do this well. Part of the problem is that CPU architectures with parallel features have not yet stabilised.

For sequential languages the problem has shifted: how can I get a performant compiler easily. The most interesting answer to this question is PyPy's meta-tracing, and that's work is from 2009, and far from played out.


> Most of us wouldn't even notice the change.

A 40% decrease in optimization is enough to drop framerates from 60fps to 30fps easily, so I'm pretty sure we would notice it.


> optimized scalar code performance has moved, what, maybe 40% over the last two decades

I'm not convinced. Raw single-thread number crunching performance is somewhere around _two to three fold_, clock-for-clock, on Intel x86, over that of 10-15 years ago. What methodology do you use to attribute only a fraction of those gains to language optimizers? And even if you are correct, why is it meaningful? Who is going to have invested energy in optimising the shit out of mundane codegen when hardware performance will have just come and stolen your thunder a few months later?

The problem we have now is that CPUs are gaining ever more complex behaviour, peculiarities, and sensitivities. I'd say compiler engineering is far from a "solved problem", even for statically-typed languages.


> The problem we have now is that CPUs are gaining ever more complex behaviour, peculiarities, and sensitivities.

With mainstream CPUs, exactly the opposite is happening. CPUs are getting more complex under the hood, but less sensitive to code quality. For example, a lot of the scheduling hazards in the P6 microarchitecture have been eliminated in subsequent iterations. Branch delay slots are a thing of the distant past, so are pipeline bubbles for taken branches, indirect branch prediction is extremely capable, even the penalty on unaligned accesses is minimal.


Well, sure, but SIMD more than compensates for all of that, given how hard autovectorization is. In fact, I think with things like AVX and NEON becoming ubiquitous, you can get more benefit out of writing in assembly (or intrinsics) than any time I can think of in the past 10 years.


> Compilers are sexy, but they're very much a solved problem

No they're not, and won't be for long (ever?). However it does not matter because they are "good enough".

Compilers are driven by heuristics which provide "reasonable" results in most cases for common architectures. But they still leave a lot on the table. Compiler writers have to trade compile-time with execution-time. Now we're not talking about an order of magnitude, but rather ~20%-30% in some workloads. When it matters (I guess for people like Google/Facebook/Amazon/... it translates in electricity bill and a number of racks to add to the datacenter) people may have to get down to the assembly level for a very small (and hot) part of the program.


That seems a bit chicken-and-the-egg-y though: if the web didn't become a global phenomenon then there would be far less interest in improving the browsers, far less business need for programmers, and far fewer people working on whatever they'd be working on if they weren't working on browsers.


They're solving a non-issue. The rest of the world is perfectly fine with the statically typed, well designed languages that are easy to compile. And only the web world is so obssessed with smart compilers compensating (impressively, but still far from being sufficient) for multiple deficiencies in the language design.


> The rest of the world is perfectly fine with the statically typed, well designed languages that are easy to compile.

Python, Ruby, PHP, and Perl aren't "the rest of the world"? As far as compilers are concerned, all of those languages have more troublesome semantics than JavaScript does.

> And only the web world is so obssessed with smart compilers compensating (impressively, but still far from being sufficient) for multiple deficiencies in the language design.

You have no idea how much compilers have to compensate for the deficiencies in C and C++'s design.


Nobody really cares about their performance. They're just fine with their simple interpreters. Web is different, there is no choice, no fallback to C.

And no, thank you kind sir, but I've got a very good idea of what compilers are doing wrt. C deficiencies, I was writing OpenCL compilers for 6 years at least. Besides aliasing stupidity and byte-addressing there is nothing really bad to compensate for.


> Web is different, there is no choice, no fallback to C.

asm.js and Web Assembly.

> Besides aliasing stupidity and byte-addressing there is nothing really bad to compensate for.

Aliasing issues, C++ heavy reliance on virtual methods, too many levels of indirection in the STL, overuse of signed integers due to "int" being easier to type interfering with loop analysis, const being useless for optimization, slow parsing of header files...


All of this!

For example in JS, once you prove that "o.f = v" is not going to hit a setter (or you put a speculative check to that effect), then you know that this effect does not interfere with "v = o.g" (provided you check that it's not a getter). JS VMs are really good at speculating about getters and setters. The result is that alias analysis, and its clients like common subexpression elimination, are super effective. It's only a matter of time before we're creating function summaries that list all of the abstract heaps that a procedure can clobber so even a function call has bounded interference.

This is totally different from C. There, making sure that accesses to o->f and o->g don't interfere is like pulling teeth. And the user can force you to assume that they always interfere by using -fno-strict-aliasing, which is hugely popular.

The fallback-to-C paths on the web aren't that great, though. At least not yet. Once you fall back to C, you're sandboxed into a linear heap with limited access to the DOM and JS heap. But we'll fix that eventually. :-)


Web Assembly is not even there yet, asm.js up until recent was more a toy and a standalone runtime, I have not seen it being routinely used for a fallback, nothing like, say, python with C modules.

As for virtual methods, it is a problem of a bad C++, devirtualisation may help, but nobody cares in general. We have a cool curiously recurring template pattern instead.

But for the integers you're right. And signedness is still the most annoying source of bugs in the infamous LLVM instcombine.

Headers are a frontend issue, nothing to do with the optimisations.


> Web Assembly is not even there yet, asm.js up until recent was more a toy and a standalone runtime, I have not seen it being routinely used for a fallback, nothing like, say, python with C modules.

That's because (a) page performance is frequently gated on things other than JavaScript, so people don't go through a lot of trouble to write C++; (b) many modules that would be written in C in Python are provided by the browser itself; (c) JS itself is usually fast enough, since the gap between JS and C++ is much less than the gap between Python and C++.

> As for virtual methods, it is a problem of a bad C++, devirtualisation may help, but nobody cares in general.

Huh? Tons of people care about devirtualization! Much of the reason Firefox builds go to the trouble of PGO (and it is a huge amount of trouble) is to get devirtualization.

> As for virtual methods, it is a problem of a bad C++

So I could say the same thing about JavaScript: if it's slow, you're writing "bad JS". But you would rightly reject that as invalid: if the code people write in practice is slow, then the problem is with the language encouraging people to write slow code. The point is that the same thing applies to C++.


See? The evolution of the language and its ecosystem was driven by the peculiar web needs. No surprise the language became what it is now. No surprise it sucks shit every time it spills out of its native domain.

As for C++ - there is a choice. With JS the choice is much more limited. You either rely on non-standard asm.js behaviour, lose DOM access with web assembly, or tolerate the stupidity and limitations of the language that far overgrown its tiny niche.

Btw., I am yet to hear how a language with a fixed syntax and no macros can be perceived as "highly expressive".


Re: int, isn't it the reverse? I.e. don't compilers find loops using ints easier to optimise because they can assume the counter does not wrap?

Also, what do you mean by level of indirection in the STL? Pointer chasing or, I suspect, layers of layers of tiny functions that need to be inlined for reasonable performance?


> statically typed, well designed languages that are easy to compile

Which ones are these? All of the statically-typed and well-designed languages I can think of are, at the least, hard to compile well, if not hard to compile in the first place. (Haskell and Rust both come to mind; there are few Haskell compilers other than GHC, and no Rust compilers other than rustc.) The languages that are easier to compile are either not statically-typed in a useful way, or not well-designed, or both.


SML? Pascal?

Edit: I was really referring to language families, so including things like OCaml, Modula, etc.


Rust is actually quite easy to compile. Ada is easy to compile. ML-like languages are easy to compile. Oberon is trivial to compile.


I've never heard anyone say Ada was easy to compile. Also, I heard its uptake was slowed by how hard it was to get the early compilers working. So, where did you see an easy one? Seriously, because Ada still needs a CompCert (or at least FLINT) style certified compiler. An easy Ada compiler would be a nice start on that.


I always thought that Ada was easy to compile in the backend (i.e. the moral equivalent of what B3 does) but challenging on the frontend because of how large the language is. Could be wrong though.

The frontend is usually the hard part IMO. In WebKit, we have 100,000 lines of code for our "frontend" (i.e. the DFG compiler) and 50,000 lines of code for our "backend" (i.e. B3). That split is really interesting.


It's what I thought, too. The split may come from the fact that the front-end is more complex, richer in information, harder to analyze, and harder to transform. The more you can do with it the more code it takes to represent. That's my hypothesis.

Far as Ada, I figured it would be difficult due to the large number of language features and complexities of analyzing a program for safety w/ all its rules. And any interactions that might emerge in that between features and rules. Could be easier than I think but my default belief is that it's not easy.


I am talking about the backend optimisations, not the frontend parts. Ada frontend got tons of sweet static information for the backend to consume.


That makes more sense. Did you ever work on any Ada compiler I might have heard of?


Nothing fancy, I was simply implementing a subset of Ada for fun. I find it is the quickest way to learn or explore a language - write a compiler for a subset of it.

Frontend was awful and I never finished it, but the LLVM-style backend wad very easy to build.


You are probably right, assuming input programs are correct. Rust (and Haskell) is not easy to _type-check_ though.


You can't compile Rust without types and inside functions, most types come from inference and coercions, so "type checking" is really "type resolution".


> Rust (and Haskell) is not easy to _type-check_ though.

Is that really the case? It seems to me that a Prolog-like resolution system, coupled with a constraint solver, would get most of the job done with little effort.

There are certainly many rules to keep track of, but in some cases the newer rules are strict generalisations of the older rules. For example, Haskell's original type classes are just a special-case of modern multi-parameter, flexible instances/contexts, etc. type classes. Likewise, we can implement constraint kinds, type class instance resolution and type equalities using one constraint solver.


Hindley-Milner is already quite Prolog-like. But you're right, CLP(fd) rocks for typing.

My usual approach to implementing a type system is to derive a flat list of Prolog equations out of the AST and leave it to Prolog for solving. If you ask what to do with error messages, I've got a comprehensive answer, but it is not for a mobile phone typing.


I'd really appreciate further comments on what to do with type errors for prolog/minikanren-like tools as typecheckers


Ok, I've got a proper keyboard now.

My approach to the typing error reporting is quite compatible with the Prolog codegen. What I do: for each AST node which produce one or more type equations I record the location data (cache it and give it an index), and then issue a Prolog term like this: with_location(Id, Term), for each of the terms generated by this location.

with_location semantics is simple, but may depend on the presence of a backtracking. If it's a plain Hindley-Milner, there is no backtracking, and it simply do call/1(Term), continue if true, store Id and the failed term with all the current bindings and fail/0() if false.

If there is a backtracking, I store a stack of location ids and failure reasons instead, but this required patching the Prolog execution engine a little bit (luckily, I'm using my own Prolog for this).

Now, the next step is to decipher the failed term. Most of the equations are in a trivial form: equals(A, B), so I simply report that prolog_to_type(A) failed to unify with prolog_to_type(B). For the other (rare) kinds of equations I define custom pretty-printing rules.


Thanks, looks nice and pretty straightforward, will try it out


I don't understand this perspective. Is the point that JavaScript just works such that you don't have to worry about it compiling at runtime?

To me this is the difference between knowing that your program will probably work after it compiles, vs not knowing until after its deployed.

What I think people really enjoy in dynamic languages like JavaScript is the instant feedback, just reloading the browser. This can be accomplished with decent IDEs for most statically typed languages.


I don't believe anyone's arguing that the overhead of static checking is not worthwhile, merely that it involves serious work on the part of the compiler developers that should not be dismissed.


While they have sucked up a lot, it is far from certain all would have been employed optimizing compilers if that hadn't happened. Demand tend to create the supply.

Also, many of the improvements done for JS will certainly trickle down to Python, Ruby, PHP eventually.


Good to see the Webkit team (mostly Apple) continue putting serious energy into JS performance. Take a bit of guts to throw out the whole LLVM layer in order to get compilation performance...

It's also encouraging to see them opening up about future directions rather than just popping full-blown features from the head of Zeus every so often. (Not that they owe us anything... ;-)

(Edit: it's also damned impressive for 2 people in 3-4 months.)


Gutsy move indeed. Though I wonder, what was the development cost in real life cash for gaining those 5% of performance?


3 months. Two people working on it (me and @awfulben).


Nice work, Fil. Looks cool.


Thanks! :-) Looking forward to more write-ups about TurboFan!


Ok, thats impressive. I'll show myself out.



Out of curiousity, did you look at improving LLVM compile times? I didn't get that out of the article.

Awesome write up btw.


As Phil said, the development cost was not that high all things considered. But even if it was - think about the number of user-hours that 5% is leveraged across, and the resulting time savings and battery life savings for those users. It would be worth investing even more than we did for 5%!


Considering there are now 1Billion Active iOS devices, that 5% saving is huge!

I wonder if this will make in time for iOS10.


tl;dr: B3 will replace LLVM in the FTL JIT of webkit. LLVM isn't performing fast enough for JIT mainly because it's so memory hungry and misses optimisations that depend on javascript semantics. They got an around 5x compile time reduction and from 0% up to around 10% performance boost in general.


Actually the bigger reason is compile time - better optimizations based on JavaScript semantics are a secondary advantage.


I think that's mostly accurate, in the sense that we wouldn't have done this if it was only motivated by specializing for JavaScript semantics. We had gotten pretty good at having our high-level compiler (DFG) burn away the JavaScript crazy and leave behind fairly tight code for LLVM to optimize.

But as soon as we realized that we had such a huge compile time opportunity, of course we optimized the heck out of the new compiler for the kinds of things that we always wished LLVM could do - like very lightweight patchpoints and some opcodes that are an obvious nod for what dynamic languages want (ChillDiv, ChillMod, CheckAdd, CheckSub, CheckMul, etc).


But isn't it true that some of the things you ended up doing would make sense for LLVM, or would most of them be invalidated by the kinds of optimization passes that are common in LLVM?

E.g. stuff like making the in-memory IR representation better cacheable certainly sounds like it's all-upside, and LLVM should just learn from your project.


LLVM's use of a very rich (and hence not as memory efficient) IR is deeply rooted. Phases assume that given any value, you can trace your way to its uses, users, and owners. The LLVM code I've played with assumes this all over the place, so removing the use lists and owner links as B3 does would be super hard. B3 can do it because we started off that way.


There is a tradeoff between having a dense and compact IR and being able to modify it very easily. LLVM tends to focus on the second. Even if there are plan to balance a bit more, this will take a lot of time (and benchmarking).


what is memory usage like in terms of number of allocations and peak RSS with B3 vs. LLVM?


>> "tl;dr: B3 will replace LLVM in the FTL JIT of webkit. LLVM isn't performing fast enough for JIT mainly because it's so memory hungry and misses optimisations that depend on javascript semantics. They got an around 5x compile time reduction and from 0% up to around 10% performance boost in general." [1]

Is this a knock on LLVM then?

I wonder then specifically if this brings to light any concerns over Swift (another dynamic language, and was created by the same person who created LLVM as well). [2]

Seems weird that the original creator of LLVM was able to make a dynamic language such as Swift - without any problems.

[1] https://news.ycombinator.com/item?id=11105231

[2] http://nondot.org/sabre/


You're reading it too politically. It says right up in the first paragraph:

> While LLVM is an excellent optimizer, it isn’t specifically designed for the optimization challenges of dynamic languages like JavaScript.

LLVM was, more-or-less, designed around C++ semantics. B3 is not going to be a better C++ compiler than LLVM. LLVM is not going to be a better JS compiler than B3. They're taking different evolutionary paths.


LLVM compiles fast code for things that aren't C++ as well. Maybe you could say it's biased towards static languages.

Also its compilation speed is supposed to be fast in terms of AOT compilation, but not necessarily fast enough for JITing javascript in web-pages.


Not a knock on LLVM. The conditions are just different. There is no software that is 100% great at every single thing - that's why Windows embedded CE failed while iOS and Android thrived on phone and mobile devices.

VxWorks is great at driving martian rovers, but would be horrible at running Adobe Photoshop or Maya.

Why does everything either need to be a snub or a boon?

LLVM is great at generating compiled binaries - it just takes resources (time + memory) to do so - in practical terms, the compilation time only affects the programmers and developers running multiple compiles over a work day, working on the codebase itself. For consumers who are running the programs, LLVM makes great optimized binaries.

In a JS engine, the compiler doesn't have access to the source code until a user loads up the page, so it's in essence doing "compilation" on the spot. And in this case compilation memory and time usage matters.

Swift itself is a "compiled" lanugage - meaning that in development, a developer needs to hit that "compile" button in XCode to generate a binary file which is then distributed and run. In this case, LLVM can use all cores on that big honking Mac Pro or shiny MacBook Pro 15" retina all it wants - take all 16 cpus and 32 gigs of ram and crank it and make a nice binary that comes out optimized to run on a iPad Air.

Of course things like Java conflates the two styles (source code compiled to bytecode and then to run on the JVM) - but in the Java ecosystem there are significant optimizations in both the compile phase (ie javac helloworld.java) as well as ahead of time compilation in the JVM itself.


LLVM is designed for languages that are compiled ahead of time like C and C++. In these cases it's not that important how much time and memory it takes to compile something.

If you're working on a JIT, this matters. A lot. Webkit isn't the first project has has used or worked with LLVM in this manner and abandoned it. PyPy has also investigated using LLVM and stopped doing that.

LLVM isn't bad, it just isn't good at solving this problem it wasn't designed for. If anything, I think it's remarkable LLVM has worked out as well for Webkit as it did.


Swift is generally ahead-of-time compiled, so compile speed doesn't matter as much. Many of the core parts of Swift also avoid highly dynamic semantics. The most dynamic parts are when interfacing with Objective-C classes, an area that has not been heavily optimized yet.


Swift is not a dynamic language. It's statically typed.


On the typing front, maybe you'd find this recent paper entertaining: "Is Gradual Sound Typing Dead?" (link below)

The conclusion is that it's mostly dead or worse, because of the performance hit when typechecked code calls untyped code.

(And also because of the details of Typed Racket, which perfectly reasonably translates into ordinary untyped Racket, but then that is then mangled by GNU lightning (!).)

But...

"At the same time, we are also challenged to explore how our evaluation method can be adapted to the world of micro-level gradual typing, where programmers can equip even the smallest expression with a type annotation and leave the surrounding context untouched. We conjecture that annotating complete functions or classes is an appropriate starting point for such an adaptation experiment."

http://www.ccs.neu.edu/racket/pubs/popl16-tfgnvf.pdf


This comment, and the parent comment, seem to be conflating "dynamic language" with "dynamically-typed language".


What's a "dynamic language"?


Dynamic languages are (generally) not compiled ahead of time. It is possible to have static typing in a dynamic language in the sense that the code is interpreted at runtime (but the types are set statically in the code), and there is no "binary" that one runs.

It used to be called interpreted language or "scripting" language - but I think the vocabulary shifted so the word "dynamic" more encompasses what the languages are about.


By that definition, AFAIK, Swift is neither a dynamic language nor dynamically typed.


According to Apple itself, Swift is a compiled language, and there is no use of the word "dynamic" or "dynamic language" on Apple's website

https://developer.apple.com/swift/

As to whether it is dynamically or statically typed - it's easy to tell - do you need to indicate a variable is an int or a char or a string before you start using it? and do you have to cast the variable to different types when using functions that expect a certain type? If so - it's statically typed. Dynamically typed languages allows you to give an character "5" to a function that expects an integer (ie the number 5) and then automatically converts it so for example the final result to 5+"5" is int(10). Or when you try to print it, it gives you a string literal "1" and "0" without you having to recast it yourself.

Obviously having dynamic typing makes things easier for humans, but the computer has to keep predicting what the programmer is going to do with that variable, so it has some runtime and compile time weaknesses in performance, memory usage, as well as less strictness in compile-time checking - which might allow certain types of errors to sneak by, whereas static typing is very direct - you have to instruct the computer to do every casting from one type to another, etc., and the statically-typed compiler is a sadist - it will fail all your code over and over again until you get all the types right. The difference is extremely noticeable even for a novice programmer.

https://developer.apple.com/library/ios/documentation/Swift/...

says that Swift is "type-safe" (compiler is a sadist and will halt on all type errors) but has type inference so you don't have to explicitly indicate the typing of a variable. I would consider it to be heavily on the statically-typed side though - since it seems the language won't let an integer variable all of a sudden also be a string - you have to cast it properly first.

Not sure where the confusion comes from, probably from people putting buzz words to everything that is new regardless of applicability.

Either way, you are correct, swift is not "dynamically-typed" and neither is it a "dynamic/interpreted" language.


> As to whether it is dynamically or statically typed - it's easy to tell - do you need to indicate a variable is an int or a char or a string before you start using it?

Modern statically typed languages often don't require this, because type inference, so its a bad test.

> and do you have to cast the variable to different types when using functions that expect a certain type?

Modern statically-typed languages may not require you to do this (e.g., Scala if the types involved have appropriate implicit conversions defined.)

> Dynamically typed languages allows you to give an character "5" to a function that expects an integer (ie the number 5) and then automatically converts it so for example the final result to 5+"5" is int(10).

That's the canonical example of weak typing; many (maybe most) dynamically-typed languages do not do this type of conversion.

A better test for a dynamically-typed language is what kind of error is produced by sending a value of an unexpected tpe (including, as expected, anything that can be handled by the applicable implicit conversion rules) to a function: if it is a compile-time error, the language is statically typed. If it is a runtime error, it is a dynamic language.


> Dynamically typed languages allows you to give an character "5" to a function that expects an integer (ie the number 5) and then automatically converts it so for example the final result to 5+"5" is int(10)

This really has nothing to do with the dynamic/static axis. Many definitely dynamically typed languages (python, lisp variants, etc) do not allow this kind of conversions. This is just poor language design.

Often languages which allow this kind of implicit conversions are called weakly typed, but that's another very ill defined term. I like "stringly" typed.


It's worth noticing that most of the optimizations here are for "space" -- reducing the working set size or the number of memory accesses. CPUs have gotten much faster than memory blah blah. This is the sort of thing where microbenchmarks may mislead you, because you WSS is probably not realistic.

I think we don't have great tools for helping with this sort of optimization. One can use perf to find cache misses, but that doesn't necessarily tell the whole story, as you might blame some other piece of code for causing a miss. Maybe I should try cachegrind again...


Cool stuff! Does anyone know why the geometric mean is used for averaging benchmark scores rather than the usual arithmetic mean?


I would say that geometric mean is the usual way of averaging benchmark scores. It has the property that a given relative speedup on a component benchmark always has the same effect on the aggregate score. With an arithmetic mean the component benchmarks with a longer runtime will dominate the aggregate. Normalizing the results before applying the arithmetic mean doesn't really help either -- the first X% improvement to a component benchmark would still be valued more than the second X% speedup.


Interestingly, much of their complaints around pointer chasing, etc, are things LLVM plans on solving in the next 6-8 months. i'm a bit surprised they never bothered to email the mailing list and say "hey guys, any plans to resolve this" before going and doing all of this work. But building new JITs is fun and shiny, so ...


LLVM instruction selection is slow, there is a "fast-path" which hasn't received much attention (it is only used for -O0 in clang). The new instruction selector work just started and will take a couple of years, considering the tradeoff between spending 3 months on it and waiting a few years for LLVM to be improved (without any guarantee of LLVM reaching the same speed as what they did). See also some thoughts from a LLVM developer on optimizing for high-level languages: http://lists.llvm.org/pipermail/llvm-dev/2016-February/09546...


The problem with the path they've taken is it has a finite end.

This is why, when they compare it to v8/etc, it's kind of funny. They all have the same curve.

Basically all of these things, all of them, end up with roughly the same deficiencies once you cherry pick the low hanging fruit[1], and then they stall out, and get replaced a few years later when someone decides thing X can't do the job, and they need to write a new one. None of them ever get to a truly good state.

Rinse, wash, repeat.

The only thing these things make real progress, is by doing what LLVM did - someone works on it for years.

Let me quote a former colleague at IBM - "there is no secret silver bullet to really good compilers, it's just a lot of long hard work". If you keep resetting that hard work every couple years, that seems ... silly.

TL;DR If you really believe they've totally gotten everywhere they need to be in 3 months, i've got a bridge to sell you

[1] For example, good loop vectorization and SLP vectorization is hard.


Danny, you mentioned loop and SLP vectorization. One thing that bothered me while watching the development of the SLP and Loop vectorizers over the last two years was that developers who cared about SPEC added functionality that increased compile time. I remember one change that added a second phase that scans the entire IR in the SLP vectorizer to get a 1.4% win in one of the SPEC programs. I hoped the FTL project would be able to enable the vectorizers, but to my disappointment the vectorizers are too slow to justify the boost on javascript workloads.


I imagine they did, considering the head of LLVM is a (probably distant) coworker of theirs.


"The head of LLVM" - no such thing exists, but okay. (People have such weird ideas about how these projects work in practice).


I realise that, but you understand the intent of the comment which is the point.


tl;dr

https://webkit.org/blog-files/kraken.png

https://webkit.org/blog-files/octane.png

seriously though, dang, how many years of coding to get to that level of expertise


I'm really wondering about the politics behind all this. I mean both LLVM and WebKit are Apple projects (even though they're Open Source). So it would have been reasonable to expect an improvement of LLVM instead of ditching it altogether.


I highly doubt there are any politics behind it. LLVM is an AOT C/C++ compiler at its core, and the tradeoffs it makes don't always make sense for dynamic, JIT compiled languages with extreme emphasis on compilation speed like JS. Personally, I expected this to happen.


Specialising LLVM for dynamic compilation might well come at the expense of LLVM's strengths as an AoT compiler, and it would probably not be easy in any case.

In addition to that, dedicated implementations can take various shortcuts to make their job easier - there are some nice examples given in the link. LuaJIT is example of a compiler project that benefits from being heavily specialised to a particular job, to remarkable effect.


If you read the technical explanation you will see that there are zero politics behind this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: