Hacker News new | past | comments | ask | show | jobs | submit login
ARM chips have an instruction with JavaScript in the name (stackoverflow.com)
406 points by kdeldycke 5 days ago | hide | past | favorite | 295 comments

If anybody else was curious, it appears that the performance win of use of this instruction looks to be about 1-2% in general javascript workloads: https://bugs.webkit.org/show_bug.cgi?id=184023#c24

I would argue a solid 1-2% can get you a promotion in HW companies. You put 5 of this improvements and that’s a generational improvement.

1-2% for a single language at the cost of polluting an entire instruction set.

Would you say the same if the language was c or c++?

Yes this is necessitated by some of JS's big warts, but the sheer amount of javascript in existence weights heavily when considering the trade-offs. You cannot ignore HTML/JS if you're targeting UI applications - it is table stakes.

Yeah, but isn't the whole point of ARM to be a reduced instruction set? How reduced are we, really, if we're dedicating transistors to the quirks of a single language?

RISC is a misleading name, the concepts of its design are not really based around the idea of a "Reduced Instruction Set" as in "small" per se, nor are CISC machines necessarily a large size instruction set.

It is much more about the design of the instructions, generally RISC instructions take a small, fixed amount of time and conceptually are based on a sort of minimum unit of processing, with a weak to very weak memory model (with delay slots, pipeline data hazards, required alignment of data etc) with the compiler/programmer combining them into usable higher level operations.

CISC designs on the other hand happily encode large, arbitrarily complex operations that take unbounded amounts of time, and have very strong memory models (x86 in particular is infamous here, you can pretty much safely access memory, without any alignment, at any time, even thought often the result will be slow, it wont crash)

As an example, the PDP-8 has fewer than 30 instructions, but is still definitely a CISC architecture, some ARM variants have over 1000 instructions but are still definitely RISC.

RISC is about making building processors simpler, not about making instruction sets and programming with them necessarily simpler.

Well, it's not like x86 unaligned access was some cute thing they did just to make it extra complex. It made complete sense in 1976 when designing an ISA that might support implementations with an 8 bit memory bus and no cache. Why eat more bus cycles than you need?

Fast forward a decade or so and all processors had cache and multi-byte memory bus so the unaligned access and compact instruction streams are no longer necessary.

But processors these days are complex multi core beasts with IEEE fpu, simd units, mmu, memory controller, cache controller, pci-e northbridge, all kinds of voltage/thermal interplay, and even iGPU. ISA is over emphasized.

when the concerned ARM instruction is doing something perfectly capable by the software to just score 1-2% performance improvements, it is definitely CISC based on the definitions you listed above.

This Javascript related instruction is complex in terms of a bitwise operator on a single value, however complex it is still a simple operation; CISC-ness generally relates more to accessing memory or interacting with the hardware in complex ways.

From the definition given:

CISC designs on the other hand happily encode large, arbitrarily complex operations that take unbounded amounts of time, and have very strong memory models

The operation it accelerates is very simple, its just not really required much outside of javascript. It is already identical to an existing instruction, just with slightly different overflow behavior that javascript relies on.

Considering the amount of phones and tablets in the world, ARM's primary purpose right now, by a large margin, is interpreting web pages.

Consumers and even manufacturers are both focused on performance as a primary metric, not the size of an instruction set.

I'd argue at this point ARM serves more value as an instruction set which isn't encumbered by the mass of x86 patents and historical legal baggage, thus meaning it's something that can reasonably be implemented by more than just two companies on the planet.

It being RISC is secondary to that.

> Yeah, but isn't the whole point of ARM to be a reduced instruction set?

It was the point at, say, the original ARMv1.

Yeah, the problem is not that it's javascript per se, it's that it's a quirk. FCVTZS would have been just fine.

I'd argue that the whole point of ARM (as in ARM holdings) is to make money for their shareholders! :-)

One of the most popular and fastest growing languages

What is the cost of this pollution?

I am genuinely curious, I don't know much about instruction set design.

This uses existing rounding modes with pre-set flags, so it costs 1 entry in the LUT and a small number of muxes, one per flag, assuming the worst case.

I guess it's however many bytes the instruction occupys in L1 cache.

The pollution would be in the chip complexity.

It's not a random esoteric programming language, it's JavaScript.

Is that significant enough to justify the instruction?

A recent AMD microarch was 10-20% faster than the previous one, despite running on the same physical process. No single component was responsible; there were changes to several components, each of which increased speed by only 1-4%.

It is if it also increases battery life by a similar amount ;-)

It increases the die area, power consumption, and instruction set size by a miniscule amount, so probably.

A great question for the folks who actually do CPU design! As somebody who writes JS rather frequently I’m not complaining.

it depends on the cost (in gates and timing) looks like this might be just a few gates

Nothing justifies the prolonging of Javascript torture.

Nothing justifies the prolonging of C torture either, except of the C's wide spread. Why do you think modern CPUs still expose mostly C-abstract-machine-like interface instead of their actual out-of-order, pipelined, heterogeneous-memory-hierarch-ied internal workings?

>> Why do you think modern CPUs still expose mostly C-abstract-machine-like interface instead of their actual out-of-order, pipelined, heterogeneous-memory-hierarch-ied internal workings?

Because exposing that would be a huge burden on the compiler writers. Intel tried to move in that direction with Itanium. It's bad enough with every new CPU having a few new instructions and different times, the compiler guys would revolt if they had to care how many virtual registers existed and all the other stuff down there.

But why C? If you want languages to interface with each other it always comes down C as a lowest common denominator. It's even hard to call C++ libraries from a lot of things. Until a new standard down at that level comes into widespread use hardware will be designed to run C code efficiently.

> Until a new standard down at that level comes into widespread use hardware will be designed to run C code efficiently.

Exactly this hinders any substantial progress in computer architecture for at least 40 years now.

Any hardware today needs to simulate a PDP-7 more or less… As otherwise the hardware is doomed to be considered "slow" should it not match the C abstract machine (which is mostly a PDP-7) close enough. As there is no alternative hardware available nobody invests in alternative software runtime models. Which makes investing in alternative hardware models again unattractive as no current software could profit from it. Here we're gone full circle.

It's a trap. Especially given that improvements in sequential computing speed are already difficult to achieve and it's known that this will become even more and more difficult in the future, but the computing model of C is inherently sequential and it's quite problematic to make proper use of increasingly more parallel machines.

What we would need to overcome this would be a computer that is build again like the last time many years ago, as a unit of hard and software which is developed hand in hand with each other form the ground up. Maybe this way we could finally overcome the "eternal PDP-7" and move on to some more modern computer architectures (embracing parallelisms in the model from the ground up, for example).

My favorite article on this: "C Is Not a Low-level Language: Your computer is not a fast PDP-11"

Hacker news post: https://news.ycombinator.com/item?id=16967675

and again: https://news.ycombinator.com/item?id=21888096

GPUs are pretty different, despite exposed interface being about the same C

It is not the same C, it is a dialect full of extensions, and it only applies to OpenCL, which it was one of the reasons why it failed and forced Khronos to come up with SPIR, playing catch up the polyglot PTX environmnent of CUDA.

OpenCL 3.0 is basically OpenCL 1.2, which is OpenCL before SPIR was introduced.

I don't have words to describe how exciting that would be. The only way I could see it happen is if the existing legacy architecture would be one (or many) of the parallel processes so that efforts to make it run legacy software don't consume the entire project. I really do think it possible to make a "sane" machine language that doesn't need layers of abstraction or compilers and is easy to learn.

>> Especially given that improvements in sequential computing speed are already difficult to achieve and it's known that this will become even more and more difficult in the future...

That's perfect. As performance stop increasing just by shrinking transistors, other options will have a chance to prove themselves.

>> but the computing model of C is inherently sequential and it's quite problematic to make proper use of increasingly more parallel machines.

IMHO Rust will help with that. The code analysis and ownership guarantees should allow the compiler to to decide when things can be run in parallel. Rust also forces you to write code that will be easier to do that. It's not a magic bullet but I think it will raise the bar on what we can expect.

> If you want languages to interface with each other it always comes down C as a lowest common denominator

Nope, "always" only applies to OS written in C and usually following POSIX interfaces as they OS ABI.

C isn't the lowest common denominator on Android (JNI is), on Web or ChromeOS (Assembly / JS are), on IBM and Unisys mainframes (language environments are), on Fuchsia (FIDL is), just as a couple of examples.

CPUs expose a "mostly-C-abstract-machine-like" interface because this allows chip designers to change the internal workings of the processor to improve performance while maintaining compatibility with all of the existing software.

It has nothing to do with C, specifically, but with the fact that vast amounts of important software tend to be distributed in binary form. In a hypothetical world where everybody is using Gentoo, the tradeoffs would be different and CPUs would most likely expose many more micro-architectural details.

> Why do you think modern CPUs still expose mostly C-abstract-machine-like interface

I don’t think that, because they don’t. Your premise is hogwash.

Modern RISC derived CPUs for the most part expose a load store architecture driven by historical evolution of that micro arch style and if they are SMP a memory model that only recently has C and C++ adapted to with standards. Intels ISA most assuredly was not influenced by C. SIMD isn’t reminiscent of anything standard C either.

Also you might want to look into VLIW and the history of Itanium for an answer to your other question.

> Also you might want to look into VLIW and the history of Itanium for an answer to your other question.

There's only one question. What do you mean?

But Itanium wasn't out of order. How does that even come close to answering a question about exposed out-of-order machinery?

There is one CPU that exposes its out of order inner workings, the VIA C3 ("ALTINST"). The unintended effects are so bad that people that accidentally discovered it referred to it as a backdoor: https://en.wikipedia.org/wiki/Alternate_Instruction_Set

> [..] referred to it as a backdoor

Because it was implemented in a flawed way.

> "In 2018 Christopher Domas discovered that some Samuel 2 processors came with the Alternate Instruction Set enabled by default and that by executing AIS instructions from user space, it was possible to gain privilege escalation from Ring 3 to Ring 0.["

It's not like other languages are well-adapted to that either. That's a hard target to code for.

Why couldn’t Haskell compilers make good use of that?

The "sufficiently smart compiler" [1] has been tried often enough, with poor enough results, that it's not something anyone counts on anymore.

In this case, the most relevant example is probably the failure of the Itanium. Searching for that can be enlightening too, but heres a good start: https://stackoverflow.com/questions/1011760/what-are-the-tec... (For context, the essential Itanium idea was to move complexity out of the chip and into the compiler.)

Also, don't overestimate Haskell's performance. As much fun as I've had with it, I've always been a bit disappointed with its performance. Though for good reasons, it too was designed in the hopes that a Sufficiently Smart Compiler would be able to turn it into something blazingly fast, but it hasn't succeeded any more than anything else. Writing high-performance Haskell is a lot like writing high performance Javascript for a particular JIT... it can be done, but you have to know huge amounts of stuff about how the compiler/JIT will optimize things and have to write in a very particular subset of the language that is much less powerful and convenient than the full language, with little to no compiler assistance, and with even slight mistakes able to trash the perfromance hardcore as some small little thing recursively destroys all the optimizations. It's such a project it's essentially writing in a different language that just happens to integrate nicely with the host.

[1]: https://duckduckgo.com/sufficiently smart compiler

This is reminiscent of my experience designing SQL queries to be run on large MySQL databases.

I had to write my queries in such a way that I was basically specifying the execution plan, even though in theory SQL is practically pure set theory and I shouldn’t have to care about that.

You can stuff Haskell-drived bluespec into FPGA.

Well it turned out that for running scalar code with branches and stack frames exposing too much to the compiler was not helpful, especially as transistor budgets increased. So as long as we program with usual functions and conditionals this is what we have.

I can swap out a cpu for one with better IPC and hardware scheduling in 10 minutes but re-installing binaries, runtime libraries, drivers, firmware to get newly optimized code -- no way. GPU drivers do this a bit and it's no fun.

For a long time I thought the JS hate was just a friendly pop jab. From working with backend folks I’ve realized it comes from at least a somewhat patronizing view that JS should feel more like backend languages; except it’s power, and real dev audience, is in browsers, where it was shaped and tortured by the browser wars, not to mention it was created in almost as many days as Genesis says the world was built.

Huh? Python and powershell are all over the backend, and it’s hard to argue against JavaScript while you’re using those. At least in my opinion.

I think it has more to do with the amount of people who are bad at JavaScript but are still forced to sometimes work with it because it’s the most unavoidable programming language. But who knows, people tend to complain about everything.

So, there are excuses for Javascript's terribleness, but that doesn't stop it from being objectively terrible.

Sad to see the word "objective" become the next bullshit intensifier because people can't separate their own subjective opinions from the realm of verifiable fact.

"Terribleness" isn't an objective property.

You can compare Javascript to other languages and note that many of its notorious problems have no rational justification, and are unnecessary. That's what I call objectively terrible.

Yeah, that's not objective. Sorry.

What do you think "objective" means?

The comparative approach I mentioned can be used to eliminate personal feelings about such issues - not in all cases (types might be an example), but certainly in some.

Many users of Javascript, including myself, recognize that it has many weaknesses. There's even a book that acknowledges this in its title: "Javascript: The Good Parts."

Denying this seems to be denying objective reality.

You may be confusing "objective" with "universal," thinking that I'm claiming some unsituated universal truth. But that's not the case. Any statement is only ever true within some context - the language that defines it, the semantics of the statement, the premises that it assumes.

In this case, there is a shared context that crosses programming languages, that allows us in at least some cases to draw objective conclusion about programming language features. "The bad parts" implied by Crockford's title includes many such features. We can examine them and conclude that while they might have some historical rationale, that they are not good features for a programming language to have.

In many cases this conclusion is possible because there's simply no good justification - the title of this post is an example. Having all numbers be floating point has many negative consequences and no significant positive ones - the only reason for it is historical. Such features end up having consequences, such as on the design of hardware like ARM chips. That is an objectively terrible outcome.

You can of course quibble with such a statement, based on a rigid application of a simplistic set of definitions. But you'd do better to try to understand the truth conveyed by such a statement.

What do YOU think "objective" means?

None of the properties you're talking about are objective. Objective doesn't mean Crockford wrote a book about it or "lots of people agree with me".

Objective means factual. You're putting the word "objective" in front of your own and others opinions to arrogate the credibility of objectivity onto statements that are not based in observation of material reality.

More people holding an opinion doesn't make it a fact. "Terribleness" or "justifiableness" are not matters of fact, they are both matters of opinion.

Do you understand? You keep repeating your opinion and then using the word "objective" to claim that your opinion is fact. You think I am disagreeing with your opinion, rather I am disagreeing with you stating your opinion is a fact. No matter how many people agree with you it will never be a fact, it will always be an opinion because "terribleness" is not a matter of fact! "Terribleness" is the result of a value judgement.

There are no such things as "objective conclusions", objectivity is not a manner of reasoning. You're looking for something more like "observations", "measurements", hard facts.. none of which apply to "terribleness" because it can't be materially observed--only judged.

"Objectively" isn't an intensifier unless used in the form "Objectively [something that isn't objective]." Why would actual facts need to be intensified? What kind of insane argument would anyone have where facts and opinions are compared directly?

I know it sounds stronger to say your opinions are facts but it is okay to have opinions. Just remember that the difference between opinions and facts is a difference of kind rather than a difference of degree. You can argue an opinion, you can attempt to persuade me to your way of thinking if you show your reasoning.

You can just look up some dictionary definitions, like "not influenced by personal feelings or opinions in considering and representing facts." I've explained how that applies in this case - we can use comparative analysis to draw factual conclusions.

Focusing on the specific word "terrible" is a bit silly. Sure, it's hyperbolic, but I used it as a way to capture the idea that Javascript has many features that are comparatively worse at achieving their goals than equivalent features in many other languages. This is something that can be analyzed and measured, producing facts.

Crockford's book title is simply an example of how even a strong advocate of Javascript recognizes its weaknesses. You may not understand how it's possible for it to objectively have weaknesses, but that doesn't mean it doesn't. In this case an objectively bad feature would be one that has negative consequences for programmers, and can be replaced by a features that can achieve the same goals more effectively, without those negative consequences.

If there's anyone who'll argue in favor of such features on technical rather than historical grounds, then it would certainly undermine the objectivity claim. But the point is that there are (mis)features in Javascript which no-one defends on technical grounds. That is a reflection of underlying objective facts.

I'm also not making some sort of ad populum argument. As I pointed out, any claim of objective fact has to be made in some context that provides it with semantics. In some languages, the expression "1"+"1" is a type error, in others it produces "11". Both of those outcomes are objective facts in some context. What your objection really amounts to is saying that there's no semantic context in which my claim could be true. That's clearly not the case.

Perhaps a different example would help: people don't write programs any more by toggling binary codes into machines via switches. That's because we've come up with approaches that are objectively more effective. We can factually measure the improvements in question. The features I was referring to fall into the same category.

I'm going to repeat the closing from my last comment, because you're still doing the same thing:

You can of course quibble with such claims, based on a rigid application of a simplistic set of definitions. But you'd do better to try to understand the truth conveyed by the claims, and engage with that.

Again, you think I am disagreeing with your opinion by pointing out that it is an opinion and not a matter of fact. You're only continuing to restate your opinion and insist it is fact.

Claims of objective facts? Objective facts in some context? Badness isn't a matter of fact--it's a matter of opinion, I say again, you're making a value judgement and asserting that as a fact. You may as well tell me an onion is a planet and I can live on it if I believe hard enough.

You think I am disparaging your argument by saying it is mere opinion as though they isn't good enough to be real true fact. I am not, I am merely pointing out that your statement is actually an opinion which you are incorrectly claiming to be a fact.

> I'm also not making some sort of ad populum argument. As I pointed out, any claim of objective fact has to be made in some context that provides it with semantics. In some languages, the expression "1"+"1" is a type error, in others it produces "11". Both of those outcomes are objective facts in some context. What your objection really amounts to is saying that there's no semantic context in which my claim could be true. That's clearly not the case.

"Objective fact" isn't claimed. You seem to be missing that an opinion even if backed up by evidence still isn't itself a fact and thus isn't objective. This isn't a matter of context. The difference between opinion and fact is not like the difference between true and false.

I don't know how you're lost on this. "JS is bad" is an opinion. "JS is objectively bad" is still an opinion but claims to be a fact, because "badness" isn't an objective property. Whether or not something is bad is not a matter of fact, it's a matter of opinion.

The "+" operator performs both string concatenation and addition in JS. <-- That is a fact, anyone can fire up a JS interpreter and confirm this for themselves.

The "+" operator being overloaded to perform string concatenation and addition with implicit type coercion is bad. <-- That's an opinion. While anyone can observe the behavior, they have to make a judgement on whether or not it is desirable and desirability is not a matter of fact.

Normally JS devs don't really encounter these notorious problems, for many years now.

You sound like a complete novice, funboy or the one who knows only JS. There are many issues, it is possible not to touch or to work around them, TS and flow helps. JS solved just a few — 'use strict', strict comparison, arrow functions, string literals. Core problems still there — implicit types, prototype declarations, number is float, toString/inspect division, typeof null. Every javascript programmer has to walk that way of embarrassment.


I've been programming for a decade in many languages including assembly, C#, Rust, Lisp, Prolog, F# and more, focusing on JS in the last 5 years.

Virtually no one writes plain JavaScript, most people including me write TypeScript, but Babel with extensions is normally used. Your reply exhibits your ignorance of the JS world.

I occasionally write JavaScript since 2007, experiment a lot last 5 years, red through ES5 specification several times. I've worked as C++, PHP, Python, Ruby developer. Experimented with a few languages.

"JS" instead of "TypeScript" brings confusion. TS solves some issues and I've mentioned it, still

    typeof null
Template literals interpolation helps but if string (not literal string) slips by it is a mess

    1 - "2"
    1 + "2"
Check out another comment [1], Object, Function, etc defined as constructor. It is not solved by "class", it is still a function with a bit of sugar:

    class Foo {}
    Foo instanceof Function
Globals with a few exceptions defined as constructors, DOM elements defined as constructors, inheritance defined as constructors

    class Bar extends Foo {}
You can internalize how it works and there are some good explanations [2] but design is error prone and terrible.

[1] https://news.ycombinator.com/item?id=24815922

[2] https://yehudakatz.com/2011/08/12/understanding-prototypes-i...

C++ has WAY more spec footguns than JS (and that's without counting all the C undefined behaviors which alone outweight all the warts of JS combined). PHP also beats out JS for warts (and outright bad implementation like left-to-right association of ternaries). Ruby has more than it's fair share of weirdness too (try explaining eigenclass interactions to a new ruby dev). Even python has weirdness like loops having an `else` clause that is actually closer to a `finally` clause.

`typeof null === "object"` is a mistake (like with most of the big ones, blame MS for refusing to ratify any spec that actually fixed them).

If you're having issues accidentally replacing `+` with `-` then you have bigger issues (eg, not unit testing). I'd also note that almost all the other languages you list allow you to overload operators which means they could also silently fail as well. In any case, garbage in, garbage out.

Foo being an instance of function is MUCH more honest than it being an instance of a class because the constructor in all languages is actually a function. This is even more true because you are looking at the primitive rather than the function object which contains the primitive.

I have not claimed JS is the weirdest. But I have not claimed "Normally C++/PHP/Ruby/Python devs don't really encounter these notorious problems, for many years now" either.

Eigenclass (singleton_class) explained in another thread. I have not encountered Pythons for/else [1] yet.

Right, typeof null exposed by Microsoft IE 2 (?). Web is many times bigger now yet even such a small mistake is not fixed.

I have issue + of being concatenator, I prefer string interpolation, separate operators. Implicit type conversion often does not make sense spoils a lot

    [] * 2
    foo = {}
    bar = {}
    foo[bar] = 1  // just throw please
    //["[object Object]"]
> they could also silently fail as well.

But they don't. If only these rules were defined as library. I am sure it would be ditched long ago. Actually this may be argument in favor of operator overloading in JavaScript, the way to fix it.

> Foo being an instance of function is MUCH more honest

    class Foo

    TypeError (already initialized class)
    # wrong one
    NoMethodError (undefined method `call' for #<UnboundMethod: Foo(BasicObject)#initialize()>)
    # does not allow unbound

new constructs an object and calls initialize. Same in JavaScript

    function Foo () {
    new Foo
    // Foo {}
    // Window
It kind of make sense — new creates an object of constructor.prototype and calls constructor. I can't see how it is MUCH more honest than if new creates an object of prototype and calls prototype.constructor. By that logic Object.create is not honest

    Object.create(Object.prototype) // expects [[Prototype]] not constructor
And even if it was

    foo = {}
    bar = Object.create(foo)
    bar.__proto__ === foo 
    bar.__proto__.__proto__ === Object.prototype
    bar.__proto__.__proto__.__proto__ === null 

    class Foo {}
    class Bar extends Foo {}
    bar = new Bar 
    bar.__proto__ === Bar.prototype 
    bar.__proto__.__proto__ === Foo.prototype
    bar.__proto__.__proto__.__proto__ === Object.prototype
    bar.__proto__.__proto__.__proto__.__proto__  === null
I don't need constructor except in new, otherwise I use it only to access prototype. Absence of languages adopting this approach confirms its usability issues.

> This is even more true because you are looking at the primitive rather than the function object which contains the primitive.

Could you please expand this part? "Primitive" has specific meaning in JavaScript.

[1] https://book.pythontips.com/en/latest/for_-_else.html

`__proto__` doesn't necessarily equal `.prototype`.

    var foo = Object.create(null)
    //now foo.prototype and foo.__proto__ are both undefined
    foo.prototype = {abc:123}
    //foo.__proto__ is still undefined. Need to use Object.setPrototypeOf()
In older JS code, I've seen people trying to abuse prototypes. One result in this kind of thing is often retaining references to those hidden `__proto__` leading to memory leaks.

Also, `__proto__` is deprecated. If you're writing JS, you should be using `.getPrototypeOf()` instead.

> Could you please expand this part? "Primitive" has specific meaning in JavaScript.

    var fn = function () {}
    fn.bar = "abc"

    Object.keys(fn) //=> ["bar"]

    (1).__proto__ === Number.prototype //=> true
JS is torn on the idea of whether something is primitive or an object. You see this (for example) in Typescript with the primitive number being different from the Number type which represents a number object. To get at the primitive, you must actually call `.valueOf()` which returns the primitive in question. Meanwhile, you can attach your own properties to the function object -- a fact exploited by many, many libraries including modern ones like React. You can also add your own `.valueOf()` to allow your code to better interact with JS operators, but I believe that to pretty much always be a bad practice.

Yes, I know about null [[Prototype]] and who is owner of prototype and __proto__. I like how it is not real property

    var foo = {
      __proto__: null
As you can see I do not store it and do not modify it, it is assertion, kind of instanceof.

> Also, `__proto__` is deprecated

Object.getPrototypeOf would be too verbose in this example. I could have defined

    Object.defineProperty(Object.prototype, "proto", {
      get() { Object.getPrototypeOf(this) }
but why bother? We both know that I meant [[Prototype]].

> Primitive

> fn.bar = "abc"

is syntactic sugar for

fn["bar"] = "abc"

do not follow.

> (1).__proto__ === Number.prototype

number is primitive

    typeof 1
    1 instanceof Number
number is wrapped in Number when we access property

    Number.prototype.foo = function () { return this }
    //Number {1}
    typeof 1..foo()
    1..foo() instanceof Number 
Function is not primitive

    typeof Math.max
    Math.max instanceof Function
    Math.max instanceof Object 
it is an object, we can attach properties to object.

As I remember valueOf is complicated by hints, exposed in JavaScript [1]. I've played with removal

    delete Array.prototype.toString
    delete Object.prototype.toString
    Uncaught TypeError: Cannot convert object to primitive value
Unfortunately it converts to string and than converts string to number.

    delete Array.prototype.valueOf 
    delete Object.prototype.valueOf 
[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

The same thing. I work in the JS world, and interact with it nonstop.

Oh, JS has no issues, solved by ClojureScript, PureScript, Elm.

These languages are not used by virtually all JS programmers. Babel and TS is.

The other issues you mention are solved by using ESLint which flags code like this.

I do not encounter these issues in my life as a professional JS programmer, neither do my colleagues; and I'm not on my first project, don't worry. For all practical purposes they are non-existent.

anyways, we are all happy for wasm, it's not that we love JS so much.

I actually like JavaScript, with a few changes it can become a good language, safe for a novice programmer.

Agreed. Check out AssemblyScript, you might like that. It's early, though.

> Virtually no one writes plain JavaScript

Which is because plain Javascript is objectively terrible, as I pointed out.

That's like pointing out that C++ after template transformations are applied is terrible. Yeah, so what? Nobody writes it like that.

"C++ is C because nobody writes C", you can't be serious.

That's not what I said.

And that's your opinion. I find javascript quite enjoyable and easy to use, without producing errors. YMMV.

JavaScript has good parts, I write it a lot. But it is ignorant to close eyes on its warts

    1 + '2'
    1 - '2'
    Number.MAX_SAFE_INTEGER + 2
and entire WAT series, stems from "don't raise" ethos. JavaScript exposes constructor instead of prototype that messed up a lot, in Ruby terms

    Object.alias_method :__proto__, :class
    Object = Object.instance_method(:initialize)
    Class = Class.instance_method(:initialize)
    Class.__proto__.alias_method :prototype, :owner
    new = ->(constructor) { constructor.owner.new }
    Person = Class.prototype.new do
      def initialize
    def Person.foo
    puts Person.foo

    john = new.call Person
    def john.bar
    puts john.bar
    def (Person.prototype).baz
    puts john.__proto__.baz
Does anyone wants to adopt this feature in their language?

Operator overloading can lead to ambiguities in dynamic languages. Ruby, python, and any number of other languages have it much worse because they can be overloaded any way you want while JS overloads are (currently at least) set in stone by the language.

If you could only choose one number type, would it be floats or ints? Crockford would say decimal, but the rest of use using commodity hardware would choose floats every time. It's not the language implement's fault someone doesn't understand IEEE 854. This max safe integer issue exists in ALL languages that use IEEE 854. In any case, BigInt is already in browsers and will be added to the spec shortly.

Perl and Lua have separate arithmetic and concatenation operators

    1 + "2"
    1 - "2"
    1 . "2"  | 1 ..  "2"
    1 . "-2" | 1 .. "-2"
Python and Ruby throw exception, use explicit type conversion and string interpolation

    1 + '2'
    Traceback (most recent call last):
    TypeError (String can't be coerced into Integer)
    1 + '2'.to_i
Hardly any language is perfect, I have not encounter much of operator overload in ruby (nokogiri?) but I believe C++ got it bad.

One number type in Lua:

    > 0 | 0xffffffff
    > 0 | 0xffffffffffffffff
    > 0 | 0x7fffffffffffffff
limited but better than JavaScript:

    0 | 0xffffffff
BigInt is an improvement

    0n | 0xffffffffn
it is strict

    1n - "2"
    1 + 1n
    Uncaught TypeError: Cannot mix BigInt and other types, use explicit conversions
works nice with string interpolation:

Numbers is a sane example. One can argue it was for good. How about `{} + []`? I believe I can disable this part in JavaScript engine and no one would notice. And misleading `object[key]` where it calls toString, sure I have not tried that in a decade but it is stupid. UTF-16:

    ""[1] # there were emoji
You've said nothing about constructor oriented programming. Unique feature, I have not heard any other language adopted it yet. The post you've replied contents sketch for Ruby. Actually I've got it wrong — every JavaScript function is a closure (Ruby method is not closure) and Prototype method was a class method (not instance method), fixed but ugly:

    def function(&block)
      Class.prototype.new.tap do |c|
        c.define_method(:initialize, block)

    def function_(object, name, &block)
      object.class_eval do
        define_method(name, &block)

    Person = function { |name|
      @name = name
    function_(Person.prototype, :name_) {

    john = new.call Person, 'john'
    puts john.__proto__ == Person.prototype
    puts john.name_

    def function__(object, name, &block)
      object.singleton_class.class_eval do
        define_method(name, &block)

    function__(john, :name__) {
    puts john.name__
By the way, you can say "Yes, I know JavaScript has some problems". It is not a secret, everyone knows.

I completely agree that separate operators area MUST for dynamic languages.

Lua allows operator overloading with metatables as do ruby and Python with classes




> One number type in Lua:

Not quite true. Lua had only 64-bit floats like JS until version 5.3 and the blazing fast LuaJIT still only has floats. Well, to be honest, it has hidden 32-bit integers for sake of bitwise operations just like JS (well, JS uses 31-bits with a tag bit which is probably a lot faster).

> How about `{} + []`? I believe I can disable this part in JavaScript engine and no one would notice.

That's very simple. {} at the beginning of a line is an empty block rather than an object (yay C). "Disabling" that would break the entire language.

> UTF-16

UCS-2 actually. Back in those days, Unicode was barely a standard and that in name only. Java did/does use UCS-2 and JS for marketing reasons was demanded to look like Java. I don't want to go into this topic, but python, PHP, ruby, C/C++, Java, C#, and so on all have a long history not at all compatible with UTF-8.

> You've said nothing about constructor oriented programming. Unique feature, I have not heard any other language adopted it yet.

I'll give you that JS prototypal inheritance is rather complex due to them trying to pretend it's Java classes. Once again though, the deep parts of both Python and Ruby classes are probably more difficult to explain. Lua's metatables are very easy to understand on the surface, but because there's no standard inheritance baked in, every project has their own slightly different implementation with it's own footguns.

Closures are almost always preferred over classes in modern JS. Likewise, composition is preferred over inheritance and the use of prototype chains while not necessarily code smell, does bear careful consideration.

If someone insists on using deep inheritance techniques, they certainly shouldn't be using class syntax as it adds yet another set of abstractions on top. Object.create() and inheriting from `null` solves a ton of issues.

> By the way, you can say "Yes, I know JavaScript has some problems". It is not a secret, everyone knows.

I'd say if you take the top 20 languages on the tiobe index, it sits in the middle of the pack with regard to warts and weirdness. Maybe people are just attracted to weird languages.

I have no grudge against operator overloading when done consciously — complex numbers, matrix multiplication. I've tried to implement JavaScript arithmetic in Ruby, failed so far.

Sorry, I had to be clear, in Lua "one number type" means float, my bad. I meant Lua 5.3 integer still works like JavaScript Number. In the end we have to know about ToInteger, ToInt32, ToUint32, Number.MAX_SAFE_INTEGER [1]. It is not one number type but encoding of several number types, union.

Prior to 5.3 and in LuaJIT it has different limitations

    > print(string.format("%18.0f",9007199254740991 + 1))
    > print(string.format("%18.0f",9007199254740991 + 2))
they have extended original "one number type" without change of the interface. In any case both versions do not convert to BigNum like Ruby.

    => 115792089237316195423570985008687907853269984665640564039457584007913129639935
Ruby unified Fixnum, Integer and BigNum as Integer in 2.4. Can't see benefits of Number/BigInt against Float/Integer. I'd rather have 3.6f literal.

Yes, I know how WAT works. I meant ToPrimitive [2]

    [] * {}
I've disabled this code in Firefox, have not done extensive testing but looks like no one depends on it. We infer types with TypeScript and flow but VM already knows it, it can report such cases without external tools. I think of it as extension of Firefox Developer edition — lint in the browser.

Object.prototype.toString is not as useful as Ruby, Python

    class Foo {}
    `${new Foo}`
    //"[object Object]"

    class Foo end
    #=> #<Foo:0x0000560b7a10df20>

    >>> class Foo:
    ...     pass
    >>> Foo()
    <__main__.Foo object at 0x7fb53aecf1f0>

And there is no separate inspect/__repr__

    #=> #<Date: 2020-10-20 ((2459143j,0s,0n),+0s,2299161j)>
    #=> "2020-10-20"
> UCS-2 actually

Oh, DOM UTF-16 string broken by UCS-2 JavaScript function. I understand it is not easy to fix, Ruby fixed in 1.9, Python in 3.0, new languages (Rust, Elixir) come with UTF-8. Microsoft Windows has code pages, UCS-2, UTF-16.

Maybe Python way? b"binary", u"utf-8" (but together, not python fiasco), ruby has "# Encoding: utf-8", transformation tools can mark "b" or "u" all unspecified strings.

> Once again though, the deep parts of both Python and Ruby classes are probably more difficult to explain.

No, every Ruby object contains variables and has a link to a class which defines instance methods, we call it singleton_class

    foo = Object.new
    foo.class == Object
    #=> true
    foo.singleton_class == Object
    #=> false

    def foo.bar
    #=> #<UnboundMethod: #<Class:#<Object:0x000055cb33808e10>>#bar() (irb):7>
There is ancestors chain

    #=> [Object, Kernel, BasicObject]
There is a bit of syntactic sugar

    Foo = Class.new
There are few revelations with main (method defined in Object)

    def baz
    => #<UnboundMethod: Object#baz() (irb):19>
Nothing like audible "click" I had when understood that "function" is a "constructor"

    constructor Foo {}
    // you can call me as function too
that unlike any other language [[Prototype]] is hidden. I've red through ES5 to be sure there are no hidden traps left.

Every JavaScript programmer has to go through this list either beforehand or by experience. I do not want to undermine TC39 effort — arrow functions, string interpolation in template literals, strict BigInt, Object.create — these are great advancement. I don't feel same way for "class", underlying weirdness is still there.

Make [[Prototype]] visible

    Object = Object.prototype
    Function = Function.prototype
now it is easy to reason about

    typeof Object 

    Foo = class {}.prototype         // redefine with sweetjs macro
    Bar = class extends Foo.constructor {}.prototype

    new Foo.constructor              // redefine with sweetjs macro

    Object.constructor.create(Bar)   // redefine as Reflect.create
once redefined:

    Foo = class {}
    Bar = class extends Foo {}
    new Foo
I've shown it in another comment [3].

Languages are weird, there are a lot of C++ developers, I've been there, no way to know all dark corners. Pythons ideology hurts. Java took EE way. C# was tied to Microsoft. C K&R is beautiful, hard to write safe, packs a lot in the code. PHP has its bag of problems. SQL is not composable, CTE helps. Go ideology. Ruby — performance. And JavaScript because browser, not bad when know and avoid skeletons in the shelf.

Lua metatables looked like a proxy/method_missing for me.

[1] https://www.ecma-international.org/ecma-262/5.1/#sec-9.5

[2] https://www.ecma-international.org/ecma-262/5.1/#sec-9.1

[3] https://news.ycombinator.com/item?id=24815922

> Object.prototype.toString is not as useful as Ruby, Python

I don't know that returning that info would be good or secure in JS

> Oh, DOM UTF-16 string broken by UCS-2 JavaScript function. I understand it is not easy to fix, Ruby fixed in 1.9, Python in 3.0, new languages (Rust, Elixir) come with UTF-8. Microsoft Windows has code pages, UCS-2, UTF-16.

The best and most compatible answer is moving JS to UTF-32. JS engines already save space by encoding strings internally as latin1 instead of UCS when possible (even around 90% of text from Chinese sites is still latin1). IMO they should have made backtick strings UTF-32, but that doesn't exactly transpile well.

> No, every Ruby object contains variables and has a link to a class which defines instance methods, we call it singleton_class



or adding all the things with a few instances, .constructor, etc


I'll let you decide which implementation is easier to work through, but I have a definite opinion that Ruby's system is more complex (and Python layers their OOP on top of what is basically a hidden prototypal system).

> I've red through ES5 to be sure there are no hidden traps left.

You'll love newer ES versions then. The upcoming private fields are an even bigger mess.

JS needs a "use stricter" mode which really slices away the bad parts. Even better, just add a `version="es20xx"` requirement to use newer features and have browsers ignore what they don't know, so you could even compile and link to multiple compilation levels of the same script and have the browser choose.

In truth, JS would be in the best place of any top-20 language if Eich had just been allowed to make a scheme variant as he had planned.

Yes, template strings was a clean start.

Of course prototype based language is simpler than class based. Ruby system is more complex. It provides more tools — Class, Module, class and instance methods, variables (as depicted on the picture). You've asked eigenclass (singleton_class these days), that's Class:a -> A, very simple concept.

And yet Ruby inheritance is much easier, it is all around and it just works. No one does this in JavaScript, too complex. There were many attempts of building OOP people could understand on top of JavaScript in 200x. No one does this for Ruby.

> https://i.stack.imgur.com/FPPdI.png

With fix above:


I'll let you decide which implementation is easier to work through, but I have a definite opinion that current JavaScript system is more complex.

> You'll love newer ES versions then. The upcoming private fields are an even bigger mess.

I follow.

Netscape and Mozilla tried version approach. Module is a clean start, 'use strict' by default. It is not Visual Basic, already good.

Sure, until you parachute into a code base where several generations of contractors added features that communicate over a shared global object. This is bad per-se, but becomes worse when your language allows one to add fields on the fly and you end up with this container full of similar fields because eventually nobody knows exactly what’s in the object any more...

And that seems pretty simple to fix. "The same level of awareness that created a problem cannot be used to fix the problem" - and maybe that's why they hired you, to fix these problems. I've been that guy before. What was really fun was the codebase I had to fix that used mexican slang words as variable names, and I only speak English. So much fun. But I sucked it up, accepted the challenge, and I improved it.

It really doesn't take a super-genius to code javascript simply, efficiently, and without errors. Funny that a lot of programmers that think they're very smart are the ones that either shit on javascript, or make a lot of stupid errors with it, or both.

I've seen similar abuses with global maps in other languages (essentially the same). This is an architecture fault rather than a language fault.

As you say, that is a problem with any language and project with a revolving door of developers. Perhaps those companies should learn their lesson and hire at least one or two good, permanent senior devs to keep things on track.

Like always, the human factor outweighs almost everything else.

I'd love to have a way to take a `Date` object and set it as the value of a `datetime-local` input. Sure feels like that should be straightforward to do, without requiring any weird conversions like it does.

Wow. `zonedDateTimeISO` looks like exactly what I was hoping for. I'm kind of surprised that there isn't anything analogous to `strftime`, though.

That’s not what “objective” means.

"When I use a word, it means just what I choose it to mean —neither more nor less."

It's not often that one comes across a quote from one of the greatest works of art:


That whole conversation is quite interesting and perhaps this part can also be thought of as alluding to "prescriptivism versus descriptivism".

I think it goes beyond prescriptivism/descriptivism.

Descriptivism refers generally to describing how a community uses language, so the most common usages end up being the primary definitions. Once documented, these tend to become prescriptive.

In that context, a single author who uses a word in an unusual way would likely not have any impact on the descriptive or prescriptive definitions of that word, unless of course their usage becomes common.

Humpty was arguing for the benefits of using words in unusual ways, which potentially violates both prescriptivism and descriptivism.

Impressionistic use of words is one example of this, where the words used might convey a certain feeling, perhaps via their connotations, even though their literal definitions may not be exactly appropriate. This kind of usage is generally found in literature where language is being used as an artistic tool rather than a prosaic tool of communication.

tl;dr: my HN comments is a art

So you're saying we should applaud reducing it by 1-2% right?

To answer the question of if it's worth it to add this specialized instruction, it really depends on how much die space it adds, but from the look of it, it's specialized handling of an existing operation to match an external spec; that can be not too hard to do and significantly reduce software complexity for tasks that do that operation. As a CE with no real hardware experience, it looks like a clear win to me.

Why whine about it? No matter your age, JavaScript will almost certainly still be in use until you retire.

This is like saying “nothing justifies the prolonging of capitalist torture”. On some level it’s correct, but it’s also being upset at something bordering a fundamental law of the universe.

There will always be a “lowest common denominator” platform that reaches 100% of customers.

By definition the lowest common denominator will be limited, inelegant, and suffer from weird compatibility problems.

If it wasn’t JavaScript it would be another language with very similar properties and a similar history of development.

Given that the instruction set already has a float to integer conversion it seems likely that the overhead of implementing this would be small and so given the performance (and presumably energy) win quoted elsewhere seems like a good move.

It would be interesting to know the back story on this: how did the idea feed back from JS implementation teams to ARM. Webkit via Apple or V8 via Google?

Correct, the overhead is minimal - it basically just makes the float->int conversion use a fixed set of rounding and clamping modes, irrespective of what the current mode flags are set to.

The problem is JS's double->int conversion was effectively defined as "what wintel does by default", so on arm, ppc, etc you need a follow on branch that checks for the clamping requirements and corrects the result value to what x86 does.

Honestly it would not surprise me if the perf gains are due to removing the branch rather than the instruction itself.

Not quite. JS is round towards zero, ie. the same as C. If you look at the x86 instruction set then until SSE2 (when Intel specifically added an extra instruction to achieve this) this was extremely awkward to achieve. x86 always did round-to-nearest as the default.

The use of INT_MIN as the overflow value is an x86-ism however, in C the exact value is undefined.

> The problem is JS's double->int conversion was effectively defined as "what wintel does by default",

No, JavaScript "double to int conversion" (which only happens implicitly in bitwise operations such as |, &, etc) is not like any hardware instruction at all. It is defined to be the selection of the low-order 32-bits of a floating point number as-if it were expanded to its full-width integer representation, dropping any fractional part.

Interesting. So it's as much an x86 legacy issue as JS and presumably JS followed x86 because it was more efficient to do so (or maybe by default).

Sounds too like performance gains will depend on how often the branch is taken which seems highly dependent on the values that are being converted?

> Interesting. So it's as much an x86 legacy issue as JS and presumably JS followed x86 because it was more efficient to do so (or maybe by default).

Most languages don't start with a spec, so the semantics of a lot of these get later specced as "uhhhhh whatever the C compiler did by default on the systems we initially built this on".

Seeing as JavaScript was designed and implemented in two weeks, I'm betting this is the answer

Today's JavaScript is so divorced, so radically different from the the original implementation to be considered a different language, though.

Isn’t modern JS backward compatible with 1.1?

Mostly. See https://tc39.es/ecma262/#sec-additions-and-changes-that-intr... for a comprehensive list of backward-incompatible changes in the spec.

Using that list to answer your question is a bit tricky, since it also includes backward-compatibility breaks with newer features. But, e.g.,

> In ECMAScript 2015, ToNumber applied to a String value now recognizes and converts BinaryIntegerLiteral and OctalIntegerLiteral numeric strings. In previous editions such strings were converted to NaN.


> In ECMAScript 2015, the Date prototype object is not a Date instance. In previous editions it was a Date instance whose TimeValue was NaN.

sound like backward-incompatible changes to a JS 1.1 behavior.

Another notable example is the formalization of function-in-block semantics, which broke compatibility with various implementations in order to find a least-bad compromise everyone could interop on. I'm not sure if JS 1.1 even had blocks though, much less functions in blocks...

>Another notable example is the formalization of function-in-block semantics, which broke compatibility with various implementations in order to find a least-bad compromise everyone could interop on. I'm not sure if JS 1.1 even had blocks though, much less functions in blocks...

can you explain what you mean? Did early js implementations not have functions inheriting the block scope?

If they’re referring to lexical block scope then indeed, js historically did not have block scoping, it was function and global scoped.

There are also differences in how eval works, etc

No. Modern javascript engines are compatible with 1.1, but the syntax of modern javascript is not.

Modern JS really isn't even compatible with JavaScript. When people talk about "modern JS" it usually includes TypeScript and a bunch of NodeJS-isms, none of which are compatible with TC-39. It's only by proximity (and perversity) that it even gets to be considered JS.

Rounding mode was defined then and hasn't changed since though.

Your comment made me realize this is true now. Through the 80s it was the other way around. You counted as having a language with a spec, even if there was no implementation, but an implementation without a spec was a "toy"

> Through the 80s it was the other way around. You counted as having a language with a spec, even if there was no implementation, but an implementation without a spec was a "toy"

Not to my recollection. I don’t recall anyone at uni discussing a C standard until 1989, and even by 2000 few compilers were fully compliant with that C89 spec.

There were so many incompatible dialects of FORTRAN 77 that most code had to be modified at least a bit for a new compiler or hardware platform.

All of the BASIC and Pascal variants were incompatible with each other. They were defined by “what this implementation does” and not a formal specification.

C was specified by K&R in 1978. Pascal had a specification for the core language, and BASIC was largely seen as a toy.

> C was specified by K&R in 1978.

No it wasn’t. K&R was far removed from many C implementations of the time, wasn’t ever written as a formal spec, and had gaping holes of undefined behavior.

An educational textbook isn’t a formal language specification.

C was the K&R book, PASCAL was the User manual and Report. These were the platonic ideals of the languages.

The specification was the language, the fact that there was an implementation was a bonus. I never once in my comments above said "formal" so perhaps we are meaning two very different things by "specification." No version of Cs specification since K&R has done away with undefined, nor implementation defined behavior.

It was 100% [ed] due to [/ed] the default wintel behavior. All other architectures have to produce the same value for that conversion.

The branch always has to be taken if executing JavaScript, because otherwise how would you tell if the value was correct or not? You'd have to calculate it using this method regardless and then compare!

If you always take a branch, it's not a branch.

Or did you mean by "taken" that the branch instruction has to be executed regardless of whether the branch is taken or not?

JavaScript JITs always emit this instruction when ToInt32 is required, since checking would be more expensive in user code. And the instruction always used the JS rounding method, since that's cheaper in silicon. I used branch since the parent used "branch".

What are you defining as correct?

"correct" in this case being consistent with the JavaScript specification.

Unless I misread the current arm docs, I don't think this is still present in the ISA as of 2020?

The whole RISC/CISC thing is long dead anyway, so I don't really mind having something like this on my CPU.

Bring on the mill (I don't think it'll set the world on fire if they ever make it to real silicon but it's truly different)

To understand RISC, ignore the acronym it stands for, instead just think fixed-instruction, load-store architecture. That's what RISC really means today.

No variable-length instructions. No arithmetic instructions that can take memory operands, shift them, and update their address at the same time.

Someone once explained it like “not a reduced set, but a set of reduced instructions”. Not r(is)c, but (ri)sc.

Pretty much what you say, I just liked the way of describing it.

It seems to me people are ignoring that the C stands for Complexity. What's reduced is Complexity of the instruction set, not the size of it (or even the instructions themselves). In the context of the coinage of the term, they almost certainly could have called it "microcode-free ISA", but it wouldn't have sounded as cool.

Doesn't the C stand for Computer?

Oops I'm wrong about the name but not about the spirit. This is the original paper: https://dl.acm.org/doi/pdf/10.1145/641914.641917

Is ARM even microcode free these days?

I dont think so

> No variable-length instructions

AESE / AESMC (AES Encode, AES Mix-columns) are an instruction pair in modern ARM chips in which the pair runs as a singular fused macro-op.

That is to say, a modern ARM chip will see "AESE / AESMC", and then fuse the two instructions and execute them simultaneously for performance reasons. Almost every "AESE" encode instruction must be followed up with AESMC (mix columns), so this leads to a significant performance increase for ARM AES instructions.

> think fixed-instruction, l[...]. That's what RISC really means today.

> No variable-length instructions.

So, ARM is not a RISC instruction set, because T32 (Thumb-2) instruction can be 2 or 4 bytes long.

Similarly, RISC-V has variable-length instructions for extensibility. See p. 8ff of https://github.com/riscv/riscv-isa-manual/releases/download/... (section "Expanded Instruction-Length Encoding").

What is with variable-length instruction aversion? Why is it better to load a 32-bit immediate with two 4-byte instructions (oh, and splitting it in 12/20 bit parts is non-intuitive because of sign extension, thanks RISC V authors) than with one 5-byte instuction?

Fixed width instructions allow trivial parallel instruction decoding.

With variable length instructions one must decode a previous one to figure out where the next one will start.

> With variable length instructions one must decode a previous one to figure out where the next one will start.

People said the same thing about text encodings. Then UTF-8 came along. Has anyone applied the same idea to instruction encoding?

That would eat precious bits in each instruction (one in each byte, if one only indicates ‘first’ or ‘last’ bytes of each instruction).

It probably is better to keep the “how long is this instruction” logic harder and ‘waste’ logic on the decoder.

That doesn't sound so bad to me, in comparison to fixed-size instructions. One bit in each byte gone to have the most common instructions take one byte instead of four. I can imagine that allowing much denser code overall.

Or it could be one bit out of every pair of bytes, to support 2-, 4-, and 6-byte instructions. I don't know much about ARM Thumb-2, except that it does 2- or 4-byte instructions, so clearly someone thought this much was a good idea.

Instruction encodings are below my usual abstraction layer; I'm just speculating...

That's still a bit of dependency between instructions, not much but it is there, and that does make making parallel decoder harder.

It's not all in favor of fixed width encoding though. Variable length has an advantage of fitting more things into the cache. So it's all a balancing act of ease of instruction decoding vs space used.

If you want to be able to decode a bunch of instructions in parallel you really want it to be simple.

You don't have to make it self-synchronizing like utf-8, but a prefix length encoding is a real asset.

All sorts of boundary issues occur when you're near the end of a page or cache line. How many instruction bytes should you load per cycle? What happens when your preload of a byte that happens to be in a subsequent page trips a page fault?

By comparison, 4byte instructions that are always aligned have none of those problems. Alignment is a significant simplification for the design.

(Somewhere I have a napkin plan for fully Huffman coded instructions, but then jumps are a serious problem as they're not longer byte aligned!)

> What happens when your preload of a byte that happens to be in a subsequent page trips a page fault?


...and even ARM breaks instructions into uops, just like x86.

Of course it does. The level of abstraction required for modern pipelining and OOo scheduling is still beneath ARM. I'm not that familiar with the details of arm on paper but it's not that low level by research standards.

Arm v8 is the "current" 64-bit version of the ISA and you almost certainly have it inside your phone. You might have a version older than v8.3.

I think you are thinking of the old Java-optimizing instructions on the older ARM processors.

Those never really took off…

What does it mean for the RISC/CISC thing to be dead? The distinction between them is more blurred than it used to be?

RISC / CISC was basically IBM-marketing speak for "our processors are better", and never was defined in a precise manner. The marketing is dead, but the legend lives on years later.

IBM's CPU-advancements of pipelining, out-of-order execution, etc. etc. were all implemented into Intel's chips throughout the 90s. Whatever a RISC-machine did, Intel proved that the "CISC" architecture could follow suite.


From a technical perspective: all modern chips follow the same strategy. They are superscalar, deeply-pipelined, deeply branch predicted, micro-op / macro-op fused "emulated" machines using Tomasulo's algorithm across a far larger "reorder buffer register" set which is completely independent of the architectural specification. (aka: out-of-order execution).

Ex: Intel Skylake has 180 64-bit reorder buffer registers (despite having 16 architectural registers). ARM A72 has 128-ROB registers (depsite having 32-architectural registers). The "true number" of registers of any CPU is independent of the instruction set.

Since RISC wasn't coined by IBM (but by Patterson and Ditzel) this is just plain nonsense. RISC was and is a philosophy that's basically about not adding transistors or complexity that doesn't help performance and accepting that we have to move some of that complexity to software instead.

Why wasn't it obvious previously? A few things had to happen: compilers had to evolve to be sophisticated enough, mindsets had to adapt to trusting these tools to do a good enough job (I actually know several who in the 80' still insisted on assembler on the 390), and finally, VLSI had to evolve to the point where you could fit an entire RISC on a die. The last bit was a quantum leap as you couldn't do this with a "CISC" and the penalty for going off-chip was significant (and has only grown).

You can't understand the RISC/CISC "debate" until you spend a few minutes skimming through the IBM 360 mainframe instruction set.


Not even remotely. Nothing in the RISC philosophy says anything about pure data-path ALU operations. In fact this instruction is pretty banal compared to many other FP instructions.

It strikes me as ironic that an architecture that used to pride itself on being RISC and simple is heading in the same direction as intel-levels of masses of specialist instructions.

I don't mean this as a criticism, I just wonder if this is really the optimum direction for a practical ISA

Ironically this instruction only exists because JavaScript accidentally inherited an Intel quirk in it's double-to-int conversion. This instruction should be "Floating Point Intel Convert" not "Floating Point Javascript Convert", but w/e, trademark laws and all that.

In practice the instruction made code a few percent faster, primarily because it allowed dropping a few branches at the end of every JavaScript int conversion on ARM. IMO, ARM has always been at the "edge" of RISC (at least until AArch64 threw out all of it's quirks) and the exception proves the rule here. Instructions like this exist specifically because they accelerate execution time, rather than for assembler programmer convenience. That's the same underlying reason why RISC had such an advantage over CISC before micro-op decoders and register renaming became a thing. It's not so much that "having less instructions is better", but that things like memory operands or repeat prefixes or what have you primarily act as programmer convenience at the expense of code efficiency. Stuff like FJCVTZS is used by compiled code to gain a measurable speed benefit, ergo it stays in.

May be as the time increases the threshold for being a RISC increases as the tech advances?

Emery Berger argues that the systems community should be doing exactly this -- improving infrastructure to run JS and Python workloads:


We need to incorporate JavaScript and Python workloads into our evaluations. There are already standard benchmark suites for JavaScript performance in the browser, and we can include applications written in node.js (server-side JavaScript), Python web servers, and more. This is where cycles are being spent today, and we need evaluation that matches modern workloads. For example, we should care less if a proposed mobile chip or compiler optimization slows down SPEC, and care more if it speeds up Python or JavaScript!

Personally I think this would be unfortunate as I don't think JavaScript is the path forward, but computers have always existed to run software (d'oh), which mean the natural selection will obviously make this happen if there is a market advantage.

However I see all high performance web computing moving to WASM and JavaScipt will exist just as the glue to tie it together. Adding hardware support for this is naive and has failed before (ie. Jazelle, picoJava, etc).

I partially agree. I wish TypeScript compiled directly into a JIT without compiling into JavaScript and I wish TypeScript has a strict mode that always actively requires type definitions.

> Adding hardware support for this is naive and has failed before (ie. Jazelle, picoJava, etc).

The hardware support being added here would work just as well for WASM (though it might be less critical).

Pray tell in which way this will help WASM?

WASM would have at least one f64 -> int32 conversion routine with the same semantics as JS, if possibly not the only one.

maybe this will lead to the revival of Transmeta-like architectures? I always had a soft spot for reprogrammable microcode.

WASM is _designed_ to be easily jitted, without the expensive machinery we had to put in place to do this for x86, so the whole point is not require a new architecture.

Because of this I find WASM to be the best direction yet for a "universal ISA" as it's very feasible to translate to most strange new radical architecture (like EDGE, Prodigy, etc). (Introducing a new ISA is almost impossible due to the cost of porting the world. RISC-V might be the last to succeed).

"WASM is _designed_ to be easily jitted, without the expensive machinery we had to put in place to do this for x86"

I'm not sure this is actually true, given that the original intent was for WASM to be easy to compile using existing JS infrastructure, not in general. So given that, it would make sense to carry over JS fp->int semantics into WASM. WASM is in effect a successor to asm.js.

It's certainly also not too hard to compile/jit for new architectures, but that was not the initial intent or what guided the early/mid-stage design process.

If you examine the current WASM spec, it doesn't appear to specify semantics at all for trunc. I would expect it inherits the exact behavior from JS.

True, but wasn't my point with "expensive machinery". By that I was referring to

  * discovering which instructions are executed (tracking  control flow),
  * mapping x86 code addresses to translated addresses,
  * discovering the shape of the CFG and finding loops,
  * purging translations that get invalidated by overwrites (like self-modifying code),
  * sprinkling the translated code full of guards so we can make assumptions about the original code
etc. All this is unnecessary for WASM and you can in fact make a quick and dirty one-pass translation for the 1st gear. I don't know, but it seems reasonable that they would keep the same FP->Int semantics as JS, but you rarely do that in non-JS code.

EDIT: fail to make bullets, oh well.

Is there really reason to believe that "porting the world" has gotten that much harder in the decade since RISC-V was released?

It would mean JavaScript would have to compile to the same byte code less there are multiple instruction sets.

Unlike Java there's no official bytecode and all implementation do it differently. I don't think any high performance implementation use bytecodes, but instead uses threaded code for their 1st tier, and native code for all others.

The quote isn't really saying that JS-specific instructions need be added to the ISA though.

In that sense it's not saying anything different from what we have been doing for the past 60 years.

The only significant thing that has changed is that power & cooling is no longer free, so perf/power is a major concern, especially for datacenter customers.

> In that sense it's not saying anything different from what we have been doing for the past 60 years.

Yes it is? The essay's point is that "standard" hardware benchmark (C and SPEC and friends) don't match modern workloads, and should be devaluated in favour of better matching actual modern workloads.

You think Intel designs processors around SPEC? (Hint: they don't).

ADD: It was an issue a long time ago. Benchmarks like SPEC are actually much nicer than real server workloads. For example, running stuff like SAP would utterly trash the TLB. Curiously, AMD processors can address 0.25 TB without missing in the TLB, much better than Intel.

Yeah WASM killing JS is an outside bet for this decade. Could happen. (Please Lord).

Curious, have you actually written production software targeting WASM? Because I have, in Rust, the language with the best toolchain for this by an order of magnitude.

Despite all that, and me being very familiar with both Rust and JS, it was a big pain. WASM will remain a niche technology for those who really need it, as it should be. No one is going to write their business CRUD in it, it would be a terrible idea.

I don't dispute it, but could you elaborate on the painful parts?

I find the crossing in and out of JavaScript to be less than ideal. However, I don't see why WASM couldn't evolve to require less of that, ie. expose more of what you need JavaScript for today.

> However, I don't see why WASM couldn't evolve to require less of that, ie. expose more of what you need JavaScript for today

It can, and it is. Designers are already doing all they can to make it an appealing target for a variety of languages on multiple platforms.

We are no longer in 2005. Javascript, especially in its Typescript flavor, is a perfectly capable modern language.

Its not that it isn't capable, its that it has more gocha's than most other languages of it size (and no, just because the gocha is well defined doesn't mean that it doesn't trip programmers up).

Its also despite a couple decades of hard work by some very good compiler/JIT engineers at a considerable disadvantage perf wise to a lot of other languages.

Third its most common runtime environment, is a poorly thought out collection DP and UI paradigms that don't scale to even late 1980's levels, leading to lots of crutches. (AKA, just how far down your average infinite scrolling web page can you go before your browser either takes 10s of seconds to update, or crashes?).

Many of the gotchas wouldn't ever show up if only people committed to writing JS as sensibly as they write their programs in other languages when they're being forced to do things a certain way.

But when it comes to JS and even other dynamic languages, people for some reason absolutely lose their minds, like a teenager whose parents are going out of town and are leaving them home by themselves overnight for the first time. I've seen horrendous JS from Java programmers, for example, that has made me think "Just... why? Why didn't you write the program you would have written (or already did write!) were you writing it in Java?" Like, "Yes, there are working JS programmers who do these kinds of zany things and make a real mess like this, but you don't have to, you know?"

It's as if people are dead set on proving that they need the gutter bumpers and need to be sentenced to only playing rail shooters instead of the open world game, because they can't be trusted to behave responsibly otherwise.


That is simply not true, io language is prototype based yet it is simple to reason about, fix for JavaScript:

    Object = Object.prototype
    Function = Function.prototype
That's how most dynamic languages work, no need for explanations, obvious assertions:

    Object.__proto__ === null
    Function.__proto__ === Object 

    object = new Object.constructor
    object.__proto__ === Object
    object.constructor === Object.constructor
    object.toString === Object.toString

    object.private = function () { return 'hello' }
    Object.shared  = function () { return 'world' }
    Object.constructor.static = function () { return '!' }
It shows hidden complexity — what does instanceof mean?

    // Object instanceof Function
    Object.constructor.__proto__.constructor === Function.constructor
    // Object instanceof Object
    Object.constructor.__proto__.__proto__.constructor === Object.constructor
    // Function instanceof Function
    Function.constructor.__proto__.constructor === Function.constructor
    // Function instanceof Object
    Function.constructor.__proto__.__proto__.constructor === Object.constructor
Why would anyone want this? Lets get rid of .constructor on both sides:

    Object.constructor.__proto__ === Function
    Object.constructor.__proto__.__proto__ === Object
    Function.constructor.__proto__ === Function
    Function.constructor.__proto__.__proto__ === Object
Now it is obvious — constructor is a Function, Functions __proto__ is Object. JavaScript programmers jump through ".prototype" hoops which should not be exposed.

... what? This comment is incomprehensible. What does it have to do with the parent comment (besides exemplifying the "teenager left at home for the first time" and creating-a-mess stuff)?

None of this crazy reflective, meta-level programming belongs in the LOB application logic of any program. Save it for your programmer workspace/REPL environment--which is something that 90+% of JS programmers aren't using anyway. They're all editing "dead" files in Visual Studio Code and SublimeText.

> Object.constructor.__proto__.constructor === Function.constructor // Object instanceof Object

... and that's not even what instanceof means.

Do you ever type "prototype"? Do you ever type "prototype" in any language but JavaScript? How it differs from other languages accessing [[Prototype]] / class directly, not through constructor?

I've thought that's caused by prototype inheritance, but no io language [1] prototype based yet easy to understand. The only difference — it does not hide [[Prototype]]. I've redefined Object and Function to show how it works. You would get same result in "dead" files. REPL to make it easy to reproduce.

Imagine it is another language

    object.private = function () { return 'hello' }
    Object.shared  = function () { return 'world' }
    Object.constructor.static = function () { return '!' }

    object.private = function() { return 'hello' }
    Object.prototype.shared = function () { return 'world' }
    Object.static = function () { return '!' }
Can you see the difference? Or straight from io tutorial

    Contact = (class {
      constructor(name, address, city) {
          this.name = name
          this.address = address
          this.city = city
    holmes = new Contact.constructor("Holmes", "221B Baker St", "London")
    Contact.fullAddress = function() {
      return [this.name, this.address, this.city].join('\n')
No "prototype" (except from class but it can be redefined).

> ... and that's not even what instanceof means.

Thank you, it should check [[Prototype]] not constructor

    // Object instanceof Object
    Object.constructor.__proto__ === Function
    // Object instanceof Object
    Object.constructor.__proto__.__proto__ === Object
    // Function instanceof Function
    Function.constructor.__proto__ === Function
    // Function instanceof Object
    Function.constructor.__proto__.__proto__ === Object
[1] https://iolanguage.org/

... what?

> Do you ever type "prototype"?

Not every day or even once in an average week, unless you're doing something wrong.

> You would get same result in "dead" files.

You don't understand. You shouldn't even be futzing with these things in _any_ file in the first place. Keep this sort of thing inside your REPL, and keep it out of anything checked into source control or anything distributed to other other people. 99% of ordinary LOB logic simply does not require (or justify) the use of these kinds of meta-programming facilities. Don't start using clever code reflection tricks unless you're writing a reflection framework.

Dropping this highly relevant link here for the second time:


> straight from io tutorial `Contact = [...]`

Er... okay. In JS:

    class Contact {
      constructor(name, address, city) {
        this.name = name;
        this.address = address;
        this.city = city;

      getFullAddress() {
        return this.name + "\n" + this.address + "\n" + this.city;

    let holmes = new Contact("Holmes", "221B Baker St", "London");

    221B Baker St

> Not every day or even once in an average week, unless you're doing something wrong.

Do you have to understand it or not? There is a ton of information on the web (viewed 191k times) [1], [2], etc. Acting like it is intuitive is disingenuous.

No, that's you who don't understand. I've shown particular code that is error prone in JS but is ok in other languages. That is the issue. It is not solved. Every novice gets it. It is simple to solve. Instead you claim "you should not need this", tell it 191k viewers. Yes, you know what's going on, still you claim "you don't need this".

    class Foo
    foo = Foo.new
    foo.class == Foo // metaprogramming! *You shouldn't even be futzing with these things in _any_ file*
Sure I know about method definition in class. Extend it (same class). It is useful technique.

And please don't post twice same if you can't read response.

[1] https://stackoverflow.com/questions/9959727/proto-vs-prototy...

[2] https://javascript.info/class-inheritance

> that's you who don't understand

Because your comments are incomprehensible.

> I've shown particular code that is error prone in JS but is ok in other languages.

It's not OK in Java. It's not OK in C. It's not OK in C++. Because you can't even do that in any of those languages.

> still you claim "you don't need this"

Look at the languages I listed above. Think about the program you're trying to write. Do you think it's impossible to write the program you want to write in those languages, where you are not even allowed to to do your clever tricks?

Look at the underscores in the name __proto__. That's a huge, massive sign in your face that means, "If you find yourself touching this a lot, it's because you're doing something wrong." How could it be clearer? Imagine it were called __proto$do_not_use$__ instead. But you use it anyway. And then you ask, "why is this so difficult?". Where's the surprise?

Think about this whole thread. You're doing something, experiencing pain, and then complaining about the pain that you feel. Yes, of course you feel the pain. Every time you mention the pain, you are actually giving an argument against yourself: it's not the right thing. So stop doing it. The fact that you got instanceof wrong is further evidence that you're not the best person to tell us which road to take to get to the party--you don't know the city.

(As for the code snippet you wrote in this comment, it's not clear at all what you're even trying to say. Once again: incomprehensible.)

I've mentioned prominent example, there is underlying problem

    class Foo {}
    typeof Foo 
    Foo instanceof Function 
Is there underscore here? How about

    typeof Object
    typeof Function
    Object instanceof Function
    Object instanceof Object
    Function instanceof Function
    Function instanceof Object
How many underscores there? And looks like you don't know about

without underscore and you blame I don't know this land.

You've got burned by Java and C++, bad for you. There are other languages — Python and Ruby (last example). JS [[Prototype]] is equivalent of class hierarchy. You shown many times that you can't comprehend, normal reaction should be investigate and understand first, not your case. It shows you in a bad light.

> there is underlying problem

No, you haven't ever defined _any_ problem, only abstract things that bother you.

There is only one problem that matters: writing a (correct) program. If you can do it without meta-programming, then you should.

If you start meta-programming when you don't have to, then you have already lost. If you are doing this when you shouldn't, and then you experience pain and say, "This system needs to be fixed", then make sure you fix the correct part of the system: yourself.

> JS [[Prototype]] is equivalent of class hierarchy.

Which nobody is futzing around with (because they can't, and they shouldn't futz with it anyway) on the level that you're insisting we do, in the ordinary course of writing programs. (The level where you are seeing "problems".) Once again: keep that stuff in your REPL. If you are editing a regular source file that will be committed to the repo and you are typing out "prototype" or "__proto__" or "setPrototypeOf" more than once or twice every 10k lines, then you're almost definitely doing something wrong.

> You've got burned by Java and C++, bad for you.

No, it's that the world is burning because of thousands of programmers who can't behave responsibly because they pick up a dynamic language and say, "There are no adults around to discipline me, so now is the time to go nuts."

(And the irony is that this thread exists because of your complaints about JS and how it burns you, where I started out arguing that JS is fine at the beginning. But now you're the champion for JS?)

You've shown your position — ignorance. You don't have to repeat yourself. You would not save that "burning world". I am sorry for all junior developers who is in contact with you and who you "discipline". I am quite grown up and estimate such worldviews as "no more than middle, or terrible mistake".

Honestly this is really spot-on. I think part of the problem is people have heard that javascript is bad and just expect it to be bad, so they never really try to learn it and just copy and paste something that works and maybe expand on the bad examples when they should just be learning how to use javascript.

Javascript requires some self-discipline, where other languages discipline you if you try to fuck things up yourself. It's kind of like a drunken sailor on leave going on a bender on-shore - the minute they leave the ship all bets are off. Some people just can't wrap their head around using weak/dynamic typing, because it loosens the rules. But that doesn't mean there aren't any rules.

I code in a dozen other languages and Javascript is still my favorite because it's so easy to use, no boilerplate, and it doesn't give me any "gotchas" because javascript actually does follow some very basic easy to understand rules. The people that don't bother to learn those rules invariably shit on javascript. It's not fair, it's not real criticism.

And practically all of the "examples" of bad javascript is almost always code that nobody should ever write, in any language, because it's just stupid to write code in those ways. But people think "aHa, aNoThEr 'gOtChA'" and the myth of javascript being somehow inferior continues. And to anyone who actually knows how easy javascript is to use, it just looks silly, pathetic, and obnoxious.

Regarding gotchas, it's bearable. I only have a couple on my short list: == vs === and xs.includes(x) vs x in xs, and only the latter is not reducible to a trivial rule of thumb. TS is helpful in this regard, possibly there are more in plain JS.

Regarding performance, modern JS is plenty fast, but it's not in the 'terrible' category. It's memory usage, perhaps ;) https://benchmarksgame-team.pages.debian.net/benchmarksgame/.... For performance critical code JS is not the answer, but good enough for most uses, including UIs.

Regarding UI paradigms, I'm not sure what the problem is, or what significantly better alternatives are. I did MFC/C++ in '90s and C++/Qt in the '00s, and both were vastly inferior to modern browser development. React+StyledComponents is a wonderful way to build UIs. There are some warts on the CSS side, but mostly because Stackoverflow is full of outdated advice.

MFC was a mess, In the 90's. I dabbled with MFC multiple times, mostly to support other peoples applications and it was overwhelmingly bad. For C++ I mostly just used the native win16/32 APIs because it turned out to be less kludgy for small GUI's associated with various services I was writting. OTOH, I did a _LOT_ of development with vb, and later delphi for the frontend of those services. Both of which were significantly better GUI development tools than visual studio at that point.

I've since moved on to other development areas, but have done a bit of both JS (on and off from about 2005->2015), and a bit of QT here and there too. The latter seems to have gotten better in the past few releases AFAIK. C# is at least ok too.

So in the case of MFC and early QT its not surprising that you prefer react+JS as neither of those were exactly the pinnacle of UI development in the late 1990's early 2000s.

What you describe in a later posting about MFC was never really a problem with the VCL, pretty much no one dealt with repaint messages unless you were writing a custom component. I remember showing a number of MFC programmers BC++ builder (which is also still a thing) when they were singing the MFC tune. They tended to get really quite after that.

PS/edit: Crazy, I just went and looked at the latest visual studio documentation and MFC is still an option, and at first glance seems basically the same as it was 25 years ago. Although the section on creating win32 applications is in front in the manual. I guess i'm not surprised that MFC is still a thing, but I am surprised that it doesn't have a giant depreciation warning. I frankly can't imagine creating a new MFC application in 2020.

Why are MFC and QT inferior to the Javascript way? I don’t know much about either.

On top of my head:

* Apps eventually degenerate in a mess of interconnected components triggering events on each other. Building UIs felt more like writing Verilog than writing functions. Nothing like chasing event ordering bugs through library components. With React I never saw 'enqueue a new event for component X' anti-pattern. All callbacks flow from children to parents in a fairly regular manner.

* Related, there is no need to manually manage repaint events in React. Visual updates happen automagically. See, for example, https://stackoverflow.com/questions/952906/how-do-i-call-pai... for a hair pulling session related to manual repainting.

* On the ergonomics side, having a markup language allows for terse but readable definitions of component trees. I see 2020 Qt has introduced a markup language, this was not the case circa 2005.

* Hooks give an extra layer of niceness. Declaring a component is as easy as declaring a function. Local state becomes almost as simple as declaring a local variable.

* The inspector capabilities of modern browsers are very handy when debugging layout and styling issues.

Thanks for the detailed explanation.

It's obviously interesting to understand performance on real life JS and Python workloads and maybe to use this to inform ISA implementations.

I don't think that it's being suggested that ISAs should be designed to closely match the nature of these high level languages. This has been tried before (e.g. iAPX 432 which wasn't a resounding success!)

“We need more performance, should we fix the underlying performance problem in our software? No, we should design our CPU’s to accommodate our slow, bottlenecked language!”

Asking for CPU features to speed up Python is like trying to strap a rocket to horse and cart. Not the best way to go faster. We should focus on language design and tooling that makes it easier to write in “fast” languages, rather than bending over backwards to accommodate things like Python, which have so much more low-hanging fruit in terms of performance.

> “fast” languages

What are those?

C++, Rust, .Net would qualify, Go (doesn’t have a “release mode as such, but achieved decent speed), Julia, etc

Anything that makes an actual attempt at optimising for performance.

That's rather silly. He says:

"For example, we should care less if a proposed mobile chip or compiler optimization slows down SPEC, and care more if it speeds up Python or JavaScript!"

But anything that impacts SPEC benchmarks (and the others we use for C code) is also going to impact Python performance. If you could find a new instruction that offers a boost to the Python interpreter performance that'd be nice, but it's not going to change the bigger picture of where the language fit in.

> But anything that impacts SPEC benchmarks (and the others we use for C code) is also going to impact Python performance.

Say you work on an optimisation which improves SPEC by 0.1%, pretty good, it improves Python by 0.001%, not actually useful.

Meanwhile there might be an optimisation which does the reverse and may well be of higher actual value.

Because spec is a compute & parallelism benchmark, python is mostly about chasing pointers, locking, and updating counters.

Please No. Let the hardware do best what it's good at, being simple and running fast. Let the interpreter/compiler layer do its thing best, flexibility. There have been attempts to move the hardware 'upwards' to meet the software and it's not generally worked well. No special purpose language supporting hardware exists now that I'm aware of - lisp machines, smalltalk machines, Rekursiv, stretch, that 1980s object oriented car crash by intel whose name escapes me...

Edited to be a touch less strident.

You do understand that current hardware exists to support C, right?

What aspect of currently popular CPU instruction sets ‘exists to support C’?

Strong sequential consistency is a big one. Most architectures that have tried to diverge from this for performance reasons run into trouble with the way people like to write C code (but will not have trouble with languages actually built for concurrency).

Arguably the scalar focus of CPUs is also to make them more suited for C-like languages. Now, attempts to do radically different things (like Itanium) failed for various reasons, in Itanium's case at least partially because it was hard to write compilers good enough to exploit its VLIW design. It's up in the air whether a different high-level language would have made those compilers feasible.

It's not like current CPUs are completely crippled by having to mostly run C programs, and that we'd have 10x as many FLOPS if only most software was in Haskell, but there are certainly trade-offs that have been made.

It is interesting to look at DSPs and GPU architectures, for examples of performance-oriented machines that have not been constrained by mostly running legacy C code. My own experience is mostly with GPUs, and I wouldn't say the PTX-level CUDA architecture is too different from C. It's a scalar-oriented programming model, carefully designed so it can be transparently vectorised. This approach won over AMDs old explicitly VLIW-oriented architecture, and most GPU vendors are now also using the NVIDIA-style design (I think NVIDIA calls it SPMT). From a programming experience POV, the main difference between CUDA programming and C programming (apart from the massive parallelism) is manual control over the memory hierarchy instead of a deep cache hierarchy, and a really weak memory model.

Oh, and of course, when we say "CPUs are built for C", we really mean the huge family of shared-state imperative scalar languages that C belongs to. I don't think C has any really unique limitations or features that have to be catered to.

> Now, attempts to do radically different things (like Itanium) failed for various reasons, in Itanium's case at least partially because it was hard to write compilers good enough to exploit its VLIW design. It's up in the air whether a different high-level language would have made those compilers feasible.

My day job involves supporting systems on Itanium: the Intel C compiler on Itanium is actually pretty good... now. We'd all have a different opinion of Itanium if it had been released with something half as good as what we've got now.

I'm sure you can have a compiler for any language that really makes VLIW shine. But it would take a lot of work, and you'd have to do that work early. Really early. Honestly, if any chip maker decided to do a clean-sheet VLIW processor and did compiler work side-by-side while they were designing it, I'd bet it would perform really well.

Thank you for an interesting comment - seems to imply that Intel have markedly improved the Itanium compiler since they discontinued Itanium which is interesting!

I guess any new architecture needs to be substantially better than existing out of order, superscalar implementations to justify any change and we are still seeing more transistors being thrown at existing architectures each year and generating some performance gains.

I wonder if / when this stops then we will see a revisiting of the VLIW approach.

I doubt it. Another major disadvantage of VLIW is instruction density. If compiler cannot fill all instruction slots, you are losing the density (thus wasting cache, bandwidth, etc).

Didn't later Itanium CPU microarchitectures internally shift to a much more classic design so the could work around the compiler issues?

I've never heard of that. If true, that might be a large hole in my theory :)

To quote David Kanter at Realworldtech

> Poulson is fundamentally different and much more akin to traditional RISC or CISC microprocessors. Instructions, rather than explicitly parallel bundles, are dynamically scheduled and executed. Dependencies are resolved by flushing bad results and replaying instructions; no more global stalls. There is even a minimal degree of out-of-order execution – a profound repudiation of some of the underlying assumptions behind Itanium.


Given the large amount of security problems OoO has caused, there is a chance that we may revisit the experiment in the future with a less rigid attitude and greater success.

> in Itanium's case at least partially because it was hard to write compilers good enough to exploit its VLIW design

This is half true. The other half is that OOO execution does all the pipelining a "good enough" compiler would do, except that dynamically at runtime, benefiting from just in time profiling information. Way back in the day OOO was considered too expensive, nowadays everybody uses it.

OOO parallelism runs hard into limits too (see the heroic measures current ones go to to increase IPC a teensy bit). Parallelism friendly languages have the potential to break through this barrier given parallelism friendly processors. (And they do in GPU land...)

AIUI it's not pipelining but executing out of order is where the big win comes from, it allows some hiding of eg. memory fetch latency. Since data may or may not be in cache, it's apparently impossible for the compiler to know this so it has to be done dynamically (but I disclaim being any kind of expert in this).

That is very true, thanks for clarifying. It is one of the main reasons it is hard to build 'good enough' VLIW compiler, though I haven't paid attention to the field in >10 years. OOO = 'out of order' :)

A shocking amount, but in a many cases its also what doesn't exist, or isn't optimized. C is designed to be a lowest common denominator language.

So, the whole flat memory model, large register machines, single stack registers. When you look at all the things people think are "crufty" about x86, its usually through the lenses of "modern" computing. Things like BCD, fixed point, capabilities, segmentation, call gates, all the odd 68000 addressing modes, etc. Many of those things that were well supported in other environments but ended up hindering or being unused by C compilers.

On the other side you have things like inc/dec two instructions which influence the idea of the unary ++ and -- rather than the longer more generic forms. So while the latency of inc is possibly the same as add, it still has a single byte encoding.

Address generation units basically do C array indexing and pointer arithmetic, e.g. a[i], p + i, where a is a pointer of a particular size.


In C, something like a[i] is more or less:

    (char*)(a) + (i * sizeof(*a))
And I think it will do a lot more than that for "free", e.g. a[i+1] or a[2k+1], though I don't know the details.

By having address calculations handled by separate circuitry that operates in parallel with the rest of the CPU, the number of CPU cycles required for executing various machine instructions can be reduced, bringing performance improvements.[2][3]

Here's a (doubly-indirected) example: https://news.ycombinator.com/item?id=24813376

And what about that has anything to do with C specifically? Every useful programming language requires cause precede effect, and every architecture that allows load-store reordering has memory barrier instructions. Specifically, where would code written in C require the compiler to generate one of these instructions, where code hand-written for the process's native instruction set would not?

It matches C's semantics exactly, to the point where ARM chose a specific acquire/release to match the "sequential consistency for data-race-free programs" model without requiring any global barriers or unnecessarily strong guarantees, while still allowing reordering.

(I should note that I believe this is actually C++'s memory model that C is using as well, and perhaps some other languages have adopted it too.)

Yep. They have a compiler to bring it down to the metal so IDK what you're saying.

--- EDIT ---

@saagarjha, as I'm being slowposted by HN, here's my response via edit:

OK, sure! You need some agreed semantics for that, at the low level. But the hardware guys aren't likely to add actors in the silicon. And they presumably don't intend to support eg. hardware level malloc, nor hardware level general expression evaluation[0], nor hardware level function calling complete with full argument handling, nor fopen, nor much more.

BTW "The metal which largely respects C's semantics?" C semantics were modelled after real machinery, which is why C has variables which can be assigned to, and arrays which follow very closely actual memory layout, and pointers which are for the hardware's address handling. If the C designers could follow theory rather than hardware, well, look at lisp.

[0] IIRC the PDPs had polynomial evaluation in hardware.

The metal which largely respects C's semantics? For example, here are some instructions that exist to match C's atomics model: https://developer.arm.com/documentation/den0024/a/Memory-Ord...

I've done work on a proprietary embedded RTOS that has had high level versions of those barriers at least a decade before the C atomics model was standardized (and compiles them to the closest barrier supported by the target architecture).

I suspect that the OS and Architecture communities have known about one-way barriers for a very long time, and they were only recently added to the Arm architecture because people only recently started making Arm CPUs that benefit from them. And that seems like a more likely explanation than them having been plucked from the C standard.

Moreover, one-way barriers are useful regardless of what language you're using.

Note that I am specifically pointing to those exact barriers, and not "any old barriers". C's memory orderings don't really lower down to a single instruction on any other platform that I'm aware of because of subtle differences in semantics.

[0] Close, it was the VAX-11 and is the poster child for CISC madness.

Let the hardware do best what it's good at, being simple and running fast. Let the interpreter/compiler layer do its thing best, flexibility.

Yeah, this is pretty much the opposite of what actually works in practice for general-purpose processors though – otherwise we'd all be using VLIW processors.

The complex processors like the PDPs were followed by risc processors because they were simpler. The hardwrae has to run code, I get that, but VLIW didn't work. Risc did. The x86 decompiles its opcodes into micro-ops which are load/store risc-y simple things. Simplicity was always the way to go.

I do take your point about VLIW, but I'm kind of assuming that the CPU has to, you know, actually run real workloads. So move the complexity out of the languages. Or strongly, statically type them. Or just don't use JS for server-side work. Don't make the hardware guys pick up after bad software.

When RISC first appeared they really were simpler than competing designs.

But today I think it's hard to argue that modern pipelined, out of order processors with hundreds of millions of transistors are in any sense 'simple'.

If there is a general lesson to be learned it's that the processor is often best placed to optimise on the fly rather than have the compiler try to do it (VLIW) or trying to fit a complex ISA to match the high level language you're running.

I guess if "as simple as possible but no simpler" then I'd agree.

> ...rather than [...] trying to fit a complex ISA to match the high level language you're running

Again agreed, that was the point I was making.

It seems like every 2 months I feel the burn of JS not having more standard primitive types and choices for numbers. I get this urge to learn Rust or Swift or Go which lasts about 15 minutes... until I realize how tied up I am with JS.

But I do think one day (might take a while) JS will no longer be the obvious choice for front-end browser development.

Funny, I have a small JavaScript app I have abandoned because I find developing JS so awful. Now that I have ramped up in Rust I am very tempted to rewrite it as Rust has first-class WASM support. Unfortunately I'd still need JavaScript for the UI bits.

IMO: Rust isn't the easiest language to learn, but the investment pays off handsomely and the ecosystem is just wonderful.

EDIT: I meant "to learn" which completely changes the statement :)

>But I do think one day (might take a while) JS will no longer be the obvious choice for front-end browser development.

I think that day might be sooner than anyone thinks- Chromium is dominant enough now that their including Dart as a first-class language (or more likely, a successor to Dart) will likely be a viable strategy soon.

Of course, the wildcard is Apple, but ultimately Dart can compile down to JS- being able to write in a far superior language that natively runs on 80% of the market and transpiles to the rest is suddenly much more of a winning proposition.

I feel like kotlin is a much better language than dart, has many more use cases and compiles down to javascript also.

Dart is weak for functional programming / functional way of thinking. Right there, dart lang loses me as a potential user.

If you want that, you can start with TypeScript and name your number types. Doesn’t do anything real at the moment, but subsections of code can then be compiled as assemblyscript to wasm.

I dunno how you folks do it, but I admire anyone that can stick with JS. It's just so...bonkers, every aspect about it, from the obvious "wat" moments, lack of integers, to the mental overhead of libs, modules, frameworks, templates, and packing. But I'm glad I have coworkers that grok it.

I like trying new languages and have played with dozens, and JS and Prolog are the only languages that have made me actually scream.

You should definitely try to branch out. At the very least it gives you new ways of thinking about things.

Go or dart is probably the best for "JS killer" in terms of maturity, tooling, and targeting the front end, followed by haxe, swift and/or rust (they may be better even, but frankly I'm not as familiar with them).

Honestly modern JS has far fewer "wat" moments than most languages, modern javascript is incredibly well designed. And JavaScript does have integers now.

Nowadays it feels like the opposite, the committee takes so long to dot the i's and cross the t's that features take multiple years to make it through the approval process be ready to use (I'm looking at you, optional chaining).

You could split the difference and go C# or Java and use one of their web stacks.

JS supports

    bools (Boolean)
    floats (Number)
    ints (x|0 for i31 and BigInt)
    arrays (Array, and 11-ish variants of TypedArray)
    linked lists (Array)
    sets (Set and WeakSet)
    maps (Map and WeakMap)
    structs (Object)
    genSym (Symbol)
    functions (Function)
    strings (String)
What are they missing that you are dying to have?

With WebAssembly that day is nearing. However it is certainly many years out.

Are you saying you'd like to have specific int32, int64, float32, float64 types?

Strong typing and static typing - yes please.

In terms of adding types to deeply dynamic languages I think Julia had the best approach: sub-typing and union types.

It has the advantage that it is possible to give a reasonable type to most functions without a rewrite (even if the types would be terribly long to accommodate the weak underlying type system)

Yes this specifically.

What other numeric types would you like to see?

In addition to the existing doubles, ES2020 added support for signed integers.

My guess is that this is what JIT people asked for.

The JSC JIT people seemed kind of surprised by this, which was weird. Maybe the V8 JIT people asked for it?

Is there a source for that? I was under the impression this initially shipped on iOS devices so it'd be weird for JSC to be surprised by it.

According to this bug it was shipped six months after the bug was created and they didn't seem super familiar with it.

It was public knowledge that Apple had a shipped implementation using this instruction before that patch was merged.


I also don't really see any indication that the project maintainers don't know about the instruction.

Looks like an example of JSC not being developed in the open more than anything else.

are you referring to the tweet linked there? That tweet is wrong in a lot of ways, because that instruction gives you a 1-2% boost, not a 30% boost like they claimed.

They're talking about a 30% boost overall on a new release which is an amalgamation of many pieces, including use of this new instruction.

Can you give an example of something the maintainers said that made them seem "not super familiar with it"?

Don't sufficiently advanced compilers infer what the real type of a variable is, in the most important cases?

First comment on question: “The JavaScript engine has to do this operation (which is called ToInt32 in the spec) whenver you apply a bitwise operator to a number and at various other times (unless the engine has been able to maintain the number as an integer as an optimization, but in many cases it cannot). – T.J. Crowder”

Edit: From https://www.ecma-international.org/ecma-262/5.1/#sec-9.5

  9.5 ToInt32: (Signed 32 Bit Integer)

  The abstract operation ToInt32 converts its argument to one of 2 integer values in the range −2³¹ through 2³¹−1, inclusive. This abstract operation functions as follows:

  Let number be the result of calling ToNumber on the input argument.
  If number is NaN, +0, −0, +∞, or −∞, return +0.
  Let posInt be sign(number) * floor(abs(number)).
  Let int32bit be posInt modulo 2³²; that is, a finite integer value k of Number type with positive sign and less than 2³² in magnitude such that the mathematical difference of posInt and k is mathematically an integer multiple of 2³².
  If int32bit is greater than or equal to 2³¹, return int32bit − 2³², otherwise return int32bit.

  NOTE Given the above definition of ToInt32:

    The ToInt32 abstract operation is idempotent: if applied to a result that it produced, the second application leaves that value unchanged.

    ToInt32(ToUint32(x)) is equal to ToInt32(x) for all values of x. (It is to preserve this latter property that +∞ and −∞ are mapped to +0.)

    ToInt32 maps −0 to +0.

Remember not to refer to outdated specs; the modern version is at https://tc39.es/ecma262/#sec-toint32 . The changes look editorial modernizations (i.e., I don't think there have been any bugfixes to this low-level operation in the 9 years since ES 5.1 was published), but it's better to be safe than sorry, and build the right habits.

We do, but there are still times when a double -> int conversion is necessary, this is true in every other language as well.

The real problem is that JS inherited the x86 behavior, so everyone has to match that. The default ARM behavior is different. All this instruction does is perform a standard fpu operation, but instead of passing the current mode flags to the fpu, it passes a fixed set irrespective of the current processor mode.

As far as I can tell, any performance win comes from removing the branches after the ToInt conversion that are normally used to match x86 behavior.

Many important cases are polymorphic so have to be able to handle both.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact