The widening gap between memory and core speeds suggests to me that traditional RISC philosophy is not the way forward for performance and efficiency; fixed-length instructions, load-store restrictions, and delay slots may make implementation easier and faster at a time when memory could keep up with the CPU and instruction decoding was the bottleneck, but now that memory is often the bottleneck, it makes sense to have more complex, dense instruction encoding and the other features that are usually left out of RISCs, but improve code density.
Variable-length instructions are especially beneficial to code density, since often-used instructions can be encoded in fewer bytes, leaving rarer ones to longer sequences. It also allows for easy extension. Relaxing the restriction on only load/store instructions being able to access memory can reduce code size by eliminating many instructions whose sole purpose is to move data between memory and registers; this also leads to requiring fewer explicitly named registers (since instructions reading memory will implicitly name an internal temporary register(s) the CPU can use), reducing the number of bits needed to specify registers.
Other considerations like number of operands and how many of them can be memory references also contribute to code density - 0- and 1-operand ISAs require far more instructions for data movement, while 3-operand ISAs may waste encoding space if much of the time, one source operand does not need to be preserved. 2 operands is a natural compromise, and this is what e.g. ARM Thumb does.
This is why I find the description of "compressed RISC-V" linked in the article ( http://www.eecs.berkeley.edu/~waterman/papers/ms-thesis.pdf ) interesting - benchmark analysis shows that 8 registers are used 60% of the time, and 2-operand instructions are encountered 36/31% statically/dynamically. These characteristics are not so far from those of an ISA that has remained one of the most performant for over 2 decades: x86. It's a denser ISA than regular RISCs, and requires more complex decoding, but not as complex as e.g. VAX. I think the decision to have 8 architectural registers and a 2-operand/1-memory format put x86 in an interesting middle-ground where it wasn't too CISC to implement efficiently, but also wasn't RISC enough to suffer its drawbacks. I'd certainly like to see how an open-source x86 implementation could perform in comparison.
The real problem is energy efficiency, it shows its head everywhere, from embedded systems and tablets to supercomputers. Even desktops are affected - you can cut cost of machine by having less costly power supply.
Most of the time you are not constrained by information stored in instruction cache for RISC CPUs, because most of time is being spent in some tight loop. And you can see how hard it is to create an energy efficient implementation for x86, partially (and in noticeable part) because x86 decoder is complex.
> Variable-length instructions are especially beneficial to code density, since often-used instructions can be encoded in fewer bytes,
Speaking of this, I find it interesting that ARM went back to a fixed 32-bit instruction width for ARMv8 (from 16/32 in Thumb-2). Any idea why they chose to do this?
ARMv8 is targeted at very high end phones but mainly at servers (of course it will creep down into cheap feature phones eventually). My server has 16 GB of RAM which is small for an ARMv8 server. So memory pressure may be not such a problem.
However it's also worth saying that Cortex-A53 can run Thumb-2 instructions. Not sure about Cortex-A57.
I can understand RISC-V's use in academic settings or if you truly want open hardware.
But what's the commercial benefit? Its an open core, its lacking patents because the performance critical aspects have been patented by others in their designs, so how does this stack up in terms of performance? Can you make a processor design as fast as proprietary ones like the Linux effort?
Second comes the issue of fabrication, is there somebody ready to fab this? Or are you just going to throw this on a large FPGA? If you're throwing it on a FPGA, then why take jabs at the other ISA's when you'll be running this on non-open proprietary sillicon anyways.
Lastly, who cares? I'm guessing embedded is out as they care out the cost of each chip, the cheaper and more performant the better. Perhaps you're running something mission critical or are totally tied to a architecture, but then you're a dinosaur, the industry's trending towards abstracting the hardware away anyways. Do you really care which piece of sillicon your app runs on?
All of the above's probably really biased, misguided and wrong, but I'd like to hear what other HN'ers have to say.
their F.A.Q. states that they expect performance not worse than ARM, which sounds like a dealbreaker (there wasn't anything similar in OpenCore movement like… ever!).
Fabrication: these guys do it http://www.lowrisc.org/ and don't forget about chinese production companies who use custom MIPS now — this is a great altarnative for them. Actually, this applies to any government which needs verifiable hardware non-tampered by NATO
I think Open ISAs such as SPARC, OpenRISC and RISCV and their related ecosystems and tools provide the following opportunities:
-Educational: Engaging more people in hardware design and creating a much bigger community who can understand and design complex systems without the need to start from scratch. Moreover, a larger community leads to the improvements and maturity quicker than propriety solutions.
-Commercial: I agree that they may not be able to compete with proprietary solutions in few applications in near future. However, in many applications the combination of open ISAs and proprietary solution enable faster customization and development time. A good example is NavSpark/Venus (http://navspark.mybigcommerce.com/), a GNSS solution based on Leon3 with an attractive price and competitive features comparable to the state of the art.
Looks cool! Disappointed that there's no option to trap on integer overflow. Languages don't support it because processors don't support it, and processors don't support it because languages don't require it; a vicious cycle that someone needs to break.
Most of the time you don't need integer overflow. When you need it, you can insert a check.
I consider integer addition commands in MIPS a mistake. They take up CPU real estate, they slow down design and in the end they are not even used!
The handling of division overflow in MIPS is more fair. If you can have a division that may iverflow, compiler inserts a check. Resources are wasted only here, not everywhere.
I'm going to assume you meant "most of the time you don't need integer overflow checks", because that makes more sense with the rest of your comment.
I couldn't disagree more. If you look at real programs, overflow would be a bug for the vast majority of the integer arithmetic instructions. Therefore overflow checking should be the default. Wraparound overflow is sometimes useful and should be available but it should not be the default.
For example, http://www.cs.utah.edu/~regehr/papers/overflow12.pdf finds that there are ~200 instances of overflowing integer arithmetic instructions in all of SPEC CINT2000 (many of which are bugs). Every other integer arithmetic instruction in SPEC CINT2000 (orders of magnitude more than 200, of course) should never overflow; overflow would definitely be a bug and overflow checks would be useful in catching such bugs.
See http://blog.regehr.org/archives/1154 for a more complete argument. I really hope that if an ISA really does become "the standard ISA for all computing devices" as RISC-V aspires to, that it supports integer overflow traps.
The dirty secret is: nobody cares because the ISA doesn't matter.
When programming a modern microcontroller, I regularly think: "Gee, I wish I had more pins." "Gee, I wish I had documentation on that peripheral." "Gee, I wish I had better tool support." or "Gee, I wish I had more RAM/Flash/MHz."
I never think "Gee, I wish I had a better ISA".
I applaud the effort to make an open microprocessor especially in light of the increasing efforts to put trusted module crap in our computers. However, this has no commercial advantage in any way other than that.
> The dirty secret is: nobody cares because the ISA doesn't matter.
I disagree!! While normal operations don't matter, think about features like 'trap on integer overflow', if it was widespread in the popular ISAs we would have language which would use this semantic and as a result less bugs.
Another interesting feature could be Azul's Vega realtime GC support but I don't know if this requires a change of the ISA, or if it's just a MMU feature..
Hardware capabilities/segmentation would also require support in the ISA.
That said I agree with you that the RISC-V is just 'yet another ISA' which doesn't have interesting technical features, it's main feature is that it is open and you can implement it without paying someone for the privilege.
> While normal operations don't matter, think about features like 'trap on integer overflow', if it was widespread in the popular ISAs we would have language which would use this semantic and as a result less bugs.
You need to study history, son. :)
All of the ISA's from the 70's and early 80's HAD an overflow feature. It was wiped out when we jumped to 32-bit architectures because overflow was so much less common.
GC at the hardware level was, I believe, done by the Lisp Machine. However, standard RISC chips could run rings around it.
Modern ISA's aren't simply a bug in amber that solidified the mistakes of yesteryear never to be rectified. Modern ISA's have many features precisely to correct the mistakes made in the past.
> All of the ISA's from the 70's and early 80's HAD an overflow feature.
Really? The first ISA I learned was the 6809 ISA, my memory is a bit fuzzy but I don't remember any 'trap on overflow' feature in the ISA.. Does the x86 has this?
Plus, if you believe that new language are always better than old language, it is you who need to better study history.
About GC at the hardware level: Lisp machine failed so what? There are different many ways to help supporting GCs..
Plus hardware GC wasn't the only feature of Lisp machines..
I don't get your point about "modern ISA" plus I'm not sure that you can lump x86-64 and ARMv8 ISAs in the same category even though both are 'modern ISAs'.
Well, the concept of condition type "trap" didn't appear until deep pipelining. However, hardware very much supported condition codes. In fact, the reason why condition codes disappeared was because of deep pipelining--you had to wait for the full result in order to compute carry, overflow, etc.
However, I can tell you from painful personal experience that the 6809 ISA had an overflow condition code.
Lisp Machines had hardware support for GC. That does not mean the GC was a hardware function.
RISC machines were faster than the Lisp processors, but not at the time when both were actively developed and mostly for the smaller benchmarks. At the time RISC processors were appearing, several Lisp chips based on RISC were developed (SPUR, Symbolics, Xerox, ...). They never reached the market, mostly since the AI Winter already set in. Lisp Machine software was then ported as virtual machines to SUN SPARC (Xerox) and DEC Alpha (Symbolics).
And you'd better read before replying: I said trap on integer overflow, not branch on integer overflow.
The former gives you overflow check 'for free', the latter reduce the code density, which impact the instruction cache, which can reduce the performance.
And in the early days, performance was above everything else, CPU being so slow..
In this case, an open ISA would allow more stability over time. More over, once we have an open and widely adopted ISA, one can build an open microcontroller. After that, embedded developers will stop getting bad surprises when some specific microcontroller is being deprecated / out of stock and the whole board (and likely large parts of the firmware) has to be redesigned.
And you're right: an ISA don't matter. It could be ARM, AVR or OpenRISC / RISC-V, but it's very desirable to be stable and embeddable into a SoC without purchasing additional licenses.
Variable-length instructions are especially beneficial to code density, since often-used instructions can be encoded in fewer bytes, leaving rarer ones to longer sequences. It also allows for easy extension. Relaxing the restriction on only load/store instructions being able to access memory can reduce code size by eliminating many instructions whose sole purpose is to move data between memory and registers; this also leads to requiring fewer explicitly named registers (since instructions reading memory will implicitly name an internal temporary register(s) the CPU can use), reducing the number of bits needed to specify registers.
Other considerations like number of operands and how many of them can be memory references also contribute to code density - 0- and 1-operand ISAs require far more instructions for data movement, while 3-operand ISAs may waste encoding space if much of the time, one source operand does not need to be preserved. 2 operands is a natural compromise, and this is what e.g. ARM Thumb does.
This is why I find the description of "compressed RISC-V" linked in the article ( http://www.eecs.berkeley.edu/~waterman/papers/ms-thesis.pdf ) interesting - benchmark analysis shows that 8 registers are used 60% of the time, and 2-operand instructions are encountered 36/31% statically/dynamically. These characteristics are not so far from those of an ISA that has remained one of the most performant for over 2 decades: x86. It's a denser ISA than regular RISCs, and requires more complex decoding, but not as complex as e.g. VAX. I think the decision to have 8 architectural registers and a 2-operand/1-memory format put x86 in an interesting middle-ground where it wasn't too CISC to implement efficiently, but also wasn't RISC enough to suffer its drawbacks. I'd certainly like to see how an open-source x86 implementation could perform in comparison.