If EE/CS departments of colleges adopt RISC V hardware for teaching their students, providing cheap microcontrollers and boards to students at the start of their semester classes, those precocious little buggers are going to build Doom clones and help port their favorite flavor of linux onto them. When you've got a generation of top talent tinkering with an ISA that doesn't suck like x86, you're going to see adoption in actual industry.
RISC V is not much more than a slightly tweaked MIPS, the previous "preferred academic architecture", and we all know what happened to that... much to the chagrin of those who thought MIPS would be "the future of computing".
My prediction is that it will probably replace MIPS where it's currently used (very low-end tablets and phones, and a lot of home routers), but won't really displace ARM or x86. A "pure RISC" like MIPS and now RISC V just doesn't have the "CISCness" that makes successful CPUs beyond the theoretical.
Look at the results for the Loongson. It has at least twice as much cache as the others and 4-way OoO, yet after normalising for clock frequency and process power, it manages to be dead last in efficiency on 3 out of 4 of the benchmarks.
I've been following trends in the industry since the 90s. The "RISC dream" is still very much a dream. Trying to get an overly simple and restrictive ISA to perform well is sheer folly. Things that the RISC proponents thought would matter (decode complexity, number of architectural registers, addressing mode complexity) actually didn't, and what they thought wouldn't matter (memory bandwidth/latency, clock speed, instruction density), did. A denser, more intel-ligent (pun fully intended) ISA is what x86 and ARM is moving toward, while the MIPS/RISC advocates are going backwards.
This is a processor comparison. The Loongson is the only 90nm chip here. No wonder it has no chance against 32nm and 40nm chips. The 60nm A8 also performs horribly.
But in general, these more complex instructions are where most of the microarchitectural optimisation happens. The BCD instructions mentioned in the above link were originally implemented in microcode and took dozens of cycles; for example, AAD took 60 cycles on an 8086, but only ~4 on a Skylake (and can be scheduled in parallel with the uops of other instructions) --- despite the little use it has today. A look at instruction timing tables shows that the common idea of little-used instructions receiving no optimisation or even becoming much slower with each new processor generation is mostly false (there are exceptions, like the ill-fated P4.)
Hey, That's unfair to C! If an OS is reckless enough to give a process memory when it asks for it, it should have no business worrying about what the process does with the allocation!
In all seriousness though, the language can't be held responsible for poor programming practices.
For any insecurities resulting from c code, blame the compiler!
All the attention from RISC-V comes down to the fact that the creators like to trash talk proprietary ISAs like x86 and ARM and they like to put RISC-V under the banner of being "new and clean". This leads the average joe (and by "average joe" I mean software engineers without any background in hardware or in CPU architecture) to dream of a "Linux of CPUs", thinking they could hack something overnight and magically have a working open hardware, even though the CPU is just a small part of the entire system.
At the academia level, RISC-V is a very good thing since it allows to research new kinds of architecture without being bound by proprietary ISAs. But at the industrial level, it's just hype.
Wrong. I believe most people here are focused on high performance implementation (servers, desktops, laptops) and Risc V won't get there for a long time, if at all. High performance needs very high investment, to work closely to the fabs to tweak a design to make the most of new process nodes. Only big players can afford this. So if you have only this in mind, yes Risc V may sound like hype.
But there's a huge domain of embedded designs, with small scale CPU where the cost matters a lot. There are many niche players there, like Cortus, Andes, BA Semin and now SiFive. ARM is also there with Cortex M, but it's expensive and the line-up has holes (big gap between M and A/R series). Yes, ARM is expensive at the low end. ARM is cheap at the high end, when compared to Intel. Not at the low end.
So far all these players have suffered from a limited ecosystem. With dedicated ISAs, each one had to support their toolchain with limited resources, and limited 3rd party support. What Risc V offers to them is the ability to leverage a significant ecosystem. They can focus on the CPU core implementation, with value added customization (extra instructions, accelerators, ...) and support, and leverage all the tooling around Risc V with much little effort. This makes a big difference. And all of them are moving to Risc V, it's their future. As a user, it does make a big difference to me. Instead of a lagging GCC toolchain, I'll soon have access to recent GCC and LLVM based tools for example.
For IoT in general, Risc V offers the possibility to have an open system, which in the end means more competition and lower costs, with no lock-in from some big actors "owning" the ISA you're using. A lot of people are looking at this seriously.
So no, Risc V is not hype. And there's already industrial changes. You just need to look at the deeply embedded world.
I'm still hoping they can somehow get access to a flash process, so that they can integrate the CPU core (and peripherals, of course) with the flash in the same chip. This board still uses QSPI to talk to an external flash, which houses the code.
I'm not sure how programming is done, with "traditional" microcontrollers that have their own flash, you use JTAG (or even serial on the older ones) to program the flash directly, then just remove the JTAG lines and reset the board and it runs your code. With this, I guess you have to program the SPI flash directly, or (perhaps?) use JTAG to talk to the microcontroller core and control its SPI that way?
Anyway, my understanding is that flash memory is "deep analog magic" which the people behind RISC-V haven't mastered yet, so they can't design the memory into the same chip as the processor core.
I'm still a little unclear on RISC-V's goals - are they looking at the microcontroller market or are they looking more to offer an alternative for ARM and x86 CPUs?
In the microcontroller market there's a lot of competition right now, especially with devices like the ESP32 going for $8 with wifi and bluetooth.
Their goal is to be the standard ISA for all general purpose computers; microcontrollers, embedded controllers, servers, HPC, workstations, laptops, smartphones. Microcontrollers happen to be one of the quickest markets to enter with a new ISA.
> devices like the ESP32 going for $8 with wifi and bluetooth
Well, give Espressif some time, they've been a member of the RISC-V foundation for a while; so I suspect they'll be dropping Tensilica for RISC-V at some point. Maybe we'll see a module like this with RISC-V cores in the nearish future. For now it surely isn't competing in the same part of the market. For what it's worth, the FE310 seems to basically lead in power and peak performance in its category; there are surely some use cases which it is already enabling.
Maybe a bit cynic, but given their churn in the lower level specifications (supervisor mode and up) my guess is:
Main goal: write lots of theses and papers and obtain degrees.
Secondary goal: have an unencumbered ISA to help with the main goal.
Everything else depends on those who adopt the ISA. If some org is willing to spend the multi-million $$$ necessary to design a high-performance desktop class CPU (and more to make it a reality), it will happen.
Those low-end chips are easier to do, and an easier market to serve: The ISA's license fees are a bigger chunk of the chip's price, so RISC-V dropping that to 0 helps more. You also don't need to build a high-end chip that blasts everything else out of the water before your newcomer is even considered, since in the microcontroller market there are several pockets of "good enough".
If that works out, there might be attempts to scale up later (when they have experience and financial resources to work with).
It's a good sign that NVIDIA and Qualcomm are both integrating RISC-V for embedded controllers on their chipsets. This means that there will be experience and maybe R&D going into RISC-V cores at these two massive vendors of embedded and mobile CPUs. It is not inconceivable that RISC-V could... cross-contaminate to their high-performance application processor business.
The examples of contributions from big players using BSD like licenses, specially in the embedded space, proves otherwise.
Thus making the base RISC-V ISA kind of useless for software that one might actually care about.
Useful ISA extensions are likely to remain royalty free at least, if not standardized at the level of the foundation; but if they don't, it's not the end of the world.
To reply to your question, as a developer I don't care about ARM vendor specific features, because Android hides them for me.
However as a consumer, they do impact updates.
Throw in that we already have a mainstream (if you squint) desktop OS that is a compile away (and has driver support for a lot of hardware) and if I where Intel I'd be worried in the mid-term.
Hell for the first time since the first AMD64's came out Chipzilla is looking vulnerable from the AMD direction, a surprise to a lot of people (including myself, I bought a Ryzen 1700 for work back in late May and I've been astounded by the performance per £ on my workloads).
Intel shares with Nvidia(also currently using large dies) a limited form of performance leadership at the high end that may be eroded in the face of these shifts. Most of the market is going to see more benefit from more cores instead of faster cores with the current conditions. That might be less true once we're talking about a baseline of 16 or 32 cores, but 2 is proving to be too few for current workloads.
And that's just their nearest competitor. ARM SOCs have consumed the mobile market and are creeping upwards. My ARM Chromebook makes a pretty good Linux desktop too, and you can practically trip over RaspPi devices.
In the end I went with an i7-7700HQ because I simply couldn't make my old laptop last any longer, 8GB ram just wasn't enough for my workflow.
If the mobile Ryzens are amazing I'm going to regret that maybe.
In contrast, the simplicity of RISC means such optimisations are either not possible or much harder --- if by definition all instructions already run in one clock cycle on one execution unit, then the only choices are to add more execution units (something the CISCs can also do with the same amount of difficulty) or attempt to recognise sequences of instructions and combine them for execution on dedicated hardware; the key difficulty here being that recognising and combining instruction sequences is much harder than decoding a single instruction to a sequence of uops or dispatching it to hardware internally.
REP MOVS/STOS is not optimized at all. The Architectures Optimization Reference Manual says that SIMD instructions are faster. The only reasons to use REP MOVS/STOS are to deal with unaligned parts of the data and to not have startup cost from checking if SIMD is available via CPUID when you have only a few bytes. Further the intel manual tells you to not use the other string instructions at all.
> if by definition all instructions already run in one clock cycle
Then you are using a different definition of RISC than RISC-V does
> the key difficulty here being that recognising and combining instruction sequences is much harder than decoding a single instruction to a sequence of uops or dispatching it to hardware internally.
Why is that supposed to be more difficult? Matching combinable instructions is trivial.
That might've been true for a period some time ago, but before and after that it has been the preferred way to do memcpy() or memset() since it can operate on entire cachelines at a time. It's one of the fastest, if not the fastest, and also the absolute smallest (which helps with icache consumption), while leaving the SIMD registers free to do more... useful things than shuffling data around.
How is it "trivial" to match a round of AES or SHA, or a multiplication or division algorithm, or even a memory copy/store loop, so they can be replaced with the optimal hardware implementation...?
I don’t know much of CPU IP licensing, but I would think one pays more for the design of the CPU than for the ISA.
And that’s probably for the better, because, if one pays for the ISA, but the RISC-V ISA is free, how are ‘they’ going to get ”financial resources to work with”.?
You have a couple of options:
- Use a ready made CPU design (eg. Cortex-M) and pay license $$ for each chip sold.
- Use an existing ISA (eg. ARMv4), build your own CPU around it and pay license $ for each chip sold.
- Use RISC-V, build your own CPU around it and pocket those $ you'd otherwise pay for the ISA license (or reduce your price to become more competitive, or something inbetween)
(And then there #4: Design your own ISA.
Now you also have to deal with standard software: compilers, kernels, some libraries. Still may make sense in some cases, but why go through the trouble if all you want is to save some pennies per chip on the ISA?)
In all but the first case, your CPU design is a one-time cost that becomes marginal over the (hopefully many) chips you sell. Especially microcontrollers can be _very_ high volume where this makes sense (more so with the IoT hype that's going on).
Which is why I think it's a wise idea for them to aim for that market first and build up experience and a war chest.
At this point, it would be very lovely and educational to see an Oberon system (http://www.projectoberon.com/) ported to these small boards. (It could definitely use more memory to run a desktop system, though.)
Then there was the Sceptre board, but it went out quite fast.
But given Astrobe, there are a few more boards to target.
Posting the direct link for the Oberon image.
I have the RioRand® EP2C5T144 Altera Cyclone II FPGA Mini Development Board ($23 on Amazon).
You'll also need a clone of a USB blaster JTAG cable, which goes for $10 or less.
The ecosystem is starting to pull industry players.
Aside from GroupGets and MassDrop, who else is operating in the crowd-funded, discounted bulk-purchase collaborative.. consumption.. space?