
SiFive U8-Series Core IP - ingve
https://www.sifive.com/press/sifive-announces-new-u8-series-core-ip-for-high-performance
======
sanxiyn
This is a press release. There is a blog post going a bit more technical here:
[https://www.sifive.com/blog/incredibly-scalable-high-
perform...](https://www.sifive.com/blog/incredibly-scalable-high-performance-
risc-v-core-ip)

------
monocasa
The floating point unit is optional on the U8, which is a non standard choice
for a OoO design. Normally if you've got an OoO frontend, sticking a FPU is a
drop in the ocean (or can be).

Is this an artifact of them using BOOM (which is just an assumption on my
part), since BOOM makes it ridiculously easy to add or remove backend
execution units?

~~~
reitzensteinm
There are an absolute ton of customization options available. In the context
of the decisions that can be made, making FP optional isn't surprising.

I do have to wonder about whether they've got tools for real customers that
can effectively test the tradeoffs being made, at least for performance
options like cache size & associativity. Even with all the options at your
disposal and an intimate knowledge of your problem domain, beating a handful
of well made and broadly tested off the shelf designs won't be easy.

I _love_ this stuff, but I'm a bit skeptical if it's actually something the
market is asking for. If there were more substantial options like crypto
accelerators, GPGPU, video encoding/decoding, FPGA, mixing big and lots of
small cores, vector extensions, being able to size up the ALU for specific
operations... I could see the pain of custom silicon paying off.

Having accelerators sharing the cache hierarchy is not something the big
players have really done much of yet and probably won't until there's so much
dark silicon they can more or less throw the kitchen sink in.

[https://scs.sifive.com/core-
designer/customize/f9c3a7e1-08b7...](https://scs.sifive.com/core-
designer/customize/f9c3a7e1-08b7-464e-8bf9-39da62e2e35b/)

~~~
brucehoult
One benefit of having customers customize their own cores, is they can iterate
their settings (on FPGA) in hours and test the result on their _actual_
workload, not just guess how some semi-standard core might perform on their
workload based on SPEC or CoreMark or something.

The announcement says the U87 with (Cray-like) vector processing will be
available in H2 2020.(Also ARM SVE-like, but ARM hasn't shipped anything with
it yet, and I think hasn't even announced anything with it? Fujitsu has, in a
supercomputer) Meantime, the U84 has only scalar integer and FP, with nothing
equivalent to NEON on ARM. That might or might not matter to you.

Mixing a reasonable number of cores (up to nine) in the automated SoC
configuration tool was announced alongside the 7-series (dual issue in order)
cores in October last year. That continues. If you want more cores than that
some engineering time would be required to verify it.

[https://sifive.cdn.prismic.io/sifive%2F5ec09861-351b-420c-b6...](https://sifive.cdn.prismic.io/sifive%2F5ec09861-351b-420c-b6e3-e2b76843044f_linley+report+-+sifive+raises+risc-v+performance.pdf)

Adding QuickLogic eFPGA to a SiFive core has been announced:

[https://www.prnewswire.com/news-releases/quicklogic-teams-
wi...](https://www.prnewswire.com/news-releases/quicklogic-teams-with-sifive-
to-make-efpga-technology-available-via-designshare-portfolio-300939385.html)

~~~
reitzensteinm
Very cool, thank you!

While you can iterate pretty quickly on an FPGA, I'd still be worried that
cutting down a chip to optimize for $/perf works until it doesn't. Something
off the shelf would be pretty resilient to algorithm tweaks (ad absurdum you
have an Intel chip which performs black magic to run almost anything you throw
at it well).

It's pretty hard to overcome the economy of scale of just buying a mass
produced chip that's twice what you need but ten times cheaper because they're
made in batches of a million.

I should note that my skepticism is coming from a place of "I want this to be
a thing". Using the web page to design a chip is an instance of "live in the
future and build what's missing" if I've ever seen one.

~~~
mlyle
This is not to beat commodity processor/controller silicon. If you are going
to build a SoC for whatever reason (peripherals, integration, etc) -- you
might as well make the integrated processing exactly what you want.

------
jabl
So, is the U8 core based on BOOM, or is it a ground up design? Or if neither,
then what is it?

And the HBM2E+ is some doohickey so customers can combine a RISC-V core with
HBM2 memory?

(and if somebody from SiFive happens to read this, there's a typo, presumably
HMB2E+ means HBM2E+)

~~~
brucehoult
Fixed now, thanks.

~~~
sanxiyn
I still see HMB2E+.

~~~
brucehoult
Really fixed :-)

------
phkahler
How can it support the vector extensions? I didnt think the spec was done yet.
Is it?

~~~
brucehoult
The U84, available to lead customers now, doesn't have the vector extensions.
The U87, announced to be made available in H2 2020, will.

The vector extension spec is 99% done. There shouldn't be anything more than
minor tweaks from now.

Note that part of the RISC-V extensions philosophy is that specs are not
finalized until (preferably) multiple implementations have been made and shown
to work as expected in the real world.

~~~
sanxiyn
Is there another implementation of vector extension?

------
giomasce
AFAIU one of the nice thing of the SiFive IP cores is that they were
distributed with some kind of free license (although they still required other
non-free IPs to be actually used in a chip). Is this still (or was it ever)
true?

~~~
_chris_
AIUI, the U5 series is BSD/open-sourced as "Rocket", and a version of their
shared L2 cache is available as the sifive-inclusive cache. I believe Berkeley
uses these artifacts as part of their FireSim
([https://fires.im](https://fires.im)) FPGA-based infrastructure.

------
dclusin
Curious what about this technology makes it front page noteworthy? Has the
deep learning community been hotly anticipating the arrival of this
technology, or is it more along the lines of stuffing the ballot box?

~~~
monocasa
It's the announcement of the highest perf RISC-V core that you'll be able to
put your hands on so far.

~~~
lsllc
"you'll be able to put your hands on" ... the million dollar question: When???

~~~
brucehoult
If you're a company making your own SoCs already and want a high performance
RISC-V core for a mid-range tablet or phone or an SBC equivalent to the
Raspberry Pi 4 or something like that then the answer is "now".

If you're someone wanting to buy a quantity one chip from Digikey or Mouser
then it'll be ... when someone decides to make and sell a retail chip. Which
might well be someone making their own chip for a tablet or set-top box etc
and selling the surplus production.

