Highlights: The version of BOOM in the paper (two years ago) had the same IPC as...

_chris_ · on Sept 27, 2017

Hi, author here. Just to point out to others (and to be fair to the A15), IPC is just one part of the final performance equation. =)

microcolonel · on Sept 27, 2017

Chris, I worked hard to get you on that pedestal, don't go jumping off. ;- )

Fair enough, you didn't reach the same frequencies, but that's what the other 1.4mm² and the process shrink are for.

ARM[v7] maybe does a little bit more per instruction, what with those conditions and 14+ character non-mnemonic mnemonics; but ultimately instruction counts should be pretty close, right?

Update: also probably SIMD[or vectors], breakpoints, more interesting memory management, the handling of bizarre FP corner cases, maybe power management[high frequency dvfs? :- )], and other things go in that additional 1.4mm².

_chris_ · on Sept 27, 2017

> Chris, I worked hard to get you on that pedestal, don't go jumping off. ;- )

O:-)

> ARM[v7] maybe does a little bit more per instruction, what with those conditions and 14+ character non-mnemonic mnemonics; but ultimately instruction counts should be pretty close, right?

Great question. I just so happen to have written a tech report on this very topic! https://arxiv.org/abs/1607.02318

Basically, performance should be identical between ARMv8 and RISC-V, given the RISC-V core implements macro-op fusion to combine things like pair loads together.

microcolonel · on Sept 27, 2017

> Great question. I just so happen to have written a tech report on this very topic! https://arxiv.org/abs/1607.02318 .

> Basically, performance should be identical between ARMv8 and RISC-V, given the RISC-V core implements macro-op fusion to combine things like pair loads together.

Yeah, I drew my conclusions from your papers. :- )

I really should diversify my sources, I bring nothing to this exchange.

wmf · on Sept 27, 2017

Wasn't that comparison totally rigged by omitting a bunch of stuff (e.g. SIMD) from the BOOM core that wasn't used by a particular benchmark?

_chris_ · on Sept 27, 2017

I was using Coremark in that report, which is what ARM used to market their cores. The 32b ARMv7 cores have NEON SIMD, but I believe no FMAs, whereas BOOM is 64b and includes double-precision FMA units.

Also there's no current RISC-V extension for SIMD or vector ops, so I didn't maliciously "omit" things to "totally rig" the comparison. But even running SPECint is not going to fire up the SIMD unit.

pertymcpert · on Sept 27, 2017

Uhhh yes ARMv7 NEON has vector FMAs.

SPEC with modern compilers will definitely use SIMD, utilization on hot regions is variable but it's definitely beneficial.

adrian_b · on Sept 27, 2017

Not all ARMv7 have FMA, only the newer models.

Models with FMA: Cortex-M4, Cortex-M7, Cortex-A5, Cortex-A7, Cortex-A15, Cortex-A17. I do not remember which Cortex-R have FMA. Models without FMA: Cortex-A8, Cortex-A9.