
An update on GNU performance - ingve
https://community.arm.com/tools/b/blog/posts/update-on-gnu-performance
======
wyldfire
The original title, "An update on GNU performance" is missing some context
that's mostly implicit from the source. It would be great if it said either
"An update on GNU toolchain performance on ARM" or "An update on GCC
performance on ARM".

~~~
badosu
Sigh, still hoping for the day it means Hurd with GNU toolchain...

~~~
2trill2spill
If you don't mind me asking why Hurd? Is it because it's a microkernel? or is
there other things I'm missing?

~~~
flukus
Having had to work on a locked down linux (stupid corporate rules) system for
the last couple of months the dedication to user freedom sounds enticing:
[https://www.gnu.org/software/hurd/community/weblogs/ArneBab/...](https://www.gnu.org/software/hurd/community/weblogs/ArneBab/technical-
advantages-of-the-hurd.html) . As a user being able to install packages
without root and start my own services and mount network drives without root
would make my life a lot easier.

~~~
jcelerier
> As a user being able to install packages without root and start my own
> services and mount network drives without root would make my life a lot
> easier.

... but that's not a problem of linux, that's a problem of your company rules.
They would put similar rules whatever the OS.

~~~
flukus
Locked down was probably not the right word, I'm just a normal user with no
root access. It is a linux problem because the admins are trying to do
something pretty sensible, not let a single user screw up the machine for
everyone and restricting users is how linux achieves this, HURD took a
different approach.

------
amelius
Does anyone know why "char" is unsigned on ARM/gcc?

To me it seems like a weird design choice that only complicates porting
software from x86.

~~~
Someone
Performance. Historically, ARM didn’t have a _“load byte and sign extend”_
instruction ([http://www.drdobbs.com/architecture-and-
design/portability-t...](http://www.drdobbs.com/architecture-and-
design/portability-the-arm-processor/184405435)), making loading a _signed
char_ and promoting it to an _int_ slower than loading an _unsigned char_ and
promoting it to an _int_.

In C, a function argument or return value of type _char_ gets promoted to
_int_. So, code that uses _char_ a lot does a lot of such promotions.

~~~
loeg
> In C, a function argument or return value of type char gets promoted to int.
> So, code that uses char a lot does a lot of such promotions.

I think you're confusing standard C and machine/implementation-specific
behavior. (If not for yourself, for people who read your comment.)

------
ilaksh
I thought that media codecs were handled outside of general purpose code. Like
maybe a special chip or on a graphics processor.

~~~
richardwhiuk
Nope. They could be GPGPU accelerated, but if you aren't rendering to screen
then it's not typically graphics processor territory.

Plus modern codecs are quite linearly data dependent which is a bad for for a
GPU.

~~~
ilaksh
But aren't movies rendered to the screen?

My understanding is most smartphones at least have hardware decoding for H264.

~~~
semi-extrinsic
Exactly, a GPU needs to have specific hardware (a dedicated ASIC) for H264
acceleration. It's not accelerated by the usual GPU hardware, which is
sufficiently general-purpose that an accelerator running there would not be
codec specific.

------
mrob
This article includes a misleading graph. The "performance improvement" bar
graph has bars that extend down into performance degradation, making the
improvement look bigger than it really is. If you're going to label the bars
as "improvement" then the y-axis should start at 1.

~~~
wyldfire
No, it's just normalized data. See the legend: "Throughput/latency improvement
over glibc 2.27".

Rather than show the new metric and the baseline, they showed all metrics as a
function of the baseline. All of them are an improvement.

This is a common idiom for compiler benchmarks -- IMO it's not misleading at
all.

> making the improvement look bigger than it really is.

They even highlight the 1.0 line. IMO it would be confusing for them to start
the y-axis at 1.0.

~~~
mrob
I don't recall seeing normalized data like this on a bar chart before. It's
standard for line charts where the x-axis is sampled from continuous data, not
bar charts. The 1.0 value is effectively the zero point, so bars should extend
up and down from 1.0 (in this case only up because no performance got worse).

~~~
jcranmer
This kind of normalized bar graph is the most common way things like SPEC
results are reported, although I think that showing one category of bars
that's fixed to 1.0 to label what it's being normalized against is more
common. (Annoyingly, throughput and latency are both shoved in the same graph,
which gives the misleading impression that you're comparing one against the
other).

------
oytis
Nice job! Would expect more implementation details from the article though.
"Pattern-matching capabilities" is not very descriptive.

