
Usability Improvements in GCC 9 - chx
https://developers.redhat.com/blog/2019/03/08/usability-improvements-in-gcc-9/
======
saagarjha
Great! This directly addresses one of the reasons I use Clang most of the
time, and it's great to see that the GCC realizing that error message quality-
of-life is important. As an aside:

    
    
      $ gcc-9 -c cve-2014-1266.c -Wall
      cve-2014-1266.c: In function ‘SSLVerifySignedServerKeyExchange’:
      cve-2014-1266.c:629:2: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
        629 |  if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
            |  ^~
      cve-2014-1266.c:631:3: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
        631 |   goto fail;
            |   ^~~~
    

this is a cute example ;)

~~~
umvi
It saves us frequent users of C++/python from ourselves :P

------
umvi
Another major improvement to GCC9 is gcov - starting with GCC 9, gcov can spit
out coverage results in JSON format as well as take multiple coverage files as
arguments, spitting out all the results to stdout, 1 json object per line
(this is huge).

This will allow much of the code coverage calculation process to be done in
parallel (i.e. _much_ faster than is currently possible). This is because
currently gcov writes everything out to files, which forces the wrapper to be
single threaded lest one gcov process overwrite the output file of another.

For fun, I wrote a parallelized GCC9 gcov wrapper in python that generates an
LCOV coverage report that genhtml can consume. Unofficial/anecdotal bench
marking shows incredible gains over lcov on my own personal projects (700ms
vs. 90 seconds). I'm sure it could be improved even more.

[https://github.com/RPGillespie6/fastcov](https://github.com/RPGillespie6/fastcov)

~~~
jcranmer
lcov can be really, really slow in how it handles its own file format. I ended
up rewriting the genhtml stuff myself in Python (originally, it was to drive a
treemap view, but extending it to handle the actual detailed coverage wasn't
difficult), and then moved the "merge two files into one file" into Python as
well for a >10x speedup. And Python is not known for its blazing speed.

After that, I ended up shifting to tackling the generation of the initial data
from .gcda and .gcno, which is where I discovered bugs in both lcov and gcov
in the collection process. One bug is that gcov somehow manages to compute
negative edge counts on certain cycles in the graph, and when that happens, it
double-counts those cycles.

------
nemetroid
My main issue with g++ compiler output is that for each coding error, I get a
few lines of useful information, followed by a hundred "note:" lines detailing
the finer details of increasingly improbable failed substitutions.

~~~
Zpalmtree
Clang by default shows only 20 error lines, or something like that, so you
have to scroll up a ton less when compiling. I swapped to clang just for the
improved error messages.

------
lrsjng
"With -fdiagnostics-format=json, the diagnostics are emitted as a big blob of
JSON to stderr"

~~~
codetrotter
I’ve previously been thinking about what would be the consequences of
implementing a subset of Unix tools to work with a binary protocol.

For example, tables would be output with type info and proper unambiguous
delimiting of some sort.

Unrelated to that I once wrote a program in the past several years ago as two
distinct parts that communicated over a pair of Unix pipes; one part the GUI
which embedded a web view and the other part which contained the program logic
and which output HTML to stdout. The code was quite ugly, but anyway it did
the job at the time.

But the idea of outputting JSON is somewhat in between this two things. But
the main takeaway here IMO is the idea about outputting a specific format as
controlled by a flag. Of course other tools have done this too in the past.

Consider a few Unix utilities reimplemented to work a bit tighter together in
terms of how some data is represented, and using for example Cap’n Proto to
serialize the data and piping that between them but also writing a library for
destructively converting to plain text representation and then using some
method or combination of methods for deciding when to use binary piping. For
example,

\- A user-provided flag. Tiresome to type.

\- A user-provided environment variable. Problematic when mixing tools that
implement serialization with pure plaintext tools.

\- The shell could be aware of which programs implement this serialization and
invoke them with the flags or env vars corresponding to whether or not each
program on each side of each pipe in a pipeline supports the serialization.
Possibly the way to go.

In this way one could incrementally rebuild the Unix tools to work this way
without having to change everything at once and still being compatible with
the universal interface of plaintext forever, since no-one wants nor can
implement this in every Unix tool out there.

Once you have a couple of tools support this then the possibility for new ways
to interact with pipelines becomes available.

I still love and will use the classic command-line. But in some situations the
possibilities enabled by this would be very compelling indeed.

~~~
Longhanks
What you describe is PowerShell. It is available for most Linux and macOS.

~~~
pantalaimon
IMHO what kills PowerShell is that it's so incredibly verbose.

~~~
svnpenn
Even with the built-in aliases?

------
favorited
> One concern I’ve heard when changing how GCC prints diagnostics is that it
> might break someone’s script for parsing GCC output

Honestly though, who would build a tool which depends on the (presumably
undocumented) text output of another application, and then expect it not to
break?

~~~
hedora
Compiler error string formats are well-documented. Tools have been relying on
the : delimited format mentioned in the article since at least the mid-90’s
(probably longer).

That makes them even more stable than flavors of the decade like xml or json.

In fact, the : delimited compiler error format is so well established that it
will probably outlive the json schema the article presents, and maybe even
json-as-defacto-standard.

I’m not saying the json thing is bad (it provides rich information that will
be useful to some tools), but it’s shelf life will probably be much less than
25 years.

~~~
DyslexicAtheist
> it will probably outlive the json

your comment reminds of of a hilarious but very true ringing anecdote here¹
about json and _' The Sins of the Father'_ :-)

¹
[https://news.ycombinator.com/item?id=8708617#8709223](https://news.ycombinator.com/item?id=8708617#8709223)

------
argd678
I’ve been using Terraform a lot lately and aside from no debugger and even
line numbers on errors, I wish it had used a real language and had usable
errors like this. Great too see, it’s the human computers that are the most
expensive to run so you don’t want to waste their time.

------
iheartpotatoes
I would be really interested to know what kind of unit testing was done to
prove that a feature like this won't break anything. I see that he left the
original messages and added more guidance, so it would be possible to regress
against GCC8. But the addition of the new guidance must require a certain
degree of code partitioning and hand-inspection, given how complex GCC is
under the hood.

------
brian_herman__
Nice

------
cryptonector
> What is the optimizer doing? > ...

Nice! Now, can we get warned about UB?

~~~
bonzini
You can at run time with ubsan. At compile time it doesn't make sense, because
undefined behavior is used for cases where the overwhelmingly common case you
won't see undefined behavior. It'd have 99% false positives, pretty much by
definition.

~~~
cryptonector
This is nonsense. The optimizer makes decisions based on UB at compile-time.
It should warn about this, especially when deleting code.

> overwhelmingly common case you won't see undefined behavior.

You think so. But this actually can be a source of serious security
vulnerabilities. I want at least the option to know.

~~~
bonzini
Would you like to have a warning for every "for(i=a; i<=b; i++)" loop, where
the compiler assumes the loop is not infinite?

Or for every a[i++] in a loop (including a[i] with i++ in the for loop of
course), where the compiler assumes it can convert the induction variable to
*p++?

Both of these are undefined behavior if "i" is signed and it overflows.

