Firstly, apparently IBM chips have hardware transactional memory now:
PowerPC / PowerPC64 / RS6000
GCC now supports Power ISA 2.07, which includes support
for Hardware Transactional Memory (HTM)
S/390, System z
Support for the Transactional Execution Facility
included with the IBM zEnterprise zEC12 processor has
Also on the 390:
S/390, System z
The hotpatch features allows to prepare functions for
hotpatching. A certain amount of bytes is reserved
before the function entry label plus a NOP is inserted
at its very beginning to implement a backward jump when
applying a patch. The feature can either be enabled via
command line option -mhotpatch for a compilation unit
or can be enabled per function using the hotpatch
This is huge- you can now do this on a laptop
Although it seems obvious that this might be a good idea,
why would it
1) use exorbitant amounts of memory; and
2) be "pretty awesome" instead of, say, mildly useful?
edit: And both questions satisfactorily answered
in the time it took me to peruse the preamble of
Functions can be inlined across module boundaries, even when they're not declared inline. You can turn virtual functions into regular functions, if you know that the virtual function is never overridden, or if you can derive the exact type. You can change calling conventions for functions. You can do better escape and aliasing analysis. If a function is only called once, then you can probably optimize it a lot better because you know exactly how it will be called.
As with all optimizations, not all programs will see any significant benefit. Programs with heavy inner loops like physics simulators and graphics processors will not see much benefit, since the local optimizer works well enough. Programs like compilers, interpreters, and web browsers will see larger benefits. However, the benefits can be high—30% improvements in running time are not unheard of.
In short, LTO is a high-cost, high-benefit optimization.
It uses a lot of memory because it requires keeping a representation of more or less the entire program in memory at one time, in a format which is amenable to analysis. It is not out of the ordinary for an optimizer's internal representation to be on the order of 1000x the size of the source code.
> Early removal of virtual methods reduces the size of object files and improves link-time memory usage and compile time.
> Function bodies are now loaded on-demand and released early improving overall memory usage at link time.
> As of this time no releases of GCC 4.9 have yet been made.
It is also nice to see more parity with Clang when it comes to diagnostics and ASan/UBSan.
-march=nehalem, westmere, sandybridge, ivybridge, haswell,
bonnell, broadwell, silvermont
-mtune=intel can now be used to generate code running well
on the most current Intel processors, which are Haswell
and Silvermont for GCC 4.9.
I wonder if anyone is working to standardize something like this? It would be way more useful than all those _s() functions they added to C11.
I also disagree with some of the interface choices, for example when strlcpy() fails it tells you how many characters you needed, not simply "error" as in strcpy_s. Also the use case for memcpy_s() is extremely limited. It just seems like the _s() functions were rushed and stuck in there without regard for what makes sense.
I don't personally notice a change though, IMHO releases have been fairly consistently released since the 4.0 timeframe.