Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In case someone cares about these things, I compared the build times and the binary sizes for 1.9 vs 1.8.3 using the open source project we maintain [1]. This is on a 6-core i7-5280K:

Build time with 1.8.3:

   real	0m7.533s
   user	0m36.913s
   sys	0m2.856s

Build time with 1.9:

   real	0m6.830s
   user	0m35.082s
   sys	0m2.384s

Binary size:

   1.8.3 : 19929736 bytes
   1.9   : 20004424 bytes

So... looks like the multi-threaded compilation indeed delivers better build times, but the binary size has increased slightly.

[1] You can git-clone and try yourself: https://github.com/gravitational/teleport




Unless you perform a proper statistical analysis it's unfair to draw a conclusion from a single run.

Furthermore, when I see a second run that's faster than the first one, I immediately wonder if it's the cache being cold for the first run and warm for the second.

While I have your attention, https://zedshaw.com/archive/programmers-need-to-learn-statis... is worth reading.


In fairness, the phrase he used was "looks like". I don't think his comment was intended to suggest that he'd done rigorous and exhaustive wide-spectrum analysis of compile times and executable size, just that expectations matched the result for his project.


Thanks :) I'm no stranger to the scrutiny of Hacker News, I did 3 builds in a row and threw out the 1st one (cache), the last two were within 0.1s of each other, so I copied & pasted the latter.


So basically there's no speedup.


I'm pretty sure he means the last two runs of the same compiler.


"Programmers Need To Learn Statistics Or I Will Kill Them All"... What an insufferable asshat.

PSA: There is no reason to behave like this and this is an incredible way to alienate a bunch of people. You either offend people directly with the murder implication or they don't take you seriously because you sound like you're throwing such an extended temper tantrum that you managed to write it all in a blog.


or you can stop being offended by words put out on the internet by strangers... which is what i always recommended to basically everyone.


Or, you can be not offended and still criticise someone for being an asshat.


I'm not offended. I'm just not going to waste my time reading an article by someone behaving like a child.


It's like Doonesbury, but it came from the 80's: http://imgur.com/82QXoAj


... maybe it's meant to be a bit ironic/salty/sarcastic/venting?


So honest question from a non-statistician,

how, concretely, should I go about doing this particular analyzis of compile time for one project ? How many times should I run the build for each of the 2 compilers and what should I do with the result so I could; 1. Draw a conclusion 2. Come up with fair numbers of how they compare ?

I would hope someone could tech this hopefully simple and very concrete thing to the HN crowd and I do hope the answer is not "go learn statistics".


You need to first create a clean slate each time for running the experiment: no cache, no FILESYSTEM cache etc. Maybe a tonne of single use docker images? Even then filesystem caches will mess you up a little.

Beyond that, you need to run the same build "several" times to see what the variance is. Without getting specific, if the builds are within a couple percent of each other, do "a few" and take the mean. If they're all over the place do "lots" and only stop once the mean stabilises. There are specific methods to define "lots" and "a few" but it's usually obvious for large effects and you don't need to worry too much about it.

If you're trying to prove that you've made a 0.1 improvement on an underlying process that is normally distributed with a stddev of, like 2, then you're going to have to run it a lot and do some maths to show when to stop and accept the result.


I want measurements with filesystem cache because I'm interested in estimating the speed of the compile-test-edit cycle. If you want to estimate the impact on emerge then you'll want no filesystem cache.

It's all about measuring based on what you intend to use the measurements for.


If the measurements are all over the place, why not take the fastest? The average is no good, because it'll be influenced by the times it wasn't running as fast as possible.

I don't myself lose much sleep over worrying about the times it runs faster than possible.


I agree with this sentiment. Any time worse than the fastest is due to noise in the system (schedulers etc). So the fastest is the lowest noise run.

Of course, as I said in another comment it depends what you want to do with the measurement. If you plan to edit how long a run will take on an existing system, then you need to accept the noise and use the mean (or median).


There are people who have thought about this, e.g., http://onlinelibrary.wiley.com/doi/10.1002/cpe.2939/full

Personally I think it's a better idea to instrument your programs and count the number of memory (block) accesses or something. That metric might actually be useful to a reader a few years in the future. The fact that your program was running faster on a modern x86 processor from the year 2010 tells me nothing about how it would perform today, unless the difference was so large that you never needed statistical testing in the first place...

edit: I'm not sure if this paper is accessible to everyone, so here is an alternate link https://hal.inria.fr/inria-00443839v1/document


Aren't go programs statically linked? The change in binary size might be completely unrelated to changes in the compiler.


Yes, the other guys are just being pedantic because libc is attempted loaded dynamically (but it is not required—DNS behaviour just may change without it).


> Aren't go programs statically linked?

Not by default. You have to set CGO_ENABLED=0 to statically link libc.


Well, Go code is statically linked, but the runtime may try to dynamically load libc for DNS resolving. Use of cgo of course drastically change everything.


By default, nowadays the toolchain also supports generating dynamic libraries.


And by default, this feature is not used.


So what, it doesn't make this "Go code is statically linked" into a fact, given that it depends on compiler flags.

Now if it said "Go code is usually/by default statically linked", then yes.


Do native Go programs use libc?


For a few things like system DNS resolver in net package (can be switched to the pure Go version with compile-time or run-time switch) and getting user's home directory in os/user package.


To expand on this:

    $ cat foo.go 
    package main
    
    import (
     	"fmt"
    )
    
    func main() {
    	fmt.Println("Hello")
    }
    $ go build foo.go
    $ file foo
    foo: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
compared to:

    $ cat bar.go
    package main
    
    import (
    	"os/user"
    	"fmt"
    )
    
    func main() {
    	u, err := user.Current()
    	fmt.Println(u, err)
    }
    $ go build bar.go
    $ file bar
    bar: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, not stripped


You can usually build fully static by doing

    CGO_ENABLED=0 go build
even when using os/user or net/


At a computer with go1.4.2 freebsd/amd64 ATM (earlier was go1.8.1 linux/amd64 IIRC) and the above os/user example results in a dynamically linked ELF when built with CGO_ENABLED set to 0.


I am pretty sure they don't.


Nope.


Runtime has changed too... slightly




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: