
Porting dl.google.com from C++ to Go - swah
http://talks.golang.org/2013/oscon-dl.slide
======
skriticos2
So what I take from this is that the previous implementation sustained a huge
amount of code rot and new code got layered over it with a staple instead of
proper re factoring.

So he put the whole mess in a bin and re-done it cleanly with Go. Now it's
much nicer. Some of Go's attributes helped along the way.

Did I miss something?

~~~
acqq
I guess, what you miss is that the author was the author of LiveJournal,
memcached, now also of groupcache in Go. So the guy who rewrote the program
was much, much more capable that any "typical" programmer.

~~~
fbuilesv
What relevance does that add to this presentation? It's interesting by itself.

What do the presentation (or our discussion) gain from the fact that Brad is a
super smart guy "much more capable than the 'typical' programmer"? Why are we
even bringing up that point? :)

~~~
jdale27
A bunch of "typical" programmers wrote a program in C++, and it turned out
crappy. A "super smart guy" rewrote it in Go, and it turned out nice. How much
of the delta between crappy and nice was due to the language, and how much to
the programmer? (And how much is due to the new programmer learning from the
old programmer's mistakes?)

As with all anecdotal language advocacy, you have to take it with a grain of
salt, and in this case maybe more than usual.

~~~
chrissnell
Brad even acknowledges that the original program was nice in it's day. C++
complicated things but it sounded like the bigger problem was developers
leaving the team and new features being hackishly implemented.

~~~
scottlamb
And don't underestimate "changing environment". As Brad said, "in 2007, using
local disk wasn't restricted ... in 2012, containers get tiny % of local disk
spindle time ... cluster file systems own disk time on your machine, not you."

Within Google today, if you're directly using local disk instead of the Google
storage stack, you're going to have a bad time. Even more so if you're calling
read() from a single-threaded event loop.

------
joebo
I don't understand the need for the payload server from the slides. That makes
me wonder - why not just use a HTTP server to serve the static files (e.g.
nginx)? I'm sure I'm missing the obvious, but I'm probably not the only person
wondering it.

~~~
awj
From the looks of it, they wanted support for putting files in place before a
release date and easy per-file header/caching/access control. Add in a few
other miscellaneous features _and_ make it available to everyone and you're at
a point where an HTTP server probably won't cut it.

~~~
qznc
Slide 62 mentions the proprietary bits: ACL policies and RPC storage access.
Does an off-the-shelf httpd support ACLs? How easy would it be to make them
support google storage instead of a file system?

~~~
nknighthb
(I wrote a rough equivalent of "payload_server" in Go at my current employer
to solve authentication, access control, and some other business logic
issues.)

> _Does an off-the-shelf httpd support ACLs?_

Not really. You inevitably end up writing custom code to conform to your
particular requirements and/or existing systems. If you want high-performance,
you end up writing it in C as a module for Apache/nginx/whatever.

> _How easy would it be to make them support google storage instead of a file
> system?_

Unless said storage system is presented to userspace through ordinary file
interfaces, same as above. There's no general turn-key solution built into
webservers. The problem space is too wide.

Using Go in this way gets you good performance, simple architecture,
maintainability, and easy deployment with total flexibility to do whatever you
need in order to solve your version of the problem. There are no
straightjackets, you don't have to conform to (or find ways around) anyone
else's conception of the problem space.

------
STRML
Maybe I'm showing my allegiance to my platform of choice, but the subtle dig
on nodejs wasn't warranted on slide 25 ([http://talks.golang.org/2013/oscon-
dl.slide#25](http://talks.golang.org/2013/oscon-dl.slide#25)). As everyone's
pal `substack` will tell you, use streams! Instead of explicit buffering,
handling backpressure, etc., it's as simple as:

readable.pipe(writable);

Additionally the link to `http-proxy` on slide 30 is misleading; 60% of that
file is comments, and about 50% of what's left is websocket support, with the
rest being header parsing & redirect parsing. The actual proxying bit is very
simple and straightforward, and if you don't need every feature `http-proxy`
offers you can do it yourself with streams in < 10 lines.

~~~
bradfitz
It wasn't a dig on nodejs. It was a dig at event-based programming, on which
I've wasted years of my life in many languages. Node.js isn't unique in that
regard.

As I mentioned in my talk, that code looks like fine Node.js code.

But it's still event-based, and the flow isn't readable. In the actual
presentation I went through the code to show how control flow jumps around. I
picked a Javascript project (and the top hit I got from a search) because
people know Javascript.

Websocket support doesn't matter. In Go, you can also just io.Copy(websocket,
src).

I agree Stream makes Node.js code better.

~~~
test-it
Have you by any chance seen the async keyword in C# 5.0? It allows one to
write event-based code without callbacks obscuring the control flow. From what
I've heard Python is in the process of copying this feature. Iced Coffescrip
does something similar also.

~~~
bradfitz
This answer sums up my feelings best:

[http://stackoverflow.com/questions/7479276/what-is-the-
main-...](http://stackoverflow.com/questions/7479276/what-is-the-main-
difference-between-net-async-and-google-go-light-weight-thread)

It's good for C#, but still a language wart that could be built-in. I like
that Go only has one set of APIs for everything, not the sync way and the
async way.

It's sad that C#, which started out as a fixed-up Java, is now growing its own
warts.

Of course, Go's not perfect either.

~~~
dragonwriter
> I haven't yet tried Go, but I don't see how it could match the performance
> of C# API with a single function. C#'s async methods offload any IO to the
> process IO completion port threads, thus freeing the current thread to do
> more work.

Go generally uses synchronous functions, but a function in Go can be the
subject of a "go" statement (sharing the name of the language should give an
importance of how central this feature is), which causes the function to be
executed as a goroutine (that is, asynchronously using an M:N threading
model.)

~~~
wolf550e
You meant to reply to test-it, but instead it looks like you're teaching a
member of the Go team how Go works.

------
packetslave
See also
[https://github.com/golang/groupcache](https://github.com/golang/groupcache)
for the peer-to-peer memcached replacement mentioned in the slides.

~~~
otterley
I can't wait till it's stable and has a man page!

~~~
kisielk
It's a library meant to be used in your Go application, it's not going to have
a man page. It already has API documentation:
[http://godoc.org/github.com/golang/groupcache](http://godoc.org/github.com/golang/groupcache)

~~~
wbl
So what do you think section 3 of the manual is for?

~~~
pjmlp
UNIX and libc syscalls?

~~~
e12e
"The table below shows the section numbers of the manual followed by the types
of pages they contain.

    
    
      1 Executable programs or shell commands
      2 System calls (functions provided by the kernel)
      3 Library calls (functions within program libraries)
      4 Special files (usually found in /dev)
      5 File formats and conventions eg /etc/passwd
      6 Games
      7 Miscellaneous (including macro packages and
        conventions), e.g. man(7), groff(7)
      8 System administration commands (usually only for root)
      9 Kernel routines [Non standard]"
    

So section 3 is a little wider than that, eg:

    
    
        $ apropos apache::xmlrpc
        Apache::XMLRPC::Lite (3pm) - mod_perl-based
           XML-RPC server with minimum configuration

------
JulianMorrison
What this actually means: groupcache is awesome. You just act as if the cache
is full, and if it isn't, it will be. Where did the data come from? That's
pluggable. And no concern of the part that just serves it up. Very subtle,
very nice.

~~~
revelation
What this actually means (2): Go standard library has reasonable performance.

This won't work so beautifully when whoever implemented all the magic bits
under your business logic didn't do so to deliver reasonable performance for
your use case.

Or if the underlying code is in fact _wrong_. This turns into the kind of code
you can see from Line 242:

[https://code.google.com/p/google-api-go-
client/source/browse...](https://code.google.com/p/google-api-go-
client/source/browse/googleapi/googleapi.go#242)

~~~
jamesaguilar
What I'm about to say is a little trollish, but if you use words like,
"Business logic," whatever you're writing probably doesn't need to be faster
than Go. And anyway, if performance sensitive bits are your thing, cgo is
probably the best FFI outside of C# that I've ever used.

Also, it's a sign of a defective language that the library guards against old
environments? Really?

~~~
revelation
Business logic is the word used in the slides.

~~~
jamesaguilar
Damn! Your close reading has completely disarmed my bad attitude.

------
e98cuenc
These slides are practically unreadable in an iPhone. They are split in half
and it's impossible to get a full page on the screen (I can only see the right
half of the previous slide and the left half of the next slide).

Anybody has an alternative to read these slides? The content itself seems
quite interesting

~~~
fosap
Yes, the page source. It's very clean.

------
fizx
How does groupcache handle consensus?

Edit: Scanned the source, looks a like a best-effort distributed lock, rather
than any sort of consensus protocol. This works for a cache setting, where
e.g. having a split-brain scenario and duplicating the work is no big deal.

~~~
tux21b
I was thinking about that too, but since it is not possible to change existing
items and the computation of those items must be re-entrant (groupcache just
tries to avoid duplicate computations but does not guarantee it), there seems
to be no reason for any distributed consensus. In fact, the groupcache design
is astonishing simple.

------
azth
Pretty disingenuous on slide 58 to attempt to make the Go code look shorter
than it actually is. Note how he left out all the verbose error checking code.

~~~
bradfitz
Sorry, wasn't my intention. But it's only 4 more lines. In my defense, I only
showed three pages of C++, and not all of it, which would've been longer than
the whole presentation. So I cut less from the Go snippets than the C++
snippets.

~~~
ubershmekel
Your intention was to show the good Go code vs the bad C++ code. The different
syntax highlighting colors and the missing error handling code really leave me
feeling amiss.

My main gripe and reason for not jumping on the Go bandwagon is the error
handling strategy, it was exactly what I wanted to see in the Go version.

But thanks for the info anyhow.

~~~
PommeDeTerre
Additionally, I think that this comparison would only be valid after the Go
version experiences an equivalent amount of developer turnover, features added
in a rush by developers inexperienced with the software, external
infrastructure changes, and the other pressures that the C++ implementation
was apparently subjected to over the years.

The Go code may look good in 2012 and 2013, while it's still fresh. But I'd be
very curious to see how it looks in 2017 or 2018, assuming it's still even
being used then.

------
hosay123
Either I'm having deja vu, or despite the date on the presentation, this is at
_least_ a year or two old

~~~
skyebook
I don't know if its that old, but there was definitely a talk on this
presented some time back. From what I remember, there was a lot of HN
discussion around "well that's great but its just a download server" and some
other people saying "yeah but its a complicated download server".

~~~
fizx
Groupcache is the interesting new part of the presentation, and it was just
open-sourced a few days ago.

------
YZF
Interesting story. Is this a "port" or a "rewrite from scratch"? It's kind of
hard to tell.

~~~
bradfitz
Parts of both.

Some of the logic is ported from C++ to Go almost line-for-line.

Some of the architectural parts are completely redone.

But it has the same binary name and flags and RPC interface

------
artagnon
While I won't dispute that Go has some cute primitives, I thought the examples
were terrible. On slide 25, it talks about why a simple operation is painful
([http://talks.golang.org/2013/oscon-
dl.slide#25](http://talks.golang.org/2013/oscon-dl.slide#25)), and then goes
on to evangelize io.Copy() on slide 31. Okay, so the standard library saves me
from open-coding it:

    
    
      func Copy(dst Writer, src Reader) (written int64, err error) {
          // If the reader has a WriteTo method, use it to do the copy.
          // Avoids an allocation and a copy.
          if wt, ok := src.(WriterTo); ok {
              return wt.WriteTo(dst)
          }
          // Similarly, if the writer has a ReadFrom method, use it to do the copy.
          if rt, ok := dst.(ReaderFrom); ok {
              return rt.ReadFrom(src)
          }
          buf := make([]byte, 32*1024)
          for {
              nr, er := src.Read(buf)
              if nr > 0 {
                  nw, ew := dst.Write(buf[0:nr])
                  if nw > 0 {
                      written += int64(nw)
                  }
                  if ew != nil {
                      err = ew
                      break
                  }
                  if nr != nw {
                      err = ErrShortWrite
                      break
                  }
              }
              if er == EOF {
                  break
              }
              if er != nil {
                  err = er
                  break
              }
          }
          return written, err
      }
    

Uh, big deal?

The chunk of what's important isn't explained at all:

\- runtime/ takes care of memory management quite efficiently with a decent
tracing gc in runtime/mgc0.c. I haven't benchmarked it against other stop-the-
world collectors, but it should be no match for truly concurrent gc.

\- runtime/proc.c schedules various blocking and non-blocking (called netpoll,
which resolves to epoll on systems where it is available) calls. It seems to
account for number of cores and use native threads, but I'm not sure how it
interacts with the Linux scheduler.

\- runtime/malloc.goc is the core memory allocator/deallocator. Seems to be a
relatively straighforward arena allocator using a bitmap.

I didn't have time to go through groupcache, but the presentation certainly
didn't tell me much about it.

------
codereflection
I don't remember where I saw this, but somewhere, someone from Google said
that all of their code changes every 5 to 6 months (or some reasonably short
amount of time). That clearly sounded... optimistic at best. It's nice to see
that even companies like Google have 5 year old old that is legacy and causing
problems.

------
__Joker
I still don't understand why google does not give option to download via
torrent ? Downloading android studio from dl.google.com last week over a slow
connection was a horrible experience. I had to retry three times before I
managed to get a successful download.

------
_random_
It seems that Go is a good replacement for Python as well?

~~~
rbanffy
As someone who wrote a lot of Python code ans who is learning Go while
implementing a somewhat important application, i can tell you Go is a good
replacement for Python if it fits the problem better (mine was concurrence).

Go is pleasant, but there are Go problems an Python problems (and C problems,
Lisp problems, and so on)

------
c0rtex
Aside: Does anyone know how these slides are generated?

~~~
bradfitz
[http://godoc.org/code.google.com/p/go.talks/present](http://godoc.org/code.google.com/p/go.talks/present)

[http://godoc.org/code.google.com/p/go.talks/pkg/present](http://godoc.org/code.google.com/p/go.talks/pkg/present)

------
godbolev
Does anyone have a link to the video?

------
CoryG89
too long... ?

~~~
fosap
Read the source code. IMO cleaner, and scrolling in the "right" way
(vertical).

------
IzzyMurad
Too many Google employees in Hacker News trying to advocate Go...

~~~
tptacek
Yes. Nobody here wants to learn how Brad Fitzpatrick reasons about, designs,
and implements server software. It's all spam.

~~~
drhayes9
Second this. One of my favorite things is getting a slice, however small, of
how truly great programmers get their work done, why they chose what they did,
and what the results were.

We actually need _more_ of this on HN, not less.

~~~
ballard
This doesn't reveal any secrets into systems engineering, but it slices:

    
    
        package main
    
        import "fmt"
        
        func main() {
            cake := make([]byte, 2600)
            copy(cake, []byte("Happy Birthday, Lundberg !"))
            lundberg := cake[:8]
            milton := cake[:0]
            fmt.Println("Lundberg got :", string(lundberg))
            fmt.Println("Milton got :", string(milton))
        }
    
        // :-)  (Ran out of UTF8 smileys and movie fact checkers).
        // And yes could've used string but bytes are awesomer.

~~~
pjmlp
Slices aren't Go specific.

