Hacker News new | past | comments | ask | show | jobs | submit login
The ups of downs of porting 50k lines of C++ to Go (togototo.wordpress.com)
218 points by logicchains on Mar 7, 2015 | hide | past | web | favorite | 105 comments



Do read the answer of the author regarding the performance:

"The throughput of the Go program is quite competitive with the C++ one, although the server’s IO-bound so most of the time is just spent in socket write/read syscalls. The latency is at least an order of magnitude worse, due to Go’s garbage collector, which is amplified by the use of an older Go version. If the server was latency-critical I don’t think it could have been written in Go, at least not until the new GC planned for 1.5 or 1.6 is released (assuming we could upgrade to a newer kernel by the time its released)."


Author here. Just a note that by latency-critical, I'm referring to >10 millisecond latencies. If you can tolerate occasional pauses of 400-500 milliseconds, then the GC wouldn't be a problem. Also note that the GC slowness came from having to scan a fairly large heap (a lot of cached stuff); it could be avoided by storing all that off-heap, but I suspect that would complicate the code significantly.

Finally, note that by "at least an order of magnitude worse" I'm comparing it to hyper-optimised C++ that's designed for sub-millisecond latencies, as the C++ server used the same framework used in latency-critical HFT software.


The GC pause time is mostly proportional to the number of pointers in the heap, not the absolute size. For example, if your heap consists of a single 50GB []byte, the GC pause will be negligible.

This means that you can often control the pause time effectively if you are able to reduce the number of pointers in use. I have a server that has a large-ish working heap size (10-30GB) that consists of several very large maps; initially it gave pauses of 2-3 seconds. I got rid of pointer types from the map keys/values and the pause time became a few hundred milliseconds. At this point, the GC pause was mostly caused by internal overflow pointers from the hashmap implementation. I implemented my own pointer-free map type and brought the pauses down to < 1ms.

I also filed an issue (https://golang.org/issue/9477) and Dmitry Vyukov kindly implemented a change so that pointer-free maps are not scanned by the GC. This will be in Go 1.5 and I will delete my custom map.


How did you get around the fact that Go doesn't allow you to modify value-type entries in maps? For instance, if I have a map[int]myFoo, I can't do myMap[24].myParam = 3, I have to create a new myFoo and assign it to myMap[24]. Whereas if I have a map[int]*myFoo, myMap[24].myParam = 3 works fine.


Effectively, you need to keep a small heap and if possible have "simple" data structure. In my case, I implemented the Python code of a trie based levenshtein distance algorithm[0]. I used maps because it was basically a direct translation of the Python code. This resulted in millions of maps and oft the index would just stop in the garbage collector for a couple of seconds. I solved it using my own map which is travelled linearly.

    type SpellCheckerMap struct {
        Runes []rune
        Nodes []*SpellCheckerNode
    }
with only few runes, I can just go through the list to find the index of the following node. You save the computation of the map hash key. It is pretty fast, here is an example of a spelling suggestion over a 500,000 words corpus[1].

Basically, the garbage collection cost forced me to think about a better data structure for my case. Maybe not that bad all in all.

[0]: http://stevehanov.ca/blog/index.php?id=114 [1]: https://www.chemeo.com/search?q=asparine


Semi-offtopic:

The are more compact representations, e.g. you can store the dictionary in a deterministic acyclic minimized finite state automaton (which can be stored in a flat array/slice). This gives you O(n) time lookups, where n is the length of the word, and reduces much of the redundancy in a dictionary.

Words within an edit distance can be found by computing a Levenshtein automaton for the word (which can be done in linear time) and computing the intersection language of the dictionary automaton and the Levenshtein automaton.

This approach is fast and very compact. I have a Java implementation:

https://github.com/danieldk/dictomaton


I am a chemical engineer so reading "deterministic acyclic minimized finite state automaton" is sending me back in my first years at university. As I want to implement instant search with part of string matching, this can be very interesting. This is to match "Octane, 2,4,6-trimethyl-" when typing "trimethyl".

For the moment I am looking at the Linkedin approach[0]. Sorry for keeping this semi-offtopic thread alive, but these problems are so interesting that I cannot stop.

[0]: https://github.com/jamra/gocleo


Oh, this is very cool. I wonder how many CAT (Computer-Aided Translation) tools out there (many of which are notoriously slow) could be significantly sped up using this approach.

I'd bet most of them, although they're almost all closed-source so we will likely never know.


One issue is that the server uses Google Protocol Buffers. The following definition, for instance:

    message Foo {
	required string a = 1;
	required int32 b = 2;
	optional int64 c = 3;
    }
Generates a struct like:

    type Foo struct {
        a *string
        b *int32
        c *int64
    }
This is not particularly friendly to the garbage collector compared to a POD struct with no pointers.


Protocol buffers allows you to reuse the memory of your objects. You can simply reuse the object where you parse your message into. You pool these objects, and you just saved yourself a huge amount of heap allocations.

https://developers.google.com/protocol-buffers/docs/referenc...


This would work if we were only using them for messaging, but the protobuffer objects are also used as the datastructure in which the information they contain is stored. This was done in the name of simplicity, to avoid creating a separate internal datastructure for each kind of message.

The flow of data is currently:

various external sources -> merged into protocol buffer struct -> later sent to client

To avoid keeping a heap of protocol buffer structs in the heap, this would need to change to:

various external sources -> merged into some internal datastructure -> later converted into protocol buffer struct and sent to client.


> This would work if we were only using them for messaging, but the protobuffer objects are also used as the datastructure in which the information they contain is stored. This was done in the name of simplicity, to avoid creating a separate internal datastructure for each kind of message.

Did you consider keeping strings (serialized messages) in your cache rather than message objects? I do this in C++ code simply for memory efficiency. Here it would also allow you to avoid these extra pointers.


FYI, if you don't need required/optional, non-zero defaults and extensions, try `syntax = "proto3"`. It generates much better Go code.


How can you have a POD struct without pointers if one of your fields is a string?


Couldn't it be done with a fixed-size array of runes or bytes? It's definitely possible in C/C++.


Not without wasting a lot of space depending on the difference between average size of string vs maximum size of string.


We generally know the maximum string sizes for each property, and they're pretty small, so this wouldn't have been a problem.


You can use a string class like the one in Folly that decays from a fixed array to heap allocated space if the string gets too long.


I believe it's also possible to use variable-size arrays in C or C++ (and end up with a variable-size struct).

Go does allow fixed size arrays in structs, and they're inline in the struct, so `struct { foo [8]uint }` is 32 bytes, whereas `struct { foo []uint }` is 12 bytes and `struct { foo string }` is 8.


Lots of small maps becoming little arrays. The more things change, the more they stay the same. (There is production Smalltalk code from the late 80's running in large multinationals that's basically what you just described.)


> If you can tolerate occasional pauses of 400-500 milliseconds, then the GC wouldn't be a problem.

You might want to spend some time optimizing allocations (http://blog.golang.org/profiling-go-programs for some info). I successfully eliminated these pauses in my project by putting data on the stack instead of the heap in critical points of my code. This can be done (though it's ugly) even when the size of particular allocations aren't known at compile time (i.e. use a fixed stack allocation with an appropriate upper limit, check for the limit and go to the heap only if necessary).

> I suspect that would complicate the code significantly

Maybe if your code has many hot spots with these allocations, but it's worth it to manually optimize these currently.


I had already done a bit of profiling: originally the GC was taking up around 40% of the total running time, and I managed to reduce it to 10% by removing the biggest source of allocations. As it's not latency-critical there wasn't an immediate need for further optimisation.


May I ask why you've chosen Go over Java, which is becoming very popular in the HFT industry, even for latency-critical code?

The code generation tools are better (what you call "compile time IO"), the IDEs are much better, it has generics which you seem to miss, performance is better (the GCs are state-of-the-art), monitoring is much better, the language is also regular and simple, and you don't have to write inheritance-heavy code if you don't like.

As someone who likes both Java and Go, I find it surprising that anyone would choose the latter for long-running server code, especially where performance matters. Go is great for quick command-line apps or very simple services, but when you need to build an important server, Java wins out every time.

Certainly for your particular requirements and preferences, Java seems to have all of Go's advantages and (almost) none of the disadvantages.


Taking from the author's paragraph:

- Emacs - Java IDEs may look better, but this doesn't seem to be an issue for author. Also other tools are very mature for such youn language as Go.

- Goroutines - Java has no built-in equivalent or one opinionated way of green-threading, only frameworks requiring months of expertise. From perspective of person, who need to switch language quickly this is very discouraging.

- No inheritance - Java design patterns are clearly no-no in the described context (they were presented as anti-pattern).

- Built-in, effective templating - again one easy path to start with, also powerfull enough to write your own tools if needed.

Go is simply easier/faster to start with and most of the performance problems were not GC-bound, so JVMs maturity wasn't such an advantage.


> -Goroutines

https://github.com/puniverse/quasar (I'm the main author)

> Built-in, effective templating

So does Java (and for a long time): http://docs.oracle.com/javase/7/docs/api/javax/annotation/pr...

> and most of the performance problems were not GC-bound

The author's complaints were about GC pauses, and besides, Java is faster even for non-GC bound tasks.

> Go is simply easier/faster to start with

Maybe, but just a little, and mostly because Java has (too?) many libraries to choose from.

Even if the differences you highlight are indeed true (a point on which I disagree), those differences are, at best, quite small, while the advantages in Java's favor are much bigger, certainly in those areas that matter to the author.


How do you avoid too many allocations to happen in Java? The GC is certainly slower with more allocated elements but I seldom see any code example in Java which doesn't behave like the allocations don't cost anything. More than that, it seems that the whole language is based on that premise? As I consider the info you already provided a good argument for Java, I hope you can provide some good links.


Java has a generational garbage collector, so short-lived allocations pretty much don't cost anything, and by and large it's exactly the short-lived allocations that are the ones you could have optimized away in e.g. hand-tuned C++.


What about an object being always handled only through the pointer? Does that mean that the array of 10M objects is actually an array of 10M pointers and 10M allocated objects, all of which have to be allocated, deallocated and the travelled through by the garbage collector? And what if it's not an array, but some more complex form? Is there a clean way to group a lot of objects to be treated by the allocator, deallocator and the GC as the single allocation unit? I understand that some language lawyers think that's not important ("just use the 'new,' the VM should care and not you") but for somebody like me who's used to the C level of control and actually cares and measures the performance differences which can result in the different number of servers needed to solve the problem, it really is.


This kind of stuff matters to Java developers, too, as, perhaps surprisingly, Java has become a high-performance language, especially when it comes to concurrency (as it offers low-level support for memory fences, and includes state-of-the-art implementations of concurrent data structures).

As pjmlp said, the issue of "array of structs" is being addressed in Java 10. In the meantime, for contiguous memory allocation, you can make use of off-heap memory (which also helps those "more complex forms"). But the flip side is that Java's memory allocation is a lot faster than C's (i.e. in throughput -- not latency, as there are GC pauses), and most GCs are copying collectors that automatically arrange objects contiguously (though it's far from being good enough for arrays of object, as you need to follow references, and every object carries additional overhead, which is precisely why this is being addressed in Java 10).


Currently not and this is part of the Java 9-10 roadmap.

However all JVMs have state of the art profilers like e.g. Visual VM, Java Flight Recorder and many others that help track down which data structures might need some help.


> How do you avoid too many allocations to happen in Java?

Most of the time you don't need to, because GCs are that good, and escape analysis can avoid heap allocation automatically[1]. You won't get C++ latency, but you'll do better than Go. If you really need to avoid allocations for some reason, you can go off-heap for C++ performance (Java HFT applications do that for the most performance-critical things; it requires more work, but overall, less than C++).

Also, I'm not saying that Java is always the better choice (the JVM needs some time to warm up; Go's runtime is statically compiled into the native binary artifact), but in this particular case it seems to be exactly what the author was asking for, that it's really the obvious choice.

[1]: http://psy-lob-saw.blogspot.com/2014/12/the-escape-of-arrayl...


I didn't personally make the decision. We don't have any teams using Java, we do have a team using Go. Ergo, Go was chosen.


Ah, OK, then. It's as good a reason as any, I guess. You should know though, that Java is now quite popular in HFT circles, especially in the UK, with high-performance libraries and monitoring tools[1] specifically tailored to that industry. In spite of a difference in marketing, you'll find that Java can get you much close to C++ than Go can. You have better control over execution and a runtime of higher quality overall.

Even though Go's runtime is statically linked while Java's isn't, Java is very much a C++ replacement in many circumstances, while Go makes for a terrific Python replacement if you need fast scripts and command line tools (which is why you mostly see Python->Go and C++->Java transitions). Not that Go is always more appropriate than Python or Java is always better suited than C++, but at least those are the common alternatives, and the ones that make the most sense considering the design decisions of those languages.

For you particular needs, I have no doubt you'll find Java to be the more appropriate choice.

[1]: Like this: http://openhft.net/


> May I ask why you've chosen Go over Java, which is becoming very popular in the HFT industry, even for latency-critical code?

Hype.


Wow, 400-500ms pauses? Is that the case in newer Go versions as well? Seems like that would wipe out viability for a whole load of applications.


Go is actively working on this. Go 1.5 will have concurrent GC and is shooting to stop the world for only 10ms. You can read more here: https://docs.google.com/document/d/16Y4IsnNRCN43Mx0NZc5YXZLo...


Not really. If you manage your allocations properly, you will never have such long GC pauses.

N.B. we use Go for real-time bidding (in the context of programmatic buying of ad space), and can easily respond to 6000 QPS on a single server within a 100ms time frame (from the SSP's POV), with a working set of about 2-3 GB that we constantly keep in memory.


Like the author's case, yours sound a much more appropriate choice of Java. Why have you picked Go?


Probably hype. There is a lot of hype in the industry, unfortunately.


That sounds really nice, but how often does that working set change?


Thanks, more than that, I also know that there are many in-house implementations written in C++ with significantly worse performances than those of the good garbage collected libraries. Sloppy programming can always make slow products. It's still good to know where the limits are and I thank you for the honest report.


Did you make use of sync.Pool at all for your common garbage generation cases?


Not yet: simplicity was deemed more important than latency for the present, and it was believed that pooling would bring unneeded complexity. If latency becomes an issue then pooling would be the next step taken.


> Go also forced me to write readable code: the language makes it impossible to think something like “hey, the >8=3 operator in this obscure paper on ouroboromorphic sapphotriplets could save me 10 lines of code, I’d better include it. My coworkers won’t have trouble understanding it as the meaning is clearly expressed in the type signature: (PrimMonad W, PoshFunctor Y, ReichsLens S) => W Y S ((I -> W) -> Y) -> G -> Bool”.

I like this description. It's one of the reasons why I prefer OCaml to Haskell, but I've found it hard to verbalise.


I'm still trying to force myself to like Go. The basic problem is that Go is too regular for me, which makes it painful to read other people's code.

I don't have this problem with functional languages or C written in a free flowing coding style like djb's.

Is it even possible in Go to have an individual style?


Too regular? I gotta be honest, that comment makes very little sense to me. My problem diving into most codebases (over 10 years spent contracting) was always multiplied horrifically by non-idiomatic code in language X.

Bob the developer has his own ideas about what + should mean... and uses gobs of hard to puzzle through magic all over the place. You need to run this C++-alike code through Bob's Pre-Processor -- but it is fine, cause if QT can do it, Bob can do it.

I want to read the code to understand the goal and the means for achieving it, not to be impressed with your brevity or cleverness.

Go was/is developed for teams, which means to some degree to the lowest common denominator. So far, in the last couple years, I have -- in general -- come to see these trade-offs as generally wise. Go is exceptional pragmatic for working on a team.


> Too regular? I gotta be honest, that comment makes very little sense to me.

Not agreeing or disagreeing with the parent, but I believe he may be referring to a common problem in software development that is best summed up by a quote by Yaron Minsky: "You can't pay people enough to carefully debug boring boilerplate code. I've tried."

If your Go code is just a series of `if x, err := foo(); err != nil { ... }`, it can become easy during code review to miss subtle bugs because in one function somewhere, someone write `err == nil` instead. This is not unique to Go; many programmers feel the same way about Java.

I find that one of Go's weakness is its limited ability to abstract common patterns. I also believe that for some languages (e.g. Haskell, Common Lisp, Scala) their main weakness is their apparent unlimited ability to abstract common patterns, sometimes beyond the understanding of most programmers. I'm not smart enough to come up with (a) an objective way to measure how little/how much a language allows abstraction, (b) a language that would hit a sweet spot of allowing abstraction without going into the deep end.


The problem is the sweet spot moves, and it can also be different between individuals and teams. I wouldn't even be surprised to learn that the sweet spot for a team is a lower level of abstraction than any individual team member is capable of, partially due to communication reasons and partially because "abstraction comfort" is not really on a line and you probably ought to target the minimum for any given "element" of it on your team.

It's why I've said before that while I'd very much like to work with Go, it's not my personal favorite. I'm comfortable with Haskell, but I would be engaging in malpractice to put code into the source control system that requires that level of fluency with abstraction to understand. Go has a really solid balance for large teams. If you're currently in a startup with 5 well-chosen, high-skill engineers, you won't have any clue what I mean by that, but when you've got hundreds of engineers who may touch some bit of code, where most of them are trying to spend as little of their cognitive budget as possible on it so even the very smart and very skilled ones are pretty much just stabbing the code until it does what they want it to, you start to appreciate a language that limits how hard they can twist the knife.


>write `err == nil` instead

I actually made this mistake a couple of times. Shows why it's important to unit test all error paths (of the 10k lines, at least 3k were unit tests).


The opposite of too regular is not necessarily "abuse of the C++ preprocessor" or unwarranted cleverness.

The same phenomenon exists in natural languages: The sentence structure in some passages of the bible is just too regular for me to be interesting, great authors on the other hand have an individual style while still writing with great clarity.

Go's style is simply too paratactic for me.


It is not just regular. It is dull.

Go is a miser. It gives you very few features to play with, forcing you to think hard about how you can solve real problems with these features in the most economic way. The result is much less fun, and the code you write is dull. But it is still useful, often as useful as what you write with a more fun language. And the probability that you will change your code because "feature X is the new hype" is considerably lower.

Go is not perfect in any sense of the word; every now and then its feature set turns out to be genuinely limited, and you have to apply hacks. But sometimes when you are forced to think hard, it turns out that you can combine two or more dull features in a creative way to do something you thought were only possible with new, interesting features. This offers a perspective of looking at language features: what are really necessary, and what are there just because programmers want to have fun?

Parsimony is a virtue appreciated everywhere in science and technology, except programming, where languages are rated by the number of features they include, and complexities get added to systems until they explode. The go perspective is a very, very valuable one in such a world.


> Parsimony is a virtue appreciated everywhere in science and technology, except programming, where languages are rated by the number of features they include, and complexities get added to systems until they explode.

You're laughably wrong.

There is an important school of thought in programming language design that emphasizes and heralds "language features" that are orthogonal, internally consistent and lean. In this school of thought, it is a plus (sometimes imposed as a necessity) when something can be expressed in the language by simply combining the primitive built-in "language features" in a certain way - as opposed to creating another "feature" altogether. And it is a plus when the list and specification of language features is short. And no - people who belong to this school of thought won't pine for more "features" so that they can make the whole thing more complicated for minor conveniences.

Ask yourself. Which of these two languages are more likely to be complemented by language nerds/snobs - C++, or Scheme...

And some people who apparently belong to this school of thought are criticizing Go for not being orthogonal enough. I wouldn't know about that - you would have to ask them about that. But if history is any indication, when something in Go is pointed out as not being orthogonal or simple, the goal post will change to nebulous claims of "pragmatism".


> There is an important school of thought...

We are advocating the same school of thought. However languages designed by such schools rarely get mainstream - Scheme, Lua, Smalltalk, to name a few that I know of. People acknowledge them, put them on the altar and ignore them. Go is the first language in decades I know of that boasts simplicity as a feature and gets mainstream; C is the last one before Go. There are things I dislike about Go (esp. lack of generics), but I find it worth defending. I apologize for my somewhat cynical intonation, if that caused confusions.

> Ask yourself. Which of these two languages are more likely to be complemented by language nerds/snobs - C++, or Scheme...

I don't understand this question.

> And some people who apparently belong to this school of thought are criticizing Go for not being orthogonal enough.

I am very eager to hear such criticisms, but I haven't come across any of them. Most "criticisms" on Go are simply "Go lacks feature X", which are simply invalid. A valid criticism is "It is very tricky for me to solve problem X with Go, and adding feature Y will make things easier", but I don't always buy it unless the author is smart enough to convince me that the difficulty lies in the limitation of Go instead of the author's intelligence.


I think some of those "Go lacks feature X" are the arguments you're eager to hear, and not simply invalid. Go lacking generics but having generic map/array types built in is an example of going against the Scheme school of thought. They're not built up from primitive features, but handed down as language features.

I think this post makes those criticisms well: https://www.quora.com/Do-you-feel-that-golang-is-ugly/answer...


In addition to those, there is Iota. Which is completely special-cased syntax sugar, which is not reusable outside of a single context. You can't use it at all more generally, and get potentially non-obvious code from using plain integers silently.


> Ask yourself. Which of these two languages are more likely to be complemented by language nerds/snobs - C++, or Scheme...

I don't understand this question.

I think the OP means complimented (aka praised), not complemented (aka enhanced)


> Ask yourself. Which of these two languages are more likely to be complemented by language nerds/snobs - C++, or Scheme...

I realized that you might have meant "compliment" instead of "complement". The answer depends entirely on what you call a "language nerd". For me, Scheme is a beauty to behold while C++ is an abomination.

But the depressing fact is that C++ is used way more than Scheme, even in areas where Scheme is clearly more suited. This is why I am enthusiastic about go; it can hardly represent the entire "simplicity and orthogonality" school, but it makes very practical use of its simplicity: very regular code, friendly learning curve, intuitive tooling, rich IDE plugins, fast compilation, simple deployment, etc. Eventually some will start to think about the reason for these advantages and appreciate simplicity on practical ground.

Arguably, Go (and C) differs from the other languages of the orthodox "simplicity and orthography" school. Conceptually, go is much more complex than, say, Smalltalk. But most, if not all, constructs in go can be mapped to their machine representations in a simple and efficient way. So rather than adhering strictly to conceptual simplicity, go actually prioritizes "practical simplicity" and apply conceptual simplicity whenever possible. The result is a very well-thought trade-off that I appreciate - the language is simple enough without sacrificing too much performance or making the compiler hard to write - it is much more tricky to write an good compiler for the other conceptually simple languages.

Anyway, I still loath the lack of generics in go. People are always pointing to Russ Cox's generic dilemma (http://research.swtch.com/generic) when topics on generics are brought up. Come on, you have to make trade-offs and there are cases where it is really useful...


> we had for instance a library for interacting with a particular service that was generated from an XML schema, making the code perfectly type-safe, with different functions for each datatype. In many languages that allow compile-time metaprogramming, like C++ and D, IO cannot be performed at compile time, so such schema-driven code generation would not be possible.

Actually this would be possible in D since you can read files at compile-time with the "import(filename)" syntax. Then you can use compile-time parser generators to parse it.


I didn't realise that was possible; I've updated the post to reflect that.


Avoiding code generation is a stated goal of the language. Too bad the hype train swings in random directions.


For what it's worth I'd personally have preferred to use D, but politically it wasn't an option.


"“hey, the >8=3 operator in this obscure paper on ouroboromorphic sapphotriplets could save me 10 lines of code, I’d better include it. My coworkers won’t have trouble understanding it as the meaning is clearly expressed in the type signature: (PrimMonad W, PoshFunctor Y, ReichsLens S) => W Y S ((I -> W) -> Y) -> G -> Bool”."

This was worth it - the fact a languages idioms are "don't write clever code" is extremely positive.


Specially when reviewing that latest code drop from the off-shoring partner company.


I've also found the lack of parametric polymorphism a huge pain point. Type safety is constantly sacrificed to allow for code-reuse and nicer APIs, leading to really awful code using type switching at best, inscrutable amounts of reflection at worst. This seems to plague the Google developers as well, just look at the Go App Engine APIs.


I built ported part of a distributed system I have in production to Go about 2 years ago. I remember doing a lot of reflection trying to factor out some common SQL code. Like this:

    func CreateRecord(record interface{}) (err error) {
        t := reflect.TypeOf(record)
        v := reflect.ValueOf(record)
Where I named the struct the same thing as the table it mapped to.


There are often alternatives. One of the ones I picked up from Haskell (of all places) is having an instance of something in hand solely for its type. I don't know what all you were about to do with that record, but in this case, you may have been able to use:

    type Record interface {
        Create() Record
    }
which declares an interface of things that know how to create another copy of themselves and return it. Well, the type doesn't guarantee that the same type will come out but you can document that. Then you don't need reflection to create a new instance, as long as you can get an instance of the correct type from somewhere else. You presumably have some arguments that goes in, but it's reasonably likely (though not certain) that there's some sort of regularity you can exploit and put into the type signature of Create() up there.

I've used this pattern myself in a generic "game server" that implements a network protocol and manages creating "rounds" of games and other high-level bookkeeping, where you pass it in a "prototypical" game object that provides a "CreateNew()" method on it, thus making it so the core engine can create new games without ever actually knowing what the game itself is.

If it's good enough for Haskell it's good enough for Go.

(Haskell, a bit confusingly, calls this a Proxy, perhaps because the instance is standing in as a proxy for the type? I was never quite sure where the name came from.)

I've actually written quite a bit of Go now without needing reflection, and the only "interface{}"s in the system are either A: things that legitimately can be "anything" (on a per-instance basis, i.e., not ATDs that really want to just have one type in them but legitimately on an object-by-object basis may contain an "anything") or B: things I really want to say are "encodable/decodable as JSON by the standard encoding/json library but there's no clear way to say that in the type". The latter annoys me in theory a great deal more than it annoys me in practice.

Mind you, I acknowledge there's a point where you'll be left with no other options but an interface{} or reflection, but people do end up reaching for it more quickly than is strictly speaking necessary. And it isn't necessarily a compliment for Go that it makes you think a bit harder for this sort of thing.


The lack of polymorphism and parameterized types makes Go the C of 2010s. This practically means that in a few years we will have someone, somewhere, creating Go++ and the story will repeat itself.


Go supports several types of runtime polymorphism, including interfaces and closures. It just doesn't support polymorphism through OO-style inheritance.

C supports runtime polymorphism, for that matter, it just requires a bit of boilerplate to set up and use function pointers and tagged dispatch.

You might be right about Go++ (ObjectiveGo?) being inevitable, though.


The mostly painful situation isn't at all what you mean, it has nothing to with inheritance at all.

In Go, it is impossible to do "container" polymorphism. That is, write a function that manipulates, say, an array or maps of "somethings", where the only thing it does with the somethings is assign them, read them (to return one), or check if two "something"'s are equal to one another.

The necessity of this functionality can be seen by the fact the Go stdlib can ACTUALLY DO THIS for the built-in functions, but no facility is provided to do the same for user code.


"The necessity of this functionality can be seen by the fact the Go stdlib can ACTUALLY DO THIS for the built-in functions, but no facility is provided to do the same for user code."

Right. Go has generic types. Channels and maps are generic types. It just doesn't have user defined generics. This is a lack.

Getting around this is creating a new level of cruft on top of Go. "Generate" and reflection are being used to work around the lack of generics. It's going to be tempting to use "generate" to invoke a macro language. That may not end well.


Swift looks a bit like "Go++" to me, at least on a purely syntactic level. I expect Apple to open source it sooner rather than later, at which point it will become a viable option to develop web-based services in.


Why wait for Apple, when there are OCaml, Haskell, F#, Standard ML, Rust, ATS to choose from?


ATS' type system looks particularly intimidating. Do you know of any company that uses it for RESTful services?

What I think makes Go attractive is not just the static typing but a combination of C-like syntax, GC, a simple type system (admittedly arguable) and a broad range of libraries. None of the languages you have named combines the first three but Swift does. If it is released as open source under a reasonable license the library situation, which is its biggest weakness at this point, should take care of itself.


What?!?

Swift's syntax is hardly any different from languages in the ML family.

All the languages in the ML family have automatic memory management. Be it RC, GC, or compiler dataflow models like affine types.

Swift's type system, is just like any ML language.

Finally it doesn't have web stacks like Yesod and Snap(Haskell), ASP.MVC/WebSharper (F#), Ocsigen and Eliom (OCaml).

Unless Apple decides to rewrite WebObjects from Java into Swift that is.


>Swift's syntax is hardly any different from languages in the ML family.

Call it superficial if you like but I consider the syntactic features that Swift derives from curly brace languages (not just the curly braces themselves but the syntax for generics, the lack of double semicolons, etc.) what makes it a lot more appealing than any ML I've been exposed to.

Beyond my own preferences I think this matters for language adoption. The name "Go++" implies it will be recruiting from Go users. (It's probably why Scala went with curly braces.) Here being associated with Apple might actually be a detrimental factor for Swift.

>All the languages in the ML family have automatic memory management. Be it RC, GC, or compiler dataflow models like affine types.

I never said they didn't. What I said was that none of the languages you listed combined garbage collection and a curly brace syntax. Rust and ATS are memory safe but their memory safety features require greater programmer effort to use (especially ATS', I imagine).

>Swift's type system, is just like any ML language.

You're right. I was mistaken to claim the type system as an advantage Swift has over other candidates for "Go++".

That said, I am not sure about the OO model, which Swift carries over from Objective-C. I have not used OO in OCaml. How does it compare to Swift's?

>Unless Apple decides to rewrite WebObjects from Java into Swift that is.

While that is not impossible it is unlikely. Regardless, I expect the users of the language to port something like Java's Spark long before that happens. It will only happen if and when Apple makes the language open source, of course.


>>Simple, regular syntax. When I found myself desiring to add the name of the enclosing function to the start of every log string, an Emacs regexp find-replace was sufficient, whereas more complex languages would require use of a parser to achieve this.

It would be really wonderful to have a series of tutorials on this subject. It might be a good reason for me to learn to use Emacs, anyway ..


A good place to start is http://www.emacswiki.org/emacs/RegularExpression, followed by https://www.gnu.org/software/emacs/manual/html_node/emacs/Re... and https://www.gnu.org/software/emacs/manual/html_node/emacs/Re.... They're more references than tutorials though; unfortunately I'm not aware of any comprehensive tutorials on the subject.


If you are already using emacs, I recommend setting regular expression find/replace as your default search mode. Ctrl-s, etc, so it becomes second nature.


Another way to use search in Emacs is to go to a place that you need to edit/insert text. Instead of using the movement commands or the mouse, use forward or backwards search to search for the point you need to be at: like "(str" in "(String s) ...".

I don't know if it is faster for me. But it can feel more ergonomic, since you don't have to make so much effort into going to a specific point. I'm getting better at it, though: maybe soon it will become second-nature.


That way also makes writing kbd macros a lot easier


No inheritance. I’ve personally come to view inheritance-based OO as somewhat of an antipattern in many cases, bloating and obscuring code for little benefit

So everywhere you would use inheritance, you use composition instead? The stuff you'd have stuck in the superclass, you stick somewhere else and stick in your struct?


> So everywhere you would use inheritance, you use composition instead? The stuff you'd have stuck in the superclass, you stick somewhere else and stick in your struct?

No, Go provides interfaces (aka traits or roles), which you can use to share composable functions (aka methods).


No, Go provides interfaces (aka traits or roles), which you can use to share composable functions (aka methods).

I wasn't talking about creating conventions/apis/facades, rather about how one reuses code.


I feel many of pros and cons listed in this article are not related to the porting at all.

btw, I think go is really not a replacement of c++. In my experience, go is more a replacement of java to improve the development speed.


The pros and cons are all based on what was learned from the porting process. Unfortunately the confidential nature of the software prevents discussing it in greater detail.

Go can be a replacement for C++ for programs that didn't need to be written in C++. There aren't many programs like that around nowadays however, as most of the time C++ is only used when it's really necessary, such as for extremely latency-sensitive applications or applications requiring precise memory/allocation control.


I wouldn't bet on that. I would expect C++ programs to have been written ages ago and not having been rewritten because of a combination of "it works"/"I can't read it"/"It's not my code". The most famous C++ to Go port must be dl.google.com (http://talks.golang.org/2013/oscon-dl.slide#1), which arguably never needed to be written in C++ in the first place, except it was probably the only reasonable choice at the time.

In other words: legacy.


i think in the long run, having to maintain a code that is readable helps a lot. even though c++ itself is easy to follow, it starts to get complicated once you get deeper and deeper in oo where everything is inherited from something else. this is one thing i like in python modules. they are highly readable. and then write something in Cython if its time critical. same with Go, the code feels very clean and easy to maintain. i haven't done parametric polymorphism in c++, so no idea about it.


> and then write something in cpython if its time critical

I believe you mean Cython? :)


updated :)


"Since one of my reasons for getting into programming was the opportunity to get paid to use Emacs, this is definitely a huge plus."

Wow! Wish I could say the same!


"one of my reasons for getting into auto repair was the opportunity to get paid to use Snap-On tools"

I feel text editor loyalty too, but this statement does feel a bit strange :)


Please, use Dialyzer for Erlang.

BTW, I don't know Go, but Erlang has per-process GC, so there won't be a large heap to scan.


Go doesn't have per-process GCs, because goroutines share memory. Structures are not copied or moved across channels, a pointer to the structure is copied and both sender and receiver get access to the same object in memory.


I'm a PHP programmer. Can someone explain why the lack of parametric polymorphism is a big deal?


Say you need a Vector2D type:

    struct Vector2D
      int x
      int y
  
Except, hold on, I have a routine that needs floats. In a dynamic language, I'd leave off the type hints; with a decent type system I'd parametise the `int` type; in Go I have to reimplement the whole type:

    type Vector2DFloat
       float x
       float y
Lather, rinse and repeat for complex numbers, vectors of vectors, etc. The only way around this is (a) to use the `interface{}` type (in which case you're just using a very verbose dynamic language) or (b) to rely on lots of text-based code generation.


Maybe my brain hasn't starting firing on all cylinders yet this morning...but are you saying that:

If I have some function in Go (pseudocode - I don't know Go but it should get my point across)

f(float x) { }

and call

f(vec.x)

Go can't cast it to a float?

I'm not being sarcastic or anything - I'm genuinely curious on Go and your example.


No, what he means is the you would need to write these two functions:

        func DotProductInt(v1, v2 Vector2D) int {
             return v1.x*v2.x + v1.y*v2.y
        }

        func DotProductFloat(v1, v2 Vector2DFloat) float64 {
             return v1.x*v2.x + v1.y*v2.y
        }
If Go allowed for parametric polymorphism, the type of the elements could be abstracted away like this (not real syntax obviously)

        type Vector2D<t> struct {
             x, y t
        }

        func DotProduct(v1, v2 Vector2D<t>) t {
             return v1.x*v2.x + v1.y*v2.y
        }


To answer your question, no, but it's not really the key issue. The problem is storage; I can't store a value of 1.5 in my original Vector2D, so I had to reimplement it.

If that sounds silly (why not just change the original Vector2D to floats?) imagine you're implementing a game, for example, and you want to store game objects in sets. You can't just write one type-safe Set structure – you have to implement Monster1Set, Monster2Set, PlayerSet, Coin1Set...

Basically, any time you want to go outside of Go's built-in array and dictionary types, you run into problems.


What is the fascination of software people with LOC as a code measure / metric?

It seems indicative of nothing, not quality, especially not readability nor maintainability.

I've never understood this apart from the very early days of punch cards and memory/storage limitations which placed physical limitations on computation.


> It seems indicative of nothing, not quality, especially not readability nor maintainability.

It's not meant to be indicative of any of those things (though I'd argue, all else being equal, maintaining an application with more LOC is harder than one with less).

It's indicative of complexity and scale. A programmer can read through an entire 500 LOC program and will know everything about the program. This becomes much more difficult for a 50K LOC program and outright impossible for a 5 million LOC program.

Taking a 50K LOC program and bringing it down to 10K (regardless if that's by refactoring, removing unneeded code, or rewriting in a new language) makes it much easier for each developer to know/understand a larger portion of the program.

Prior to my current job I had only worked on relatively small applications (<25K LOC) and I was blown away by the difference between working on things like that and working on something measured in the millions.

[And in this specific context I doubt it would be on the front page of HN if somebody took 500 lines of C++ and rewrote them in a hundred lines of Go, i.e. knowing the LOC is useful to determine if this was a meaningful undertaking or not]


AFAIR one of the few things that software engineering research has consistently shown is the bug-count is correlated with LOC, esp. with churn, i.e. number of lines changed. (I can't recall exactly how robust the effect is, but I remember reading about it in "Making Software: What Really Works, and Why We Believe It" and you can probably find a few cites in there.)


Writing 50k lines of code takes a lot of time, and every line has a chance of spawning a bug or undesired behaviour. Of course it's not a perfect metric of a system, and of course a line of code written by some experienced C guru will be completely different from one written by a newcomer to the language. But it is also true that the less lines of code, the less potential points of failure in a program.

Also, if you were given the task to read and analyze a piece of code, you'd surely wish it were shorter! As long as the software performs the desired tasks and readability is cared for, shorter tend to be better.


Most software engineers still view software construction as the task of manipulating text files. That is why they have such a fascination with lines of code. People coming from a different programming tradition such as SmallTalk or APL know that this makes no sense.


"It also allows parallel/async code to be written in the exact same way as concurrent code, simply by setting GOMAXPROCS to 1."

Aargh! If your code has race conditions with GOMAXPROCS > 1, it's broken. "Share by communicating, not by sharing". (Ignore the bad examples in "Ineffective Go". Send the actual data over the channel, not a reference to it. Don't try to use channels as a locking mechanism.)


What's the point of using GC in high performance software? Seriously. In my opinion, it makes no sense.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: