Hacker News new | past | comments | ask | show | jobs | submit login
Experience report on a large Python-to-Go translation (gitlab.com)
274 points by psxuaw 16 days ago | hide | past | web | favorite | 94 comments



Here's another experience report: I ported a small 1 KLOC PHP project to Go this week (in some spare time between large C++ compile times). The primary goal was to reduce the number of supported languages we use.

The port happened in the mechanical line-by-line way, copying each PHP file to a *.go and fixing all the syntax. The project was small enough that automation wasn't interesting.

I agree with the "1/3 time spent debugging the result". Another complicated facet was the lack of insertion-order preserving maps, that PHP applications end up relying on heavily. The error/exception impedance mismatch was not a problem in practice at all.

According to cloc, the original PHP project (excluding vendor) is 1.0 KLOC, the resulting Go application is 1.2 KLOC. I imagined Go would have been more verbose than this, but actually most lines remained 1:1 conversions, and the Go standard library happened to cover a lot of small utility functions that had to be separately written in PHP (e.g. for string suffix matching).

Another interesting point is the number of comment lines in cloc appeared to drop dramatically, since real type annotations are much less verbose than PHPDoc.


Sounds about right! Here's another: https://benhoyt.com/writings/learning-go/ ... I ported a medium-sized web backend in Python to Go.

"Due to Go’s static typing and because I was using fewer libraries, I expected that the code would end up being more than twice as many lines of code. However, it was only 1900 lines of Go (about 50% more than the 1300 lines of Python)."

"The porting effort was very smooth, and a lot of the business logic was almost mechanical, line-for-line porting of the original Python. I was surprised how well many Python concepts translate to Go, right down to the things[:20] slice notation."


Small nitpick: in a statically typed language, you don't have "type annotations", since an annotation is usually optional. If you're talking about a function declaration, I think it's called a parameter declaration?


Small nitpick: some statically typed programming languages have optional type annotations.


You're right, that is the word I was looking for.


I'd like to compliment the author on the quality of this post. It's very well written, data/example driven, fair, and educational.

Overall, a joy to read. Thank you!


He's been slightly famous for technical writing for a long time

https://en.m.wikipedia.org/wiki/Eric_S._Raymond


Came here to say this. Engaging, thoughtful, truly well written... This excellent piece is a real contribution to the body of knowledge of language design, and I'm grateful I got to read it.


I freakin' love python, but also it feels like the community is top notch among languages.


Go is probably more verbose because it’s missing List comprehensions, for example.

It needs map, filter, reduce to reduce line count. Swift, while probably not as performant as Go, makes writing in a more Pythonic style.

   [1,2,3,4,5,6,7,8,9].filter {$0 % 2 == 0}.map {$0 * 2}.reduce(0, +)

   ["550", "a", "6", "b", "42", "99", "100"].compactMap{Int($0)}.filter {$0 < 100}
https://github.com/melling/SwiftCookBook/blob/master/functio...


This is probably my single biggest complaint. I spend an inordinate amount of time when writing Go doing tedious, fiddly stuff like this using loops and indexes and accumulators and stuff to perform extremely common operations that could be represented by a single word.


Note that this is the case because Go lacks generics, so it's not a quick or easy fix


I'm really hoping to see this soon:

https://blog.golang.org/why-generics


But that assumes reducing line count is the goal; one could argue that functional constructs like that sacrifice length in favor of complexity / cleverness.

Mind you I do like and prefer functional style, I've done a lot of the iteration style processing back in Java 1.5 and never gone back to that yet. Functional style expresses the 'what', whereas iterative style spends a lot of code expressing the 'how'.

I'd spread the operations out over multiple lines, one operation per line at least. But that's a personal style preference.


[edit: fixed links]

This was discussed recently on the go-nuts mailing list: https://groups.google.com/d/msg/golang-nuts/u-L7PRa2Z-w/kfUS...

There was also discussion around an earlier post he made about the work: https://groups.google.com/d/msg/golang-nuts/WstriKt2jTA/lsZy...


I'm sad that ESR has not (yet) responded to Nigel Tao's comments on the first thread, which pretty much capture all my thoughts from reading the blog post exactly.


> The problem directed the choice of Go, not the other way around. I seriously considered OCaml or a compiled Lisp as alternatives. I concluded that in either case the semantic gap between Python and the target language was so large that translation would be impractical. Only Go offered me any practical hope.

I wish they expanded more on this point. Do they mean that rewriting in, say, Lisp would be longer because it wouldn't be a 'port' and more like writing a new program from scratch?

EDIT: Spoke too soon. Reading more carefully, I answered my own question.

> Python reposurgeon was 14 KLOC of dense code. At that scale, any prudent person in a situation like this will perform as linear and literal a translation as possible;


Yeah, that point doesn't seem to make a lot of sense. When people talk about Python and Go being similar, or filling a similar niche, I gather that this refers mostly to the short compilation time, rich standard library and easy of use. They are quite different in other ways.

Go might also be more similar in syntax than to Python than Lisp is, but the article talks about the "semantic gap" (not syntactic gap) which seems much narrower in Lisp vs. Python than in Go vs. Python.

> I did examine two automated tools for Python to Go translation, but rejected them because I judged the generated Go code would have been a maintainability disaster.

Perhaps indicating that the languages are not all that similar semantically.


I'd be interested in reading more about that, too.

I'm familiar with both Python and Common Lisp, and to me they've always felt very similar.


Interesting. I'd have expected more than a 50% code expansion going to Go, maybe even 3x or 5x.

Similarly, he's using 40x speedup as a rule of thumb. I usually think of Python as 20x slower than C.

Personally I'd be loathe to convert a working Python system to Go, but it sounds like he had good reasons. I do wonder a bit whether divide-and-conquer or a C extension might not have worked instead.


"Interesting. I'd have expected more than a 50% code expansion going to Go, maybe even 3x or 5x."

This has been my extensive experience as well. I wouldn't be able to use Go if it was that much more verbose than Python. It certainly isn't as succinct as Python by any means, but it's not the night-and-day nightmare a lot of HN posters seem to think it is... provided you actually learn the language.

In fact, one of the questions that I've found coming up in my head is... if you take the huge, huge pile of features that Python or other languages bring to the table that Go doesn't, and all their corresponding disadvantages in terms of having to learn them all, and how they all interact, etc.... and that's all you get in real, production code... is it really worth it? Because let's not mince words... it's a long list of features, all of which superficially seem awesome. And I can craft one-liners that would be a dozen lines or more of Go... but usually those one-liners have a lot of single-letter variables in them to focus on all the awesome syntax and features. When I get into real code with real variable names, the advantage fades fast.

I find this to be food for thought. I still haven't fully integrate it into my worldview. But I can definitely say I feel like the cost/benefit matrix I had in my head even three years ago has shifted a lot. Perhaps it would be fair to say I haven't necessarily lowered my estimate of the benefits of all the fancy features, but my estimates of their costs have gone significantly up. A lot of them are really cheap in the moment you're writing them down for the first time and using them, but carry hidden long-term costs that I feel like younger me was not accounting for properly, especially if you are not the only developer.


You wouldn't take the huge pile of features in an all or nothing. You get to pick and choose. I think the authors wish list is similar to most people who are experienced with more expressive languages. I don't see calls for Python f-strings or async syntax to end up in Go. But something like a list-comprehension syntax I think would be really popular (based on what I saw when it was introduced to Python, originally 'why' but now many people's favorite feature).

Interesting how Python is now considered a huge pile of features, considering how long it took for things like a ternary operator to end up in the language. Maybe Go is in it's Python 1.5.2 lifecycle stage ;)


"Interesting how Python is now considered a huge pile of features"

I've been tracking Python since 1.5.2 was common and 2.0 was just coming out. For me, it's not been a problem because the new features were added incrementally.

But I've seen new programmers try to come at it in 2019 and 2020, swallowing what I got spread out in ~15 major releases over ~20 years all in one big chunk, and learning even the core language now is definitely "a pile of features", to say nothing of the various library ecosystems. I don't even mean this necessarily as a criticism per se, just a description. It's not an easy-to-learn language anymore, and I don't recommend that people model it as such. It's a power tool for developers, not an easy learning language.

And, yes, the "you can pick and choose the features" is a non-starter in a team environment. At best, the team can pick and choose, and that takes a rather strong hand and alignment to achieve. It is a very common case that you'll just end up with what your teammates use.


> And, yes, the "you can pick and choose the features" is a non-starter in a team environment. At best, the team can pick and choose, and that takes a rather strong hand and alignment to achieve. It is a very common case that you'll just end up with what your teammates use.

I wonder if this wouldn't be much of a problem by, say, 2025. Already I'm used to having multiple linters run on save that fix up spacing and indentation like Prettier does. I'd expect people to make tools that'll complain if you're about to check in code that's too complicated for your team (uses lots of parentheses in a line without temporary variables, uses overly-complicated reduce() calls, etc.).

Prettier: https://prettier.io


That's a good point. One of the original appeals of Python is that it started as a pedagogical language, and it shows. Even Python2 today still has most of that feel.

Python3 on the other hand, just isn't. You're pretty much required to constantly deal with Unicode and its related issues, and this is a burden for beginners. In Python2, you can kick that can down the road almost indefinitely.


That hasn't been my experience. In Python 2 your program would appear to work fine initially, but the first time your data happens to contain a non-ASCII character you would get a mysterious UnicodeDecodeError somewhere far away from where the actual problem is.

In Python 3 it just works again, like it did in Python 1.5, before the whole str/unicode implicit-conversion mess was introduced in Python 2.


> You get to pick and choose.

In theory; in practice though, you need to have all developers (and yourself) strictly aligned on what features to use. In my limited experience with Scala and working with Scala developers, there's as many styles and preferences as there are developers on a codebase.


I've been saying for a long time that every abstraction has a cost. Sometimes that cost is hard to quantify or externalized, but our inability to quantify it doesn't mean the cost doesn't exist.

Abstractions have to at least pay for themselves many times over to be worth the extra cognitive burden and we need to get better at measuring these trade-offs. The success of languages like Go hint that there's more costs than we have traditionally been willing to acknowledge.


The one-time cost of learning a good abstraction is strictly less than the ongoing cost of understanding and then continually reimplementing it by hand. The purpose of a high-level language is to make programs more concise and clear; a language that doesn't do this may somehow become popular but that shouldn't be mistaken for successful.


> The one-time cost of learning a good abstraction is strictly less than the ongoing cost of understanding and then continually reimplementing it by hand.

There is a danger of a "no true Scotsman" fallacy here - it's easy to define a "good" abstraction as one that is worth the one-time cost of learning it.

So, given that there are both "good" and "bad" abstractions in every language, is the net gain from learning them all greater than the cognitive effort of having to learn them all?

I'd argue that no, it's not. Absolutely reducing the total number of abstractions in the language reduces the cognitive load of working in that language, even if it requires us to be more verbose.


I look on it as falling out of the https://en.wikipedia.org/wiki/Rule_of_least_power. When I see reuse of an abstraction I know not only what it does but what it doesn't do. If I skip that and start writing custom code, well, that code could do anything at all so everyone has to continually reread it carefully. I find poring over tedious boilerplate to be a waste of precious lifetime compared to learning better building blocks.


There's more to the cost of using an abstraction than just the cost of learning it. Example: an abstraction that makes a piece of code easier to test may introduce indirections that make it harder to read. That trade-off may be worth it, but not acknowledging that there is a trade-off is the problem I'm getting at.

The religious adherence to "use all abstractions" is as dangerous as the religious adherence to "use no abstractions." My point is we have to get better at quantifying this stuff otherwise we'll forever be stuck in this cycle of arguing which abstractions are appropriate and when but never actually being able to quantify when we are right.


"The one-time cost of learning a good abstraction is strictly less than the ongoing cost of understanding and then continually reimplementing it by hand."

In theory, I agree with you. However, the main thrust of my post is that the practice is not working out as that theory predicts.

I haven't fully worked out all the bits and pieces, but I do know that just repeating that the theory must be correct because it must be correct even if it contradicts the evidence is not the correct way forward. Theory bows to evidence, not the other way around.


I have pretty good mastery of Python2, and in my hands, a few lines of Python would blow up into a page of Go. This really frustrated me and was one of the reasons I bounced off of it.

Other reasons: no fork(), poor make integration, weird source dir structure, and I once caught one of the go tools trying to write system files in /usr/lib or something (which fortunately failed on permissions). Can't recall the specifics of that last, and perhaps it was somehow misconfigured.

Python2 has a good feature set, and one of the downsides of Python3 is that it greatly enlarges what you need to know without much expanding what you can do. It's a real problem, and it would be a shame if it ended up where C++ is these days. ("Poor Joe--a C++ spec fell off a high shelf and killed him...")


Well... fancy features/abstractions are pretty low cost, if you use just one. It's when they interact that they really get expensive unless they are almost flawlessly designed to work together (and even then, they just get somewhat more expensive).

It's like the first donut has 100 calories, the second has 200, the third has 400, and the fourth has 800. If you eat the whole box, it's really going to hurt. Rather than eat them all, you need to pick the one or two that you really want, and leave the rest in the box.


When you have a system that is at big scale, even a 2x speedup can relieve a huge amount of problems.

I think that Go, which is very inexpressive for modern languages, only expanded the code size that much, should make people consider whether we should ever be using interpreted, dynamic typed languages. Perhaps various JITted/compiled and statically typed languages can offer the same productivity on medium to large sized projects while just being a lot faster on top of it.

People who love Python may especially enjoy Nim, or F#


Having done a significant amount of programming in both Python-like and statically typed languages, I think the idea that they offer similar productivity (i.e., time to correct, maintainable implementation) is simply false.

And in a number of cases I've turned an existing program in a language like C or Java into a faster program in Python. One of the benefits of a fluent language is that you can explore more of the possibility space and look for better solutions. That's difficult and costly in static languages.


Nim is the closest language to Python. Translating to Python is a breeze and you get speeds comparable with pure C.


You seem like a Nimmer. What would you describe as the biggest downsides of Nim wrt Python? (aside from the fact that it's currently a niche language.)


The only two that I can think of are due to static typing:

You cannot enforce types only where you want, like mypy. Yet, type inference and automated conversions help quite a bit.

There's an unofficial REPL in the compiler but you cannot do dir() or tab-complete methods/procs. You have to rely on documentation, nimsuggests or IDEs instead.


If you like F#, it’s based on OCaml.


> it sounds like he had good reasons

Meh.

> Subversion-to-git conversion of the Gnu Compiler Collection history, at over 280K commits (over 1.6M individual change actions), was the straw that broke the camel’s back. Using PyPy with every optimization on semi-custom hardware tuned for this job still yielded test conversion times of over 9 hours, which is death on the recipe-debugging cycle.

Just how often does one need to convert the Gnu Compiler Collection from Subversion to Git in under 9 hours?


> Just how often does one need to convert the Gnu Compiler Collection from Subversion to Git in under 9 hours?

For this particular repository, once the conversion is complete, it's complete. But there are plenty of other old projects out there with large repositories in old version control systems. Reposurgeon is a general tool.

Also, one repository conversion does not mean one run of reposurgeon and you're done. Reposurgeon is meant to be run multiple times since each run is likely to uncover issues that have to be addressed, and the way to address them is to do another run with updated parameters. Reposurgeon also has an interactive mode where you can explore a repository and test the results of various possible transformations.


I was wondering that, too. My guess is that he's using this as a (large) set of test data to test his "recipes" and has to do that a lot. Having to run many nine-hour trials serially is painful.

Not sure I'd have bothered, but it's a defensible choice. And produced this interesting example of comparable implementations.


There would probably be a much longer list of issues if ESR had converted to Rust instead, but the syntax for error returns is quite interesting. Rust and Go both opt not to have exceptions, instead they use error return values.

The original Python code using exceptions was:

    sink = transform3(transform2(transform1(source)))
Making that use error return values looks quite verbose in Go, but Rust has syntax specifically for that case, making it quite manageable:

    sink = transform3(transform2(transform1(source)?)?)?


> if ESR had converted to Rust instead,

Rust would have solved three of his complaints with Go: Sum types, iterators, and generics.

Borrow checking can get especially tricky with graph algorithms though, so perhaps that would have been a new issue.


For graph algorithms there is a good Petgraph[1][2] crate, so many common operations can be done in no time.

[1] https://crates.io/crates/petgraph

[2] https://github.com/petgraph/petgraph


It's ugly either way but for clarity you can move the error checks into the transformation leaving the call point clean.

i.e transform1 returns (result, error) and transform2 accepts (result, error) and short-circuits if err is not nil.

It allows the expressive and succinct description of a transformation list but with a bunch of messiness hidden.


This isn't ideal, because it forces each successive function that takes a Result type to add handling code in the case of an error.

fn(fn(fn()?)?)? is a bit gnarly but better than duplicate code like that, imo.


something something Maybe monad


He evaluated Rust and Go a few years ago and strongly preferred Go.

https://blog.ntpsec.org/2017/01/18/rust-vs-go.html


the final ")?)?)?" is kinda unsettling, but I can't explain why.


ESR tried Rust a while back, and decided to go with Go in general.


Adding lookbehinds to the regexp library is a terrible idea.

> The regexp implementation provided by this package is guaranteed to run in time linear in the size of the input.

Python's is exponential, because it inherits all the non-regular "regular" expression mess (such as lookbehind and backrefs) from perl.

One would assume esr would have marinated in unix culture for long enough to be aware of this.


Some projects need performant regexes, and some honestly just don't. I agree that keeping the base regex library linear is admirable, but it's be nice if they offered a well-marked thing like regex.slow_and_perl_like in the stdlib



This would contravene the training-wheels nature of Go.

Also, Perl-style (ir-)regular expressions, despite their popularity, are not a worthwhile abstraction IMO and thus should not be enshrined in the standard library of programming languages, even if you want them in your library eco-system for end-user facing API compatibility.


I love reading ESR, and it's such a pleasure to see how great his writing is in 2020. This made me hark back to The Cathedral and the Bazaar - and also echoed my usage.

I do however find that when you have a python project that is heavily dependent on third party libraries - these things get significantly larger and more problematic. That's not really a commentary on Go, inasmuch that it's a byproduct of the longevity of Python.


I do not get why people do these total rewrites, especially for working Python systems. Why throw out the baby with the bathwater ? Python is fundamentally a composing toolkit. Rewrite the slow bits in C++/Rust/Go and wrap it. That's how all major Python components like Numpy, Scipy, Tensorflow, PyTorch etc. does it. And that's a major reason why Python dominates today.

Align with the core strengths of Python's philosophy and its toolset and get the benefit. Why fight it ?


> Rewrite the slow bits in C++/Rust/Go and wrap it.

For reposurgeon, you can't. Not the author, but I have done some hacking on it. "The slow bits" are not things you can rewrite in a different language and wrap. They are much too integral to the code.

> That's how all major Python components like Numpy, Scipy, Tensorflow, PyTorch etc. does it.

And what all of these have in common is that the "slow bits" are not like those in reposurgeon. The wrapped speeded-up things are basically fast implementations of appropriate basic data types, like Numpy arrays and vectors. Those aren't the kinds of things that are slow in reposurgeon.


So which parts and things are the slow bits of reposurgeon? ESR seemed to be saying[1] that the last time he tried profiling it was seven years ago.

[1] http://esr.ibiblio.org/?p=8161&cpage=1#comment-2065946


"Rewrite the slow bits" only works when you have a solid hot loop that you can rewrite. But what if most of your program is the hot loop?

This particular problem is not particularly numerically oriented. There's nothing he could feed into a external library to speed it up. It is highly algorithmic code, pretty much the worst case for Python.


> Align with the core strengths of Python's philosophy and its toolset and get the benefit. Why fight it ?

In this particular case, it's worth noting that reposurgeon highlights some core weaknesses in Python's philosophy and its toolset. The two biggest are the amount of memory required for even basic objects and the GIL.


I think it is fine when people are doing rewrites on their time and dime. It is certainly better than situation where language fans just make drive-by comments to authors on github etc to rewrite stuff in xyz-lang because it is so much better.


Because nine hour runtime is really long when you're trying out different rules to make sure that the conversion happens the way you want. It's almost like a REPL with a nine-hour response time (not quite, but it's in that direction). That's so completely unworkable that a rewrite into something faster is basically your only option.


> The man barrier to translation was that, while at 14KLOC of Python reposurgeon was not especially large, the code is very dense.

Reasoning about somebody else’s dense code is probably the least favorite activities. When I hear about a language being “expressive” or having “flexible syntax” I shudder.


I think the author was reasoning about his own dense code, though.


significant mastery ahead!

This is a success story and a teaching document.

.. have to point to this : "Now that I’ve seen Go strings… holy hell, Python 3 unicode strings sure look like a nasty botch in retrospect. " (!)


I wish he had been a bit more explicit there, TBH. What's better about Go strings?


Skimming over a post from a quick search [0], it looks like Go strictly uses byte strings and manipulates them with various keywords and functions. No encoding/decoding between byte strings (and picking an encoding) and unicode strings like in python.

[0] https://blog.golang.org/strings


I think that’s not really the right way to put it. Rather, go offers UTF-8 strings and byte slices, with a simple typecast in either way, with various keywords and functions to DTRT to each. One still must worry about encoding & decoding if one doesn’t want UTF-8, and one must worry about invalid UTF-8 in a byte slice when casting, but in general Go does what one would expect with a minimum of fuss.


Go strings are just a sequence of bytes in no particular encoding, i.e they can contain arbitrary data. Converting strings to byte slices and vice versa works always without ever changing a single bit.


He wrote about this more detail in an earlier blog post (about halfway down the page).

http://esr.ibiblio.org/?p=8161


If it was too slow in Python and now moving to Go. Could there a time when there is a need to move to Rust/C/C++ for even faster performance? Go seems an odd choice based on performance consideration alone.


As he said, given the size and complexity of the transition, one of his primary goals was to do as much of a "literal" translation first as possible. IOW, this wasn't so much of a "rewrite from scratch" (which is almost always a bad idea), as "do a translation first; then do refactoring to be more idiomatic / performant later". Obviously some parts will need to be rewritten, but the lower you can keep the rewrite:translation ratio, the better.

I'm not an expert in Rust, but from what I know it seems like moving to Rust would require much more rewriting than Go.

The other thing, as he says in the post, is that you get diminishing returns: The Go code was 20x faster than the Python code, but the whole 7-hour operation was only 10x times faster, because at some point the external SVM calls start to dominate. So even if a rewrite into Rust could gain him another dozen percentage points in speed, it's unlikely that a rewrite into Rust would have much of an impact on his end-to-end performance; and it would almost certainly make the code less accessible.


Someone once told me a tale of a student who had every variable defined as a single element array. When queried, she responded: "What if I need two of them?"

If you consider the could, there's always a way to make things arbitrarily more complex. Reality is that your target is simply "sufficient".

Also another provided reason was language consolidation, so ease of developer-project-flexibility was probably not the lowest of concerns


Especially with node.js/TypeScript one can reach similar performance to Go with arguably much nicer programming language to work with.


Evidence, please.


Pretty interesting. It is scary to make your "learn a new language" task to port 14,000 lines of code, but with that in mind, this all seems to have gone well. Some random thoughts:

> I had to write my own set-of-int and set-of-string classes

map[int]struct{}, map[string]struct{}

   ints[42] = struct{}{} // insert
   delete(ints, 42) // delete 
   for i := range ints { ... } // iterate
   if _, ok := ints[42]; ok { ... } // exists?
> Catchable exceptions require silly contortions

I am not sure why go has panic/recover, but it's not something to use. panic means "programming error", recover means "well, the code is broken but I'm just a humble generic web framework and I guess maybe the next request won't be broken, so let's keep running". It is absolutely not for things like "timeout waiting for webserver" or "no rows in the database" as other languages use exceptions for. For those, you return an error and either wrap it with fmt.Errorf("waiting for webserver: %w", err) or check it with errors.Is and move on. Yup, you have to remember to do that or your program will run weirdly. It's just how it is. There is not something better that maybe with some experimentation you will figure out. You have to just do the tedious, boring, and simple thing.

I have used recover exactly once in my career. I wrote a program that accepted mini-programs (more like rules) via a config file that could be reloaded at runtime. We tried to prove them safe, but recover was there so that we could disable the faulty rule and keep running in the event some sort of null pointer snuck in. (I don't think one ever did!)

> Pass-by-reference vs. pass-by-value

I feel like the author wants []*Type instead of []Type here.

> Absence of sum/discriminated-union types

True. Depending on what your goals are, there are many possibilities:

   type IntOrString struct { i int; s string; iok, sok bool }
   func (i IntOrString) String() (string, error) { if i.sok { return i.s, nil } else { return "", errors.New("is not a string") }}
   func NewInt(x int) IntOrString { return IntOrString{i: x, iok: true} }
   ...

This uses more memory than interface{}, but it's also very clear what you intend for this thing to be.

I will also point out that switch can bind the value for you:

   switch x := foo.(type) {
   case int:
      return x + 1
   case string:
      i, err := strconv.Atoi(x)
      return i + 1
   }
And that you need not name types merely to call methods on them:

   if x, ok := foo.(interface { IntValue() int }); ok { return x.IntValue() }

You can also go crazy like the proto compiler does for implementations of "oneof" and have infinite flexibility. It is not very ergonomic, but it is reliable.

> Keyword arguments

   type Point struct { X, Y float64 }

   func EuclideanDistance(a, b Point) float64 { ... }

   EuclideanDistance(Point{X: 1, Y: 2}, Point{3, 4})
> No map over slices

This one is like returning errors. You will press a lot of buttons on your keyboard. It is how it is.

I personally hate typing the average "simple" for loop:

   func fooToInterface(foos []foo) []interface{} {
       var result []interface{}
       for _, f := range foos {
           result = append(result, f)
       }
       return result
   }
But it's also not that hard. I used to be a Python readability reviewer at Google. I always had the hardest time reading people's very aggressive list comprehensions. It was like they HAD to get their entire program into one line of code, or people wouldn't think they were smart. The result was that the line became a black box; nobody would read it, it was just assumed to work.

I really like seeing the word "for" twice when you're iterating over two things.


I feel about list comprehensions the way I feel about regular expressions. Below a certain point of complexity, both are vastly superior ways of expressing what's going on. Above that point, comprehensibility drops off fast, and they immediately become inferior tools.

For example, something like:

    [some_func(y) for x, y in some_dict.items() if some_condition(x)]
...is, at least in my mind, eminently more readable than the imperative equivalent, and less PEBCAK-risky (i.e. what if you have to do two similar iterations and forget to use a different accumulator between the two?).

However, I totally agree with you re: "don't get too clever". Pretty much the instant you have nested comprehensions, or try to get clever iterating over multiple data structures in a single expression, it immediately becomes much worse than the imperative form, and you should feel a little bashful and bust out some loops, functions, and accumulators.

Same is true for regex. Something like:

    ($match) =~ qr/\A(?:foo|bar)[.]com (\d+)-baz$/
...while it does require you to know regex, is way simpler and more robust than writing the equivalent stack of many string slicing conditions, and less error prone. However, once the "too clever" rubicon is crossed (subjective, but my personal rule of thumb is: more than 50chrs of non-literals or more than one lookaround expression), like list comprehensions, it rapidly becomes much harder to understand than the equivalent long, explicit, string-slicing form.

As with many things, I think that skill in these areas is a matter of knowing when to stop.


Much of what passes for wisdom in this field is just people's way of feeling good about themselves because they're convinced they're smarter than others. The list comprehension or whatever it is is just a construct to enable that thought pattern.


> I really like seeing the word "for" twice when you're iterating over two things.

You will if you write a Python list or generator comprehension that iterates over two things, won't you?


> It was like they HAD to get their entire program into one line of code, or people wouldn't think they were smart.

Lol. Call sign of a junior developer...


I liked this summation because the migration happened for a truly valid reason: Python really was a bottleneck. Not that I expected ESR to succumb to hype driven development, but it's nice to see for sure.

On the article itself: I just knew that error handling would have the biggest write-up, even when the one writing was someone like ESR. Gods, the error handling in Go is odious.

Now my obligatory opinion: If only [insert language here, Go in my current job]'s promise of producing more maintainable code was true; the reality is that it's just the same nigh unmaintainable hell I've found in nearly every other project I've worked on. At least Python is nice to read, even (mostly) when awfully written. Oh, how I miss it.


The missing 'keyword arguments' could have been replaced with a struct passed to a function, no? Unless I'm missing something from Python, in Go you could replace this type of function:

     func f(x int, y int, c string) 
with something like this:

     type funcOptions struct {
          x, y int
          c    string
     }
     func f(o funcOptions) {} 
 
     f(funcOptions{x:3, y:-1, c: "hello"})
So the readability hit would have been more 'minimal.


The problem there is that you then have a struct populating the namespace, which means it’s an entity to track, something to think about, something which populates autocomplete buffers. That’s not necessarily terrible, but it does impose complexity that simple keyword arguments do not.


This is very similar to the pattern the AWS SDK uses.

In reality, its a small benefit to readability. You gain parameter names, which is very nice, but you end up with huge function call lines.

    describeTableOutput, err := dynamodbSvc.DescribeTableWithContext(ctx, &dynamodb.DescribeTableInput{
      TableName: "users",
    })
With the main sources of pollution here being:

1) The forced inclusion of the package specifier on the type, rather than being able to directly reference the type via an import alias or something. You can alias the package name, but that's not a great general solution.

2) The forced inclusion of the type name at all. What, exactly, would be the type inference challenge if you were allowed to do something like this?

    describeTableOutput, err := dynamodbSvc.DescribeTableWithContext(ctx, &{
      TableName: "users",
    })
I understand, if the parameter in the signature were an interface, this would not work. But its not; its a struct. It feels to me like this kind of inference should be allowed when a parameter is a struct, but maybe I'm missing some subtle corner case where it would not work.

3) All the context stuff; both as a function parameter, and the words "WithContext". I hate this. To be clear; Context is awesome. Every function which has the possibility of making network calls should accept a context. And that's the problem; modern Go libraries liter it everywhere, adding "WithContext" mirrors of existing functions to maintain backcompat. Context really should be "contextual"; omnipresent, overrideable, and only getting in the way when its needed. 95% of functions which interface with a context accept the context and pass the context; they're middlemen. They shouldn't have to even care about the context. Something like:

    func (d DynamoDB) DescribeTable(input *DescribeTableInput) (*DescribeTableOutput, error) {
      ctx := getContext()
    }

    func main() {
      setContext(context.Background())
      describeTableOutput, err := dynamodbSvc.DescribeTable(&{
        TableName: "users",
      })
    }
In other words, available via magic global functions. I'm sure there's some reason why this wouldn't work, or would have unintentional negative consequences, but I illustrate it only for the point that there are ways to accomplish this goal that are (probably) cleaner than making it a function parameter.


Yeah, this is a known design pattern known as the Parameter Object (from Fowler's Refactoring book) (see https://wiki.c2.com/?ParameterObject). I am of two minds on it though, since it does add boilerplate to your codebase - especially if you start adding things like the builder pattern, or if you want to make arguments mandatory (e.g. via asserts).

I use intellij which can offer inline parameter name hints, I think that's a good middle ground but it doesn't make things more readable outside of that editor.


Yes, though you also have to consider default values as a possible source of error.


True, but you could provide default values as well if you use something like this: https://medium.com/@meeusdylan/go-reduce-function-parameters...

You'd need extra logic to know a value has not been set at all though. At which point the complexity might not outweigh simply..not using them.


One of the better read for a long time.

The translation assistant you write is actually very interesting. Heavily rule based but surprising to see it actually helps at all.

But the scale of the project itself seems still pretty limited, reimplementation could still be an option.

Overall good read and interesting approach


If you want a real surprise, try a rewrite in Elixir.


Hmm down votes... Too bad b/c I'm serious. I ported a corpus generator some years ago from Ruby to both Go and Elixir. To my surprise it was easier to port to Go, but the Elixir version ran much faster.


Or many other languages.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: