Hacker News new | comments | show | ask | jobs | submit login
What I learned in 2017 writing Go (commandercoriander.net)
254 points by zeveb 9 months ago | hide | past | web | favorite | 68 comments

>> ...a careless approach to a main function can result in needless complexity. Compare two versions of the same main function: before and after. One version is over 400 lines. The other is about 40 lines. That’s an order of magnitude. One will be easy to change.

A potentially contrarian view: a long-ish straight linear function without duplicated code is often much easier to understand than a smaller function broken into subroutine callouts (because you don't have to chase down the implementation of the subroutines to figure out what's going on).

While I tend to agree with this, I think a counterargument is: long functions mean more local variables that are in scope. This makes it harder to understand the interaction between different variables at different scopes. A separate function introduces a separate namespace, and all the data flow in and out has to be explicit, through function parameters/returns.

I would love an IDE that unrolls subroutine calls inline as a view mode.

I'm working on this very thing right now as a side-project. It unrolls functions, if statements, for/while loops, etc and uses jupyter notebooks to do so AND allows you to interact with the result. Kind of like an interactive log.

It's early days but I'm making progress! https://github.com/ebanner/pynt

Peek definition does this, sort of. Definitely exists in VS and VS Code, and probably in the JetBrains things.

That is really interesting. They would have to be made to stand out somehow, but it might be helpful. I can only imagine what some library heavy code would look like.

Yes, agreed about the counter-argument. See my nearby comment here:


I think the important thing is that modularity (breaking up longer functions into shorter ones) helps, but only if not done blindly. One has to spend some time doing it, since there are many possible ways it can be done, and some of them will be better than others. So it can pay off in the long term, but only if one is willing to put in the effort and time, and be ready to throw away some not-so-good attempts. I've done this sometimes and seen that it does help.

That's not always true, the best cases for inlining one-use functions is when you don't introduce new local variables.

I've seen 1k LoCs functions that were incredibly easy to understand, and 10 liners that were complete gibberish messes.

I do however agree that declaring new variables every 10 lines over hundreds of lines would be hell to follow.

Good naming helps. If you strive to make it obvious what each subroutine does, you can treat them as small black boxes (until you need to look into a specific function). Obviously, there needs to be some balance and not go crazy in either direction.

Having a lot of single-use functions, even if properly named, still splits the context in that many pieces.

For some cases that results in more confusing code.

I partly agree with acangiano and partly with you. On the whole, I tend to agree more with him, with this proviso: don't just name the functions well, name their parameters well too. That can make the overall program easier to understand, and we can still get the benefit of modularity and black boxes, while not creating more confusion. YMMV, of course.

I agree about modularity and encapsulation being important.

Usually I try to weight the overhead of having more function prototypes to track, indirections in the main function and whatnot versus the overhead of having it all inlined.

A lot of things become more evident as you inline one-use functions, because the whole context is now local.

Yes, it's a tradeoff.

Interesting, your comment reminded me of inline functions in C++ and C99; not the same thing (or at least goal) as what we were discussing, but closely related:


I agree with you. Sometimes a single long-ish function is preferrable to a flock of shorter ones.

Still, I think the GP has a point - clear naming is very important, regardless of one's attitude towards function size. ;-)

I find myself increasingly doing this in Rust:

    let lines = {
        let ret = vec![];
        // Open file
        // Read into ret
    let processed_data = {
        // Open another file
        // Use lines
        // Construct ret

You get the best of both worlds this way: scoped, meaningful names like with sub-functions and the continuity of a long one.

Everything is an expression is an underrated feature of Rust.

In at least Java, and probably more languages, you can also use braces to limit variable scope. E.g.:

  List<String> lines;
    int whatever;
    lines = new ArrayList<>();
    // modify lines

  List<Object> processed_data;
    int whatever;
    processed_data = new ArrayList<>();
    // modify processed_data, using lines

In Go something similar to this is actually more common because of `defer`, e.g.:

    if err := func() error {
      f, err := os.Open(filename)
      if err != nil {
        return err
      defer f.Close()
      // work with f
      return nil
    }(); err != nil {
      return err

I do this a lot in unit tests where a pattern is repeated.

IMO the advantage of small functions is that they act as a form of documentation. In the given example (short main) I can see that it loads some configuration without having to worry about how it does it. With the long main, I'm forced to see the details.

Also, if we're talking about OOP, small methods are better. With small methods it's easier to change behavior by subclassing. In several occasions, I've been forced to make a subclass with a copy of a big method with only one line changed.

>because you don't have to chase down the implementation of the subroutines to figure out what's going on

If the subroutines are sensibly named, you shouldn't need to. I'd argue they're also easier to follow than blocks of code within a long function as well-written small functions will depend only on their inputs, whereas code in a large function could depend on any of the params to that large function.

A viewpoint shared by the go-kit faq: (https://gokit.io/faq/#dependency-injection-mdash-why-is-func...), which I tend to agree with.

If you want to find specific place in large function, you will need to read all code line by line. If case of smaller functions, you can locate branch of interest much faster if naming is good enough.

It really depends on the context of the problem being solved. Some problems become more complex when modularized like this.

Splitting functions is basically adding indirections and spreading the context all over the place.

For reusable tasks that works wonders, but for single-use functions that usually obfuscates the intent more than anything.

I don't want to follow fifteen 10-lines one-use functions when I could read one 150-line function (which would most likely be a lot less due to less boilerplate).

I have to disagree with this. If the function is larger, then the mental cognitive overload to understand (and follow the flow) is much higher. If a 1000 line method is divided into 10 methods with each 100 lines, and if those 10 methods are named appropriately, then it would be much easier to understand the flow. It also gives a high-level picture of what the function does, and if needed, I can always jump into particular sub-routine.

I concur on gometalinter.

It can be a pain sometimes to satisfy it, but most of the time, my code ends up better than it was before.

Well, except for gocyclo. Some functions just are complex - simplifying them or breaking them up into several smaller functions just to keep each function small is not always the best idea or even possible. But that can easily be disabled on a per-function basis.

The `pkg` package is a horrible convention. Having a `cmd` directory makes sense, as each binary needs to be a separate package, but putting all your libraries under a `pkg` directory is unneeded.

I hate the pattern primarily because it overloads the name "pkg" which is already used in your go workspace at least one directory up to store compiled library files.

The word "lib" makes more sense. But, also, its unnecessary.

I've always used `lib` in every other language. I see no reason to change....

Some projects dump everything in the root of the repo. I find that to be annoyingly messy.

Comingling non-Go dirs with Go code makes it hard to navigate, since some dirs will be Go packages, and some won't. You may want to have documentation (/docs, perhaps), test data (/test?), build output dirs (/build or /dist), Docker/Kubernetes/Helm dirs (/docker, /k8s, etc.), Protobuf files (/proto), and so on, usually all in the root of the project.

Secondly, these directories might conflict with Go package names. We have a project that has a package called "documents", which is about persistent documents, which we'd have preferred to call "docs", but the root already had a /docs folder for actual documentation.

Personally, I prefer "src".

Good points.

More fundamentally, it feels a bit screwy for a multi-faceted 'project' to live wholly inside $GOPATH in the first place. But at the same time, it's natural.

Typically, unless I'm writing a shared library, I find myself putting everything under a `internal` folder. It keeps people from importing possibly unstable code until it's exposed.

I tend to agree, though I also find having everything be in the root of the repository equally rough for larger projects.

Delve's main doesn't seem all that exemplary. It's irritating to open up main only to find that the princess is in another castle.

For dependency management we use Dep (https://github.com/golang/dep) combined with (https://github.com/getstream/vg). This works pretty well.

I didn't know about vg and it looked promising, but I was disappointed to see that it still requires that I initialize all my projects in a subdirectory of $GOPATH/src. Why does go (and/or its tools) enforce this? Why can't I put my projects in ~/code? Why do I need a "github.com" subdirectory? Feels so silly.

Main developer of vg here. In principle vg should work for packages outside of GOPATH when using full isolation mode (which has its own set of problem, see README for details). Its dep integration doesn't work then though last time I tried. As far as I can tell the main reason why Go tooling needs GOPATH is because everything uses absolute imports and relative ones can translated to absolute ones by always working from inside the workspace.

I've solved my main issue with GOPATH in a very easy way though. I just create symlinks in my normal directories (like ~/work) to my projects in GOPATH. This allows me to go to my own projects easily while still having everything in GOPATH.

I use zsh and there is an easy way to find your golang projects using cdpath:

    # setup easy access to cd paths
    setopt auto_cd
    cdpath=($HOME/code/go/src/github.com \
      $HOME/code/go/src/github.com/collinvandyck \
      $HOME/code \

There is a detailed explanation of how package names are used by the compiler here: https://golang.org/cmd/go/#hdr-Description_of_package_lists

Specifically this bit is relevant here: > Every package in a program must have a unique import path. By convention, this is arranged by starting each path with a unique prefix that belongs to you. For example, paths used internally at Google all begin with 'google', and paths denoting remote repositories begin with the path to the code, such as 'github.com/user/repo'.

Most (all?) Go tooling assumes this convention. Conceptually, I don't see it as that different from how other languages determine input paths, except in Go they are guaranteed to be unique which I consider a good thing. In Java for example you still have file namespaces they're just inside the project itself (looking at you `src/java/com/example/package`) or Python which looks for the root module name in `PYTHONPATH` and then recurses down for submodule names.

Most if not all languages use file structures to namespace imports in one way or another, Go just applies it globally instead of at a project level. I agree it raises the barrier to entry for someone to just come in and run `go run` on a project but I find it works very well once you've got it setup. Convention over configuration is a big thing in Go tooling.

I found the article's stance on vendoring to be quite restrained. I can't wait until Dep becomes _the_ solution. That and getting rid of GOPATH (I work across multiple machines, and need to bootstrap new environments regularly).

How do people differentiate between packages being worked on and packages installed?

Let's say I have my-package (version 1.0) installed and in use in a lot of my other projects. But I'm working on version 1.1, but need to have 1.0 as the reference for other projects.

Right now I just fiddle with $GOPATH but that feels like a hack and is an annoyance if I have to swap around a bit.

Right now my-package might be in a broken state because I'm working on it, so I can't build anything relying on it if I go with the single $GOPATH hierarchy that every discussion seems to assume.

My main method of keeping track of this is creating a symlink to all my actual projects in my regular ~/work directory. Appart from this most people use vendoring, but I use (and maintain) virtualgo instead: https://github.com/GetStream/vg

It supports working on a dependency of a project locally with ease, by using "vg localInstall" which is a extreme pain with vendoring. And also it supports version pinning of executables.

Generally we use vendoring, i.e. you have 1.1 in your gopath, but vendor 1.0 in your projects (as long as they're standalone projects).

It's also important to limit cross package dependencies as much as possible to avoid these situations.

Yes, vendoring is the best choice here. I'd also recommend everyone give dep[1] a shot.

[1] https://github.com/golang/dep

Having tried to migrate from godep to dep a couple of times now I always gave up seeing how many additional unneeded files dep will pull into my /vendor folder.

Just a few days ago using dep instead of godep on a medium-sized web application gave me a delta of 1,920,834 additions and 1,942 deletions (_after_ manually pruning _test files), mostly caused by depending on aws-sdk-go.

Do you have any strategy for keeping your vendor small with dep or do you just accept it the way it is?

There seems to be this: https://github.com/golang/dep/issues/858 , but it looks like it's not implemented yet and the status is highly unclear.

Running "dep prune" after "dep ensure" should remove a lot of unnecessary files (this will be integrated into ensure somewhat soon). Also, in a lot of cases it's not needed to commit the vendor directory, so its size doesn't really matter then. Finally, I really recommend checking out the #vendor channel on the Go slack if you have any more questions regarding dep, everyone there is really helpful.

Do you have production applications running that do exclude their vendor folder? Does this work reliably? How do you handle dependencies that do not use semver?

What would be one of the cases where you'd say you need to commit vendor?

All our production applications don't have their vendor committed. By running "dep ensure" it dep restores the exact same vendor directory from the lock file, we do this in CI where we build our binaries. For dependencies that don't support semver dep uses the master branch by default and pins a commit to support reproducible builds. The only case where this has caused problems for us was when one of our dependencies force pushed to master, so the commit that was pinned didn't exist anymore. This resulted in a failing build though, not an inconsistent one.

Other than working around that issue the only case where you actually need to commit vendor (at the moment) is when you want your project to compile reproducibly by only running "go get project". If you are fine with telling your user to run "dep ensure" (or make) first this is not needed. This is not usually an issue when working with colleagues, but can be nice for publicly released projects.

Thanks for all the insights! I will give this a go soon I guess.

No, I've mostly accepted it at this point since I don't care too much about the size of /vendor (as long as it's not outrageous). I have noticed that it brings in extra files, though, which a `dep prune` typically helps alleviate.

0. They could use something like "kardianos/govendor" to check the location of the used packages (vendor, internal, external ...). 1. Using environment variables is not a hack in general. Persisting the environment in version control requires some kind of make-like wrapper though. 2. $GOPATH is not limited to one directory.

This is coming from a comparatively spoiled Rails developer, but: why would anyone bother using a language in production if it lacks fundamental things taken for granted like dependency management and stable testing frameworks/practices? Do people just not do enough due diligence and get blinded by a language's honeymoon period?

While Go’s dependency management story isn’t an encouraging one, from a testing perspective I personally have few complaints with Go. Mocking with interfaces is straightforward, and despite the basic testing package, I find it quite easy and straightforward to both write and run Go tests — a testing “framework” is not required.

I do generally use the testify assert package for convenience[0], but it’s not strictly necessary.

[0]: https://github.com/stretchr/testify

Dep (which is very similar to bundler) really solves most of the problems with Go dependency management and is the official tool everyone is converging to now: https://github.com/golang/dep

For the problems it doesn't solve I've created virtualgo at my job, which is similar to rvm or virtualenv: https://github.com/GetStream/vg

I just discovered VG yesterday. I am super excited to really try it out. That is what I loved about Python was Virtualenv and not having to install packages globally. So I am really looking forward to trying VG out.

Always nice to hear :). Be sure to report any issues you have on GitHub btw, the setup and usage should be as painless as possible.

> why would anyone bother using a language in production if it lacks fundamental things taken for granted like dependency management and stable testing frameworks/practices?

That one's easy. Go is a brilliant low-level language. You hack on servers, tooling, IPC, batch pipelines, etc. Nothing that people endlessly keep pondering about, from "deps" to generics to gopath to "practices", really hinders the enthusiastic hacker much from just getting busy with it, and producing. A Go coder doesn't envy the "spoiled Rails developer" because he feels blessed by a low-level systems language that actually has prim types other than int, sane pointers, concurrency primitives, utf8, heck simply strings to begin with, fast compilation times, no header-includes mess, etc.. =)

I've worked with Eno (the OP) on the product that he's talking about (Cloud Foundry). It's worth noting that the first versions of Cloud Foundry were written in Ruby/Sinatra for the reasons that you're talking about.

We switched to Go because Ruby's concurrency model was obtuse/inefficient and scaling the codebase was challenging even with excellent developers and great engineering practices. The performance gains with Go were a really nice bonus.

If you're solving Google-style problems like writing a PAAS, then the benefits of Go outweigh the headaches. If you're writing a more straightforward CRUD-style Web API, then Rails might be a better choice.

To provide a historical context, once upon a time, Ruby and Rails community didn't have Bundler.

But that didn't stop early adopters to build libraries and push Rails to where it is right now.

The very same could be said of Rails. I for one absolutely hated working in Ruby and especially Rails because I'm incredibly spoiled from Clojure. Why would anyone bother using a language without live coding that has tons of syntax issues, a braindead ORM, inconsistent conventions, extremely noisy documentation, where most devs care more about pretty superficial things than actual simplicity and composition. Do people just not do enough due diligence and get blinded by a language/framework's honeymoon period?

(I don't mean to bash Ruby, just trying to put your argument in a different perspective.)

You realize that bundler isn't a part of Ruby, correct? You're comparing apples to oranges.

FYI, bundler was planned to be integrated into Ruby 2.5, but will now likely be in the following ruby release.

Does Go have a comparable tool that is as widely adopted as Bundler is?

It looks like `dep` will be this tool for Go.

While dependency management is still an ongoing pain, it is miles ahead of Rails just for the fact that they are compile time dependencies. Last time I was responsible for deploying a Rails app the standard deployment method was to pull and build the gems on the production system as part of the deployment. Before Docker solved this I always thought of dependency management as one of Ruby/Rails weaker points.

mocking with interfaces is so straightforward, it just kind of happens without much thought if it something you are used to in other languages. The first thing I did in my very first go project was mock the repository/DB tier out so I could learn go without focusing on learning a new DB at the same time (I was using dynamo for something ultimately)

Different perspective:

Often these things are done in a half-assed way that isn't well suited to production deployments (e.g. not secure, requires connections to random hosts on the Internet from production boxes).

"Testing" means many things to many people in the context of many different kids of software. Defining a fixed framework for same can be detrimental to some proportion of that space.

I can not imagine deploy to production where you pull dependencies on production box. You build it on build server or locally and then deploy compiled/bundled system.

But still starting project where you can pull dependencies for development with dependency manager and setup first thing in let's say 25% faster, I'd choose that.

For testing I would agree framework is not a must have.

Some applications require a higher efficiency in terms of resource usage e.g RAM, CPU etc. Go solution is better in that scenario. Cloud software vendor would use more efficient language especially if that application is not directly billable to customers.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact