
Go Module Mirror and Checksum Database Launched - ingve
https://blog.golang.org/module-mirror-launch
======
programd
Vendor. Everything. Always.

I appreciate the fact that Google is investing in infrastructure to make the
Go ecosystem robust, I really do.

Having said that I encourage everyone to do a simple experiment.

Check out your code into some random directory. Copy the directory to a USB
drive and walk it over to an air gapped machine (no wi-fi, no ethernet, clean
dev environment installed). Copy the directory to the box, make some small
code change and try to build your binary.

If your build fails you're doing it wrong.

In real life one of the following will happen[1]:

\- Your network connection to the Internet will be down

\- Your network connection to the corporate network will be down

\- The services you depend on (Github, Google proxy, Docker hub) will be down

\- The services you depend on will throttle you

\- The services you depend on will serve you wrong or corrupt data

\- The services you depend on will be offline for maintenance

\- The certificates on the service you depend on were not renewed in time

\- The account you use on services you depend on will be inexplicably
suspended

Murphy's law being what it is, some or all of the above will happen when your
SaaS is on fire, your customers are screaming, and you're trying to roll out a
fix. Meanwhile your company is losing approximately your personal lifetime
earnings in revenue every half an hour.

Vendor. Everything. Always.

(and have backups, lots of backups)

[1] Based on highly educational career experience

~~~
ndarwincorn
I don't see any of those as reasons to commit dependencies to the same VCS
repo your source control lives in.

Your first point is particularly laughable. If your network connection to the
internet is down and you're a SaaS provider, you're probably going to need to
fix that before you can roll out a fix to your hosted software.

That said, they're all good points to consider when building out a deployment
pipeline, and in general mirroring your dependencies is a great idea. But
committing a vendor directory is a poor solution/mitigation for most of those
risks.

~~~
programd
> Your first point is particularly laughable. If your network connection to
> the internet is down and you're a SaaS provider, you're probably going to
> need to fix that before you can roll out a fix to your hosted software.

Don't assume that your SaaS runs on the same network as your build machines.
In most cases this is not the case - or at least it shouldn't be. Your SaaS
might be broken in some way, you're trying to create a build to fix it on your
secure corp network and your connectivity to the Net disappears. And it's 3 AM
in San Francisco and everyone who can fix it is asleep, and meanwhile your
European customers are not happy.

Let's be clear. You can mitigate the effect of all these problems one way or
another. But the mitigations are easier if you have less moving parts and less
dependencies on systems you don't control.

Lets face it, ultimately it all comes down to limiting your business risk at
least cost (because money). One simple way to do that is if your build
environment travels with your code down the years.

~~~
ndarwincorn
> Don't assume that your SaaS runs on the same network as your build machines.

That wasn't the assumption. It's laughable because under any condition of
where your hosting and build infrastructure live, you'll need to restore your
internet connection before you can deploy whatever software fix you're trying
to ('deploy' in the sense that your customers have access to the deployed
fix).

------
cube2222
For anybody wondering if vgo is production ready, we've been using it since it
came out in go 1.11 and so far didn't have any problems.

Existing workflows don't break because there's the vendor command and
dependency management is instantaneous. You can also roll it out gradually as
it doesn't have any problems depending on non-module repositories (using
commit hashes and dates)

There's a great blog post series too if you're interested in the internals of
go modules: [https://research.swtch.com/vgo](https://research.swtch.com/vgo)
(it's also one of the good counterexamples to people claiming go ignored x
years of PL research and history)

~~~
eknkc
I found the editor support to be still somewhat hit and miss though. VsCode go
extension still does not work as good on module enabled projects for example.
Otherwise we did not face any issues either.

~~~
calcifer
That's a great reason to try out Goland, which has had excellent module
support since day 1.

~~~
weberc2
If you can afford that $100-200/year subscription.

~~~
sethammons
I bought it and stopped using it. It does work with modules, but I don't like
the IDE itself and went back to vs code even though it breaks on me regularly.
The UI/UX is just too foreign for me in GoLand. I had to search Google to find
out how to search the project. Everything I wanted to do was non-discoverable
for me.

~~~
konart
CMD+SHIFT+F just like any other editor (vscode, sublime etc) out there?

Also you have action manu on CMD+SHIFT+A that will give you any command IDE
can offer.

It's as discoverable as vscode if not more (you can even use IDE Features
Trainer plugin if you want to)

~~~
sethammons
Pretty sure cmd+shift+f (maybe it was just cmd+f) would search the open file.

~~~
scns
With Shift: search whole Project. Without: search in open editor

------
philips
I like the simple plain text signing format for the API calls.

For example:
[https://sum.golang.org/lookup/go.etcd.io/etcd@v0.4.0](https://sum.golang.org/lookup/go.etcd.io/etcd@v0.4.0)

Is signed with:

    
    
      — sum.golang.org Az3grobHchAJWrV4M34o1kLnZV4vrGSfFA+2Q9VClbmWqBjsnN4GzK1xB1RaYGSo0jIjWH9GDcR3Tja5sadw2ESoKwg=
    

The format is documented here:
[https://godoc.org/golang.org/x/mod/sumdb/note](https://godoc.org/golang.org/x/mod/sumdb/note)

Also, as an aside, I am building a transparency tool for arbitrary binary
downloads with the rget project:
[https://github.com/merklecounty/rget](https://github.com/merklecounty/rget)

~~~
rsc
I really like rget, btw.

I posted a trivial little command-line demo of a client and server for some
others who were thinking about a transparency tool. It implements the
cacheable GET API we designed for the checksum database but obviously it
applies to arbitrary key-value pairs. It might be a better foundation than
stuffing things into CT logs (or not, you do get some interesting
infrastructure that way).

[https://rsc.io/tlogdb](https://rsc.io/tlogdb)

~~~
philips
Thanks!

I am almost certain I am going to need to move off of the CT log thing but I
hope to onboard a few projects first[1].

But, tlogdb looks like a great starting (sent you a PR if you have a moment).
I am trying to understand how all of the parts fit together at the moment
particularly if I need to use Trillian or how far a direct SQL store to
Spanner/CockroachDB like tlogdb implements is sufficient for awhile.

The big advantage of implementing something like tlogdb is that it becomes a
URL map which means that developers don't need to publish additional metadata
(at the expense of the service needing to download entire files to calculate
the digests).

All of that to say thanks for all of the work on sumdb. Great starting point.

[1]
[https://github.com/merklecounty/rget/issues/10](https://github.com/merklecounty/rget/issues/10)

------
donatj
Is there a justifiable need for the proxy? It seems like we’re just adding a
centralized point of failure, the lack thereof something I have absolutely
touted as a feature.

For verification we already have the go.sum file validating the hash. We’ve
got vendoring if you’re concerned about dependencies disappearing.

The touted speed feature seems like a non-issue? The absolute largest Go
package I’ve ever pulled took maybe a minute to grab its dependencies, and
then I had them and never had to again.

The cynical side of me feels like this is a way for Google to collect
analytics on the Go ecosystem. The slightly less cynical part of me thinks
people from other ecosystems demanded it due to cargo culting the idea.

The fact that it makes it more difficult to use private modules is pretty
irritating as well as I am going to need to make sure all our developers and
build systems are aware when we move to 1.13. Having to set GOPRIVATE on _each
machine_ is absurd. It should be definable in the go.mod so it can be known
via git.

~~~
nindalf
> The cynical side of me feels like this is a way for Google to collect
> analytics on the Go ecosystem

> other ecosystems demanded it due to cargo culting the idea

And there it is, the hallmark of a typical comment on a thread like this

* Cynical

* Expresses concern about data being collected even though this data can't possibly be used for anything bad. (If you disagree, go to crates.io and tell me what concerns you have).

* Expresses disapproval of a large corporation even though they haven't done anything wrong here.

* Pithy dismissal of other programmers by calling them names ("cargo cult"). This change improves speed of downloads by 6-7x on poor connections and 3x on good connections. Clearly this is helpful for situations like CI which typically involve clean builds. What's the issue here exactly?

* Furious attack on a free product that's completely optional to use. If you want to keep vendoring your dependencies, go ahead and do that. Why are you angry that some people are going to use this proxy?

~~~
donatj
"Furious attack". For serious? I gave a light hearted criticism of adding a
single lynchpin to the entire Go ecosystem. I think that's a legitimate source
of concern.

I was by no means pithy, I was by no means being dismissive. Thinking you need
something just because you had it elsewhere is the definition of cargo culting
- and is something the Go developers have otherwise largely avoided.

Just because something is _faster_ doesn't make it better or worth it,
particularly when it comes with tradeoffs and is already really fast - which
is my argument - that the tradeoffs aren't worth the benefit.

~~~
nindalf
This is not a single point of failure. It's by no means a legitimate concern.
Go read the proposal. If you don't want to use proxy.golang.org, use
goproxy.io. Or host your private one. Fairly sure you'll be able to use github
soon as well. Do whatever.

I notice you couldn't back up your FUD about analytics being collected.

Go programmers are happy that they have a solution where their dependencies
(and specific versions of those dependencies) will be available forever
without them having to take the trouble of vendoring. Your dismissive
response? Just vendor. They're happy that clean builds are faster, saving CI
time. Your response? It doesn't need to be faster.

~~~
donatj
You're doing the same thing you're criticising, being dismissive of my
concern.

> If you don't want to use proxy.golang.org, use goproxy.io

Yes, I can do that, but as I mentioned in my original post having to get an
entire team of people and fleet of CI systems to use a non-standard
configuration which is not communiciable in the project repo is a minor PITA.

If and when proxy.golang.org has a bad day, builds will fail. Go developers
will have a bad day. Having the ability to work around it doesn't make it not
a lynchpin.

As for the "FUD about analytics being collected" it was an offhanded quip. I
thought it read pretty clearly as _not entirely serious_ hence the prefix "The
cynical side of me feels" and not "I feel". It doesn't merit the effort of
defending.

------
joppy
"Below is an example of such a tree"... and there is a picture of a binary
tree, with no indication as to how the tree is used.

I appreciate trying to explain the interesting technology behind
authenticating modules, but this diagram appears to explain nothing at all.

~~~
Verath
There was a talk at gophercon that I belive goes into more detail about how
this works and the cryptography behind it (I only skimmed it so far)
[https://youtu.be/KqTySYYhPUE](https://youtu.be/KqTySYYhPUE)

------
w8rbt
This is probably a dumb question, but here goes... I'm testing go modules with
an internal project that uses a private github.com repo. It depends on other
public go packages and modules is nice to manage those but the general public
will probably never see or want to use this code.

Are modules in private repos impacted by this in any way? Would they show-up
in the index somehow?

~~~
FiloSottile
If the private one is the main module, you don't have to do anything, it
doesn't get sent anywhere because it's just what's on your disk. Indeed, you
can call it "module foo", which is not a go-gettable name.

If you _depend on_ private modules, you need to follow these instructions:
[https://tip.golang.org/cmd/go/#hdr-
Module_configuration_for_...](https://tip.golang.org/cmd/go/#hdr-
Module_configuration_for_non_public_modules)

In any case, private modules never show up in the index, mirror, or checksum
database.

~~~
etc_passwd
...but references to them are stored by google in logs and used per the
privacy policy at
[https://proxy.golang.org/privacy](https://proxy.golang.org/privacy)

------
tyri_kai_psomi
Also I would like to bring up since JFrog's module proxy, GoCenter was
publicly hosted first, they will have more than likely been first to host a
module version. That means you are now open to the situation where someone
uploaded v1.0 of a module to Go Center, and v1.0 of a module to Google's
Mirror, and they contain different code.

I am not sure how the Go team plans to handle this, but the only obvious
solution would be for Google's proxy to reach out to GoCenter first to see if
they have a copy of a specific version, download and cache, and serve that.
Does anyone know if that happens?

~~~
justinclift
Why would two versions - specifically tied to a given commit - of say "v1.2.3"
contain different code?

~~~
tyri_kai_psomi
Well you mentioned it in your comment, tags are not specific commits. Go
Modules standardize around SemVer, and fallback to hashes only when it needs
to. The reality is anyone can git push -f and change what a tag points to
currently.

So the scenario goes like this: 1\. Someone requests a module from GoCenter
that is currently not in it's caches. 2\. Gocenter downloads the requested
version v1.2.3, and stores that. 3\. Bad/Careless/Ignorant/Evil (depending on
circumstances, I guess) git push -f's 4\. Module is requested in Google's
module proxy, which now loads a "v1.2.3" into it's local caches, but it
contains different code now than "v1.2.3" of GoCenter

Outside of git push -f, there are other hacky techniques you can use to
"update" the code a tag points to.

~~~
justinclift
Hmmm, maybe my understanding of tags is incorrect.

When I tag something in git, that _is_ for a specific commit isn't it?

------
tjpnz
[https://proxy.golang.org/privacy](https://proxy.golang.org/privacy)

A Google privacy policy that's actually reasonable.

------
bndw
Does anyone know if the Go Proxy server using the original download
protocol[0] defined by vgo?

    
    
        GET baseURL/module/@v/list fetches a list of all known versions, one per line.
        GET baseURL/module/@v/version.info fetches JSON-formatted metadata about that version.
        GET baseURL/module/@v/version.mod fetches the go.mod file for that version.
        GET baseURL/module/@v/version.zip fetches the zip file for that version.
    
    

[0] [https://research.swtch.com/vgo-module](https://research.swtch.com/vgo-
module)

~~~
secure
Yes, see also
[https://youtu.be/WDbbIS7m9bU?t=298](https://youtu.be/WDbbIS7m9bU?t=298)
(sorry, slide deck not yet available it seems)

------
eduren
This looks great.

I'm interested in the possibility of running a private mirror. Does anybody
know if they've release the source for proxy.golang.org? I can't seem to find
it in the golang github org.

~~~
philips
[https://github.com/goproxy/goproxy](https://github.com/goproxy/goproxy) as
well

