
Into the Borg – SSRF inside Google production network - red0point
https://opnsec.com/2018/07/into-the-borg-ssrf-inside-google-production-network/
======
menage
> it’s not because of the design of Borg interfaces

As the original creator of the Borglet status page, I think it's not really
accurate to describe it as "designed". :-)

It was more a case of gradual semi-random evolution, with people (mostly me,
at least in the old days - I've no idea how much it might have changed in the
last eight years) adding things that seemed like they could help Google
engineers trying to track down issues with their Borg jobs. User-friendliness
for random SSRF hackers was definitely not high on the priority list.

~~~
kevincox
As a user of the borglet pages they aren't pretty but they are quite
functional and to-the-point. I can appreciate the no-frills design.

------
tptacek
This is a great find. SSRF is a really unappreciated vulnerability; it is
usually game-over. That it came from a Caja audit adds some tasty irony.

A friendly word of advice: when you find flaws like this (you, the reader, not
you, the guy who wrote this post), think carefully before disclosing internal
network details you discover like this writer did. The internal details of a
target network don't become public domain simply because you found a
vulnerability. There are firms that get _extremely_ itchy about this kind of
stuff getting published, and I can't blame them.

~~~
outworlder
I'm still amazed that this individual published as much data as he did. I
don't see where it says they have permission to do so.

Specially because:

> I hope they won’t beat me with a stick for disclosing any of this

This tells me that this wasn't cleared with them. Doesn't sound like a smart
move. If your gut is telling you it may be a bad idea, it could be because it
is...

~~~
londons_explore
The page he accessed probably contained _thousands_ of Google borg jobs. Many
about secret or future projects for example. Many giving implementation
details for secret sauce algorithms (eg. oh - they precompute all possible
misspellings for their spell corrector via this mapreduce!). Simply knowing
the fleet wide CPU, network and RAM usage for Gmail would give a competitor a
lot of knowledge into the probable running costs of the service for example.

In this case he exposed a tiny proportion of what he accessed.

Sounds fairly reasonable to me.

~~~
asfasgasg
Uh, how does that follow? In what other domain do you get a pass for doing a
little unnecessary harm because you could have done much more?

------
puzzle
> Google is still relying on Borg for its internal production infrastructure,
> but I can tell you it’s not because of the design of Borg interfaces!

No matter how spartan, the Borg status pages are more helpful than most
Kubernetes UIs out there when it comes to debugging a problem in depth, i.e.
past CPU and memory graphs. Part of that is made possible by applications
exposing debugging endpoints and telling Borg about them.

~~~
GauntletWizard
Yeah, I would kill for the k8s pod contract to have something like the borg
status line.

~~~
jbeda
It's something we've talked about on and off. Never seems to make it happen.

------
hkr_mag
For those who is new to the world of SSRF vulnerabilities, check the SSRF
Bible (full disclaimer: I'm with Wallarm):
[https://docs.google.com/document/d/1v1TkWZtrhzRLy0bYXBcdLUed...](https://docs.google.com/document/d/1v1TkWZtrhzRLy0bYXBcdLUedXGb9njTNIJXa3u9akHM/edit)

------
justicezyx
Very impressive findings.

Disclaimer: I am part of Borg team.

~~~
londons_explore
Time to make everything accessible via stubby only, none of this http
nonsense...

~~~
asfasgasg
Integrity and privacy for all streams.

------
how2cflags
Webarchive link to the article above, as I found myself unable due to the
system load on the page.
[https://web.archive.org/web/20180720170255/https://opnsec.co...](https://web.archive.org/web/20180720170255/https://opnsec.com/2018/07/into-
the-borg-ssrf-inside-google-production-network/)

------
heipei
borglet status page in full resolution: [https://opnsec.com/wp-
content/uploads/2018/07/borg2.png](https://opnsec.com/wp-
content/uploads/2018/07/borg2.png)

------
luhn
Not quite relevant to the article: Since it seems to be tricky to properly
sanitize URLs for SSRF, I had an idea for safely calling user-defined URLs:
Set up an unprivileged non-VPC Lambda function that calls a URL and call all
user-defined URLs through the Lambda function. I think it should be
bulletproof, anything I'm overlooking?

~~~
bpicolo
GCP does it by requiring a header on requests to metadata systems.

Require a header on your internal services and make sure you never send that
header with user-requested URLs and you can guarantee safety there.

~~~
kerng
Still allows tons of attacks and recon, like port scans, or even crashes of
processes- since internal things might not go through the same fuzzing
scrutiny as external endpoints.

------
paulddraper
tl;dr SSRF is server side request forgery, where you can gain access to
private resources by convincing privledged servers to make requests for you.

If you are using network access for security, either don't, it blacklist
private IPs (and use public DNS) for untrusted URLs.

------
dilyevsky
Ha! I spent so much time squinting at that borglet page. Such nostalgia

------
elvinyung
> I should mention that Borg, like Kubernetes, relies on containers like
> Docker

Hmm, containers _like_ Docker, or Docker? I thought Google used lmctfy since
long before Docker?

~~~
powera
cgroups were written at Google, and have been used internally for a very long
time; they provide "container"-like limits on resource usage for a group of
processes.

I assume that Google isn't using Docker internally for production services,
but don't know for sure (and I assume anyone who does know for sure can't tell
you).

~~~
bogomipz
I've heard a few times that Google's containers actually run inside of VMs.
I'm curious if anyone knows what their VM implementation is or what its based
on?

~~~
rrdharan
Mostly upstream KVM with a custom replacement for QEMU:

[https://cloudplatform.googleblog.com/2017/01/7-ways-we-
harde...](https://cloudplatform.googleblog.com/2017/01/7-ways-we-harden-our-
KVM-hypervisor-at-Google-Cloud-security-in-plaintext.html)

~~~
jsolson
Lots of bits of KVM turned off, though. Makes it really interesting when I
work with people on Open Source stuff. I find out all sorts of things KVM can
apparently do that mostly leave me going "you put WHAT in host ring zero?!" :)

(note: as implied, I work on our userland QEMU replacement)

------
dandigangi
I found this really cool to read into. Some of it went over my head but a
great read none the less. Thanks for sharing.

~~~
dandigangi
Why was this downvoted? I really don't understand the users of this site
sometimes.

------
pjjw
i can't help but very lol @ kubernetes as the successor to borg :0

