Ah sorry - we're eventually going to implement karma sharing to take care of cases like this. Often I search to see if someone else has already submitted the 'better' URL but I forgot to do that here.
There's a lot of randomness in which submissions get noticed and achieve liftoff from /newest. At least it evens out over time if you continue to post good submissions!
Always nice to see a vendor respond quickly and adequately. Even better when they go above and beyond, as seems to be indicated here by the timeline posted and the assessment of what fixes were put in place by the researcher(s). Good job Tailscale.
Very well written. Also quite worrying given they're supposed to be a security company and these kinds of issues are well known.
Then again it does seem like the entire universe applies "eh probably nobody will try and hack it" to services listening on local TCP interfaces.
They certainly don't care about multi-user machines, though I suppose there are so many local root exploits these days you're basically trusting your users anyway in that situation.
Linux is used by by docent "security companies" and still has vulnerabilities from time to time. So does Apple in the products where their care about security, etc.
More important are three things to realize:
1. their handling of the incident, which wasn't just fast but you could say absurdly fast to a point that I'm pretty sure multiple employees dropped everything the moment they read the mail to solely focused on fixing and analyzing it
2. it's Windows only (at least it's main problem is), coming from a workaround for a feature "missing" in windows from a company which is relative young and started out in the Linux/UNIX space.
3. it is exploitable due to a fundamental design flaw of browsers
Or what I'm trying to say: It isn't that surprising (or worrying) that such a thing happened, what matters is how they handle it and make sure that it will not happen again.
It's not missing. Windows has named pipes. They just don't use the same API as Unix sockets so you would have to do some work to do it properly (or use an existing library that abstracts both interfaces).
In any case Windows has had support for Unix sockets since 2017 so there's really no excuse.
> Linux is used by by docent "security companies" and still has vulnerabilities from time to time. So does Apple in the products where their care about security, etc.
Right but neither of them are companies whose sole product is a security product.
I agree their response time is good - they probably realised how bad it looks!
Best exploit write up I’ve read. All the details, interspersed with nonchalant humor.
“None of these words are in the Bible.
This is certainly true of the King James Version, but many of these words can be found in what might as well be the Old Testament of Same-Origin Policy: the WhatWG Fetch Standard, which defines the CORS rules we are being accused of violating.”
The quality of her writing is wonderful, clear, to the point, funny and of course all the technical details are well explained. For a non native English reader like me (I am French), a real pleasure to read. You can congratulate her!
> In theory, there is no path for a malicious Tailscale control plane to remotely execute code on your machine, unless you happen to run network services that are designed to allow it, like an SSH server with Tailscale-backed authentication.
Now I feel less crazy for not using Tailscale SSH for similar reasons.
I'd like to see a security evaluation of Tailscale, on a per feature basis.
I'd like to see tailscaled run with far fewer privileges.
Is there a Tailscale alternative that just does Wireguard + NAT traversal and doesn't try to do key management?
Yep. Same boat. Absolutely zero interest in granting them ssh authZ; transport wrapping is all I want to outsource. Just deliver my bits and I pay you, tyvm. My suspicions have been proven correct here.
Unfortunately reading about this remote RCE vector has me wondering whether I can use the product at all without all this bloat (taildrop, ssh, etc) affecting me. Going to have my team look at zerotier this week, I’ve heard a few ok things.
Saw that when it came out, yikes, but here it makes my point for me.
The zerotier software failed - as such you could (in the simplest terms) bypass the transport “firewall”. At no point could you execute code on my machines. At no point could you spoof any authorization layers outside of what’s required to reach my ports. So when the model catastrophically failed here, attackers still cannot login to my machine. Other attacks might make this possible (e.g. code exec in the agent), but were not found - I suspect due to the lack of attack surface.
All software can have serious bugs, which is why you do defense in depth. Never depend on just one thing for your entire security perimeter.
Outside narrow very well defined cases where proofs of security are possible, it might be impossible create perfectly secure computing systems due to the insolubility of the halting problem and the sheer size of the combinatorial space.
If you watch the CVE announcements it's a continuous stream of serious bugs in all kinds of major software applications including OSes, web browsers, networking hardware, VPNs, cryptographic libraries, and so on. Microsoft, Apple, Cisco, etc. have serious vulnerabilities fairly often.
Wireguard never had, and probably will not have, a serious vulnerability (one allowing bypassing a tunnel). The attack surface is small, and you can carefully review the code, even formally verify it. The devices could all tunnel out to a nearby VM in the cloud.
This vulnerability is very critical, and discovered by an undergrad (not a security team): Code execution in local machine, taking over tailscaled, hijacking the coordination server, adding nodes, SSHing into machines, SMB shares, etc. The users are owned if attacked, and this was supposed to be a security-focused product.
Part of the problem is the feature bloat, that Wireguard deliberately avoided. Like, I want a mesh VPN, not an alternative to OpenSSH or Dropbox as well. The integrations add code, and it’s hard to secure a larger code base.
The response from Tailscale has been excellent though. Hopefully they will take measures to prevent such issues. This is a VPN after all!
> Wireguard never had, and probably will not have, a serious vulnerability (one allowing bypassing a tunnel).
True, but even the bare minimum WireGuard VPN still has a lot of stuff other than WireGuard. There's going to be a configuration protocol, software to create a tunnel device on the system, a management protocol, software updates, a UI, identity management or some kind of login/auth system, etc.
No one is asking for “perfect security”. We all agree with you, that is impossible. What many security professionals want from this product is a stable network transport tunnel with well-defined attack surface. We understand defense in depth, which is why we disable ssh authZ in tailscale, for example.
Now imagine you are an enterprise user of tailscale, you diligently elected not to trust it with login to your boxes, but you still got pwned because of “taildrop”, a feature no one on your team uses, wants, or knew was enabled.
Software vulnerabilities happen at a rate that highly correlates with size of attack surface. The attack surface here is pretty clearly too high (bad “defense in depth” as you say), and I hope they provide mechanisms in the future for disabling all this bloat, otherwise offerings like zerotier will eat their lunch.
> I'd like to see tailscaled run with far fewer privileges.
Yeah - I have a dislike for services running as root when it's not necessary, and then getting users to escalate to root to interact with them routinely.
One thing I was thinking about was trying to identify the Linux capabilities which let tailscaled run, and then look at if it's feasible to adjust the default systemd unit to run it as a non root user. Closely followed by then trying to harden up the service with as many of the recommendations as possible in "systemd-analyze security".
Despite there being a pretty good range of restrictions available, it seems to be pretty rare that service definitions actually come locked down... Might be something for the tailscale team to look at in future?
Software shipped by the distro maintainers I find is often properly locked down with systemd features, but third party stuff is always hit and miss. Definitely agree Tailscale should be shipping with the bare minimum privileges required.
I use a cheap public VPS and Wireguard and it works over my ISP connection at home. My servers run here at home but are only publicly visible at my VPS public ISP address. Is that the same as what you're asking for with 'NAT traversal'?
If so, the config is straightforward for techies, just Wireguard config, it routes into my home server and I use Apache Reverse Proxy to route to the backend services.
Yeah the whole "run local http server as control panel" is iffy for non security centric stuff, let alone VPN software.
I guess it is because it's easy ? But now even windows can make unix sockets, that seems like reasonably easy and secure solution for "talk with some daemon portably"
> Is there a Tailscale alternative that just does Wireguard + NAT traversal and doesn't try to do key management?
I really wish there was a NAT traversal protocol or library that wasn't overly complex and focused on the 90% cases. It would help not just tailscale's but anyone building p2p tech.
You usually still have to punch with IPv6 as there is usually a stateful firewall in the way. You just get 100% success vs the 80-90% you get with V4 (and getting worse as CGN gets more common).
This is correct. It's very annoying for any p2p like application, because the punching is a coordinated and time sensitive dance that just circumvents particular firewall bs. The firewall approach comes from this heavily flawed idea of the client initiated model of communication, extrapolated to client=consumers and server=service providers. It's just awful that the majority of the nodes on the internet aren't even reachable by default.
Anyway, it would be much better to leave the socket APIs to handle this, possibly with OS safeguards and privileges. Writing p2p applications is analogous to being constantly protected "for your own good" by a guardian. /rant
It's used like that because for a long time that approach worked. Users are terrible at securing their own machines and will click yes on anything just to get a thing they want and so putting stateful firewall allowing only outgoing connections was very effective measure.
Much less relevant when now even windows comes with half decent, reasonable default firewall out of the box. Then again "user clicking allow button till it works" is still a problem.
Do they have enough logs to reach out to people that were affected? As far as vulnerabilities go, this set is one is one of the worst ones I've seen this decade, and they seem rather straightforward.
Would be nice to get a blog post from them that goes a bit into impact, not just a report that tells you to update. It's nice that they responded quickly, but I feel like this shouldn't have happened in the first place for a network security company and it makes the Windows client feel like a bit of an afterthought. Looks like they have a PR open to switch it to named pipes, I hope that is properly reviewed by someone that knows Windows APIs before it's merged.
I received this email as well, I probably should have clarified to say that it would be interesting to know if any of this was ever actively exploited. I assume this hasn't happened, considering the sentence in their report, but this is a client vulnerability, so logs may not have reached their servers(I know nothing about their telemetry setup or what is actually logged, which is why I mentioned that a blog post about their part of the procedure might have been nice).
1) they probably don't get notify
2) they also have interest to say no if isn't being exploited publicly even if they are searching cases internally.
In any case the response feels solid most companies will try force update whit vulnerability a and not disclose what the problem was and then blame the public for not have the software update, this giveme security about the compaby, it's also problematic because it drains the the time of the host to update their systems.
My guess is the client sends some kind of "goodbye" message when it gets reconfigured to another coordination server, and that message has enough information to determine if it originated from this attack.
Related: note also that tailscale's tailnet 100.* subnet is som form of CGNAT public ip block. I think Tailscale thought long and hard about this, and landed on it because it was a path of lesser resistance to break fewer things. And if you squint they fit the stated purpose.
But even if browsers now implement PNA the tailnet itself is public address space, so that vector still exists. I wonder if browsers (and eventually standards) will be pressured to treat those blocks as private.
The real takeaway here is that you should never treat any network boundary as critical for security. This is true whether it's a physical boundary or a virtual one (with TS being one example of the latter).
If your private net is full of trivial to access things with no access control or horribly insecure services, that's a huge problem. There are many many many ways to hop over firewalls. Hostile JS on web sites is just one.
Network boundaries are only first lines of defense in what should be a defense in depth strategy. Never depend on any one single boundary completely.
My personal criteria is: if it's not secure enough to be connected directly to the Internet with no firewall, it's broken. Make it that secure and then put it on a secure network.
We're crazy to allow any program to talk on the network by default. Then when we run js we allow the browser, a user executed program, to decide what level of network control it will exercise. This laissez-faire attitude to controlling communication paths, or even awareness, makes lateral movement so much easier.
The lack of integrated authentication services is one reason why so many things are completely open. It's too hard to set up and manage user credentials, and in any case, programs shouldn't have access to user credentials, they should get delegated permission. AD has made everything too hard, we need a TOFU like dynamic machine identity exchange, which then allows individuals users to execute programs with particular network capabilities.
Geez, yeah, I was going to say that it's clearly bonkers that DNS rebinding can trick the browser into communicating with loopback addresses, until I got to the part of the article that explained how there actually is a mitigation for that. Hopefully Firefox fixes that soon, because I shudder to think how many applications are vulnerable.
> If you run non-HTTPS web services on your Tailnet, and those services are unauthenticated or rely on Tailscale for authentication, implement an allowlist of expected HTTP Host headers to prevent malicious Javascript from accessing these services.
In my opinion, this should be done not only for non-HTTPS services, but for all services: the "default" virtual host (used where there is no Host header, or when it has an unexpected value) should have nothing except a static 4xx error page. This not only avoids DNS rebinding attacks, but also avoids automated attacks in which the attacker doesn't know the correct hostname for the service (mostly automated scans for vulnerable PHP scripts and similar).
In addition, it's basically required if you're using a reverse proxy service, eg. Cloudflare or Akamai. CF websites have been found via Shodan or Censys because site info is left open via https ://[ip]:443.
I don't see a writeup of how this was fixed. Merely checking the Host header is insufficient -- the vulnerability would still be wide open to anyone who can open TCP sockets to localhost.
Windows has APIs (named pipes, DCOM (eww) and such) that allow authenticated local access to services. Unixes have unix sockets.
Generally speaking, allowing privileged operations because a specific user asked over a TCP socket is asking for trouble: there are quite a few ways that unwitting processes could open a socket on behalf of an attacker without realizing that it is asserting its identity and thus granting privilege.
All the major cloud get this IMO entirely wrong with their services that issue secrets to instances (e.g. AWS IDMS).
With tcp being connection-oriented I think it's not too hard to get right, especially if the OS won't reuse a closed socket right away. Definitely worth considering though. Of course it's doable without netstat if you can track down the right apis https://stackoverflow.com/questions/47659365/find-process-ow...
The concept of DNS rebinding and DNS records pointing to a private/localhost IP address is particularly interesting and I remember when I first came across it in the wild. It's not exactly re-binding in the classic attack sense described in the article: some US sportsbooks make you download a geolocation service that verifies your location in order to place bets. The sportsbook's front end communicates with it through a DNS record pointing back to 127.0.0.1, and opens up a WebSocket to talk to the service. I imagine the WebSocket is used to bypass the same-origin policy but perhaps someone more knowledgeable can speak to that.
The client app is not indicating that 1.32.3 for Windows is available yet but the download link on the site has been updated.
Tailscale client downloads are extremely slow at the moment, so I suggest you distribute one copy manually around your tailnet rather than bogging down their servers even more.
The Windows client caches the current version for a while, so may not yet have v1.32.3 available on your device.
In that case, you can still pull the latest release from http://pkgs.tailscale.com/stable.
Tailscale admin here, politely requesting client update push capability. Being able to see endpoint version is helpful, I will be suspending unpatched endpoints in the near future.
Or Get Tailscale in the Windows Store so it could auto update for all the endpoints out in the wild I don't control (company laptops, users home PCs, etc). Trying to get TEN people to manually update today was a pain, I can't imagine even triple that.
I've read the description several times and find it hard to follow:
..an attacker-controlled website visited by the node..rebinds DNS for the peer API to an attacker-controlled DNS server making peer API requests in the client,
including accessing the node’s Tailscale environment variables
I enjoyed the explanation very much. Even though I don't use mesh VPNs (yet), the architectural discussion of the vulnerability entailed numerous useful bits of background on browser and network infrastructure. Commendable work!
i might be mistaken but i think there was something else weird about the windows loopback interface. i can't remember what it was, but something like binding the loopback interface would bind on all interfaces by default maybe?
> The speed and quality of Tailscale's response to our report is unlike any vendor interaction I have experienced, and suggests a deep commitment to keeping their customers safe.
I have mixed feelings here as a Tailscale customer.
Yes a quick response is great, but this actual security issue is pretty terrible IMHO.
Anything other than an immediate response would have been akin to lighting their company on fire and walking away.
> Anything other than an immediate response would have been akin to lighting their company on fire and walking away.
Have we forgotten Zoom, who reinstalled itself secretly on user machines with an RCE-vulnerable server, which they described as “working as intended?” They’re still wildly popular today with organizations despite the insane lack of regard for security and their users’ safety.
Mistakes happen. I applaud Tailscale for moving so quickly and doing the right thing.
As the reporting party of the much less severe (and much less interesting) TS-2022-003[1] I can confirm that vendor interaction is also swift when not responding would not light the company on fire. In fact there were several other vendors that were affected that have yet to mitigate it.
> Anything other than an immediate response would have been akin to lighting their company on fire and walking away.
Companies get away with sweeping customer security issues under the rug less and less, but still far too often. I honestly wish we as a people would put other players of this game in as high standards as you do here for this company here.
> We can ask Tailscale to open a path on an SMB share. Windows being Windows, it will send your username (and a hash of your login password) to this server, unprompted, despite having no reason to consider the server trustworthy.
Wow, I used to think Linux security was miles ahead of Windows security more than 20 years ago because of insanity like this. Fast forward 20 years. NTLMv2 is common, so cracking a password actually requires guessing the entire password instead of just 8 characters. But password guesses are much cheaper, so we haven’t gained much.
Microsoft, how long will it take you to fix this for real? Opening a URL or UNC path should not, without an opt-in, authenticate at all. If configured to authenticate, it should prove, zero-knowledge, to the server that the supplied password (e.g. the logged-in user password) matches the server’s expected password. No further information should be leaked.
Server certs are a different issue. If OpenSSH, by default, sent SHA256(logged in user’s password) to the server, even after verifying the cert, it would get laughed out of the toolbox of security-conscious users.
I'm not sure I've ever seen a detailed technical writeup of a vulnerability before that started with such clear and concise instructions on the exact steps needed to defend against it at the start of the article before. In particular, making clear the priority of what to patch is excellent. If I'm a user of a product where a bug was found, I'm definitely interested in learning about what the bug was, how it was discovered, and whether I should be worried about other bugs in the future, but the absolute first thing I want is to do whatever I can to make sure I'm not affected by it. Listing what to patch and/or change in code might be more "boring" than the narrative of how the bug was found, and it might spoil the surprise, but I think sometimes we focus so much on the fun of the process of finding the bugs or revel in the cleverness of an attack (and those things are fun!) that we forget that the real point to it is to make our stuff safer. There's plenty of time for fun, but make sure you patch things first!
Also interesting was this recommendation: "Keep using Tailscale!
The speed and quality of Tailscale's response to our report is unlike any vendor interaction I have experienced, and suggests a deep commitment to keeping their customers safe."
Every product has security issues now and then. The real challeng is building robust processes remediate them (and to ensure that class of issue doesn't reoccur). The teams that deliver, by being transparent and fixing their stuff in a timely way, get my business.
I call this the "Vulnerability view" instead of a "Remediation view" and its something I feel a lot of Security people and tooling gets wrong when sharing information with those outside our bubble.
It is dead easy to export a vulnerability scan or penetration test report and throw it at the developers, but you will get much better outcomes and better rapport if you tell them what they need to do (i.e. patch to version x.x.x) versus telling them what is wrong ("the sky is falling!").
Releasing a patch and a detailed write-up on the same day seems like a bit of an unfortunate choice, especially for a WTF!! vulnerability like this. In software that doesn't auto-update, no less...
Looking at the timeline, it looks like Tailscale opted to allow for public release on the day of the patch:
> Sat 19 November: Coordinated Disclosure time proposed by Tailscale, accepted by us, Tailscale shares planned Security Bulletins and blog post
> Tue 22 November, 5:06AM: Blog draft shared with Tailscale (a bit last minute, sorry!!!)
> Tue 22 November, 7:00AM: Coordinated disclosure time
Because the code is open source anyway, I'm guessing they assume attackers would see the announcement of a vulnerability, browse the recent pull requests and figure it all out themselves anyway. Delaying publishing of the details saves maybe a few days of exposure to risk for motivated attackers, especially as the author seems to have done her work together with one other person in just over a week.
They've also sent out emails it seems, so people know they should update ASAP and why. With the extremely limited amount of people running Tailscale (and the even smaller subgroup running it on Windows specifically) I don't think it's an attack hackers will rush to roll out. Mitigations also exist (i.e. block access from the browser to 100.100.100.100) so even in situations where you cannot update you can protect yourself.
Especially as the fixes seemingly have been going into their public GitHub branch for days, since the report. I wonder if that was a conscious choice or negligience, maybe I'm missing something? I would expect these to be released as patches/merged in when the vulnerability is published, like a lot of other security-critical open source software does it.
Malicious actors will monitor patches and reverse-engineer them anyway, so probably better to make some noise in this case and make sure people update as fast as possible.
> If you visit my website, I am granted the honour and the privilege of executing arbitrary Javascript on your computer.
>
> This is a pretty bad idea
This is why I disable javascript by default, but I suspect that on this page it's needed to fix the theme or something, because the text is light grey on a white background, and all monospace sections are completely illegible.
Edit: I don't mean to hate on the author, the content of the article is really interesting!
> I suspect that on this page it's needed to fix the theme or something, because the text is light grey on a white background, and all monospace sections are completely illegible.
You seem to be correct. I found a single <script> tag in the source, with the following code:
(() => {
let v = localStorage.getItem("color-scheme"),
a = window.matchMedia("(prefers-color-scheme: dark)").matches,
cl = document.documentElement.classList,
setColorScheme = v => (!v || v === "auto" ? a : v === "dark") ? cl.add("dark") : cl.remove("dark");
setColorScheme(v);
window.setColorScheme = v => {
setColorScheme(v);
localStorage.setItem("color-scheme", v)
};
})();
Though I don't see what's the point of this since the "light" theme, as you've pointed out, is completely illegible.
i would still rather see less considering where tailscale software sits in my privacy/security. at some point i'd ask why do i pay to use this swiss cheese ? (not saying thats the case, but if I were to continue to see more issues)
Yep agreed. I think/hope an incident such as this will create a step change in their security processes. As you said, it only takes a few of these incidents to make it very difficult for the business to recover, especially given the tech savvy customer base.
ps. she's looking an employer rn // hire her!