This may be surprising to some people, but it's not a new pheonoema and I feel it's kinda known in TLS circles. Everyone who's done scans on TLS vulnerabilities should know this (speaking as one of the authors of the ROBOT attack).
What's often going on is that the www domain is delegated via DNS to a thirdparty service provider, while the apex domain is hosted on some appliance and just does a simple redirect. The reason for that is that you can't do DNS delegation for an apex domain.
During our ROBOT investigations we found a particularly weird example: We found a particularly severe variant of our vulnerability in old versions of Cisco ACE load balancers. Cisco told us that these were out of support, so they're not gonna get fixed. Turns out: they used one of them to redirect cisco.com to www.cisco.com.
Another implication of this result is: even if the transport layer security appears to be secure, it's possible that it only indicates the CDN setup, and the actual upstream servers are insecure.
Which means, a latest and greatest TLSv1.3 server you see may be ultimately backed by a vulnerable TLSv1.0 upstream server (or even unencrypted HTTP!).
This explains why nvidia.com has terrible TLS [1] config while all their other domains are ok [2] (in particular, the domains hosting the driver downloads as ok [3]).
DNS is hierarchical. If your DNS server manages example.com then you can set a delegation for a subdomain - e.g. www.example.com - to someone else. I.e. you could say "for this subdomain the nameservers from google's cloud service are responsible". You can go further and e.g. configure the google cloud service so that abc.www.example.com goes yet to another DNS service.
So a company could say "we manage the DNS for our domain inhouse, but for www. we let some other company do it and pay them for it". This is pretty standard.
> Does this mean that third parties can modify DNS entries?
Not only DNS entries, but 3rd-parties can wiretap/modify arbitrary content in all web pages if you use a CDN. That's essentially how CDNs work, you have to trust a 3rd-party if you detegate your request to a 3rd-party. I'm not saying it's necessarily bad, but there is no other way around. And in case of CloudFlare, the DNS is already fully controlled by CloudFlare.
CNAME flattening makes things kind of work, but if your CDN is using DNS to provide different IPs to different clients, doing a host lookup at your normal authoritative isn't going to work as well; ECS can help a bit, and maybe using anycast for apex domains helps; but it's a challenge to do well unless you delegate the whole domain.
> Additionally, 24.71% of these redirections contains [...] HTTP intermediate URLs.
One problem I've found is that if you have a redirect from HTTP to HTTPS, then you don't notice an extra redirect from HTTPS to HTTP (e.g. redirect from HTTPS plain domain to HTTP www, which then in the next request get upgraded from HTTP www to HTTPS www).
Combine that with the fact that it is surprisingly difficult to prevent the AWS load balancer from redirecting from HTTPS to HTTP. Literally Apache, Nginx, Tomcat, Jetty all get it wrong out of the box. I had to write this blog post to prevent myself from getting it wrong in the future: https://www.databasesandlife.com/jetty-redirect-keep-https/
The particular application I discovered that on did not accept HTTP traffic, so I discovered the extra redirect. Most people don't.
> Literally Apache, Nginx, Tomcat, Jetty all get it wrong out of the box.
Absolutely correct, it's surprisingly difficult to get a really good working configuration with no weird fallbacks or side effects and have working http2. I have five server blocks per domain name when using nginx, some are reusable though, here:
* Port 80 with wrong domain - close connection, nothing legit does this
* Port 80 with right (www. and .) domain - redirect to https://
* Port 443 with no SNI - close connection, nothing legit does this
* Port 443 with www. SNI (http2 enabled) - redirect
* Port 443 with . SNI (http2 enabled) - display proper page
Not surprisingly it tremendously cut down the amount of exploit scanners and bad bots from logs (they seem to rely on everyone not enforcing SNI) and managed to hide the server from Shodan :D. This is of course not to diss nginx, other servers e.g. caddy and apache were worse or even impossible to configure to the same point.
Nothing about TLS, a lot about getting the same setup as I can with nginx. E.g. getting rid of the behaviour of random TLS certificate from random block being served when no SNI matches (wtf?) and closing the connection instead.
Hmm, are you using the latest versions of Caddy? That definitely shouldn't be the case. If you are, please file a bug report! If you aren't, please upgrade. :)
CNAME in this case makes no difference vs an A/AAAA record. The HTTP side of things doesn't care what kind of DNS record it was, but will be affected by what the fqdn is, since it will affect cookie propagation and other security things in the browser.
It's usually better not to have the same content at two urls. Whichever one you choose, the other should redirect to; or if you're picky like me, trying to go to the non-published one should send you to the home page on the other --- this reduces the tendency of others to link to the wrong form.
Should catch all TLS connections that have no SNI or try to connect straight with just IP and if snakeoil is generated right then it doesn't instantly reveal what's the real hostname.
Interesting. From my observation less and less sites actually use the www subdomain. Some give plain 404, domain not found, certificate errors and similar when browsing them with (or without) www, depending on how they are set up. It's a mess, really. Usually they actually behave the same, including TLS configuration etc.
Also something I observe daily: When I tell someone to go to "example.com" some will punch in "www.example.com", others will simply do "example.com". There is also the ultra rare occurrence of someone typing in "http://www.example.com".
Personally I just configure www to point at the same as the regular domain and have the www as a "subject alternative name" in the certificates. And in Nginx it's as easy as just adding it to the "server_name". I suspect it's the same for all other major HTTPDs.
These URL mangling behaviours of Chrome are getting really insane. It kind of works for very well known, ultra large sites like Facebook, Youtube, Google and other large corporations, but how are you supposed to know whether you're at the right place for the myriad of smaller and/or personal websites if most of the URL is hidden?
The way you are using "www." and "." is incorrect. The dot isn't part of the local hostname "www". It is a separator or delimiter for the textual elements in the domain name. "Redirecting from (or to) ." is a weird and as far as I know completely wrong way to talk about redirecting from or to the domain name one level up in the hierarchy. The only time a dot plays any role in a domain name (other than that of a simple separator) is when it's used at the end of the fully qualified domain name.
What if I'm talking about redirecting an entire TLD "www" to ""?
Obviously it's a separator, I just found writing "" and "www" too weird. Not to mention that in this context where you're not really a recursive DNS resolver it really doesn't matter how one specifies the existence or lack of it of a subdomain. In addition to that some things use "@" (instead of the dot I used).
Either way, "." by itself can only meaningfully refer to the root, not some arbitrary enclosing domain name. There is no way you will convince me that using "." as a shorthand for "parent domain" is a good idea, for that and all the other reasons.
I understand the idea of highlighting the apex domain, due to malicious sites. bankofamerica.com.site.eu/login, and so on. I wonder if they could have done some other kind of highlighting or UI trick to promote the "site.eu" portion of the link, rather than hiding the subdomains.
I wonder how generalizable your experience is, though. I imagine this could very well be a trend in tech circles, where knowledge is widespread that "www" is mostly optional and the domain name is the relevant part. (And people tend to enter a lot of urls manually so they shorten them as much as possible).
I could imagine that this might be different among non-technical users, where the exact meaning of domains, subdomains and the "https" and "www" prefixes are less well-known.
My (also non-data backed) guess would be it's a generational thing.
When I was a kid in the early 2000's, most web addresses were "www.domain.com", that's what you'd see on written material like advertising or articles. These days if an ad has a web address it usually doesn't include the `www.` subdomain.
I would say the opposite. From my experience technical users are the ones that insist on "www" while non-technical users just follow the general trend of eliding it and are unaware of what it means exactly.
Exactly. The point is that there is a difference between www and nowww. The only reason they usually lead to the same result nowadays is a social convention and those can change arbitrarily. Tech people are aware of this but non-tech people notice and learn this convention by association and might be (and from my experience, are) unaware that it is not a technically guaranteed one.
I use www. for everything because of cookies, when having cookies on the plain domain they are on every subdomain automatically, while having them on www. is exclusively for www.
I like your idea but it's based on false assumptions. See RFC6265. Depending on the implementation on the backend side (specifically the "Set-Cookie" header) it's possible to handle a cookie generated on "www.example.com" on "example.com" or "sub.example.com" ... and all other possible combinations.
With the correct configuration, www.example.com cookies do not leak over all subdomains, but it's impossible to configure example.com cookies to _not_ leak over all subdomains.
A quick summary from a layman: Most websites that have both a www-domain (www.example.com) and a plain domain (example.com) tend to have better security configurations on the www-domain. It's worth noting that they seem to have scanned only for some known weaknesses, and that there is no law stating that www-domains must be stronger. Also, some websites that implement both www and plain may implement a redirect to a weaker configuration (oops). They analyzed about 1.3 million websites, so it's a decent chunk of the internet.
Of note to me is that they didn't comment on why they might be configured differently. As in, does following a popular tutorial to configure Apache or Nginx neglect the plain domain? You couldn't determine the reason through a web scan so I don't blame the researchers for offering the reason. They also don't offer any way to fix the disparity. Obviously, the best practice is to just configure your server's www and plain domains correctly, but clearly most of the internet isn't following their best practices checklist.
The crap apex domain difference is probably explainable due to the fact you can't CNAME an apex (Yes, some CDN's/systems have work arounds).
So www gets put behind a nicely configured CDN with decent TLS settings, whereas the apex gets a crappy HTTP redirection service or maybe worse, the actual origin itself, serving up a half baked config.
www is more secure to use than the root domain because of how cookies are set. Setting a cookie for the www subdomain ensures that it’s only available for that subdomain and not globally across all subdomains. It makes sense that sites that use www are generally more secure.
> www is more secure to use than the root domain because of how cookies are set.
> Setting a cookie for the www subdomain ensures that it’s only available for that subdomain and not globally across all subdomains.
No, www is not more secure because of cookies. As you said, it depends on how cookies are set, which is up to the host, as this is defined in the HTTP headers coming from the server. It's possible to access a cookie from "www.example.com" on "example.com" or "sub.example.com" if it's configured like that on the backend. Also the browser needs to comply. (Protip: most browsers do)
Try document.cookie = 'a=b' in a browser console on non-www domain and then do document.cookie on the www variant.
You'll see it not set. (The cookie will only be accessible on subdomains if you include the domain explicitly, like document.cookie = 'a=b;domain=toplevel.domain')
Yeah, because the WHATWG folks changed the spec after 20 years. Not because IE implemented the wrong thing. Not surprising folks in this thread remember the old spec, which was valid for longer.
> Further analysis of the top domains dataset shows that 53.35% of the plain-domains that show one or more weakness indicators (e.g. expired certificate) that are not shown in their equivalent www-domains
And Chrome still wants to fool the user into thinking these two domains are the same.[0]
I asked a question about this exact topic on Information Security over 4 years ago, which received very little attention but the one answer it received came to the same general conclusion (possibility of protocol differences):
However, if these are hosted on the same server, there is a long-standing bug in Apache that forces the same protocol and cipher suite for all virtual hosts:
What's often going on is that the www domain is delegated via DNS to a thirdparty service provider, while the apex domain is hosted on some appliance and just does a simple redirect. The reason for that is that you can't do DNS delegation for an apex domain.
During our ROBOT investigations we found a particularly weird example: We found a particularly severe variant of our vulnerability in old versions of Cisco ACE load balancers. Cisco told us that these were out of support, so they're not gonna get fixed. Turns out: they used one of them to redirect cisco.com to www.cisco.com.