
Userdir URLs like https://example.org/~username/ are dangerous - hannob
https://blog.hboeck.de/archives/899-Userdir-URLs-like-httpsexample.orgusername-are-dangerous.html
======
rlpb
> All userdir URLs on one server run on the same host and thus are in the same
> origin. It has XSS by design.

The ship has now sailed, but I still think it's worth pointing out that it is
Javascript's security model that is broken by design, that XSS vulnerabilities
are the result, and the same origin policy is an incomplete workaround as
illustrated by the article. This becomes clear when you consider that userdir
URLs pre-date Javascript.

The public suffix list is yet another incomplete workaround for the security
flaw in the same origin policy (that "same origin" isn't a concept that can be
clearly defined).

The modern web stack is a security house of cards, as demonstrated.

~~~
vortico
Honestly I can't think of a much better system that can support a billion
websites. What would you change/overhaul about the current system if you could
ignore compatibility?

~~~
yoloClin
Firstly, I'd fix the damn spelling of the referer header instead of everybody
putting up with it for close to 30 years.

I don't think it's that JavaScript and HTML are a bad choice, but there are
some things that would have made life a lot easier if they were strongly
enforced sooner, including secure cookies by default, SameSite=lax, removal of
referer header and CSP - doing them sooner would have stopped bad developer
practices while also removing a fair chunk of application security
complexities, but at least we're moving towards a better world regarding those
now.

I don't know if it'd be technically possible to implement, but additional
characters to mark unsafe strings would have a huge impact on webapp security.
Reflection of untrusted data at the moment generally relies on one of: HTML
encoding, URL encoding or JavaScript escaping and escaping a safe way is
highly context-dependent (I've seen an unescaped "\n" cause injection within
JavaScript contexts). A way of effectively storing the level of trust a chunk
of data has across multiple transports when marking untrusted data including
within HTML/JS, SQL statements and interpreted languages like BASH or PHP -
this would eliminate a bunch of vulnerabilities and would probably have
mitigated a bunch of notable historic vulnerabilities and/or hacks.

~~~
squiggleblaz
> HTML encoding, URL encoding or JavaScript escaping and escaping a safe way
> is highly context-dependent (I've seen an unescaped "\n" cause injection
> within JavaScript contexts)

I have had a hard time convincing co-workers that if you have php generating
sql generating (! yes!) html generating javascript, you need to escape the
string for javascript since it's embedded in javascript. Then you need the
string escaped for html since it's embedded in html. Then you need the string
escaped for sql since it's embedded in sql. Only then can you chuck it into
the middle of the string. It is better to not do such craziness; but once
you've decided to do such craziness, you must do it properly. The similarities
between js and mysql escaping are irrelevant; it must be escaped properly each
time it is embedded in another language.

~~~
benibela
Escape characters are one of the most stupid things in the computing world.

The formats could be so simple: first the length of the data, then raw data of
that length

~~~
EamonnMR
But then we would never be able to treat text as a stream. All of your one
pass algorithms would become two pass.

~~~
jl258
I’m curious what you mean by this — why does coding for length prevent
streaming? The receiving end can certainly treat the text as a stream still.
Do you mean that the sender cannot “stream” if they are generating the
contents on the fly? That seems trivial to solve - just break it up into
chunks, assuming a little bit of buffering is OK.

------
renewiltord
This is a non-problem. There is no real threat model. It's like how you can
cast to my TV if you're in my house. Sure, anyone could attack the party by
posting porn on the TV but that's really not going to happen because I'm not
going to invite you and put you in the wifi if you're the kind of person who
does that.

~~~
strenholme
Also, example.com/~username style URLs usually are 1990s-style static web
sites where the only Javascript is something like a marquee. They are not the
kinds of sites which have extensive Javascript or a web forum people can sign
in to; _at most_ a site like that would have a guestbook.

~~~
arkanciscan
Marquees were HTML

~~~
masswerk
MSIE-only extension.

Edit: The "true" marquee was running in the now defunced status bar
(window.status) via JS.

~~~
arkanciscan
::Vietnam flashback music::

------
Wowfunhappy
The only place I've seen these used are universities (eg each professor gets a
webspace) and other small companies and organizations. I think it's relatively
safe to assume Alice isn't going to attack Bob—if she did want to, she has
better ways to do so.

Now, if someone knows of a host selling ~username spaces to members of the
public, that's a different story.

~~~
Polylactic_acid
When I see ~username I know I'm about to load a page with some cream tiled
background. These days every user just gets their own subdomain like
username.usersites.com

~~~
op00to
Heh. I use two userdir websites for work.

------
vortico
I've never seen a userdir URL being used with anything but raw static HTML
with no or little JS. Generally the type of people who use these hosts don't
use cookies or local storage at all. They just want to post a few web pages,
photos, or documents. Can someone find an example?

~~~
simonw
Eighteen years ago my university let me host PHP scripts in a /~user
directory, and I wrote my own simple blogging software and ran it there.

It was vulnerable to the attacks described here. Thankfully no-one on that
domain ever exploited it in that way.

~~~
333c
Today my college lets me do the same. I wonder if I could even possibly use
`mod_wsgi` to run something like a Python server.

------
johnklos
Ridiculous. The solution isn't to call public_html bad. The solution is to
encourage people who are going to run things like webmail and other sites with
Javascript like that to use a specific virtualhost for webmail.

~~~
indymike
Would be nice if we included a path in the spec for single origin. It's almost
as if the spec was designed to kill off public_html style share hosting...
either that or talk the control panel software people into using subdomains
instead of paths for userdirs.

------
chc4
User-specific subdomains have their own set of problems too. This is why, for
example, GitHub Pages switched their hosting from username.github.com to
username.github.io - even with HTTPSOnly cookies, you can still _set_ cookies
for other subdomains.

[https://github.blog/2013-04-05-new-github-pages-domain-
githu...](https://github.blog/2013-04-05-new-github-pages-domain-github-io/)
and [https://github.blog/2013-04-09-yummy-cookies-across-
domains/](https://github.blog/2013-04-09-yummy-cookies-across-domains/) are
relevant

~~~
hannob
Have you read the post?

It specifically mentions that and even names github.io as an example.

------
RandomGuyDTB
I've only ever seen userdir URLs hosting static content - ideally, static HTML
with lesson plans or a resume or something. Why would anyone host a web
application on a userdir? This seems like a complete non-issue.

Also - why would a blog be threatened by XSS? A blog is a collection of static
HTML files and I assume the creator would be using a composer or just a text
editor + browser to add and edit pages. Why would anyone log into a blog?
Nowadays anything like a comment section is generally handled by iframes to
some other service (like Disqus) anyway.

~~~
jlgaddis
> _Also - why would a blog be threatened by XSS? A blog is a collection of
> static HTML files and I assume the creator would be using a composer or just
> a text editor + browser to add and edit pages. Why would anyone log into a
> blog?_

Perhaps you've never heard of WordPress [0], the 17-year-old blogging software
used by ~60 million web sites as of 2012 [1] and that is, today, "the platform
of choice for over 35% of all sites across the web" [2]?

\---

[0]: [https://wordpress.org](https://wordpress.org)

[1]:
[https://web.archive.org/web/20160129215921/http://www.forbes...](https://web.archive.org/web/20160129215921/http://www.forbes.com/sites/jjcolao/2012/09/05/the-
internets-mother-tongue/)

[2]: [https://wordpress.org/about/](https://wordpress.org/about/)

~~~
RandomGuyDTB
I have experience using Wordpress; in sixth grade I tried out the Wordpress
editor and felt writing HTML was easier. I admit it's popular with less-savvy
users and with companies with money to burn, but it seems unprofessional to
use Wordpress for anything you can write by hand. I don't think anyone would
be using blogging _software_ if they didn't have their own domain.

------
redstripe
You'd think this was just an archaic relic but Bambora payment processing
(which hasn't updated their site since uppercase HTML tags and framesets were
a thing) has a userdir feature that lets their customers upload static file to
their "secure webspace" which is hosted on the same domain for all customers.

~~~
strenholme
[https://www.bambora.com/en/us/](https://www.bambora.com/en/us/) looks like a
modern web site to me; if there was a URL we could look at, it would be
interesting to see this.

------
kylek
The tildeverse has some words for you

~~~
Polylactic_acid
Those sites have no backend code so there isn't really much risk of cookies
being shared.

------
zelon88
I would argue that the root issue you're describing which leads to XSS on the
same host is allowing un-sanitized input leading to code injection.

It is entirely possible to design your own application which uses unique
userdirs without using any of Apache's built in functionality that are
completely immune to XSS because it is not possible for any of the users to
place malicious scripts on the server.

Likewise it is also entirely possible to build a web app that doesn't use
userdirs at all in any form for anything and still be vulnerable to XSS
because of underlying input sanitization or code injection problems.

So whatever you're going to do, just do your homework and do it well.

------
masswerk
I actually do not understand, why this hasn't been addressed over all those
years. As far as the evaluation of origin is concerned, we may easily
implement something like:

If the path field of a URL contains a tilde directory, do the following: 1)
split the path after the tilde directory, 2) add the first part to the host
field, 3) use the second part as the new path field.

If there are any concerns regarding compatibility, make it opt-in via a HTTP
header field.

~~~
tgv
The tilde is just a convention. There are also sites that map it to
/home/<user> or something. Perhaps extending the Accept-Origin headers could
work.

But the basic message is: don't expect any protection from cross-directory
script attacks, so use domain names for anything serious.

------
boomlinde
Not really a problem if the host only serves static content. Author argues
that there still is what I guess could be called a phishing vector:

 _> All of this is primarily an issue if people run non-trivial web
applications that have accounts and logins. If the web pages are only used to
host static content the issues become much less problematic, though it is
still with some limitations possible that one user could show the webpage of
another user in a manipulated way._

It is really _without_ limitations possible that one user could show the
webpage of another user in a manipulated way, but for a static site that is
true whether it can be done dynamically because they share domain or because
the phisher can simply read your site content, copy and manipulate it.

------
BiteCode_dev
The title makes it look like the problem comes with the "~username" URLS
naming scheme, while it really comes from any features that let the user host
static files they upload AND link somewhere else on your website.

The site take the example of apache mod_sites, but alone it's not enough. You
need apache mod_sites, and the ability inject a link served by it somewhere
else.

Well, yes. If you let user input anything and then inject it anywhere in your
website, it's a security whole. Not a surprised.

So if you let your users host entire application on your website, and let
other users of your website visit those applications. Well, I mean, sure it's
a problem.

The best sandbox is the one you don't need to use.

------
mrslave
Today we have affordable wildcard CA-signed SSL certificates (thanks Let's
Encrypt!) so there's no excuse for not following
[https://username.host/](https://username.host/) structures.

------
Alex3917
> While this issue should be obvious to anyone knowing basic web security, I
> have never seen it being discussed publicly.

It was discussed here:
[https://news.ycombinator.com/item?id=15141594](https://news.ycombinator.com/item?id=15141594)

I'm still of the opinion that there should be an RFC around tilde urls, making
it official that they're for hosting static content that isn't vouched for by
the owner of the main domain. This has been an important part of our culture
for decades, and it should be formally recognized and codified so that people
can do it in a secure way.

------
epx
It made me nostalgic, my (static) webpage was like/~that in 1995 or so.

------
torstenvl
What? No. mod_userdir is not dangerous. The resulting URLs are not dangerous.

Having multiple unrelated unvetted outside users with the ability to make your
server serve arbitrary JavaScript is dangerous. The route you take to get
there is kind of beside the point.

If anything, mod_userdir is helpful in that it's obvious when content is under
the control of a different user. Perhaps browsers should treat x.com/~x/ as a
different server from x.com/~y/ for XSS purposes.

------
Groxx
Sure, if you're running _entirely separate systems_ behind each username.

But in that case this has been true since XHR first came into existence 15-20
years ago:
[https://en.wikipedia.org/wiki/XMLHttpRequest](https://en.wikipedia.org/wiki/XMLHttpRequest)
(sub)domains are a fundamental requirement for web security, they essentially
always have been.

------
zemnmez
you could fix this pretty effectively with the Content-Security-Policy:
sandbox header, which would just deny all same origin policy access

------
frobozz
Most Userdir URLs like that just go to 404 pages anyway. That is the bigger
problem.

------
jefftk
Summary: host user pages as username.example.org instead so browsers will
consider them to be separate origins

------
perlgeek
No danger if ~username/ can only contain static pages. No tokens to be stolen
etc. in XSS scenarios.

------
mcintyre1994
I didn’t know about the public suffix list, that’s a cool idea to maintain as
a community resource. Nice!

------
6510
When roboform was still popular a lot of its users had it fill out username
and password automatically.

------
kalonis
(I am so glad the title is not "Userdir URLs considered harmful".)

------
dvfjsdhgfv
This is not a problem of userdir but of Javascript. Userdir is older, then JS
arrived and introduced many kinds of vulnerabilities, then some of these have
been addressed, with mixed results. With userdir you're perfectly fine if you
don't use JS.

------
mister_hn
So maybe 90% of academic websites are built in this way

------
singularity2001
what about [https://example.org/+rm+-rf+/](https://example.org/+rm+-rf+/) ?

------
mholt
-

~~~
bsagdiyev
Looks like he edited it to remove now but parent comment was basically saying
he's been telling customers to avoid shared hosting by using shared hosting in
the form of Heroku, etc instead.

