Hacker News new | comments | ask | show | jobs | submit login
The Document Base URL Element (developer.mozilla.org)
59 points by theandrewbailey 7 months ago | hide | past | web | favorite | 25 comments

This feature was recently used to bypass email link filtering systems: https://www.securityweek.com/phishers-use-new-method-bypass-...

If you want you can disallow its usage on your website with a Content-Security-Policy directive: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Co...

We've (the boomerang dev team) seen many frameworks do this. For many parts of boomerang, we need to get the url of an object (img, a, iframe, anything that could have a url), and just looking at the object's `src`/`currentSrc` or `href` attributes is insufficient.

What we need to do is create a new `link` element, attach it to the window, assign the url to the link's `href` attribute, and then read that attribute back out. This process converts all relative URLs to absolute URLs, taking BASE into consideration.

Note, that you MUST add the link to the document of the correct window. Adding it to the document of an anonymous iframe will not work because while anonymous iframes inherit a lot of things from the parent, the baseURI isn't one of them.

Completely off-topic, but seeing the Firefox logo in B&W makes me realize for the first time that you can sort of see an upside-down Godzilla in the negative space surrounded by the firefox. They should make it more intentionally lizard-like.


Ember used to use <base> tags but moved away from the due to user confusion and problems with svg: https://www.emberjs.com/blog/2016/04/28/baseURL.html

My personal use case: my company built a system where developers could build html/css/js "apps" that semi-technical end users could deploy into their accounts. Those end users could tweak the html/css/js in limited ways and just the modified part would be deployed to a different location, but would have an injected <base> tag to point back to the developer's app so that end users could reference the original app's images and other static assets. If I were to build it again I might choose an actual templating system instead, but it would be one extra thing our users would have to learn.

I've always seen <base> as somewhat analogous to with in JavaScript, whose usage is strongly discouraged [1]. Recent exploits using <base> seem to confirm that a bit.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

It is interesting to read the web developer perspective on this element. Perhaps it is of little use to them.

As an end user who prefers using a non-graphical http client to retrieve web content to a file (then view the file in another program -- often a browser), I use this element every day.

When I know I will use a browser to view the file, I use a client that writes the base href according to the target domain to beginning of the file.

"As an end user who prefers using a non-graphical http client to retrieve web content to a file"


Years ago I worked on a proprietary framework that depended on the base tag. Let me say in my experience you are far better off just having a decent router that builds URLs from the root for you. Far far fewer wired pathing problems.

It's invaluable when you want to prototype or verify something from "production" without access to the backend and dev tools are "not enough": 1) save HTML of any page (you know, view-source:URL, ctrl+s), 2) put <base href="URL"> there, 3) open it from localhost or even as a file:///, 4) if you want to make changes to say CSS, again save the original and (absolutely) link your local duplicate. You have to rewrite url()s there, though: sadly, there is no @base for CSS.

Saved me some time of waiting for credentials or slow preview from backend.

As some other comment points out, this element has been used to bypass security checks of web based email clients [0]:

> Avanan says cybercriminals have found a simple way to bypass this security feature by using a <base> tag in the HTML header – basically splitting the malicious URL. Using this method, Safe Links only checks the base domain and ignores the rest – the link is not replaced and the user is allowed to access the phishing site.

To me, this seems more an overlook, if not a bug, of the email link protection system.

[0] https://www.securityweek.com/phishers-use-new-method-bypass-...

It’s an outright bug and probably a regression: 90s spam filters checked it but since <base href> has fallen out of style I’d bet most of the developers were unaware that this corner of HTML existed.

both of the frontenders I work with didn't know it existed until I was talking about having to take care of it in the crawler I was writing.

This breaks relative anchor links. I wouldn't recommend using it unless you really have to

I found this useful when writing a webproxy for fun. You normally need to rewrite all the links to feed back into the webproxy, but this is useful for relative links that slip by because they were added dynamically by Javascript.

I remember finding the base element very useful back in the days of hand writing the HTML of my Geocities page, what with its ridiculously long base URL.

I've pretty much always found that the added difficulty in doing same-page anchor links is reason enough to avoid the base tag.

When is this every useful?

We have a system where all of our static assets are prefixed with the SHA1 of the git commit that produces them. It's a great cache-busting technique for CDNs. But this also means we cannot use paths like /img/background.jpg but rather /<insert 40 hex chars here>/img/background.jpg when we refer to such assets. Of course most of the time we can compute this by JavaScript on the client side but it's often easier just to use <base> and get rid of that piece of JavaScript.

Another use class is to change all links to open in a new tab by setting only "target" on the <base>. This happens on marketing sites with a lot of third party links so the marketing department naturally wants those links to open in a new tab instead of the current tab.

Some websites do use <base target="_blank"> to make links open in a new tab by default.

However, this can create a vulnerability as the linked website now has a reference to your site through window.opener and can take the user away from your site:


If you use a CI system that deploys each build of your webapp, where it might be deployed at a random url (<domain>/build1/index.html vs. <domain>/build2/index.html), then you would make all the url's in your app relative urls, which are then basically prefixed by the base url specified in the document head once the actual requests get sent. You can then write a different base href onto the document for each build, and everything will magically work.

Also, if ever use relative urls for assets or requests, and you deploy to different URL's (or your deployed URL path's don't map 1:1 to your filesystem paths), it can come in handy.

Angular also uses this to determine where the client side routing begins - https://angular.io/guide/deployment#base-tag .

It's pretty useful when making phishing pages. Rather than having to host all the tiny files and images yourself or going through what is likely an enormous HTML file and editing all the srces and hrefs, you can just throw a <base> tag on there and be done with it.

Admittedly this is probably not the use case you're looking for.

It's useful if you're developing a static HTML page, loading it from your local filesystem via a file:// URL but want it to pretend like it's at the expected http:// URL. Of course, you wouldn't ever publish it like this.

I'm unsure if there's any utility to publishing a publicly-visible page using this tag.

This was useful in the area of a frameset.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact