
The Lazy Man's URL Parsing - joezimjs
http://www.joezimjs.com/javascript/the-lazy-mans-url-parsing/?utm_source=linkedin.com&utm_medium=referral&utm_campaign=linkedingroups
======
yahelc
This is a great trick, but there's an irritating IE bug. On the pathname
attribute, IE doesn't add a leading slash on the pathname (whereas all other
browsers do).

It can be corrected by doing:

    
    
        parser.pathname = parser.pathname.replace(/(^\/?)/,"/");
    

Further, because you're creating a DOM element, it's less performant than
using a RegExp solution. That performance degradation will typically only
surface if you're parsing a large number of URLs (for example, in a loop),
rather than just 1 or 2.

~~~
exogen
For the DOM element, it seems like you could just create the element once and
then keep it around for reuse. Should be safe since JavaScript is single-
threaded.

~~~
zachrose
My understanding is that modifying and querying DOM elements is what's slow,
not just creating them.

<http://jsperf.com/lazy-url-parsing>

~~~
exogen
I'm actually surprised it's only ~40% slower (in my Chrome anyway).
Considering how much less code it is to maintain, that's totally worth it IMO.
I'm sure there are a thousand other functions you'd want to optimize in a real
app before this.

------
unfletch
The interface is officially called "URL decomposition IDL attributes." IIRC
it's only implemented by <a/>, <area/> and the Location object.

Here are the canonical docs: [http://dev.w3.org/html5/spec-
LC/urls.html#interfaces-for-url...](http://dev.w3.org/html5/spec-
LC/urls.html#interfaces-for-url-manipulation)

And here's a prettier (but less detailed) version:
[http://developers.whatwg.org/urls.html#interfaces-for-url-
ma...](http://developers.whatwg.org/urls.html#interfaces-for-url-manipulation)

------
matttthompson
Just came across this clever solution yesterday, actually. Unfortunately, it
doesn't work for URLs containing a username and password (e.g.
<http://username:password@example.com>). Glad to have found URI.js, though--
it's exactly what I was looking for.

~~~
joezimjs
I was noticing that too.

------
mmahemoff
_var link = $(' <a/>').attr('href', '<http://example.com)[0]> _ in jQuery,
just to show how simple it can be.

A stylistic point is I'd just call it something like "link". Assigning a link
element to "parser" is being cute and it's not what the object actually is,
even if it's being used with intent to parse.

------
sparknlaunch12
HN Link: [http://www.joezimjs.com/javascript/the-lazy-mans-url-
parsing...](http://www.joezimjs.com/javascript/the-lazy-mans-url-
parsing/?utm_source=linkedin.com&utm_medium=referral&utm_campaign=linkedingroups)

Takes you here: This webpage has a redirect loop The webpage at
<http://www.joezimjs.com/500.shtml/> has resulted in too many redirects.
Clearing your cookies for this site or allowing third-party cookies may fix
the problem. If not, it is possibly a server configuration issue and not a
problem with your computer. Here are some suggestions: Reload this webpage
later. Learn more about this problem. Error 310 (net::ERR_TOO_MANY_REDIRECTS):
There were too many redirects.

Instead of here: [http://www.joezimjs.com/javascript/the-lazy-mans-url-
parsing...](http://www.joezimjs.com/javascript/the-lazy-mans-url-parsing/)

------
Kevin_Marks
The orignal seems to be down now. Note that there are many ways to name the
bits you parse a URL into: [http://tantek.com/2011/238/b1/many-ways-slice-url-
name-piece...](http://tantek.com/2011/238/b1/many-ways-slice-url-name-pieces)

~~~
bmelton
Even worse, the original seems to be down due to an infinite redirect loop.

~~~
joezimjs
Sorry about "the original". It wasn't actually supposed to have that query
string on the end. That was for Google Analytics tracking elsewhere and I
forgot to remove it when I posted it here.

I'm not sure why it produced an redirect loop because it worked many times for
a lot of people. It may have had something to do with my server getting
overloaded. I'm on a Hostgator reseller account, so I have limited resources
and when HG saw the massive CPU usage from all of you people checking this
out, they shut down my site for a bit.

------
IsaacSchlueter
Incidentally, the Node.js `require('url').parse(str)` method is designed to
present the same API, except that it also includes the auth section as an
'auth' member.

------
foxhop
If you are coding with python instead of javascript check out my complete uri
module (miniuri.py):
[https://bitbucket.org/russellballestrini/miniuri/src/tip/min...](https://bitbucket.org/russellballestrini/miniuri/src/tip/miniuri.py)

------
elliottcarlson
Cached copy of the page: [http://www.joezimjs.com.nyud.net/javascript/the-
lazy-mans-ur...](http://www.joezimjs.com.nyud.net/javascript/the-lazy-mans-
url-parsing/)

------
ww520
Laziness is the mother of all inventions. Neat trick.

------
dudus
I wonder if it works in Internet Explorer 6 or 7.

~~~
Trezoid
Do people building anything real world care about IE6 any more? Hell, do they
really care about 7?

~~~
stilist
If you care about enterprise you probably care about IE 7.

~~~
mkopinsky
In the hospital where I work, most of the computers got upgraded to IE7 about
6 months ago. There's talk of upgrading to IE8, but that's probably a year off
if I had to guess.

The way I see it, about half of my salary is for caring about IE7. IE7 hacks
and stupidities occupy no more than 5-10% of my time, but the way I see it,
80% of my work is damn fun and I would do it for free. I am glad that there is
the remaining 20% of IE7 hacking, boss-assigned-task-doing, and so on for
which I (consider myself to) get paid 5x my actual salary.

