

Probably the worst URL scheme ever - jakub_g
http://www.bvb.de/?_%1B%E7%F4%9D
[english version] http://www.bvb.de/?Z%1B%E4%F4%9D<p>Can anyone point to any advantage to have unreadable URLs like on the linked page?
======
masnick
LinkedIn URLs are by far the worst. For example, the first profile that came
up when I searched for Paul Graham:

[http://www.linkedin.com/profile/view?id=23081590&authTyp...](http://www.linkedin.com/profile/view?id=23081590&authType=NAME_SEARCH&authToken=v2DV&locale=en_US&srchid=b2875520-dc20-4e1a-a17b-17460da676f9-0&srchindex=1&srchtotal=3061&goback=%2Efps_PBCK_Paul+graham+_*1_*1_*1_*1_*1_*1_*2_*1_Y_*1_*1_*1_false_1_R_*1_*51_*1_*51_true_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&pvs=ps&trk=pp_profile_name_link&_mSplash=1)

~~~
raverbashing
You can have a public URL for linkedin in the format /in/CustomName

E.g.:www.linkedin.com/in/barackobama

~~~
smackfu
Too bad that seems like only an external reference method, and the site never
uses those URLs internally.

~~~
masnick
Yeah, I want to know what they're doing internally with these insane URLs.
It's just ugly -- poor craftsmanship IMHO.

~~~
mseebach
Yeah, nothing screams good craftsmanship like armchair-quarterbacking someone
else's work that clearly works well.

FWIW, it seems to be some sort of encoded state, essentially HATEOS.

~~~
smackfu
Works well?

"Oh you want the URL of your LinkedIn Profile? Don't use the URL in the
address bar, silly! Use this other URL we are providing randomly down the
page."

~~~
mseebach
I was referring to the URL-scheme of the LinkedIn app in general, not just the
profile page - I somehow missed that this subthread was only about the profile
page. Yes, I agree they dropped the ball completely on that. Especially tragic
as they could fix the 80% case with a front-end one-liner
(window.history.replaceState).

------
cedricd
The elephant in the room here of course is this:
<https://news.ycombinator.com/x?fnid=b7VO4wED8MRumCeiX5fCnF>

~~~
ancarda
I never understood why HN has such a peculiar URL for accessing pages. It
times out after a while too, is that to stop crawlers?

~~~
Robin_Message
They are ids to lookup closures in a database. They time out to stop the
database overflowing ;) It's called continuation-based web development [1],
popular with Lisp and Smalltalk-based web servers (because who else has
continuations?)

[1] <http://en.wikipedia.org/wiki/Continuation#In_Web_development>

~~~
jgrahamc
There's no database.

EDIT: to all the people arguing with me. Read the source code to Hacker News.

    
    
        (= fns* (table) fnids* nil timed-fnids* nil)
    
        ; count on huge (expt 64 10) size of fnid space to avoid clashes
    
        (def new-fnid ()
          (check (sym (rand-string 10)) ~fns* (new-fnid)))
    
        (def fnid (f)
          (atlet key (new-fnid)
            (= (fns* key) f)
            (push key fnids*)
            key))
    
        (mac afnid (f)
          `(atlet it (new-fnid)
             (= (fns* it) ,f)
             (push it fnids*)
             it))
    

They are in memory. Which is why they expire randomly when the HN process is
restarted.

~~~
gcr
Racket (arc's host language) keeps continuations on the filesystem, or you can
write your own "stuffer" to do what you want with them (store them in a
database or whatever). But you have to keep them somewhere or else (assuming
the server uses continuations) you can't keep track of the user's path through
your code as they click through links and such.

Racket does have an option to serialize the continuations, gzip them, sign
them with HMAC, and then send all of that to the client so the server doesn't
have to keep track of anything, but HN doesn't use it.

See [http://docs.racket-
lang.org/continue/#(part._.Advanced_.Cont...](http://docs.racket-
lang.org/continue/#\(part._.Advanced_.Control_.Flow\)) for a quick
introduction.

~~~
epochwolf
HN doesn't use racket. It's a custom lisp based on scheme.

~~~
gcr
Sure it does. HN is written in "Arc":

<https://github.com/wting/hackernews>

Arc runs on Racket:

<http://en.wikipedia.org/wiki/Arc_(programming_language)>

See the "OS" section where it says "runs on the Racket compiler"

See also the Arc source code, <https://github.com/Pauan/ar/blob/arc/nu/arc>
Note the "#lang racket" at the top.

------
waffle_ss
I think THOMAS (<http://thomas.loc.gov/>), the search engine provided by the
US Library of Congress for searching federal legislation, has the worst URLs
I've seen. Here's a random one:

    
    
      http://thomas.loc.gov/cgi-bin/bdquery/D?d113:1:./temp/~bdGqLa:@@@T|/home/LegislativeData.php|
    

And here's a link I got to the Patriot Act (HR 3162):

    
    
      http://thomas.loc.gov/cgi-bin/query/D?c107:44:./temp/~c107DgA33R::

~~~
twistedpair
Making sense of that in linear time would be a great interview question.

~~~
networked
The _./temp/~c107DgA33R_ bit looks like a reference to a cached internal state
of the system, so you can probably make about as much sense of it as you can
of <https://news.ycombinator.com/x?fnid=H1QJE8EOaO2OkA28owXZ4H>.

------
franze
i have seen worse.

i.e. sites that show on page with the url <http://www.example.com> and another
page with the URL <http://www.example.com> and even after another click the
URL <http://www.example.com> with completely new content

and sites, that use a logic like this
[http://www.example.com/357893857435/sfjsfsfsfd/this-
should-b...](http://www.example.com/357893857435/sfjsfsfsfd/this-should-be-
the-seo-part-of-the-url) where
[http://www.example.com/357893857435/sfjsfsfsfd/this-is-
shoul...](http://www.example.com/357893857435/sfjsfsfsfd/this-is-should-be-
the-seo-part-of-the-url) and
[http://www.example.com/357893857435/sfjsfsfsfd/tHIS-is-
the-s...](http://www.example.com/357893857435/sfjsfsfsfd/tHIS-is-the-sEo-part-
of-the-URL) show the same page, oh and of course
<http://www.example.com/357893857435/sfjsfsfsfd> also shows the same page.

oh, and cases where <http://www.example.com/click1/click2/click3/item-id/123>
show the same page as <http://www.example.com/click1/item-id/123> which show
the same page as [http://www.example.com/click1/click2/click3/click4/item-
id/1...](http://www.example.com/click1/click2/click3/click4/item-id/123)

all of the examples above are far worse than bvb.de

~~~
tsahyt
Quite frankly, I despise the use of "SEO URLs" altogether. It's basically a
waste of bytes.

~~~
nkozyra
Sometimes, sometimes not. You can often compress a lot of

&parameter=value into a simple /value/

rewrite. This - by most accounts - makes it _more_ SEO friendly. Granted,
putting full sentences that match an article title is never _necessary_ , but
there are a lot of SEO tricks that can make a url not only "nicer" looking but
also shorter.

------
hamoid
I still think the one used by the Spanish Congress is worse. URL for legal
document 162/000609:

[http://www.congreso.es/portal/page/portal/Congreso/Congreso/...](http://www.congreso.es/portal/page/portal/Congreso/Congreso/Iniciativas?_piref73_2148295_73_1335437_1335437.next_page=/wc/servidorCGI&CMD=VERLST&BASE=IW10&FMT=INITXDSS.fmt&DOCS=1-1&DOCORDER=FIFO&OPDEF=ADJ&QUERY=%28162%2F000609*.NDOC.%29)

~~~
masklinn
Legal portals are generally gasbage, here's the french one for Article L511-1
of the environmental code:
[http://www.legifrance.gouv.fr/affichCodeArticle.do?idArticle...](http://www.legifrance.gouv.fr/affichCodeArticle.do?idArticle=LEGIARTI000023491026&cidTexte=LEGITEXT000006074220)

~~~
claudius
§ 1353 of the BGB (German Civil Code) can be found at <http://www.gesetze-im-
internet.de/bgb/__1353.html> (literally ‘laws on the internet’).

~~~
mhd
So close, with minimal effort they could map that to '/bgb/1353'. It seems
that dejure.org actually works that way ->
<http://dejure.org/gesetze/BGB/1353> seems to map to
<http://dejure.org/gesetze/BGB/1353.html>, but they graciously ignore any kind
of file extension...

~~~
claudius
Well they already managed to get an overview/full-text at /bgb/, so I am quite
happy for now…

------
dbbolton
So, this page is named ?_(null)çô(null)

? How does that even begin to make sense?

~~~
hosay123
It's an underscore followed by 4 bytes, possibly the integer 2650072859 or
468186269. If they're intentionally trying to obfuscate their URLs to prevent
crawling, it might be further encrypted somehow.

------
ozh
How about <http://www.tsa.gov/tsa-pre%E2%9C%93%E2%84%A2> ? It's "TSA-Pre✓™"
and you have to type in the checkmark and the trademark symbols, otherwise it
404s

Edit: ho, they fixed that and it's redirected. Too bad, that was funny :)

~~~
ramses0
"I can't imagine the skill required to do this without the experience to know
it's a bad idea" (can't find the source for this quote, but you should get the
sentiment).

...ahhh... found it: """"How do you attain the skills required to do this
while not also learning not to?"
[http://news.ycombinator.com/item?id=4711355](http://news.ycombinator.com/item?id=4711355)
"""

~~~
blcknight
The way you end up with URL's like <http://www.tsa.gov/TSA-Pre✓™> is a CMS
system that replaces spaces with dashes in the title to make the URL. No skill
required.

~~~
whalesalad
Glad to see they finally fixed that with a redirect to a sane url.

------
glyphobet
It's Latin-1 encoded and then URL-encoded:

>>> import urllib

>>> print urllib.unquote('%1B%E7%F4%9D').decode('latin-1')

çô

------
insteadof
It was up until last week, but <http://www.tsa.gov/tsa-pre✓™> was the
canonical version. Now it redirects to the version you don't need to know the
ALT keyboard jockeying with.

------
jakub_g
Can anyone tell the advantages of having unreadable URLs like this? The only
thing I can think of is reducing bandwidth usage through short URLs :)

~~~
Ovid
And reducing bandwidth via a brilliant anti-SEO strategy.

* Useless URLS? Check.

* Eye-gougingly ugly design? Check.

* Densely packed content with tiny font? Check.

And have lots of fun reading the source.

------
lysol
This is nitpicking, but the worst URL scheme is actually smb:

~~~
raverbashing
And the worse mime type is an unfunny one.

------
bdg
My first thought when I see this:

This is a band-aid to try and prevent an XSS or SQLi flaw somewhere on the
site.

------
cmsj
I counter with <http://www.ctshirts.co.uk> just go there and even the front
page transmutes itself into a horrifying URL of doom.

Depending on which subsection of the site you're in, you sometimes also get
|||||||| on the end of the URL.

I actually know someone who works in the e-commerce dept of the company and I
think we should all berate him for how terrible a person he is ;)

~~~
cmsj
I just noticed that %7C is the URL encoded form of..... | so it's always just
full of pipes!

------
msturm
Not sure what it is but the URLs on the japanese BVB site look better:
<http://www.bvb.jp/>

~~~
homosaur
What the heck, this site is reasonable, attractive, and legible, why is this
not the MAIN SITE?!?

------
bobbo3
Here's a NOT SAFE FOR WORK url.

    
    
        http://www.adultwork.com/ViewProfile.asp?UserID=1945109&Keywords=&KeySearch=1&TargetURL=http%3A%2F%2Fwww%2Eadultwork%2Ecom%2FSearch%2Easp%3FRefreshVar%3D02%252F05%252F2013%2B17%253A11%253A24%26cboCountryID%3D158%26cboCountyID%3D146%26cboAPID%3D0%26rdoRatings%3D0%26cboLastUpdated%3D01%252F01%252F2003%26intAgeFrom%3D25%26intAgeTo%3D33%26DF%3D1%26cboLastLoginSince%3DX%26strSelPostCode%3D%26HotListSearch%3D0%26rdoKeySearch%3D1%26strPostCodeArea%3D%26SearchTab%3DProfile%26cboRegionID%3D11%26question_69%3D%26question_70%3D%26question_2%3D%26question_3%3D%26question_57%3D%26question_27%3D%26question_42%3D%26strKeywords%3D%26intHalfHourRateFrom%3D%26intHalfHourRateTo%3D%26dteAvailableAnotherDay%3D%26hdteToday%3D02%252F05%252F2013%26cbxSelIsEscort%3DON%26strTown%3D%26dteMeetDate%3D%26intMiles%3D%26intMilesUSA%3D%26rdoOrderBy%3D7%26intMeetDuration%3D%26cbxGenderID%3D2%26cboSCID%3D0%26cbxPreferenceID%3D55%26intHourlyRateFrom%3D%26intHourlyRateTo%3D%26intHotListID%3D0%26PageNo%3D1%26SS%3D0%26strSelUsername%3D%26dteMeetTime%3DX%26intMeetPrice%3D%26cboBookingCurrencyID%3D28%26intOvernightRateFrom%3D%26intOvernightRateTo%3D%26strSelZipCode%3D%26CommandID%3D1&NavUserIDs=1768061x1983764x1816348x1873822x1896482x1964251x548903x1052569x1188136x1431008x1780228x1788349x1801647x1475155x1635012x1964725x1995120x1985169x1657721x1678563x1620768x591995x1551539x1579011x1996472x1694586x1198128x1916266x1945109x883257x1097958x273262x1891436x1578047x1797390x1415157x1825574x1935666x1119043x929033x1935510x1957223x1468772x1873269x1494092x1120357x1282956x1284275x1107421x639826

------
joosters
Take THAT, evil web crawlers! Maybe they are trying to get poorly-written
spiders to crash when they hit the site?

~~~
jakub_g
Actually when you google site:www.bvb.de there are some readable URLs (not
sure how they got there), upon navigating to which they do 302 redirects.

~~~
joosters
It's just like tinyurl.com, but in reverse :)

~~~
X4
hahaha :D

Let's put the tech sensation aside. I'm glad to know that the HN Folk have a
good sense of humor :)

BTW: you can add multiple routes pointing to the same url, but allow only the
SEO URLs to be indexed. This keeps the cryptic URLs for the entertainment of
the Users/Crawlers.

~~~
joosters
How would you do this? (Leaving aside the question of 'why?')

I suppose you could try blocking crawlers from the raw URLs with an aggressive
robots.txt and then put a sitemap (with friendly/SEO URLs in) somewhere for
them to discover instead. Would that work?

Paranoid web spiders could flag the site as suspicious, though. Such schemes
might make it seem like the website is presenting one view to the spider, and
another to real visitors. Almost like it was trying to hide malware from a
scanner.

~~~
X4
You could simply add "index" to the page when accessed with /seo/url and
noindex when accessed with the cryptic url. Additionally you can enforce that
using .htaccess or nginx rules also. Your Framework and HTTP-Router class just
has to support multiple URLs per page.

Basically your CMS or Framework must allow to have multiple routes like
site.com/best/watch/casio and site.com/→@ðŋ]æ~@¢“«¢“¹²³»«@€^ linking to the
same page.

I've used that in the past to switch languages dependant on url-path +
browser-language. /en/my-article would show that english article to a german
visitor, but everything else on the site like nav, terms etc. would be German.
To access the Enlish site, the german visitor would have to click the
appropriate flag. I could have easily added the feature to read that same
article in German, by a click on a flag in the bread-crumb's mini drop-down.
Example: blog»my-article[v] a click on [v] would open blog»mein-artikel etc.

------
emidln
A former coworker of mine created django-unfriendly[1], which seems like it's
probably worse. On the other hand, django-unfriendly is meant to obfuscate on
purpose.

[1] - <https://github.com/tomatohater/django-unfriendly>

------
mustermann
Add the one below to the collection: [http://www.wg-gesucht.de/1-zimmer-
wohnungen-in-Berlin.8.1.0....](http://www.wg-gesucht.de/1-zimmer-wohnungen-in-
Berlin.8.1.0.0.html?filter=cbf7a0c31b5135fe440b64aef5ecf43a9530846d0bc2cf61a9)

------
helipad
Look! Another fixed-width, left-aligned German website. Brimming with
nostalgia here.

------
romaniv
Weird characters aside, having URLs of the form example.com?stuff has many
advantages. For one, you don't need any weird magic to get relative URLs
working properly.

------
dexcs
Worst website hoster ever: <http://www.fcbayern.telekom.de/de/splash.php>

~~~
sebslomski
Nice try. See you crying in 3 weeks.

------
greggman
What makes this bad? Or rather what makes this objectively bad? I feel like
URL schemes espoused by people who judge them are basically bs. Success of the
website IMO seems like the only objective measure and by that measure pretty
much any URL scheme is fine given the schemes used on some of the most popular
sites use schemes that people like jakub_g complain about

I can see where maybe an API URL might have objective better and worse schemes
but a content URL? Show me the research results, not just fashion opinions

~~~
greggman
It's a legit point and a legit question. Instead of voting it down answer the
question with some objective facts

------
verandaguy
SANITIZE _ALL_ THE SERVER CONTENTS!

------
jnardiello
Ehy, i see nothing wrong here.

------
pkhamre
Best football club in the world with word URL scheme in the world?

------
ante_annum
http is a perfectly normal url scheme

 _ducks_

------
XarotheOne
Haha that is epic.

EDIT: I apologise for the really bad comment. I have read the rules now and
will only put high quality posts from now on. Thankyou.

~~~
endgame
<http://thebestpageintheuniverse.net/c.cgi?u=epic>

~~~
XarotheOne
Actually thanks for that, I didn't realise so many people were against it but
I shall refrain from using the word "epic" in a context that it should be
used.

