

Show HN: Two-faced URLs - ipsin
http://brokenthings.org/

======
ipsin
This is a URL shortener that gives a link that sends "preview" processes one
place and actual users somewhere else. It works for google plus and facebook
and several other "preview" bots. Proof-of-concept because I don't like URL
shorteners in general, but it links to a restricted list of sites.

------
pak
Yet another reason for URL shorteners to be banished from the web (not that
this will ever happen). Besides the arcane input requirements of Twitter and
the semi-retarded behavior of certain email clients, I haven't found a good
use case for them that couldn't have been solved better with proper link
formatting or better content presentation.

When the web is just a sea of opaque pointers to pointers to pointers, with
various hilarity like these URLs mixed in, we'll all wonder why the hell we
stood by and let it happen.

~~~
covati
_I'm biased, since I have run more than one shortener/tracker myself. This
isn't meant to be a rant, more of a defense for the uninformed._

It's a bit naive to think that most websites are referenced by a simple link
to static content. The web has become so complex and urls now reflect that.
Short urls are useful for many people for many different purposes.

For an example of how complex urls are, take a look at SEO'd blog post or
newspaper urls. They have become a string on keyword rich terms that help to
increase SEO. The content that is served is brought up dynamically and may in
fact change over time. Ads, sidebars, and comments are brought in after the
fact, often based on content or origin of the user or if they have a cookie.
It's complicated - url redirection (which already may be happening in this
flow) doesn't add much to it.

Advocating for "proper link formatting or better content presentation"
reflects the fact that you've never had to dig deep into the complex SEO,
user, customization, and other general business requirements that arise when
running a CMS.

Short urls are valuable because they allow people an ability to simplify the
increasingly long urls for use in many different mediums without the worry of
breaking or losing portions of the url.

Yes, they often do allow the creator to change them, but that's no different
than what any website operator can do natively on their website.

There are problematic situations that can arise, such as when a url shortener
goes down. And this has happened, and it hasn't caused any catastrophic
problems. It's inconvenient, but it often only affects old content on twitter
or facebook. I'm not saying it's ideal, it definitely doesn't add value to the
internet. But it's not _that_ bad.

Regarding the 'arcane' input requirements for twitter or facebook. Somehow I
think you missed the point of these networks. Twitters 140 characters is part
of what defines them. There will always be short message mediums, they are
there because people have a desire for short bursts of information. They
aren't going away.

I hope that helps to explain why shorteners arose because there was a need,
not just to add complexity or _more evil_ to the internet.

~~~
pak
> Advocating for "proper link formatting or better content presentation"
> reflects the fact that you've never had to dig deep into the complex SEO,
> user, customization, and other general business requirements that arise when
> running a CMS.

Erm. I don't think you understand what I meant. I meant if a URL is too long
to fit within a block of text in your medium, enable hyperlinking it with
short, descriptive link text the way just about any media _other_ than twitter
or plain text (or HN markdown, but I digress) allows. Have we already
regressed from the concept of the well-made hyperlink?

I appreciate that designing URL structure for a site can be difficult. I don't
think it's as hard as you make it out to be, because just about any free CMS
package will take care of SEO'd URLs for you and set up redirects even if you
change your content. Hell, you can customize all of this in Wordpress without
touching code. But that's an orthogonal problem, and shortened URLs neither
contribute to SEO nor constitute a viable long-term URL structure for any
site.

> Short urls are valuable because they allow people an ability to simplify the
> increasingly long urls for use in many different mediums without the worry
> of breaking or losing portions of the url.

This is what I don't understand. OK, sure, URLs get long, but they could be
long before we started putting keywords in them, that's not new. Besides email
or twitter, what media are you possibly talking about where I'm dropping bits
of a URL on the floor? And if bad email clients can't linkify long URLs
properly and twitter won't allow proper link formatting, why are we just
rolling over and enabling their problems?

>Twitters 140 characters is part of what defines them. There will always be
short message mediums, ...

They could and should make an exception for URLs. Imagine tweeting:

    
    
        URL shortener controversy all over again!
        [Read it on HN|http://news.ycombinator.com/item?id=2734728]
    

and the URL portion and formatting doesn't count toward the limit, only the
link text part. Visually, that's only 55 characters worth of information, and
that's how the limit should be formulated. You can't tell me that
<http://bit.ly/blsa23848> is more informative or contributes meaningfully to
Twitter's character limit by virtue of it being shorter than my formatted link
_with link text_. In fact, it's _less_ informative, and obscures the link's
true destination without deshortening (and as the OP shows, deshortening is
fallible).

------
ipsin
It's actually several things, including user agent strings, HTTP accept/accept
language.

The underlying script allows you to route between n different destinations
based on header matching and netmasks. The shortened URL is just a hash of the
rule set, with a sequence number appended in case of collisions.

~~~
phillco
Clever. I assume the only way for Facebook and Twitter to circumvent this
would be to fake all of the HTTP headers sent by, say, a modern version of
Firefox. I don't think it's cheating - even though the tools are bots, their
job _is_ to preview the content their users are about to see.

Even then, the author could probably find out which IP addresses are used by
Twitter and Facebook's lookup engines and send them the innocuous version.
However, I just did a test on my own server, and Facebook used two different
IPs (69.171.228.246 and 69.171.224.245) for the same request. Pretty big
range.

------
billpg
That's one thing that mildly annoys me about twitter. It clearly knows where a
shortened link is heading because it reveals the full URL on the mouseover.
Given that information, why does it need the shortened link any more?

------
bgarbiak
This actually could be quite useful. Sometimes people need to post a nicely
described link with an adequate thumbnail, yet the Facebook bot finds only
some random image. With this little service one can prepare a dummy page for
illustrative purposes and then let brokenthings.org handle the bot and the
redirection.

------
lawn
I've been waiting for this since url shorteners came to be. Sadly I don't
think shorteners will loose popularity anytime soon and I will continue to
avoid them as I've always done.

------
rmccue
At a guess: HEAD requests give one URL, GET requests give another?

~~~
MattBearman
I think it knows the user agent of specific bots and serves different pages
based on that, as social networks tend to do a full GET request in order to
grab a screenshot.

------
TobbenTM
This is not working for me.. Whenever I change the source link, Facebook gives
me no preview..

------
makethetick
Very clever/sneaky!

------
jasondavies
To defend against this, social networking sites should simply cache the link
that the preview bot sees, and send users there directly instead.

------
inportb
Clever use of content negotiation!

------
lbarrow
This is great! Lots of fun.

