

Ask HN: "How-Tos" for Rolling Your Own Short URLs? - daveambrose

I'm interested in working on a small side project to roll my own short URL, similar to how TechCrunch recently unveiled their version of http://tcrn.ch/.<p>If you know of any detailed How-Tos for creating your own version of a short URL, I'd appreciate it.<p>(P.S. To keep the question on topic, I'm not interested in debating the merits of whether short URLs promote spam, acts as a middleman, etc as was covered some weeks ago here by joshu.)
======
ivankirigin
I made one for Tipjoy where I mapped a content ID to short string, and loaded
a frame like the Digg Bar where people could donate to the site and view the
content. For example: <http://tipjoy.com/2w/>

Here is the python code I used to "shrink" and unshrink the content ID.

    
    
      def shrink( id ):
          validCharacters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-~"
          base = len(validCharacters)
          characterArray = ""
          d = id
          while d > 0:
              ind = d % base
              characterArray = validCharacters[ind] + characterArray
              d = d / base
          return characterArray
      
      def unshrink( code ):
          validCharacters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-~"
          base = len(validCharacters)
          val = 0
          revCode = code[::-1]
          power = 0
          for r in revCode:
              i = validCharacters.find(r)
              if i==-1:
                  raise
              val += i * (base**power)
              power += 1
          return val
    

If you think this code is bad for some reason, please tell me in the comments.
I spent little time reviewing it when I wrote it about a year ago.

------
zaidf
input.php - input box asking for url. call create.php on form submit.

create.php - generates random string, adds url and random string to urls table

redirect.php - takes random string. finds url associated with it. redirects

.htaccess - catches http//domain.com/<random string> and maps it to
redirect.php

~~~
callmeed
and make sure to redirect with a 301

~~~
zacharydanger
i.e. header($final_destination, true, 301)

~~~
zacharydanger
Wow, I really botched that up.

header("Location: " . $final_destination, true, 301)

In my defense, I generally wrap this call into my own redirect function.

------
simonw
I wrote up my experiences building a vanity short URL service a few weeks ago:
<http://simonwillison.net/2009/Apr/11/revcanonical/> \- algorithms and code
included

------
mdakin
I would associate each URL with a sequential number. And express the number
using base-36 (0...9 + a...z). I don't see the need to use obfuscation for
this application but if desired I'd hash the URL + a secret and prepend that
fixed-length hash to the base-36 sequential number. I'd express the hash in
base-36 too.

~~~
dagobart
Well, originally, tinyurl just went for sequential short URLs too. But as
those were predictable, abuse was afoot: "Early on, the resulting URL aliases
of the service were predictable, and were exploited by users to create vulgar
associations. The URL aliases dick and cunt were made to redirect to the White
House websites of U.S. Vice President Dick Cheney and Second Lady Lynne
Cheney." Source: <http://en.wikipedia.org/wiki/Tinyurl#Early_abuses>

~~~
mdakin
Interesting... tack that hash on. :) Probably does not need to be a hash
actually. A few random base-36 digits would do it. You'd just need to store
the random digits with the URL in whatever sort of persistent storage system
you're utilizing.

------
ashleyw

        1.to_s(36) #=> "1"
        10.to_s(36) #=> "a"
        100.to_s(36) #=> "2s"
        123456.to_s(36) #=> "2n9c"
    

Ruby. Where the number is the id of the record in the database or something.

~~~
ashleyw
Also forgot to mention, you can do the opposite with:

    
    
        "2n9c".to_i(36) #=> 123456

------
jlees
Create your own URL shortening service with Heroku and Shorty:
<http://brad.posterous.com/create-your-own-url-shortening>

Some useful stuff to go on there even if you're not using Heroku.

------
chacha102
Here is a related question. Is creating a hash of letters and numbers instead
of just using an ID beneficial in anyway?

~~~
chacha102
By an ID I simply mean starting at 1 and moving up from there

~~~
lucumo
It prevents people from iterating over all URLs in your database. Whether or
not you consider that a plus or a minus, is mostly up to you.

It also prevents people from easily seeing how many URLs you've shortened.

------
ScottWhigham
You can write the basics of that sort of thing from start to finish in half a
day - easy stuff. You need:

1) Database table to store the "codes" - two "main" columns: Code,
DestinationUrl

2) Random code generator to generate the "Code"

That's really the gist of it.

~~~
MoeDrippins
Exactly. The "win" tinyurl.com (etc.) was that someone had the foresight to
see the need at all and do it, not that it was hard to do.

Amazon got patents on less.

~~~
dagobart
Wasn't it services like learn.to, beam.to and similar that came first? With
the intend to have freely pickable shorthand URLs, where the shortness of them
was just a byproduct?

I'm not sure of this, but iirc the learn.to/beam.to etc services were around
in the early web already, mid-90s, while tinyurl and friends came later.

~~~
nimbix
Those services were different because they were meant to provide an
alternative URL for _your_ website, not someone else's. Also, they required
you to go through a signup process and I remember some of them displaying ads.

And of course, there was no Twitter back in those days to create an artificial
need for short URLs.

