Hacker Newsnew | comments | show | ask | jobs | submit login
Ask HN: how can I generate youtube style id?
4 points by jacktang 2352 days ago | 18 comments
Given sample URL: http://www.youtube.com/watch?v=xqvObyU2ITs, any availabale algorithm to generate youtube style video id (xqvObyU2ITs)? Thanks!

-Jack




def generate_code(len = 5)

  (1..len).map { (("a".."z").to_a + ("A".."Z").to_a + (0..9).to_a)[rand(62)] }.join
end

-----


That rocks dude.

-----


Here's a very quick and dirty one for Ruby:

require 'zlib'

base = "whatever"

salt = "whatever"

Zlib.crc32(base + salt).to_s(36)

This will generate 6-7 character strings. Not sure how likely collisions are, but they should be rare enough that a simple check/regenerate should work.

-----


First thought is that they're generating a globally unique video ID number and base-62 encoding it to keep the URL shorter. Maybe not though?

You could do that easily by just base-62 encoding your table's auto-generated primary key.

-----


Or, using C with /dev/urandom: http://alan.dipert.org/post/84526522/random-strings-with-c

-----


What's wrong with starting with '0' for the first item and incrementing up from there? Guaranteed uniqueness.

-----


a) impossible to do in a distributed fashion, b) may leak information about your operations to competitors. of course, OP may not care about these things.

-----


I understand (b), and in fact I considered it a potential weakness, but why does (a) matter if you're selling from a single site? Even if you got multiple simultaneous orders, storing the last license generated in a DBMS should make it multiprocess-safe.

-----


uh, in python:

  import string, random
  ''.join(random.sample((string.letters+string.digits), 12))
I hope that's what you were asking for. If not, you might want to clarify your question.

-----


On morning's light, that won't do what you think. random.sample() gives you a unique sampling. No character will be repeated. try this instead:

    alphanum = string.letters+string.digits
    ''.join([alphanum[random.randint(0,61)] for i in xrange(12)])

-----


You're quite right; I should have caught that.

-----


You want to use an ID that can predictably be unique for something like this. You shouldn't use a random string.

-----


What do you mean by "predictably unique"? Do you mean "guaranteed unique"?

As a way to generate unique ids this isn't horrible. 12^62 is something like 220 bits. The odds of a collision are even lower than with a UUID.

Guaranteed uniqueness is preferred, yes. But the level of effort needed to guarantee uniqueness across a large application / dataset / etc is much higher than "unique enough", just as it's a lot more expensive to prove a number is prime than to generate a number that is 99.9999% probably prime.

-----


well, even 99.9999% possible, we should handle the 0.0001% exception ;) Can I understand that, if collision occurs, let the it generate the id again?

-----


That would be guaranteeing uniqueness. Don't bother guaranteeing uniqueness. The 99.99999% was an understatement. 12^62 is 99.999999999999999999999999999999999999999999999999999999999992% unique. This is higher odds than being killed by a meteorite.

-----


:) thanks

-----


Hi, how to keep the string/id is unique? In another words, how to deal with the id conflict?

-----


It's generating a big (BIG) random numbe. The odds of a conflict are many billion times more than the odds of getting hit by a meteorite.

http://en.wikipedia.org/wiki/UUID#Random_UUID_probability_of...

-----




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: