Hacker News new | past | comments | ask | show | jobs | submit login
Scribd says: Beat our programmers at a coding challenge (scribd.com)
143 points by jennylin on May 24, 2012 | hide | past | favorite | 107 comments

For anyone that is doing CS at Columbia University or considering CU, if you like these types of coding challenges, definitely look into Professor Ross's Programming and Problem Solving class. It was one of the best CS classes I took at Columbia.

Here is his website: http://www.cs.columbia.edu/~kar/teaching.html

Each semester consists of 4 different open ended programming problems like these. You work as a team to compete against other members of the class. There's no tests and each class is run as an open seminar where people talk about their strategy and implementation and consider the best approach to solving these problems.

This was my favorite from my year: http://www.cs.columbia.edu/~kar/4444f02/node18.html

I wish lecture notes and videos were available for this course...

You might want to check out Peter Norvig's CS212 on Udacity. From what I've seen over the past 6 weeks it's an excellent course problem solving and `good taste` in programming.

Yes, that is already on my list, after I complete the compilers class from Coursera. Although the bad side of these kind of courses is that not all of the domains are interesting to everyone, e.g. I almost fell in sleep when Norvig talked about Poker rules :)

I thought I was the only one who found the Poker part boring.

It always blows my mind how many CS academics have these god-awful mid-90s era webpages.

It's worse than that. This webpage presents the information in a clear, readable text format with no unnecessary clutter. Why, it's such an old design that it's similar to what Gutenberg was cranking out on his printing press and to what you can buy today on your Kindle. Imagine that!

They're probably too busy doing Computer Science to bother keeping with the late trends in web design.

I absolutely love it. I think it shows you have reached a certain level of academic distinction when you can get away with a css free homepage, preferably hosted unix style at http://dept.yourinstitution.edu/~yourinitials

It's like when people get to a certain level of wealth they stop sending status signals.

On the game rules page: http://www.scribd.com/jobs/botrace_rules

The game starts off with Apples Bananas and Oranges, but ends with Apples, Bananas and Melons.

Robots are a wise choice for safety's sake when dealing with quantum fruit.

Fixed, thanks :) We'll deploy a new page tomorrow.

I've got a coding challenge for the scribd team: make documents downloads one-click away from the document page and without mandatory login.

You may need to consider a better way to rank players - percentage of games won is problematic (as was discovered during the Tron Google AI challenge) and something like TrueSkill or bayeselo would be more robust.

Yeah, we're seeing the first problems with that right now- newly uploaded bots progress faster than they should because they play against a more "diluted" pool of opponents than the oldtimers did. We'll think of something.

I also think your alpha-beta bot is objectively better than the (currently) #1 bot.

Easiest thing to throw together now is a script which just feeds the whole game history into bayeselo as a bunch of fake chess matches, with the white move advantage set to 0. That's what was done for Tron.


Make every bot play each other once or multiple times.

The more I think about it, the more I like it as a recruiting venue[1]:

* A good way to measure the potential recruit skills

* Gives a good fun experience to go over later at the interview, both verifying it is the recruit who coded it and setting common background beforehand

* You get people who showed some interest in your company, seeing the contest

* You get people who are interested in hacking around puzzles

[1] (I don't know if scribd sees it that way)

Adding to what you said, someone who discovers, attempts and solves this solution is already better than 70% of the programmers out there. They show promise of being motivated to crack a puzzle that robs them of their sleep, and take the time to learn and code against the Scribd API.

Even if they don't "win", they are greatly superior than a normal resume pusher.

Absolutely. We've already used this game for some talking points during interviews.

Smart that they implemented a 10-second time limit, because my first thought was to write a bot that would pass the current game state to Mechanical Turk and have a person solve it. It would be interesting to see how a human fared against Scribd's developers' bots.

I'm pretty sure I was the guinea pig for this system and I have to say, it was one of the more enjoyable interview tasks I have had to perform.

There is just something extremely satisfying about watching your little robot run across a field collecting little fruit.

If we started asking this as an interview question, NOBODY would pass. Time after time we have "senior" developers who fail to write very simple functions, like to count the number of occurrences of each letter in a string.

For me, personally, it sounds like an awesome question, but it's not clear how you would judge it. Let interviewees' bots fight?

Wait...when you say fail are you saying they fail to produce something optimal or fail to produce working code?

They fail to produce working code. I've written about this a number of times.

Most candidates CANNOT write a string reverse function without the off-by-one error.

...Must keep working on Gridspy.... Must not enter fun contest...


Btw. here's a game, played between our two best bots so far:


Seems to be returning an error page at 10:19am EST

Can we use nacl?

We do plan to add some additional languages. Next on our list are: Haskell, Lua, Python.

Sorry to be specific is the sandbox chrome or just a js engine? If it is chrome I could compile and run a nacl module to improve speed.

It's using SpiderMonkey right now. Using nacl as sandbox is a neat idea though.


there's a tiresome amount of dick referencing in the programming world.

(fyi, one half of the team who built the game does not possess a dick)

I don't get it...

Was there some innuendo somewhere that I missed?

Thank you.

Comparing yourself to others is a good way to gain perspective on how far you've come, and more importantly, how much further you have to go to attain some desired level of skill. Taken to extremes predictably ends up hurting more than helping.

How could this be framed better? It's already pretty friendly.

For people that havn't studied algorithms much, where is a good start on game playing algorithms? Pointers to papers, or names of data structures to look up? Perhaps http://erikdemaine.org/papers/AlgGameTheory_GONC3/paper.pdf ?

You should look at the basics of AI and various search algorithms.

The first priority would be to quickly pick up the vocabulary: what is a state, what is a utility function and so on.

I recently took a class on AI and we used the (rather popular) Artificial Intelligence: A Modern Approach by Russell and Norvig [1]. I didn't actually read the book, but I've heard good things about it so it's definitely worth a look.

[1]: http://aima.cs.berkeley.edu/

All the lectures for the course are available online as well[2]. The professor my semester (Dan Klein) was a brilliant lecturer, so the lectures are worth watching if you have the time. The lecture notes[3] are also online.

[2]: http://webcast.berkeley.edu/playlist#c,d,Computer_Science,9C...

[3]: http://inst.eecs.berkeley.edu/~cs188/fa11/lectures.html

Of course, if you want to participate in this game, you are doubtless in a hurry. So you might want to skim through lectures 2, 3, 6, 7, 10, 11 in roughly that order--they seem to be the most pertinent.

Combinatorial game theory would be the wrong tool for this, I believe, since this isn't a strictly finite turn-based combinatorial game. (And in my opinion combinatorial game theory is more useful for analysis and proof techniques than for creating AI algorithms, but I've only had one class on it.) The blog mentioned "fruitwalk-ab" uses minimax (with the alpha-beta optimization of course), which is the bread-and-butter algorithm for these kinds of games and I expected it to be in 1st place. Sure enough, it is at the moment. (Edit2: No longer.)

In an intro machine learning course you'd learn about minimax and others, but skip paying any money and just read the basic algorithms here and look up the wiki pages for even more examples: http://www.stanford.edu/~msirota/soco/blind.html (The introduction has some term definitions.)

Edit: Also, the obligatory plug for Artificial Intelligence: A Modern Approach http://www.amazon.com/Artificial-Intelligence-Modern-Approac...

Thanks for the info, interesting stuff. I've got more first-gen minimax going right now, sorta works but gets stuck in a repeating move pattern. Interesting stuff to work on though, i really appreciate the pointers. Hopefully this weekend i can work out a better heuristic valuation of a game state, for a first run i'm just counting my objects vs theirs(very simple), but i need to somehow weight it such that it prefers picking objects up sooner than later. And of course it would better match if it also took into account groupings, but thats for after i get basic board-clearing working.

Hey, honest question here, is it impossible to brute force this and consider (almost) every possible move and get to a non possible to loose point? Given that the board is restricted to 15x15 and the fruits to 81... I'm not saying this is easy, I'm asking if the game AI has an upper limit so to say...

The instructions say you need to make a move in 10 seconds, but that's the only ceiling I can imagine for this approach.

Ohh, I missed that, that makes more sense now, thought ten seconds is quite a lot, you will very probably not be able to brute force every possibilities before every move. Thanks!

Search space = 4^(15x15) = 2.9 × 10^135


Given that each move must be made in 10 seconds, here's my approach. I'm proposing that you use machine learning to evaluate moves, to approximate the exact (search-based) scoring. I'm not implementing this, but feel free to ask me for more details:

Training phase:

Create a random board configuration. Exhaustively explore the search space to find whether this is a win +1, lose -1, or draw 0 for Player 1. You now have a training example: (board configuration, game outcome)

Now, train a neural network (or other non-linear ML model, e.g. SVM) to predict the expected outcome based upon the board configuration.

Deployment phase:

Port the neural network to Javascript. For each possible move, use the neural network to predict the outcome of that move. Pick the move with highest expected outcome. The neural network will run in constant time, most likely well under 10 seconds per move.

> Create a random board configuration. Exhaustively explore the search space to find whether this is a win +1, lose -1, or draw 0 for Player 1

Given that the search space can grow O(4^mn), this can be done only for endgame configurations. Further, not knowing any bounds on the board size makes the input representation difficult to define for a such machine learning approach. And, your target should probably be the weights of an evaluation function, rather than the exact game outcome.

As for the learning algorithm, I know TD-learning was found to be a good approach in various chess programs.

> For each possible move, use the neural network to predict the outcome of that move. Pick the move with highest expected outcome.

You would likely still want to run an alpha-beta search to pick the move to minimize the prediction error.

Hey, I'm writing a bot for this.

One thing though: I think resetting the board state should call new_game() again. This likely only matters when testing your bot, but it's nice for variables that need to be instantiated per each game.

It's pretty fun! My current greedy solution is called: SoGreedy. :)

Fixed, thanks!

Your bot is doing pretty well already. :)

[Sorry for bug report here, didn't find another way] I think there is a bug with get_my_item_count() and get_opponent_item_count(). These two functions don't return decimal. When players have 0.5, they return 0. This works locally though.

Example of failed match : http://www.scribd.com/job_game/match/295703 This line for example 1 / 1 / 0 / 0 / 1 - first number is the fruit type - second number is the total fruits - third number is my fruits - fourth number is opponent fruit - fifth number is irrelevant (You'll never see 0.5 in the third or fourth number..)

Bot compilation seems to be stuck on this guy: http://www.scribd.com/jobs/botrace_bot/58


So when I upload a bot now, I get a 504 gateway timeout, but it does eventually seem to work. But then my games played resets to 88 instead of 0.

I'm getting a 404 when going to any bot page.

Log into www.scribd.com, that'll make viewing of the bot pages work. We'll deploy a fix for that later today.


I think there is some kind of bug where a bot is stopping. Example : http://www.scribd.com/job_game/match/275573 I simulated two boards where this bug happens online, and it doesn't happen locally.

Also, the API is wrong here : "return the index of that fruit (starting with 0)" => It starts at 1

Thanks! We'll fix the API docs.

About the bug: Try to trace() out a few messages in your make_move() function, that might tell you whether something went wrong (the logs will appear at the bottom of the game replays.)

OK. Will do that thanks !

I was going to post how they should do a clearer job of describing the challenge (i.e. using bullet points, headers) and main rules, but I guess there's a separate webpage devoted to the challenge:


Scribd has Kramm. You have already lost.

I think they need to reevaluate how they score a tie, out of 58 games I have 4 losses and 30 wins and the rest are ties, but my percentage of games won is only 50%ish. Seems like a tie shouldn't be put in the same category as a loss.

I see there's a limit on how long the moving function can run. Is there a similar limit on how long the new_game() function can run?

10 seconds as well.

If would probably make it easier if they had an interactive version, so that we could learn the game before starting to program.

The API is really simple. The simplest version of a bot is:

    function make_move() { return EAST; }
We also provide a downloadable environment so you can test your bot locally.

Why are the fruit counts represented as floats?

You can get half a fruit if you and your opponent pick it up at the exact same time.

They didn't really have a choice in the matter! In JavaScript, all numbers are floats.

Incorrect: http://www.khronos.org/registry/typedarray/specs/latest/

Besides, there's Number.prototype.toFixed().

What are the grid-size and fruit-count limits like on the server -- the same as in the standalone client?

Yep! Width and height are between 5 and 15, and there are up to five different types of fruits on the grid.

This looks like a perfect application for using a genetic algorithm solution with two competitive agents.

Looks like there are permissions problems when emailing botrace@scribd.com

How long will this contest be up? Or will this be ongoing?

We're planning to keep this running indefinitely.

We may also add additional languages in the future, like Haskell and Lua, and possibly some advanced playing modes.

Might I suggest Go? It's already sandboxable, demonstrated by the "Try Go" box right on the front page of golang.org.

Count this as a vote for more languages.

Stingy v1 James Ojaste 1200.0%

What's happened with the ranking?

Seems like the upload a bot link is down...

It should be mentioned that you need to register in order to submit. That was my issue.

However, while my bot was compiling, I moved away from the page. Later on trying to resubmit, I kept on getting errors, until I changed the name of my bot submission.

More helpful error messages would greatly improve the whole process. :)

this is cool. submitted mine, cross-posted to a few places mine's are prefixed with bosky101 :) ~B

can't upload a bot seriously ....

Aw, that sucks. We're still fixing glitches in the system, but uploading actually should work.

What kind of error are you seeing?

I go redirected to this link www.scribd.com/jobs/botrace_upload_bot

which then displays: Oops! Something went wrong. We're working on a fix. Please visit our homepage for some recommended reading to take your mind off of it.

Try creating a new account on www.scribd.com before you go to www.scribd.com/jobs/botrace_upload_bot

I'm getting 504 Gateway Time-out The server didn't respond in time. on http://www.scribd.com/job_game/process_uploaded_bot

Same here when trying to upload my bot the first time.


Uploading bot revisions is broken right now- you can try to upload a new bot, or wait for our fix which should go live in a couple of hours. :)

Also, feel free to contact us at botrace@scribd.com if you see any more issues.

Same here when trying to update the bot.

I can't delete my old bot, and my new bot is compiling indefinitely.

This was so much fun! Thanks scribd. :) Humble Hodor and Greebly Wartfinger gooooo.

compiling indefinitely...

Ah, the new generation of the "design contest."

Thanks, but no thanks. Hire better developers if you need your algorithm.

That doesn't really make any sense. Scribd is trying to trick you into solving an undergrad-level AI problem for them? Sounds unlikely. Designing and writing the game probably took more effort than writing a bot would!

Coincidentally, I really like the game design. I took an AI course recently and we had some really similar using Pacman and Python, and that was fun too, but I would much rather use JavaScript and compete online than use Python and the annoying framework code they gave us.

odd seeing another "tikhon" commenting about scribd.

Tikhon Jr?


I get confused all the time. I thought cardinality[ tikhon ] was 1.

It seems unlikely to me that scribd are actually looking for a solution to this problem, or an isomorphic one. Have you any basis for thinking that?

There are many standard approaches to the general class of two player board games (they are clearly aware of these, with an alpha-beta based bot).

I would guess that what is going to differentiate winners from losers in this specific game are the exact specifics of the setup. So if they are looking for a solution to an isomorphic problem, it must be a very similar problem - as the general type of challenge is well understood.

It just looks like a neat game to me; I doubt its a cleverly disguised problem from 'social reading and publishing' that they are trying to get a cheap solution to..?

I'm with you. It just looks like a lot of fun. My ten year old is really interested in just playing for the sake of play. So we will!

Your comment is totally removed from reality. Did you even read the page? Do you think that Scribd is in the business of picking up fruits from grids?

They are trying to break into the picking fruit from grids business. Hardly anyone is doing it so it is wide open for disruption

Probably worth a billion dollars too!

Wouldn't be so quick to down-vote his comment. I don't think scribd is stealing code or that submissions will be directly applicable, but there's some cross-over between a game like this and searching/crawling a db/docs/network. It's a great way for them to find devs with skills they need.

Finding good devs is essentially their stated intent, and so they're not looking to use the results of the contest for free but rather identify talented people.

Downvotes on the comment recognize the wrongly cynical views of the poster.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact