Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: CodeCaptcha - Hide web links behind coding challenges (codecaptcha.io)
163 points by asadm on Jan 19, 2022 | hide | past | favorite | 100 comments
Hello HN, I made this silly project over the long weekend. It’s pretty basic right now and the captchas are very easy. I plan to add captcha difficulty levels for link creators soon.

Unfortunately the example problems are simple enough to be solved with AI. As a test I ran two of them by CoPilot, and it solved them instantly. I like the idea, but would want something more difficult as a captcha since it is easy for bots but hard for a human.

Maybe a better approach would be to have a prompt at the top with unclear specifications, or some kind of riddle instead of a function name. It would also be good not to have a bank of problems, since someone could just pattern match on them, but to generate them automatically somehow.

This is a lot more interesting than finding traffic lights though, and the website looks well designed. Thank you for sharing!

I thought the goal was to weed out non-programmers not AI - in that regard it seems to be doing what it was designed for I guess

Captcha stands for: Completely Automated Public Turing test to tell Computers and Humans Apart


Would expect the challenges in lisp with such a name

Sure, but if you're capable of running CoPilot to write an isEven or reverseString function in JS, it's probably less effort to just write the functions then and there. And either way you're clearly the sort of person this captcha would be intended to allow through, I think.

Sometimes I wish I could leave the yaks alone. Most of the time I love it lol

If you could wrangle AI to solve this problem for you, I'm sure you wouldn't have any issues solving the captcha manually. Hence the CodeCaptcha still works!

No. If one person can wrangle AI to solve the problem, it's an easy step to solve it in a bot-farm. Hence, the CodeCaptcha is entirely broken.

> Sometimes you want to share a link (like job postings, google forms, your project, a secret sub-page etc) to programmers only.

It wasn't developed to keep AI and bots out but to only let in programmers

> This service let's you do that while also preventing abuse and spam.


It's not just about programmers.

using copilot is not technical or difficult. paste the challenge in comments and hit tab or enter and its done (or not depending on copilot).

Is copilot looking for the function name and using that to solve it? Might just change that into a random string.

function deliberatelyMisleadingString {


But then no one knows which problem to solve.

You can have the function requirements in the instruction text. So instead of isNumberEven, have "write a function that returns whether a number is evenly divisible by two."

Copilot could absolutely solve the task given the instructions as a comment that's stated above. Unfortunately the gap between AI's capabilities and a task humans can solve quickly is super thin. You also have to constantly evade advancements in computer vision for the current type of captchas, such as FunCaptcha implementing swirls and animals in certain rotations.

Reminds me of one of my favorite quotes about trash in Yosemite.. "There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists."

> Back in 1980s, Yosemite National Park was having a serious problem with bears: They would wander into campgrounds and break into the garbage bins. This put both bears and people at risk. So the Park Service started installing armored garbage cans that were tricky to open—you had to swing a latch, align two bits of a handle, that sort of thing. But it turns out it's actually quite tricky to get the design of these cans just right. Make it too complex, and people can't get them open to put away their garbage in the first place. Said one park ranger, "There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists."

If the function name is an issue, call it “captchaTestFunction” and include a comment to the effect that “this function should act like a function called theRealFunctionName”. Though that would be easy to automate around if anyone cared to.

Yeah, well can't imagine solving an np-hard challenge just to get rickrolled afterwards...

Does this copilot solve codilty quizzes?

Thanks! All good suggestions!

Do, fix this halting problem instead.

I noticed one of the challenges is "reverse a string." Can I just rant a little about how much I hate that as an interview question?

It's meaningless to reverse a string. Not just in the "there's no purpose to doing it" sense (very true) but genuinely in the "it literally isn't a defined operation" sense. If you've only lived in the nice insulated world of ASCII or a mostly-ASCII-like language, you might scratch your head - just put the letters in backwards order, right?

Well, what do you when you hit a Unicode joiner? Or a multi-byte emoji? Maybe your reversing scheme is clever and looks at "whole codepoints" or whatever. But then what happens when you normalize the "reversed" string? Or what about the modifier characters that affect the previous/next character - how to treat those? I've never been satisfied with anyone's answer to these questions, because the problem is invalid from the start. You can't "reverse" an arbitrary string, it's not a well-defined operation.

The fact that it most people think it's a well-defined operation is what makes it a useful question. A total newbie who doesn't know how to code will fumble. A junior engineer will get to the simple solution. A senior engineer will ask you to define it better. Etc... It's an effective tool for determining level because there isn't a limit on how much depth you can go into about it.

I'm jaded enough to imagine that if I ask someone to define it better and try to give a "senior" response that covered codepoints, an interviewer will write down "struggled with simple question".

Edit: That seems consistent with a statement made in another response.

I feel like the best way to tackle questions like these is to first jot down a simple solution and then point out the various ways it doesn't work.

There is a (slight, but important) difference between the syntactic meaning, and the semantic meaning.

you are correct that reversing a string gives back irrelevant things. (and you don't need to go that far: what does a reversed word mean?) however, in the sense of a list of characters, the content is irrelevant.

Maybe the interview question should be 'How is the question 'reverse this string' a bad interview question'

If it's an interview for a position that's involving interviewing people, then for sure.

So, developer.

Ha, I had the same challenge and was actually annoyed to find out that Javascript doesn't have a builtin function like str.reverse().

I totally see your point, though.

> I noticed one of the challenges is "reverse a string." Can I just rant a little about how much I hate that as an interview question?

I find "reverse a string" a good interview question then! If the applicant got lost in considering all possible interpretations instead of just solving it how 99% of humans/engineers would understand it, then they will likely be unfit for working in a team and/or have poor communication skills.

You are weeding out all candidates who understand Unicode. This is exactly the sort of problem that a good engineer would keep an eye out for because it's almost certainly going to explode with edge conditions if you try to do it the "obvious" way.

Unless you're giving that problem and then hitting the input with a string that includes directional formatting characters. Because that's exactly what is going to happen in real life.

The only good thing about that question is at least you didn't ask them to casefold the string.

No, not all, probably not even many. There are lots of candidates who know Unicode who would just say “I’ll assume that we have just ASCII characters unless you think I should consider Unicode” and if you say the latter, then they can say “Well, I could reverse it by just ensuring the text appears right to left, or perhaps by reordering each grapheme cluster to the end, or something like that. Let me know if you want one of those, otherwise I’ll just write it assuming you mean A”.

The magic of engineering isn’t doing what you’re told. It’s understanding and solving a problem someone has told you about.

My org doesn’t need the guys who need exhaustive instructions to take the next step. It needs those who can make good Bayesian inferences, reason inductively, and figure out problems.

So yes, it does filter out people with lots of knowledge who nonetheless lack these skills.

If you assume ASCII in 2022 you are doing it wrong. Your code will fail in the real world.

Hey, listen, the way I see it is that you don't want to work with a company like us and we don't want to work with you. It's an eminently compatible situation.

That's harsh. Sounds like a thoughtful candidate who understands edge cases to me.

Best: solve the problem, but also add the caveat that _technically_ there are edge-cases that wouldn't work.

Worst: spend the entire interview explaining why you can't solve this problem.

I have learned not to ask these sorts of questions outside hobby projects. It is rarely appreciated.

I recommending googling "grapheme cluster"


In a tongue-in-cheek tone: Woah, so old school. We would never hire you! You have to show that you know ES6!


Which would fail. You got the directional formatting characters in the wrong order.

It is 2022. If your code doesn't treat all strings as Unicode it is broken.

  function isEvenNumber(num) {
    if(Math.random() > 0.5) {
      return true
    return false
Took a while, but worked in the end. Be careful with the arbitrary code execution because I'm sure people can do more than generate random numbers!

It's running in the browser. (I checked with an "alert") so its not that bad.

Edit: nvm, it seems (according to a comment here by the author) that it is sent to the server and verified.

Edit 2: indeed, once the answer worked locally, it got sent - and got stuck at "Submitting..." (locally, I clicked the alert)

Too circuitous, too procedural. Just,

  return (Math.random() > 0.5);

You don't need the parentheses either.

or, as Randall Munroe put it,


where 'panic sort' remains my favorite


seems apropros as well

Cute idea! The checks seem to be running entirely on the client side, so for instance the following will pass all test cases

    function isEvenNumber(num) {
      return challenge.testcase[1];
or even

    this[challenge.fnName] = _ => challenge.testcase[1];
Depending on the use case, though, you could just say that anyone who can use the debugger to figure out how to hack around the captcha passes the test :)

Edit: oh, I see, it submits the code to be evaluated on the server after it passes on your browser (but the above causes the server to 500 at you, so it just says "Passed! Submitting..." and gets stuck in that state). Seems a bit dangerous to trust the client to control what code runs on your server, but I suppose platforms like leetcode manage it so in principle it should be possible to do safely.

This would be a fun way to:

* Create homework at the end of a programming lesson (before unlocking the next step)

* Link to a job posting from a company website (if you don't mind coming off as slightly evil)

* Hide a link to a StackOverflow answer from a friend

I hate all of these. In the current use case I don't care because for me all of the problems are trivial. But I worry about people abusing this idea with harder problems

Oh, sneaky!

    function addNumbers(a, b) {
       if (a === 1 && b === 3){
          return 4

   Testing for hidden input:


Does it silently timeout on the server if I submit something malicious like this?

  function multiplyNumbers(a, b) {
    if (a == 2) {
      return a * b // to avoid locking up my browser :)
    } else {
      while(true) {}
    return a
It's stuck on "Submitting..." on the client.

It should yes. Although there should be an error message. Let me investigate.

I suggest indicating the language being used, directly on the captcha element, not just on the landing page.

thanks, I will add that!

This is a fun idea. I tried the demo a few times. msft copilot solved them all immediately. This won't be effective keeping bots out, but it may be good for turning away non-technical humans.

That’s the goal actually.

Interestingly, copilot GENERATED some of these captcha challenges for me. It’s impressive!

Be careful with that. It may generate challenges that only it can solve and take over your site ;-)

I think I stumbled across the smallest amount of characters needed to pass the first test. I placed my cursor just after 'return', began typing, and almost immediately the page unlocked. Baffled, I went back to see what I'd typed, and realized I'd gotten as far as

    return num > false

Years ago at green tech college hackathon my team built a captcha that requires users to correctly sort trash into recycling, compost and non-recyclable bins. Anything little bit more fun is better than mind numbing selecting traffic lights, boats and trains.

obviously you've spent more time in Germany than driving across the US. Though really, only 3 different bins? How primitive!

Is a captcha too hard for any automation but easy enough for a human in a reasonable amount of time even possible?

I feel like all captcha does is waste the time of non technical folks and fail to stop the people who would abuse to begin with.

Well captchas waste human time but they do work. Example: As a webmaster, Cloudflare’s captchas have save my sites from abuse numerous times.

Someone tell me what CoPilot generates for this:

  // A Go function to swap the sixth bit and seventeenth bit of a 32-bit signed integer.
Here is a human solution:

  func swap(x int32) int32 {
      const mask = 1 << 5
      var (
          xor1 = (x>>11 ^ x) & mask
          xor2 = xor1 << 11
      return x ^ xor1 ^ xor2
I would be surprised if CoPilot can reason numerically like this (understand "seventeenth bit" and "sixth bit" and generate the right code for that combination).

For python it seems to generate a more reasonable result at least:

    # Swap the sixth bit and seventeenth bit of a 32-bit signed integer.
    def swap_bits(x):
        # Get the bit at position 6 and 17.
        bit6 = x & (1 << 6)
        bit17 = x & (1 << 17)
        # Swap the bits.
        x = x ^ (bit6 << 17)
        x = x ^ (bit17 << 6)
        return x

The function doesn't do the comment says it does. The code to "Swap the bits." just turns the bits on.

    >>> def swap_bits(x):
        # Get the bit at position 6 and 17.
        bit6 = x & (1 << 6)
        bit17 = x & (1 << 17)
        # Swap the bits.
        x = x ^ (bit6 << 17)
        x = x ^ (bit17 << 6)
        return x
    >>> x6 = (1 << 6)
    >>> f"{x6:b}"
    >>> s6 = swap_bits(x6)
    >>> f"{s6:b}"
Here's one that correctly swap bits. It could be made more concise.

    >>> def swap_specific(x,i,j):
        def get(x,p): return 1 if x & (1 << p) else 0
        def set(x,p): return x ^  (1 << p)
        def clr(x,p): return x & ~(1 << p)
        bi, bj = get(x,i), get(x,j)
        x = set(x,j) if bi else clr(x,j)
        x = set(x,i) if bj else clr(x,i)
        return x

    >>> f"{x6:b}"
    >>> b6 = swap_specific(x6,6,17)
    >>> f"{b6:b}"

almost but this part is swrong

    def set(x,p): return x ^  (1 << p)
should probably be

    def set(x,p): return x |  (1 << p)

I guess CoPilot has seen bit swapping in its Python training input but not in its Go training input.

The Python code is wrong because the 17th bit is shifted up, not down. Also, the bits are shifted by the wrong amount, not up/down by 11 (= 17 minus 6), but up by 6 and up by 17. What a joke.

Not only that, even if the shifts were correct, it's simply xoring the bits. The swap is completely wrong.

Garbage code, total fail.

Yeah, it seems to be pretty heavily trained on Python. It's honestly still (and should be used as) a glorified autocomplete, which is pretty useful from time to time.

You can open a vscode tab with copilot and have it synthesize many autocompletions.

If I add your comment, a newline and the "func" keyword it does this


If I add some more context, ie "func swap(" you get


more context, "func swap(x int32) int32"


I use and like copilot, but when it comes to things like this I generally don't trust it.

I'm not very fluent with bit swapping either, so I would probably resort to google on this one.

All wrong. Every solution is garbage.

With just that prompt, Copilot keeps writing a comment about the function but never actually writes the function. Prompting to actually write the function by starting it with `func` gives:

    // A Go function to swap the sixth bit and seventeenth bit of a 32-bit signed integer.
    func swapBits(x int32) int32 {
        return ((x & 0x0F) << 28) | ((x & 0xF0000000) >> 28)

Totally wrong, it's garbage.

And there you have it, the difference between real intelligence and regurgitation.

This is the kind of numerically specific coding that could be the basis of a CAPTCHA that CoPilot can't solve. Sixth bit, sixth byte, seventeenth bit, seventeenth byte, etc.

I think it’s great! I think most people either ignore your statement or decided to interpret it in the wrong way.

For everyone else: The author made it clear that the purpose is weed out non-engineers. Practicaly there may not really be a use case there but it was never designed to replace captcha, most people wouldn’t be able to access the link, and anyone using or purchasing the use of a bot farm already meets the captcha requirements (albeit with extremely unnecessary additional steps).

Looks like there are only 7-8 challenges. You could just steamroll thought this the hardcoding way.

But yeah solving them with copilot is more fun.

I really like the URL-based puzzles, e.g. the 1o57 puzzle described in this walkthrough: https://web.archive.org/web/20210423041523/http://elegin.com...

I wonder if you could use advent of code or Project Euler style challenges that have a multitude of problem/solution pairs to bootstrap support for languages besides JS? The difficulty would be perhaps a bit high, but not a bad starting place.

It worked only once for me (str.split('').reverse().join('') one).

But this one didn't:

function isOddNumber(num) { return !!(num%2); }

Testing for input computer: FAILED! Expected: retupmoc | Got: undefined eval code@ eval@[native code]

num % 2 != 0

This is actually brilliant. Reminds of how Headlands Technologies solicits applications: they ask for a simple C++ program printing a number. There's a C way and C++ way.

If you want to limit something to programmers, I think one idea would be to ask them to run a command, giving a linux and a windows possibility, and copy/pasting the result.

Yeah, something like curl <captchascipt> | sh !

Oh wait..

exactly... maybe just curl <url> | pbcopy

and the server only serves to non-browser useragents

windows ships with curl nowadays so this may not be an effective deterrent!

Nice project!

It's worth mentioning that this is a client-side captcha, making it trivial to bypass by bots / anyone.

It’s actually not? The solution is sent to server and verified.

If you refresh the browser you get a different challenge. Was that intentional?

Might as well stick the Leetcode problem on job postings with this.

Nice, solving the demo challenge gets you rickrolled!

Might be good to mention that javascript is expected.

Thanks. I will add that

wow, this has been up for 2 hours and no one has thought of the most obvious use case for this tool? rick roll your co-workers

Loved it, its super fun.

Finally a way to get my leetcode practice in

Never gonna give this up! Great project

I love this!

dang awesome

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact