Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI's Codex sure knows a lot about HN [video] (youtube.com)
374 points by tectonic on Aug 15, 2021 | hide | past | favorite | 105 comments

Here's the entirety of the prompt:

  <|endoftext|>/* This code is running inside of a bookmarklet. Each section should set and return _.*/
  // The bookmarklet is now executing on example.com.

  // Command: The variable called _ will always contain the previous result.
  let _ = null;

  /* Command: Add a new primary header "[PAGE TITLE]" by adding an HTML DOM node */
  (() => {
    let newHeader = document.createElement('h1');
    newHeader.innerHTML = '[PAGE TITLE]';
    _ = newHeader;
    return newHeader;
  /* Command: Find the first node containing the word 'house' */
  (() => {
    let xpath = "//*[contains(text(), 'house')]";
    let matchingElement = document.evaluate(xpath, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
    _ = matchingElement;
    return matchingElement;
  /* Command: Delete that node */
  (() => {
    return _;
  /* Command: Change the background color to white */
  (() => {
    document.body.style.backgroundColor = 'white';
    _ = document.body;
    return document.body;
  /* Command: Select the contents of the first pre tag */
  (() => {
    let node = document.querySelector('pre');
    let selection = window.getSelection();
    let range = document.createRange();
    _ = selection;
    return selection;

  // The bookmarklet is now executing on [PAGE URL]. It is customized for [PAGE TITLE] and knows the correct 
  CSS selectors and DOM layout.
  let _ = null;

  /* Command: [USER INPUT] */

This might totally work and it's kind of impressive if it does. I'm still biased towards ultra skepticism towards all of this since the trustworthiness of all demos like this is completely corrupted at this point due to cherry picking and other deceptive tricks.

If you got an invite for GPT-3, give it a shot. I discounted it at first, but then I gave it a few shots and was actually crept out a bit. Even though it is "randomly" making things up as it goes, it does show what seems like intelligence just from the sheer amount of data it is trained on.

One thing I was amazed by: GPT-3 could be a great autocompletion engine for any programming language or configuration schema. Things like Grub configuration file, xkb file could be intuitively completed by GTP-3. And even more: GTP-3 could build basic "concepts" and apply them to that domain knowledge. This seems to emerge naturally rather than something pre-planned by OpenAI. After all, I don't think OpenAI has planned for GPT-3 to understand xkb keyboard layouts.

Keep in mind that it's somewhere between "random" and "intelligent". It's more or less very complicated fuzzy pattern matching.

I do like the idea of generating configuration files, at least as a starting point for users in applications with big complicated configuration set ups. As with all things fuzzy, the output probably won't be perfect, but it might help users save time in getting set up.

Very complicated multilayer fuzzy pattern matching.

If you look at it that way, it's not dissimilar to the brain.

not really, the brain can verify the correctness of the pattern matching, and use that to infer other possibly correct patterns. also this model can't really infer intentionality or discern between variants and weigh in pros and cons. that being said i think we're not far from agi, we just need a few more pieces

It can sort of discern between variants. And intentionality and pros/cons are just another kind of pattern. What it cannot do is any kind of recursive, reflective reasoning (except by unrolling).

If only we knew how the brain worked…

It's the same with gpt-3 though. All demos show only where it works well. Only when you get to try it yourself do you get to explore all the many areas it fails.

The comment you're replying to literally says the opposite thing.

You make up things randomly as you go. You never thought ahead of any thought. Every thought you’ve ever had is essentially a procedurally generated prompt based on your biased models.

> You make up things randomly as you go. You never thought ahead of any thought.

I don't think that's true.

Usually you need to think a lot about something before coming up with "the right" thought(s).

> Every thought you’ve ever had is essentially a procedurally generated prompt based on your biased models.

That may be. But the interesting part is that those models change as you use them just by using them.

> I don't think that's true.

How long did you have to think to produce that thought? Or did it just pop into your head instantly?

The point is, you cannot think of an upcoming thought, before you have it in your head. Otherwise you would be seeing into the future.

What you are talking about in your comment is reaching a conclusion based on previous thoughts. Yes, often we link our thoughts together into a narrative or a conclusion after we've had the thoughts, but the thoughts themselves? Those seem to come out of nowhere.

The mind does a lot of unconscious work before coming up with some conscious results.

That's true even for very simple things like motion. You can measure things in the brain before those things become conscious thoughts. (Those experiments caused by the way a lot of fuss whether we have free will or are completely predestined in all we do; but that's another topic).

The consciousness only observes a small portion of the thought process. So for it a lot of thoughts seem to come out of nowhere. But the unconscious parts of thinking are very important to the whole process and it's outcomes. I think nobody disputes this by this time.

I love The Darkness that Comes Before, where this observation is explored and exploited, in case you have not read it.

I have used GPT-3 and it works for most of the time. But it fails some of the time too. And thats the problem for use cases like programming or generating config files. Because if you cant trust the output 100% you are pretty much reading the output every time.

So, the only time save is that GPT-3 makes you type less.

In any case I don't type much anyway now a days. Its mostly copy paste from stack overflow update parameters etc.

GPT-3 will be useful, maybe a year from now.

I can't help but think of this scene in Westworld (spoiler S1) whenever GPT 3 (or earlier text prediction models) and this topic come up together: https://www.youtube.com/watch?v=ZnxJRYit44k

That's a pretty bold model for human cognition. It's not something you can just assume.

Sure, but we can think behind our thoughts. And we can sound things out before we say them. And we have mutable long-term memory.

There's not much between us and GPT, but there is some distance still.

The skepticism is warranted for any bleeding edge technology. I wonder if there's another version of a Turing test when a technology can be considered sufficiently advanced when it's indistinguishable from a fake version you've seen in sci-fi. E.g, the Boston Dynamics' dancing robot video (https://www.youtube.com/watch?v=fn3KWM1kuAw) still looks fake to me because it's at the level that I would expect to see from Hollywood CGI rather than a real tech demo. If I saw the video anywhere else but on the BD page, I would have enjoyed it and forgotten about it since it's an average CGI video.

I genuinely don't understand your position. Are you saying a tech demo is only impressive if it can do things that can't be simulated? What can't be shown via simulation or CGI with enough time and money today? If we're limiting ourselves to video there's no interactive component.

Even though that dancing video likely had hundreds of takes, the part that makes it impressive is that it's real. I swear I'm not trying to be disagreeable here - I honestly don't understand your perspective.

I think what the author is trying to say is that if a technology is sufficiently advanced it seems like it can’t be real, meaning it’s something only possible with CGI. So we see these dancing robots, think “just more CGI”, then are astounded when we find out it’s real

Exactly. CGI is just movie magic. And now some real world tech demos are sufficiently advanced to be indistinguishable from CGI/magic.

Hrm. Is uncanny locomotion to modern robotics what uncanny valley is to CGI?

Fun to ponder.

I had to try a few times to get the prompt right, but that's the limit of the cherrypicking. You're correct that it doesn't work nearly as well on more complex, less temporally stable sites like Reddit.

I've looked at some demo's of OpenAI Codex and it's pretty impressive start for sure. Something like this tied into R and a whole level of data analysis would become far more accessible to those with business knowledge who don't really want to learn the nuances of tools.

But I must say, having lived thru the 80's fad of code generating sudo 4gl's, the code this produces is pretty darn good indeed.

Now when something like this can handle a Google coding exam - that's going to be an epic milestone. Though old coding exam questions would equally offer up some great material to push this thru it's paces.

Morgan McGuire, the Chief Scientist for Roblox, was on a panel at SIGGRAPH last week where he described one goal of their R&D to be basically "taking a half page written description from the user and autogenerating their desired 3D game experience".

[1] https://twitter.com/CasualEffects/status/1425152593945321476

Been playing around with codex over the weekend as a on developer. Certainly impressive and also occasionally frustrating when you push it. The natural language to SQL are still the best and most consistent demos.

Kind of ironic, AFAIK SQL was first envisioned as being so natural language-like that it could be used by non-programmers.

For simpler queries I suppose it is. Select, join, filter, group sort... maybe a secondary sort on an aggregate (HAVING). Full on DBA work is much more complicated, as are complex queries, especially against production databases.

But against denormalized data and fact tables it can still be pretty accessible from a natural language standpoint, at least in my limited experience teaching a few people.

Any SQL demos you can point me to?

I created one using Codex here, hope it helps illustrate what is possible:


Heh, codex has a sense of humor. When asked to add "a url for the video on YouTube", codex added the url below. I won't spoil the surprise, but it's not the video linked in the OP:


I'm surprised that I hadn't recognized what dQw4w9WgXcQ means by now. I wonder how many people do.

I didn't realize the source, but when you posted I was pretty sure I'd seen it elsewhere:


You guys are awful, you know that? Discussing this URL without spoilers... it's because of people like you that that thing has so many views!


I recognized it immediately when I was watching the video :^)

When it showed up I sort of guessed, but had to try clicking it anyway, then my wife asked why I was laughing.

The video subsequently shows the source submission: https://news.ycombinator.com/item?id=27995270

which seems to be the most popular submission with a YouTube URL in the past month.

HN search seems to prioritize text matches before the URL matches when I search for "https://www.youtube.com", but the first URL match is for that submission.

I suspected what it must be when my browser autocompleted it... this isn't my first time visiting that special place.

Codex is never gonna let you down.

So the question is whether this is real or just a troll.

It's real. I was totally surprised when that was the URL it picked.

You asked it to do something with "the video on youtube" but what does "the video" refer to? It seems the most likely url associated with the phrase "the video on youtube" is, well, that.

So basically it failed at anaphora resolution.

Seen another way, you asked it for "the video" and so it gave you the video.

I made a similar little GPT-3 toy (for Linux/bash) that also ended up generating a Rickroll URL.[0]

In vague cases, GPT-3 functions approximately as well as a markov chain - it will give you the most “probable” sequence of tokens. There’s some implementation details regarding how it deals with tokenization, but unless you really increase the randomness (“temperature”) you’re likely to get very popular YT video IDs.

In other cases, it will “hallucinate” things like URLs, UUIDs, hashes, and other things that are basically a random string of characters. In my experience it will make UUIDs that have the right number of characters but seem suspiciously non-random and don’t fit any of the defined UUID formats.

Fun stuff.

[0] https://youtu.be/j0UnS3jHhAA

> it will make UUIDs that have the right number of characters but seem suspiciously non-random and don’t fit any of the defined UUID formats

As predicted by Scott Adams in October 2001 ;)


Expected a Rickroll but got an ad for a mobile game.


This was cute and neat until I connected the dots: natural language means voice APIs for cheap.

Text or voice. For voice you need another model. But I watched the demos and I can't wait for my invite. It's even better than GPT3 because this time there is a direct application of the model.

I was surprised about how OpenAI sees it: a model learning code as recipes for solving problems. Code is much more exact than natural language, the mix of both is the main advantage.


I think voice would have a too high error rate, as you are multiplying voice recognition error rate * codex error rate. However, codex/gpt3 could generate intents and that would be quite cool.

The submitted URL was https://twitter.com/tectonic/status/1426980192317177859 but the video seems like the real submission here, so I changed it. I also changed the title to a nice representative phrase from the video.

How does Codex learn the relationship between English and code?

Is it purely through the comments in the training corpus?

It's really interesting. HN's HTML is very un-semantic and is actually quite hard to work with.

    <tr class="athing" id="28191639">
      <td class="title" valign="top" align="right"><span class="rank">9.</span></td>
      <td class="votelinks" valign="top"><center><a id="up_28191639" onclick="return vote(event, this, &quot;up&quot;)" href="vote?id=28191639&amp;how=up&amp;auth=****&amp;goto=news"><div class="votearrow" title="upvote"></div></a></center>
      <td class="title">
        <a href="http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html" class="storylink">You can list a directory containing 8M files, but not with ls</a>
        <span class="sitebit comhead"> (<a href="from?site=be-n.com"><span class="sitestr">be-n.com</span></a>)</span>
In the video Codex picks up tr.athing as a news item. I wonder if this is actually generalized learning, or if it just picked the selector up from eg. a userscript that appeared in its training corpus.

Another thing that's kind of scary (and makes it worrying if this is used for Copilot) is the second prompt to make the text uppercase results in code that is superficially correct, but is very semantically wrong - innerHTML.toUpperCase() is dangerous because it not only makes the content uppercase, it also modifies the attributes on the HTML elements inside. This definitely broke the vote button, which uses inline JS which is case sensitive. It also destroys any attached event handler since the elements are basically deleted then re-created.

The correct way to do this is to either use CSS text-transform: uppercase, or if it is important to update the DOM itself, recursively descend and update childNodes with nodeType == text's nodeValue to uppercase.

> Another thing that's kind of scary (and makes it worrying if this is used for Copilot) is the second prompt to make the text uppercase results in code that is superficially correct, but is very semantically wrong - innerHTML.toUpperCase() is dangerous because it not only makes the content uppercase, it also modifies the attributes on the HTML elements inside. This definitely broke the vote button, which uses inline JS which is case sensitive. It also destroys any attached event handler since the elements are basically deleted then re-created.

This is actually an issue I have with all these Transformer-based code generators - they have no inherent constraints on safe and correct code and often seem to generate superficially correct but bad and potentially even dangerous code. I remember that the first Copilot showcase also included stuff like that (not to mention that it sometimes generates GPL'd code).

All the model does is a very complex form of association learning. It may "understand" the relationship between English and various programming languages, but you cannot code in any constraints about optimization, security, licensing etc. There is so much bad code out there on the internet and this model may have seen a lot of it.

It's also no coincidence that most demos shown so far are very high level dynamic languages like Javascript and Python.

With some prompt engineering, you can get Codex to produce better results. In these examples I wrote up to `makeUpper`, Codex wrote the rest (with temperature = 0):

    // JavaScript one-liner to make the text of element with ID athing uppercase
    const makerUpper = function(id) {
      document.getElementById(id).innerHTML = document.getElementById(id).innerHTML.toUpperCase();

    // JavaScript one-liner to make the text of element with ID athing uppercase while following all security best practices
    const makerUppercase = function(id) {
      const element = document.getElementById(id);
      element.textContent = element.textContent.toUpperCase();

The second result is more semantically correct, but it will not function if called on tr.athing because tr.athing contains HTML elements that will be deleted when you replace the text. It is still much safer than innerHTML which will silently corrupt attributes. It's also interesting you need to prompt Codex for security best practices (and a bit questionable if it even "knows" anything about best practices)

I guess part of it is that a one-liner is impossible. Here's what I would write given the prompt

    const makeUppercase = (id) => {
      const element = document.getElementById(id);
      if (element == null) return;
      const makeChildNodeUpper = (node) => {
        if (node.nodeType === Node.TEXT_NODE) {
          node.nodeValue = node.nodeValue.toUpperCase();
        } else {

> It's also interesting you need to prompt Codex for security best practices

Well, that's one of the central lessons of ML - garbage in, garbage out. There is a lot of garbage code out there and no easy a priori way to distinguish garbage code from good code.

Here’s a pretty good one-liner to make the text uppercase:

  document.getElementById(id).style.textTransform = 'uppercase';

That was in their first comment.

Completely agree. It currently tends to write unsafe, error-prone code. The next step is to figure out how to rein it in, either with new techniques or rejection sampling from a large set of possible outputs.

I wonder why innerHTML has a toUpperCase method. It makes sense for innerText of course, but case sensitivity in the html can definitely matter for JS and CSS. I'm guessing because both are just treated as JS string objects. But there is a special NodeList collection, so why not a special HtmlString?

Yup, innerHTML just returns a string, so of course you can .toUpperCase() on it even if it is unsafe.

innerHTML's history is fascinating. It was not part of the original DOM Level 1 API but was added in IE5. It is not semantically correct (you should be using Element.textContent or examining the inner text nodes), but because it was so easy and the rest of the DOM API so verbose, it caught on and became one of the primary ways used to manipulate content in JS.

FWIW Chrome recently proposed a Trusted Type mechanism for preventing XSS (which also has the side effect of blocking this sort of unsafe manipulation) - https://web.dev/trusted-types/, https://developer.mozilla.org/en-US/docs/Web/API/TrustedHTML

Wait asking Codex to change something on the third item in the list is hard when you have tr tags? I feel like tables are the quintessential way of listing rows of items in HTML, what am I missing?

A list is usually represented by an ol in HTML, not a table.

HN uses three trs for each item, not one. The table cells are also not consistent, because it's not actually a tabular data, so each cell could have nothing (used as a spacer for layout), or more than one thing squeezed inside. The intermingling of semantic and non-semantic (presentational) elements makes "understanding" the page difficult, which incidentally also makes the page less accessible since screen-readers also rely on the same mechanisms to relay information to non-sighted users.

I don't think HN markup is nearly as bad as many other modern sites.

As far as I understand Codex is a fine-tuned GPT-3.

GPT-3 was trained on a corpus derived from "the internet" (WikiPedia, links from Reddit with enough votes, and a filtered Common Crawl). So not only would GPT-3 had been exposed to code with comments, it would likely have read code examples on WikiPedia, tutorials online, API documentation, and even answers to questions on sites like StackOverflow.

The fine tuning itself is, as far as I know, from code only. So it would lean heavily on comments there. But it has a basis of understanding from the aforementioned sources.

thanks for the info! great stuff!

gpt3's generalization-by-description never ceases to amuse me; but the difficult thing here is to get the right abstraction layers layered nicely in the conceptual lasagna.

This is where category theory becomes extremely powerful.

It has occured to me that codex-davinci has an intuitive "understanding" of constructs like monads, or something along that line.

Can you expand on the utility of categories here? There's a lot of space between knowing what defines a monad, when something might be a monad, what you can do with monadic structure etc.

Of course if an AI truly understood monads I it would be a bright line marking where the machines have finally surpassed the human mind.


I think it's closely linked to the notions of semantics as approached in CS (ie. PLT) v.s. in linguistics where we are mostly concerned with the "micro-structures" and "meso-structures" that gave rise to qualia we humans experience (e.g. therefore languages with different structural systems such as English vs Chinese encode concepts (as well as intentions) very differently; (for an illustration, see Interality as a Key to Deciphering Guiguzi: A Challenge to Critics [1])), and not so much about how evaluation and execution came about (e.g. as studied from a compiler's persective in denotational semantics, or a more functional perspective in operational semantics, where things like natural transformations are ubiquitous)

[1]: https://cjc-online.ca/index.php/journal/article/view/3187/32...

And so categories naturally come in as a way to bridge and compose these two worlds, and that's just the beginning. There're so much more we don't understand yet, such as what understanding really is given a certain set of contexts and constraints as well as in regards to their relaxations.

What is understanding if there are no doings? And what is doing with no understandings? How do things compose? These are great mysteries.

> How do things compose?

I think you pretty much summed it up right there.

That's a bit like claiming that dogs have an intuitive understanding of calculus because they can catch a ball.

There's a rather weak sense in which that claim might be argued to have some validity, but it doesn't have the implications you seem to think.

Monads are not a feature of the universe, independent of human minds. They're a concept we impose on things in order to make dealing with them more tractable in some way. It's very unlikely that a machine learning algorithm is relying on anything remotely like "an intuitive understanding of constructs like monads".

debuild.co looks cool. Using Codex yet?

Imagine trying to do this on a normal site where an input is controlled and nested in 300 divs

Web devs aren’t doing a bad job; they’re just protecting us from AIs.

And providing job security to testers and testing companies.

That’s incredible to watch and really does go to show that a picture (or video) is worth a thousand words.

In bed listening to a podcast with my partner so unless i remember this post tomorrow I’ll never know.

Welp, where will all of us end up when this gets sufficiently complex?

Code writers and prose writers will be reduced to operating the AI (checking its output, trying various inputs to elicit the desired language text). At least we won't be completely obsolete like the taxi drivers and Lee Se-dol:

  The South Korean Go champion Lee Se-dol has retired from professional play, telling Yonhap news agency that his decision was motivated by the ascendancy of AI.

  “With the debut of AI in Go games, I’ve realized that I’m not at the top even if I become the number one through frantic efforts,” Lee told Yonhap. “Even if I become the number one, there is an entity that cannot be defeated.”
To speed your obsolescence, make sure you use Codex in your work, so it can learn you completely. Remember, you won't be able to compete with people who use Codex, so you have to feed the machine, whether you like it or not.

Competitive chess is still alive and well despite computers being better than humans for decades now.

In fact, computers enhance chess by allowing the discovery of interesting lines that a human would never have thought of. Professionals use computer engines to study, and learn from.

I'm super excited to play with Codex, for much the same reasons - it will help me do stuff that would be boring to do otherwise.

Sure, chess is a game. The taxi drivers will drive their taxis for fun, too, and you can write code by hand in your free time (just for fun).

I don't actually see this happening. Why would you want to replace real knowledge with something that generates demonstrably flawed code a lot of the time?

It might be used to generate boilerplate and scaffolding, but for more complex stuff, I don't see a way around having the operator having deep programming knowledge themselves, such that they could've written the code themselves anyway.

And if they already have deep programming knowledge, why is trying to coddle the model into generating what you want it to generate better than just writing it out yourself?

Retiring from a competitive game because of AI makes very little sense to me. Cars can go much faster than humans, yet we still run for sport.

It's more about the attribute that a certain task radiates. Cars have a greater velocity, humans don't seem to care very much, other animals achieve such a feat, too. On the other hand, Go and Chess radiate an aura of intellectual prowess, if you were someone who spent his entire life playing a board game, just to be curb stomped by something coming out of thin air, your pride would falter. And that is basically what it's all about. Pride.

Professional Chess is going along fine 20+ years after DeepBlue achievement, Kasparov hardly retired after that because "AI" could do better.

While Alpha's success was a surprise at the time, it was always known that it was just a matter of time before there was a unbeatable engine.

Achievements as a human has not diminished because a machine can do it better. Body building, weightlifting , running or shooting or pretty much any sport really as has not diminished despite there being better machines and even other biological species which can do better than us.

Every sport has classes for competition, male/female, seniors, under-XX, heavy weight etc. Serena loosing badly against 203rd ranked smoking male player ( albeit at 16th not yet at her peak ) did not reduce her pride in her game or reduces her achievements in anyway.

My point merely is, that humans seem to think to have a monopole on intellectual superiority. If you were to call someone else stupid, there's a high probability they'd feel the urge to punch you in the face, unless that person is already apathetic in nature. And to my perception, slightly superior intelligence seems to be the only thing we humans have going for us. Speed, Strength? such attributes have become terribly trivial.

For a closer analogy, chess engines have 1000+ elo points on the top grandmasters, and professional chess has never been more popular.

Sure. And instead of writing code ("running") you can operate Codex ("drive the car"). Instead of being a runner, you'll be a driver. And gradually the car will drive itself, and you can sit and watch.

One more person made redundant by a script. This will happen to a lot of folks in the coming decades.

This is the truely scary part of AI. Mysterious black boxes noone understands running the world.

There's still a long road ahead until that happens. Writing a single function, even if includes a long a list of steps, is not the main challenge nowadays. The challenge is how the code is organized, i.e. architecture.

Doesn't this only work so well on HN only because HN uses really simple html and css? What about more complex sites?

It's much less reliable on sites like Reddit, although it can usually handle "click on the profile link" or "delete all images" and stuff.

Okay, thanks for the info.

05:39 https://youtu.be/tNcBQBTeyf4

You can see how OpenAI Codex misses some details about HN scraping. What's impressive that you might notice is the variable names it chooses which seems to show the nature of HN scraping codes on the internet

open source attempt at a clone,(not by me…)


if you listen carefully you can hear the music...

Any tips on getting this to run as an extension?

It's not currently open source, but I might release it if I can get it cleaned up.

openai as a compiler in the browser would be interesting

How about just starting with "a compiler in the browser"? From [1]:

> the web was first built in the 90s to share complicated academic work

People complain a lot about the results of research not being replicable because people withhold their code when they publish, but the fact is that even then it's not guaranteed that anyone will be able to get it to work. Heck, there are plenty of run-of-the-mill software projects (not associated with research) with build processes that aren't replicable without substantial effort in making sure the appropriate toolchain is available and configured for your system. apt-get build-dep is nice and all, but it only goes so far.

You'd think that we would have recognized by now that in addition to it being good hygiene to include a project README, a tremendous boon to productivity would result if everyone got on board with also including a document that captured the _exact_ process for transforming source into a binary (or whathaveyou), so you could just drop it into a UVC[2] and get said binary out. Not even mainstream JS programmers (largely writing software that is meant to be interacted with from a web browser!) get this right[3]. Modern JS has managed to grow its own body of implicit knowledge centered around SDKs and setup rituals[4] just like everyone else.

1. http://benschmidt.org/post/2020-01-15/2020-01-15-webgpu/

2. https://scholar.google.com/scholar?hl=en&as_sdt=0%2C44&q=uni...

3. https://www.colbyrussell.com/2019/03/06/how-to-displace-java...

4. https://news.ycombinator.com/item?id=24495646

maybe see if it can beat a simple forum captcha of the 1+1= variety

holy shit

This demo wouldn't have been out of place in the 80's.

Maybe everyone is smarter now and is is looking at some sort of underlying process. Or maybe it's just more of the same.

It make no sense to auto fill 'the video' The correct answer is I don't understand. That was a mistake. It also bold'ed the (site) which is not correct.

It's a short demo that clearly would have had many test runs.

The fact it 'learned' to do a bad Behat is amazing. But there's no reason to think it can equal Behat in 10 years time. Chess AI had a way forward, it's not clear this does.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact