Doing more than a human can isn't impressive. Most computer problems for any pur...

chaxor · on May 11, 2023

It does, using this method.

My immediate thought as well was '... Yeah, well vimdiff can do that in milliseconds rather than 22 seconds' - but that's obviously missing the point entirely. Of course, we need to tell people to use the right tool for the job, and that will be more and more important to remind people of now.

However, it's pretty clear that the reason they used this task is to give something simple to understand what was done in a very simple example. Of course it can do more semantic understanding related tasks, because that's what the model does.

So, without looking at the details we all know that it can summarize full books, give thematic differences between two books, write what a book may be like if a character switch from one book to another is done, etc.

If it doesn't do these things (not just badly, but can't at all) I would be surprised. If it does them, but badly, I wouldn't be surprised, but it also wouldn't be mind bending to see it do better than any human at the task as well.

Frost1x · on May 11, 2023

>Of course it can do more semantic understanding related tasks, because that's what the model does.

The problem is that marketing has eroded maintaining any such faith. Too often, simple examples are given to the consumer to extrapolate intended functionality because there's no false advertising involved then. It's used over and over again in products, the examples are well selected and don't actually show a good representation of perceived functionality.

As such, I personally can't make the leap to of course it can do more semantic understanding related tasks like a diff that's not as simple, one where perhaps a characters overall personality over the course of the book is shifted, not just a single line that defines their profession.

This isn't to say the demonstrative example isn't neat on its own accord given whats going on here, it is, I'm just saying I can't make such leaps from examples given by any products. When I work with vendors of traditional software, this happens all the time people dance around a lack of functionality you obviously want or need to make a sell. It's only when you force them to be explicit on the specific cases, especially in writing, that I have any faith at all.

xcv123 · on May 12, 2023

LLM's are specifically designed and trained for semantic understanding related tasks. That's the entire point. They are trained to solve natural language understanding tasks and they build a semantic model in order to solve the tasks.

chaxor · on May 12, 2023

I did say it may do certain tasks with low performance. What you're saying is not really understandable (or simply wrong). The fact that it does understand how to do a 'simple' task such as finding where text is different is actually somewhat impressive given the typical training data for these models. But I suppose you need to actually understand the field of NLP to understand why that is.

If you're expecting perfection or magic then you will be disappointed.

ehnto · on May 12, 2023

I think it's fair to say most people don't know what a diff tool is, but do know how to ask questions. That is the democratizing factor that AI is introducing I feel, giving high powered computing to people without the need for specialized knowledge.

chaxor · on May 12, 2023

The real goal would be for the model to determine what tool is needed to achieve the task, and use that to achieve it. Using the model to write code provides a way to achieve explainable and correctness guarantees (by reading and executing the code, not the model internals). So in this case identifying that a diff program is required may have taken a couple of seconds, ans the diff would take milliseconds - still yielding a faster and more correct output (as the model alone may produce an approximate diff or hallucinate).

EGreg · on May 11, 2023

Or course it can very soon, since those were also written by humans. Like AlphaZero vs Rybka