Neat! We did something in the similar spirit after having too many beers once : https://github.com/LinusU/monofile
The idea was to be able to mix languages in a single file, including the implementation
I've been working with members of the Qwen team on OpenDevin [1] for about a month now, and I must say they're brilliant and lovely people. Very excited for their success here!
This is neat! Which model are you using in the backend?
We're working on a similar--but general-purpose--LLM-based software agent[1]. General purpose tasks are very hard, and we too are finding that tackling narrow use cases one at a time works much better!
I have an open PR to start moving in this direction [2]
At the moment I will be honest: Just GPT-4 + a meaty system prompt. More work goes into the parsing of results to intepret what is code, and update the preview in real time.
I was looking for Hungarian and didn't see it, unfortunately. I learned a smattering of it some twenty years ago in a study abroad. I remember it was difficult, but fun, to learn.
Three things made it difficult, as I remember: one, it is agglutinative, so you get strings of suffixes on the ends of nouns, verbs, etc.; two, it has zero overlap with English vocabulary, or even the Romantic & Germanic ancestors of English; three, vowel harmonizing takes a bit of practice. None of these is particularly demanding, but they have no equivalents in English.
But a very aesthetically satisfying language once you get the hang of it.
This is indeed a good writeup. Just one small quip:
> We’ll need to clarify copyright law when it comes to disseminating derivative AI-generated works.
Generated content can be either derivative or transformative, and this distinction is important. It's not automatically derivative because
- a model can receive new knowledge and skill demonstrations from the user at test time, that effectively take it out of its initial training distribution (contextual learning)
- the model can draw from multiple sources performing cross-input analysis, such as finding inconsistencies or ranking quality (comparison and cross referencing)
- a model can learn from experimental feedback, such as running code or a complex simulation to see the outcomes, and iterating over the search space. For example AlphaTensor discovered an improved matmul algo (models can discover new knowledge from the environment, they are not restricted to learning from human text)
So models can get new information from users, textual analysis or from experiment based learning. In all these cases it does more than derivative work.
We're very aware that we'll need great agents to be able to compete with Devin and others. We're currently setting up evaluation pipelines to evaluate various agents against SWE-bench.
Our thesis is that a community experimenting with various agents and agent architectures will outpace a private company on a single track. We're building the notion of an "agent hub" out of the gates--anyone can plug into the Agent interface and contribute their work. We're also discussing how to build a meta-agent, which farms out specific tasks to sub-agents.
It's early days though--we've only just gotten things wired together in a sort-of working demo. Stay tuned!
The thing I'm most proud of is the installation--you just put a single base64 encoded line at the top of your script, and everything works magically.