> “What are you doing?”, asked Minsky.
> “I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.
> “Why is the net wired randomly?”, asked Minsky.
> “I do not want it to have any preconceptions of how to play”, Sussman said.
> Minsky then shut his eyes.
> “Why do you close your eyes?”, Sussman asked his teacher.
> “So that the room will be empty.”
> At that moment, Sussman was enlightened.
There is no "emergent" phenomenon currently in AI. I am thinking that the closest concept to what the author is asking for is a generative model, but even this does not really come close to performing tasks like "discovering" a new programming language. And it certainly would not discover an "emergent" anything. If it did "discover" anything it would be an object very statistically similar to the objects used to construct the model in the first place (e.g., deep fakes).
Everything in current AI is data first (read: data only). You can sometimes synthesize data and use that, but you always start with real-world data, and the models you end up with are parameterizations of the data you trained with. Always. So, while there is a future where a technical answer can be supplied to this question appropriately ... We are as of yet not in that future.
I consider this a technically valid answer and not a dismissal of the author's question, btw.
This is a very interesting claim!
Are you making this claim on a purely empirical basis, as in:
(1) "no AI has ever 'discovered' anything notably new/important/useful/interesting/valuable
or are you instead claiming/suspecting that:
(2) AI could not 'in principle' discover/generate anything falling into the positive categories above?
Suppose we want to learn a "natural" programming language. Training data would be example programs that we believe should be easy to express in any language. Since each of those programs will be expressed in a particular language, we'll need a notion of program equivalence across languages. As a toy framework let "language" mean a basis of combinators in pure lambda calculus; this is convenient because we have a (Hilbert-Post-)complete theory of behavioral equivalence among programs (H*, see Barendregt's classic book), and because the combinatory basis problem has been well studied since 1950s. Applying machine learning we can try to "fit a combinatory basis to data" in the sense of finding a finite weighted set of combinators, giving more weight to language primitives with shorter spelling. Our loss function will be the Kolmogorov complexity of our training data, actually gradients will be better if we use a softmax-Kolmogorov complexity. I used gradient descent to update weights of existing language primitives, and used greedy sparse dictionary learning to propose new language primitives. Most of the work was in proving equivalence and approximating Kolmogorov complexity.
It was a cute experiment, but hopelessly far from practical.
 http://fritzo.org/thesis.pdf (2009)
Our phones should learn a private language with us. My dog learns after one repetition; Zoom should learn to arrange my windows as I like, at least after 47 repetitions.
For the microdose-at-work developers, the trippiest arena would be in the visual realm, where our brains aren't so crippled by convention. Make a tablet/pen drawing app that develops a common language with the user.
That depends. If there are two parties who are going for Australia let them eat each other while you go for LatAm. Closer to the end of the game the 2 Australian armies count for very little against the number of armies you get for winning at least some battle every turn. Pro-tip: hold off on exchanging any sets as long as you can. Have fun!
My next thought is to use a language agnostic spec and train an AI to create programs that adheres to that spec. OpenAPI for RESTful interfaces would be good.
IMO, using A.I. to generate programs (or languages) is the ultimate nerd snipe. I, sovietmudkip, have to admit defeat else my own productivity and happiness will suffer.
There is someone else out there whose personal incentives align with engaging in this difficult endeavor. But for me, I measure myself in projects realized, and that holds me back sometimes.
If you decide the language generator itself should be doing the compiling and processing, then congratulations, you are interested in the GPT-3 self-prompting community, and should go read everything Gwern wrote about GPT-3, then join the openAI GPT-3 slack. The short answer is that text transformers output a vector of likely ‘next tokens’ based on a token input. You can choose from this output vector using whatever rule you like, and feed it back in as input.
If you’re wondering how an AI might ‘talk to itself’ and program its own behavior internally, then you might like papers like this as a starting point: https://www.sciencedirect.com/science/article/pii/S105120041.... Short answer: because of how they are wired, NNs tend to have activation ‘areas’ as they process things, and this is represented as the connections between and weights of data flowing through very large matrices; not a thing that’s super easy for humans to interpret as ‘language’.
I believe OpenAI also is publishing more on interpreting how AIs work and think behind the scenes, so you may want to check their blog / published papers.
Let me explain: Python, for example, is a language specification which has different implementations in different languages, most popular being CPython (commonly referred to as Python).
So if you fed the existing implementations of python to an AI and asked it to create a new one in some other language - I think this could be very well be a fruitful experiment.
Then how would you train that - show it what computer languages have come before, would it weight it upon how well they are used to get an idea of what humans prefer - the prospects of some java-C-Cobol mutant language do seem a logical output perhaps.
Area's in which I'd like to see AI focus would be - optimisation and code auditing. Which may well prove easier as you have a solid defined goal. Also such endeavours would prove invaluable down the line if you wanted to have AI come up with a programming language. As to do that you would need to train how humans communicate/think and how computers communicate/think and meet in the middle.
[EDIT - spelongs]
Yes, I think this kind of requirement would lead us to a very interesting specification exercise which would address questions like:
(1) what are the different use cases relevant to identifying 'as yet unaddressed' programming language design issues?
(2) what are the 'success metrics' of a new language design?
(3) what are the 'unresolved shortcomings' of existing languages?
(4) What are the issues related to 'non-human language design'?
This last question is kind of about the opposite to 'readability as we know it', where you could anticipate a language suited to creating non-human-intelligible code, which might require tools to translate it into human intelligible form, kind of like the assembly-language to higher-level language issue but kind of in reverse.
Not sure it could be done, but it would be interesting to see if an AI could come up with new abstractions that it sees in many different examples... Maybe like a new type of iteration or assignment
You'd want to somehow codify the effectiveness of each language. Like add up stats on frequency and severity of errors, maybe by time spent either developing or repairing per desired product, and use that as a metric of overhead.
For the language to be useful, it has to be useful to humans not just ais.
Otherwise the only use we could get out of something like that is maybe an ai could use it to write down a portable version of itself or anything else, which we could "use" only to store, replicate, & reconstitute whatever the ai wrote.
This does not have to be true in order to translate languages.
But in the case of current programming languages, they all compile to the same standard instruction set architectures following the same basic laws of boolean logic, so it should not be surprising that PLs are translatable. Likely, the model has discovered some basic statistical representation of those things.
I'm unclear on what doesn't have to be true?
I intended to observe that a system capable of recognizing idioms in multiple languages, built on the large training sets of those languages' existing code, suggests the possibility of idiom-level combinational synthesis of a novel language. Most easily as a course-grain polyglot (combining large chunks of such code), and increasingly speculatively as grain size decreases (eg different languages in the same file, or function, or expression), or as language options are pruned (a manticore - representational redundancy is removed as a feature present in multiple languages becomes only available in some/one of its forms).
If you try to generate language artifacts, maybe you need to gather the sources of major projects, which large companies can do easily with their internal codebase.
If you want that, use Smalltalk, a Lisp or a Forth.
why you want yet another language?
although it's interesting stuff, maybe you make softdev easier :)
why you want tohave yet another language? ;)
its quite interesting :)
Well, once you had developed a tool for automatically generating Domain Specific Languages (presumably based upon getting the tool to educate itself about the domain in question) you could try to get it to develop a language specific to the Blockchain domain (perhaps intending to come up with an alternative to Solidity?).