A note: looking at the code, this isn't a seq2seq Keras model. The core model code is a fork of the base Keras text generation example (https://github.com/fchollet/keras/blob/master/examples/lstm_...), which works like a char-rnn where the previous 80 characters will predict the 81st character, then feed the generated characters back into the model. In the server implementation, the server keeps predicting characters until it hits a break character, which will then serve the generated characters to the user.
In a seq2seq implementation, you need to predict all output characters simultaneously (i.e. model input is all characters in a line, model output is all characters in the next line), which in Keras involves using a TimeDistributed(Dense()) layer. (see the Keras example for seq2seq: https://github.com/fchollet/keras/blob/master/examples/addit...) This also requires more sequence ETL and a lot more training time.
You are totally right: I was mixing up with another project I'm working on where I am using seq2seq (using only tensorflow) :) I will update the text of the repo. Thank you!
This reminds me a bit of auto-sklearn[1] for automatically select the machine learning algorithm to use and its parameters (and so isn't quite doing it at code level like this one).
Yea, I also have a similar project in the making: https://github.com/kootenpv/xtoy . This one does optimisation of finding a machine learning model using evolutionary search, but mainly focuses on just taking any kind of data and coming up with a prediction (missing data, text data, date/time data etc). It's a lot of fun :)
> It would be very fun to experiment with a future model in which it will use the python AST and take out variable naming out of the equation.
So what if we use AST as a source for the code structure? Also, there are other metadata such as filename (e.g. reducer.js), path (./components), project dependencies (package.json for JavaScript projects), amount of github stars and forks.
There was a novel about this: a programmer was coding a robot, it's silicon brain was a sphere as I recall, and he programmed it by just thinking. But his mind wandered and he accidentally hard coded the Declaration of Independence (I think) into the thing. It ended up being the nav computer on a cargo ship with one operator on board. They crash-landed on a habitable planet and the operator worked with the brain to make it a new body form: a horse. They then ran into, of course, people in a medieval state of history. And the duo from space has to figure out how to survive, which involves acquiring useful things from the villagers, while protecting their true identity. The horse turns out to be quite helpful, but has an occasional glitch.
In a seq2seq implementation, you need to predict all output characters simultaneously (i.e. model input is all characters in a line, model output is all characters in the next line), which in Keras involves using a TimeDistributed(Dense()) layer. (see the Keras example for seq2seq: https://github.com/fchollet/keras/blob/master/examples/addit...) This also requires more sequence ETL and a lot more training time.