This is really cool stuff - the network structure reminds me a lot of Graves' MDRNN[1] and Grid LSTM[2], as well as some work I helped with (ReNet [3])
I wonder if the structure over frequency/time is too "regular" - in general for sound the frequency correlation and the time correlation are on wildly different scales.
Also if you are looking to go farther you might reconsider adding NADE or RBM [4] on top, or latent variables in the hiddens[5][6] to add more stochasticity.
There was some alternate work by Kratarth Goel extending RNN-RBM to LSTM and DBN, it might give you some ideas to look at [7]. I know when we messed with bidirectional LSTM + DBN for midi generation it lead to this kind of "jumbled/dissonant" sound you seem to be having - don't know what to make of it here. You might consider bi-directionality over the notes, though it makes the generation way more annoying.
Awesome work! I will definitely be sharing around and checking out your code.
I wonder if the structure over frequency/time is too "regular" - in general for sound the frequency correlation and the time correlation are on wildly different scales.
Also if you are looking to go farther you might reconsider adding NADE or RBM [4] on top, or latent variables in the hiddens[5][6] to add more stochasticity.
There was some alternate work by Kratarth Goel extending RNN-RBM to LSTM and DBN, it might give you some ideas to look at [7]. I know when we messed with bidirectional LSTM + DBN for midi generation it lead to this kind of "jumbled/dissonant" sound you seem to be having - don't know what to make of it here. You might consider bi-directionality over the notes, though it makes the generation way more annoying.
Awesome work! I will definitely be sharing around and checking out your code.
[1] http://arxiv.org/pdf/0705.2011.pdf
[2] http://arxiv.org/abs/1507.01526
[3] http://arxiv.org/abs/1505.00393
[4] http://www-etud.iro.umontreal.ca/~boulanni/ICASSP2013.pdf
[5] http://arxiv.org/abs/1411.7610
[6] http://arxiv.org/abs/1506.02216
[7] http://arxiv.org/pdf/1412.6093.pdf