I am co-writing a paper with a Ph.D. student and he is currently working on trying with other datasets. We are also trying different architectures, combining multiple LSTMs (stacked, residual connections + batch normalization, bidirectional LSTMs, and on)