Have you tried your classifier with a different accelerometer dataset? That will be a good test of generalization.

I am co-writing a paper with a Ph.D. student and he is currently working on trying with other datasets. We are also trying different architectures, combining multiple LSTMs (stacked, residual connections + batch normalization, bidirectional LSTMs, and on)

