If you were using old generation instances (t1 m1 c1 etc.) your experience would have been VERY different to current generation instances (c3 c4 m3 t2). Did you try network optimized instances with (very) low latency networking?

