As with all scientific papers, this has to be taken with a grain of salt.
The assumptions of this paper about the way speech is encoded, encrypted and fragmented are quite extensive. For example, the need for a vbr-codec for this method to work is a deal breaker. Most consumer grade voip is not compressed a lot (considering that 64kbit/s isn't a big deal nowadays) and definitely not using a vbr codec (e.g. G.711 is used quite often, a cbr codec).
So, at least for this method, encrypted VoIP data is usually secure.
The assumptions of this paper about the way speech is encoded, encrypted and fragmented are quite extensive. For example, the need for a vbr-codec for this method to work is a deal breaker. Most consumer grade voip is not compressed a lot (considering that 64kbit/s isn't a big deal nowadays) and definitely not using a vbr codec (e.g. G.711 is used quite often, a cbr codec).
So, at least for this method, encrypted VoIP data is usually secure.