Speech Recognition would probably be best fed in the same way - find a neutral-sounding speeches for which transcripts exist.

Best would be parliamentary speeches with transcipts, and the closed captioning for national news programs. The main constraint is storage space/computational power.

