Structure is an "editing problem" but once you go beyond a fairly straight interview or talk show format, the amount of post-production work increases dramatically. With the help of some automation, I can edit a 20 minute interview in an hour so so. Structuring a 1 hour show from different clips and inserting various breaks etc. would probably add a good day to the whole process.
When I say it's an editing problem, I mean it's not something you can easily change as a consumer. On the other hand cutting out silences and speeding up slow speakers is something you can fix 'in post' using a generic podcast app.