Lars here, the guy who gets the honorable mention at the end of the post "for brainstorming Redshift performance" with Austin (the author of the post) :-)
If you care to dig a little deeper into the things we discussed, we've written them up in a longer blog post:
I would be interested to know what their monthly Redshift bill is. The work they’ve done is really impressive, I’m just wondering if the cost savings justify all the time they’ve invested. Sometimes the right answer in these situations is just to throw more CPUs at the problem.
The problems they solved here are vanilla optimization for Redshift. Adding sort/dist keys on tables and pre-aggregating immutable data is stuff you’re going to have to do at some point, throwing more CPU at it can only help so much.
I'd like to know the actual footprint of their data. They mention some of their tables have "infinite rows", yet from their screenshots, the largest query is on"link_web_production.exit_link", scanning 3.9 mio rows.
If you care to dig a little deeper into the things we discussed, we've written them up in a longer blog post:
https://www.intermix.io/blog/top-14-performance-tuning-techn...