Hacker News new | comments | show | ask | jobs | submit login

For anyone setting up a similar system, our product https://fivetran.com automates the most annoying part of this: getting the data into Redshift! We support MySQL, Postgres, Salesforce, and lots of other data sources.



yeaaaaaahh but is that really the most annoying? dimensional modeling, getting all the joins right, scripting all the transforms (the T in ETL) is pretty annoying. Getting data into redshift is like, just the beginning. Not to mention, skips the transform part, at least.


Sorry if I'm getting this mixed up.

Snowplow is shown on your pricing page. Doesn't Snowplow already connect directly with Redshift? Or was Fivetran built before Snowplow finished their new setup?


Yes this is a common point of confusion. There are three options:

1. Run your own Snowplow collector that writes to your Redshift. 2. Use Snowplow Inc's hosted collector that writes into your Redshift. 3. Use Fivetran's Snowplow collector that writes to your Redshift.

We have customers running each of these configurations. In the case of #1 & 2, they're just using us to sync other data into the same Redshift cluster as their snowplow collector.


Side note - you have caching disabled on the html, css and images of the site, any reason why?


How do you deal with events with varying schema?


Assume that any one event type has a consistent schema and create a separate table with the custom properties for each event type.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: