Hightouch |Remote (North America)|Full-time | Backend, Fullstack, and Frontend Engineers
Some background on Hightouch - our mission is to help companies leverage their customer data to grow. We started with the problem of “Reverse ETL” or helping companies sync data from their data warehouse (e.g. Snowflake, Databricks, etc.) to 200+ SaaS tools (Salesforce, Marketo, Facebook Ads, etc.) without coding. Since then, we’ve evolved into a suite of tools around the warehouse (identity resolution, data enrichment, event streaming, etc.). We’ve raised a Series B and scaled to $40m+ ARR in 3 years with 800+ customers including Fortune 500 co’s like Spotify, the NBA, PetSmart, etc.
We are hiring for:
From that listing "Sync Speed: Customers want to sync a lot of data to important destinations like Facebook and Snapchat, which requires us to analyze every part of our syncing process and find where we can optimize to sync data more quickly"
I'm curious about this. What workflow requires syncing high volumes of data from a CDW to Facebook & Snapchat at low latency? It's my understanding that businesses mostly use those platforms for advertising. I'm struggling to think of a use case where you want to adjust your advertising with low latency and lots of data? I could understand feeding lots of data from your CDW into a ML model that updates your ads through the FB Ads API but I can't see why
1. it needs to go straight from CDW to FB ?
2. it needs to be a lot of data?
3. it needs to be fast?
Perhaps there is some other use-case besides adjusting ads.
4. Also why do you use the word "syncing" rather than "send"? I tend to think of syncing involving multiple programs that can edit data (e.g. Google Docs, distributed consensus etc.). Are Facebook and Snapchat actually updating the data you send and you have to sync the other direction? Or is just one-way?
I work on the syncing team at Hightouch. These are great questions and also good feedback on how we could be clearer when describing the problems we need to solve.
1. We also support the case you describe, in which an ML model processes data and then updates properties in a destination. However, customers still get a lot of value out keeping downstream systems synced with their warehouse tables. For instance, you can define which people you want to receive different campaigns and make sure that's consistent across all your ad platforms. You can also use it for simple projects like easily keeping Airtable in sync with a Postgres database.
2. Some people have warehouse tables with many billions of rows.
3. If you have a billion rows, you need to hit a very high rows per second number in order to run a sync in a feasible amount of time. Also, we have an event collection product, which allows customers to feed events into Hightouch in realtime, and a personalization API product, which allows customers to hit an API and get a low-latency response for how a given user's experience should be personalized. Making sure that the new data flowing into the events API is processed, and data is ready in the personalization API for fast fetching, needs to be fast.
4. It's true that syncing often implies some bi-directionality. In this case we think about "syncing" the destination system state to that present in the source system. It's nice because you can use the source system as the source of truth and trust that any edits you make will be reflected elsewhere. Possibly across many destinations.
1. > However, customers still get a lot of value out keeping downstream systems synced with their warehouse tables. For instance, you can define which people you want to receive different campaigns and make sure that's consistent across all your ad platforms.
Ah thank you that's a very helpful example for understanding your product.
2. > Some people have warehouse tables with many billions of rows.
I don't doubt customers have multi billion row tables in their CDWs but I guess I'm not seeing why you would need to send billions of rows to FB (or any other downstream system that isn't an OLAP database) rather than some much smaller payload distilled from that data via an ML model or SQL query. I admittedly have never run a FB ad campaign but Meta has 3.35 billion daily active users [1] across all of their products. If they are sending billions of rows to FB, do your customers have individualized ad campaigns for every single FB user? Perhaps I'm just not familiar enough with the state of modern digital advertising.
3. > If you have a billion rows, you need to hit a very high rows per second number in order to run a sync in a feasible amount of time.
I wonder why do you have to send billions of rows to FB every time? Surely you send it once for initial setup and then incrementally sync smaller deltas? And presumably your customer is OK with the initial setup being slower. Unless your customer is doing billions of writes to their CDW in between syncs?
You're right that typically the day to day delta we need to sync is much smaller. However, customers often want to change something for a large fraction of their dataset, requiring a large update. Also we support a our Personalization API product, for which customers do frequently want to refresh what personalizations they're showing to all of their users. For this we do need to be regularly syncing all the data.
Some background on Hightouch - our mission is to help companies leverage their customer data to grow. We started with the problem of “Reverse ETL” or helping companies sync data from their data warehouse (e.g. Snowflake, Databricks, etc.) to 200+ SaaS tools (Salesforce, Marketo, Facebook Ads, etc.) without coding. Since then, we’ve evolved into a suite of tools around the warehouse (identity resolution, data enrichment, event streaming, etc.). We’ve raised a Series B and scaled to $40m+ ARR in 3 years with 800+ customers including Fortune 500 co’s like Spotify, the NBA, PetSmart, etc. We are hiring for:
Full Stack Engineer, AI Decisioning: https://boards.greenhouse.io/hightouch/jobs/5404573004
Product Manager, Insights Products: https://boards.greenhouse.io/hightouch/jobs/5401455004
Our Talent Team is committed to responding to everyone who applies!