
Grouparoo: Open-source app to sync customer data with 3rd party tools - kine
https://www.grouparoo.com
======
bleonard
Thanks for checking us out! Co-founder here, happy to answer any questions.
There is so much to do in this space, but we’re excited to be getting started.

No engineer wakes up in the morning excited to sync data to Marketo, so we
started there - `npm install` and so you can get back to building the core
product. We make data self-serve for your non-technical colleagues and we
handle all the exhausting integration stuff you don’t want to think about (API
nuances, rate limiting, retrying, batching, etc).

~~~
mariushn
First, thanks for providing the `npm install` way to run it. Too many apps
require Docker and that's it.

Question: Could we use Grouparoo to replace mixpanel? Would we need to build
the client side to collect events and dump that into Grouparoo?

~~~
bleonard
At the moment, it depends what you are using Mixpanel for. If it's about
collecting events and storing them, then we have a JS web library [1] for that
and we'll happily store them in our database and let you do things with them.
It will handle anonymous users and convert/merge them when they log in.

[1] [https://www.npmjs.com/package/@grouparoo/client-
web](https://www.npmjs.com/package/@grouparoo/client-web)

------
coob
Wow, this is exactly what I started looking for at the start of the day.

Unfortunately I can't use it yet as there's no Postgres SSL support - filed a
GitHub issue here:
[https://github.com/grouparoo/grouparoo/issues/734](https://github.com/grouparoo/grouparoo/issues/734)

~~~
bleonard
Thanks! That should be an easy one. What was your goal when you started the
day? :-)

~~~
coob
Syncing our customer user data to Zendesk without having to write our own
syncing service.

~~~
bleonard
Sounds perfect for us. We have some questions on that issue. Let's make it
happen!

------
shakedown1
At a glance, this looks cool - im in the middle of designing an internal
Customer Data Platform/Single Customer View to serve as a single data source
for various internal & external marketing tools. Looks like this might be
worth looking at.

Does it have an API to access the user profile and group data ad-hoc?

Can you stream data in?

Can it trigger a destination sync when the underlying data changes?

Can it do profile merging (visitor -> known customer stitching)?

How can you do reporting/analytics?

~~~
bleonard
Great questions, thanks!

> Does it have an API to access the user profile and group data ad-hoc?

We’ve seen a few approaches here. 1) Yes, there are APIs 2) The Postgres
database this runs on is in your data center, so you can read it directly 3)
You can write back to your own product database as a “destination”

> Can you stream data in?

We support events via an API. We’ll store the vents and allow your to create
profile properties from them. I’m very interested in creating a Kafka or other
message bus sort of integration too that brings in data and/or triggers
recalculations. No one has needed that yet, though, so it’s just on our
eventual list.

> Can it trigger a destination sync when the underlying data changes?

We have schedules, table queries, and events to know profile data has changed.
When it changes, it then recalculates groups. Then properties or groups change
that are being sent to a destination, it automatically exports there. “Hey
Mailchimp, the user changed their first name and should now be tagged as VIP.”

> Can it do profile merging (visitor -> known customer stitching)?

We have the concept of anonymous id before login. When we realize two profiles
are the same (usually after logging in from another device or something), the
profiles are merged and everything recalculated.

> How can you do reporting/analytics?

This hasn’t been a focus so far outside of our ETL mechanics. You can see who
has been imported and exported and with what and when and all of that.

Things get more interesting around properties and their values, but we haven’t
gotten there yet. We’ve seen some success at pointing tools like Metabase at
the Grouparoo database.

~~~
shakedown1
Cool thanks, just installed it now to have a look. Any timeline for when S3
and Redshift will be available as a source & destination?

~~~
bleonard
Let me know how it goes: brian [at] grouparoo.com

Redshift is available now. We've talked about S3, but were looking for input
on what kind of formats to read/write. Email me or make an issue [1] with what
you were hoping for.

[1]
[https://github.com/grouparoo/grouparoo/issues/new?assignees=...](https://github.com/grouparoo/grouparoo/issues/new?assignees=&labels=enhancement&template=feature_request.md)

------
kine
I worked with the Grouparoo guys years ago at TaskRabbit. They're a super
impressive team. Excited to watch their progress with Grouparoo!

------
ramziq
Grouparoo made it dead simple to connect a mysql database and generate
information. Easy to Install and Integrations make it easy to send the right
data to each tool you use. No code is necessary to change what data gets sent.

------
desk_minion
Oh wow, this reminds of Hightouch
([https://hightouch.io](https://hightouch.io)) and Rudder
([https://rudderstack.com](https://rudderstack.com)). It's interesting that
all of these are positioned around data warehouses, which is generally very
messy to deal with.

~~~
bleonard
Thanks for the links.

Overall, there's an interesting organizational dynamic that we've seen around
data enablement. Marketing and other operational teams need it and it's often
locked in the product space. It's usually not a priority for the eng team
because they are focused on the core product, but the data is there. The
important stuff (ETL copy of the product db) is usually not a huge mess.

We are inspired by warehouse tools like Looker that made that accessible to
more people, giving them autonomy to be successful. Grouparoo takes that one
step further to add on top of the data and make it actionable in all the other
places that people want it.

~~~
mmaia
In my experience this scenario is very accurate. Add to it the growing usage
of tools like Pipedrive, Pipefy, Airtable, that sometimes prod-eng team is not
even aware.

As a Rudder and Fivetran user, I can see a very complementary use case for
Grouparoo. Where the first two are responsible for unifying events and
external data in the DW and Grouparoo to sync user data to other tools.

Two other tools that I saw in this space (not Open Source):
[https://www.calixa.io/](https://www.calixa.io/)
[https://windsor.io/](https://windsor.io/)

~~~
bleonard
Overall, we believe events are an overused approach in this space. They
clearly have their place for impression data, but product dbs or data
warehouses have lots of good stuff that isn't being used enough to drive
goals. Then there's all the issues with getting that right in these other
tools that no engineer wants to deal with (rate limiting, formatting,
schedules, idempotency, etc).

So I hope there's a place because it's the gap we saw and are hoping to solve
and share with the community. Email me if you want to chat more: brian [at]
grouparoo.com

------
borisjabes
Very cool to see. A lot of folks (Notion, Figma, Loom) use our service
([https://getcensus.com](https://getcensus.com)) for this purpose if folks are
curious to check out a SaaS version of the concept.

We support all the major data warehouses (incl. straight Postgres), connect to
a bunch of different applications, and don't store any of your customer data!

Bonus: we recently added native support for dbt :-)

------
recur
This was such a massive problem at the large company where I used to work.
Excited to see the progress!

------
shaanr
I had a demo of Grouparoo a few weeks ago -- super cool platform that solves
so many ETL / integration challenges.

------
soumyadeb
Congrats guys from the team @ RudderStack. Glad to see more open-source
products in this broad customer data space.

Is it fair to say this is more like the Segment personas's product? We see a
bunch of use cases for personas (which we don't have in RudderStack) so can
point to you guys.

Congrats again on the launch.

~~~
bleonard
Yeah, the group-building part and syncing those (for example to Static Lists
in Marketo) is like Personas. It was always funny to me that you had to pay
quite a bit for Personas for Segment to actually, you know, segment.

The gap we saw was around understanding the user and segmenting in a way that
could be shared across multiple services and the product. And doing so in a
way where you controlled the data and the total cost was managed.

It's nice to see there are others open in this space. The trends are certainly
in the open direction to provide a lot of value and control in a way that's
good for everyone.

------
bartekr
Can’t wait to try this out sometime soon. I’ve worked with all the founders
and they’re awesome.

------
jslakro
What if I need to store all user data (not limited to profile information) in
google sheets and I provide only an interface for an specific usage. I cannot
see how good this scenario fits. Use cases are quite simple

~~~
bleonard
Everything in Grouparoo is somehow tied to a user, so if you have a list of
"locations" or something, that's not a good fit at this point. We basically
need a "foreign key" to a property of the user.

If you do have that, we have support for google sheets[1]. You share a sheet
with a service account and that allows it to be a source in Grouparoo. From
there, you can make groups and send to destinations.

We'd be curious about your use cases. Feel free to make an issue[2] with what
you are hoping for and we can discuss there.

[1] [https://www.grouparoo.com/blog/google-sheets-
source](https://www.grouparoo.com/blog/google-sheets-source)

[2]
[https://github.com/grouparoo/grouparoo/issues/new?assignees=...](https://github.com/grouparoo/grouparoo/issues/new?assignees=&labels=enhancement&template=feature_request.md)

------
edmundito
I've worked with the founders in the past and they're legit! I'm excited to
see where they'll go with Grouparoo.

------
nloui
Looking forward to tracking this project. I totally get it.

