Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: ClearBrain (YC W18) – predictive models for app conversions and churn
83 points by bmahmood on Jan 31, 2018 | hide | past | favorite | 20 comments
Hi I’m Bilal, cofounder of ClearBrain (https://clearbrain.com). ClearBrain helps you automatically build predictive models for which of your users are most likely to convert or churn in your app. Think AmazonML for marketing analysts.

Our founding team comes from Optimizely & Google where we built similar predictive tools for our marketing teams. At each company, we kept building the same components of a predictive pipeline - javascript snippets to collect data, ETL jobs to transform that data, and cron jobs to run a regression. We were spending hours a week maintaining these pipelines, but the time-consuming part wasn’t the algorithms (as they’re open sourced) it was the transformations.

So with ClearBrain we decided to automate the data transformation steps. We built our system in Spark ML (scala), Data Pipeline, and Go. Instead of instrumenting yet another Javascript snippet, we use existing data in Segment (YC S11) and Heap (YC W13) through standard integrations. And because every Segment/Heap dataset has the same schema, our system can process it with the same transformations into a machine-readable feature matrix. When a customer selects a user action tracked in Segment/Heap to predict, our transformed matrix is run through a logistic regression via Spark ML, and outputs a probabilistic score for each user to perform that action based on users who performed it in the past.

This distills the predictive modeling process to a simple UI to identify high-probability users in minutes. We’ve built the tool with marketers in mind, to help them identify which users may convert or churn, and export those users to marketing tools like Facebook Ads, Hubspot, etc. We’ve also found good reception from startups that have marketing objectives but lack the resources to deploy ML-driven campaigns themselves.

We look forward to feedback from the HN community! :)


Looks very interesting. We're looking for something like this. The website talks also about the "Facebook Aha moment"... So would you guys also be able to predict which action(s) we should push our users towards to convert better? and which action(s) are less important?

Any plans to connect to Amplitude? and/or if we feed events directly to you via a JS API? (we don't use heap or segment, that's why I'm asking)

Thanks! Yep, we have a feature called "Benchmarks" which uses a decision tree analysis to identify the thresholds in distinct events that lead to an increase probability towards your conversion goal. We wrote a blog post that expands in more detail on how this works: https://blog.clearbrain.com/posts/discover-your-products-7-f...

Amplitude is definitely on our roadmap as one of the next integrations we're looking to support in 2018!

It's actually a first year statistics homework assignment to do an "AHA Moment" action analysis.

A random forest automatically gives you a "variable importance plot".

Bilal nice to meet you - have a high growth SAAS startup that this looks perfect for, what's the best way to get onboarded and give this a material try?

Nice to meet you as well! We've helped several SaaS startups predict upgrades and reduce churn, so definitely sounds like a great fit.

Shoot me an email at bilal@clearbrain.com and I can help you get started. You can also get set up immediately at https://www.clearbrain.com

Are you concerned about the European Union’s new (General Data Protection Regulation)[https://syncedreview.com/2018/01/31/will-new-eu-regulations-...]?

We're mindful of GDPR and consistently ensuring ClearBrain is compliant with the upcoming regulation, from both how we collect and process user data.

With respect to the points raised in the article - ClearBrain actually does not use deep learning techniques as a basis for our predictive models. Predictions in ClearBrain are based either on logistic regression or decision tree paradigms.

From the beginning of when we approached ClearBrain as well, we wanted to make sure we provided a service that wasn't merely a blackbox. We wanted to provide insight into how the models are performing, so we expose analyses such as feature importance, attribute benchmarks, and indications of which actions are informing the models.

This helps with both interpretability and actionability in customer workflows, but also in some of the GDPR issues noted.

How correlated is your models output with a simple measure of the volume of interactions a user has with an app?

Main question...is this a domain where ML really adds a lot of value beyond the basic concept of High Interaction (lots of time spent and clicks/pages visited) -> High conversion Probability?

Great question! It is true that generic engagement/activity metrics tend to be highly correlated to conversion, and the absence of any activity tends to be correlated to churn. We see those features show up often.

But the propensity models built in ClearBrain tend to be more specific. The target variable can be defined as any client or server-side event you've tracked in Segment, or any trait/attribute of your user. As such, common use cases tend to be around predicting conversion events to discrete stages of a user journey - separate models for whom will move from plan type A --> B --> C, etc. So even if a generic engagement metric shows up as highly correlated for these discrete stages, the benchmark in engagement would be different and hence still intuitively helpful to diffrentiate groups of users.

If you use Heap or Segment, using ClearBrain is a no brainer. Good job Bilal and team!

This is a great team and product!


Hi and congrats on ClearBrain! Hope all is well, super excited to see where this goes.

Love what Bilal and the team at ClearBrain is working on. There's really no integration work if you already use Segment for analytics.

What kinds of ML models and algorithms are used? Can I train models based on my own data?

ClearBrain models are primarily based on a logistic regression. We automatically run some parameter tuning like ridge regression and class balancing of your data for feature selection and regularization. We connect directly to your data in Segment, Heap, or Redshift out of the box, so can definitely help with your own data!

I think it was a great choice to choose Heap/Segment as data providers.

I've been using ClearBrain for a few months. It works. Give it a try.

How is this different than Custora (also YC alumn)?

I'm not really familiar with this space, but is there a non-enterprise option here? I would love to try something like this out but the price point is too high for me.

Thanks for the comment, but sorry to hear the price seems cost-prohibitive for your company size. :( Our goal is definitely to try and provide a solution that scales with different company sizes, but you're right that below a certain size, it may not make sense to use heavy-weight machine learning tools. Happy to chat at bilal@clearbrain.com though to discuss more about your use cases and see if I can recommend some options or even alternatives!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact