Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Metaplane (YC W20) – Datadog for Data
163 points by kzh_ on Nov 15, 2021 | hide | past | favorite | 67 comments
Hey HN! We’re Kevin, Guru, and Peter from Metaplane (https://metaplane.dev). Metaplane is a data observability tool that continuously monitors your data stack, alerts you when something goes wrong, and provides relevant metadata to help you debug.

Data teams are often the last to know about data-related issues. They commonly find out only when an executive messages them about a broken dashboard. This is comparable to finding out about your servers being down only when your end users report it! In software engineering, this problem is solved with observability tools like Datadog and SignalFx. These monitor your system over time by tracking metrics (like CPU, memory usage or any arbitrary value), and sending alerts when they hit thresholds or are anomalous.

Metaplane solves this problem for data teams. We continuously monitor our users’ data warehouse tables and columns, testing for things like row counts, freshness, cardinality, uniqueness, nullness, and statistical properties like mean/median/min/max, as well as schema changes. After we build up a baseline of data points for each of these tests, we send alerts on anomalies to the user's Slack channel. Each alert includes metadata like upstream/downstream tables and BI dashboards affected by the issue, so that the user can assess how important the issue is and how quickly it should be addressed.

We're particularly careful about alert fatigue and false positives. Since we can't ask users to set manual thresholds (they would be changing all the time), we have to make a reasonable prediction based on past data, which can result in false positives and false negatives. If we under-alert, we miss important issues, but if we over-alert, users become desensitized and start ignoring alerts. Our solution is to include "Mark as anomaly" and "Mark as normal" buttons with each alert, for users to provide feedback to the model.

To give a common example, Metaplane can tell you that a revenue metric in a Snowflake column has spiked from $100 to $10,000 in an unexpected way. The alert includes upstream dependencies in dbt and downstream Looker dashboards that are impacted. Another example is if a table in Redshift that is usually updated every day hasn’t been updated in over 48 hours. A third example is if a table in BigQuery that typically increments 10M rows every day suddenly adds only 1M rows because of an upstream vendor bug. These are all what we think of as “silent data bugs” — all systems are green, but your data is just wrong!

Over the last eight months, we've caught problems like these for data teams at dozens of companies including Imperfect Foods, Drift, Vendr, Reforge, Air Up, Teachable, and Appcues.

Today, we’re excited to launch our self-serve product and free plan with the HN community. Setting up monitoring for your data stack takes less than 10 minutes. Here's a 4 minute demo video to see how it works: https://www.loom.com/share/1aa54eb8b45548e180f6ab3a4a580cc5. We make money by charging for more tests and team/enterprise features. You can use our new free plan or try out all of our features in a 30 day trial, no credit card required.

Our goal is to help data teams of any size be the first to know about data issues. We think observability will become as much of a no-brainer to data teams as it is to software engineers today. Starting on AWS?—get Datadog. Bringing on Snowflake?—get a data observability tool (hopefully ours!). Eventually we want to support more use cases that you’d expect from a Datadog for data, like log centralization and diagnostics, spend monitoring, performance insights, and deep integration with upstream applications. For now, we’re just starting where the pain is highest.

We'd love to hear your ideas, experiences, and feedback, and will be answering any questions in the comments!

Looks great! Congrats on the launch.

How does this differ from a data reliability platform like Datafold? https://www.datafold.com/

And can this replace what https://atlan.com/ does as well?

Good question! Both Datafold and Atlan support data monitoring as a secondary feature, but have different main focuses:

Datafold is primarily known for their Data Diff regression testing that simulates the result of a PR on your data within a CI/CD workflow. There’s definitely a need for proactively preventing data issues from occurring in the first place, but issues introduced via code are only one subset of potential data quality issues.

Metaplane is focused on catching the symptoms first via continuous monitoring. Regression tests don’t replace the need for observability, and vice-versa.

Atlan is primarily known for their data workspace features that make collaboration easier, like a data dictionary, SQL editor, and governance.

Data collaboration is a huge unsolved problem and data monitoring does play a role there. But Metaplane is focused squarely on the problem of detecting data issues and giving you relevant metadata to prioritize and debug.

thanks for the reply! great breakdown of the space

For those of us in smaller businesses, any chance of supporting Google Analytics as an integration? The ability to detect statistical anomalies across random pages/events would be a godsend, especially for organizations that don't have a proper data warehouse and dedicated BI staff.

I guess that'd be kinda a crossover between data integrity and marketing... in this case, an anomaly IS the data, where an unexpected increase/decrease in pageviews somewhere is something we'd love an alert on, but only after some threshold. As you pointed out, doing that manually just results in a bunch of false positives for those of us who aren't professional BI or statisticians.

Hi! We at Trackingplan.com (YC W22) do this. It's just an snippet you put in your site (you can install it using google tag manager) and automatically documents and monitors all your integrations schema and traffic wise.

It's a different approach than Metaplane, which we are actually evaluating for our dwh and backends monitoring. Trackingplan is more about creating trust on the data collection side of the problem.

Drop me a line, josele (at) trackingplan (dot) com if you want to know more!

Not yet, but eventually! We’re focused on data in warehouses and transactional DBs right now, just to limit the amount of integrations we need to build to start. We definitely plan to integrate with application sources like Google Analytics down the line though. Upstream applications are ultimately the sources of data truth, after all.

I wanted to +1 what you said about “organizations that don’t have a proper data warehouse and dedicated BI staff.” At the end of the day, a huge number of companies (maybe even most?) don’t have dedicated data teams but still want to know be alerted about data anomalies. Heck, we at Metaplane even fall into that camp.

Sounds good, thank you!

We're actually thinking of building something like that and would love to have a chat to learn more about use cases.

Sure, would be happy to chat. Still new here and not sure how private messages work. May I email the address in your profile, or?

In short, in many of the small businesses I've worked for, it was trivially easy to collect analytics data, yet monumentally difficult to analyze or make use of it in a meaningful way. The signal to noise ratio was incredibly low. One aspect of that was the need to manually discover unusual events, usually by creating manual dashboards and alert thresholds. But lacking a strong data science background, designing proper statistical tests to identify significant events was not something I could easily do.

Sure drop me an email in my profile.

Check out www.cliff.ai

Do they support Google Analytics? Nothing on their website or release notes indicates this, as far as I could find.

We support one click GA import. Shoot an email to aayush.jain@cliff.ai - happy to walk you through

This is awesome! We did this in-house back in 2014, and it quickly became an unmaintainable mess.

With dbt and Snowflake poised to take over this space, I can see this fitting right in on top of these tools. One idea would be to build in dbt integration into metafold

I'm curious - how did you settle on the pricing? I can see it being a differentiator from Datafold, Supergrain depending on your feature set

Amazing how many companies use dbt + Snowflake right? Such a different world from 2014…

Good idea, we actually do have a dbt integration that pulls in lineage and job metadata from your dbt manifests: https://docs.metaplane.dev/docs/dbt. Eventually we want to let you configure Metaplane tests from your dbt YAML.

Pricing is still in flux to be honest. We wanted to start with a price that was approachable for small teams, comparable to other tools in your stack, and could be paid for without going through a whole procurement process. But we’re trying to stay as flexible on pricing as possible!

I’m building our companies first data platform right now (fivetran, dbt, snowflake), so I’ll definitely check this out!!

1) Do you have Metabase on your roadmap? Lightdash?

2) I see that you alert on schema changes, which is great. Can you monitor for schema changes of a Postgres database? Reason I ask: Fivetran (and others) will try to buffer some schema changes from you to prevent data loss (drop columns, rename columns, etc). There is some more complex nuance I have in mind here, but it’s a bit too long to type out on my phone, :)

1) An integration with Metabase Cloud is on our roadmap for Q1! We'd love to integrate with Lightdash, but they don't have a public API just yet[1].

2) Several of our customers use us to alert on schema changes in Postgres, specifically so they can get ahead of application database changes that will end up in the warehouse, so you're definitely not alone! Here's a link on how to connect postgres: https://docs.metaplane.dev/docs/postgres

That's an excellent stack and one we kept front and center when building out Metaplane, so definitely let us know if you have any feedback or suggestions here!

[1]: https://github.com/lightdash/lightdash/issues/632

All sounds great! I'll share it with my team.

My plan was to monitor the postgres database in the staging environment, so we can be alerted to schema changes before they are released into production (and hopefully stop the production deploy).

I have a goal of moving this even further upstream into the CI build for the source application itself (Ruby on Rails in this case), so that the application's test suite will fail a developer introduces a breaking schema change. Note: this is a pretty tricky problem to solve without a) the tests being way too brittle OR b) super slow end to end tests. I have some goals of introducing which is a mashup of: Spectacles [1], Pact [2], and dbt models [3].

[1] https://www.spectacles.dev [2] https://pact.io [3] https://docs.getdbt.com/docs/building-a-dbt-project/using-so...

That sounds like a great plan. We're planning to build our public API and CI/CD integrations early next year, so that developers can know what the downstream impact of their changes might be, and whether it could introduce unexpected results. We may be able to slot right in there with Pact.

Mitigating the impact with monitoring is where we're at right now, but we're with you that preventing errors can be even more important.

If it's interesting to you, we're happy to open up a shared slack channel to dig into the nuance as well! Just email me (guru@metaplane.dev) with the email you'd like to be added.

Very cool. I'll reach out.

When Nick Schrock created dagster, he argued that many "data cleaning" tasks which people attribute to "data engineering" aren't actually "cleaning", but are architecture problems. I believe schema changes also fall into this category. I'm extremely new to data engineering, but when I think about "What are the things which will break this system?" an application engineer thinking "I'm going to rename this column and my tests pass, so this should be fine" will break things all the time. (Similar goes for dropping a column, changing a one-to-many into a many-to-many)

This might solve a problem I have right now - I’m persisting a kafka topic to clickhouse using the clickhouse kafka engine and realized that to get reliable monitoring on this pipeline I’ll have to roll my own service that polls clickhouse and sends metrics to datadog, then write datadog monitors. Looking forward to exploring Metaplane.

Thanks for sharing your use case :) We don't support Clickhouse yet, but this is exactly the kind of problem that we want to solve for data teams. You shouldn’t need to write and deploy code to monitor every metric or data that needs to be accurate and timely.

Looking forward to hearing what you think, and please reach out to team@metaplane.dev because we’d love to explore building this integration for you!

I looked at the pricing page. It's a very big jump from $0 to $400 per month. There is no middle ground. Something like $30 per month. Any reasons?

We wanted to make Metaplane free for individuals and approachable for teams, and found that $400/mo is justified by the cost of engineering time saved and is comparable to other paid tools that a smaller data team might use (like dbt, Fivetran, Hightouch).

That said, we’re still experimenting with pricing though, and I can see the argument for a tier that’s more suited for individual paid plans. We also want the free plan to be pretty generous — is there a specific constraint that feels too limiting? Thanks for your feedback!

One note about the pricing page itself - I found the "black vs grey" links for the two tiers of "Growth" pretty confusing.

I incorrectly assumed that once I enter the Growth plan $400/mo, I get access to dbt/lineage. But those are only "checked" when you pay $800/mo version of the Growth plan.

So it really feels like you have 4 plans: Free, Growth, Business?, and Enterprise.

Thanks for the feedback on pricing — it's definitely a work in progress (and we'll make the visual distinction clearer on our website). Our goal is to be as accessible as possible, so we're flexible with the plans right now. Pricing shouldn't be the reason that a team misses out on data issues.

When you sign up, you'll have access to everything immediately, so you can connect and try it out and when we start enforcing the limits, we'll give you ample notice.

I'm curious if you have opinions on the plans and pricing. Do the plans make sense as 1. individual 2. team for warehouse 3. team for whole stack 4. enterprise?

This looks great! I'd love to see how your tool could integrate with product analytics use cases. I see the product analytics use cases as downstream from the more core data/BI use cases you've described here - e.g. product analytics teams are often the ones spotting the bugs you mention. Even if core data/BI teams do notice the upstream bugs, what do you think would be the best way to notify downstream teams/dashboards about pending issues? Maybe something to think about.

Of particular interest would be able to correlate upstream events/logs with key product metrics - e.g. DAP/WAP/MAP, click-through-rates, revenue, etc.

Really looking forward to seeing how the product grows!

I love this, and the comment you made about getting up in 10 mins with any size team.

e.g. we are just 2 of us - yet need this.

Also, apologies - but I litereally hear in my head "Meatplane" every time I read 'Metaplane' (but I do like the name - Suckit Zuck, metaplane is just way more meta-ier than 'Meta'

Haha you're definitely not alone — this happened so much we ended up getting http://meatplane.com. Maybe we'll consider a rebrand in the near future :)

So click me and pay for me

Tell me that you'll data me

Watch me like you'll never let me down

'Cause I'm Aler-tin' on a meat-plane

Don't know when I'll be backed-up again

Oh Ops, I hate to go


How do you compare yourselves to Monte Carlo Data?

The main difference is how we get into the hands of data teams. To use Monte Carlo you’ll need to book a demo with their sales team to see the product, go through an implementation process, and potentially pay as much for a data monitoring tool as you do for your database.

We want every data team to have have observability as soon as they have data in a warehouse, and that means: 1) letting you implement the tool within 10 minutes, without talking to us, 2) providing a free plan and paid plans that make sense for modern data teams, and 3) focusing on being as helpful as we can with as little configuration as possible.

Monte Carlo has built a strong team and has done a great job telling the story of data observability for larger companies. Overall we want to support even the smallest data teams, who we feel are being underserved by other companies in this space.

Congrats on the launch! Any plans to integrate with Apache Superset? https://superset.apache.org/

Thanks, and definitely! We're making a big push in the coming months to keep building out our downstream BI integrations. The Superset API is quite nice so we're looking forward to working with it.

This sounds like a very cool product! Any plans on supporting integration with Microsoft SQL Server?

Integrating with Microsoft SQL Server is definitely on our roadmap in the coming months. If you're up to discuss your use case, please reach out to team@metaplane.dev because we’d love to explore building this integration for you!

Looks amazing, and timing on bringing the solution is great. Any chance there will be on-prem version?

Great question! While right now we are focused on our cloud native application, we do plan on supporting an on-prem version for companies that require hosting themselves.

The reason our customers haven’t required this is because we’ve tried to take security seriously from day one and Metaplane doesn’t store any customer data (just the metadata). We received our SOC2 Type II report, support IP whitelisting, SSH/Reverse SSH tunnels and are always exploring other integration options like AWS’ PrivateLink.

That said, we definitely understand the need to keep even metadata on-prem, so we plan on tackling that later next year.

Congrats on the launch. Well done.

another security approach is to enable your customers to close their inbound firewall ports and link listeners. this helps cloud and on prem models have far stronger security.

example here (disclosure: i am a founder of the company behind this solution) with both open source and OEM/SaaS models:

https://github.com/openziti-incubator/zdbc (code for one implementation - a wrapper around the JDBC drivers)

https://netfoundry.io/zero-trust-database-security/ (blog post with links to developer example video, whitepaper, etc)

Congrats on the launch! Data Quality is an important area that our customers always ask about on Select Star (https://selectstar.com). Looking forward to integrate with you guys one day.

Offtopic, what is the template used for the landing page or the inspiration? I think I have seen quite a few like this. Example : https://conjure.so/

Check out https://tailwindcss.com/ — it's been gaining popularity as a CSS system and comes with great defaults

I use tailwindUI :) The landing page format (big headline / hero image below etc) looked similar to other sites.

Btw congratulations on the launch. The tool looks great.

I am a huge fan of this team and their tool. We already use it and it has caught a bunch of issues before they became bigger problems.

We've already had those "wow I'm glad we have this tool" moments, just a couple of months in.

Congrats on the launch! How does this differ from Bigeye or Anomalo?

Our customers think of BigEye, Anomalo, and Monte Carlo very similarly (needing to go through a sales process, spending quite a bit of money), so this answer to a previous question about Monte Carlo might be useful: https://news.ycombinator.com/item?id=29228070 (linking to avoid redundancy)

That doesn't seem like real differentiation. What specifically do you do differently than Bigeye or anomalo? Or is the real value add that I don't have to talk to a human?

I guess I just have a hard time seeing why this would help people solve real data quality issues.

Congrats Metaplane and welcome to the market. Like you, Bigeye also works for small (but mighty) data teams.

This is an exciting emerging space with folks (including multiple YC companies) tackling the problem from different directions. I helped compile a list here: https://medium.com/@vikati/the-rise-of-data-monitoring-7221e...

Congrats on the launch guys! Love the self service approach you've taken to this problem

My team has worked on a library for a similar purpose:


Load any document, profile and monitor the profiles for changes that would impact downstream applications.

Very common problem, you all are in a great space! Very interested and will check out!

Nice work! Love the support for merging profiles together and also profiling unstructured data. I could’ve used this all of the time back in research days, instead of having to do the profiling by hand and from scratch every time.

We definitely want to explore suggesting data tests based on profiling. Don’t be surprised if you see a fork!

this is cool !! have you experimented with getting this to run on large datasets ? IE: running this with Spark or on your datawarehouse ?

This looks AWESOME, congratulations team!

How is it different from soda.io tools?

very cool - any thoughts on going open source ? This would make it an easier comparison in my mind to dbt...

I recently stumbled on an open-source tool with a similar premise: https://github.com/elementary-data/elementary-lineage

you can check it out

Congrats Kevin and Guru on the launch. Happy to see this on here, I was wondering when you’d do your Launch HN. These are real problems that a lot of data teams have and I’m glad to see this making it into the wild!

Thanks Ian — our earlier conversations definitely helped shape our thinking about this space!

Hi! Out of curiosity: Why post so early? It’s only 5:48 am on the US west coast.

We’re based on the US east coast in Boston and just posted when we woke up :)

We live in a global society.

While it's 5:48am on the US west coast, it's 13:48pm in the UK

And its always 5pm in Ireland!

UK? Bah! I’m waaay ahead of you guys ;-) (CEST)

To be the Datadog for anything, you need a very aggressive sales team who are willing to hold customers financially hostage in exchange for C-grade customer service. (Though their technology is certainly acceptable) Be better than Datadog.

Also you need to ensure the price of the monitoring solution is more expensive than the resource being monitored.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact