Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Embrace (YC S19) – Spot and fix bugs in mobile apps
90 points by efutoran on Aug 12, 2019 | hide | past | favorite | 33 comments
Hi HN!

I’m Eric, the co-founder of Embrace (https://embrace.io).

Embrace is a debugging and performance monitoring platform that gives developers the information and context they need to monitor and solve errors, crashes, and performance issues. Think of it as what you wish Crashlytics had evolved into combined with the session replay capability of Fullstory.

Before Embrace, I co-founded the mobile gaming company Scopely, where we made six top-grossing mobile games including Yahtzee, Walking Dead, and Star Trek. The pain I felt while developing those games sparked the idea for Embrace. Customers and I would find bugs that were impossible to reproduce by the development and QA teams, and the analytics and logging tools we had in place just weren’t enough to solve them.

We had crashes under control, so we cared more about startup freezes, failed purchases, and out-of-memory app closes. Without reproducing the issue, we couldn’t tell if the error was caused by a fundamental code issue, something with my device settings, a network problem, or just a very unfortunate combination of all of the above. The solution seemed simple: I wanted to look up my session and see all the user interactions, networking and logging together to find out what caused my issue.

After talking with my friends at other mobile app companies, I knew I wasn’t alone. Things worked well in development, but we saw unexpected errors in production and we never had enough information to solve them. I wanted more than just a stack trace to help developers fix the problem, so my co-founder Fredric—who has now built three mobile analytics companies—and I started Embrace.

We've talked with many mobile developers and companies and we saw many common problems with apps, such as slow app starts resulting from too many blocking network calls on startup, and we have built the features into our platform to help solve these problems. We also saw processes that were more cumbersome than they needed to be. Often when developers had to fix an issue they would try to combine data from backend logs, different monitoring tools, and feedback from bug reporters to try to build a picture of what was going wrong, but in the end it still wasn’t enough. There was always that one log message that they realized they should have added and they had to wait another release cycle figuring out a fix for the issue.

You can add Embrace’s SDK to your app to start collecting the info I had been missing when building apps. We intercept network calls, track views, monitor CPU usage, capture crashes, and automatically collect many more metrics to provide developers with the context that they have told us helps them solve problems. Add logs and breadcrumbs that you define, and we are able to get you as close to replaying user sessions as possible without capturing video. You’ll be to see able to see the network calls before a failed purchase or whether or not the device was in low-power mode when it crashed.

We are fortunate to have 40+ customers already, including Wish, OkCupid, AllTrails, and Home Depot. We helped solve the 2nd largest, long-standing crash for Wish by providing their developers with context they were lacking. Developers for a subscription revenue app were able to identify that a critical network call did not occur as expected when users took a certain path through their app. The most recent customer I visited solved two bugs using info from our tool the day after they integrated.

We look forward to answering any questions you have and hearing what challenges you face with your mobile apps. We are free to use in development, so any feedback you have on the service would be much appreciated!

Wow. Great tech. Congratulations!

A few questions:

What's the typical amt of overhead due to the sdk on the app in terms of RAM, CPU, Network and how much does that translate to battery drain? What improvements, if any, are in the pipeline with regards to this?

How does the sdk behave in 2G environments? Is it built for billions [0]?

How did you ensure the sdk itself isn't causing issues given the myraid of functionality it supports? Interested in knowing how you built resiliency.

On the server side, how does the architecture look like in terms of storage, reporting, and collection of data from phones.


[0] https://developer.android.com/topic/billions

Thanks for all the great questions!

The overhead is fairly minimal since we mostly intercept things that happen in the app and don't really have any things that run continuously at high frequency, e.g. we add about 0.5ms to network calls on average devices. The RAM usage is limited since we put caps on how much data we capture in a session. We do give devs the option to configure the SDK to capture screenshots when errors occur, which is the largest consumption of RAM we incur, but that can be disabled.

We make network calls for logs and submitting session data, so that does some bandwidth, but as mentioned above we put caps on how much data we capture in a session to avoid payloads getting excessively large.

We've had some customers who have been pretty sensitive to battery drain, and from working with them we've solved a couple of issues that did affect battery drainage, but no longer do.

We haven't invested a ton of effort yet to optimize the SDK specifically for 2G environments. That said we do have customers using our service with large user bases in India and the Philippines, where many of their users are on less-than-stellar network connections. As we expand to serve more markets where 2G is more common, we will be focusing on SDK performance for that.

As to how we ensured the SDK wasn't causing problems.... blood, sweat, and tears? Jokes aside, we had great early adopters who worked through some painful bugs with us. We've also had some devs at companies that we think have high code standards give us pointed feedback. Our basic thesis is that development for mobile is hard, and developing an SDK for mobile that does not impact the app it's integrated in is really hard, so we also used our own tool to figure out when things were not going right with the SDK.

The backend stack uses a bunch of wonderful OSS tech like nginx, Kafka, Cassandra, Clickhouse, Redis, MySQL, React, Gin, and Django. We are dealing with data volumes that pose fun engineering challenges, and we wouldn't be able to do it if we weren't standing on the shoulders of giants.

If I missed the mark or you're looking for more depth, don't hesitate to follow up!

> If I missed the mark or you're looking for more depth, don't hesitate to follow up!

Thanks a lot for taking time to reply. I hope you blog about the challenges you overcame at some point, esp wrt power consumption and resilience.

This product looks like something Google should have built themselves!

Good luck.

Thanks for the kinds words! I will talk to our SDK folks and see if I can get them interested in writing a blog post and not just writing code :)

It looks really cool, but I feel your 'Live' pricing @ $1000/m is highly prohibitive to developers not in the first world. Although I suspect that's not your target market. I wouldn't be able to recommend to anyone to add that cost onto their app, without massive differences from what one gets with other platforms.

we definitely understand that that price point won't work for everybody developing an app. we have considered offering a trimmed-down version of the service at a lower price point in the future, but our current focus is on the full-blown service. we take a pretty hands-on approach in how we work with our customers, so in addition to the tech part of the platform we have a very engaged customer success team. the decision to take that approach has also impacted how we think about pricing.

It seems they have a Startup plan

Yes but if I have client's that aren't startups, and want to use a product like this, I'd much more easily recommend a cheaper alternative.

I think the question I generally ask is what are the alternatives? There are so many developers tools but very few built for the uniqueness of mobile.

yes, we do. we know money's often tight when you start so we are excited find ways to work with other startups. if you are a startup and sign up for an account on our dash, our CS folks will reach out to you about this.

This sounds really useful. Can you intercept communication with GPS as well? One of the most frustrating bugs I had to deal with was GPS misbehaving on like 1% of Android users with with no pattern among OS version, manufacturer, carrier, physical location etc. Maybe it would have been helpful in that situation.

We find that customers typically use session breadcrumbs to collect data around subsystems like Bluetooth and GPS to figure out what is going wrong. The 1%-failure use case is definitely a pattern we see quite a bit and we've had customers use our platform to figure out infrequent GPS tracking and Bluetooth connectivity problems.

Interesting. How are you actually capturing the network request data if it's HTTPS?

There are lifecycle hooks you can plug into within the app to make these types of captures. This tool is an example that uses them: https://github.com/kasketis/netfox/blob/master/README.md

Yep, there are definitely a lot of ways to tackle network capture. I haven't looked in-depth at that particular tool to know how close they are to what we do. We're always excited to see how other people are tackling this. Is there something that you think netfox does better than other tools you've looked at?

No, I was just pointing out you don’t need a man in the middle proxy to analyze http traffic.

On iOS we do this by swizzling network calls and on Android we use interceptors that are automatically added to your code using a Gradle plugin. By the time we get the data, the network stack has already handled SSL termination, so handling HTTPS calls is no different for the SDK than handling HTTP calls.

So would this work with React Native or Xamarin apps, for example?

We currently support RN, but Xamarin is not yet supported.

Looks interesting! Is there an easy way to mask PII from the analytics tool?

We have a few ways to ensure you don't send us PII. Sending identifiers like user name or email exist as an option in the SDK, but it is an opt-in integration step and not done automatically. We also have an option to capture screenshots, but again this is something that customers have full control over so that sensitive parts of the app are not captured. And we built a remote kill-switch for screenshots if a customer didn't get their integration quite right or something unforeseen happened. If you have sensitive data in URLs we allow specific URL patterns to not be captured. Was there any specific type of PII you were primarily interested in seeing masked?

Sure - I'm more concerned about inadvertent leakage of email addresses or passwords than passing along an identifier to the API as part of the implementation, especially if it's capturing network traffic.

We do support network traffic capture, but this is not enabled by default and is typically only used on specific endpoints with specific status codes that customers are having problems with. We do not capture request and response bodies both for privacy and data usage reasons. We don't want to double the bandwidth usage when somebody integrates us nor do we want them to accidentally send us sensitive data.

Looks really interesting!

How does it behave with slow/unreliable networks? Do you buffer events if they don’t go through immediately? What’s typical data usage for the SDK?

We do buffer session, log, and event data and retry it when network connections improve. The data usage will vary from application to application since it is heavily dependent on how much network activity the app has, how many logs are captured, and a few other integration options which are specified by the developer. Session payloads are typically in the 5kB to 10kB range, and will increase to the 20kB range for network heavy apps that are making a lot of calls for assets.

Looks pretty fantastic. Is the sdk open source?

Not yet. We will be open sourcing our React Native SDK in the coming month, and the native SDKs will be open sourced later this year.

Does it work in app-extensions such as custom iOS Keyboards?

Sadly not. We have not seen a lot of demand for this yet. It'd be great to hear more about your use case.


Yes, we have people using Flutter with our service

The docs don't say anything about it, the integration works the same way as for native android?

Yes, the integration is the same as for native Android. We're happy to work through any questions you have about it if you want to test it out.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact