
Launch HN: Embrace (YC S19) – Spot and fix bugs in mobile apps - efutoran
Hi HN!<p>I’m Eric, the co-founder of Embrace (<a href="https:&#x2F;&#x2F;embrace.io" rel="nofollow">https:&#x2F;&#x2F;embrace.io</a>).<p>Embrace is a debugging and performance monitoring platform that gives developers the information and context they need to monitor and solve errors, crashes, and performance issues. Think of it as what you wish Crashlytics had evolved into combined with the session replay capability of Fullstory.<p>Before Embrace, I co-founded the mobile gaming company Scopely, where we made six top-grossing mobile games including Yahtzee, Walking Dead, and Star Trek. The pain I felt while developing those games sparked the idea for Embrace. Customers and I would find bugs that were impossible to reproduce by the development and QA teams, and the analytics and logging tools we had in place just weren’t enough to solve them.<p>We had crashes under control, so we cared more about startup freezes, failed purchases, and out-of-memory app closes. Without reproducing the issue, we couldn’t tell if the error was caused by a fundamental code issue, something with my device settings, a network problem, or just a very unfortunate combination of all of the above. The solution seemed simple: I wanted to look up my session and see all the user interactions, networking and logging together to find out what caused my issue.<p>After talking with my friends at other mobile app companies, I knew I wasn’t alone. Things worked well in development, but we saw unexpected errors in production and we never had enough information to solve them. I wanted more than just a stack trace to help developers fix the problem, so my co-founder Fredric—who has now built three mobile analytics companies—and I started Embrace.<p>We&#x27;ve talked with many mobile developers and companies and we saw many common problems with apps, such as slow app starts resulting from too many blocking network calls on startup, and we have built the features into our platform to help solve these problems. We also saw processes that were more cumbersome than they needed to be. Often when developers had to fix an issue they would try to combine data from backend logs, different monitoring tools, and feedback from bug reporters to try to build a picture of what was going wrong, but in the end it still wasn’t enough. There was always that one log message that they realized they should have added and they had to wait another release cycle figuring out a fix for the issue.<p>You can add Embrace’s SDK to your app to start collecting the info I had been missing when building apps. We intercept network calls, track views, monitor CPU usage, capture crashes, and automatically collect many more metrics to provide developers with the context that they have told us helps them solve problems. Add logs and breadcrumbs that you define, and we are able to get you as close to replaying user sessions as possible without capturing video. You’ll be to see able to see the network calls before a failed purchase or whether or not the device was in low-power mode when it crashed.<p>We are fortunate to have 40+ customers already, including Wish, OkCupid, AllTrails, and Home Depot. We helped solve the 2nd largest, long-standing crash for Wish by providing their developers with context they were lacking. Developers for a subscription revenue app were able to identify that a critical network call did not occur as expected when users took a certain path through their app. The most recent customer I visited solved two bugs using info from our tool the day after they integrated.<p>We look forward to answering any questions you have and hearing what challenges you face with your mobile apps. We are free to use in development, so any feedback you have on the service would be much appreciated!
======
ignoramous
Wow. Great tech. Congratulations!

A few questions:

What's the typical amt of overhead due to the sdk on the app in terms of RAM,
CPU, Network and how much does that translate to battery drain? What
improvements, if any, are in the pipeline with regards to this?

How does the sdk behave in 2G environments? Is it built for billions [0]?

How did you ensure the sdk itself isn't causing issues given the myraid of
functionality it supports? Interested in knowing how you built resiliency.

On the server side, how does the architecture look like in terms of storage,
reporting, and collection of data from phones.

Thanks.

[0]
[https://developer.android.com/topic/billions](https://developer.android.com/topic/billions)

~~~
fnewberg
Thanks for all the great questions!

The overhead is fairly minimal since we mostly intercept things that happen in
the app and don't really have any things that run continuously at high
frequency, e.g. we add about 0.5ms to network calls on average devices. The
RAM usage is limited since we put caps on how much data we capture in a
session. We do give devs the option to configure the SDK to capture
screenshots when errors occur, which is the largest consumption of RAM we
incur, but that can be disabled.

We make network calls for logs and submitting session data, so that does some
bandwidth, but as mentioned above we put caps on how much data we capture in a
session to avoid payloads getting excessively large.

We've had some customers who have been pretty sensitive to battery drain, and
from working with them we've solved a couple of issues that did affect battery
drainage, but no longer do.

We haven't invested a ton of effort yet to optimize the SDK specifically for
2G environments. That said we do have customers using our service with large
user bases in India and the Philippines, where many of their users are on
less-than-stellar network connections. As we expand to serve more markets
where 2G is more common, we will be focusing on SDK performance for that.

As to how we ensured the SDK wasn't causing problems.... blood, sweat, and
tears? Jokes aside, we had great early adopters who worked through some
painful bugs with us. We've also had some devs at companies that we think have
high code standards give us pointed feedback. Our basic thesis is that
development for mobile is hard, and developing an SDK for mobile that does not
impact the app it's integrated in is really hard, so we also used our own tool
to figure out when things were not going right with the SDK.

The backend stack uses a bunch of wonderful OSS tech like nginx, Kafka,
Cassandra, Clickhouse, Redis, MySQL, React, Gin, and Django. We are dealing
with data volumes that pose fun engineering challenges, and we wouldn't be
able to do it if we weren't standing on the shoulders of giants.

If I missed the mark or you're looking for more depth, don't hesitate to
follow up!

~~~
ignoramous
> If I missed the mark or you're looking for more depth, don't hesitate to
> follow up!

Thanks a lot for taking time to reply. I hope you blog about the challenges
you overcame at some point, esp wrt power consumption and resilience.

This product looks like something Google should have built themselves!

Good luck.

~~~
fnewberg
Thanks for the kinds words! I will talk to our SDK folks and see if I can get
them interested in writing a blog post and not just writing code :)

------
mad_tortoise
It looks really cool, but I feel your 'Live' pricing @ $1000/m is highly
prohibitive to developers not in the first world. Although I suspect that's
not your target market. I wouldn't be able to recommend to anyone to add that
cost onto their app, without massive differences from what one gets with other
platforms.

~~~
Axsuul
It seems they have a Startup plan

~~~
mad_tortoise
Yes but if I have client's that aren't startups, and want to use a product
like this, I'd much more easily recommend a cheaper alternative.

~~~
efutoran
I think the question I generally ask is what are the alternatives? There are
so many developers tools but very few built for the uniqueness of mobile.

------
frakkingcylons
This sounds really useful. Can you intercept communication with GPS as well?
One of the most frustrating bugs I had to deal with was GPS misbehaving on
like 1% of Android users with with no pattern among OS version, manufacturer,
carrier, physical location etc. Maybe it would have been helpful in that
situation.

~~~
efutoran
We find that customers typically use session breadcrumbs to collect data
around subsystems like Bluetooth and GPS to figure out what is going wrong.
The 1%-failure use case is definitely a pattern we see quite a bit and we've
had customers use our platform to figure out infrequent GPS tracking and
Bluetooth connectivity problems.

------
martinald
Interesting. How are you actually capturing the network request data if it's
HTTPS?

~~~
ec109685
There are lifecycle hooks you can plug into within the app to make these types
of captures. This tool is an example that uses them:
[https://github.com/kasketis/netfox/blob/master/README.md](https://github.com/kasketis/netfox/blob/master/README.md)

~~~
efutoran
Yep, there are definitely a lot of ways to tackle network capture. I haven't
looked in-depth at that particular tool to know how close they are to what we
do. We're always excited to see how other people are tackling this. Is there
something that you think netfox does better than other tools you've looked at?

~~~
ec109685
No, I was just pointing out you don’t need a man in the middle proxy to
analyze http traffic.

------
jefflinwood
Looks interesting! Is there an easy way to mask PII from the analytics tool?

~~~
efutoran
We have a few ways to ensure you don't send us PII. Sending identifiers like
user name or email exist as an option in the SDK, but it is an opt-in
integration step and not done automatically. We also have an option to capture
screenshots, but again this is something that customers have full control over
so that sensitive parts of the app are not captured. And we built a remote
kill-switch for screenshots if a customer didn't get their integration quite
right or something unforeseen happened. If you have sensitive data in URLs we
allow specific URL patterns to not be captured. Was there any specific type of
PII you were primarily interested in seeing masked?

~~~
jefflinwood
Sure - I'm more concerned about inadvertent leakage of email addresses or
passwords than passing along an identifier to the API as part of the
implementation, especially if it's capturing network traffic.

~~~
efutoran
We do support network traffic capture, but this is not enabled by default and
is typically only used on specific endpoints with specific status codes that
customers are having problems with. We do not capture request and response
bodies both for privacy and data usage reasons. We don't want to double the
bandwidth usage when somebody integrates us nor do we want them to
accidentally send us sensitive data.

------
benkuhn
Looks really interesting!

How does it behave with slow/unreliable networks? Do you buffer events if they
don’t go through immediately? What’s typical data usage for the SDK?

~~~
efutoran
We do buffer session, log, and event data and retry it when network
connections improve. The data usage will vary from application to application
since it is heavily dependent on how much network activity the app has, how
many logs are captured, and a few other integration options which are
specified by the developer. Session payloads are typically in the 5kB to 10kB
range, and will increase to the 20kB range for network heavy apps that are
making a lot of calls for assets.

------
ec109685
Looks pretty fantastic. Is the sdk open source?

~~~
efutoran
Not yet. We will be open sourcing our React Native SDK in the coming month,
and the native SDKs will be open sourced later this year.

------
bithavoc
Does it work in app-extensions such as custom iOS Keyboards?

~~~
efutoran
Sadly not. We have not seen a lot of demand for this yet. It'd be great to
hear more about your use case.

------
verttii
Flutter?

~~~
efutoran
Yes, we have people using Flutter with our service

~~~
verttii
The docs don't say anything about it, the integration works the same way as
for native android?

~~~
efutoran
Yes, the integration is the same as for native Android. We're happy to work
through any questions you have about it if you want to test it out.

