Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Berbix (YC S18) – Instant ID checks to fight fraud and stay compliant
106 points by ericlevine on Nov 21, 2019 | hide | past | favorite | 51 comments
Hi everyone!

We’re Steve and Eric, the founders of Berbix (https://www.berbix.com). We make it easy to instantly verify photo IDs. Our goal is to empower platforms to accurately identify their users while being responsible stewards of sensitive information.

Today, we’re launching our self-service ID checks (https://www.berbix.com/pricing) to organizations of any size that need to answer the question: Are you who you say you are?

We’re taking a privacy-first approach to identity verification. Your government-issued photo ID is one of the most sensitive pieces of information you own, and sharing it with a company online can be scary. We’ve invested significant effort to try to do this the right way from a security perspective, ensuring all images that ever leave our system are aggressively watermarked, and enforcing short retention policies to automatically purge data. We aren’t—nor do we ever intend to be—in the business of selling personal data.

Unless you’re a credit card processor, everyone knows that you’d be crazy to collect credit card numbers directly without using a system like Stripe because of PCI compliance. But there’s no equivalent standard for identity documents. It’s still the wild west when it comes to best practices around this extremely sensitive data. Companies inevitably will need to collect this data, whether to comply with regulations to verify age, confirm the identity of a GDPR or CCPA request, or deter fraud on a marketplace. It may come across as self serving, but we’d rather have a privacy-oriented company collect that data on their behalf.

We were the product and engineering leaders of the Trust & Safety team at Airbnb for several years where we were tasked with stopping all bad things from happening on Airbnb—both online and offline. This was a challenging problem as it included your typical online fraud like chargebacks, account takeovers, and wire scams in addition to much more novel offline risks like property damage and personal safety issues.

We learned to distinguish between “premeditated” bad actors who come to a platform with the intent to cause harm and “opportunistic” bad actors who would swipe a $20 bill on a nightstand, as an example. Some techniques work well against one group, but not the other. One effective means to fight both is to check a government-issued photo ID. Premeditated bad actors will often leave to find another platform with fewer protections, and opportunistic bad actors will think twice before doing something malicious in the moment when they know their ID has been checked.

Historically, checking IDs online has been hard. It required a 5-figure contract with a legacy ID verification provider, would take minutes or more, and the quality of the data returned left a lot to be desired. We knew there had to be a better way, and so we started Berbix. Our product returns a result in 2 seconds or less and leverages the machine- and human-readable components of a photo ID to maximize accuracy.

We’ve designed Berbix in a way that we, as developers, would want to use it (https://docs.berbix.com), with backend API libraries that make an integration simple and intuitive. We offer client-side SDKs for a number of platforms including React, iOS, Android and more (https://github.com/berbix). We make integration simple enough to be completed in a matter of minutes, while also providing flexibility to offer custom configurations if desired. Using our API, you can request the information you need to verify your users, while isolating your servers from ever handling the sensitive user-submitted ID images directly.

We’d love feedback from the HN community. Looking forward to hearing your thoughts!

privacy-first approach

So, let's take a look at your terms.[1]

However, we cannot guarantee that unauthorized third parties will never be able to defeat our security measures or use your personal information for improper purposes. You should always use caution before sharing your sensitive personal information online.

we each agree to resolve any claim, dispute, or controversy (excluding any claims for injunctive or other equitable relief as provided below) arising out of or in connection with or relating to these Website Terms of Use, or the breach or alleged breach thereof, by binding arbitration by JAMS, Inc.

Not only mandatory arbitration, but arbitration with JAMS, which has problems.[2] Not the American Arbitration Association consumer rules. The AAA will often send small claims to Small Claims Court, which is cheaper, and has real judges.

These Website Terms of Use, and any rights and licenses granted hereunder, may not be transferred or assigned by you, but may be assigned by Berbix without restriction.

So if you exit by being acquired by Google or Facebook or Tencent, they get all the data.

There's nothing in the terms which places any legal responsibilities on Berbix beyond minimal compliance with the law. The terms are no better than the average web site, and worse than many.

So, "privacy last".

[1] https://terms.berbix.com/terms/website [2] https://www.sfgate.com/news/article/PRIVATE-JUSTICE-Can-publ...

Thanks for taking the time to carefully read our terms of service! We share your point of view that these aren't merely a series of boxes to check.

Our “privacy-first” claims are namely three-fold:

1. We limit the data returned to our customers and enforce a maximum data retention period after which data is permanently deleted. We encourage customers to reduce the amount of data they need and the number of days it must be stored.

2. We built our own image watermarking service to protect the sensitive images we process and store. This helps ensure that the images cannot be used to verify an identity on any other service.

3. We completed our SOC 2 Type 1 examination in March 2019. This is an intensive security audit performed by an accredited third party. We perform these annually.

And thanks for the feedback regarding JAMS specifically. We are in the process of revising our terms that were first drafted early this year before our public launch.

We really do appreciate this feedback, which we will take into account as we continue to iterate on both our Terms of Service and Privacy Policy to best-reflect our business practices. As part of this effort, we are making a commitment to always publish a record of all changes to our terms.

Commenting as an uninvolved bystander: your entire reply sounds like corporate-speak to me, and it's offputting. I get that you're trying to state your intent, and perhaps English isn't your native language, but it's deterring me from looking further into Berbix. Also, your second claim really just seems to be a lock-in, not a user-friendly positioning.

If your primary business is trust, you have to commit contractually to back it. Not just make "claims".

FYI: SOC 2 Type 1 has no weight for corporate/data privacy/infosec because anyone get it with a dozen .docx templates from the internet. Type II report is substantial because it requires the auditor to observe your actual operations for the previous 3-6 months. If you got Type 1 report in March, does that mean that Type 2 report will be available to prospect customers any day now?

How can you force your customers to have a retention period for data you provide them? They could just keep a copy, and you'd never know.

Oh! You're the company who -- when I was requesting the non-consensual tome of personal data that Sift keeps on me (and basically everyone else in the US and other coutries) -- refused to accept my straight-faced selfie and instead specifically insisted over and over again that I need to look "joyful or happy" and try again.

I get (in retrospect, after research) that you're asking for a real-time face pose change for better identity verification, but do you realize how dystopian it feels when someone is fighting with an opaque bureaucracy and the process demands that they smile about it?

You should try expressing a rationale up-front so it's not so Orwellian.

Completely understand and sympathize with that. We absolutely can (and will) do a better job of conveying the intent of the different checks here. The pose change requested is randomized, but I get that this can be frustrating.

I know you already get this, but for posterity, the idea here is to make sure the person submitting their ID is actually in front of the computer (and can react to a prompt). Attempting to use a still photo is a common way a bad actor may try to circumvent these protections. Obviously correctly identifying someone in the case you described is extremely important given the sensitivity of some of these data access requests.

Orwellian isn’t exactly the vibe we’re looking for, though, so we can do better here.

So you require someone's PICTURE to deliver the data you gathered on that person? To further augment your digital stash? Or train your models to recognize said person? (after which you delete the picture, logical - storage space costs money)

I hope I'm wrong somewhere.

If I'm not, I don't think I want to do business with you, or to ever have my ID checked by you if it means you'll get to keep my data- then ask me for an up-to-date picture to improve your collection when I object to that.

So the purpose of taking a picture of yourself is to make sure that the photo as depicted on the ID matches the person who is completing the flow. This is important as a stolen ID should not be usable for the purposes of online identity verification. We’re not in the business of selling your data, but of providing a secure, privacy-oriented way for businesses that have to perform ID checks to do so. In the situation described above, we’re providing identity verification services for Sift in the context of the data subject access requests they’re receiving.

As OP in this thread, this is a complete mis-read of the situation; you should re-read the other comments and consider removing this one.

They say the photo is to make sure someone is not using a copied ID - I believe them. Makes sense.

They say the photo will be removed - I believe them. GPDR, California laws, good will, storage is expansive, etc.

What I won't trust anyone with, is what they will do with the data created from this photo that is not the photo itself.

That's really nice to hear -- thanks for the reply! A "Why do we ask this?" link would probably be optimal.

I do get the aim, but it took me a while --- I'd wondered if it was simply data collection for more classifier training or something, which felt like a dodgy extra ask along with a verification service (even if it's the same strategy as recaptcha).

>when I was requesting the non-consensual tome of personal data that Sift keeps on me

I hope Berbix has a plan here for the fact that when sketchy companies use their service, it will make them look sketchy by association.

We're rather selective in terms of which customers we serve given that we see the verification of someone’s identity as a privilege rather than a right.

With respect to Sift, we see them as good stewards given their narrow focus of preventing fraud as opposed to data brokers that sell your data to ad networks, hedge funds, etc.

Sift has taken a privacy-conscious approach to responding to data access requests by ensuring the individual is in fact who they say they are. We’re proud to help them prevent fraudulent access to sensitive personal information.

disclaimer: I know some of the folks involved with Berbix

I love this, for the same reason I love Stripe (and before them authorize.net et al): I don't want every vendor I use that has KYC requirements to implement their own controls on the screenshots/scans/videos/whatever of government IDs, because then there will be a huge amount of variation and some of them will inevitably get it very very wrong.

Putting it all under one roof (or a few roofs) allows those few companies to get storing and handling this toxic data really right, extract a reasonable amount of revenue, and everybody wins (Berbix wins because money, their customers win because they don't have to pay the opportunity cost or real cost of developing this in house, and end users win because of better/safer handling of ID information).

Not really following your logic. Why do you care as a customer if there’s variation in KYC requirements from vendor to vendor?

Also not sure why this makes handling ID information “safer.” The vendor using this service still has access to all the images (ID and selfie) and data returned to them from this API that they can store a copy of on their own servers forever. This is not like your Stripe analogy where you only get the token and not the actual card number.

A few questions:

1. "images that leave our system..."

Why do they at all??


2. You're a startup. Meaning: 90%+ chance you die. What procedures have you put in place to make sure that ALL the data is destroyed in case your company changes hands, so that it cannot be used by somebody whose ideas and privacy are different than yours, simply by buying you (or your carcass after you are bankrupt)?


3. What use is watermarking in case of a giant data breach? The fact that we know that YOU lost our data doesn't help us any. What are your plans for data storage such that a breach in your systems does not allow easy exfiltration?

Appreciate the thoughtful questions.

On the point of why images leave our system at all, we provide a way to show our work to our customers — they won’t trust our results if they can’t see that they’re accurate. When they access information on our dashboard, if we render the images, they’ve left our systems. To be clear, we’re not syndicating this information to any third parties, just showing this information directly to our customer (who is the owner and controller of this data).

As for what procedures we put in place, we enforce short retention periods for the data we store in our systems for precisely the reason you are worried about. At the expiration of that period, the data is permanently deleted. Furthermore, in the event of a change of control, the contracts we’ve put in place with our existing customers govern how the information can be used. This is super important to us as we personally take privacy extremely seriously.

The aggressive watermarking is important for several reasons. First, in the worst case scenario, we can trace how a breach happened and when. Second, it is watermarked in such a way that the images become much less functional than they would be otherwise — the intent is to ensure that the images cannot be used to verify an identity on any other service. We take security very seriously — we’ve already secured SOC 2 certification and continue to invest heavily in security using industry best practices.

Cannot be used out of embarrassment, or you are actually ‘de-facing’ the identity?

For example, shutterstock logo sure doesn’t stop people from using those. Plus there are open source tools to reverse such watermarking.

I submitted a data request to a third party processor recently (to Sift, after they were mentioned in an NYT article) and they sent me a link to your service to submit ID and two selfie photos.

The consumer facing experience on this is not the best. Here I am filing a request to a third party processor for data that I never personally sent them. And in order to handle that, I have to send even more sensitive information to yet another third party processor. See the irony here?

Sift’s email said the ID data would be retained for no more than 14 days, while Berbix’ privacy policy says the retention period is the shorter of “until no longer needed” or for 3 years from my last interaction with your customer.

Who’s right here, and if your customer quotes end users a retention period that’s shorter than 3 years, how do you hold them to that?

Absolutely understand where you’re coming from. It can be jarring to be asked to go through those steps by a set of companies with whom you have no direct relationship. That said, data access requests can contain some extremely sensitive information and it’s important companies responding to such requests don't share information with the wrong person.

Regarding your question on data deletion; we abide by the retention policies chosen by our customers, which are typically much shorter than 3 years. For Sift specifically, the retention policy is indeed 14 days, after which point we automatically delete all the personally identifiable information we've collected on Sift's behalf. We'll be taking in your feedback, however, as this could be made clearer in both our privacy policy and our product.

Congratulations! Know your customer/online id verification sure looks like the business to be in lately. Given that there's https://veriff.com which seems to be the most advanced one and they are also YC graduate, what's the advantage/edge you guys have? Selling on the price looks like a race to the bottom.

Or are you focusing on USA customers only? GDPR and all that.

Thank you! And yes, this space is definitely becoming increasingly important. Our focus has been to provide a lightweight, low-friction means by which to confidently check IDs. While we have been primarily serving North America-based customers, our product can work well for any US or Canadian IDs or ICAO compliant travel documents (which includes many European IDs).

Like Veriff, there are a lot of companies in this space. What is the differentiator other than low-friction? Most of the competitors offer ~$1 per use and are highly automated and frictionless.

Congrats on the launch!

Do you verify consistency of components on a license through access to a DMV database (e.g. name matches address matches license number) or is this closer to a surface-level check of subtle visual indicators present on a license? Watermarks, holograms, etc.

Curious if the core asset here is a well-trained CV model, or if there's a data/partnership moat as well.

Thanks and great question! Our patent-pending fake ID detection provides an additional layer of fraud prevention on top of the surface-level checks of visual indicators that are typical in online ID checks. This gets closer to a DMV database check without the high cost (several dollars) of checking against motor vehicle records.

Gotcha, thanks! When you say "provides an additional layer of fraud prevention", you mean you're verifying against some external service like Checkr or something?

If not, very curious how you solve for false negatives in your KYC. That itself is a meaty problem domain. I remember when Coinbase was scaling, tons of folks were complaining about being unable to access funds because they were told that they weren't providing proper identifying information, even when they were, in fact, were.

If it's easy to check the database and they don't, wouldn't that be negligent?

"Your Honor, Mister Evil here laundered $270Million of drug money through our service, but there was no way for us to know that when he signed up with the false name 'Mr Boobyface', and we didn't want to pay $2 to check if the name was valid. We're an innocent bystander here!"

Nice! ID checks are so prevalent that I really wouldn't mind a stripe-esque provider mediating it.

I stumbled on Indian government's effort to address this problem via https://digilocker.gov.in/ I think they basically store photos of you, your govt IDs (driving license, social security etc), your records (educational, financial) and digitally verify it with the issuer of those records (in most cases, other govt agencies), plus, also link it with your mobile phone number and/or personal email (for MFA). Is it fair to say berbix is doing the same but addressing a global market? Or, is berbix a complementary product, that is, you'd simply build on top of a service like digilocker as requester and/or issuer of verifiable documents [0]?

With that in mind, do you also plan to expand verifiability to other forms of documents, too, other than IDs?

Congratulations on the launch.

[0] https://partners.digitallocker.gov.in/

there are a bunch of them out there already...



identity mind


How is your service different from the KYC service providers out there targeting fintech startups?

Many of the existing KYC providers for fintech startups rely on credit reports to perform identity checks. Our perspective is that relying on credit reports alone is becoming increasingly problematic given the widespread data breaches which expose the underlying data. Bad actors have access to full name, date of birth, address history, and social security numbers for countless individuals.

We believe that going forward, fintech startups are going to have to have to have increasingly robust KYC programs, which will include an ID checks. Berbix is an effective tool to collect and check IDs instantly as part of a robust KYC program. To that end, we’re currently serving multiple customers in the fintech space with such programs.

Every European challenger bank or fintech I've signed up for an account with (Monzo, N26, bunq, TransferWise) has required an ID document check of some kind, sometimes including taking a video, and there have been multiple providers there.

Is your market research limited to the US?

Actually it has became standard practice in where I live to have government ID check as part of KYC process for these new fintech companies, usually accompanied with a live photo capture.

Maybe it is not the case in the US? Interesting.

Congrats on the launch.

Couple questions

- Is the verification completely automated?

- Sound's really expensive at $1 per check, why?

Thank you! Super excited to show what we’re working on. Yes, our product is fully automated - our customers can choose to review items that we’ve flagged as problematic, but they can also choose to reject them outright. As for the pricing, we’ve found that our price is competitive in this space, and we can provide significant volume discounts.

Drop me a line at (my first name) @berbix.com and I’ll shoot over a demo link so you can try it!

Berbix looks good.

We need to compare the face on the user's ID card against other user-provided photos to see if they are the same person. Will the 'images.front.face_image' watermark interfere with face analysis?

Twilio's Lookup API can return extra data from various databases. Have you considered doing something similar with Berbix Verify? We would be interested in sex-offender registration status, violent crime conviction status, and home geographic area. Various demographic info could also be useful: education level, employment status, income/asset level, etc.

This looks really interesting. Took a quick glance at the site but didn’t see anything about use case examples. Do you have a hunch about what use cases will be most frequent? E.g. for landlords, selling cars, etc.

If this is going to further the trend of websites requiring me to upload my I’d, well then that is bad and you can definitely count me out.

"Pay-as-you-go pricing for growing startups."

$100 / month is NOT "Pay-as-you-go" pricing.

Pay-as-you-go can mean not committed term: stop going, stop paying. If it’s “month to month”, that counts.

Pay-as-you-go doesn’t necessarily mean usage-based pricing.

Love this! We put it on the roadmap as an integration for our company Moonlight in Q1.

Do users typically redirect to Berbix, or is there a hidden API integration?

Our customers typically integrate with us using one of our client-side SDKs (https://docs.berbix.com/docs/client-sdks). Our web SDKs embed an iframe into your website (or spawn a modal) that will take the end user through the image collection flow. Our mobile SDKs get baked into mobile apps so that there’s no additional app to download to complete the end user flow. In either case, as soon as the end user finishes the flow, you can fetch the data about the transaction via our API (or using our backend SDKs).

Who's liable in the case of an "oops", and a fulfilled GDPR data request winds up being a GDPR violation?

Our customers can specify their own risk profile, and in general, customers using us for more sensitive use cases (e.g. returning all the data they possess about individuals), tend to pick our most thorough checks. In addition, we return a number of signals to our customers to help them determine how confident they should be about a given verification. That said, it is ultimately the responsibility of the business who's the recipient of the data request to determine whether they're confident enough about the identity of a "data requester" before acting on their request.

The unfortunate reality is that very often, businesses do not do enough due diligence before responding to data requests. [1] Berbix makes such diligence easier and more secure, such that more companies should be able to adopt better processes around data access requests.

[1] "GDPArrrrr: Using Privacy Laws to Steal Identities" https://i.blackhat.com/USA-19/Thursday/us-19-Pavur-GDPArrrrr...

ID checking for age verification seems like a moat for the big few adult sites.

Can this be used for age verification for adult sites?

Looks really cool and I have some ideas and side projects that could use this.

Why is there no demo on your site?

Thanks! You can check out short videos of our mobile [1] and web [2] flows.

If you want to give it a try yourself, you can sign up and try our "test mode" for free!

[1] https://www.youtube.com/watch?v=cX4lQlCWXq0

[2] https://www.youtube.com/watch?v=O3nV0AXbsL8

Very cool. Someday we might need to verify personal training certs

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact