Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Open-source user onboarding and KYC flow made with Svelte (~50kb) (github.com/ballerine-io)
254 points by alonp99 on Oct 30, 2022 | hide | past | favorite | 69 comments
Hi everyone, We’re building an open-source identity and risk management platform and we’ve just released the first chunk of code, a fully customizable KYC flow & UI, to the public. We’ve chosen to use Svelte so our flows would be lightweight (it is ~50kb gzipped). Next up: - Adding forms and components for it to be used as a full onboarding flow. - Releasing an open-source case management dashboard, for manual approval of users. - Releasing an open-source rule engine, to help automate decisions. We’d love to hear your feedback, suggestions, or any question you’ve got. And if the rest of the project is relevant or interesting to you, follow us and, we’ll update you once new things are available. Thanks!



This looks great!

A somewhat tangential question: what are the ways KYC can be made less vulnerable to identity theft? Do people who verify the uploaded document use some automated government services to check for stolen documents?

It seems like a process like the one in this flow (upload a document and a selfie) is useless if a document is stolen (since in most cases you can just look up the person's social account and download their selfie). And even worse, it would give a badge of authenticity to the scammer.

And if the backend people do use some government service to verify the document, then what is the value of submitting a selfie?


Hey, thanks for the comment.

Its true, a document and a selfie are not enough in some cases. there are a couple of technics to make a better guess if the selfie was live but they are not good as liveliness checks.

We already started to work on a liveliness step with a customizable challenge, meaning the developers can configure what action the user should perform in this step (like turning the head in a specific direction, perform hand gestures, and more).

The mobile SDK's will have more sophisticated tools to detect fraud, we will write about it soon.


That sounds good - thanks for clarifying. Excited to see this project grow!


It's not what they're doing here, but you should be able to read the biometric chip using a phone and verify the data that it contains server-side (since it's signed). Not sure how easy it is to get hold of the public keys though.

Which also would be a nice feature if it could be implemented here, might be possible with WebNFC :)


Many services do some sort of “liveness check” to verify there’s a live human interacting with the webcam for the selfie.


Hm, do they stream user's camera feed to their server? Seems like a questionable practice if the user is not made aware of that.


I've seen cool computer vision demos that run client-side in the browser, so I hope they don't stream the feed. OTOH if they do stream, warning the user would go a long way.


If you send the result only it’s easy to spoof


Good point!


> since in most cases you can just look up the person's social account and download their selfie

Are these reviewed by a human? If so, why not 'just?' require the selfie to contain the document, some written nonce, and/or a weird body position, like a flag semaphore position?


I like the idea of asking to hold the same document in the selfie. Not a full-proof solution but way better than asking for any non-specific selfie.


Superb! This is excellent work. KYC is an important area for many applications. And Svelte rocks.

In my direct experience, many coding teams have a challenging time working with KYC because there can be so many edge cases, and the flows sometimes must deliberately block immediate onboarding.

Metrics for KYC are often probabilistic, such as involving total lifetime value (TLV) versus risk assessment. This tends to makes first-time user experience (FTUX) harder for UX designers to reason about, thus harder for developers to implement well.

Great project. Thank you for sharing this.


> KYC is an important area for many applications.

Agreed. Superb for malicious hacks and information leaks.


It's more productive to direct criticism at the legal systems that necessitate KYC rather than the technology that allows compliance. Being that hacker news is a technology-oriented forum, readers must be forgiven for interpreting your comment as criticism of the referenced project, rather than the policy that gives rise to its necessity.

With that said, KYC is a prerequisite to doing business in a number of fields, and I applaud any efforts toward an open source and auditable implementation.


It is malicious technology just like AI surveillance. And just like AI surveillance Being enforced by regulation doesn't make it moral. It _will_ eventually be hacked and you may even endanger the lives of your customers by leaking financial information paired with physical addresses.

For whatever reason harvesting personal data like big tech does is bad but if you slap a KYC sticker and say its for fighting terrorism it's all good.


It's not for fighting terrorism it's for mitigating fraud which is rife in finance


KYC and more broadly AML efforts are meant to detect and prevent both profit-motivated crime like fraud and terrorism. It's a shame that a 2011 report by the UN showed only 0.2% success rate (criminal enterprises kept 99.8% of illicit earnings) after two decades of the stuff. Things haven't exactly gotten better since then, with as little as 0.1% successfully confiscated and the proportion attributable to money laundering regulations possibly as low as 0.02% (https://www.emerald.com/insight/content/doi/10.1108/JMLC-01-...). It's possibly the least effective policy experiment of all time.

The whole endeavor might be so entrenched now that it's "sovereign-complete" and will require complete regime change to really change it, much like other things, but from more optimistic views there's probably already enough people working at the layer of law to try and make things better (here's a random congressional testimony from 2017 https://www.judiciary.senate.gov/imo/media/doc/Cassara%20Tes... "Modernizing AML Laws to Combat Money Laundering and Terrorist Financing"). People working at the layer of tech might help best not by making the most streamlined implementations of everything possible, though open source is certainly better than not, but by making the experience as brutally inconvenient and complaint-generating (directed/redirected at law makers) as possible, doing the minimum necessary to follow the laws and regulations as currently written to minimize their harms, since their gains are pretty much nothing. As they currently tend to exist, KYC setups end up being more useful for crime than against it.


Regardless. Keeping CID financial data is like handling nuclear toxic waste with a plastic bucket. It's not going to end well for you or the customer. Find a better way.


Does KYC help to prosecute Microsoft support scammers?

KYC's purpose is to prevent you from transferring money anonymously. However there still are methods to do this (for example, gift cards or betting on sports).


I made a few flippant and really vague remarks at my bank and I regretted it later, when I realized they were genuinely trying to KYC and not just be nosy in my personal life.


This comment is just optimizing for outrage and contextless criticism without any informative aspect


Thanks! that was part of our motivation behind it. you can read this blog post about KYC UX from our experience previously working together at a Neobank, by Nitzan who is leading the Product on this project: https://vaulted-law-a70.notion.site/Creating-the-perfect-KYC...


Not to detract from the project itself, but neither this HN post or the GitHub repo ever define any of the initialisms in use. Being slightly unfamiliar with what KYC/KYB even stand for, I had to guess and search all around the docs for clues that my guess was correct. Might be worthwhile to add a single expansion somewhere.


Thanks, yeah you’re right, we’ll add it to our GitHub readme page. And for the rest, I’ll just add here that KYC/B stands for “Know your customer/business” and it is common in the onboarding process of financial services, healthcare, gambling, education, etc. in which a company is gathering information/documents and verifying who they give service to.


KYC is such a recognized term that it has surpassed its spelled out meaning, like radar or CAPTCHA.


I didn't know what KYC meant until I worked in fintech.


If you have to ask.


Yeah, we can't have people trying to learn here, this is a site for hackers! (\s)


Just a heads-up: On desktop, in all of the demos, I get stuck on the "Upload ID" step. Clicking on any of the buttons triggers the following error in the console:

Firefox: > Uncaught (in promise) DOMException: The object can not be found here.

Chromium: > Uncaught (in promise) DOMException: Requested device not found

Both at DocumentOptions.svelte:34

It tries to open the camera on mobile.

This should probably offer the option to upload a file, but at minimum it should handle the error.


Thanks ! we are on it!


AML - Anti Money Laundering

CFT - combating funding terrorism

KYC - know your customer

KYB - know your businesses

https://complyadvantage.com/insights/what-is-know-your-busin...

Really wish the repo / docs spelled this out even if it is broadly used there is always someone that hasnt seen an ancronym or initialism before.


Seems like one of those things where if you need this flow you know what those mean.


Maybe you don't know you need the flow and seeing a description makes you go holy shit, I gotta rethink some things.


It's a surprisingly well-built piece of open source software.

I should add that I absolutely hate KYC practices and all the evil it brings to the world. I sincerely hope this will not succeed and not gain any widespread adoption.

Having said that I must also admit that this looks like top notch open source project to me. I like the live demos, the well-built UI/UX with tons of attention to details; the way it's documented and the code. It's so nicely done!


Thanks! I guess…

One of the things we’d like to achieve with it being open source is actually a community that will contribute and collaborates on new practices that are best for both the companies and their customers and less for criminals.


It feels to me that the community to collaborate on something that is wanted by no one, costs money for no reason and is imposed on the world's businesses by a government of a single particular country ... is quite small.


It would be useful if KYC and KYB were expanded once in the readme. I can Google them, of course, but for semi familiar readers having one expanded use towards their first use can help prevent disruptions in reading.


As far as I know, it is Know Your Business (KYB) and Know Your Customer (KYC). I face the same struggle as you and I decided to maintain a list of acronyms: https://github.com/d-edge/foss-acronyms#acronyms


Awesome, thanks for sharing and open sourcing this!

It seems however the UI components are still in a private repository: https://github.com/ballerine-io/ballerine/tree/main/packages

Is it normal?


They are all migrated into “web-sdk”

We will remove ui-components for now to avoid confusion, thanks for the heads up


What kind of automation are you using? For the most part I'd rather enter my SSN and answer a few questions about my background before uploading a bunch of government documents. There are also likely increasing regulation about having users upload some of those documents that your app won't be able to handle.


Hey, thanks for the comment. you’re right, eKYC is definitely the future of identity verification and after building the basic identity stack, we plan to add eKYC methods like verification using only the phone numbers and ID numbers. eKYC is not in a solution that is not globally supported though, yet.

Part of our vision is to standardize this data to allow companies to partner and “vouch” for verified users - so users won’t need to send their documents all over and by doing so we may reduce leakage exposure.

About the automation part, the flows are vendor agnostic and can be connected to any vendor. we are in the process of open-sourcing a backend where you can orchestrate IDV, Risk, Fraud, Document classification, and OCR vendors.


hilariously, it is because of increasing regulation that users will need to upload more sensitive data to more services.

Even requesting to delete your data in the EU requires the submission of sensitive data in order to verify your identity and fulfill the request.


Of course it is. They regulate what information you have to collect and then how you're supposed to collect it, while trying to make sure you don't collect too much or store it in the wrong way and that you you show the correct messages on your website.


> EU requires the submission of sensitive data in order to verify your identity

The EU created eIDAS to enable people to authenticate without submission of sensitive data by using digital signatures, based on public key cryptography, using an id card with an embedded hardware security module and a pin. Those are strong factors: "what you have" is not easily copied and "what you know" may have low entropy, but the embedded hsm has anti-hammering. When interacting with companies in a kyc flow, or to claim rights under gdpr, the people sign a statement of purpose and time, creating data that can not be reused to authenticate as them at a later time for a different purpose.

But no one implements that. Instead companies implement the worst possible authentication method from the set of allowed methods, the bottom of the barrel solution that is only still legal in the EU due to the industry lobbying for backwards compatibility with existing manual workflows from the age of snail mail: uploading photos of identity documents and smiling into a webcam. Those are weak factors: "what you have" and "who you are" using a webcam to scan documents and biometry are vulnerable to deepfakes. But most importantly this method lacks an inherent protection against reuse. This leaves the customers vulnerable to identity theft should their data ever be stolen.

In the worst case the collected data is stored raw, without any mitigation like timestamp and purpose watermarks, or those marks are easily removed. In the worst-worst case the data is also accessible by anyone with even the most far fetched claim to need to know, without rate limiting or misuse detection, so that phishing any internal account is enough to put all this sensitive data on sale at a darknet market.

I do not agree that the EU requires that. It allows it and failed to require that companies offer at least one better method as well.


The company I work for, Onfido, has recently released something similar to this: https://onfido.com/solutions/studio/

Hosting this yourself is one thing, but there are big advantages to outsourcing it.


We used Onfido for quite a while, we think you guys are great. I think the solutions you're comparing here are different in more ways than where it is hosted.

For example, This project is about not being tied to a specific vendor, all modules are completely vendor agnostic.

let’s say that you start with one vendor and after a while, you either are not satisfied with the results and want to switch a vendor without changing your internal operations workflow or maybe you have a new geography to cover and you need another vendor for it that will fit the rest of the tools you are using, with this project you can use the same tools to send your document to whichever vendor you want.

We built it for customization from the ground up, and since it's open source - it's at the code level. we expect to see UI packs, flow and vendor compositing, liveliness challenges, etc all developed in the community and for the community to use.


> there are big advantages to outsourcing it

I'm not sure I disagree, but you should elaborate. I don't see pricing anywhere on that page, so it's impossible to decide if the advantages outweigh the costs anyway.


If your chief risk officer is any good, they will start raising red flags all over if you suggest in-housing KYC fully.

You suddenly have to think about stuff like:

- where are you storing the passport data? Who has access to it? How is it processed?

- what’s the process in case of a data breach, what regulators must be notified

- what data sources are you querying? Sanctions lists, etc. how often are they updated?

- do you need to support one country only, or multiple? Each country will have their own quirks in the process

That’s just off the top of my head.


Again, good points! this is exactly what we want to take off of the dev/CTO shoulders when going into the CRO’s office, we might need to do a better job communicating what the project is about.

- You can choose where to store the data, you can either keep it in the vendor's hand if it's more comfortable for the CRO or use your own AWS and GCS for specific geographies (e.g. GDPR). for those who have access to the data we have RBAC built-in into the back office, we’ve seen that most companies are actually building their own back office since identity verification providers usually won't give something operational for manual approvals - so getting one that is built with best practices in mind and maintained by a community is a big benefit in a security aspect.

- Each company that deals with such processes have its own regulatory framework they are under and they are obliged to assign personnel to build and maintain a policy and make sure the company acts according to it it.

- You can orchestrate that entirely, using whatever data source you need, by using an existing integration or adding a new one.

- This is a big plus for having a community that is spread across multiple geographies and can help build local solutions that will make the system global (stuff like which data you need to collect, how much time data should be stored, in what area it should be stored, and which vendors works best with specific data).


It makes sense, I didn’t really mean my comment as a dismissal of your product btw; I understand that you’re trying to build a set of tools that cover the whole range of what a company might need to do KYC — you’re not really competing with onfido in that sense, as the next onfido could use your product as a building block (or acquire you).


Depending on a 3rd party for your customer data seems like a really big risk, no?


Hey, as someone who’s had to implement this in the past: no.

If anything, outsourcing your KYC makes a lot of sense for most companies. You reduce risk as vendors like onfido have spent literally hundreds of millions building their platform (and would already use a number of public and private data sources to prevent fraud)

If you’re building a bank, it might make sense to in-house it, otherwise it’s just exposing yourself to increased fraud rates and potential data breaches.


Very much this. Same reason it's nice to use Stripe so you're not the one holding your customer's credit cards. Holding anyone's passport photo sounds horrible.


Solid points. from what we’ve seen it defers between different companies.

Each company has its own preferences, some find it less optimal to send IDs to a 3rd party, and some don’t want to host it themself, and this is why we built it to be entirely modular. It means you can either use a vendor in the backend (for fraud, file storage, AML, or anything else) or do stuff on your own, we want to make sure the underlying infrastructure won't break when you mix and match those. we do it by standardizing the data and how all modules talk with each other.

From what we’ve seen it really depends on what service the company is giving, who are the customers, and sometimes the size of the company.


Awesome work, great idea to open-source this !!


This looks great, congrats on the launch!

What's the story around user privacy regulations? The intersection between KYC regulations and things like GDPR are very hard problems, because the former requires collection and storage, while the latter requires deletion, editing, updating, export, and general user control.

Does Ballerine do anything to help this process?


Thanks! We are trying to take those stuff away from developers and have it built in the infrastructure - such as built-in RBAC, access logs in the backoffice, and integrations to s3 and GCS that are controllable at per-user level.


I think the sort of thing that would really help would be the ability to write data retention policies into the flow and have them enforced by the system.

For example, when adding a step that says the user needs to upload Photo ID, requiring the developer to set a retention policy on that, like 30 days, or minimum of 90 days and the time it's used, or retaining until the account is closed. Then automatically deleting after that point.

Being able to also generate a report of all of these policies, simplified down together, would be really handy. If you're updating a privacy policy you probably want to tell your lawyers how you handle data, and having the tool provide a summary of all the policies would be useful.

This only addresses the deletion and retention part. There's more around export and update, as users have the right to correct any data you hold about them if they believe it is incorrect or out of date. That could actually re-trigger the flows too!


Thats an amazing feedback Dan, I would love to have a short call to pick your brain if you're up for it. ill reach on linkedin


Looks great! Way to go! Thanks for open sourcing it


Very cool stuff - how do you plan monetising it?


We how are focused on giving a complete solution for identity and risk management, that is open-source and free.

For monetization, we have a few methods: 1. Aside from the ability to bring your own vendor and use us for free, we do offer a fast way to start verifying users by using vendors we have commercial agreements and pre-set integrations with. 2. Enterprise features common in other OSS companies.


This makes tons of sense. A solid KYC solution typically includes having a human in the loop at some point in the process. In Fintech, this can even include a zoom conversation to help with risk assessment.


hello,

it looks like some core functioanlity like backend services not open sourced.

like 'ballerine-backend'

is it true that only front-end is now open-sourced ?


The front end was the first part that we open-sourced and we are working to migrate the rest, we are planning to make a complete identity and risk stack a commodity.


Congrats and good luck!


Congrats on the launch!


Thanks alot!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: