Hey HN!
I’m Gal, co-founder at Checksum (https://checksum.ai). Checksum is a tool for automatically generating and maintaining end-to-end tests using AI.
I cut my teeth in applied ML in 2016 at a maritime tech company called TSG, based in Israel. When I was there, I worked on a cool product that used machine learning to detect suspicious vehicles. Radar data is pretty tough for humans to parse, but a great fit for AI – and it worked very well for detecting smugglers, terrorist activity, and that sort of thing.
In 2021, after a few years working in big tech (Lyft, Google), I joined a YC company, seer W21, as CTO. This is where I experienced the unique pain of trying to keep end-to-end tests in a good state. The app was quite featureful, and it was a struggle to get and maintain good test coverage.
Like the suspicious maritime vehicle problem I had previously encountered, building and maintaining E2E tests had all the markings of a problem where machines could outperform humans. Also, in the early user interviews, it became clear that this problem wasn’t one that just went away as organizations grew past the startup phase, but one that got even more tangled up and unpleasant.
We’ve been building the product for a little over a year now, and it’s been interesting to learn that some problems were surprisingly easy, and others unusually tough. To get the data we need to train our models, we use the same underlying technology that tools like Fullstory and Hotjar use, and it works quite well. Also, we’re able to get good tests from relatively few user sessions (in most cases, fewer than 200 sessions).
Right now, the models are really good at improving test coverage for featureful web-apps that don’t have much coverage (ie; generating and maintaining a bunch of new tests), but making existing tests better has been a tougher nut to crack. We don’t have as much of a place in organizations where test coverage is great and test quality is medium-to-poor, but we’re keen to develop in that direction.
We’re still early, and spend basically all of our time working with a small handful of design partners (mostly medium-sized startups struggling with test coverage), but it felt like time to share with the HN community.
Thanks so much, happy to answer any questions, and excited to hear your thoughts!
-- I think the name doesn't sell me or even most people because "checksum" is more of a security/crypto term. When I saw the HN post say Checksum I didn't think it was going to be about end-to-end tests. I thought it was going to be some crypto thing. Maybe a name like "Tested" or "Covered" is going to click better with the potential customer.
-- I don't feel like the demo video is making me feel like I know what this product is doing. I could also be misunderstanding the product. It might help more if the demo showed the following (in ideally less than 5-10s or most users might tune out):
1. A quick setup step for checksum 2. A set of generated tests 3. Passing tests
Seeing those steps would give me the emotion as an end-user "wow this must something I can quickly setup and will make me feel like I have test coverage out of the box"