Ask HN: Selling a database engine as a solo developer?

logicalmonster · 2023-10-08T18:07:24.000000Z

Could the reason your DB is so fast because it's missing something in terms of safety/reliability that other databases do? That would be one thing I'd be concerned about hearing about a 100x performance increase.

Anyway, I don't know what might be the best monetization strategy for your specific niche, but here's a random idea. You need proof that this thing works and is worth using beyond one random guy saying "Trust me, bro".

Can you find one company who is willing to take a chance on your technology for free? Or maybe perhaps free for a generous time period like 5-10 years or something?

You'll give them access, give them some support to implement your database in their product, and even put something in your will about giving them a license to the source code for this database if you become incapacitated, reducing their risk.

The catch is that if they find your software useful (and I don't know what metrics you can use in a contract to determine this) they'll cooperate with you to help you put together a full-throated endorsement through cooperating with you to put together a full case study, do some social media with you, and give you some defined number of links or mentions on their website + blog.

Then you have something to point to for selling to other companies. "We did a trial run with Acme company and they got these results with our software. Here's the case study that shows from them how our software was so effective."

senderista · 2023-10-08T20:53:07.000000Z

> Can you find one company who is willing to take a chance on your technology for free? Or maybe perhaps free for a generous time period like 5-10 years or something?

That's an interesting idea! Do you think startups or established companies would be the most receptive clients to such a proposal?

senderista · 2023-10-08T20:49:59.000000Z

> Could the reason your DB is so fast because it's missing something in terms of safety/reliability that other databases do? That would be one thing I'd be concerned about hearing about a 100x performance increase.

That's a great question! First off, the database does already implement "ACI" from "ACID" (currently uses snapshot isolation but I have a design for serializable isolation, which shouldn't impact transaction latency by much). But the main reason it's so fast is that it eliminates IPC (this is a single-node system, so network latency isn't a factor). With 2 IPC round-trips for begin/commit transaction, it's impossible to get transaction latency below 10-20us. But after eliminating IPC, it's pretty easy to get latency under 1us (I currently have update transaction latency <200ns and I'm not done optimizing, perhaps even <100ns is possible).

The elephant in the room here is durability, which is perhaps a more nuanced question in the domains I'm targeting than many HN readers would expect. There are a few points I'd like to make about durability:

1. Full transactional durability cannot be achieved on generally available hardware (even NVMe SSDs) without fatally compromising latency. Durable transaction latency can never be lower than round-trip storage latency, which for an NVMe SSD is at least 10-30us, 1-2 orders of magnitude higher than the sub-microsecond latencies I'm targeting.

2. Full transactional durability isn't necessarily a requirement for many applications that use an in-memory database primarily as a concurrency control mechanism rather than as the system of record for data. Many of these applications require no durability at all (think Software Transactional Memory), and many others only need on-demand checkpoints.

3. Even traditional databases like Postgres are rarely run in fully durable mode (much like they're rarely run in fully serializable mode), because of the inevitable compromise to latency (and throughput, in poorly designed systems).

4. What seems to be to be an acceptable compromise to me is to offer on-demand checkpoints (with the option to run automatically on system exit), along with asynchronous durable logging that can keep up with throughput and doesn't compromise latency. I'm also considering an API that would designate an individual transaction as a "durability synchronization point", so that once commit_transaction() has returned, the caller knows that transaction and all previously committed transactions are durable.

I'm fairly confident that with careful design (io_uring and O_DIRECT) I can make SSD I/O throughput keep up with in-memory transaction throughput, but I can't be sure until I prototype it.

logicalmonster · 2023-10-09T20:33:37.000000Z

Fascinating writeup. That said, I'm sure lacking the D guarantee in ACID might caution a lot of possible customers because nobody has time to figure out the nitty-gritty tradeoffs of a database nobody else used. Even if you're right, it's a bit of a hard sell. I'd say work on refining your explanation about that so it's even easier to digest: possibly with some pretty graphics or video.

Perhaps figuring out which industry/company has a specific performance bottleneck and pitching your database to them might be the right move.

apavlo · 2023-10-10T00:28:06.000000Z

Why would any company take the risk on adopting a single-person passion project DBMS that is leftover from failed startup?

The best you can hope for is to make a logo, slap it on Github, and then I'll add it to my list:

https://dbdb.io/browse?tag=failed-company

senderista · 2023-10-10T18:03:40.000000Z

It's already on there LOL

apavlo · 2023-10-10T22:37:35.000000Z

I am going to guess Helium?

senderista · 2023-10-11T01:22:04.000000Z

Nope, I don't feel comfortable disclosing which one for various reasons but there are enough breadcrumbs for anyone who cares to guess without much effort.

Cilvic · 2023-10-07T21:19:41.000000Z

I can not speak to your specific questions, sorry. Maybe you had some contact with customers/prospects of the database startup you worked at?

senderista · 2023-10-08T02:40:56.000000Z

We had only one customer who really deployed the platform in a prototype, and their feedback was generally positive but they seemed wary of investing in a platform without a solid company to support it, which I understand. Open-source is something of a risk mitigation but likely not enough when the only person who understands the codebase is me. This will undoubtedly be a major hurdle in sales conversations, and I'm not sure how to address it, since I'm not interested in collaborators at this point.