By analogy, any software project that includes writing a database is, de facto, a database project.
It's like a web framework. Coding one when CGI came out was way less work than now. If you want to match what django or rails does now, it takes years.
I read somewhere it would cost millions if you tried to hire dev to recode django at today's market rate.
I follow a project called Handmade Hero by a longtime game and game tooling programmer where he codes a complete game from scratch with absolutely zero dependencies except Windows. It's an excellent educational resource and I've learned a lot about how games work, including game engines such as Unreal of Unity. This has enhanced my knowledge of how I might better use those engines and I feel that it's been a valuable use of my time.
In practice the index overhead per packet was only 2-3 bits. This was accomplished by lossy indexes, using hashes of just the right size to minimise false hits. The trade-off being that an occasional extra lookup is worth the vastly reduced size of compressed indexes.
To this day, I'm not sure of general purpose, lossy, write-once hashtables that get close to such little overhead.
Competitors would use MySQL and insert per packet. The row overhead was more than my entire index. But it worked out: just toss 50k of hardware at it.
But... It does take over a lot of engineering time writing such bespoke software. Just compressing the hashes (a common info retrieval problem) is a huge area, now with SIMD optimised algorithms and everything.
I used this same library to encode telephone porting (LNP) instructions. That is a database of about 600M entries, mapping one phone number to another. With a bit of manipulation when creating the file, you go from 12GB+ naive encoding as strings (one client was using nearly 50GB after expanding it to a hashtable) to under a GB. Still better than any RMDBS can do and small enough to easily toss this in-RAM on every routing box.
Some day I'd like to write it in Rust and implement vectorized encoding and more compression schemes. Like an optimized SSTable just for integers.
Same thing if you read the Dremel paper. Worrying about bits helps when scaling.
IIRC, when we moved to a highly-customized-Lucene-based system in 2011, we dropped the server count on the cluster from around 400 nodes to around 15.
This is a false dichotomy.
Maybe during the days of the dot-com boom, it was was true enough because scaling a single server "vertically" became cost prohibitive very quickly, especially since truly large machines came only from brand-name vendors. That was, however, a very long time ago.
A naive interpretation of Moore's law implies CPU performance today is in the high hundreds of times as fast as back then. Even I/O throughput has improved something like a multiple of mid-10s, IIRC. More importantly, cost has come down, too.
The purchase price premium for getting the highest-performance CPUs (and mobo they need) in a server over the lowest cost per performance option is about 3x. Considering that this is, necessarily, a single  server, the base for that premium isn't exactly tremendous. The total cost would seem to be on the same order of magnitude as a team of programmers.
Of course, in the instant example, the database was particularly specialized, including what strikes me as a unique feature, a lossy index. I'd expect data integrity to be one of the huge challenges of databases, which, if relaxed, makes writing a custom one a more reasonable proposition.
 Or a modest number, on the order of a dozen, for something like read slaves, rather than the multiple dozens if not hundreds the distributed system.
But, the gains become more expensive as you move up the scale. So, at least starting down the software path is often very cheap with many large gains to be had. Similarly, it's at least looking at the software before you scale to the next level of hardware tends to be a great investment.
It's not about always looking at software, it's a question of regularly going back to software as it's better to regularly do so rather than as a one time push.
I'm a bit confused.. are you agreeing or disagreeing? My point was to call out a false dichotomy and offer a third option.
> It's not about always looking at software
Yet that's exactly what happens. Software engineers completely dominate the field, including management, so they always look at software and only software.
- working in very small but more importantly, constant memory (predictability is a must for reliable embedded app),
- provide provable transactional characteristics with guarantee to roll back to previous state in case transaction is interrupted and still be consistent and able to accept new transactions,
- minimum storage overhead -- the amount of overhead was directly translating to competitiveness of the device at the time as multiple vendors tried to provide their own solutions for the same preexisting platform
- storage write pattern that would provide wear leveling for the underlying flash
I ended up writing my own database (less than 1k LOC total with encoding/decoding data using BER-TLV) that would meet all characteristics, take few tens of bytes of stack, take few bytes of overhead per write. The database would vacuum itself and coalesce record state automatically. It had some O(n^3) algorithms but THAT'S OK since the amount of data could never be so large that it could pose any problems.
The project took 2 years to complete. I spent maybe a month designing, implementing and perfecting the database. I wouldn't say that the Akin's law of spacecraft design applies here. I would probably spend more than that if I had to integrate existing product and end up with inferior product anyway.
> There’s an old adage in software engineering — “never write your own database”. So why did the OneNote team go ahead and write one as part of upgrading OneNote’s local cache implementation?
All those rules work for most but not all projects. It's the same as saying "You shall allways obey traffic rules". Maybe I should, but sometimes I may not want to brake on yellow light when I have clearly impatient driver tailgating me.
As we gain experience we learn the world is not black and white. Akin's law #8 says:
"In nature, the optimum is almost always in the middle somewhere. Distrust assertions that the optimum is at an extreme point."
You are free to violate any of those on the right in the interest of the ones to the left.
Not correct. Example, a stranger: "are you alone a home?" or "what is the keycode for the door" etc.
“I am not going to answer that” is a true, correct, and appropriate response to both of those questions.
But besides, most kids learned to lie pretty well, to circumvent various restrictions - and learning from the adults who tell them no to lie, but do it by themselves.
I am in favor of leading by example. If I am lying, how can I possible demand it from kids, to do otherwise? When they find out, they loose confidence in me. But when I tell them to only lie in extreme situations to "enemys" and act accordingly, they are much more likely to become truthful by default as well.
Some simplifications are useful in some contexts, but the world is an unimaginably complex thing and trying to dumb it down can only take you so far.
Wait so does this truth exist, or is it also a lie on some level?
It's not truth at all. Just an observation. It happens to fit with my perceptions. That's it.
You can artificially define truth by tying it to a particular frame of reference, but that's not "the truth", as that frame of reference is not 100% transferable to others anyway. The idea of umwelt, as I understand it, seems to work here. Still, it's just an observation, an impression.
I'm not really saying anything new here. Descartes was saying something similar quite a long time ago. Then again, he could have meant something else entirely and there's no way for me to know which is it. I just assume my understanding is close enough to the intended meaning. I may well be completely wrong on this.
Basically, there's nothing you can be really sure about, including the fact that you can't be really sure about anything. There are only things that appear to work for you and, possibly, others. You can use them. Just don't believe them unquestioningly.
Me: It is correct to teach kids that they should tell the truth in every scenario, even if as adults we don't.
You: Not correct. [...]
Exceptions exist in every situation for every single thing you say or do or think. Pointing out exceptions doesn't get you anywhere. Every adult knows this. A childs mind doesn't, and depending on the maturity level, cannot comprehend them. There is an established successful method of instruction of starting with simple rules/laws/examples, and then layering complexity later on. All of our systems of education are based on this. Nobody tells a child to also look up for a possible meteor or a skyscraper collapsing, or a vulture about to attack them before crossing the street. We just teach them to look both ways. Programmers are taught to take for granted that a bit in memory retains its value once set. When you're learning how to program you don't need to account for CPU bugs or cosmic radiation.
Me: It is correct to teach kids that they should tell the truth in every scenario, even if as adults we don't."
teach children that truth is optional."
Optional means maybe. I said default is truth. And when I say every, I mean every.
But there are exceptions, in the case of enemys. So if they lie to someone, it means this one is a (temporary) enemy. Which is a serious implication. They do understand that usually.
They also understand the concept of friend and enemy very early, I bet you agree. (not that they can allways correctly sort it out, but also we adults can't do that allways)
So they very intuitively understand the exception in the case of a meeting with a potential dangerous person and I bet, instinctivly act accordingly and not tell him, where others are hiding for example. (Or break down and cry, also a valid strategy)
If you have a special case where RDBMs can't fill your need, then you obviously have to build your own. But these cases are so rare that it proves the rule.
Once you understand the problem it isn't really difficult to implement it. The trick is to decide what kind of properties you really need and only implement what is absolutely necessary to achieve it.
I did this in ANSI C using some of the stuff I have already implemented for the project. For example, I already had BER-TLV parser/serializer with very specific properties (managing collections within buffer of specified size, etc.) so I reused it for the file format and then again on application layer for the record format.
The basic database is KV store kept in the form of a append-only log file. The entries are records of modifications performed to the database. The keys and the values are binary and the structure is managed by the application. Application supplies callback to perform some operations (for example, given a base version of the record and a patch, calculate the patched version of the record).
The transactions were basically an identifier and a flag (is the entry end of transaction?)
All algorithms are very simple and focused on constant use of memory.
For example: coalescing operation was basically reading old file and writing alive records to a new file. I would traverse the old file entirely, for each record (O(n^3)) and, using callbacks supplied by the application, write end result of all committed transactions to the new file. After this was done to all records I would switch the files.
Fetching any record meant doing linear search through all database entries to find all entries related to the record, taking base version and then successively applying all patches until entire database file is traversed. This is inefficient as hell but the use case was that the database was very rarely actually accessed. The typical use was to just write data to the database and it was only accessed after failure or in some very specific background tasks.
- it requires operating system,
- its memory footprint even in best case way larger than total memory available on the system,
- can't work in constant amount of memory (no dynamic allocations after application startup, compile-time verifiable stack usage)
- can't provably continue from any power cycle -- there is no guaranteed automated recovery method in case of unfortunate power cycle
- data storage overhead was unacceptable for the application
The standard build of SQLite uses an OS, but there is a compile-time option to omit the OS dependency. It then falls to the developer to implement about a dozen methods on an object that will read/write from whatever storage system is used by the device. People do this. We know it works. We once had a customer use SQLite as the filesystem on their tiny little machine.
Likewise, the use of malloc() is enabled by default but can be disabled at compile-time. Without malloc(), your application has to provide SQLite a chunk of memory to use at startup. But that is all the memory that SQLite will ever use, guaranteed. Internally, SQLite subdivides and allocates the big chunk of memory, but we have mathematical proof that this can be done without ever encountering a memory allocation error. (Details are too long for this reply, but are covered in the SQLite documentation.)
Finally, we do have proof that SQLite can continue after a power cycle - assuming certain semantics provided by the storage layer. Hence, the proof depends on your underlying hardware and those methods you write to access the hardware for (1) above. But assuming those all behave as advertised, SQLite is proof against data loss following an unexpected power cut. We have demonstrated this by both code analysis, and experimentally.
So probably you were correct to write your own database in this case. My point is that SQLite did not miss your requirements by quite as big a margin as you suppose. If you had had a bigger hardware budget (SQLite needs about 0.5MB of code space and a similar amount of RAM, though the more RAM you feed it the faster it runs) then you might have been able to save yourself about two years of development effort.
SQLite is fantastic product, it is just not aimed at extremely constrained platform like the one I was working on.
The device was a platform the company has already invested into a lot (few thousand units, committed to buy another few tens of housands) and not cheap (few hundred dollars per unit). The application was bid to extend the lifetime of that hardware by cramming new features to existing platform. At the end, after considering many products, it was clear to us that every byte saved was byte available for other features and memory was our limiting factor. So it was not even a question whether we could fit a particular database but how much space we would be left with for the really important features.
I don't remember object (*.o) sizes, but the entire database was 1kLOC of ANSI C while the entire application was about 70k LOC of heavily optimized code. The memory requirements were few tens of bytes of stack (not really important as other parts of application were using more stack) and hundred bytes of statically allocated memory. It even came to dumbing down algorithms to keep object sizes down. I learned a lot on that project.
I admit I did not do much research on the provable characteristics of SQLite back then (decade ago) once it was clear it could never fit our application. The research was mostly aimed to prove we need our own database. The management did not agree ("Never roll your own database...") So I just ended up doing it as a skunk works project. It worked, the product is still in use and it has never failed a single transaction (out of tens o billions processed).
I may even write my own blog post in the spirit of the one in the title of this thread, it just never occured to me it is interesting to general public.
Still I agree better not to roll your own stuff unless it is absolutely critical.
“One Size Fits All”: An Idea Whose Time Has Come and Gone
The End of an Architectural Era (It’s Time for a Complete Rewrite)
They basically show that classical RDBMs are inefficient by around a factor of at least 10 in every conceivable application. I tend to trade in a little of that for the kinds of dramatically simpler mechanisms discussed in TFA.
I got older and realized I now had to maintain 8 different projects, including my in house versions of things. Giving up control of a project to another person made me realize I had to eventually trust someone else to implement things correctly.
Now a days I am back on implementing my own solutions since I am more worried about efficiency. It would be nice if the compiler could inline the library implementation WELL(it usually sucks at this with static linking even with -ffunction-sections, etc.). I think there is a need for a language level facility for controlling what aspects of a third party library wind up in your final binary. Or more modular type libraries.
I just wish design decisions made by others weren't so baked in, like datatypes used or tag bits and what not.
I'm not sure whether that doesn't rank very high for people, doesn't even occur to them to be a problem, or they just don't give a shit about anybody else.
Having a variety of high-quality, widely-used libraries to choose from is vital, though. That's why I don't use brand new stacks at work, even if they are "better" in some way.
My first programming roles were in a mainframe shop with a major defense contractor, which had some brilliant mainframe system programmers over the years. Their major unclassified systems (payroll, shop order control, etc.) were internetworked with a homegrown real-time system, and they all had a home-grown disk access system (random and sequential) that was surreal in its speed and reliability, all coded in 370 assembler. On the business applications side, they had a thorough API that was callable from even COBOL programs.
By the mid-1980s, upper management decided they had to "standardize" and began developing replacement systems using IBM's IMS. Performance was unusably bad. I left around that time to join the UNIX RDBMS world, so I don't know if they ever found a solution that could actually be rolled out.
Working with the older stuff was actually fun because things just worked, and the customers of our legacy systems were really happy.
It’s all about how you spend the time you have. Don’t build stuff that you can get off shelf. You’re not going to write a better database, even if you think you can. You may get great performance for the current system on day one (after spending a huge amount of time and effort on developing it, perhaps with zero value created) but over the lifetime of the database you’ll incur huge costs that you probably can’t even fully foresee. The details here are not clear enough, but working from first principles (“I want to build an Evernote-like app”) I can’t imagine an experienced developer suggesting you should write a database (cache/file system).
The best software development is mundane: glue together what’s there, buy the resources you need to get sufficient performance. Switch a component if you need something to be faster, re-architect key parts, and buy the new components off the shelf too. It’s fun to build new stuff with custom algorithms you work out, but instead of that you can go home at 5pm, try building a database in your spare time and learn why you shouldn’t, and still have space to relax.
As Sam Altman wrote recently, the productivity tip that most people are astoundingly ignorant of is: choose what you do carefully.
It’s telling that this article does not begin: “we had a bottleneck that could not be solved.”
b. Embedded software
c. High performance computing
e. Military / Aviation software
In each of the above fields there are categories of software where you can run circles around general solutions by writing something custom and yes some of those involve writing your own database.
Apart from that, I agree with your list but I’d consider them to be very specialised domains that are not like the vast majority of software development, especially building conventional apps like OneNote (a document database).
ha, just kidding.
I’ve never worked with anyone who tried to implement their own database, but I have worked with people who implemented their own network protocols, JS SPA frameworks and service discovery layers, and they were all really bad ideas.
It gets even worse when said person leaves the company, their system is still critically important and full of bugs, and has the inevitable little-to-no documentation. Other devs have to maintain this mess, and nobody truly understands it.
The best advice is really to keep it simple. Build the system of out small, isolated units with clear APIs between them. The API and components should be isolated at runtime, so they can be restarted or fail separately, as well as in the code - separate modules / files / libraries. That way looking at it can be easier to understand how everything fits together.
The simple part wasn't just a platitude or a generalizing statement. A db is really a beast that can easily turn into a giant ball of spaghetti with a ton of features, settings and tweaks and never-ending list of bugs. A lot of work and thinking I did on it (including a rewrite after an initial prototype) was with the goal of making it simple by cutting unnecessary features.
A lot of projects have switch away from BDB for that reason, others still rely on BDB 5.3.
Also, I kind of agree that BDB is a bit of a nightmare to work with.
Then again, ACI does assume you aren't having CPU/memory/etc failures, so to the extent that you count anything as guaranteed, it is probably possible to 'guarantee' durablity. My original point was more that "minimally adequate" isn't that hard compared to going from there to a high performance, general purpose system (and also a offhand snipe at SQL).
I also helped write a log-structured OLTP-ish database that sits in front of MySQL.
I have only two regrets about these systems. First, MySQL was, in retrospect, the wrong backing store. Second, I used Thrift serialization. Thrift is not so great.
Every sufficiently advanced system does that, be it a CRM that allows custom fields, or a project management or ticket system, a workflow system etc. And basically all enterprise software contains elements of these systems.
For a new application that only supports one database, it's probably a better choice.
The result is also a nightmare, it makes for really difficult upgrades, and a complex product for the customer as he has to have some database knowledge and can shoot himself in the foot quite easily.
I still don't know which approach is the worst between EAV and exposing the data schema...
That sounds familiar: https://pthree.org/2012/12/14/zfs-administration-part-ix-cop...
I never got over it, really. On the plus side, it was super fast, that I cannot deny. But it always seemed super gross to me.
I was just always amazed that there never were any problems with corrupt files.
And we didn't need full sql semantics, because we partially implemented it into our scripting language anyway and used only cursor APIs. postgresql's mvcc was still a great inspiration for how to design the actual row storage.
Served us well, but 15 years later, a bit too much of custom features make it hard to switch to a standard database.
Then you stick it on a filesystem that does not work the way you think it does or even worse you stick it on a remote filesystem (eg CIFS/SMB) that does not work the way you think it does and is now backed by an FS that also does not work the way you think it does.
Your DB may work fine but make sure you onboard the lessons that all the others have learnt through bitter experience.
WordPress average page generation: 1 sec
SunSed CMS: 0.03 sec
Database is almost always the reason for slow applications.
Edit: if you do caching right, WordPress becomes as fast.
Custom made database for just a specific problem can be a lot faster than generic one. But depends on your knowlidge.
I'm still unsure whether a database would benefit me, or if keeping everything flat is easier for ~50GB on a single machine. I'm leaning towards yes, as does the article, but my log files do not need much maintenance so I do not make persistent transformations.
By definition. First sentence in wiki. I don't know what are your definitions.
It would be easier than for instance a hash table, or rather in networking, a deterministic solution is more common.
I'm obviously not a systems programmer though so I can't provide much else.
I know tons of people who roll their own flat-file storage engine and are perfectly happy with it, even scaling up to their moderate couple of tens-of-thousands of users. Nothing fancy.
Personally, I had to write my own database (just like what the author of the article wound up doing), and had a delightful time learning all sorts of things, and now it is one of the most popular databases out there (https://github.com/amark/gun) and I encourage others to try (if they have time) building one themselves!
The wisdom in that is usually pretty sage. Reach for a SQL database unless there are reasonable and concrete reasons not to. Then, evaluate whether there’s a need to roll your own, or if there’s something you can grab right off the shelf.
I’d argue there are plenty of perfectly valid reasons to roll your own, and it’s not particularly difficult if you build on an already-proven storage foundation (file system, LMDB, etc.)