Hacker News new | past | comments | ask | show | jobs | submit login

Being a dba for a decade already, resolving issues similar to the ones you described, are the sweatest I've experienced in my professional career. Unfortunately, I don't believe that fixing the problem in the database by understanding and resolving it in a logical manner is the way things will be in the near future. You know, RDBMSs aren't sexy ... developers developers developers , web developers web developers web developers :) the newest API and/or language are much more important nowadays for the newcomers. I cannot believe it, it's out of my mind, but that's the reality. Noone wants to utilize the database at its fullness anymore, everything needs to be done at the gazilion of app servers by using java/.net or the trendy 3rd/4th generation language of the day.

My only hope is that the data is the thing that is not going away, and this forces us to think in a logical way about it(eventually devs/web devs will start doing the data maniplation and etc the right way, I guess you're pretty familiar with the approach in mind).

Thanks




I remember c. 2002-2005, it was RDBMSes that were the overused, overhyped solution. Basically every webapp had to be backed by a database. Everybody was moving to 3-tier architectures (even for things as simple as a TODO list), frameworks like Struts and Rails and Symphony and Django all sprung up to manage the complexity of talking to the DB, you'd see all these tips on blogs (themselves among the first database-backed webapps, even though they worked perfectly well as flat files) about how to normalize your schema and shard your data for scale and pick ID spaces and make your joins perform well.

Meanwhile, a small minority of people were like "Guys? GUYS! You can just use flat files and in-memory data structures for that. Why bother with a database when 3 hashtables and a list will do the same thing several orders of magnitude faster." They had outsized accomplishments relative to their shrillness, though: Yahoo, Google, PlentyOfFish, Mailinator, ViaWeb, Hacker News were all built primarily with in-memory data structures and flat files.

These things run in cycles. The meta-lesson is to let your problem dictate your technology choice instead of having your technology choice dictate your problem, and build for the problem you have now rather than worry about the problems that other people have. For many apps, a hashtable is still the right solution to get a v0.1 off the ground, and it will scale to hundreds of thousands of users.


Fully agree with you - there are problems and problems, and 'use the right tool for the right job' should always be the way to go.

hashtalbe or flat files,huh, hard to believe this is going to be useful for a non-trivial app as 'data storage engine'(so to say). You will have to implement everything yourself: - concurrency - several users updating one and the same entry - data consistency - to return the data as of the start of the 'select' for the user,what about others who updated the entry but still didn't commit. - security - fine grain access to specific data entries - what can be extracted by whom, what about auditing(who did what at what time and etc.) - backup/recovery and etc.

More code most of the time means more bugs. Not everyone is Google , Amazon or Facebook to have the resources to create good custom solutions and support, improve them.

For most cases, a simple rdbms database will provide you with enough 'standard' functionality to not reinvent the wheel(think security,concurrency,consistency and etc.) = to code these on your own from scratch.

But in the end, it's again the nature of the problem(whether at all you'll need security, concurrency or consistency and etc.). It's nice to know what the database as a tool gives you , and I don't think this is the case nowadays - too many java/.net, and many other developers , have no idea what they can get out of a database. And I really hate when one spent several weeks to write something that could be done in several hours by using a specific database feature.


Most of the time, you can ignore all those concerns when you're starting out. Just run a single-threaded webserver and make one HTTP request = one transaction. You can keep your registered users in memory and store pointers to them in a list as your ACL, and log audit trails as a linked list. Snapshot data to a file periodically for crash resistance, and load it back in at system start.

Sure, you won't be able to run on more than one core, and this won't work if your data exceeds RAM. And I wouldn't do this for anything mission-critical; it's better for the types of free or cheap services where users will tolerate occasional downtime or data loss. But if you never hit the disk and never need to make external connections to databases or the network, you can easily do upwards of 10K req/sec on a single core these days, even in a scripting language like Python. Assume that you've got 10,000 simultaneous actives (this is more than eg. r/thebutton and most websites, and usually translates to around ~1M registered users), this is one req/user/sec, which is pretty generous engagement.

Hacker News is built with an architecture like this. There've been a couple massive fails related to it (like when PG caused a crash-loop by live editing the site's code and corrupting the saved data in the process), but by and large it seems to work pretty well.


Yup, I'm actively using this architecture, in Python, with gevent (so that calls to external services don't stall your web server). I spawn long running computations in separate processes. I've described it here:

http://www.underengineering.com/2014/05/22/DIY-NoSql/

It really achieves 10k simple requests/transactions per second on a single core (or a $5/month VPS). The software support for it might have been better though. I should really release some code that helps with things like executing a function in another process, or verifying that some code executes atomically.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: