Why do I get the feeling that you're an op and look down on development people? If that's really true, try to start developing some project and see how you like frequent schema changes, trying to synchronise schemas with peers, resolving relation issues when merging features, etc. On the other hand if you abstract your interaction with data enough, you can change the whole backend later once it's stable and not care about it up-front.
What I hear you saying is unfortunately - it's worse for ops, so noone should use it.
> On the other hand if you abstract your interaction with data enough, you can change the whole backend later once it's stable and not care about it up-front.
have to disagree with this. By forcing yourself to work with a data layer so abstracted that you can't even reference whether you're dealing with a JSON document or a set of twelve joinable tables, you're going to write the most tortured and inefficient application. Non trivial applications require leaky abstractions.
And you lose the ability to do data constraints well.
Put clearly: Declarative data constraints (including referential integrity, but also check constraints and the like) are the single most important features of RDBMS's for most applications.......
On the other hand, if you abstract your interaction with data enough, you can change the whole backend later once it's stable and not care about it up-front.
This is a conception of data that is more true in theory than it is in practice. In practice, if you want to query your data and efficiently, you'll need to worry about how it's stored. You'll have to worry about the failure cases.
Of course, it is definitely application dependent. If you're just writing a Wordpress-replacement, you can probably choose whatever data store you want and just write an abstraction layer on top of it (especially if you don't care about performance). On the other hand, if you're looking at querying and indexing terabytes (or more) of data, you'll have to work very closely with your data store to extract maximum performance.
"If that's really true, try to start developing some project and see how you like frequent schema changes, trying to synchronise schemas with peers, resolving relation issues when merging features, etc. On the other hand if you abstract your interaction with data enough, you can change the whole backend later once it's stable and not care about it up-front."
I can sort of sympathize with this a little. I used to use MySQL for schema prototyping and then move stable stuff to PostgreSQL back when PostgreSQL lacked an alter table drop column capability.
However today, this is less of a factor. Good database engineering is engineering. It's a math intensive discipline. Today I work often with intelligent database design approaches, while trying to allow for agility in higher levels of the app.
Don't get me wrong, NoSQL is great for some things. However it is NEVER a replacement for a good RDBMS where this is needed.
> If that's really true, try to start developing some project and see how you like frequent schema changes, trying to synchronise schemas with peers, resolving relation issues when merging features,
How does something like MongoDB actually help with this, though? Certainly a lack of a schema lets you be more nimble in changing it, but you still have to write code to handle whatever schema you decide rather than letting a battle tested RDBMS handle it. I think NoSQLs have their uses but not forcing correctness on your data as a feature is not one of them. But I also believe in static typing.
Migration tools are great once you stop changing the schema very often like it happens when a project starts. This of course depends on the project... if you can have a full design from the start, it's probably going to work too. If you don't know the exact requirements or way to get there - not so much.
If you change your schema once every 2-3 days, something wrong with whomever leading the software project. That's like writing a software with zero planning or lack of knowledge for the problem domain.
I don't care if it is a startup or not. Come up with a very simple idea, draw the models in ER diagram, implement that stuff.
It's very hard to imagine that tomorrow suddenly all relationships need to be changed. Even if that is the case, scrap your Repository/Entity model and start from the beginning.
Nothing can help you much if the fundamentals are wrong.
Everything else in a software project changes frequently, especially during the early days. "Fundamentals" don't help you know more ahead of time and it's really nice to be able to quickly adjust when you come across something you hadn't anticipated.
I tried building a small side project with Postgres about 8 months ago (after not really doing rdbms stuff for 18 months before) and was amazed at how inflexible it felt, and how much frustration used to seem normal.
I write up a schema, send to everyone else on the team, get feedback. If users are invovled get it from them too....
take in all the feedback, write up a new draft, wash, rinse, repeat until running out of shampoo (i.e. feedback).....
Once things are pretty stable, do a prototype, address any oversights, do the real thing.
As a startup, change of minds is pretty normal to the point that it's far more better to have the tools to quickly implement it rather than plan/document it well. The only documentation is the general gist behind the database.
If it prototypes well, then further refine it with ER diagrams for future maintenance.
Why is planning everything without validation better than above?
Disagree here. Figure an hour of planning saves 10 hrs dev time and 100 hrs bugfixing.
That doesn't mean spending months planning. It does mean doing your best to plan over a few days, then prototype, review, and start implementing. If things change, you now have a clearer idea of the issues and can better address them.
The worst thing you can do is go into development both blind and without important tools you need to make sure that requirements are met--- tools like check constraints, referential integrity, and the like.
Maintenance takes up about 50% of all IT budget . Most individual pieces of software will spend 2-6 times (considering the average life cycle of an in production software product to be 2-4 years) more money on being maintained than being developed.
Data migration is a massive problem for any organization with data sets at any scale. RDBMS, in general, has gotten in the way of those migrations. People aren't looking at NoSQL just because they cannot sit still but instead are looking to find a better experience with handling data.
I'm not sure if NoSQL is the right answer to that but let's give it a chance and see what happens when people are migrating MongoDB data in 3-4 years.
Integration issues are best handled with good API's. Migration issues are a bigger issue but one thing that good use of RDBMS's give you is the ability to ensure your migrated data is meaningful. Not sure you can do that with KVM-type stores.
Right there with you. The excuses I hear for not wanting to use a good 'ole RDBMS just does not make sense to me sometimes. CREATE TABLE too hard? Time consuming? Difficult?
Those who do not study the history of databases are doomed to repeat it. Soon we'll add back row-level write locks, transaction logging, schemas, multiple indexes and one day they wake up with MongoSQL.
You say that like it's a failure on the devs' part, but that's kind of like blaming regular users for not switching to Linux because they don't like editing network configuration files.* It's masking the problem that there are real developer-friendliness issues with the existing databases. And taunting users will not get them to switch back.
* but then Linux distros get network autoconfiguration and suddenly it's obvious that it was the right solution all along.
This is exactly why I do not believe that what we do is engineering. Engineers do not refuse to use proven techniques and technologies because they are "too hard" and instead use easier methods that have serious shortcomings.
The effort required to get something like Postgres running in the first place. With Redis and CouchDB, it's juts `sudo pacman -S redis|couchdb`, maybe edit the config file, `sudo rc.d start redis|couchdb`. With Postgres you have to create databases, users, etc. etc. While all this stuff is critical in production, it's really not something that you should have to do to just code some stuff.
Well, this is completely bogus. Setting up a new user on PostgreSQL takes me roughly 30 seconds. Now if you use the stock system-db user mapping for dev environment, it's something you do only when you install your system. Mine has been installed for 3 years now.
The parent is referring to a new user trying to set up postgres.
And in the above Linux networking example, this response is analogous to the person who says "But I wrote my own scripts to fix my ethernet configuration, I don't see what the big deal is!"
Edit: That came off a bit harsh. I just don't think the fact that it's not a hassle for us means it's not a hassle for other people. And I don't think that mongo's easy configuration means that it should be used over postgres in all cases, just that postgres should take notice that people like mongo's easy configuration and step up their game in that department.
If all you are doing is writing code, that's sufficient on an rpm based platform. On Debian, use apt-get instead. On Windows use the 1 click installer.
PostgreSQL comes with a default user and a default database, so the criticism on this thread is a bit..... incorrect.
Now it is true you have to set up a system user if you are compiling PostgreSQL from source. However, that's really optional in most cases unless the code you are writing is, well, a patch to PostgreSQL.......
Postgres installs with a default superuser ("postgres" on ubuntu) and a default database (also "postgres"), so that's not the real problem.
Installing software via a package system is trivial and required for any system, so that can't be the issue.
The package distribution invariably chooses a default location for your data and initializes it, so that requires no additional effort at all.
You have to start and stop the service, but the package distribution should make that trivial, as well ("service postgresql start|stop" on ubuntu). And again, I don't see any difference here.
So the only possible area I see for problems is connecting your first time. This is somewhat of an issue for any network service, because you need to prevent anyone with your IP from connecting as superuser. The way ubuntu solves this is by allowing local connections to postgres if the system username matches the database username. So, you have to "su" to the user "postgres", and then do "psql postgres". Now you're in as superuser.
The default "postgres" superuser doesn't have a password (default passwords are bad) and only users with a password can connect over the network. But, you can add a password (which then allows that user to connect over a network), or create new users. If the new username matches a system username, that user can connect locally. If you gave the new user a password, they can connect over a network.
Do you see any fat in the above process that can be streamlined without some horrible side-effect (like allowing anyone with your IP to connect as superuser)? I'm serious here -- if you do see room for improvement, I really, really, really, want to hear what the sticking point is so that it can be fixed.
Well, mostly the migration part that Devs don't want to deal with.
Most modern languages have migration utilities (Flyway for Java, Rails migration for Ruby, Python should have their de-facto migration for Django by now or else they fail hard, and JS... well.. let's wait until Node.js users decided to use RDBMS).
Well here's a fuck you back from a dev: my time is finite and everyone wants a piece of it; If I can save an hour a day by never having to think about my database? If I can shave a week or two of labor off a project?
It's really easy to work with. This is why people keep using it.
Back at you boy. I'm a dev. I hate dealing with other dev that wasted my time just because he ain't lover with RDBMS and decided to write more code and add more infrastructure components (that includes message queue unless you absolutely have no choice).
My time is finite. Ops time is finite. Obviously you decided to dick around with mine and Ops. How bout I send you to the QA department to write automation and software tools so you don't dick around with production code?
You can write with any language and any storage systems you'd like there.
Let me guess, you're the guy who makes sure there's a "Senior" in his title and you use django because the docs are so great.
I'm sorry you work with incompetent people. Sounds like you're in a cubicle farm somewhere. While you're in a meeting swinging your seniority around, I'll be over here shipping products faster than your team.
First, I'm no senior. My title is simply "developer". I work with people that share the same opinions. Most of us are in the same page and that's how we build our culture. The ones that aren't don't last long.
Second, I'm not using Django.... and what's wrong if I do?
Third. I respect people around me. In return, they respect each other so we don't throw away the word "incompetent" and to think that we're better than anybody else.
No pirates. No ninjas. No rockstars. No racers (dhh?) as well. Just grown-ups doing their job with a bit of love, passion, and respect. All balanced.
Fourth. I have no cubicle. I work in an open space and I love it. I don't need my special office (I had one a few years ago and it sucks).
Fifth. My project manager attends meetings and deliver mostly good news to us. He's the best PM I've been with (so far). If we have meeting, that's usually when shit hit the fan and we need to have an honest conversation. Other than that, e-mails are sufficient.