An even more egregious form of negative work is a developer who is stuck using out of date programming practices AND has a large amount of influence at a company.
At the other extreme is the developer who is so entranced by "newer is better" mentality that they rewrite everything in an attempt to conform to "latest best practices", increasing complexity massively while introducing a bunch of bugs and huge dependencies no one ever actually needed. I've experienced that (and had to undo the mess) a few times.
Relatedly, just as there are "10x" developers, there are "-10x" as well --- it takes the average developer 10 times as long to fix as one of these takes to break.
At a previous company, the tech influencers believed in the archaic "do everything in the database." While we were technically using the .Net stack, we weren't allowed to do any actual business logic in C#. Instead it had to all be done in MS-SQL procedures (or at least at much as possible with very little CLR glue).
Similarly at my current company, we had a product were the initial devs wanted to jump on the RXJS and Socket.io bandwagons. The only problem was the rest of the company was using standard REST endpoints and promises to do the same thing, so any new devs who joined that team suddenly had massive cognitive overhead they had to overcome. Any changes to the codebase we're done by people who only half understood what they were doing, and so the complexity compounded. Thankfully, I was given the chance to rewrite the whole codebase to match what our other products looked like, so now the code is much more sane to work with.
Can you explain why you feel that "do everything in the database" is archaic? A lot of logic (especially authentication logic) can be put in the database only. Not to mention that I won't trust anything that only has application level security, and nothing at database level to check/limit it.
2. deployments, rollback, replication, synchornization - they don't work very well with db procedures
3. unless you connect directly to the db, then you must have some logic on the serverside, usually you end up replicating logic from the db to the serverside
4. Databases languages (even advanced ones like PL/SQL) are not expressive enough
5.It's much easier to scale out the server then the database (and if you are using Oracle/Sql-server etc... also cheaper), and you don't want your database's cpu to be clogged with logic code execution
6.Unit testing (or any testing) is extremely difficult
7.Debugging is hard and convoluted (also it doesn't usually work inside your IDE)
and a whole lot more.
Nothing is absolute or completely obsolete, but it is considered a bad practice for a long time by most industry professionals.
As far as I know the most popular article about it is:
i occasionally have to work with a big application which is essentially written 100% in sql. you simply can't easily change parts without testing the whole thing from start to finish because automated testing at a granular level is horrible. and sql does not lend itself to encapsulation, it does everything to make it hard to break stuff down to manageable pieces
and in sql everything you do is just so complicated, tons of boilerplate stuff you would not have in a real programming language. it's called "query language" after all, not "programming language". so don't use it for that
Its not that the logic goes into SQL that is the problem (that I usually see), its usually a poor database design and a load of code at the application level to compensate for the poor database design.
>Care to elaborate a bit? Especially, why do you think unit testing and debugging of SQL is not hard?
An SQL returns results directly in tables which you can check in all kinds of ways. You can create any number of temporary tables, with the same schema as your business tables, and check all kinds of invariants.
There's absolutely no reason why unit testing SQL should be harder than anything else, considering a single query as the "unit" of testing.
In fact, that's to the built-in checks, constraints and types a RDBMS has, you are freed from having to unit test all kinds of crap too (similar to having less to unit test in Haskell vs Ruby).
Well, you'll have to store your mock data somewhere, which in this case means more databases, often on other servers; so you'll have your SPs connecting to another DB in order to access their data.
It's not very hard, it's just impractical. Mainly because a database is a giant bag of statefulness (to put it scientifically).
You need to prepare test data, you need to update and maintain the test data. That is already a big barrier to entry.
The actual testing involves three simple steps: setting the initial state of the database, run your queries/procs, verify the results. This will be unbearably slow even for a small test set. So you start to make things complicated by trying to be smart, like only revert the state you modified, or using SQLite for tests and Postgres for production, or by running the database server on a RAM filesystem, etc, etc.
I've seen a few people go down the rabbit hole and noone came up with a solution I could be happy with.
I don't see that writing code to test SPROCs is that hard just have a set of inputs that match all of the use cases including all the edge ones run that through and check that the results on the db are as expected.
Debugging mm possibly slightly harder in that you might have to have a 3rd monitor for Toad or work Manager - but you code your sprocs properly in the first place you should not have that many problems that jump between code and sql .
Just saying its hard doesn't help in that case we ought to still be coding in GWBASIC
What's the problem with source controlled database logic?
Have your statements (including those that create stored procedures on setup, migrations, etc) on text files, and just load those into your Git or whatever.
The problem is ensuring deployment matches up. It's very easy to end up with subtle differences between a newly deployed database instance and a database instance that was deployed with an old version and then updated. For code you would think very hard before deploying each version as a patch to the previous version - it's easy and effective to just deploy the code afresh each time, with a complete artifact built from a specific VCS tag. It's much harder to do that with databases; the tooling just isn't there and it's hard to ensure that you wipe out previous logic but retain previous data, because data and logic are commingled. You could possibly build a system for this kind of thing, but the standardized tooling just isn't there.
I don't understand what's hard about mass-overwriting your previous stored procedures with new ones.
You're right about non-SProc code; just deploy all of it. Do the same thing with SProc code!
What's tough about keeping all your code in files that start with "CREATE OR REPLACE FUNCTION <funcname>", and just firing them all at the DB (within a transaction, if you like)?
I don't actively advocate putting all the code in sprocs, but I can see advantages. I also don't advocate using PHP, and yet people demonstrably build some great websites with it.
Your approach is a bit naive. You will accumulate a lot of crud if you don't drop any function you deleted or renamed. This crud could even set people up for making mistakes, like using a function that shouldn't exist and does stuff that harms the integrity of the data.
Back when I was writing Python and using Django, I found the migrations system provided by django-south was really good for exactly this.
A migration was a way to roll a database forwards or backwards; there were tools to create simple ones, and one was able to write whatever Python & SQL one wished in order to handle more complex cases. One might even archive off a column somewhere when deleting it, and load it up back when restoring it, if one wished.
Since the migrations were all just source code, they were perfectly well-suited to source control.
It was a really powerful system; I'm surprised that it hasn't seen wider acceptance.
You have an export file in your repo, containing all your stored procedures and part of the deployement process is to export the procedures to the db, updating as needed.
Absolutely. SQL is great, and having the DB just give you the correct data from the get go is convenient. But in the case where complex logic is necessary, SQL is much more difficult for correct implementation than something more expressive.
You _could_ delve into cursors or long merge statements or what have you, but in business logic specifically, the code will be read and altered numerous times by several different people. In that case, a language made specifically for expressive statements is significantly easier to deal with. That's not to say that I believe the opposite is true and that _everything_ should be done in programming space. I just think there is a better balance that can be achieved, and to default having everything in either category probably means you're not balancing correctly.
Write your stored procedures in any language you like. Postgres supports pgSQL, Tcl, Perl, and Python out of the box, with third-party bindings for some other languages.
When procedural coding is necessary, use a procedural language, in your stored procedures. What's the problem?
Valid point. I should have been more clear in my original post. The issue was more that it was institutionalized to write all stored procedures in MS-SQL, rather than utilizing available CLR options (at least for the vast majority of the time). It should be noted that I'm complaining about a specific company's development method and not the practice altogether.
And your point is? yes sometimes you have complex business logic but throwing way all the benefits of using a RDBS is a sign that your developers cant hack SQL properly.
I would be the first one to admit my SQL chops are probably lacking, so perhaps this is just my own bias revealing itself. I'm not saying to _not_ use stored procedures. I'm just saying that throwing literally _all_ of the business logic in them feels a lot like a silver bullet. I've always felt that any of the more "standard" backbend languages would be a better choice for that complexity, since their expressiveness helps to describe that complexity in a way that is easier to grok for a larger number of developers.
> Can you explain why you feel that "do everything in the database" is archaic
I think it is because I want to be able to compile and test from end to end without having a certain database on hand. I consider the (specific choice of) DB to be an implementation detail, just like whatever file system the application might reside on once deployed.
I know this is an idealistic point of view and sometimes you end up with terribly slow ORM multi-join code where a stored procedure update might have been very simple - but that's an optimization I'd like to keep until it's actually needed because of the flimsy guarantees and poor integration of SP tooling (If I misspell something in a a stored procedure can I be sure it's caught on my dev machine without having to run through integration tests?, etc)
I've been in this job long enough to know that there is no silver bullet in tech stack, methodology, technique, etc. Yet, it took me a while to get there and even if I did it means nothing because wherever I look I see S/W houses being governed by adherents of the this or that 'true religion'. I think that most of that is just cargo cults - i.e. what happens to work for this company (and whereupon people there believe in) is taken to be the way to go for any other company/product/case.
Are SPs bad/evil/nice/safe/etc? I don't know and I cannot tell unless we are talking about something concrete. Senior devs were saying a few years ago that they are the holy grail. Senior devs are saying now that they are the devil. Go figure.
Is TDD the holy grail? Dunno - these days TDD seems to be synonymous to progress and modernity. Any opposed view seem to belong to cavemen but wasn't that so with OOP just a few months before functional became the way to go?
Anyway - you catch my drift. I'm doubly cautious when I hear people speak with the greatest conviction about this and that these days unless they are speaking off a concrete example.
Well, for one thing, auth logic can only live in the database when your application uses no third-party authentication providers, and while "archaic" is maybe a strong word, that's a less and less common situation these days. Even in the "enterprise" world, single sign-on via LDAP is generally the order of the day, and good luck doing that in a stored procedure...
1. Everybody recognizes that "The Database" has now become its own unique product, which just happens to provide remote service-calls to other products over an SQL channel
2. I'm not the one responsible for managing the multi-tenant clusterfuck it will become
No separation between between the application, model and database layers? That makes it a huge problem making changes to just one of them, if you want to support a different database for example or move one of the layers somewhere else.
From what I gathered one of the biggest problems of stored procedures is that they do not play well at all with source control and associated tooling such as code review.
Ok. I did not know that. As I said this is what I have gathered from what I have read around as a common complaint. Out of curiosity, what is the testing-deployment cycle of stored procedures in source control? How do you assure yourself that the database uses a procedure coming from a specific commit?
How do you assure yourself that your non-database code uses code coming from a specific commit?
In general, you have a deployment procedure that overwrites all the old code with all-new code, and from there on out you trust, right? Why not use the same approach?
>At a previous company, the tech influencers believed in the archaic "do everything in the database." While we were technically using the .Net stack, we weren't allowed to do any actual business logic in C#.
There's nothing archaic about that, actually.
It ensures that your domain logic remains DRY and is not repeated for every new application or service that you connect to your data.
I guess it depends on your application. To me having multiple applications accessing the same database logic (or domain) feels very archaic, or at least hard to maintain.
You'll end up with highly coupled application(s) and a schema that is very hard to change and nearly impossible to deploy small changes in continuous deployment scenario's without maintenance windows. In my experience it's hard to maintain high performance and 0 deployment downtime in applications only using SQL databases multiple applications sharing tables/views.
Then there's always additional platforms for specific areas that are just way more hard in a relational database such as journaling (append only data), queuing, (distributed) caching and searching that will scatter your logic outside the database anyway.
I've found it really hard maintaining DRY principles, finding a balance between readability and performance in large SQL based applications without an extremely disciplined team.
Valid point. Sadly, the system I was working on at said company was monolithic, so I don't believe that advantage was being utilized. It does open it up the potential in the future though.
I feel that this could be solved in other ways as well. Such as with microservices to wrap your database and provide endpoints for other services to consume. This is more like how we do it at my current company.
I still can't quite come to terms with it, but I've worked with someone who managed to be at both ends of the spectrum at once. Insisted on using some newfangled thing for a major part of the product, and in doing it their way – and then, nobody was allowed to change it.
Mercifully, their work has generated so many bugs that they've been maintaining that product for the last year and haven't had time to touch anything else.
I used to work with a guy like that and it created a ton of tech debt. He would write new services using a new technology for each one, not to documenting or maintaining any of them.
What's sad is that it looks great on his or her resume to do that. That developer might leave a trail of carnage but they don't care--on to the new shiny job for them. It never catches up.
I can't really blame developers for this -- they're just responding to incentives. So long as hiring managers penalize candidates that don't have experience with trendy stacks then your actively doing your employees a disservice by prohibiting them. The only thing I can perosnally do to combat this problem is be conscious of it, not engage in that kind of hiring behavior in my office, and hope that the culture changes.
Giving employees opportunities for side projects helps somewhat. Allowing for gradual migrations to new technologies helps as well.
While I agree with you, who is in control there? Who lets the developer pick the tech and leave a trail of carnage?
Neomaniacs gonna succumb to neomania. The bigger question is why the system permits that, rather than steer those urges to try something new into useful experiments that might advance the status quo.
I think rewriting code for style purposes extremely naive. It introduces risk and sabotages `git blame`. Still see a lot of time wasted doing exactly that, as enthusiastic developers try to illuminate the rest of the team, or make things tidy. I recommend against switching to something like "standard.js" in a live project.
Agreed. There should be a standard way that everybody on the team formats their code on check-in, and that format should not change. That way, it can be enforced by a single git pre-commit hook to clang-format. If somebody wants to work on the code formatted in a different way, well then that's just more calls to clang-format, but it never touches the repository that way.
At the other extreme is the developer who is so entranced by "newer is better" mentality that they rewrite everything in an attempt to conform to "latest best practices", increasing complexity massively while introducing a bunch of bugs and huge dependencies no one ever actually needed. I've experienced that (and had to undo the mess) a few times.
Relatedly, just as there are "10x" developers, there are "-10x" as well --- it takes the average developer 10 times as long to fix as one of these takes to break.