The author thinks we should move away using the database purely to persist the data, while handling all logic/validation in the server language (e.g. By using an ORM to map your server language to the database) was less robust over time.
Instead, he suggests pushing knowledge of your data into your database as this has proven to be more time resilient and takes advantage of the power of these databases.
I liked the argument (which is better expressed that my paraphrasing) but my largest reservation was that expressiveness is a big consideration for choosing some language, and in this case the alternative to using Ruby/JS/Python + ORM in your MVC framework is to use SQL functions, which are syntactically clunky/dated at the very least.
It seems like Andl is looking to bridge this to some extent.
For example, you can finally, in the year 2016, split strings natively in TSQL:
Compare that to this SO question:
Note there are 32 answers on that question, many of which fail under various edge cases. Also notice that no one finds this situation absurd in the slightest (Stockholm Syndrome is alive and well in the MSSQL community apparently). This is just one example of the nonsense you have to put up with on SQL Server, there are many others.
And while PG is getting yet another new language capability, it appears Microsoft is pulling SQLCLR support (in Azure, so far):
...which probably doesn't matter at the end of the day because almost no DBA would ever let you use it in production.
While it would cause harm to my personal career, nothing would make me happier than to see Postgres crush MSSQL, Microsoft has earned nothing less with their absolute disdain for their users.
Oh forget about string splitting and the like. The language itself (T-SQL and SQL in general) is a disaster. No modularization possibilities (wanna do a this particular "SELECT" but with slightly different something? well, copy-paste the whole thing), sensible code autocompletion is next to impossible (type "SELECT _" wat? from wat?), don't get me started on type conversions, dumbest possible error messages, the "IN (@multipleValues)" disaster, and oh my.
The equivalent of GROUP_CONCAT in MySQL or STRING_AGG in Postgres in MSSQL is ridiculous. Yes, I understand it's more versatile and extensible to use a subselect which can be mostly optimized away, output as XML, prefix each field with your delimiter, and then ignore the first character of the concatenation of those fields, but come on. Throw your users a bone once in a while...
"Or a version that works correctly if the data might contain characters such as <"
What other edge cases are you missing? Do you care? Microsoft certainly doesn't.
TSQL is an embarrassment, it is a perfect example of the arrogance of Microsoft that is luckily finally being corrected on many teams (C#, Azure) but persists in the MSSQL camp.
Where I work, we have a DBA and use SQL CLR in production. In fact I'm in the process of writing a SQL CLR user defined function today. It's interesting that Microsoft have dropped support for them in Azure, but they're still supported in SQL Server 2016. Furthermore, starting from Visual Studio 2013 efforts were made to make them easier to deploy direct from Visual Studio, so clearly some people in Microsoft think they're worth supporting.
I'd say as long as you have a sane database structure where you don't try to do all your processing in a single database instance, SQL CLRs are fine. For example, use them in databases that handle scheduled processes rather than in ones that can't have any downtime and you it'll be easier to redeploy them without much ceremony.
I wasn't able to accomplish it and neither could this guy:
And if your answer is RTFM then you've probably not used SQL Server before, or were fortunate to have always work on a perfectly configured instance, because a lot of things (say, debugging) often don't work on a brand new install on a brand new windows box.
There are tons of things like this where people have been crying for help from Microsoft for years, but the SQL Server team clearly does not give one fuck about their users problems.
That being said, I don't mind helping out. Here are the relevant notes I made when following the "Developer's Guide to SQL Server CLR Integration" on Pluralsight:
"To configure assembly signing, open project properties, SQLCLR, scroll down to find and select the 'Signing...' button, check the 'Sign the assembly' checkbox, choose or create a 'Strong name key file' (if select <New...>, set key file name e.g. CLRLogReader, and set a password, OK to finish), OK to finish, then close project properties tab."
From the troubleshooting section of the same course...
"If you want to deploy an assembly that uses EXTERNAL_ACCESS (or UNSAFE) permissions instead of SAFE permissions, you need to perform some extra steps in order to grant those permissions to the login deploying the assembly. There are a few different ways to set those permissions (including enabling the EXTERNAL ACCESS ASSEMBLY permission for the DBO and switching the TRUSTWORTHY property on, or signing the assembly with a certificate), but the recommended method is to sign the assembly with an asymmetric key (recommended because it is both easy to do and secure). In VS, load properties for SQL Database project, SQLCLR tab, scroll down to find and click on Signing... button. Check the 'Sign the assembly' option, create/select strong key file, OK. Build project, make sure this process completes without errors. Then in SSMS will need to set up a user that can use the key that was used in the build process. Example:
CREATE ASYMMETRIC KEY CLRLog
FROM EXECUTABLE FILE =
'Path to SQL CLR DLL' --note this DLL was the one created in the build process carried out in VS
CREATE LOGIN CLRLogin FROM ASYMMETRIC KEY CLRLog
GRANT EXTERNAL ACCESS ASSEMBLY TO CLRLogin
Note that you only have to configure this assembly signing once, subsequent deployments will reuse the key and login we configured to deploy the assembly, so deployment then becomes as straightforward as deployment of SAFE assemblies."
Hope these notes prove useful.
I've also used CLRs in production since SQL Server 2005. Not sure which circles you run in, but it's quite different than what I've seen. I also use Service Broker, so maybe I'm just an odd duck?!
Generally ones wants to either extract the Nth element from a delimited string, or convert a delimited string into a table, are you thinking of a different scenario?
Somewhere on an official Microsoft page, there is something to the effect that SQLCLR is "not recommended", which gives your typical DBA a perfect excuse to say "no", with no interest in the particulars of why Microsoft might have written that warning.
Service Broker performs store-and-forward reliably message delivery inside a single SQL Server instance and between instances (on the same machine or over a network). The inter-instance delivery is particularly nice insofar as that it can use certificates for authentication (which is very nice when you have disparate Active Directory environments in play).
We've built a couple of system that use it and have been very pleased with the results. I can't figure out why it's so obscure. The documentation isn't terrible but it's definitely got a steep learning curve.
As a web developer who has done mostly Rails for a long time, I default to ORMs. And it seems to me that the last 20 years of architecture theory boils down to "How do I put a layer in front of my database?" Once your schema becomes a public API, your app is frozen.
But on the other hand, I believe in using all the features your database has to offer. I remember doing J2EE in the early oughts everyone spoke about being able to swap out one RDBMS for another, and I eventually decided that kind of lowest-common-denominator approach was not worth the cost in preset-day productivity. Plus building on an open source project is less risky than something with outrageous licensing fees, so why not use it? So my own Rails projects have real foreign keys, CHECK constraints, some SQL functions (even some in C), views, and big queries. If I can express an invariant I trust Postgres more than myself to enforce it. I've also had some projects lately where building a JSON response for the front-end involved hundreds of ActiveRecord objects, and dodging that by using the JSON-construction functions in pg 9.4+ is a huge performance win. You can even still use the ActiveRecord DSL to build the query, using your scopes and associations and Ruby code, but then do a `to_sql` and wrap the whole thing in Postgres JSON fuctions. I feel like there could almost be a gem for that. Anyway, sorry for rambling. :-) I agree that Andl could make it much more appealing for people to "put the logic in the database," and I feel like I have been moving in that direction for a long time. Maybe we will see an ActiveRecord Andl adapter?!
Remember who was putting the money behind those ideas.
IBM, for example had a vested interest in folks swapping out Oracle for another database... DB2 for Mainframe for DB2 UDB for AIX/x86. Enterprise is a zero sum game, and zeroing out the competition is a way to grow the business.
To the database it is bad by making a great show of holding its nose while dealing with the lowly bit box, making scrupulously sure to avoid taking advantage of the fact that the database has locality and enforceability. I've encountered true production race conditions that Rails validations can't solve. It is a professional embarrassment when clients receive duplicate emails about the multi-umptillion dollar software we've sold them. Meanwhile, a single constraint in Postgres prevents it from ever happening again.
And it's bad to the programmer by exposing a massive interface that you are meant to inherit from for construction. Ironically it's named after the Active Record pattern which is about making a thin interface with a table, which you can then build your own business-specific domain translation layer. But in Rails that's not really how it works.
I used to think the mismatch could be solved by lurching towards sets. But I don't think it would be a very easy transition and more to the point, I don't have anything strong to back up my argument that a collection of grumpiness about the ways various ORMs obscure the power of relational modelling.
It's just that databases begin with the design assumption that you want state to be consistent.
"Our definition is strictly more general than CRDTs  in the following sense: From any state-based CRDT we can obtain a FJ-QUA by using the same state and initial state, the same query and update functions, a fork function that creates a new replica and then merges the forker state, and a join function that uses the merge. Note that the definition of strong eventual consistency in , just like ours, requires that updates can be applied to any state."
My favourite gotcha.
It solves distributed statefulness!
... so long as someone else solves statefulness!
Not a bad tradeoff, and I highly recommend the CR papers if you haven't read them. I'd say most programs have a natural expression in terms of revisions, and you're left with only a small set of circumstances where state simply must evolve consistently and atomically (airplane seat reservations, for instance).
I would still like a framework for reasoning about these circumstances that integrates naturally with CR though.
It's like saying you have an O(1) solution to all TSP graphs with the teeny, tiny caveat that you start with a solution to the TSP graph you're working on.
Concurrent revisions in particular has been implemented to work in concurrent, parallel and incremental computing, with no change in semantics. So it's pretty darn close to a magic wand, but it can't violate the laws of logic.
Moreover, while this isn't usually taken advantage of by contemporary databases or schedulers, in research databases knowing details about the transactions that are going to execute allows dramatic improvements in throughput compared to ad-hoc queries (for a variety of reasons, from being able to find a synchronization-free execution that preserves certain invariants to being able to optimize for data locality to trading latency for throughput with group commit... the list goes on).
My personal experience has been that for the vast majority of websites, there simply isn't enough computational work going on over the data to outweigh these benefits.
All that being said: I still hate stored procedures with a fiery passion for other reasons (mostly that they often fail to integrate nicely into the rest of the application's deployment mechanisms, especially if you're using a hosted database solution which prevents custom extensions) and try not to let them anywhere near a database unless there are performance reasons to do so.
(BTW, just to show I'm not talking out of my ass, here's the latest and greatest in serializable performance within a datacenter (aka centralized): http://research.microsoft.com/pubs/255848/SOSP15-final227.pd.... To quote:
"FaRM achieves a peak throughput of 140 million TATP transactions per second on 90 machines with a 4.9 TB database, and it recovers from a failure in less than 50 ms." It also performs 4.5 million TPC-C New Order transactions per second with a 1.9 ms 99% latency [this includes 10% remote partition transactions]).
Probably depends on the plan followed for data replication, as some approaches seem to make it easier to add new DB nodes as you scale. I've not tried it myself, but I think Apache Samza has some promising ideas in this area:
Indeed, moving logic into a relational language is a great idea. If only SQL weren't such a poor pseudo-relational language.
I never used it myself, so can't comment much about it. The examples on wiki makes it look as if perhaps it would integrate with code better, but that's just a guess.
You defined "created" for me in the context I was using it.
I know a project which uses Oracle 9. Customer wants to migrate to other version for last 5 years, but he can't, too much hidden obstacles, so he even have to support old hardware (because Oracle 9 doesn't run on newer Windows and old Windows doesn't run on newer hardware). And it's because 50% of a huge system was written using Oracle PL/SQL and it's just not realistic to rewrite it.
It's because databases are seen as a liability. They are often seen as complicated to test,"un-agile", hard to manage... in fact DBA used to be a job itself. Now developers are asked to be developers and DBA, which leads to over engineered application layers and under engineered data layers, or the rise of things like MongoDB.
Losing the DBAs from an organisation might look good on the balance sheet, but it's going to cost you big time in the long run *
* assuming competent DBAs
With PLV8 for Postgres you can write your stored procedures in JS, making your entire stack essentially the same language. I've run a couple of production apps like this and it's worked better than I expected. No DML is allowed, just executing of the stored procedures, which lets us refactor the underlying tables pretty easily.
PL/SQL, for all its problems, does make it pretty straightforward to write a data API - AddX, AddYToX, AssociateXAndZ. It provides a nice abstraction over whatever table structure you may have and also makes it very clear, especially in the case of Table per Hierarchy type structures, what fields are required in what cases.
The key thing is to limit yourself to what the language is good at - reading and writing data. You shouldn't need to do things like string manipulation, date conversions etc - that is what the calling code is responsible for doing. For me it makes sense to associate the data with the logic specifically relating to it as frontends can come and go without the underlying structure changing.
I'd be keeping ORM's just to feel more secure and be more robust on the testing side.
You want scalability? you have to drop your constraints and external keys in your DB. In fact don't use a DB with a schema.
You want safe data? Use a good old relational DB with well designed schema and you're good to go.
And it work very well!
This version of Andl is free for any kind of experimental use, especially helping to make it better.
For now, the licence does not grant rights for
distribution or commercial use.
That will have to wait until I can choose the
right licence, which depends a lot on who might want to use it."
Thank you very much, goodbye, have a nice day.
> That will have to wait until I can choose the right
> licence, which depends a lot on who might want to
> use it
I'm not involved in any way, but I heard about it from the author's activity on the Postgres mailing lists, and I'm very intrigued to see where the project goes. I thought other HNers might be interested too. I'm glad to see people are experimenting with ways to take the relational model beyond SQL.
> The language handler plandl is a DLL written in C, which calls a C++ function, which in turn uses COM to start up the CLR runtime and load the Andl Postgres entry point. Similar capabilities can be provided for Mono.
It's hard enough getting someone to install a non-core contrib module. This sounds like it'd be impossible to get running.
For Mono there's a bit of work involved to load it up but apart from that it's smooth enough. As far as the end-user is concerned, they should just need to install mono and they're good.
Source: I wrote a similar module for FreeSWITCH which has lots of users both on Windows/CLR and on Mono.
Is Andl only for stored procedures?
If so why would it be better than say python?
If I were to choose, I'd prefer a lispier SQL with support for the basic things like map, filter, reduce, ... Is there something alike?
If not, why?
UPDATE: Nevermind, I just found the other links (quite small on top of the headline), so most things are explained.
However, the article suggests people use python or perl for queries instead of SQL. Did anyone see this in production (apart from Stored Procedures)?
Sounds like SQL.
Further to that point, if you ignore the ORM issue there really isn't much going for this Andl thing.
I'm hesitant to dump too much on it as I don't fully understand the implications of what the article is talking about just yet.
I think its a good start but I'm not sold on moving my logic there yet.
I like my storage to be somewhat unknowledgeable to the rest of the world, it only needs to know what my data is and where it is.
Just keep in mind, I understand fully that just because I find my current situation all fine and dandy that doesn't mean there isn't a better way of doing things.
I just don't see it with this andl thing yet.
Thanks to Edgar F. Codd (not Larry Ellison :) )
By the way, I'll look at Andl. Even if I don't like the ORM idea.
I don't know much about Ellison's story. He certainly has had an unusually long and successful career. I think he had a partner but I can't remember his name now.
Did Ellison really have a visionary connection with computer science? (e.g. along the lines of Gates' "put a computer on every desk") It could be true, but almost nobody thinks of him that way. Certainly relational databases have had an enormous impact on computing and society. Did he ever make any technical contributions, the way Gates and Jobs did? (note: I'm not getting into an argument about Jobs' tech chops :) )
The core technology originated with IBM, and I think IBM had just as much of a role in commercializing it as Oracle, but I don't know much beyond that.
I think there is a meme that Oracle was more of a ruthless competitor, skilled in acquiring other successful companies whose names we don't remember anymore. But perhaps they didn't really move things forward on a technological level the way that Apple and yes even Microsoft did.
tl;dr - yes Ellison is/was a pretty good programmer, enjoyed programming, and could handle serious technical discussions. But there seems to be consensus that the guys real genius was in business. Sounds kind of like Gates in that respect, moreso than Jobs, who I believe was never that strong of a coder.
1. It's not open-source; the license on https://github.com/davidandl/Andl specifically forbids distribution and commercial use.
2. It only runs in Microsoft Windows.
Datalog https://en.wikipedia.org/wiki/Datalog is much more interesting as an alternative query language — it has better abstraction capabilities than SQL, and it's much less verbose, but as far as I know there's no implementation of Datalog in Postgres.
sounds like .net LINQ
> The language handler plandl is a DLL written in C, which calls a C++ function, which in turn uses COM to start up the CLR runtime and load the Andl Postgres entry point. Similar capabilities can be provided for Mono
Seems like a lot of boilerplate