Hacker News new | past | comments | ask | show | jobs | submit login
Making SQL Server run on Linux (microsoft.com)
334 points by kogir on Dec 18, 2016 | hide | past | web | favorite | 140 comments



That's some seriously cool tech. The great thing from Microsoft's perspective is not needing to maintain two completely separate codebases, beyond the SQLPAL and the linux host extenstion, which are fairly small in comparison.

Also, by having the SQLPAL be directly descended from the Windows SQLOS should mean that Windows and Linux SQL Server have similar performance, and they're not paying a significant overhead that's typical of platform emulation.

Good shit.


> The great thing from Microsoft's perspective is not needing to maintain two completely separate codebases...

This is a great thing for customers too. The systems will work very similarly and less code means fewer bugs and security risks.


"Because of SQL Server’s long history on a single operating system, Windows, ..."

It did come out of the UNIX-based Sybase SQL Server, though:

https://en.wikipedia.org/wiki/Sybase

I wonder if there was an OS abstraction layer at that time (Sybase ported it to OS/2 together with Microsoft first) that was then thrown away by Microsoft after they licensed the source code and went on their own?


There was a fairly large amount of reenginerring work around the 20000/2005 era, and a significant number of smaller but significant feature additions/changes since, and I doubt that non-windows compatibility was considered at any point during those developments.


Microsoft remade SQL Server and optimized for Win after they bought it


What are the advantages of even using sql server when I can use postgres far more easily on linux ?


Be able to use the 1000s of commercial products that need MSSQL.

Encryption is easier to manage and it has built in solutions for HIPPA and PCI.

SQL reporting services and the entire ecosystem of apps that run on top of MSSQL..

Better clustering, ha, revision control and backup.

Most likely better enterprise support, and you'll be able to port "legacy" applications that were designed with win/sql in mind without having to rewrite your entire DB access and query code.


>it has built in solutions for HIPPA and PCI

What? HIPPA, point by point [0].

- Do periodic risk assessments and be able to produce the reports if demanded.

- Have a designated HIPAA compliance officer.

- Have a security policy, train employees on it, and ensure it has sufficient teeth to terminate violators.

- Follow the principle of least privilege when granting access to PHI.

- Limit physical access to facilities.

- No shared accounts, and lock workstations when idle.

- Encrypt data at rest and in transit.

- Log data access.

- Protect data against corruption.

I suppose an RDBMS can help protect data integrity through foreign key constraints and other validation tactics, but the rest of this can't really be performed at database engine layer (unless your users are logging in to MSSQL directly?) so much as the EMR application and the physical, config management (full disk encryption + screensaver password), and bureaucratic environment it's deployed in.

[0] https://www.hhs.gov/hipaa/for-professionals/security/laws-re...


> - Encrypt data at rest and in transit.

> - Log data access.

> - Protect data against corruption.

It's these three that I imagine MSSQL can help with. I believe it's got a built in storage encryption setup for tables/databases and can definitely be setup to log access to the database and queries run against HIPAA data.


It's got cell level TDE if you want which also makes it transparent to the application which makes it pretty easy to implement as you don't need to handle the encryption within your app.

All the key management, split knowledge, key rotation and most importantly access management and a full audit trail are also a "few clicks" away.

Having a turn key solution for one of the most annoying things to implement in a correct manner should not be under-estimated.

Anyone who done any compliance work knows just how many weirdly stitched solutions you can have.

Something along the lines of "we manage split knowledge to our keys by splitting the key and keepass file into 2 files and storing them on USB keys in separate vaults"....

Now technically this can pass a compliance check but this is a pretty darn silly situation simply because there isn't any better solution that works out of the box, this also quite often means that anytime you restart your DB you have an entire process of getting the keys, decrypting the tables copying their content into temporary views, re-encrypting the files and remove the keys from the server..

For MSSQL it wasn't that long ago that you have to do it "by hand" also, usually have a stored procedure that would handle the encryption and have a key which is then encrypted by several KEK's whilst it did afford some split knowledge it wasn't a true m of n which also meant that if you had more than 2 DBA's you had to make sure that the DBA's are know all of the KEK's and not be in a situation where all 3 DBA's that are not off during that week know the exact same password for the KEK...


What security benefit does TDE provide over full disk encryption on the database server?


I think it's more a performance benefit to have MSSQL be aware of and handling encryption directly. Of note in the article is how much they bypass the host OS for things like disk read-cache etc.


Full disk encryption isn't sufficient for complience (at least pci) as you need to encrypt the sensitive data in use also and manage the keys in a very specific manner.

FDE won't even work for data at rest unless you can do split keys and auditing.

This isn't about security this is about complience.


How can the access log help you? Presumably only the sysadmin and the EMR application should have SQL accounts; the only person this guards against is the sysadmin, unless the application passes through the user ID it's querying on behalf of somehow? Otherwise, you know that a query happened but not who ran it.


You need to be able to trace individual queries which MSSQL also makes it easy since you can pass the user identity and map it to a set of credentials on the DB side automatically.

But in general again this isn't about security it's about complience so it's more about doing things by the book and a very specific one.


Wouldnt it be easier just to run sql server on windows at that point to take advantage of the ms ecosystem ?

Edit: I am not familar with much of anything microsoft related.


It is hard to run Windows Server in the style favored by serious devops shops- CI for infrastructure, totally automated bring up, etc. There are lots of companies that don't/won't run SQL Server because they don't think they can do that with windows, or don't want to try and port their tooling.


It's really not, it's that most devops engineers don't want to learn or even bother and most windows sysadmins simply don't know how.

Windows automation kinda died when system admins stopped needing to know how to work with JET DB when needing to do one serious ad work.

I don't can't think of a single CI or automation thing that works with Linux that you can't do in Windows if pushed I can probably come with considerably more things which are opposite.

Licensing is the only thing really that makes windows machines a pain to work with unless you are big enough for a sell reporting volume licensing scheme.


> > It is hard to run Windows Server…

> It's really not, it's that most devops engineers don't want to learn or even bother and most windows sysadmins simply don't know how.

From a hiring/project/management perspective, this is my definition of "hard".

From a sysadmin perspective, a huge problem is the amount of knowledge needed to be an effective windows sysadmin absolutely dwarfs -- by several orders of magnitude -- the amount of knowledge needed to be effective on Linux.

How many sysadmins have you heard suggest, with all credulity that the solution is downtime and a reinstall on Windows? on Linux? Once a sysadmin knows how to move files around, run strace and grep, there is very little hidden from them on Linux, while on Windows they still don't even know how to take a backup, which is helped less by Microsoft refusing to publish any documentation that makes them look stupid -- like it being impossible to take an on-line backup of a server running Microsoft Exchange 2007 prior to SP2.

> I don't can't think of a single CI or automation thing that works with Linux that you can't do in Windows if pushed I can probably come with considerably more things which are opposite.

I agree wholeheartedly with that statement.

I hope that UNIX sysadmins want to learn, and bother to learn well the Windows environment better: UNIX sysadmins already know how a system is supposed to work, and it is by large the sheer number of inexperienced Windows sysadmins that make Windows look stupid. UNIX sysadmins have a better chance of running Windows well simply because they won't put up with "have you tried reinstalling everything"?

There's also some really good ideas in the Windows ecosystem that Linux (and even OSX) could really benefit from.

> Licensing is the only thing really that makes windows machines a pain to work with unless you are big enough for a sell reporting volume licensing scheme.

An aside: I'm not so sure this is true anymore, at least not with Azure for infrastructure, and Open programs for desktops/laptops.


"> I don't can't think of a single CI or automation thing that works with Linux that you can't do in Windows if pushed I can probably come with considerably more things which are opposite.

I agree wholeheartedly with that statement.

I hope that UNIX sysadmins want to learn, and bother to learn well the Windows environment better: UNIX sysadmins already know how a system is supposed to work, and it is by large the sheer number of inexperienced Windows sysadmins that make Windows look stupid. UNIX sysadmins have a better chance of running Windows well simply because they won't put up with "have you tried reinstalling everything"?"

Powershell is the key that unlocks Windows administration, I think. Certainly, as somebody that does both platforms, learning Powershell made a big difference to how I work and think when using Windows.

Having said that, Windows Server is still fundamentally bad. The logging is a mess, remote access is awkward compared to SSH on Linux, the failure to properly integrate package management hurts every day, the Registry is still...there, and the list goes on. The disparity of DevOps tooling is another issue. I would agree with the comment that automating Windows takes several times as long as Linux. It's just not happy fun.

At this point, Windows is probably not retrievable. The Windows Server folks will keep improving it bit by bit, but I can't see an argument for continuing to flog the horse and use Windows Server outside of full-dress Microsoft corporate environments.


Windows server has native ssh.

Also wmi always worked and still does.

Logging is great you just need to configure it and register your own triggers and monitors really depends on what you want to log.

It's very easy to do logging of every action and even completely custom and send it to the event log, you can also register custom events if you want to add your own application logging or use the stubs and just push the information into the text fields.

Again this simply comes to 2 things tooling and knowing windows internals well, sadly because there are usually very little negative side effects to rebooting a windows machine and that the market is flooded GUI administrators with mcsa/mcse certs gained through memorizing exam dumps it's hard to find people that know or want to know how to administrate windows machines to their full potential.


Windows Server has the promise of native SSH, but it isn't part of the OS builds yet. I think that it's fair to say that Windows remote access today is PowerShell and RDP (which is awesome, but not for this).

To me, that's sort of where Windows is now. It has some really impressive stuff: Windows 2000 was an avalanche of innovations, but most of it was built for the corporate LANs of a decade ago. Put Windows Server on a cloud, and most of this stuff is not relevant or useful, and Windows is not great at some fundamentals.

Logging is one of those. If you are interested in this kind of thing, I'd genuinely encourage you to spin up a Fedora or Debian system and look at the thing work. We are currently in transition between logging systems, but the feature sets are impressive and the error messages are useful. Windows logging is even more patchwork, and the (Microsoft) products that bother with the core logging system tend to fill it with chaff. It might get better, one day, but it's not a good experience today, and hasn't ever been.

Another is software management. Package-based software management has to be deeply integrated into the OS for it to provide it's full value. Debian have done this for years, and Red Hat have caught up. Microsoft have a muddle of multiple software management systems, and individual admins can't fix it. We just have to work around it and do the best that we can, or decide that the platform is what it is, and move on.


My experience at a largely Windows based Fortune 1000 company was that Windows is that hard. I dealt with 12-ish factor apps on Linux exclusively, but the teams dealing with legacy stuff on Windows really struggled. If you're very careful with the tooling you may get lucky, but I watched as things like Chef, Sensu, and Collectd all took significantly more effort on the Windows platforms.


More recent versions of Windows Server have definitely improved, but this is still pretty accurate. When given a set of tasks to do for Windows and Linux, the latter is always easier. If it takes x amount of time to automate something on Linux, it's 5x on Windows. Every time. Some of this is due to MS themselves, some of it is due to the software vendors, but it's a very real thing.


Can you give an example? In virtually every case I've seen it's people don't knowing powershell, winapi or not having the right tools for the job.


I wasn't the Windows guy so my memory is a bit hazy, but I remember trying to get Chef properly daemonized on Windows was a nightmare. Likewise trying to get collectd installed via Chef was great if you had a fully supported version of Windows. IIRC the collectd MSI only worked on some versions of windows (and 32-bit support was a whole other matter).

In an ideal world you wouldn't have to deal with 32-bit versions of Windows, or EOL'd 64-bit versions. The real world with lots of legacy stuff doesn't work that way. Meanwhile EOL'd versions of Linux chugged away just fine.

Iteration on the Windows instances was also quite a bit slower than with the Linux ones. Applying all the requisite compliance stuff resulted in a number of additional reboot cycles. I could get a Linux instance bootstrapped in half the time, without having to bake a custom image.


Chef is the wrong tool for the job use sccm even if you still want to use chef as a front end have the sccm agents installed and use it.

This is a tooling issue, dont expect to use a tool that was hacked together for all intents a purposes rather than a native solution and get good results.


So, basically, Windows doesn't play nice with others and Windows doesn't do automation. That's why things take longer and why it's more expensive. I'm reading through this:

https://www.reddit.com/r/sysadmin/comments/21z62d/sccm_2012_...

And this:

https://www.windows-noob.com/forums/topic/9357-how-can-you-m...

And I'm seeing three things:

1.) SCCM is not nearly as easy to automate as the alternatives.

2.) SCCM is expensive.

3.) SCCM has mediocre support for Linux.

And there ya go. Meanwhile things like Ansible, Chef, and Puppet can handle your Linux, BSD, and OSX servers with aplomb. Where Ansible, Chef, and Puppet use mature, cross-platform scripting languages SCCM uses Powershell. Windows is the odd man out here, and skills from nearly every other OS don't really transfer well to dealing with Windows. That's why Windows is seen as more tedious.

And, sure, our Windows guys were dropping to Powershell constantly from their Chef cookbooks. But then you get to deal with maintaining PS 2 and PS 4 scripts. The versions of Ruby and Python you'll find with Ansible/Chef/Puppet are significantly better with backwards compatibility in this regard.

But, let's say that we had bought into SCCM. That doesn't really fix the problems with the other parts of the stack, nor does it fix the inconsistencies across Windows versions.


>So, basically, Windows doesn't play nice with others and Windows doesn't do automation. That's why things take longer and why it's more expensive. I'm reading through this:

And you got this from what exactly?

Windows does automation just fine, you are throwing this word around without even trying to explain what is it you want to automate.

SCCM has ok to decent support for Linux; again depending on your use cases and distro.

I don't understand why are you even bringing random Reddit and forum posts for a 5 year old version of SCCM.

SCCM isn't expensive it's usually "free" when you buy enough licenses.

Also you missed a very critical point; Chef, Puppet and Ansible can invoke the SCCM agent.

Their built-in agents for Windows suck, so you either write your own automation (e.g. Powershell) and invoke it via Chef/Puppet/w/e or you use a decent agent that was actually built for purpose.

You are constantly complaining about things that at the end stem from a single source; incorrect tooling.


Exactly this.. The about shops that have all their server tooling for Linux, and want this nice 3rd party tool but it requires SQL Server.. Either team grins and bears supporting a one off OS and application, or they find another solution and MSFT loses a license.


Maybe but this also gives you more hosting options if you are in the cloud.

Also tools like puppet and chef for example have an easier time provisioning Unix like machines than Windows (although this has been changing, and the situation could be much better I haven't dealt with it for a bit).

Also you don't have to deal with licenses for Windows server instances which can be another huge painpoint.


There are quite a few products that require a Linux host for the main app server and a MSSQL or Oracle DB.

Oracle support is being dropped left and right MSSQL is there because it was the defacto enterprise DB for ages and it also has a free version via MSSQL express.

And lastly having more options is never a bad thing.


Oracle support is being dropped left and right

By whom? I think "big" enterprise software (SAP, Peoplesoft, etc.) has and will support Oracle for a long time. Many of their customers have huge investments in Oracle, not only financial but training and institutional knowledge.


There are quite a few products that require a Linux host for the main app server and a MSSQL or Oracle DB.

Are there? I'm genuinely curious about what these products, are that run on Linux and use SQL Server.


I should've used the require just for the 2nd part just off top on my head a lot of IBM products (e.g. Lotus) can be installed on Linux and only run on MSSQL or Oracle with and ODBC connector rather than their DB2 internal DB.


You'd never run MS SQL Server and these other servers on the same machine either way, so you could easily run SQL Server on Linux and Reporting Services on Windows Server.

Although if you're virtualizing Windows Server, it may be effectively free to run an additional Windows Server instance (since you pay per core regardless); but if your hypervisor's resources are already saturated then you could take SQL Server off of it and put it onto a inexpensive Linux Server.


Many superlatives here. Better, easier, faster... than what?

Against which other product did you compare against? There are many database systems supporting Linux. IBM DB2, Oracle Database, to mention a few.


The parent comment asked for a comparison to postgres. So that's the benchmark here. Oracle will of course be different but is also paid.


> having to rewrite your entire DB access and query code.

Anyone who has to do that is doing (or did, in the past) something seriously wrong. For me, the biggest reason not to change database after you accumulate a couple dozen terabytes of data is the inconvenience of dumping and restoring it and the downtime it'll cause.


Why? I am in the process of moving from Oracle to Postgres right now (we no longer want to pay the Oracle tax). It has taken me months of work, but I am almost done. I've been very very impressed with Postgres so far and am amazed by how much better its documentation is than Oracle's. This required rewriting thousands of queries (I now how a pretty good list of differences between oracle and postgres!), but it feels good to almost be done. But now I read that all the work I have been doing is seriously wrong. Why?


> (I now how a pretty good list of differences between oracle and postgres!)

Publishing that on a blog somewhere would make things easier for a lot of people. Have you considered it?


It all depends on how much of your "program" lives in the database, in terms of things like triggers, stored functions, etc.

That's the argument against those things - it makes it difficult to switch your database. But there are also some compelling performance arguments in some cases, so in a well-designed system those parts of the code that live in the DB also tend to be important ones.


> That's the argument against those things - it makes it difficult to switch your database. But there are also some compelling performance arguments in some cases, so in a well-designed system those parts of the code that live in the DB also tend to be important ones.

But if they only exist for the critical parts then porting is a lot easier.


This is pretty much the kind of "wrong" I meant. ;-)

In this case, it's still wrong, but faster than doing it right.


I doubt any of those 100.000.000.000 commercial products that need MSSQL would run on Linux either.

In fact, even SQL Server Management Studio is a Windows-only application.


Actually a lot of them do it's not hard to find some ticketing system or a call center management system that is written in JAVA or PHP and only works with mssql or oracle dbms.

The app stack is usually more platform agnostic than you think. Even if it's asp.net you most likely going to be able to run it or easily port it to .net core.

Basically having cross platform .net is pointless without having cross platform everything else especially MSSQL.


I was surprised how easy it was to get a couple asp.net apps running under docker dotnet-onbuild base containers... I think having SQL Server under linux/docker is a really nice option. Just being able to use it with ci/cd environments in linux is a pretty big thing.

I've moved more away from the MS environments the past few years now, but have missed some of the things that MSSQL makes very easy, and by comparison PostgreSQL makes relatively hard. Of course you're paying for it. Having the option to run under Linux is really nice though imho.


> but have missed some of the things that MSSQL makes very easy, and by comparison PostgreSQL makes relatively hard.

Could you give some examples?


Namely in the box replication and failover (like checkbox interaction via the GUI) compared to a hard to configure, or expensive to license add-on.

IIRC geo indexing was easier in mssql, although much more configurable in pgsql.

Not on pg, but linq to entities for .net is really nice, and the pg adapter (iirc) wasn't nearly as good. The node adapters for each are about equal though.

Out of the box performance on mssql, and query performance in general was better last time I compared (admittedly about 2 years ago).


They don't have to. Let's say an enterprise wants to go start moving its backend to the cloud. Now there is a very significant price difference between windows instances and Linux instances. The client commercial product can still run on Windows and the database in Linux


For that matter, it's been surprisingly easy to get ASP.Net apps to run with mono or .net core in some cases. ymmv of course, and the performance will vary, but it's an option.


Not Windows-only anymore. Microsoft is releasing various tools such as SSMS for Linux too.


No, currently, it is windows only. Maybe, someday, we can say it's not windows-only anymore. But not today.


Even then... you don't need the DB and SSMS to be on the same box.

The installs are on separate ISO's now if I'm not mistaken...


I believe so yes, it's still not a terribly pleasant experience but WAY better than it was.


I'm wondering if MS will go the electron route for SSMS, considering how well it's worked out for VS Code.


VS Code now has an extension for querying against MSSQL that's cross platform.

https://marketplace.visualstudio.com/items?itemName=ms-mssql...

Still missing several things that SSMS has, but it's a nice start!


Very nice, I was actually meaning it would be a nice to see a full-blown application similarly... just the same, nice to see the extension in VS Code. Consistently impressed with the application, but would like the original multi-doc behavior back over tabs.


There are also alternatives to SSMS that run on Linux, like JetBrains DataGrip.


DataGrip is great, but it's got a long way to go before it's comparable to SSMS.


what is it missing ?


SQL Agent support is a pretty glaring omission, to be fair they don't support pgAgent either so I regularly end up needing to fire up SSMS or pgAdmin to configure maintenance and summarization jobs.


I love Postgres, but lack of parallel query processing (same query leveraging multiple threads) has been a deal-breaker for some projects. Thankfully, Postgres is finally starting to make some progress here.

Also of note, In November MS added a number of previously enterprise-only features to all versions of SQL Server 2016 (Service Pack 1 and later). Examples: In-memory OLTP, ColumnStore indexes, are available even in the free Express edition now. Native Postgres, to my knowledge, doesn't have comparables for these yet.

https://blogs.msdn.microsoft.com/sqlreleaseservices/sql-serv...


> I love Postgres, but lack of parallel query processing ... for some projects.

It has surprisingly little effect on many workloads though - then engine is fairly conservative on when it will use it too (and there are easy ways for you to accidentally make it not consider parallelism).

For many workloads you are often better off using the many CPU cores to run different queries than cooperate on a smaller number.

Data warehousing and "big data" are a place where this isn't the case though, and other features MS SQL Server has over PG and others shine here too: the column store indexes, in-memory processing, and so forth. And you get these in the free "express" editions too, as of 2016sp1, though that is only really useful for small projects and prototypes due to other limitations in that edition (10Gb/database, max 4 CPU cores & 1.5Gb RAM put to use per instance) so you'll be paying for standard edition at least for real work.


I've always found MS-SQL pretty impressive in terms of features/value, though they get pricey pretty quickly, I've always found it nicer to work with than Oracle or DB2 by comparison... Although, I do wish they'd support the more common LIMIT syntax that many other DBs use. That said, from a devops perspective MSSQL doesn't have any real peers.

Some of the analytical tooling around MSSQL is pretty good too. Have had to track performance issues in the past. Even the general query analyzer integrated with Management Studio is great.

I do hope they consider going the electron route for cross-platform management studio, it's worked really well for VS Code, and can imagine it would be a great fit for this as well. Even if quite a bit outside their current tooling.


For you personally there probably are no advantages. For large organizations that are already heavily invested in SQL Server, however, there may be benefit in being able to migrate off of Windows, especially if most of the rest of the server infrastructure is already Linux based.


My common answer to this is query parallelism. For those of us in the DWH/OLAP space it's a big deal that the F/OSS engines are just starting to tackle, and it's not an easy problem to solve.

I might not mind paying for MSSQL if I also did also not have to pay for Windows.


Quite a lot of enterprise features made it into the free SQL Server Express this winter.

[1]: https://blogs.msdn.microsoft.com/sqlreleaseservices/sql-serv...


Other limitations (10Gb/database, max 4 CPU cores & 1.5Gb RAM put to use per instance) mean Express is only useful for small projects and/or prototypes though - you'll be paying for standard edition for real work. Those previously enterprise-only features being in standard is a huge and convenient change too though. It is worth noting that developer edition is also free these days (as of 2016RTM IIRC) so for prototyping work (unless you need to support older versions) you are better off with that than express as it has always basically been the full enterprise edition.


PostgreSQL has features MSSQL does not have, and vice versa.

PG is still very lacking when it comes to multi-node clusters for high availability for example.


OLAP for one. Initial support for basic OLAP conventions was introduced in 9.5 (grouping sets, rollups, cubes, upserts), but there's still a lot missing (like nested aggregations), and a lot of performance tuning that needs to be done.

We cannot run our BI on Postgres or we would. But with Microsoft's latest work, even products like Oracle Hyperion can run against SQL Server running on Linux.


You almost certainly already know this, but PostgreSQL behaves very differently depending on its config parameters. We've been able to get decent OLAP performance at least to the dozens of TB scale.

Very interested to hear how to made the decision to use MS SQL and the test cases that drove that decision.


Not OP, but for us, the reasons we switched from postgres to sql server were the "clustered columnstore indexes" and query parallelization. Before switching to sql server, we spent a ton of time optimizing query performance with "clever" use of indexes, ctes, etc. With sql server, we've been able to just use the columnstore indexes and write simple, straightforward queries that run in ~200ms vs 10-20 seconds in postgres.

And before you ask, we spent A TON if time and money on configuration, paying for multiple postgres experts to help with optimization.


I mean, if you were already jumping over to using columnstore tables why not use Cassandra? I guess CQL has a lot of limits that SQL doesn't, but if your use-case matches a columnstore it's an option to consider.


Cassandra really isn't a columnstore in the same sense that the word is used for OLAP systems. In the latter, efficient queries can be run across lots of values in a small number of columns across many rows. In Cassandra, queries spanning many rows are not particularly efficient. Sometimes wide-column tables are used for OLAP-type use cases in Cassandra, but that's something of a different approach. More info on that here: http://dbmsmusings.blogspot.com/2010/03/distinguishing-two-m...


My guess would be the integration with everything that wasn't a columnstore... Being able to do both in the same dbms is an advantage. Also, doing everything in Cassandra is hard, and takes a lot of planning and effort that may have not been available.

I can't speak to their specific use cases though. I've been a bit of a RethinkDB fan myself in terms of straddling paradigms. I hope they shake out into a foundation structure with as much as possible in tact.

I do think that once PostgreSQL gets some of the clustering/HA features in the box and shaken out, it will become the default option for most uses. As it stands MSSQL does have some compelling advantages, some of which you pay for.


> I do think that once PostgreSQL gets some of the clustering/HA features in the box and shaken out, it will become the default option for most uses. As it stands MSSQL does have some compelling advantages, some of which you pay for.

Or even a add-on that's not so difficult to configure. EnterpriseDB has made strides here, but justifying the license cost is harder for me now that they've moved to their new Unicore model.

`repmgr` from 2ndQuadrant is getting really close to ideal and it's included in the PGDG yum and apt repositories, my biggest issue is the manual plumbing involved to get clients to automatically pick up the new master after a failover and automatically rewinding a failed master and bringing it back up as a standby (both are done out of the box by EDB's solution).


Interesting. I guess it sometimes shows that paying the license fee can be worth from a business's point of view it if it avoids consultants' fees associated with an open source solution.


What version(s) of PostgreSQL were you using?


9.5, however we did experiment with 9.6 when it was in beta, due to the parallel sequential scan feature, but it didn't seem to help much.


Timing. Postgres 9.5 was not on the horizon when we were building out our solution. Lacking OLAP support, the query complexity to achieve similar functionality would have been very cumbersome, error-prone, and non-performant (excessive use of unions, massive case clauses, lots of repetition, inscrutable window functions - a mess). Further, our finance team later insisted on using Oracle EssBase, where Postgres in any version will never be an option.


> like Oracle Hyperion can run against SQL Server running on Linux.

Do I know you? :) Did you actually try HFM or Planning with realistic workloads? I'm very curious... I expect Oracle will soon say SQLServer on Linux is not supported, any day from now. Their Linux support is mostly geared towards Exalytics, which is geared towards Oracle licenses. Then again, these days they're so desperate to sell extra licenses, they might even accept it.


They may not "support" it, but they would have to do a lot of work to prevent it, since the Linux and Windows implementations are identical. An application cannot tell the difference.


It's been a couple of years since I had to deal with Hyperion FM but even back then I thought it was reasonably well known that SQL Server support was being removed in future versions?


Nah, waaay too many Hyperion EPM customers are on MSSQL for that to ever happen. In fact, despite all claims that products are tested "oracle-first", they still work better with SQLServer...

It doesn't matter anyway, the whole suite is slowly being strangled in favour of crappy cloud versions.


I pray for the day when Postgresql clustering is as good/easy as AlwaysOn Availability Groups.


Paying works better than praying: https://www.postgresql.org/about/donate/

/semi-serious :)


My employer spends money with EnterpriseDB so we've got that covered 8)



thanks!

The sql server integration services seem interesting.


I have used it recently. The main thing I like is that it organizes code visually as workflows and it has many pre-packaged "tasks" you can use. This means I didn't have to come up with some structure for 900 script files.

I did run into several annoying errors when trying to import from excel.


That's one of the cases where the HN discussion is really great, while the original article is rather bad.


Better tooling a better query optimizer and more reassuring "enterprise support", if that's your kind of kink.


From a development perspective clustered indexes and SSDT are the best features. Clustered indexes vastly improve query performance on tables where you often want the data by something other than insert order. SSDT lets you version your database in a really easy way and generates change scripts on demand.


You can optionally cluster a table based on index order with PostgreSQL, simply create the index and then `ALTER TABLE foo CLUSTER ON bar_ix`. Clustered indexes are actually an issue I run into with SQL server, because, like MySQL, the table data is always indirectly referenced by the clustered index, so any query that uses a secondary index requires a second pointer deref to find the data in the table heap. This has a benefit to insert/update performance, but I'd argue that more tables are higher on reads than mutations.


SQL Server supports heaps (tables w/o clustered indexes) on which indexes point directly to the referenced row. It's nice that they allow you to choose between those and tables with clustered indexes where any secondary index references the primary key. See https://msdn.microsoft.com/en-us/library/hh213609.aspx


You can use INCLUDE to put some columns in the index. Doesn't always help but speeds up many queries. AFAIK Postgres doesn't have that feature, not sure about MySQL.


Using INCLUDE comes at the cost of bloating your indexes making scans take a considerable longer time, not having to chase that extra pointer works well enough in most cases (and if you are able to benefit from Heap-Only Tuples you can even have in-place UPDATES without needing to rewrite the pointer in every index).



Parallel queries baked in from long back. I think Postgres got this just recently.


It will probably allow me to eliminate some Windows VM's that are strictly there to provide some enterpricy Java apps with an MSSQL database.


We did a performe comparison on handling Geo data.

SQL server came out on top.


In what way? I can't even tell if you mean features or performance. PostGIS is unmatched as far as features, as far as I know.


While it is true the SQL Server is missing a fair number of features compared to PostGIS (and Oracle Spatial), I've always found that the spatial performance of SQL Server is been really great for the common features it does support.


Ok, thanks, that makes it worth checking out!


Don't know if the comment was edited, but it says performance comparison? (If I interpret the typo correctly)


it wasn't edited


I said "performe" a typo from my phone, you really didn't know that I meant performance ?

Queries returned results faster in sql server, after checking we had set everything up optimally in everything.


That's one area, where at the time (about 4 years ago), Mongo worked out really well for me... iirc there was a bug I experienced with ElasticSearch, and RethinkDB didn't have geo indexing yet.

It doesn't surprise me though... MSSQL tends to do very well out of the box for most workloads.


Consider that PG only just got fairly rudimentary parallel queries in 9.6 this year, SQL Server had it in (IIRC) 2000. For DW and general DSS workloads that is significant.


Well using Postgres on Windows is just as easy as on Linux. Yet still SQL server is the king :)


Depends who you ask, for you probably not that much. For mickeysoft however... new territory to sell licenses to at some point.

[edit] https://www.microsoft.com/en-us/sql-server/sql-server-2016-p...

It would have been nice if some were able to explain why the above reasoning of selling licenses is incorrect? There is no other reason for MS to do this besides money.


Maybe next time try without saying "mickeysoft".


Licensing is probably a big reason... integration and migration to azure is probably another reason... MS pricing for SQL is pretty competitive in terms of what cloud offerings are available. I wish their compute would come down a bit across the board though.

I keep thinking it would be cool to see an interactive graph of connection latency, and throughput for 1k, 1mb, 1gb xfers between the major data centers for the big cloud providers. If I could get compute from DO/Linode and use services from Azure and have under 20ms latency for data queries, I might use Azure for services, and Linode or DO for compute.

I know that's too much for some things, but may be work the cost in latency for many use cases... saving a few hundred a month on web/service vm's and leverage Azure beyond that.

(as for downvotes, probably mickeysoft)


Here's a little lighter reading from ars on the subject http://arstechnica.com/information-technology/2016/12/how-an...


Slightly tangential, and while I know nothing about the implementation details of the SQLPAL, the SSCLI (and later CLR/CoreCLR) took the same approach to port Windows based code to Unixes [1] and makes for interesting code reading. For e.g. The SSCLI PAL's implementation of SEH [2].

[1] https://github.com/lewischeng-ms/sscli/tree/master/pal/unix

[2] https://github.com/lewischeng-ms/sscli/blob/adbd6bbad0816f65...


Feels exactly like how they abstracted DirectX from direct graphics driver access, which coincidentally was called HAL.


Not only graphics, HAL in Windows NT era was (as the name says) a set of common abstractions for talking to different hardware in standardized way


So SQLPAL (great name) is in the current SQL Server for Windows? It's quite an engineering achievement, though you could hopefully automate a lot of the actual abstraction, and I bet the testers were on overtime. I'd be interested to know the timeline, is any version of SQLPAL in SQL Server 2016? Or is it SQL.Next?


The article mentions win32. I remember it from the Window NT 3.1 days. Is it still used? Is the "32" in the name there because someone thought that 32 bits should be enough for anyone?


It's merely the name of the API, even on 64-bit Windows, where it ends up being a misnomer :-)


Anybody looked into migration to the Linux platform? Realistically, what would be the advantages/cost?


Not having to run Windows is a big factor. (Just so you know: I run a lot of Windows servers, so I'm not being knee-jerk snarky. This is a real cost).


Yes, and the cost savings go well beyond a Windows license + SA. Being able to leverage an existing toolchain for configuration management and deployment is a huge cost savings, given that the Windows alternatives are both expensive, more difficult to use, and break a lot more (IMHO).



I was thinking more of Ansible, but yeah. I've done SQL Server installs on Windows Core, and it is painful. Aside from the obscure PowerShell commands, you have to write a config file - just to get the damn thing installed. Rubbish. (I hear it's gotten better in 2016 though).


usually sql server dwarfs the costs of windows though?


It's not just the financial costs...


It remains to be seen whether SQL Server on Linux will require less care and feeding than on Windows.


Being able to yum/apt-get install and edit a config file or run a couple quick commands is a lot easier than needing to figure out how to feed SETUP.EXE the right arguments and deal with a mess of arcane windows CLI tools + a mix of PowerShell to configure SQL Server.

Automating the installation of PostgreSQL on Linux with ansible is a couple lines of tasks in a playbook + a template for postgresql.conf and pg_hba.conf. Automating the installation of SQL Server on Windows (as well as the installation of Windows in general if you aren't using System Center) is a huge pain in the ass.

That's enough less "care and feeding" for me.


Looks like there's chocolatey packages for the express versions: https://chocolatey.org/packages?q=sqlserver


Could you tell me how much RAM does sql server consume on Linux?


my experience of running sql server on windows is basically it is reasonably painless.

about the most annoying thing about running windows is updates as they mostly want to restart the box.

If you delay your updates, it can take 40 + minutes to restart once you decide to install them


The only issue I've ever had with MSSQL is that the Express version comes configured out-of-the-box with a set of defaults that make it nearly useless. TCP/IP connections are turned off, and you have to go through and set all the protocols to use the normal SQL port.


I've seen SQL Express used basically like SQLite a number of times, where it's only being accessed by a local installation of something.

Also if you're installing it manually (and thus having to configure it manually) the odds are good that you're a developer in which case it'll also only be accessed locally.

It's a sane default to not listen to the network. A lot of users won't need it. Those that do can enable it.

The port thing is handled automatically when you have your network clients working correctly.

For the record I'm a SQLite/Postgres guy, the only reason I even know about the port thing is having to help a customer troubleshoot some software that uses SQL express a few weeks ago. Definitely not a MS defender normally, but your complaints are really silly.


MicroSoft Linux is still not released, so wait till it will be.


Are there any plans for SQL Server to run on MacOS?



I've not heard of any. I would imagine that the real value for Microsoft is in expanding the number of server platforms on which SQL Server can run and MacOS doesn't really fall into that category.





Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: