Hacker News new | past | comments | ask | show | jobs | submit login
Catching SQL errors at build time (github.com)
108 points by houqp 8 days ago | hide | past | web | favorite | 35 comments

On Microsoft SQL server theres a tool in SQL Server Data Tools called "Database Projects". This lets you have the whole database schema in a file tree that can be compiled into a binary object that the SqlPackage.exe executable can diff or publish migrations against a running database.

This provides compile-time analysis of your whole SQL schema.

It's a goddamned buggy disaster and the usability is basically zero, but the concept is cool.

We started using this in Visual Studio for managing the database schema of legacy projects and it works fantastically. You can update the schema files from the database and reverse that to update the database from the schema files. It allows us to put the database schema under version control with all the benefits of branching and merging for a project without traditional migrations.

I haven't found it to be buggy in my usage.

Same here. The previous method was a massive SQL file. Moving to the DB project made such a difference in maintainability.

We've used it on our on-premise product for the past maybe 5 years. I still think it's the future for maintaining a SQL database, and have wanted to do something similar for Postgres/OSS.

I'm curious what bugs you've run into, we have generally stopped being worried about the reliability of the diff scripts (we use those, though, we don't generate them on every deployment i.e. the .exe).

There are a few tools providing this workflow on open source databases. I'm the author of one, https://github.com/skeema/skeema, for MySQL and MariaDB. Or for Postgres check out https://github.com/djrobstep/migra.

I've got the sqlpackage.exe hooked into my CI system - I trust that part, generally.

It just has a lot of jagged edges to cut yourself on.

1) Recompiling views - it does not properly figure out the dependency graph of the views before recompiling them, so if view A depends on view B which depends on table C and you're making changes to all 3 views, I have seen changes to View B not properly appear in View A.

2) Performance is bad on large projects. It's gotten better, but still the compile step is very slow. I have a legacy project that used to take 45 minutes to compile in VS2012, then "20 minute coin-toss success or crash" in VS2015, and now finally seems to work okay.

3) It uses the SQL server's server-level case-sensitivity for diffing schema. Most SQL servers have their schema case-insensitive (it's a dangerous server-wide change to change it). This means changes that things that differ only by case will not get published. That can produce cosmetic problems at the very least. We had a very embarassing bug where somebody was formatting a date in a view as "yyyy-mm-dd" which means year minute day. The change to "yyyy-MM-dd" did not get published.

4) The SSDT release for a build-server is really painful. Before it was accessible through some elaborate parameters in the MSBuild headless installer, but now it's been released as a Nuget package... and they forgot to include master.dacpac and msdb.dacpac, so if your databases have dependencies on those you have to copy them internally.

5) Releasing something that you have to access globally as a nuget package is... odd. I still haven't found the exact correct workflow for getting it onto our CI server. Everything I try feels like an ugly hack.

6) None of the build tool installers set up the needed environment variables, leaving you to figure it out yourself on every developer machine and the CI server. This includes path config, which will tempt you to reference the SqlPackage.exe by its fully resolved path... BUT

7) New versions of the SSDT suite break old versions. So don't reference the old version of SqlPackage.exe by its fully-resolved file path because it's broken.

8) It uses the SqlCmd.exe SQL preprocessor but many tools in the MS suite have minimal-to-no support for the directives. Imagine using a C editor where half the tools freak out at the sight of a #DEFINE. The powershell tool "Invoke-SqlCmd" in particular is a serious "unsafe at any speed" quality product.

9) The differ, in particular, fails at (8). Any inter-database references should be done with precompiler aliasing so you can swamp out that inter-database dependency... which is cool, but the differ then finds false-positives because it thinks.

[$(MyDatabaseReference)] in your source is different from [MyDatabase] in the running SQL server, when that's just before/after the token replacement.

10) Timing out on connecting to a database. When connecting to a large database, it can time-out when opening the connection. The setting to fix this is in the registry - there's no file or command-line setting to change it.

11) The Publish UI often opens a modal behind the main Visual Studio UI. Modal behind UI = "why is my window locked!!!"

12) Canceling a compile in-progress basically is the "crash the IDE" button.

I could go on and on...

That's not getting into a lot of the features that we see everywhere else in the MS stack that are conspicuously missing from Database Projects (like Nuget references).

>It's a goddamned buggy disaster and the usability is basically zero, but the concept is cool.

Yep, the concept is pretty nice, the implementation can be harsh.

Another nice aspect of this is the tooling (Dlls) can be programmed yourself with a .NET language. Ive used the same tooling to build a pre-deploy check for any suspicious DROPs and ALTERs that force manual intervention - it gives you the ability to parse T-SQL exactly how the engine will do it.

Wait, so instead of using SqlPackage.exe you're using the .NET objects to do the deployment, and are auditing the changes as you deploy them? That's cool! Anything posted public on this?

Nothing public and the assembly is basically undocumented but it's basically start with this assembly and go down the rabbit hole :)


Our use case is SqlPackage builds the diff SQL, then we sanity check it using our custom code built on that parse for any DROPs / ALTERs and log/notify for confirmation

I can see how something like this is a nice addition to your tool set, but you really should have good automated tests at build time anyway to catch SQL errors.

I agree, end-to-end tests have been really valuable for catching SQL issues in my experience.

> Identify unsafe queries that could potentially lead to SQL injections

This feature seems like a perfect use case for static analysis and would be a great tool in addition to automated tests.

Just don't use string mangling to combine parameters and queries. The most is switch/case in a loop or something like it doing state machine execution to combine fixed strings. And that only because one does not accidentally build something general enough to allow for SQL injection with such harsh restrictions.

And static analysis is not limited to build time so it can provide value even earlier.

Or later, if your program grows from a small hack to an important application.

This can even be done locally if your tests start and populate an in-memory database like Derby. Before anyone says it, I’m not talking about ordinary unit tests, but they can be part of the test suite.

With embedded databases, I'm fine of not just checking syntax of queries, but the logic that relies on the store.

Not a panacea, but very nice regardless.

First class support for SQL type checking and domain model mapping: https://github.com/nikita-volkov/hasql-th

Very cool! I maintain a similar tool: sqlc https://github.com/kyleconroy/sqlc

In addition to catching SQL errors, it generates type-safe Go wrapping code. Right now it supports PostgreSQL and MySQL.

I’m (obviously) a huge fan of the SQL first approach and hope more people adopt it for their own projects.

Check out OctoSQL if you like a SQL first approach: https://github.com/cube2222/octosql

Disclosure: I'm one of the authors.

We have had compile-time checks on SQL in Scala for years, I believe slick 3 (https://scala-slick.org/) was the pioneer in the language.

One detail that I really liked from Slick is that the compilation fails if your queries can't be performed against your database schema (think about missing columns, or wrong names or types).

While compile-time checks can be very useful, in my experience, there is nothing like having small integration tests that run against a real database which can easily be mounted by docker.

Reminds me of the !sql in rust-postgres-macros:


With the difference: it uses the PostgreSQL parser, not a generic SQL parser

Ooh, that looks nice!

Though it doesn't seem particularly maintained

Interesting. I had been using postguard in Node.js which also statically checks queries. But I wonder, how does this linter handle conditionally concatenated SQL strings in Go?

Good question. This is actually pretty straight forward to implement and it's on my todo list :) We just need to iterate through all phi SSA nodes recursively, see: https://github.com/houqp/sqlvet/blob/master/pkg/vet/gosource....

That reminds me on sqlj https://en.m.wikipedia.org/wiki/SQLJ

I only heard about that in lectures but unfortunately could not really use it (sqlj was only compatible with Java pre-generics, so that was no option). Nice to see this idea revived.

Not really the same but https://www.jooq.org/ is really nice

That's why I love using sql query builders on top of code generation from the migrations. Helps with writing new, type-safe queries and if migrations break the queries, we will know at compile time, its truly amazing.

Cool and all, but wouldn't one prefer db tests?

Both are useful I'd say. Imagine a plug-in for your IDE that performs the static analysis and catches the syntax errors and some trivial semantic errors (typos in some field's name for instance) each time you save the file. I definitely find value in such a tool.

That strongly depends on whether you believe that static languages are useless in the face of unit tests, or whether you believe that static languages are improved by unit tests.

The dynamic language camp would have no use for a tool like this.

Why use a typed language like Go at all then?

Types and static compile-time checking catch different types of errors than tests do. I don't know about you, but I am not smart enough to write bug-free code consistently; I can use as much help as I can get.

In any case, it takes quite a bit longer to run my test suite than it does to compile.

Unfortunately, I tried to give it a spin, and it said:

    ERROR: sqlvet only supports projects using go modules for now.
Ah well.

I love podcasts and I'm glad for any service that helps me find useful content. That said, I'm not quite getting how I'm supposed to be using this.

From a consumer viewpoint, I have a front page recommending episodes and a weekly summary. I get that, cool.

Now if I want to heart episodes to tell other listeners about episodes I particularly enjoy...I go to the website and search/submit the episode...I guess?

Anyway, I'm interested - I just wanted to share my first impression in case it was useful.

Wrong thread :-p

Woah, uh...that's weird. Thanks!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact