
Don't test with SQLite when you use Postgres in Production - edofic
http://michael.robellard.com/2015/07/dont-test-with-sqllite-when-you-use.html
======
pimterry
Better: Don't test _only_ with SQLLite when you use Postgres in Production.

His points are all valid, you definitely shouldn't release something to
production that you haven't tested thoroughly in a separate identical
environment.

That doesn't mean you should never test with SQLLite though.

A good pattern I see all the time is to have a final stage of system tests
that run slowly but very accurately in a production-equivalent environment,
and to also have a preceding test stage that uses SQLLite instead. By using
SQLLite in memory it's much much faster and easier to manage and easier to
parallelize etc etc, so you find out quicker if your code is fundamentally
broken.

The goal of testing is to find out as quickly as possible (minimal context-
switching, easy to check locally before commit) whether your application will
work. That means checking the things that are most likely to break most of the
time as early as possible. It's typically not going to be a complex database
incompatibility issue that makes your tests fail. It's going to be the SQL
query you wrote 30 seconds ago, or the leftover query that doesn't work at all
the table structure you've just moved to, etc etc. These are things you can
check perfectly effectively in a lot of cases, and much quicker, with SQLLite
and friends.

Quick test stages that catch all the obvious problems early, final definitive
production-equivalent test stage that makes sure it'll definitely definitely
definitely work in real life. If you do it well, you can set it up so they
both use exactly the same test definitions, and just swap out the DB config.

~~~
DiThi
I came here to say something along those lines. Our project has a SQLite
backend for hassle-free local development, and we use Postgres in all remote
servers. Tests run both in local and remote.

~~~
jeltz
What hassle did you expereince when using PostgreSQL locally? On the projects
I have been in we have ran PostgreSQL in local development almost painlessly.
PostgreSQL is quite developer firendly, at least for my use cases.

I suggest using syncrhonous_commit=off to sped up the test suite and to use
unix sockets (assuming you do not run Windows).

~~~
DiThi
The problem with using PostgreSQL was difficulty to make it work everywhere,
we have at least one dev in each platform. I use the pure JS version of SQLite
(sql.js in npm). The server is fairly small, having the bulk of the code in
the client (SPA).

If it wasn't for the MySQL I had to implement after the SQLite work, I would
say it's better to investigate the quickest way to configure a local Postgres
DB. There are portable windows binaries.

------
andrewvc
The comments here are full of people bucking against this advice. I've worked
at TWO companies now where people said NO we want to use SQLIte in dev. Both
switched in under a year despite fierce internal opposition. Everyone had
changed their tune once they hit growth in users and complexity. Why? Because
being ideologically right is not as nice as being sure you're writing code
that works.

The fact is if you have a serious site there's a good chance SOME specific
feature will start looking appealing ( maybe array values, JSON , window
functions, a pg extension...). These features are written because people use
them, we aren't all writing dead simple CRUD apps. Or you your ORM will
interact with one slightly differently than the other. I don't expect to
convince any of you doubters, but the first time you get a bug in prod that
slipped through the testing process but really could have been caught in dev,
you'll ask yourself why you weren't using pg there too.

There's no sane reason to complicate your life by running two databases unless
you either have a dead simple app, or are one of the jabbering idealists I see
on here. Productive programmers simplify the problem and minimize their tech
stacks.

~~~
LoSboccacc
I am for both:

we run a brutaly different dev environment than production: windows, 32 bit,
hsqldb, windows codepage against linux, 64bit, postgres, utf8 codepage

we also have a beta environment that's a perfect mirror of production down to
the vm vendor and package version and an alpha environment that's on a cheaper
vendor and uses a more updated version of production os/packages (and has
experimental features of our software enabled)

this maybe slows down development a little, but catches a brand whole class of
interesting error that are normally hidden and wait to happen till the next
time you need to upgrade something in the production environment

then again we do have some sane lib that hides the horrible differences
between databases so we have a whole class of bugs that's managed by that
layer. (no it's not an ORM I hate those)

~~~
drdaeman
Seems that someone's running wild, downvoting every comment they disagree
with.

I believe neither maintaining a code that's compatible with multiple
databases, nor using a single one is a mistake per se. Whatever rocks your
boat _is_ the attitude - one just needs to be aware of consequences (both good
and bad ones) of chosen path.

But, yeah, the linked article's warning is completely valid - testing on a
single platform, then deploying to another is likely to encounter some issues
one day.

~~~
LoSboccacc
> I believe neither maintaining a code that's compatible with multiple
> databases

me too! but there are soo many compatibility layers you can get and bolt on
for free and they also work around version issues and things like that... of
course if you need the latest version then you're in a different boat, but the
generic problem of talking to a database is basically solved by now

------
lkrubner
I will tell a story that I think is very strange.

Last year I worked at a small startup that was focused on medical records.
They used PostGres in production, but SQLLite in development. The frontend was
pure AngularJS. They had a massive Python code base for pulling data from the
database, turning it into JSON, and then sending it to the frontend.

But then things began to change.

PostGreSQL gained the ability to handle JSON natively. We began the
preliminary process of switching over to use that ability. It would mean less
work for the Python code.

Here is where it got strange: some of the team was so deeply committed to
using SQLLite, that they began hacking SQLLite to add the ability of SQLLite
to handle JSON natively. That is, any feature that PostGreSQL had, which we
used, they wanted SQLLite to have.

On the one hand, the technology they developed was amazingly cool. On the
other hand, it was a massive effort that had nothing to do with the goals of
the startup.

I could not fathom why we didn't simply use PostGreSQL in both development and
production.

~~~
justinclift
Just to point out... there's no capital G in PostgreSQL's name. It's either
"PostgreSQL" or "Postgres". ;)

Also to be super pedantic... it's SQLite, not SQLLite. ;)

------
dspillett
This is one of those things that I would hope goes without saying, but
obviously doesn't... Always test against what you expect to see in production.
If you test against something else first (in this case mocking through an in-
memory DB) to make the testing of other parts faster/easier then that is fine,
but once those tests are done you still need to do a final full test against
the real stack(s) you expect to see in production.

On the mocking thing: I thought the point of that was to completely avoid DB
calls for speed when testing other logic layers, so even a fast in-memory DB
isn't needed and anything (i.e. just some code that manufactures a response
object) that produces a result set in the right format will do? In that case
even using an in-memory DB is less efficient than the other option so is at
best a "half optimisation". Am I missing a point here?

~~~
plorkyeran
If the in-memory DB has the same feature set as the real DB, using the in-
memory DB for tests should be a single flip you switch, while writing code to
generate fake result sets for every tests requires writing all of that code.
90% of the benefits for 1% of the cost.

~~~
jeltz
But they do not have the same feature sets. And PostgreSQL trivially be tuned
to work almost like an in-memory database.

------
dfox
We are mostly PostgreSQL shop and on last few projects I wanted to make it
possible to use SQLite for development and testing, but I've fount that it is
simply not possible. not because there are things that work on sqlite and
break badly on PostgreSQL, but other way around. Just introducing Boolean or
DateTime column with server_default into your SQLalchemy model causes that you
can not use SQLite.

------
sgt
Postgres is relatively light weight. We use Postgres heavily in integration
tests and it's quite fast. I don't see that SQLite would speed things up
significantly other than causing other potential issues due to it being a
different database.

~~~
StavrosK
You just made me wonder whether disabling fsync for postgres would make your
tests behave non-identically to a database that does fsync, and how much
faster they would be. Interesting question...

~~~
spacemanmatt
I'm pretty sure the behavior would be identical except for durability.

~~~
StavrosK
I think so too. Okay, fsync, off you go for dev.

------
mgamer
I use MySQL in my system but the advice should be applicable to Postgres as
well: keep your test database in a ramdisk. By moving my MySQL to ramdisk I
got almost a tenfold improvement when running tests on a build server. Not so
much (but also significant) improvement when running tests on my development
machine.

~~~
jackweirdy
Why do you think there was a difference between tests on a build server and on
your development machine? Was it just a case of the build server being
configured for performance?

~~~
mgamer
I'm pretty sure that it's down to hard disk performance. I don't recall exact
numbers but difference in IOPS between my development machine and a build
server in the cloud (Azure) was greatly in favour of the former.

------
jgeraert
I tend to abstract away my queries using database views. This way the query in
application becomes very simple (select * from view where $simplepredicate).

In my test database (sqllite,hsqldb,derby,...) i create actual tables
containing test data with the same definition as the views.

This allows me to have rather complex queries that are database specific and
still have fast running database queries.

The views themselves i test separately. They are also easier to fix in
production than code releases, and can also be replaced by materialized views
if necessary.

Inserts/updates are typically simple too.

------
linuxlizard
NASA rule of thumb I read somewhere.

"Test what you fly. Fly what you test."

------
atmosx
A few notes:

\- All the concerns listed in this article are addressed by ORMs.

\- SQLite supports CTE[1] (subqueries).

\- It's safer to use the exact same setup in development and production, but
it's slower for applications with many tests. It's a trade-off and that's all
there is to say.

[1] [http://stackoverflow.com/questions/21819183/how-to-use-
ctes-...](http://stackoverflow.com/questions/21819183/how-to-use-ctes-with-
update-delete-on-sqlite)

~~~
dietrichepp
ORMs are not a panacea, for most applications it makes sense to bypass the ORM
for operations which translate well to SQL but become convoluted or slow when
expressed using the ORM's API. ORMs can't hide all of the differences between
database implementations anyway, not without hiding some of the functionality
that you actually want. It's unfortunate, but unavoidable.

------
mangeletti
I've always preferred this (using Django):

1\. Local development is done with the simplest of everything (local memory
cache, SQLite3 database, console based email backend, local static file
storage (vs S3, etc.)). The result is that there is very little overhead and
everything is easy to work on and test quickly. This also gives me the ability
to quickly wipe out things locally (erase and make new migrations for trivial
changes, delete the database file, clear cache more quickly by just restarting
the Django server, etc.).

2\. Final testing takes place on a staging server (Heroku), which is free, and
which can contain all the production-level backing services that production
will have. This server will be treated like production (no in-place hot-fixes
and database wipes / restarts). Separate dev accounts will of course be used
for things like APIs, email, etc.

3\. Production (Heroku).

This gives me the best of both worlds; the simplicity of local development
with simple backing services, with the comprehensiveness of a staging server
with better parity.

~~~
acveilleux
This makes sense when you use an ORM that will transparently work. You trust
the ORM to insulate you. If you have needed to go direct to SQL for something
then you're looking at putting in test-specific code in your production code
to simulate/fake/disable the custom SQL bit...

Just run postgres on a ramdisk, or on an SSD with sync disabled and a scripted
setUp/tearDown.

------
ahachete
Testing with a different database would only be good if you:

a) Also test with your real production database b) You restrict yourself to
using the lowest common feature denominator of both databases (which is
probably a pretty low figure)

Still, different databases behave differently in many aspects. And despite how
cool SQLite is, PostgreSQL is so much more advanced (specially in SQL
querying) that I don't see the point losing all those features. As I have
already mentioned in HN, check this amazing presentation: [http://use-the-
index-luke.com/blog/2015-02/modern-sql](http://use-the-index-
luke.com/blog/2015-02/modern-sql) before deciding to restrict yourself to a
subset of the SQL supported by PostgreSQL.

Given that it's easy to start PostgreSQL from a unit test, and how lightweight
PostgreSQL is, I see no real point in using SQLite for testing. Use
PostgreSQL!

------
msluyter
On the project I work on, we face a similar problem. We use H2 for an in
memory test database, but now we're frequently running into the problem where
we'd love to use a specific postgres feature (say, json support) that isn't
supported by H2.

That, and occasionally we find syntax differences that cause a headache when
doing a database setup/teardown. A single minor SQL difference requires us to
create separate H2/Postgres flyway configurations.

I think a better option than H2 or SQLite, that we're currently investigating,
is using Docker to bring up a local postgres instance for testing.

(All of that assumes a certain dedication to using Postgres. If you want to be
database agnostic, then you may in fact be better off not using Postgres in
dev/test just to force yourself to remain compatible with other DBs.)

------
serve_yay
I agree. Bite the bullet and learn to work with Pg in your dev environment.
The dev ergonomics aren't worth all the other stuff listed here. Think about
all the time you'll spend reasoning about the differences between the two.
This has nothing to do with your project.

~~~
XorNot
In the age of containers this able thing is baffling to me. I have a couple of
things that need postgres in production, and a single script which launches a
two docker containers with tmux running postgres and apache with live
scrolling logs, and which binds the postgres port to my local host.

I simply cannot understand where the friction on this is existing today. Even
pre-docker its fairly easy to setup any of the big dbms in their own little
environments (5 versions of mysql on the same server for a migration project).

~~~
acveilleux
Plus with a few tweaks to postgres relaxing ACID compliance (i.e.: no sync)
and using an SSD or better a ramdisk for storage performance will be fantastic
for unit tests.

------
mnkypete
Starting Postgres locally for unit testing is quite easy and done with a few
commands (e.g. Powershell)

[https://gist.github.com/tobiasviehweger/cbfd9a1a55bff0862f9e](https://gist.github.com/tobiasviehweger/cbfd9a1a55bff0862f9e)

~~~
spacemanmatt
When I work with PostgreSQL heavily, I don't ever take a server down if I
don't have to. I just have test harnesses create fresh test databases from
templates.

------
code_duck
I've run Linux as my desktop for a long time, so I've always found it natural
to set up my desktop as closely matched to the deployment server environment
as possible. It's very simple to set up MySQL on Debian or Ubuntu, and only
slightly more trouble on OSX.

Using SQLite would actually be more trouble and I've never seen a need to use
it all for development. Any time I've tried to use SQLite, it's been pointless
as some of my queries written to run on MySQL or Postgres fail due to lack of
support for some feature or another. And I already have plenty of MySQL and pg
running on my systems, so...

------
weddpros
We're using Vagrant for developers' environment (with
mongodb/redis/rabbitmq/consul.io)

I can't recommend it enough. Maybe a local memory database could be faster...
until your environment grows beyond just a db

------
K0nserv
Extending his argument further. Strive to have a local environments that is as
close to production as possible. Use Memcache in production? Use Memcache for
dev/testing too! Use Elasticsearch in production? Use Elasticsearch in
dev/testing!

This is all really simple to setup to. Think about how a new developer gets up
to speed and starts coding in your company. Does it involve downloading and
installing Postgres.app[0]? If it does it's no good. Starting a local
environment should require a single command. `vagrant up` is one option if you
use vagrant. Local environments should ideally use the same method of
provisioning as production servers. It can however be simpler, one of our
codebases has a bootstrap.sh file that sets everything up, it works
surprisingly well. No version conflicts, no weird bugs due to slightly varying
versions for different developers and no fake service such as SQLLite.

For the life of me I can't understand the test speed issues that people talk
about. We have a pretty small codebase with some 2k test(94% coverage). That
takes about 6-7 minutes to run inside a vagrant VM using all the same services
as production. 6-7 minutes is a long time, but you shouldn't be running your
whole test suite during development. During development only run the one or
two tests that are relevant for the feature you are building or the bug you
are fixing. These are typically really fast even with a proper database. The
whole test suite should be ran before pushing and by CI. If your database is
still too slow look at using a RAMdisk or tweaking the configuration like
people have suggested in the comments

0: [http://postgresapp.com/](http://postgresapp.com/)

------
rbanffy
While I agree testing well is crucial, running your tests on one RDBMS that's
fast and running them less frequently (think "before merging into production")
is a good compromise. PostgreSQL is slower here than SQLite and if your tests
take 20 minutes to run, your developers won't test as often as you would like
them to.

It also prevents you from doing things that only one RDBMS does (I was bitten
by this because SQLite supports timestamps with timezone data and, at that
time, the MySQL we used in production didn't) making your app more "robust"
(you are using an ORM for a reason, right?).

Imagine you only test your app in PostgreSQL and, because of that, your app
makes assumptions about the environment only PostgreSQL satisfies. You simply
can't move to anything else without extensive rewriting. Now, when the
workload changes, you can't just change your storage backend to one that suits
your workload. You need to make PostgreSQL do whatever your new needs are.
PostgreSQL is probably a good enough solution, but it's not the optimal
solution for all problems.

~~~
electrum
Aside from a few seconds to spin up a new PostgreSQL instance, it should not
be any slower than SQLite. We have a wrapper [1] (forked from [2]) that starts
a new PostgreSQL instance from Java unit tests. We have the same for MySQL
[3]. This allows every developer to automatically run tests against the target
database without needing to set anything up.

(The common alternative in Java is to use H2 or Derby, which are similar in
concept to SQLite)

1\. [https://github.com/airlift/testing-postgresql-
server](https://github.com/airlift/testing-postgresql-server)

2\. [https://github.com/opentable/otj-pg-
embedded](https://github.com/opentable/otj-pg-embedded)

3\. [https://github.com/airlift/testing-mysql-
server](https://github.com/airlift/testing-mysql-server)

~~~
anthonybsd
Hmm, I'm actually on PostgreSQL right now using H2, this might come useful.
Thanks.

~~~
electrum
Just add the Maven dependency and see their own tests for an example:

[https://github.com/airlift/testing-postgresql-
server/blob/ma...](https://github.com/airlift/testing-postgresql-
server/blob/master/src/test/java/io/airlift/testing/postgresql/TestTestingPostgreSqlServer.java)

[https://github.com/airlift/testing-mysql-
server/blob/master/...](https://github.com/airlift/testing-mysql-
server/blob/master/src/test/java/io/airlift/testing/mysql/TestTestingMySqlServer.java)

Though you want to use @BeforeClass / @AfterClass or similar rather than
creating one for each test.

------
jbb555
No, I _will_ do this.

Because it's so much easier and it still lets me test 95% of my code. And the
alternative is not testing at all because there is limited time for testing
and setting up a proper database for this is so much more trouble.

The choices are not good test vs bad test. They are test-with-issues vs. no
test.

(Obviously you have to do SOME testing with the real DB but this article is
talking about unit tests (or related))

~~~
desas
How is it so much trouble? Install postgres & any extensions you use on your
dev/test boxes, this is a one-off cost that takes minutes.

Have your test bootstrapper run "createdb somethingunique" and then "export
APP_DBCONN_STR='somethingunique'". Adding this to your bootstrapper will take
minutes.

------
gitaarik
If you're writing unit tests to test the business logic of your app, you
shouldn't need a database at all. You should write your business logic so that
it isn't dependent on a database, so you can really test the business logic
and you don't have to mock the database.

If you're talking integration tests, then of course you should use an
environment as close to production as possible.

~~~
hakanderyal
For some apps, business logic depends on database functionality. For example,
I rely on Postgres to prevent duplicate records, or delete cascades.

In any case, using SQLite when doing TDD, and testing with postgres when you
are done implementing is an acceptable trade-off for most use cases.

------
bane
SQLite is awesome for rapid prototyping and proofs of concept work. But you
should pretty much always move to a real RDBMS once you start nearing beta.
There's some advantages to using SQLite early on when you're still figuring
things out, but once you've nailed down the schema and queries and done some
early validation testing, you pretty rapidly run out of advantages and it's
just better to go with something like Postgres.

It doesn't really take a lot of discipline or work to switch over, and once
you're entering beta candidate territory, that's when features and performance
tuning start to take over and that's where something like Postgres starts to
shine in comparison.

Even if you keep with the same schema and queries, just moving over to
Postgres on a separate box, you'll probably start to see immediate performance
improvement _and_ you'll get better scaling performance almost immediately.

------
falcolas
My two cents on this topic - as usual, the answer is neither black nor white.
At the end of the day, an in-memory SQLite database works very nicely as a
stub for most standard SQL queries as part of a unit test mocking system.

It allows you to create, populate, test, and teardown an entire relational DB
in hundredths of a second, which makes it ideal for unit tests where you want
to clean the slate between tests to ensure that you aren't accidentally
creating silent dependencies on a database state created from a previous test.

On the other hand, when you're done doing your Red-Green-Refactor cycle for a
new feature, you want to immediately run integration tests, which will
exercise your program against not only your production model database, but the
other APIs you had mocked out for regular unit testing.

Mocking is good for iterative development, and SQLite is a great 80% tool for
mocking.

------
nbevans
He claims SQLite doesn't have CTE's but I believe it actually does.

~~~
zimpenfish
Definitely seems to.

[https://www.sqlite.org/lang_with.html](https://www.sqlite.org/lang_with.html)

------
azinman2
It would be nice if you could embed it and run in memory so it's as
convienent.

And yes, you can do CTEs in SQLite but that's besides the point.

------
Jean-Philipe
I also started using SQLLite for testing (through an ORM), but at some point
it became too painful to maintain and didn't find all of our bugs. It's not
even any faster than Postgres.

------
skarap
This depends on what the other options are. If you have

* test with SQLite

* test with Postgres

then using the same environment as production (Postgres) is of course better,
but if the real options are

* test with SQLite

* don't test

SQLite is much better!

------
ju-st
My experience with testing with Sqlite and Mysql when the customer uses MS SQL
is very bad. JDBC only theoretically abstracts the real DB...

~~~
WatchDog
Since when does JDBC abstract the real DB in theory? All JDBC does is provide
a common interface to execute a statement and receive result sets.

------
Fudgel
What about if you're using an ORM, wouldn't that fix the issues mentioned?

~~~
cmdkeen
No. It's perfectly possible to create queries via an ORM which work with one
database engine but fail in another - whilst not SQLite and Postgres I've
managed to do this several times with Entity Framework and Oracle trying to
use CROSS APPLY thinking it is connecting to MS SQL Server.

~~~
collyw
Its perfectly possible to create queries that work with configuration of the
same database and fail in the other.

------
whalesalad
Can we all go through and s/SQLLite/SQLite? The inconsistency in the article
and the comments is overwhelming.

~~~
onli
Seriously. How is someone that does not even know the correct name of the
software supposed to know enough about said software to know whether it is
similar enough to be a good test replacement?

I know this is a bit too much against the person, but here that is directly
relevant to the topic at hand.

Not that he is totally wrong, but he is wrong in the universality of his
argument. It depends on what postgres-specific features the software is using
and how the communication with the database is handled. If it is handwritten
SQL, it is not a good idea regardless. If it is a ORM that maps the two very
well and there is noting postgres-specific about the software, it is unlikely
to be a problem.

------
ryan-allen
Really??!!

------
matthewmacleod
I reckon it's okay to do this in the right circumstances:

\- It's a relatively simple app \- You're using an ORM \- You aren't using any
advanced SQL features \- It's only to make local development easier \- There's
still a full CI test run with your production database

That said, if your test suite is large enough that database performance is an
issue during testing then either your app is too complex for the above to
apply, or you are probably doing something else wrong.

------
WatchDog
So the alternative is to mock out every database call or use a full database
to run the tests. Using an in-memory database is convenient, keeps your tests
portable and most importantly it helps catch a lot of bugs that might of been
missed with mocking. It might not be as good as using the real DB, but its
better nothing.

~~~
jeltz
PostgreSQL will be close enough to an in memory database if you turn off
synchronous_commit. For most test ssuties I would suspect much more time will
be spent plannign the queries (which can be improved with prepared staatments)
than the time spent on disk IO if you just reduce the durability requirements
of your test database.

------
PythonicAlpha
Some comments on the arguments:

(1) SQLite did never claim to be as "complete" as other databases -- it is and
will be a "lightweight" database.

(2) Everybody with marginal knowledge of different databases should know, that
using different databases always puts you on risk and needs extra testing. You
would also not recommend to develop your application on Linux, use a
crosscompiler and ship the product on Windows untested. And with that said,
most of the arguments can be deleted.

(3) When you are using an ORM, most of the arguments are obsolete, too.

~~~
empthought
> When you are using an ORM, most of the arguments are obsolete, too.

At the cost of pretty much everything you get out of using something that
isn't SQLite.

------
CHY872
This is way simplistic. Frequently testing with the real database wastes a
tonne of time. I've worked with a team where running a single test with the
true database took a minute, and there were 200 such tests.

Thankfully, the only guarantee they needed was that provided by a key-value
store, so a ConcurrentHashMap was used on dev machines.

Then the true database was used by the CI server, and commits only occurred
when all of the tests passed with the true db.

All of the OP's reasons would (in many cases) pale in comparison to 'My tests
take hours to run'.

~~~
jeltz
If every test takes a minute then the database is probably not the culprit. I
would suspect the test suite does something really stupid which a faster
database would not fix.

We have a slow, unoptimized test suite (it does some really stupid things)
which hits the database in most test cases and it still manages to complete
6166 tests in 4 minutes on my laptop (this includes setting up and tearing
down the database).

