

Migrating an SQLite based django app to PostgreSQL - iuguy
http://stevelord.tumblr.com/post/86674737/migrating-an-sqlite-based-django-app-to-postgresql

======
idm
The biggest problem I have with developing on SQLite is the limited capacity
to perform DB schema migrations. I use South.

SQLite doesn't provide much by way of schema changes, at least not in a manner
that the Django ORM can make use of. I just run Postgres locally now, because
I can be confident that any migrations I test locally will work on the
production system.

------
Rexxar
It would be very nice to have a "postgreSQLite" with all functionalities of
postgreSQL but could be used as sqlite when only one application use the
database.

~~~
wheels
I'm not sure if this is one of the things preventing such, but it's an issue I
had with our DB:

When an embedded database is interacting with something like, say, Apache,
which uses fork to create multiple processes, your locks have to be multi-
process-safe as opposed to multi-thread-safe. For something that's designed
with single-process, but multi-threaded access in mind, that can be a bit of a
chore.

~~~
markessien
Is there any difference in the logic? Locking access to a variable will work
the same way, even if it's multiple processes trying to access it vs multiple
threads. Is the difference just in the mechanism of the lock, or is there some
type of difference in behaviour in accessing shared items between threads and
processes?

~~~
wheels
In a database you have to lock access at the file level, naturally. To do
multi-process locks you need to store your locks either in shared memory on in
an mmap block so that they're shared between processes and then you have to
deal with the case of processes crashing or being killed without releasing
their locks, which can then trigger a deadlock because the lock is held by a
process that doesn't exist anymore.

To resolve that you can pop up to the process tree, assuming you saved the PID
along with the lock and see if that process still exists (and hope its PID
hasn't been reused) and if not iterate over all locks held by that process and
free them to break the deadlock, but that sort of overhead at something as
fundamental as locking code can be annoying.

Also note that your semaphores then have to usually be implemented in assembly
or you have to have a library that implements atomic increments of reference
counts. Qt's QAtomicInt will do the trick if abused a little (just because it
has an assembly implementations for multiple architectures). pthread_* doesn't
have support for this as far as I know.

So, yeah, it's doable and some DBs do handle multi-process concurrent access,
but it can be a pain in the ass. I believe SQLite does it by just locking
complete tables on write access rather than using more fine-grained locks.
MySQL embedded may also support multi-process access -- not sure.

~~~
markessien
I see the problem with the process crashing. Why do you need to implement the
semaphores in assembly? Threads can pre-empt the same way as processes, no? If
it works there, why would it not work across processes?

~~~
wheels
Because there are no POSIX APIs for bare atomic reference counting. Presumably
pthreads implementations do implement atomic reference counting in assembly,
it's just not exposed through the API. For more on Linux see:

    
    
      man futex

------
tk999
sorry for the stupid question. Why people use PostgreSQL instead of MySQL?
Pros and Cons?

~~~
idm
ACID: Atomicity, Consistency, Isolation, Durability

See ma.gnol.ia for an example of what can happen with a DBMS that doesn't
provide this. That's a big reason why I use Postgres instead of MySQL. With
Postgres, the database won't get corrupted by a partially complete operation.

On the other hand, MySQL gets better performance by skipping the whole ACID
part... so if you are willing to work with this (i.e. you have a good disaster
recovery strategy) then you can theoretically handle a higher load by using
MySQL.

~~~
teej
ACID isn't an excuse to skip redunancy and backups.

~~~
idm
Good point, and I didn't mean to suggest that the one precludes the other.
However, the ma.gnol.ia story is a tale of redundancy - they were backing up
corrupted data for weeks, and once they realized this, it was too late. The
database took weeks to fail, and their backups were worthless.

