
Ask HN: Do you have a Oracle to PostgreSQL migration success story? - unixhero
Do you have a story from a successful Oracle to PostgreSQL migration to share?<p>What did the existing stack do?<p>Why did you want to make the switch?<p>- Technical hurdles<p>- Useful tools in order to faciliate the migration<p>What did the business case look like:<p>- What was the cost of the migration?<p>- What were the savings in the end?<p>- Any new possibilities post-migration that opened up?
======
seanharr11
1\. We (an NFL team) had an Oracle 9i database (which Oracle pulled support
for in 2010) storing NFL player's Scouting information for > 20 years.

2\. We needed to make the switch to a modern solution because as of June 2016,
we had been running w/o support for 6 years (I got here in 2016), and the
RDBMS (9i) was quite slow and not compatible with modern drivers and
applications.

3\. Our biggest technical hurdles were unique to the tool that we used for
migration (etlalchemy, described in (4) below). i.) Finding the correct
cx_Oracle (Python) driver to communicate with the correct Oracle Instant
Client version, and then finding the exact version of Python libraries which
supported that driver (b/c Oracle pulled support for 9i so long ago, finding
the old instantclient binaries was nasty work). ii.) Handling FK constraint
violations when importing data into PostgreSQL. Because you can't turn
constraint checks like you can on MySQL (SET FOREIGN_KEY_CHECKS=0;), data that
violates constraints must be resolved before the migration. As a solution, I
added support to etlalchemy (the next bullet below) to "dump bad rows" to
STDOUT, and skip the foreign key that is violated during the "constraint
migration" step.

4\. Regarding tools, we used
[https://github.com/seanharr11/etlalchemy](https://github.com/seanharr11/etlalchemy)
to facilitate the migration. I created this open source tool 2 years ago,
published it to github, and used it to carry out 99% of this project. The only
thing that had to be migrated manually were the Oracle functions (most of
which we tossed out as they were used for ColdFusion/Report Generation). The
most powerful feature of this tool is its ability to migrate schema (including
column types) between 2 different RDBMS's. It also loads data quite fast
leveraging PostgreSQL's 'COPY FROM' bulk import, and MySQL's 'LOAD DATA
INFILE'. (Constraints and Indexes are also migrated with etlalchemy out-of-
the-box)

5\. From a business standpoint, we are just completing the project now and are
likely going to choose MySQL as our final destination. This is mainly because
our database is very OLTP-oriented. Our applications perform very small
SELECTS, INSERTS and UPDATES to our database at high frequency (think football
scouts updating player statuses, live stats feeds from the NFL, etc...), and
InnoDB (MySQL storage engine) happens to do a very good job of facilitating
this type of database (high-frequency simple updates). PostgreSQL would have
been a better choice if we were to run less frequent queries involving more
computation (OLAP-oriented), but we don't (yet) have a huge Machine
Learning/Data Mining requirement here. From experience, PostgreSQL has a huge
toolkit of functions, very cool indexes, and other nice-to-haves, we just
don't need them for our business case. Both choices have a much stronger
community, and answers are found via StackOverflow/Google rather then reading
the Oracle manual until your eyes bleed.

6\. If you want to read about my experience, I have posted a quick write-up
here, including some more background info on etlalchemy:
thelaziestprogrammer.com/sharrington/databases/migrating-between-databases-
with-etlalchemy

If you decide to give the tool a try (it is battle-tested), please let me know
how it works for you. And please, if you have any questions/feature-
requests/issues, reach out to me. I am trying to get some momentum going for
this project, as it would solve lots of people's database migrations problems
as it solved mine.

~~~
unixhero
In horror I see a lot of business application stacks driven by Oracle.
Probably because of some nifty set feature early on. Again and again we pay
dearly for it.

A small and growing part of me wants to cut through the bulls hit and look
into eliminating it wherever I can sell the idea.

~~~
seanharr11
My experience with my current employer is that Oracle/SQL Server/Windows are
used because of the lack of trust in open source solutions, usually by
management who grew up in the Stone Age of Software.

All 3 of the aforementioned slow EVERY process in the building down,
especially when it comes to application development.

