
Best practices for migrating an Oracle database to Amazon PostgreSQL - yannikyeo
https://aws.amazon.com/blogs/database/best-practices-for-migrating-an-oracle-database-to-amazon-rds-postgresql-or-amazon-aurora-postgresql-migration-process-and-infrastructure-considerations/
======
wmorein
Some more context and backstory from a senior Amazon architect:
[https://perspectives.mvdirona.com/2018/11/1227/](https://perspectives.mvdirona.com/2018/11/1227/)

~~~
nateburke
Wow! Getting it from Jassy is one thing, but hearing an engineering mind like
James Hamilton sounding off like this is a surprise. I can't imagine what
Oracle must have done to deserve a dressing-down like this.

~~~
taf2
I don’t know maybe something like this [https://www.businessinsider.com/larry-
ellison-says-its-embar...](https://www.businessinsider.com/larry-ellison-says-
its-embarrassing-for-amazon-to-rely-on-oracle-tech-2018-8) ?

------
sleepybrett
This and the Corretto announcement. Those are some shots over the bow.

[https://aws.amazon.com/fr/corretto/](https://aws.amazon.com/fr/corretto/)

~~~
xrd
I'm not sure I understand how Amazon can offer something like this. IANAL,
clearly, but isn't anything one or two degrees away from "java" something that
Google just fought a long court battle over? Is Corretto different because it
is running inside Amazon's data centers? Was the Google blunder that they
supposedly used actually Oracle (acquired from Sun) JDK code inside their
Android/Dalvik?

~~~
pavon
This is fine. Amazon is just redistributing OpenJDK according to the terms of
its license, which is indisputably legal.

The problem with Android was that it was using, Harmony, a clean-room
implementation of JDK whose license is incompatible with OpenJDK. It was
believed at the time that APIs were not copyrightable, and thus that clean-
room implementations were not derivative works. The courts disagreed. Now
Google just uses OpenJDK for Android and they are back in the clear again.

------
seanharr11
We did this with the New England Patriots, only we moved to an on-prem
PostgreSQL instance. Specifically, we were on Oracle 8, so just finding the
compatible drivers was half of the battle. I wrote a Python utility that
carries these (nasty) migrations out from end-to-end. It works well, probably
doesn't scale well, and needs tests. IT also doesn't solve the problems of
Oracle-specific features, we just dropped those, which is probably not
suitable for most folks, though many of these can/should be done at the
application level.

Anyway, I made a promise to myself that when the project hits 300 stars I'd
write tests and refactor for scalability. I need 3 more. Probably will be
looking for some help if anyone's interested...
[https://www.github.com/seanharr11/etlalchemy](https://www.github.com/seanharr11/etlalchemy)

~~~
shshhdhs
You’re now at 309! Sounds like some people want it!

~~~
seanharr11
I've begun the rationalization process...I should be able to start this
sometime in January. I've begun drawing up the plans though...

------
matchagaucho
The irony of being this transparent with CIOs about DB migration details is it
instantly triggers an opportunity cost discussion.

 _" This looks like it'll cost $3M-$5M to migrate away from Oracle. But Oracle
only increased our licensing costs $500K per year and I already have a team of
5 Oracle DBAs!"_

Okay... but those costs compound each year.

Don't get me wrong... I'm all in the "migrate to cloud" camp. But DB migration
is one of _the_ most difficult and political conversations to have these days.

------
irrational
We've been working on exactly this for more than a year. Well, not to Amazon
PostgreSQL, but to PostgreSQL on our own linux setup. Using ora2pg makes
moving the data over fairly simple (once you get all the settings right so the
data flows quickly instead of taking forever). The thing that takes a long
time is converting and testing all of the thousands of sql
statements/triggers/procedures/etc. from Oracle syntax to Postgres syntax.
Here is a partial list of things to look for:

replace nvl with coalesce

replace rownum <= 1 with LIMIT 1

replace listagg with string_agg

replace recursive hierarchy (start with/connect by/prior) with recursive

replace oracle insensitive query format with insensitive_query()

replace minus with except

replace SYSDATE with CURRENT_TIMESTAMP

replace trunc(sysdate) with CURRENT_DATE

replace trunc(datelastupdated) with DATE(datelastupduted) or
datelastupdated::date

replace artificial date sentinels/fenceposts like to_date(’01 Jan 1900’) with
'-infinity'::date

remove dual table references

replace decode with case statements

replace unique with distinct

replace to_number with ::integer

replace mod with % operator

replace merge into with INSERT ... ON CONFLICT… DO UPDATE/NOTHING

change the default of any table using sys_guid() as a default to
gen_random_uuid()

oracle pivot and unpivot do not work in postgres - use unnest

ORDER BY NLSSORT(english, 'NLS_SORT=generic_m') becomes ORDER BY
gin(insensitive_query(english) gin_trgm_ops)

replace UNISTR( with U&’

Oracle: uses IS NULL to check for empty string; postgres uses empty string and
null are different

Fix string IS NULL comparisons: field1 IS NULL becomes COALESCE(field1::text,
'') = ''

Fix string IS NOT NULL comparisons: field2 IS NOT NULL becomes (field2 IS NOT
NULL AND field2::text != '')

If a varchar/text column has a unique index a check needs to be made to make
sure empty strings are changed to nulls before adding or updating the column.

PostgreSQL requires a sub-SELECT surrounded by parentheses, and an alias must
be provided for it. - SELECT * FROM ( ) A

Any functions in the order by clause must be moved to the select statement
(e.g. order by lower(column_name))

Any sort of numeric/integer/bigint/etc. inside of a IN statement must not be a
string (including 'null' \- don't bother trying to use null="", it won't
work).

Concatenating a NULL with a NOT NULL will result in a NULL.

Pay attention to any left joins. If a column from a left join is used in a
where or select clause it might be null.

For sequences, instead of table.nextval use nextval()

Recommendation: change all varchar2 columns to be not null and set the default
to be ''. This fixes a lot of the issues with the difference between how
oracle and postgres treat empty strings and nulls.

~~~
aidos
Curious about the below, what happens on Oracle?

>>> Pay attention to any left joins. If a column from a left join is used in a
where or select clause it might be null.

~~~
irrational
It's a side effect of empty strings and nulls being different on Postgres, but
not on Oracle.

~~~
aidos
Oh really? You’d get empty strings for every column in the outer joined table?

------
40acres
I had a long chat with a senior engineer at Amazon a few days ago and this
migration came up, apparently the first whispers that lead to this migration
occurred back in 2010. Just goes to show how long it takes to complete these
projects at scale.

------
latchkey
My experiences with Oracle databases back in the late 90's is partly what
drove me to really strongly adopt open source. My webdev company was doing a
contract for Ascend Communications who made ISDN modems, to build their whole
website content management system.

I wrote an adapter that tied PHP (pre-1.0) and Java together before we had
servlets and Tomcat. We used Java for the ‘backend’ and PHP for the ‘frontend’
rendering. Java talked to Oracle, pulled data out of the db and sent it to PHP
for rendering the HTML. Oracle was forced on me because Ascend had a contract
with them. This project was a nightmare on many levels, but Oracle was the
majority of my drama.

Eventually, this project drove me to figure out servlets and Apache Jserv
(first open source servlet engine)… which then led to myself and a couple
random guys from Italy who had joined to work on JServ, to get Sun to open
source Tomcat and create Java Apache (later renamed to Jakarta). If it wasn’t
for all that, other Java stuff might not have happened at Apache (Hadoop,
Lucene, etc).

------
atonse
I remember the main thing I learned after getting Oracle certified (15 years
ago) is that I never want to work with Oracle again.

Wow the guts of that thing are a god-awful mess.

------
JDWolf
I had an exec level interviewer ask me about migrating from sql server to pg
and deep down I knew it would be a nightmare but also knew that I would have
to say its ok in order to continue on with the interview process, I paused and
said that it was possible but would likely require a lot of work....luckily
didn't get that job

~~~
weliketocode
Was he/she testing your database knowledge or hoping you'd be able to fix the
current issue at hand?

~~~
JDWolf
I think it was a "cost saving" proposal because the technical interviewer had
already gone through the database nitty gritty.

------
mberger
Someone tell me how to migrate all the PL/SQL I've written and I'll be all
ears.

~~~
iaabtpbtpnn
PL/pgSQL is close enough in terms of language features to be nearly syntax
compatible after some ora2pg-like transformations, aside from autonomous
transactions, which apparently are now supported by Postgres 11 procedures (I
haven't tried this). Your main problem will be all the DBMS_* package calls...
orafce helps a bit, but not if you've gone really far into Oracle-land.
Procedures you wrote for DB system maintenance may not have a meaningful
translation.

Alternatively, you could pay EnterpriseDB.

------
exabrial
Yikes, call the burn unit.

Context: [https://www.webpronews.com/larry-ellison-amazon-oracle-
datab...](https://www.webpronews.com/larry-ellison-amazon-oracle-database-aws)

~~~
glogla
> You know who else uses Oracle? Another company that hates us, SAP uses
> Oracle everywhere. SAP ten years ago said I hate Oracle, I’m getting off of
> Oracle, I can’t stand these guys, especially this guy that goes on TV and
> makes fun of us. They say we have this great new database called Hanna.

He can't even write the name correctly.

~~~
cronix
To be fair, it's a transcription of the Fox News video interview that's at the
end of the GP's article. He didn't write it.

~~~
glogla
Ah, that makes sense.

------
etlnerd
In my opinion, Diyotta is the fastest and most efficient ETL tool for
migrating databases (Oracle, PostgreSQL, etc) to Amazon Redshift. Check out
this page for more information: [https://www.diyotta.com/solutions/diyotta-
and-amazon-redshif...](https://www.diyotta.com/solutions/diyotta-and-amazon-
redshift)

------
crb002
It is Game Over if AWS starts hosting IBM mainframe DB2. Most customers would
happily pay a 10% premium.

In the CAP theorem context if you need CA you can't beat an IBM mainframe. IBM
can also co-locate nodes of it's latest Power9/GPU supercomputer racks.

~~~
gaius
What’s in it for IBM? Mainframe hosting (e.g. being a 1970s style computer
bureau) is the only possible competitive advantage they have in cloud.

------
corebit
I would love to see a technical justification for doing this, especially that
references strengths gained in a cloud environment.

~~~
discodave
I imagine a major justification, especially for Amazon, is to stop paying
Oracle $$$.

I'm not sure if you consider that a technical, or a business justification,
but I don't really see a distinction here.

Some more strictly technical justifications would be:

* Aurora has the potential to be a better architecture in the long run because it has something closer to an active/active redundancy, rather than Oracle's primary/secondary architecture, where fail-overs are manual and/or prone to failure. * Aurora can also scale out to larger database sizes than Oracle.

~~~
grumpydba
Oracle has had active/active for years. It's called RAC and costs around one
arm.

~~~
chasil
Oracle Standard Edition used to cost around $8k/core.

Now Standard Edition 2 costs $17k/core, but I believe it includes a license to
use Real Application Clusters (RAC), justifying the price hike.

Not much justification if you're only running a single instance, though.

~~~
grumpydba
This is insanely expensive indeed. And limited to one socket per node.

~~~
chasil
Here is the price list:

[https://www.oracle.com/assets/technology-price-
list-070617.p...](https://www.oracle.com/assets/technology-price-
list-070617.pdf)

SE2 processor license is $17,500.

Enterprise processor license is $47,500.

~~~
thorin
Interesting, I don't think I've ever seen a comprehensive price list before
and I've worked with Oracle for 20 years. Generally I've been somewhere with a
site licence so you can use almost anything or used XE and bought standard one
for production.

