Hacker News new | past | comments | ask | show | jobs | submit login
Ora2Pg – Oracle to PostgreSQL database schema converter (ora2pg.com)
281 points by jka 6 months ago | hide | past | favorite | 71 comments

A few other really handy tools when migrating from Oracle to Postgres:

Orafce - An extension to bundle in Oracle functions -https://github.com/orafce/orafce

OracleFDW - A foreign data wrapper to query directly into Oracle from Postgres - https://github.com/laurenz/oracle_fdw

also check out ora_migrator https://github.com/cybertec-postgresql/ora_migrator (another extension from Cybertec) which uses oracle_fdw under the hood.

intro blog post: https://www.cybertec-postgresql.com/en/ora_migrator-moving-f...

CYBERTEC also offers a more advanced Oracle to PostgreSQL migration product: https://www.cybertec-postgresql.com/de/produkte/cybertec-mig... Here's a talk from two weeks ago by the CEO: https://www.youtube.com/watch?v=dDRrlgnH7N8

Disclaimer: I work at CYBERTEC

I've used this for migrating some applications. It worked well enough. In addition, the "orafce" extension was also useful in reducing required changes to the actual SQL queries the applications made.

In my case, the applications were relatively simple, but many of the queries used sequences explicitly, and the lack of a common syntax there was a slight pain point.

Surprised that Oracle hasn't yet sued these developers for some api violation or similar bullshit.

I mean, this is software that consumes an Oracle API, not software that replicates and exposes an Oracle API.

Larry's lawyers would probably see that as a technicality which can be resolved in court.

I wouldn’t put it past them but I think a more analogous case would be Postgres copying all the SQL for their default schemas for compatibility reasons.

Just like Oracle copied IBM's SQL in the late 70's...

People always say this but it’s not at all clear cut to me that this is covered by the appeals court’s ruling.

AWS DMS basically does this and then some.

Looks to be another migration service.

Wouldn't assuming an oracle api make it a derivative work though...

Oracle has enough lawyers and resources to make it so even if you ultimately win a suit, you lose.

I mean I don’t think the Oracle license is viral? I don’t know much about proprietary software licensing though.

The default state of copyrighted works (and most commercial licenses) is "no derivatives allowed".

A viral license is "derivative works allowed only if they are distributed under the same license".

A permissive open source license is "derivative works allowed and they can be licensed freely".

I imagine any license oracle has offered on it's db is the no derivative works kind not an open source kind.

I understand copyright but I’m not sure an open source library that uses a proprietary API is considered a derivative work. Is there case law here?

Oh, so the other context here is the Google Oracle lawsuit. In it Oracle is arguing that google implementing portions of the java standard library api in android was copyright infringement.

It's due in front of the supreme court October 7th (Wednesday after next).

It's... certainly a novel interpretation of what is copyrightable, and if they win it's going to cause quite a few problems (e.g. wine is going to be infringing windows copyright, and windows is going to be infringing Linux's copyright (with WSL and some of their other apis), and so on and so forth. Literally everyone, including oracle (see: sql, cloud apis, etc), has been acting like APIs aren't copyrightable.

So anyways, no, no case law (yet) since no one has even established that APIs are copyrightable.

I’m aware of that but there are very different details here. They’re not implementing an API compatibility layer, they wrote a one time migration tool. Beyond that in the case of Oracle vs Google the source implementing the API interface was copied directly. WSL, Wine, and SQL do not do this.

> Beyond that in the case of Oracle vs Google the source implementing the API interface was copied directly.

This is incorrect. It was not. All* the source was rewritten.

* Minus a dozen lines or so, but that was found to be deminimis copying and not copyright infringement.

> They’re not implementing an API compatibility layer, they wrote a one time migration tool.

Writing a one time migration tool however does involve implementing a version of the API... I'm not sold on this distinguishing feature being of any importance, and I think the onus is on you to make a legal argument for why it is of importance (since proving the non-existence of a valid argument here is basically impossible)

Suppose, for example, Google had implemented there vm by creating a one time translation tool that translated java std lib apis to an internal google std lib apis in the bytecode, would that have changed the suit?

I’m not talking about rangeCheck, I’m talking about the code they took from Apache Harmony (37 APIs, something like 12kloc) and the corresponding structure, sequence, and organization of the Java APIs. I don’t feel like relitigating this either as the courts have already decided this was infringing. Also, the de minimis was overturned on appeal.

> Writing a one time migration tool however does involve implementing a version of the API

No, it doesn’t. It involves calling the API, and translating the data structures it returns to another API. If they provided a function identical to one in the Oracle API that did different things in Postgres, that might be different.

>I think the onus is on you to make a legal argument for why it is of importance

I’m not a lawyer so any legal argument I make is unlikely to hold water. Instead I will merely suggest that if this project reused 12,000 lines of SQL included in Oracle Database that would probably constitute infringement, but I don’t believe it does include that.

> Suppose, for example, Google had implemented there vm by creating a one time translation tool that translated java std lib apis to an internal google std lib apis in the bytecode, would that have changed the suit?

I’m not a lawyer, I have no idea and I find counterfactuals mostly uninteresting, especially when the exact details of Oracle vs Google are so frequently glossed over.

> I’m talking about the code they took from Apache Harmony

Which Oracle did not own and which Google used with a valid (up to the Apache Harmony code not being infringing by virtue of copying the API) license, I've literally never seen any suggestion that if Google had written this code themselves anything would have changed.

> Also, the de minimis was overturned on appeal.

Ah, you're right. Either way it's a separate question from the api portion of the case.

> I've literally never seen any suggestion that if Google had written this code themselves anything would have changed.

That’s not what I’m suggesting. If Google had written it themselves it would have had to have been identical, and that is why it’s infringement. The code is the same in the Android SDK as it is in Harmony and Java SE. It doesn’t matter where they got it from, it’s copyrighted, and the copyright belongs to Oracle.

I’m no fan of Oracle, but I also believe that Google has a history of playing fast and loose with copying other people’s technologies and asking for forgiveness later. My overall opinion about the consequences of that case are similar to those expressed in this article: https://www.wired.com/insights/2014/05/non-apocalypse-apis-c...

> which were taken from Apache Harmony, an open-source cleanroom Java implementation developed by the Apache Software Foundation (ASF)

> - https://en.wikipedia.org/wiki/Google_v._Oracle_America

In what world does Oracle own the copyright to this code?

It might (if you buy Oracle's api argument) be an illegal derivative work of Oracle's api, but even that doesn't make them own it.

I'm also very curious to see a source that this is the same code as in Java SE, that seems highly unlikely...

It’s an illegal derivative. Sun never granted a license. And of course the interface declaring code will be the same, otherwise the types would be incompatible and the documentation would be confusing: http://www.fosspatents.com/2012/04/oracle-asks-court-to-clea...

Historically functionally required aspects of a work have not been considered subject to copyright. Oracle's position is contrary to a long history of reimplementation of APIs.

It’s not functionally required because there are multiple (infinite, even) ways of implementing the functionality of most APIs, and the specific choice of function names and types and such chosen is itself a creative process. This was the ruling the appeals court came to.

If I want to reimplement POSIX I need creat, open, ioctl, etc. Otherwise it just doesn't work!

POSIX is an open standard.

Honestly curious: Is there any objective reason for going with Oracle DB over PostgreSQL? Putting emotions aside :).

Or is it that Oracle runs on the fumes of legacy lock ins?

I think many tend to forget that Oracle actually offers an excellent database product, with extremely good replication/failover feature and really good performance. It's not a bad database, and it scales amazingly well.

The answer is still: No, not really.

The price tag of Oracles database license is simply to high, compared to any benefit you might get. It's much better to try to redesign your application to work around any feature you might be missing in Postgresql, or MariaDB.

I mostly agree with this comment. Oracle makes a fantastic RDBMS (so does Microsoft, if we're talking about expensive enterprise RDBMS), but while Postgres has caught up with a lot of the more important features, I disagree with your conclusion.

For many organisations, the licensing costs of an Oracle or Microsoft RDBMS are trivial compared to the costs of the projects they're supporting, and it's easy to find where the value might be. Everything that's difficult about running an RDBMS at scale (clustering, failover/replication, statistics collection, plan management, monitoring, online index rebuilding, table partitioning...) is easier with the features and management tools those companies provide. Add actually decent support to that, and you've got a very clear value proposition.

A Postgres instance will require a lot more hands on management, and a deeper understanding of how the database engine works. At smaller scale, these factors are much less likely to cause problems, and many organisations are unlikely to ever reach actually large scales (many TB instance, running hundreds or thousands of transactions per second). But if you want to start scaling an RDBMS application up, you're going to have to confront those costs in one way or another. After a point, licensing expensive features can start to look better than deploying more specialised labor.

I'm not sure if I remember this correctly, so don't take this at face value:

I remember some colleagues rolling out a project to thousands of postal service branch offices. Oracle was very happy to provide 24/7 emergency DBA services in case anything went wrong with any of the instances (local to the branch offices).

Just an additional guarantee of "if one of them behaves weirdly, we will be right on it" (as a service).

Unless you get a Postgres-based consultancy to do the same for you (and ofc they're not first party), that support is just unbeatable (in terms of what you're selling to your customer, the post offices).

What happens when the Oracle staff doesn't perform to your expectations and the contract?

This is the aspect that baffles me. I've heard people say with a straight face that they'd prefer to use Oracle "so there's somebody to sue" if things go wrong.

But it also seems to be uncontroversial that no good could plausibly come out of attempting to sue Oracle.

Good question. I don't know.

You have to live with them anyway?

You obviously don't try to sue them, their company is 90% legal staff [0]. (Sorry I can't find the original source)

[0] https://am21.mediaite.com/tms/cnt/uploads/2011/06/org-charts...

You use Ora2Pg

Oracle provide a lot of other software, similar to SAP, which are often used by enterprises. A lot of the time it all comes bundled together including a support contract. The database also has a few proprietary features which no other RDBMS provides.

The support is pretty good although it's usually only needed because the products themselves are filled with obscure bugs, edge cases and combinations of features which don't work well together.

PL/SQL - also if you plan on using Oracle Cloud which seems to have a decent serverless compute offering. I'm still curious why IBM hasn't gotten Amazon to host some System Z DB2 mainframes to compete.

Mainframe is the CA of CAP.

Personally I chose SQLite sharded on S3 and dynamodb.

> I'm still curious why IBM hasn't gotten Amazon to host some System Z DB2 mainframes to compete.

Why would they use AWS? They have their own cloud: https://cloud.ibm.com/catalog#services

There are some areas where Oracle indeed offers more features (mainly in the administration/monitoring area) compared to Postgres, but PL/SQL vs. PL/pgSQL as a procedural language PL/pgSQL is not one of them. Both languages are pretty much on par

Not System Z but Power: https://cloud.google.com/ibm/

> Personally I chose SQLite sharded on S3

This sounds interesting, would you mind sharing details?

If you are starting from zero, there is no reason to choose Oracle.

But you can do a lot with Oracle XE and Apex (which are free). and you can link them up with Postgresql through HA or FDW.

Apparently there are some certifications that make Oracle one of a handful of options for regulated industries.

If you bill your clients for licenses and add some markup on that then you make money from it.

Also some customers feel more safe when they know they've paid $10 000 or probably more for an oracle license.

Source: I once suggested to my boss that maybe we could save around $10 000 pr customer site and also save ourselves tons of work by using Postgres instead.

Oracle's "package" feature is useful for organizing methods. A package is the SQL equivalent of a static class. One can put SQL methods inside a package.

Firebird db is the only other database I know of that has package functionality. For a portable database, that is pretty impressive...

In my experience, no. I used Oracle in a professional setting for 15 years. I’ve only used Postgres for 2-3 years, but so far it has been superior to a Oracle in every way.

No need to deal with vacuum and a lot of other performance related features.

A TIMESTAMP WITH TIME ZONE data type that makes sense and is usable.

The first that comes to mind is cost.

I think he was asking the other way around. Unless you are getting a sales commission, of course - then it does make sense to go with Oracle because of the cost. :)

Also Oracle salespeople.

I can deal with the salespeople, they are actually very nice.

I'm fed up with support (it takes so much work to open a ticket) and with the documentation (excessively verbose and full of marketing).

This is fantastic.

I am monthly approached by Oracle the opposite way tho, to migrate services I architect from Postgres to Oracle db. Never succumbed to this date (I would rather switch to farming TBH).

With this tool, I can go offensive :) Somebody also mentioned in the comments orafce extension - even more awesome, I can tell them that I already use Oracle db , its just inside Postgres which is technical detail :D.

> I would rather switch to farming

Beware of John Deere though, I heard they're the Oracle of farming ;)

So happy to see this project in here. It was the very first OpenSource project I have ever contributed to.

Related: please consider getting your company to support pgloader[0] so we can get a piece of free software for migrating from Oracle into Postgres.

[0] https://github.com/dimitri/pgloader

We used Org2Pg to move our Oracle database (which we had been on for 15 years) to Postgres recently. Even using Ora2Pg it took us 2 years.

1. Ora2Pg will create a file that has all of your tables, columns, indices, keys etc. in Postgres format. Unfortunately it is great as a first pass, but we found that we had to do a lot of manually tweaking to get it into a really usable state. For instance, it does it's best guess at picking the right data type, but we didn't agree with all of its choices so we had to look through each one and decide which needed changing. Also, the way primary keys are setup on Oracle and Postgres is very different Ora2Pg takes a literal translation of translating how Oracle primary keys are set up instead of using the Postgres way - so we had to fix all of these.

2. It can't really convert functions and triggers. All of these had to be rewritten by hand.

3. It takes a ton of tweaking and experimentation to get Ora2Pg to transfer the data in a decent amount of time. Our first try took nearly 24 hears. We eventually tweaked it down to about an hour.

4. You will still need to rewrite all of your SQL queries into Postgres format. This is the part that took the most time since we have about 5,000 queries and each one had to be checked.

5. The biggest pain is that we couldn't stop development while the database migration was going on. So every time changes were made to the Oracle database, we had to manually make those same changes to the Postgres DDL files.

Sigh. I am so glad that project is behind me. I don't ever want to be involved in something like that again. It was totally worth it (Postgres is a dream to work with compared to Oracle), but the process was a nightmare.

Some notes about converting queries from Oracle to Postgres:

"foo IN (a, b, c)" can often be better done as "foo = ANY(ARRAY[a, b, c])" in Postgres - not least because you can shift to passing in an array of variable size without having to construct a special statement for each.

for sorting on the result of a function, consider using a subselect. "SELECT foo FROM (select fun(z) as foo from blaz) tmp order by 1".

replace nvl with coalesce

replace rownum <= 1 with LIMIT 1

replace listagg with string_agg

replace recursive hierarchy (start with/connect by/prior) with recursive

replace minus with except


replace trunc(sysdate) with CURRENT_DATE

replace trunc(datelastupdated) with DATE(datelastupduted) or datelastupdated::date

replace artificial date sentinels/fenceposts like to_date(’01 Jan 1900’) with '-infinity'::date

remove dual table references

replace decode with case statements

replace unique with distinct

replace to_number with ::integer

replace regexp_like() with ~ or ~*

replace mod with % operator

replace merge into with INSERT ... ON CONFLICT… DO UPDATE/NOTHING

change the default of any table using sys_guid() as a default to gen_random_uuid()

oracle pivot and unpivot do not work in postgres - use unnest

ORDER BY NLSSORT(english, 'NLS_SORT=generic_m') becomes ORDER BY gin(insensitive_query(english) gin_trgm_ops)

replace UNISTR( with U&’

Oracle: uses IS NULL to check for empty string; postgres uses empty string and null are different

If a varchar/text column has a unique index a check needs to be made to make sure empty strings are changed to nulls before adding or updating the column.

PostgreSQL requires a sub-SELECT surrounded by parentheses, and an alias must be provided for it. - SELECT * FROM ( ) A

any functions in the order by clause must be moved to the select statement (e.g. order by lower(column_name))

Any sort of numeric/integer/bigint/etc. inside of a IN statement must not be a string (including 'null' - don't bother trying to use null="", it won't work).

Concatenating a NULL with a NOT NULL will result in a NULL.

Pay attention to any left joins. If a column from a left join is used in a where or select clause it might be null.

For sequences, instead of sequence_name.nextval use nextval('sequence_name')

> "foo IN (a, b, c)" can often be better done as "foo = ANY(ARRAY[a, b, c])" in Postgres - not least because you can shift to passing in an array of variable size without having to construct a special statement for each.

When using Oracle and the values are not static/a literal you should be using

    foo = ANY(SELECT column_value FROM TABLE(?))
with an array as a bind parameter. This saves on hard parses, increases the effectiveness of cached session cursors and any client side prepared statement cache.

Oracle: uses IS NULL to check for empty string

I don't know anything about Oracle, and I realize the semantics of "null" are subtle, but this seems like a peculiar design decision that conflates two different things. With Oracle, how would one represent the state of a text field we don't know anything about? An empty string implies "we know the value and it is <nothing>" versus "it could be anything; we just don't know".

[edit: reworded for clarity]

In Oracle there is no difference between a NULL and an empty string. My understanding is that this design decision was made long long ago before there were any standards or a realization that this was a bad thing and now it is too late to change.

By far and away, the way NULLs are handled in Oracle is the biggest problem with moving to a different database.

I used this just a few weeks ago and was impressed at how comprehensive the tool is as well as the level of detail in the readme.

Another option, migrate your codebase to use dbt (data build tool) [1] which will give you (among other things) the options of deploying your database project with minimal changes to any number of database products using custom adapters (e.g. dbt-oracle [2] dbt-presto [3])

[1]: https://www.getdbt.com/product/

[2]: https://github.com/techindicium/dbt-oracle

[3]: https://github.com/fishtown-analytics/dbt-presto

The oracle adapter is pretty incomplete.


I’ve been looking at writing a script that parses the daily files for the California State Senate and State Assembly to create an API along with exposing a web app that offers a UI to view and search changes so folks interested can look around.



They give raw sql files in there but also a bunch of oracle that I have had no way to read.

This tool might help me see if there are differences in the file types.

I've been using AWS's Schema Conversion Tool to move from SQL Server to Postgresql, and it has been a good experience.

You could put your data in Postgresql and use Oracle XE/ORDS/APEX as a front-end and have the best of both worlds for free.

Ora is short for O'Reilly. http://ora.com/

That's plain wrong. Ora is short for Oracle

Do they own ora dot kahm?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact