Hacker News new | past | comments | ask | show | jobs | submit login

We've been working on migrating from Oracle to Postgres for a few years now. We are about 2 weeks from being finished. It is not for the faint of heart, but it is totally worth it. The documentation is much much better, performance is equivalent or better, the sql dialect is saner, etc. Other than moving the data itself (ora2pg was invaluable for this), rewriting the queries is what has taken the most amount of time. Some of our tips on differences between oracle and postgres sql:

replace nvl with coalesce

replace rownum <= 1 with LIMIT 1

replace listagg with string_agg

replace recursive hierarchy (start with/connect by/prior) with recursive

replace minus with except

replace SYSDATE with CURRENT_TIMESTAMP

replace trunc(sysdate) with CURRENT_DATE

replace trunc(datelastupdated) with DATE(datelastupduted) or datelastupdated::date

replace artificial date sentinels/fenceposts like to_date(’01 Jan 1900’) with '-infinity'::date

remove dual table references (not needed in postgres)

replace decode with case statements

replace unique with distinct

replace to_number with ::integer

replace mod with % operator

replace merge into with INSERT ... ON CONFLICT… DO UPDATE/NOTHING

change the default of any table using sys_guid() as a default to gen_random_uuid()

oracle pivot and unpivot do not work in postgres - use unnest

ORDER BY NLSSORT(english, 'NLS_SORT=generic_m') becomes ORDER BY gin(insensitive_query(english) gin_trgm_ops)

Oracle: uses IS NULL to check for empty string; in postgres, empty string and null are different

If a varchar/text column has a unique index a check needs to be made to make sure empty strings are changed to nulls before adding or updating the column.

PostgreSQL requires a sub-SELECT surrounded by parentheses, and an alias must be provided for it. - SELECT * FROM ( ) A

any functions in the order by clause must be moved to the select statement (e.g. order by lower(column_name))

Any sort of numeric/integer/bigint/etc. inside of a IN statement must not be a string (including 'null' - don't bother trying to use null="" it won't work). Concatenating a NULL with a NOT NULL will result in a NULL.

Pay attention to any left joins. If a column from a left join is used in a where or select clause it might be null.

For sequences, instead of .nextval use nextval('')




Somebody can build a consulting business from this HN comment.


This is the first time I have wished Hacker News showed upvotes. They have got to be over 100 with that right?


You can model it.

- Get upvote snapshots of all top-level comments' authors in this thread. Take repeated measurements at relatively low time resolution. Also compare this to any other comments they may be making (have a model of the type of comment: follow-ups that are deeply nested probably have a lower upvote velocity)

- Find archive.org snapshots of OP as well as the authors of all other upvoted comment threads. Find the delta from then to now. Some authors will have recent data.

- For a better model, construct historical graphs of all authors using these methods and use it to iteratively adjust.

- Make sure you understand the ranking algorithm and that it is reflected in your model.

Nontrivial, but could be a fun exercise.


And of course you could store the results on Postgresql!!



Checkout compilerworks.com they work on exactly this problem.


@xzor :+1: for the suggestion/recommendation


We also have a solution: https://www.jooq.org/translate/


It's almost like you should set up a Patreon for this comment.


> replace rownum <= 1 with LIMIT 1

rownum is a pseudocolumn numbering entries, sorting the results will give you records in non increment rownum. On such queries rownum <= N will give you the record that was first in the results before sorting.

LIMIT will give you the first record after sorting.

So rownum != LIMIT and if you really want to implement the same logic you would need to use select row_number() over () as rownum in the ported query, put that into a subquery and filter with where rownum <= N in the outer level.

Example:

select * from (select row_number() over () as rownum, * from pg_class order by relname) as x where x.rownum < 5;

select * from pg_class order by relname LIMIT 5;

Will give two very different results.

The first one maps to oracle:

select * from pg_class where rownum < 5 order by relname;

The second one doesn't.


Our Oracle queries are written like:

select * from ( select * from pg_class order by relname ) where rownum <= 1

So select * from pg_class order by relname limit 1

works for us.


> any functions in the order by clause must be moved to the select statement (e.g. order by lower(column_name))

What does this refer to? AFAIK, you can invoke functions just about anywhere including in the ORDER BY clause:

    => CREATE FUNCTION some_func (int) RETURNS int AS $$ SELECT $1 + 1; $$ LANGUAGE SQL;
    CREATE FUNCTION

    => SELECT t.* FROM (SELECT 1 AS a) t ORDER BY some_func(t.a);
    a 
    ---
    1
    (1 row)
Are you referring to not being able to reference column aliases in an ORDER BY clause? (work around is to use a subquery)

    => SELECT 1 AS a ORDER BY some_func(a);
    ERROR:  column "a" does not exist
    LINE 1: SELECT 1 AS a ORDER BY some_func(a);


I would note that "foo IN (a, b, c)" can often be better done as "foo = ANY(ARRAY[a, b, c])" in Postgres - not least because you can shift to passing in an array of variable size without having to construct a special statement for each.

For sorting on the result of a function, consider using a subselect. "SELECT foo FROM (select fun(z) as foo from blaz) tmp order by 1".


Using SQL Server exclusively has given me a Never-Oracle mindset. Some of these differences sound insane. and

> any functions in the order by clause must be moved to the select statement (e.g. order by lower(column_name))

I'd pull the hair outta my head


My top 1 is this one:

> Oracle: uses IS NULL to check for empty string; in postgres, empty string and null are different

Any programmer who doesn’t distinguish NULL and "" has not passed the 2-year experience. Being generous. How did this make it to production?


Do you mean why does Oracle treat NULL and empty string as the same thing? Internally Oracle changes empty string to NULL values. Oracle won't let you insert an empty string.


But functions in the where clause are an unsargable mess :(


Thanks I’ve sent this comment to our DBAs. Very useful.


> replace merge into with INSERT ... ON CONFLICT… DO UPDATE/NOTHING

PostgreSQL supports MERGE statements [1], and a cursory glance doesn't seem to show that Oracle deviates that much from the SQL standard. Is there a particular reason you are avoiding the merge statement or is it simply a readability preference for the non-standard "upsert" ON CONFLICT extension?

(I'm curious because I was a big fan of MERGE statements in a past life. Part of being allowed to use them in that codebase was proving that the standard was pretty well followed across DB vendors [not that we used more than one], as well as that MERGE performance was critical to some very large dataset joins with complex update logic.)

[1] https://www.postgresql.org/message-id/attachment/23520/sql-m...


PostgreSQL does not support the MERGE statement (at least not yet).


IIRC there was a discussion about adding the full ANSI MERGE statement, but in the end they added INSERT ... ON CONFLICT and left it at that:

https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CO...


Interesting, I can see where some of my confusion comes from if it was discussed but not completed. It's somewhat disappointing as I do think the ANSI MERGE statement is a good tool.


The MERGE statement is less functional and has a LOT of hidden gotchas, I won’t argue that I can be useful but I don’t disagree when them not providing the footgun.


I won't argue that debugging MERGE statements is easy, and that there aren't pitfalls, but it's a very declarative relational model reduce pattern so calling it "less functional" doesn't seem accurate to me, if you mean functional as in functional programming. (If you mean functional as in the raw number of features supported, I also heavily disagree because I know from past experience there are reductions you can do with MERGE that you cannot do easily if at all with other "upsert" patterns.) Again, I might be biased because in that past life I was doing some wild "big data" (of a sort, of a sort) map/reduce type stuff with MERGE (in T-SQL on Microsoft SQL Server, if that matters for painting a picture).


I should have clarified for it’s intended purpose MERGE is a useful statement, but for the use case handled by INSERT .. ON CONFLICT the ANSI MERGE statement falls flat, as it does not have the same concurrency and atomicity guarantees. It’s routinely a pain point when people try to use it to implement upsert’s in MSSQL.

For other cases there is work being done by 2nd Quadrant to add MERGE into PostgreSQL, but I’m glad the other functionality was added first to avoid a common and faulty use case of MERGE.


Merge in SQLServer is horrible imo. It’s horrifically slow. And causes more problems than it solves. After learning about upsert in PostgreSQL. Ah I wish we had it in sqlserver. The PostgreSQL code is so much cleaner and more efficient.


We've all got our anecdotes, of course, but I every time I replaced a CTE, cursor, or temp table sproc with an easier to read Merge statement performance skyrocketed. It wasn't even a fair challenge because all the CTEs and cursors were written WITH (NO LOCK) because who needs consistency or transactions, but yet Merge was faster with locks and transactions than CTEs avoiding them (and kowtowing to the local "we use NO LOCK in these parts" culture left them nearly "instantaneous").

Obviously, you have to watch your execution plans and make sure everything is indexed accordingly, because the guts of almost every Merge are a JOIN and if your database isn't set up for performant joins then of course you will hate Merge's performance.

And again, certainly anecdotal and a relatively long time ago in this industry's terms that I did any of that. But still something I'm fascinated by.

[ETA: Also, yes Merge is overpowered and underperformant for single row upsert, but as soon as you need bulk upsert, chef's kiss.]


I think full merge is scheduled for the next release.


not really true. postgres supports "insert .. on conflict" which is functionally a subset of sql merge. the feature was introduced years ago and i would not be surprised if eventually sql merge is glommed into pg core.


I wonder: How long before someone adds a patch to PostgreSQL as a 'compat' patch, or extension.


There's a company called enterprisedb that does just that.

While they keep their Oracle compatibility stuff proprietary, they do employ several of the postgresql core developers.


I believe EnterpriseDB sells something like that.


and gets sued by Oracle because APIs :(


Now do MSSQL.. (Gets pen and notepad)


@irrational, you have my sympathy. But next time you hear of anybody wanting to torture themselves with such an endeavor you might want to save their soul by asking them to look at CompilerWorks.com. Full disclosure, I work at/for CompilerWorks and will be biased since they pay the bills. But as an ultimate compliment for the kind of work we do, one of our customers brought xz0r's comment about us to our attention.


Very useful - anyone using oracle should try to use ANSI keywords now to avoid work in the future, always use COALESCE instead of NVL, use ANSI date functions, etc.


I have a hard time believing PostgreSQL performance is better than Oracle, can you give an example? How big is your database and how much did migrating cost you?


Think of how much you can spend on beefy hardware though, if you go with PostgreSQL, versus spending that money on Oracle licensing.


Yep, it's surprising how much performance you can get when you're not paying Oracle millions for licensing.

A lot of people who think Oracle is amazingly fast, think so because who puts a million dollar DB on thousand dollar hardware? Put any major DB on equivalent hardware and you'll get some pretty damn good performance.


This, wish I could upvote this more.


Nobody has any data they can show you, because Oracle's EULA prohibits both running their software and sharing that data, you can only do one of those.

My experience is that Postgres has better performance than Oracle on much smaller machines. If you have very simple data, with very simple joins, and never deviate from the trivial "select columns from table join table using (foreign key)" you may have a different experience, because Oracle is just great on those.


A decade ago Sun Microsystems published a benchmark showing that PostgreSQL was really close to Oracle performance.

https://www.informationweek.com/open-source-postgresql-trail...


have you tried both? the oracle eula prevents disclosing performance stats. pg allows user defined indexes to be added to kernel and integrated with optimizer, creating some pretty sophisticated optimization techniques. we have the text search index easily handling 20mil+ pdfs with the rum index from postgrespro http://github.com/postgrespro/rum.


Have you seen the pricing for Exadata racks? I ran the numbers a few months back and over 5 years, Oracle was 4x as expensive for my workloads. It's also easier to find AWS people than Oracle people.


You can buy a lot of SSD storage for the price of an Oracle licence.


The best part about this comment is that any errors or new best practices will automatically be updated with the entire knowledge of the world.


Wow.. someone could start an Oracle -> Postgres migration consulting business just from your comment.

I hope they name it SADL Consulting.


What about pl/sql to pl/pgsql?


This wasn't too bad. The first one was the worst - figuring out the difference in the syntax. But after that it was fairly simple to convert all the rest to Postgres syntax. It took a few days to convert all of them in our system (about 500).


A lot of this sounds similar to MS SQL Server - the MERGE one sticks out, does Postgres not have MERGE?


Not yet, merge is broken though and postgres has an alternative that actually do handle races.


If you have a DB migration challenge then take a look at www.compilerworks.com


Super useful. Thanks.


How do you develop and debug triggers?


I'm not sure what you mean. Triggers are developed and debugged in Postgres the same as in Oracle.


I am curious, what tools do you use?

No secret, that with oracle you have few full-blown IDEs where you can literally step by step execute triggers, write unit tests for them etc.


Oh, well, honestly none. I just write them by hand. I've written hundreds that way. It probably would be faster to investigate tools, but it seems like there is always something with higher priority to work on.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: