
PostgreSQL Accessing MySQL as a Data Source Using mysqsl_fdw - riyazuddin
https://www.percona.com/blog/2018/08/24/postgresql-accessing-mysql-as-a-data-source-using-mysqsl_fdw/
======
teej
This feature is the sort of thing that a clever person will use as a shortcut
to actually writing robust ETL code. I have had nothing but nightmares dealing
with foreign data wrappers in Postgres.

The caveats using this feature -

1/ Strict upper bound on how much data it can pull in.

2/ MySQL migrations need to be run on both Postgres and MySQL

3/ No way to gracefully migrate or version

4/ MySQL’s “loose” typing doesn’t play well with Postgres’s “strict” typing.
This means data can break the fdw.

5/ Pain in the ass to debug.

6/ This is debatable, but I don’t believe that application code (SQL code in
this case) should live in the database.

7/ Postgres and MySQL have different performance characteristics. This can
lead to hard-to-debug performance problems if you are explicitly or implicitly
using a fdw in your query.

8/ If you want to keep a local copy of MySQL data in Postgres (to solve for
#7) you then have to write code to keep it up to date. This defeats the
convenience of the fdw.

9/ At least when I was using fdws, they didn’t have predicate pushdown. This
causes you to do structure queries in a weird way to get filters to work as
you expect.

10/ You have to manage schema type mismatch between MySQL and Postgres. This
isn’t fun or productive work.

My experience is colored by inheriting a disaster of interconnected fdws. But
given that negative bias, I still haven’t seen enough value from fdws to
justify their use outside of small scale corner cutting.

~~~
atomic77
FDWs can be incredibly useful in the case of needing database-level, low-
latency access to a reasonably small amount of data managed on another server,
and you don't want or cannot spend the time to write "proper" ETL code.

------
eMerzh
I've used the mysql FDW for a project at work, it was really helpfull. We've
replaced an old "ETL" using a bunch of hard to maintain scripts to mysql_fdw
and some materialized view.

It was easier for us because it helps write maintainable logic, you can easily
sync multiple databases, multiple servers, without much changes.

A few things to note though:

\- we had issues with 0000-00-00 dates that are valid in mysql but not in
postgres. So we had to write include a convertion in plsql into the mat view.

\- We stumble upon a bug in mysql_fdw, in which the selected row is skipped if
the size of a column exceed a given size. This is reaaaaally hard to catch and
really bad experience as you won't get any error, notice or warning.

\- we had trouble with a few type casting like json and others.

------
kbendyk
I once used FDW to synchronize Microsoft SQL server DB with PostreSQL, after
setting up, It was just a matter of a single UPDATE to sync quite complex data
across systems. It isn’t a perfect way, but given there was no public
interface in MS SQL backed application it was a pretty effective solution.

------
imbusy111
Services should be communicating through public interfaces, not databases.
Otherwise you're confusing implementation details with public interfaces and
change becomes really hard and risky.

I can see this being useful in some legacy system that you have to maintain
that is not changing anymore, but otherwise I don't see a good use for it for
properly designed and new projects.

~~~
jobinau
Suppose we want to run a complex reporting query on tables in Sales database
(running MySQL) and Finance database (running PostgreSQL). What is the good
option without duplicating data from one system to another?

~~~
logophobia
ETL, that way you can also ensure your data gets put in a format that'd make
it easy to report on.

Granted, it's not quite as quick as this, and not suited for "one-off"
reporting, but it's a lot more flexible. Put things in a nice star schema and
you have a good way to do analytics on your data.

------
avivallarapu
This is great in fact. I like the idea how it can be used to avoid data
duplication.

~~~
ramenmeal
I'd rather data duplication than coupling two different database servers

------
rosemaryg
Great article.

------
alexnewman
Couldn’t get this to work for large tuples

~~~
ibrar74
Do you have sample sql script and paste what error you are getting?

~~~
alexnewman
It's been a year let me get one.

------
jobinau
PostgreSQL and MySQL are the most popular database systems. Yes, they can talk
and share data using mysql_fdw

