
How to move from Amazon RDS to a dedicated PostgreSQL server - cinnamonman
http://layer0.authentise.com/how-to-move-from-amazon-rds-to-dedicated-postgresql-server.html
======
aftabh
>To my surprise, PostgreSQL v9.5 on Ubuntu 14.04 does not default to unicode
schema.

PostgreSQL does (if unicode character set was specified when PostgreSQL
database cluster is initialized first time); it's Ubuntu OS which creates
PostgreSQL database cluster with ASCII character set (encoding) after
PostgreSQL's installation.

> Here is the snippet that I had to use: ...

Instead of resorting to those hacks, follow the PostgreSQL documentation[1] to
do it the right way. The simplest way is to initialize your PostgreSQL
database cluster as:

initdb --encoding=UTF8 --pgdata=<database-cluster-dir>

If you want to use an existing database cluster, PostgreSQL documentation
shows how you can do it.

\--

[1] -
[https://www.postgresql.org/docs/9.5/static/multibyte.html](https://www.postgresql.org/docs/9.5/static/multibyte.html)

~~~
dboreham
Using the distro-supplied packages for PG (or any database server) is
typically not the best choice for a production deployment.

~~~
Roritharr
I would like to hear more about that. Our Server Team swears on using CentOS
Packages with Backports for everything for security purposes. This slows us a
bit down, but so far we haven't had any security or audit related issues. How
do you deal with CVEs and these staying audited and secure?

~~~
dboreham
We use the official PG packages from their repo on CentOS. It appears there is
an apt repository for Ubuntu too:
[https://www.postgresql.org/download/linux/ubuntu/](https://www.postgresql.org/download/linux/ubuntu/)

------
jbrantly
Not that it necessarily should, but the article did not really address why
they were moving away from RDS. Was it cost, performance, other issues?

I've been trying to compare RDS for Postgres to other offerings like Compose
or Heroku but have come up surprisingly dry on comparisons.

~~~
eliribble
I'm the decision maker for the company in question - Authentise. It was a
combination of cost and the fact that we need to support an in-datacenter
install option for our customers. RDS is really great if you need all of the
failover and backup options it provides and you don't have the manpower to
invest in it. That's how we got our services started. But, as our expertise
and automation grew it ended up being a modest cost savings to handle our own
backups and redundancies. Primarily, though, we just can't install RDS in
someone else's datacenter on premises.

~~~
whyleyc
Thanks for the background info, very helpful. How has on-prem install gone for
you guys ? There was an article on HN yesterday suggesting it was a royal PITA
(for ops and support), would be interesting to hear about your experience.

Also FYI, your homepage has a 2015 copyright notice on it. No big deal, but
thought you might like to know since it could put off particularly "nit picky"
types of customer :)

~~~
JimDabell
> Also FYI, your homepage has a 2015 copyright notice on it. No big deal, but
> thought you might like to know since it could put off particularly "nit
> picky" types of customer :)

If that's when the work was created, then that's correct. Copyright notices
aren't there to tell you what year it is today. They are there to tell you
when the work was created. If they change it to 2016 when the work was really
created in 2015, then that's an invalid copyright notice and equivalent to no
notice at all.

~~~
_Codemonkeyism
Heard this for the first time, so being really curious (I always have the most
recent year in a range on my sites)

What if the page is dynamically created on the fly? What copyright should I
have? If page contains snippets/work created in different years?

------
craigkerstiens
Overall a nice post. It doesn't quite get into all the options for pg_dump as
there's a number of other flags that can be useful to reduce downtime. It's
also probably worth taking a look at Amazon Migration Service as well (though
I haven't personally used it or heard from others that have yet) -
[https://aws.amazon.com/dms/](https://aws.amazon.com/dms/).

One tool they did miss for continuous archiving is WAL-E, which tends to be
the one most used including by us at Citus Cloud and Heroku Postgres -
[https://github.com/wal-e/wal-e](https://github.com/wal-e/wal-e)

~~~
chrisatumd
We've tested out DMS to migrate from a 3 node Oracle RAC to Oracle RDS, and I
was incredibly impressed. They list a bunch of limitations in their
documentation that you should look at first. It migrated our roughly 1 TB
database in two hours and ten minutes and handled replication flawlessly - not
a single record discrepancy between the two database systems over a week long
test. We were not expecting it to work that well and doubted that it would
work with RAC as the source.

------
room271
It would be interesting to hear the motivation of why they did this. My own
experience is moving the opposite direction (from master-slave self-managed to
RDS master-replica).

I wrote up my experiences and the reasons for the switch here:
[https://www.theguardian.com/info/developer-
blog/2016/feb/04/...](https://www.theguardian.com/info/developer-
blog/2016/feb/04/migrating-postgres-to-rds-without-downtime) .

------
samcheng
For those that have done this, I'd be interested in the performance gain (say,
IOPS at an equivalent hosting + hardware cost). I'd also be interested in RDS
vs. Heroku Postgres.

Do people do this for security / policy purposes, or are they motivated by
cost savings? Does removing all that virtualization buy improved performance?

~~~
vacri
I'm not sure that the article moved to bare iron. The author could have just
set up postgres on a regular ec2 instance.

~~~
samcheng
Does that bring performance or cost savings? What is the motivation and
calculus for making that transition?

~~~
eliribble
It can bring a cost savings, but the savings is very slight if you're
comparing apples-to-apples. For example, if you're using RDS in a multi-AZ
replication set up with periodic S3 backups you have to take into account
running each of the replica instances and the S3 costs. What we found was that
RDS was treated like a black box to our automation (which was a good thing for
a long time) and so we were less attentive to being over-provisioned and
actually testing failover situations than we should have been.

------
throwaway2016a
Amazon DMS (database migration service) actually works really well and allows
you to do this without writing any code. You may need to change your Postgres
config a bit.

DMS isn't just for moving two RDS it can also be used to move off or RDS or
even move from one non-RDS server to another non-RDS server that isn't even on
AWS.

It can also do Postres to MySQL, MySQL to Postgres, etc.

It also does continuous replication so once the copy is done it will keep up
to date (yes, even if it is Postgres -> mySQL).

One gotcha, though, it does not bring over secondary keys so make sure to
recreate them before sending traffic over.

But this article does provide some insight on how to configure Postgres to be
similar to RDS in terms of functionality.

------
qaq
OK a bit of topic, but just curious we are considering rolling Postgres hosted
service that would support much large instances up 50TB much higher IOPS
1,000,000+ lower latency (infiniband vs ethernet SAN) pricing would be in line
with AWS RDS +/\- optional ElasticSearch integration optional change
notification pub/sub Would you guys be interested/think such service is a good
idea? Would appreciate to learn what you guys think?

~~~
Rezo
When it comes to data, trust is key.

Think about it this way: OCZ has on paper SSDs that are simultaneously both
faster and cheaper than, say, Intel. But I wouldn't put a OCZ drive into the
lowest budged ricer gaming PC I could imagine (let's just say that have a bit
of a reputation, to put it lightly), and I should be rightfully fired if I
suggested putting one in a server.

Yes, your service might be massively faster, but... at 10-15% cheaper? No way!
Not until you've been running for years with a great reputation. Now, if you
enter the managed database space and pull a Digital Ocean ($5-$40 pricing vs
comparable EC2 at $80-$400 AND better raw performance on top), then you can
essentially create a new market niche. That's my suggestion to focus on,
because businesses who operate 50TB databases aren't looking to save a few
dollars. But people who use DO, need a database to go with it that is both
faster and cheaper than AWS.

Honestly, performance ranks pretty low on the things I've found people look
for when choosing a managed database service. You can serve a million users
per day and average a whopping 12 IOPS, which might just about push a
Raspberry Pi with sqlite slightly. Got more than a million users? Congrats,
you now officially qualify to throw money at problems, the kind of problems
RDS and Amazon loves to solve by injecting money.

~~~
qaq
You do not have an option to host anything larger then 6TB on RDS. I guess
what we are trying to figure out is if there is a need for service for people
who have needs beyond those RDS is addressing.

~~~
yeukhon
Not only disk space, you are limited to just two nodes (one primary and one
standby). Of course you can create read replica, but that if you are write
intensive, you can end up consuming all of your pools.

~~~
merb
actually you would shard than, which of course is possible with rds aswell.
since you mostly shard at the application level. there are some users that use
extensions but I guess calling the support of aws will get these extension in
your rds instances.

------
koolba
From the article (emphasis mine):

> [pg_dump ...] The main disadvantage of this method is that it will not
> provide high _reliability_.

Pretty sure the author means "availability" not "reliabilitiy". pg_dump is
completely reliable, arguably more so than ANY other backup mechanism as it
creates logical machine independent backups.

Best intro tip regarding pg_dump: use -Fc (custom format)

~~~
room271
Yes, and when you are using pg_restore use the -j flag! (restore in parallel).

------
jayisbatman
tldr - a howto

