

Ask HN: FreeBSD or Illumos for Postgres DWH? - bsg75

Looking to implement a platform with a native ZFS port to run PostgreSQL DBs on. These databases support analytic workloads (OLAP), and initial testing with ZFS On Linux (ZoL) is promising[1], but I have encountered some issues that are driving me to investigate a &quot;built-in&quot; implementation.<p>Coming from Linux (CentOS), I find FreeBSD to be very friendly. From installation to adding apps via ports or pkg, FreeBSD was a snap to get started with.<p>Some in the PostgreSQL community favor Illumos derivatives, OmniOS for example. Initial explorations show it is a greater deviation from the familiar, and will require more time investment before we are proficient enough for production.<p>This said, are there any significant advantages that an Illumos platform has over FreeBSD, when looking at a host that will principally run PostgreSQL databases? Performance? Stability?<p>The only other workloads will be Python based ETL processes - I have found Pandas and similar run fine on BSD, not yet sure about OmniOS.<p>[1] I have great hopes for the OpenZFS project, and eventually ZFS On Linux. Other data platforms I use such as Vertica are not supported on BSD, and ZoL brings ZFS to a lot of new platform possibilities.<p>Edit: It turns out I asked this two years ago (https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=5665800), but perhaps new perspectives will be added.
======
bcantrill
Disclaimer: I work at Joyent, the company behind SmartOS, an illumos
distribution.

I would say that you've made the right choice with ZFS (natch). I think
FreeBSD, OmniOS and other illumos distributions like SmartOS are all good
choices for Postgres on ZFS, albeit for different reasons -- and it will
depend on things such as comfort level, the need for other technologies like
zones, etc. One additional option to be aware of: we in the illumos community
are very actively working on a Linux personality for illumos[1], allowing
virtualized Linux to be run within an illumos (SmartOS, OmniOS, etc.) zone.
This technology is quite far along (e.g., [2]), and we at Joyent intend on
deploying it into production this quarter. It might not match your timeframe,
but it's something to keep an eye on: it gives you the repo of record for ZFS
but with the same comfort level that you have with Linux -- and features like
DTrace and mdb as gravy.

[1] [http://www.slideshare.net/bcantrill/illumos-
lx](http://www.slideshare.net/bcantrill/illumos-lx)

[2]
[https://twitter.com/bcantrill/status/555143487482368000](https://twitter.com/bcantrill/status/555143487482368000)

~~~
bsg75
In my case, a Linux personality is not a terribly interesting feature. All of
my existing code and apps run on BSD, and FreeBSD is comfortable enough.

What I am looking for is a solid ZFS implementation, a stable OS, and
performance (IO and CPU). Jails may not be of too much use in my particular
scenario.

Data warehousing means aggregation queries running over large sets of on disk
data, which is the area I need best of breed support for.

~~~
bcantrill
Then I think you want to look very seriously at illumos: the ARC alone is a
huge win, and it remains the repository of record for both ZFS and DTrace
(which, if you care about performance, is important to you).

~~~
bsg75
Any input on which distro? SmartOS is interesting, but seems to be more of a
hypervisor / container for multi-tennancy?

Looking for a single server implementation with good package support. OmniOS /
OpenIndiana / ... ?

~~~
bcantrill
OmniOS should be perfect for you: designed for traditional server installs and
with good package support. Hit me (@bcantrill), @postwait or @danmcd up on
twitter or drop into #illumos on Freenode if you run into turbulence!

~~~
_delirium
> good package support

I'll admit my OmniOS experience is quite minimal, but this was one thing I
found frustrating about it. IMO the SmartOS packaging situation, with ~13,000
binary packages available via pkgsrc/pkgin, is a lot nicer. OmniOS seems to
have <1k packages available via IPS, and be more geared towards the kind of
corporate scenario where you install the base OS and then build/deploy your
own in-house set of application packages. (That's a perfectly legitimate
scenario too, but pretty different from the pkgsrc/ports/apt style.)

~~~
bsg75
Can the Joyent package repo be added to OmniOS?

[http://pkgsrc.joyent.com/installing.html](http://pkgsrc.joyent.com/installing.html)

~~~
_delirium
Ah that's a good point, I hadn't thought to try that. It looks like Joyent
makes sure to build their binary packages so they're self-contained and will
work on any Illumos distribution, which is pretty cool.

------
dman
Out of curiosity - what kind of hardware are you running on?

~~~
bsg75
Production would be Supermicro servers (8-16 CPU, 48/96/128GB RAM), with
either banks of 10K RPM SATA drives, or a mix of SSD volumes and HDDs. All
RAID-10.

Some servers are SSD only (< 8TB) that are used for current period (month /
quarter) for add-hoc queries of actively changing data.

The mixed configs use the SSD volumes for staging and ETL processing
(aggregation for storage into fact tables), as well as frequently queried
current period data.

As data periods age, they are migrated to HDD volumes which hold more data at
a lower price point. My budget does not yet support 20+ TB of all SSD storage.

~~~
dman
Thanks!!

~~~
bsg75
Curiosity due to ... ?

Implementing something similar?

~~~
dman
I work on similar problems on the software side. Trying to setup a home lab to
do some experiments and was curious on what kind of hardware people are using.

