Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: FreeBSD or Illumos for Postgres DWH?
5 points by kermatt on Jan 14, 2015 | hide | past | favorite | 13 comments
Looking to implement a platform with a native ZFS port to run PostgreSQL DBs on. These databases support analytic workloads (OLAP), and initial testing with ZFS On Linux (ZoL) is promising[1], but I have encountered some issues that are driving me to investigate a "built-in" implementation.

Coming from Linux (CentOS), I find FreeBSD to be very friendly. From installation to adding apps via ports or pkg, FreeBSD was a snap to get started with.

Some in the PostgreSQL community favor Illumos derivatives, OmniOS for example. Initial explorations show it is a greater deviation from the familiar, and will require more time investment before we are proficient enough for production.

This said, are there any significant advantages that an Illumos platform has over FreeBSD, when looking at a host that will principally run PostgreSQL databases? Performance? Stability?

The only other workloads will be Python based ETL processes - I have found Pandas and similar run fine on BSD, not yet sure about OmniOS.

[1] I have great hopes for the OpenZFS project, and eventually ZFS On Linux. Other data platforms I use such as Vertica are not supported on BSD, and ZoL brings ZFS to a lot of new platform possibilities.

Edit: It turns out I asked this two years ago (https://news.ycombinator.com/item?id=5665800), but perhaps new perspectives will be added.




Disclaimer: I work at Joyent, the company behind SmartOS, an illumos distribution.

I would say that you've made the right choice with ZFS (natch). I think FreeBSD, OmniOS and other illumos distributions like SmartOS are all good choices for Postgres on ZFS, albeit for different reasons -- and it will depend on things such as comfort level, the need for other technologies like zones, etc. One additional option to be aware of: we in the illumos community are very actively working on a Linux personality for illumos[1], allowing virtualized Linux to be run within an illumos (SmartOS, OmniOS, etc.) zone. This technology is quite far along (e.g., [2]), and we at Joyent intend on deploying it into production this quarter. It might not match your timeframe, but it's something to keep an eye on: it gives you the repo of record for ZFS but with the same comfort level that you have with Linux -- and features like DTrace and mdb as gravy.

[1] http://www.slideshare.net/bcantrill/illumos-lx

[2] https://twitter.com/bcantrill/status/555143487482368000


In my case, a Linux personality is not a terribly interesting feature. All of my existing code and apps run on BSD, and FreeBSD is comfortable enough.

What I am looking for is a solid ZFS implementation, a stable OS, and performance (IO and CPU). Jails may not be of too much use in my particular scenario.

Data warehousing means aggregation queries running over large sets of on disk data, which is the area I need best of breed support for.


Then I think you want to look very seriously at illumos: the ARC alone is a huge win, and it remains the repository of record for both ZFS and DTrace (which, if you care about performance, is important to you).


Any input on which distro? SmartOS is interesting, but seems to be more of a hypervisor / container for multi-tennancy?

Looking for a single server implementation with good package support. OmniOS / OpenIndiana / ... ?


OmniOS should be perfect for you: designed for traditional server installs and with good package support. Hit me (@bcantrill), @postwait or @danmcd up on twitter or drop into #illumos on Freenode if you run into turbulence!


> good package support

I'll admit my OmniOS experience is quite minimal, but this was one thing I found frustrating about it. IMO the SmartOS packaging situation, with ~13,000 binary packages available via pkgsrc/pkgin, is a lot nicer. OmniOS seems to have <1k packages available via IPS, and be more geared towards the kind of corporate scenario where you install the base OS and then build/deploy your own in-house set of application packages. (That's a perfectly legitimate scenario too, but pretty different from the pkgsrc/ports/apt style.)


Can the Joyent package repo be added to OmniOS?

http://pkgsrc.joyent.com/installing.html


Ah that's a good point, I hadn't thought to try that. It looks like Joyent makes sure to build their binary packages so they're self-contained and will work on any Illumos distribution, which is pretty cool.


Out of curiosity - what kind of hardware are you running on?


Production would be Supermicro servers (8-16 CPU, 48/96/128GB RAM), with either banks of 10K RPM SATA drives, or a mix of SSD volumes and HDDs. All RAID-10.

Some servers are SSD only (< 8TB) that are used for current period (month / quarter) for add-hoc queries of actively changing data.

The mixed configs use the SSD volumes for staging and ETL processing (aggregation for storage into fact tables), as well as frequently queried current period data.

As data periods age, they are migrated to HDD volumes which hold more data at a lower price point. My budget does not yet support 20+ TB of all SSD storage.


Thanks!!


Curiosity due to ... ?

Implementing something similar?


I work on similar problems on the software side. Trying to setup a home lab to do some experiments and was curious on what kind of hardware people are using.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: