

Ask HN: Appropriate platform for a data crunching app - zacstewart

I'm in the process of building a social Bayesian suggestion engine. I've coded out a large portion of it from scratch in PHP following an MVC design pattern, but I'm starting to wonder if this is the right environment for doing some Bayesian inference. Will good ol' MySQL work on a grand scale?<p>Reading about the obstacles Hunch has encountered has made me reconsider PHP, but that's what I know best. I'm fairly proficient in Python, so it wouldn't be such a leap to use Django, but I'd prefer not to if there's not a big performance difference. I've looked at MongoDB and it looks like I could replace MySQL without requiring me to rewrite too much of my code base. Node.js looks pretty exciting, but I know very little about it's proper application or whether it's stable enough.<p>I'm up for learning to work in a new environment if there's something that's just _right_ for the job. Any suggestions?
======
unoti
Python is generally faster than PHP, particularly if you're not using APC or
Zend. But the real benefit of using Python is the vast amount of software out
there to do so many things. For example, PyBrain (<http://pybrain.org/>) may
be right up your alley. The book "Programming Collective Intelligence" is all
Python-based, and covers a wide range of algorithms including classifiers and
so on, some of which use SQL backends. I thought the book was awesome,
practical and inspiring, but I'm not a hardcore AI person.

MySql vs NoSql: you may find that sql is totally adequate for what you're
trying to do. You may be able to factor the data access out to a data layer
such that you can switch the underlying storage without too much work.

If your goal is to just get it finished, go with what you know. If your goal
is to grow and/or earn street cred, leave your comfort zone asap!

~~~
zacstewart
I've definitely been trying to abstract all my models out so that I can swap
out a new DAL if needed. I've got a pretty extensive SQL class going, but the
app still has some hard coded SQL queries in it.

I'm not too sure I'd be enticed to use lots of python modules. I think this
may be a programming character flaw, but I am averse to using pre-built stuff
in my code, otherwise I'd be using CakePHP or Codeigniter instead of my own
framework right now. I always want to build everything from scratch.

~~~
unoti
If you're committed to the style points that you get from writing good
software, seriously consider starting to do that in a language other than PHP.

But aside from that issue, consider this: You have probably create quite a war
chest of software you've created for rocking and rolling problems, in PHP.
Python's package structure makes it a lot easier to re-use software than it is
in PHP. There are some amazing micro-frameworks out there, that let you use
other people's code, just the parts you need, without any bloat. But even if
you ignore what other people have written, and want to write it all yourself,
Python is a way better place to be. Using WSGI you ahve all kinds of options
for ultra-fast, lean and mean hosting of code that you write for frameworks.

But things like Flask may appeal to you: <http://flask.pocoo.org/> It's a
collection of a handful of tiny things that together form an alternative to
something like Django. Considering you've eschewed CakePHP, Flask may interest
you greatly. There are actually a number of micro frameworks like this.
Playing with, reading the code, or using some of these things may be fun for
you.

Edit: "FLASK" not FLASH. Oops.

