

RoR vs. What else? - AshwinRamasamy

We are building an app that would eventually use machine learning techniques to give recommendations to users. We did our MVP on PHP and then ditched it to build a new version on RoR. Now people say that, RoR does not scale well (mySQL slows down the experience) and not certainly a great tech for machine learning. I am not quiet technical. Should I just go with RoR (investments already made) or change tech. What's the pragmatic call?
======
rubyrescue
[edit - one additional point which is probably more important than the rest.
if you've already done one rewrite and you're thinking of another, your
problem is you probably don't know how to move the ball down the field in a
business sense so you're picking up the only hammer you have - technology
choice, and repeatedly swinging it. Take a step back, ask yourself if the
problem is fear of the unknown, and attack the true problem - how do you get
users to USE your product. If truly the answer is "my users don't want me to
use Rails", I'd be shocked.]

First, I'd say I don't think you have anything to worry about.

Second, don't go replacing MySQL because you're worried about scale. I have
yet to find a database that really _easily_ scales better. You can trade a
whole lot of development time to use something like Cassandra or Riak, (which
are both great for particular applications). Then you spend a huge, huge
amount of time building queries that were easy in Rails and you actually
didn't solve any end-user facing problems. You just made it where only one of
your developers actually can create queries and nobody can maintain the code.

Third, don't go replacing Rails because you're worried about scale. It's not
the fastest thing but it's just not worth losing sleep over.

Just add the appropriate technology where you need it - I probably wouldn't
build a recommendation engine in Ruby; perhaps that is the only piece that you
build using something else.

Here's what we often do @ Inaka - web and admin in Rails, business "logic" and
stuff that needs to scale (either because we have tons of socket connections
or because we have a lot of data to crunch) in Erlang. (Insert Java or Python
or whatever you need for the backend piece in place of Erlang.)

Then build an internal HTTP API that Rails can talk to. Abstract that in a
model class. Have Rails call into it as needed.
RecommendationEngine.recommend(...) returns JSON with the magical results for
your Rails app to render appropriately.

Typically we give our Rails users an "api key" (that they don't know about)
that is used to authenticate calls to the backend service, so we don't even
have to share user authentication schemes between the systems. Then you can
use devise or whatever you want for Rails but don't have to re-implement the
same password hashing algorithms - authenticating those users is just a quick
lookup in the user table.

Sometimes that service API may even be part of a publicly available HTTP API.
For instance, imagine the backend piece exposes part of its API to mobile
devices. Then, some of the methods may be authenticated and some are open to
the world.

~~~
astrodust
Seconding that.

PHP doesn't scale. Java doesn't scale. Python doesn't scale.

If you wait for something that "scales", you'll never ship anything.

Scaling isn't done automatically, it's something that's applied to a problem.
If MySQL doesn't scale, and it has proven itself to be very capable in a wide
variety of circumstances, then what does?

The more you read about scaling, the more you realize there's no magic bullet,
no magically scalable language or platform or framework. It's all about
careful investigation of the nature of the problems, the bottlenecks, and
developing solutions to address those.

------
gourneau
Your web stack and your machine learning stack don't have to be the same.
<http://scikit-learn.org/> is Python and very popular for machine learning.
Here is a in depth training video [http://pyvideo.org/video/972/tutorial-
scikit-learn-machine-l...](http://pyvideo.org/video/972/tutorial-scikit-learn-
machine-learning-python)

~~~
vellum
<http://orange.biolab.si/> is also good. It has a nice GUI.

------
jfried83
Don't believe the people who say RoR doesn't scale. Normaly the framework
isn't the bottleneck. The most critical part (imho) is your database design
(ex.: using shards for user data, using solr (not the db) for searches), ...

------
krob
I don't see a problem in using PHP. Consider how you break down your
infrastructure. Don't build monolithic code-bases which you cannot maintain.
Make small compact services (SOA) which you know you can build out quickly.
RoR is for people who want to spend a lot of time learning the framework. At
this point, it's so large you will probably never even feel like you can
encompass it's full size. Kinda like django. Don't ditch php because people
say it doesn't scale. YouPorn.com thought it scaled quite well. They get a
crap load of traffic to boot.

------
AshwinRamasamy
\------ Much appreciate the insights. We are actually less worried about scale
now (We just got users and some revenue and we are a long shot away from scale
problems). We did not want to go too far down the path where we carry tech
debt so much that we have to re-do. Your answers tell that its okay to go the
way we are going and there are still ways to keep the recommendations (Machine
Learning part) separate. \---

I shall get back here to post a link to our product in about a week when we
launch. Would very much appreciate comments then!

------
timtamboy63
Use a different stack for machine learning, and switch over to postgres - I'm
fairly sure it'll scale better than mySQL

~~~
astrodust
MySQL is as capable as Postgres and can be tuned just as fast. The differences
between these two are features, not performance.

~~~
timtamboy63
Hm, didn't know that, thanks!

------
antihero
Replace MySQL with Percona XtraDB, and leverage aggressive "permacaching" for
a start. Then it's more about splitting up your infrastructure and having
mechanisms for dealing with increased load on different parts of it (ie high
db load = spin up more DB machines)

~~~
astrodust
You have no idea what the problem is and you're already advocating a change of
platform? This is not how you scale. This is how you hit up a client for
thousands of hours of "consulting" fees.

This performance problem could be because of missing indexes, grossly
inefficient queries, or a whole host of other elementary problems that can be
fixed with a keystroke.

~~~
antihero
Not advocating a change of platform at all - Percona is a drop-in replacement
for MySQL, and XtraDB is backwards compatible with InnoDB which gets much more
performance out of multicore systems. Unless your server has one core, it's a
no-brainer.

------
traxtech
Can't you use RoR with a different storage backend like MongoDB ? It's hard to
say without additional precisions on the algorithms and persistent
datastructures.

------
mvasilkov
The pragmatic call is, wait until you need to scale, if at all.

~~~
davidlumley
This. It's less about scaling a framework, than it is about scaling your
particular application.

If you find that MySQL is a bottleneck in particular, Ruby (and by extension
Rails) has a wide range of database adapters and database ORM's available.
Active Record itself has numerous adapters, which mean changing databases
within the supported set shouldn't be too difficult.

Alternatively, you could use a different ORM such as DataMapper or one
specific to the database you desire. Rails now let's you choose which ORM you
wish to use, although if you're just getting started it might prove an
additional learning curve as most of the documentation uses Active Record.

