

Ask HN: Anyone with scaling skills to consult with us? - inovica

Hi there.<p>We are an established company (telecoms field) that is embarking on an internal 'startup' project.  It will be self-contained, still linked to telecoms but we will have the need to scale, offer redundancy and have fast database access.  Basically we are taking call data for millions of calls and we want to be able to mine that information. Over a 5 year period we expect to have in the region of 2-5 billion records (call information from number dialled, originating number, length of call, and around 10 other pieces of information).<p>Basically my knowledge is good but not good enough for this level (just being honest). We've chosen Python and Pyramid to do this, but need help with database choice, how to scale apps across servers, etc etc.  We are looking for someone who has done this before. The company is www.performancetelecom.co.uk and we're based in the UK.<p>To get in touch you will find my personal contact information in my profile<p>Thanks.  Ade
======
b14ck
Hey,

I'm in a similar situation as you--my company does millions of calls per month
(we're in the telecom industry in the US), and we use Django / python /
Asterisk as our primary tools, along with OpenSIPs for routing SIP traffic.

I've been working extensively with this stuff for the past 3 years--I'd be
happy to give you some help, or talk you through some design decisions.

If you want to get in touch, feel free to email me (or get me on gtalk):
rdegges@gmail.com

Would be nice to talk to some other people in a similar industry, not too many
telecom / web dudes around, afaik.

------
djb_hackernews
If you've got cash <http://www.datastax.com/products/enterprise>

Otherwise a Cassandra cluster for quick OLTP that batches to an HBase cluster
for your OLAP.

~~~
gujk
That is a poor choice for a non-expert user of Cassandra and Hadoop. The
DataStax toolset is very rough. Definitely not an "off the shelf" solution
like Cloudera.

------
gujk
With a dataset that small (1 billion rows of 20 columns of small data items
per year), just use a regular RDBMS like Oracle or Postgres. A small cluster
will be plenty of capacity.

------
devs1010
Just curious, why you have opted for Python when your primary goal seems to be
scalability?

~~~
inovica
We have standardised on Python as our language of choice as its the one that
we have the most skills in and everyone in the team knows it

