Hacker News new | past | comments | ask | show | jobs | submit login

Having built a couple of scalable systems of my own (personal project one running on 5 machines currently and 2 works ones running on 50 machines and 17,000+(client)/15(server) machines respectively - though note those last 2 are not websites) I can say safely that the easiest option is to go with a cloud provider. :)

It's a pain to make sure you monitor loads and tweak the availability as things grow. Whereas with Amazon (who I would 100% recommend having recently played with them) it is pretty much all automagical :)

One of the things I used to do when I was a more naive programmer was attempt to reinvent the wheel. Take LiveMeta. I got distracted for at least 2 weeks writing a javascript/php visitor tracking system (and it was damn good :D) . But the thing is - it's not what the site was about, it was an incidental feature (solely for the developers too!) and in the end Google/Clicky provided enough features.

I traded one cut and paste of the Google Analytics code for 2 weeks I should have spent on the other parts of the project.

Same applies here: dont give yourself work. If the project is going to need to scale fast ebough to require cloud computing/scalable architecture pay the bit extra to have Amazon do it for you. 3 years down the line you might decide it is time to invest in your own network - but right now I doubtthat shouldbe your focus. :)




If I may ask, what apps have you built that brings all this experience?

So it does cost a little bit more than the regular services. Thats what I gathered, but its also def worth it.

Since I have ASP.net Stack, I would have to go with Windows Azure.


You have to understand that some aspects of the work stuff is covered under NDA :)

HOwever..

The personal project is a social network site (still under development) - I built the scalable back end because we secured some funding and were thinking of launching until other factors caused us to re-evaluate. We built the architecture because it is running an evolved version caching server I wrote at university plus some other parallel applications (my thesis was in these areas) I wrote to load balance the Databases :)

On the work front the smaller cluster is essentially a massive cache cluster (like BigTable) I wrote based on my previous work that we use in office for storing the large fast-access tables we use. We process & access numerous pre-generated hashes daily (upwards of a billion). The software can scale as we add / remove nodes. If it needs extra storage it can temporairily grab one of the other servers used for other purposes. If this happens a lot it notifies us we need a new addition: commodity hardware, 64Bit OS with jack lots of RAM and voila :D

The second setup is the hash generation system. The server side of this is a distributed storage system (again custom written). It does work closely with the hash database too. I cant realy talk about the client side of the above :)

Coding in the scalability was the hardest part of all 3 exercises.

EDIT: and if that sounds cool, well, it is. But it is also a lot of headaches too :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: