
India's SMS GupShup Has 3x The Usage Of Twitter And No Downtime - pbnaidu
http://anand.typepad.com/datawocky/2008/06/indias-sms-gupshup-has-3x-the-usage-of-twitter-and-no-downtime.html
======
einarvollset
Such an utter lack of understanding of Twitter's service model I've rarely
seen. Message delivery was never Twitter's problem; dealing with persistance
(I want my messages when I'm good and ready thanks, not when the system wants
to push them to me - READ: LIKE SMS MULTICAST!) is. Christ.

Downvote me all you like, but I'm embarrassed this story made 21 points.

~~~
tx
Such an utter lack of understanding of SMS messaging I've rarely seen. Twitter
is a joke compared to what it takes to do SMS properly. Not only SMS needs to
be persistent, but they also need to be stored and accurately retreived for up
to 5 years, which is a goverment regulation in most countries. Moreover,
delivery rate needs to be 100%, period. SMS is not a time waster, even
pacemakers send SMS messages from inside of human bodies when their batteries
need to be replaced. You can't afford to drop those.

I know that the "culture" of Silicon Valley is not to criticize, so let me put
it this way: HIRE CAREFULLY for your startup, especially when it comes to
engineering. Even if you're solving a trivial issue.

~~~
einarvollset
Don't get me wrong, I don't think SMS on a truly large scale is trivial. I do,
however, think SMS messaging is fundamentally different to what Twitter is
doing. The former is unicast: I send you a message. It needs to be stored
somewhere (your SMS inbox). Twitter is multicast: I send a message, everyone
who subscribe to me needs to get that message.

Now, misunderstand me correctly; I don't think this is anything new. My former
boss built tools (over 20 years ago) that ran such systems as the NYSE, Swiss
Air-traffic control, etc that did fault-tolerant multicast. And the uptime
there was somewhat better than Twitter.

What led me to post the original comment was this: They are not solving the
same problem, but if they were their architecture would be just as flawed as
Twitter's.

------
avner
It works..because the Indian cellphone companies don't cheat you into paying a
fee for RECEIVING sms or calls.

~~~
plinkplonk
you have to pay for receiving calls/sms in the USA? wow !!!

Fwiw, here in India, in the initial days of cell phone adoption, some phone
cell service providers tried providing plans with "receiving charges", but no
one used those plans so they (mostly) faded away.

I assumed something like that happened in the United States as well.

Another interesting phenomenon here is that (most) cell phones are not locked
to a particular vendor. The cell phone manufacturers (Nokia , Samsung et al)
compete (fiercely) on phones (almost every week a new phone launches) and the
service providers (Airtel, Vodaphone et al) compete on service (bazillion
plans with different mixes of features. You can shift a plan to another one i
a few hours by sending a free sms to the service provider)

You get a sim card from the carrier you subscribe to and buy a phone from
wherever you want, insert the card and you are good to go. You can change or
upgrade the phone and/or service provider independently of each other.

~~~
dcurtis
What you are describing is the power of the open market; it's what happens
when you don't have an oligopoly like we have here in the US.

------
abijlani
I think this post made by the Twitter team pretty much sums up their woes.
([http://dev.twitter.com/2008/05/twittering-about-
architecture...](http://dev.twitter.com/2008/05/twittering-about-
architecture.html))

"Twitter is, fundamentally, a messaging system. Twitter was not architected as
a messaging system, however. For expediency's sake, Twitter was built with
technologies and practices that are more appropriate to a content management
system."

------
st3fan
"""It appears that the biggest difference between Twitter and GupShup is
3-tier versus 2-tier. RoR is fantastic for turning out applications quickly,
but the way Rails works, the out-of-the-box approach leads to a two-tier
architecture (webserver talking directly to database). We all learned back in
the 90's that this is an unscalable model, yet it is the model for most Rails
applications."""

I remember having a discussion with DHH about RoR on IRC a long time ago.
Probably somwhere in 2004 when RoR was still very young and not so much known.
Coming from a Java/Spring/J2EE background I asked him about abstracting
database access in a third tier. He said he had never heard of that and did
not know and understand why people were doing that.

~~~
lpgauth
"He said he had never heard of that and did not know and understand why people
were doing that."

Could you explain the advantage of having a middle man between the server and
db? Does the middle man caches the requests? From what I can read up it seems
too be just a logical layer that executes the request depending on some
rules...

~~~
gaius
Let's say you have a 10,000 logged-in users on your website, of which 1000 are
concurrently active. That's very difficult to do with a RDBMS, because (most)
RDBMSs do a lot of work to start a session, and maintain an awful lot of
session state (for example, Oracle pre-allocates a chunk of memory private to
each session to do sorting in).

So you put something in the middle, that multiplexes those sessions down into
say 100 session on the database, checking connections in and out of a pool as
necessary, queueing requests asynchronously if there are no free connections
in the pool. You avoid the expensive creation/destruction of database
sessions, as you start the sessions when the middle tier starts and keep them,
and you keep the session state the database has to maintain at an optimal
level. Cleverer architectures add effectively another layer between the middle
tier and the database to cache query results (because you the developer can
_know_ what data you can cache like that, but the database can only make a
best guess).

In "sharding" I suppose the middle tier also has to do some logic to figure
out which "shard" to direct the query to. Note that _this logic must be done_
, whether you do it yourself in your code, or you let Oracle do it for you in
the query optimizer, picking the right partition(s) to actually execute the
SQL on.

~~~
st3fan
This is not what is usually meant by a third tier. Personally I take database
connection pooling for granted.

What I mean when I talk about a 'data-tier' is the code that deals with
accessing data _without knowing in whatever datastore it is contained_.

If you properly hide for example Twitter#getRecentMessagesForUserById(userId)
behind a (Java) interface then you can easily change from an implementation
where you do direct database calls to a sharded solution or a cached solution.

Other tiers that use this API will simply work because the interface has not
changed. One day your app could be talking directly to a single MySQL database
and the other day your app could be talking to a 60 node PostgreSQL cluster.
It would never know since the details are hidden.

This is totally against what you see in every Rails book or example app where
the first thing that is done is direct ActiveRecord queries.

Yes it is more work, but it pays off in the long term. It also greatly reduces
local hacks for caching. All that is done in the right place. And
automatically for all users of that tier.

S.

~~~
melvinram
Maybe I'm misunderstanding what your saying but I'd like to point out that I
can go from MySQL to PostgreSQL in a matter of minutes.

Specifically, I would need to change my database.yml file with my new database
info and run a migration.

Did I miss something?

~~~
gaius
A slight tangent, but I _never_ understood the philosophy of being database-
agnostic. Every database has its own set of features and does things in a
certain way. For example, in some databases cursor operations are expensive
and temporary tables are the way to go. In some, the opposite. Remaining
neutral means using only the most basic SQL and _still_ you might not get the
expected performance...

------
axod
I wonder if those techcrunch Twitter stats are up to date or accurate. 3
million messages a day is really nothing in terms of scaling issues. 34
messages a second.

Although it also depends how many destinations each of those messages has.

~~~
ojbyrne
Sure, if messages were evenly distributed over the day. I think it's the
bursty nature of the traffic that causes the scaling issues. A better estimate
would probably assume that half that daily traffic is within 1 hour - i.e. 1.5
million/(60*60) = 417 messages/second.

------
KirinDave
Why do people think Twitter's problems are due to a lack of good technical
design?

Twitter's real problem is that they've had a pittance of hardware until very
recently. It's not my place to go into why this is (and I'm sure the version
of the story I know is a bit biased by the teller), but suffice it to say that
a lot of Twitter's problems were, until recently, more business-oriented that
technology oriented.

------
ismail
@tx claiming that SMS has persistence is wrong, once the SMS is delivered to
the customer, and the buffer fills up on the SMSC oldest messages are cleared,
a record is written to a file which is then transfered to some other server
(That deals with the persistence) so SMS as it stands now is not persistent,
since one App handles the messaging and another handles the persistence.

------
ashleyw
You've gotta remember, Twitter is in a very sticky situation at the moment!

One one hand, they know their architecture isn't perfect and will be
rebuilding it to cope with more users.

But on the other hand - they are a small team which is trying to keep it up in
its current state, and cant just decide to close Twitter for a few
weeks/months while they work on the new architecture, people will defiantly
move to another service, and Twitter is dead.

------
gaius
I laugh when people talk about "sharding". In what way is storing your users
in different databases by username any better than calling your disk drives by
letter? The major database vendors all tried shared-nothing, and most have
rejected it in favour of single images (using partitioning to arrange the data
in a form suitable for parallel queries)

~~~
st3fan
"""In what way is storing your users in different databases by username any
better than calling your disk drives by letter?"""

Probably the fact that those drives are local while sharded databases are
physically seperate (clusters) of independent machines?

Some apps totally suck at this model. Others are perfectly suited. I doubt
Twitter needs complex joins, so it should be easy to partition their data.

S.

------
noor420
Man Indians are getting better at Web Apps too. Good read.

~~~
babul
They spent a lot of time building them for others (outsourcing) and now are
builing it for themselves (entreprenuering/producing).

The same model as the far east with cheap manufacturing of electronics goods
in the previous decades. After a while of outsourced cheap
production/manufacturing, these players learnt about the markets/products and
how to actually design and build themselves and now many of them are major
manufacturers and producers in thier own right.

~~~
babul
Plus web apps will soon be big business in India as it allows them to compete
in international markets with stronger currencies (i.e. USD/GBP is greater
then Indian Rupee) without the associated export/manufacturing costs, and the
benefit of their own homes/area/country.

Hence instead of working for $10k~20k (USD) on average in a
multinational/call-centre, they will form tech/consulting startups. This is
already been demonstrated in the macro-scale with names like
InfoSys/TCS/MindTree/HCL and now will happen in the micro-scale too,
especially after media coverage of people like the Scrabulous founders etc
(who many media sources claim earn about $20k~$30k a month in just facebook
advertising).

Only a matter of time till we see a web superhit from India, imho.

------
xlnt
Twitter has a better name.

~~~
raghus
GupShup is slang in Hindi for chit-chat or gossip

