Hacker News new | past | comments | ask | show | jobs | submit login
How I built an app with 500,000 users in 5 days on a $100 server (medium.com)
423 points by kiyanwang on July 21, 2016 | hide | past | web | favorite | 168 comments

This article left a really bad taste in my mouth. I don't believe GoSnaps == GoChat in terms of complexity and the constant back patting and self congratulating is really distracting. There were a couple of decent takeaways but largely the whole post revolved around how "How smart am I?" and "What great foresight I have".

I really don't approve at all of the GoChat shaming going on. The author may be 100% correct that GoChat made mistakes in writing code that doesn't scale well but that doesn't give him a blank check to beat and berate GoChat. It reads as a very discouraging post to newer/less experienced programmers in my opinion, essentially "Don't even bother making something unless you know it can scale to millions of users" which I think is a terrible message to be sending.

The real take away is that experience beats inexperience. GoSnaps author seems to have more experience and GoChat author seems not to be a developer and outsourced his service. Without prior experience, it's difficult to have insights into all the things that could possibly go wrong.

I personally don't feel like he was berating GoChat, if anything I was sad for GoChat to have grown and fallen under it's own weight, but yet still enough respect for GoChat to have had the foresight to build and execute. The author even called it a "genius move"

> I personally don't feel like he was berating GoChat

You don't feel that way because he edited his article and in its original form it was VERY uncharitable to GoChat. Now, after the edits, it's only lying by claiming they're similar in performance profile, when in reality, in terms of realtime performance, GoChat is all GoSnap does PLUS a lot more extra work.

And "oh, look, they spend $4,000 a month on servers, and I only spent $100/month on a server. Plus 'cheap' storage from Google, that I'm not going to enumerate for some reason"

A screenshot on my iPhone 6S of Pokemon Go (just the map) is 3.7MB, if we extrapolate we are talking about 722.66 GB of images so far (using his 200K images estimate) we are talking about < $20/month for the storage and ~$80/mo for the bandwidth (Sharing them back out, ingress is free). Now this is early days for GoSnaps so it's hard to predict if those users will stay but to the author's own point he has to keep all of the pictures so that number will only grow. It is disingenuous to completely leave out a cost of the service which looks like it has already probably doubled his cost per month from his "Only $100 for a server" claim. Also I'll point out they may be compressing/downsizing the hell out of those photos but I can't tell how much each one will weigh without MitM-ing the app and I don't care that much.

For sure - I certainly wouldn't expect that it'd come close in hosting costs, but when the author is as snarky as he was about how cheap, then he should expect to be called out.

And besides (leaving aside the numerous other 'issues' with the claim, as well discussed here) is "My $200/mo stack performs better than a $4000/mo stack" really that much worse than $100?

I agree. The complexity of his photo app is nowhere close to something like GoChat.

Plus the condescending tone when he is just copy pasting code from other project is really annoying to read. Apart from this the content is interesting though.

> condescending

This is exactly the word I was racking my brain for! I love reading about how to make something more scalable by there was so much filth I had to wade through to get to that in this article...

The only points I can pick out are: use node, make sure your db queries are on one index, and make sure you turn off features of modules/frameworks you're not utilizing.

A lot of condescension for 3 small points.

I upvoted because I'm just now hearing about the whole GoChat story, and because I find amusing how quickly Pokemon Go related apps can blow up right now, but yeah, it is evident that this was primarily a self-promotional post otherwise.

Yeah it's complete bs. I've worked on chat before and even 100 clients destroy a core or even two.

Now I would use pusher but that's extremely expensive but scalable.

100 clients destroying a core or two? Please, use the right tool. See erlang, golang, java, even c#.

Years ago, you had to suffer using poll(), select(), whilst dealing with threads or forking. It's so easy these days. A raspberry pi should be able to support 1,000 clients with any of the above recommended languages.

Node was designed to hit 100K users/core, and in fact can do closer to 1M users... for simpler chat relay, a single server should be able to handle 1M simultaneous users... Growing beyond that means you need to segment/orchestrate your channels. I haven't used the app to comment... but something written in golang could probably handle a couple million users on a single server via websockets (or raw sockets), with appropriate backing services in place.

The original article is condescending to say the least, that doesn't mean that $4k of ongoing hosting costs shouldn't be able to handle a few million simultaneous users... though, something tells me that Pokemon Go could approach twitter or FB overhead, and related apps like chat could see huge swarms as a side effect.

That kind of problem would kill most first generation apps, as those who've seen twitter grow can attest to. I don't think there are any IRC networks that big at this point.

Yeah, no, that's wrong.

If you have problems with this, there's something really wrong with either your design or your choice of tools.

How'd you go about building this?

In 2008 with php and ajax

That seems unnecessary. Or did you use (something like) Meteor.js?

The real takeaway from this is that the author uses Hackathon Starter (https://github.com/sahat/hackathon-starter).

I have used it for multiple projects and it gives a huge head start compared to starting from zero. Signing up, logging in, resetting the password, uploading, etc all seem like easy work but when you pile them all up you can easily spend a week just getting to the point where you are within minutes of cloning the starter repository.

However the failure of GoChat is not relevant to Pokemon Go. While GoChat might have done something very wrong comparing 1mil users to an app with tens of millions of concurrent users is invalid. Pokemon Go would be a NoGo running on a single Node.js machine without any sort of balancing.

> However the failure of GoChat is not relevant to Pokemon Go.

That part is almost right, but also a little confused.

He built GoSnaps, which is a geo-enabled gallery with: No chat, few messages, few writers, many readers, medium amount of reads, large messages, not time-dependent. (Not even user state.)

Then he contrasts that to the failure of GoChat, which is a geo-enabled chat with: Many-to-many chat, many messages, many writers, many readers, many read operations, small messages, very time-dependent.

The performance profile between these two things is wildly different and the way he's patting himself on the back over what amounts to half knowledge is pretty disgusting.

There is a real feeling of superiority in his post based on his own assumptions (in terms of the other apps stats). I always think it's best to constructivly critique something by not trying to promoting yourself in doing so.

Failure of GoChat? GoChat didn't fail, it just had some hurdles it needed to jump. GoChat is back up 100% baby! (I'm the guy who started GoChat). I reached out to the author of this story and he has since updated his story. I now have a team of 6 dedicated guys helping me full time on this and we're cruising forward.

That is great to hear. Good work! Sorry i can't edit the post anymore, otherwise i'd update that. :)

Edit: Wow, he did update it and he's only defending his bullshit with more half-knowledge. At this point i'm ready to call him a liar, just over this claim: "My conclusion is that both apps are very similar in terms of scalability complexity."

what if he just didn't mention gochat? its an awesome post

Really cool. Do you know a list of similar repos for other stacks? C#+JS? Java+JS? ...etc

Try JHipster for a Java (Spring Boot + Angular) stack: http://jhipster.github.io/.


Anyone knows the same thing for elixir/phoenix?

Yes, it's a pain and particularly the user authentication solutions are very complex. The best I have found is this example but it's 100% not a starting point as the first thing you'll have to do is rename everything!


>Yes, it's a pain and particularly the user authentication solutions are very complex. https://github.com/hassox/phoenix_guardian

This is why I don't see the point of moving over to Elixir. Rails has ready-to-go and battle tested "modules" that you can pop in and go.

If I used Elixir now, I would have to wait for the community to build something battle tested or I can home-roll my own Elixir authentication. The Poster above just said, "It's a pain particularly user authentication."

I maintain that a web framework is probably one of the least interesting things that's enabled by a language like Elixir, a VM like BEAM, and a foundation like OTP... but... for whatever reason web frameworks have become the criteria by which we seem to measure languages these days.

That said. A chat application is really, really well-suited to Elixir.

It's definitely a lot harder to move to Elixir and Phoenix but I've enjoyed learning FP and Elixir a lot; I've done it mostly to challenge myself and it's informed my programming in other languages.

Really? The user authentication in the book Programming Elixir was pretty simple and straight forward—is that not "production ready"? Authorization seems like a simple plug away.

For a second I thought that was a plugin before I realized it was a sample application.

Yes, I think the Phoenix guys would rather you forwarded to an OTP app than build a mechanism for plugins.

I think no-one builds their apps this way yet so maybe it needs more explanation.

I find this thread a little confusing. That sample application is using Ueberauth (https://github.com/ueberauth) which appears to be a rich plugin-based authentication framework with many preexisting plugins. You don't run the sample application, you use the framework.

The point is it's not fantastically quick to drop in, in the example app look at the UserFromAuth helper - it's definitely not simple and contains a load of gotchas I'd struggle to deal with from the ueberauth documentation.

The grand parent was asking about a template/hackable starter kit with batteries included. That setup of ueberauth is the best I've found sadly.

Or any python-based solution?

https://github.com/pydanny/cookiecutter-django is fairly similar in terms of providing a good jump start for django development.

A quick thanks for that one!

Any flask based ones?

https://github.com/mjhea0/flask-boilerplate might fit the bill, not as comprehensive though.

Very true, even though not speaking specifically about the hackathon-starter but for any similar boilerplate project. If you enjoy prototyping and making MVPs, having a boilerplate that you are personally comfortable with and which covers most of the functionality is pretty much no-brainer.

I've been considering making something like this for myself. I love my current stack: TornadoWeb with a Gremlin Server graph database backend, but I do find myself recreating it from time to tim. Maybe I should put the effort in to at least skeleton the thing/define common functionality so that I can quickly prototype.

Ah, the good old "I could build StackOverflow in a weekend" line of thinking - I'm sure we've all been there.

There's a world of difference in building a photo sharing app with XXX,XXX users vs. building a chat app with XXX,XXX users.

When you do anything that involves chat or that level of concurrency, surprises will bite you in the behind, multiple times, even if you desperately try to use as much existing software as possible.

(as anyone who's taken a look at ejabberd, thought it'll play nicely, and then load tested their code will tell you)

Frankly, PHP vs. Rails vs. Node[1] vs .Net vs Java will be the least of your troubles.

[1] I do fear that the author is going to find a nasty surprise or two for themself regarding Node's performance issues

Though, using a specialized language like Erlang can make it possible to run WhatsApp on only 50 engineers (in 2015). I agree with the OP that choosing a stack like Rails can kill your Pokemon Go related MVP pretty quickly (I'm a huge Rails fan myself, but would definitely use Node.js for something like that).

> make it possible to run WhatsApp on only 50 engineers

Frankly, I'm sure Erlang is great, but with 50 full time skilled engineers I'm pretty sure you could run WhatsApp on almost anything.

The largest explosion in their user base came when I think they were at something like ~11 engineers, and from what I've heard the largest additions to their engineering team came on the frontend side, not the backend.

Ask Twitter team; they may have a different story.

I'd probably do the first gen core chat in Node, and write everything else to leverage a cloud provider's infrastructure of choice.... Google, Azure and AWS all offer similar services which include bigtable, blob/file storage, some hosted SQL variant, document search engine, and some form of queuing service.

The hard part is routing messages, and looking at some of the messages, seems to be location based chats... probably could use geohashing for sending messages to channels, the hash level you are at, plus the surrounding 8 hashes.

Breaking up the server instance that a given chat connects to based on location hash would probably work out well enough. I'm not specifically familiar enough with the app... But it seems to me that channel growth and routing are the harder things in this at the scale of millions of users.

Depending on what kind of app the Rails stack is operating...it would follow the same exact pattern as Chat vs Photo Sharing in nodeJS as in the article. A Rails app running a JSON API server would do pretty well, and basically has the same boilerplate as the node hackathon thingy built in, plus a few extra gems.

Except it will collapse under it's own weight too quickly.... Rails is a pretty bad core for a chat engine.

Except that it's an order of magnitude slower than some other tech.

First I would say Pokemon Go has done incredibly well to handle such massive growth so quickly, no doubt they were able to leverage a lot from Ingress but I could imagine many other companies having days of downtime while trying to scale up so quickly.

I also tend to disagree a bit with the article. For every situation like this were early scalability is important their are a 1000 MVP apps that are prematurely optimized or over engineered. At the end of the day the chance of anyone building an app that will get over 100,000+ in a week (and keep those users coming back) is very very very slim.

Still, I think it makes sense to adopt a language and framework that makes this kind of scalability very easy. Nothing can really be called "over engineered" if it only took 5 days to create. Not even prematurely optimized. I think he just built something correctly and didn't take dangerous shortcuts.

100% agree with you. As a freelancer I usually have to decide technology stack and infrastructure and I always go with the cheapest option for my client with the argument that one day he could just scale it up even by rewriting it.

This makes 3 months project delivered in 1 month and in the end the client usually benefits that.

> I always go with the cheapest option for my client

> with the argument that one day he could just scale

> it up even by rewriting it.

Which in practice of course never happens. Instead, the old code gets patched up until everything fails completely, and only then a rewrite happens :)

Which is a perfectly valid choice in many cases. I plan to drive my car for as long as possible and into the ground before I get another one. As long as I know the eventual replacement cost is coming and budget for it it can be a good choice.

I definitely agree with you in the normal case, but not with PoGo for what it's worth.

Pokemon Go is a bit of a singularity in growth. It's apparent to everyone on the street that (atleast, for the moment) this thing is huge.

When something is so obviously on fire with growth, over engineering might not be so bad. Look at any of the popular PoGo apps.. they have all dealt with big numbers.. the landscape is just a bit insane right now for that game.

I like making side projects, and i definitely agree with you not to over engineer.. but i feel like this is one case where it may not be the end of the world. With PoGo, if you don't balance the two (over engineering vs mvp) you'll likely end up sad, either way.

Yes, driving a nail with a sledgehammer is usually not a very good idea. :-)

>>> Where would I have put my images? In the database: MongoDB. It would require no configuration and almost no code.

Why... would anyone actually do that in anything more than a classroom example for an application like the one described? Amazon S3 and similar services have very decent libraries for pretty much every popular programming language, why would you re-implement that?

>>> MVP and scalability can coexist

I'd replace that with less catchy but probably more correct 'experienced devs can make more scalable mvps with little extra cost, if any'. MVP doesn't mean lets just go silly and make the quickest and dirtiest decision imaginable.

It's a matter of experience to recognize potential problems and the respective potential solutions, and program accordingly. SQL schema is a pretty good example. Often it makes a big difference in scaling and often you can design the initial schema to be much more scalable with some experience and a few moments of planning.

I read that and thought "MongoDB users typically store images in the database? Interesting"

The author says in the article that putting images in the database is not what he did.

I think the point the parent is trying to make is that no one would do that anyway, and mentioning it in the article as something that makes your application slow is pointless.

Exactly. Somewhat dramatic, but equivalent of saying 'decided to use postgresql instead of flat files, therefore scalable'.

That's why I was really wondering if there are people who find strong reasons to do that.

But the article mentions he used Google's cloud, and the OP asked "Why would anyone reimplement AWS?"

It sure seemed like the OP skimmed the article, found something out of context, and refuted it here.

Nope, I read it. Maybe just didn't explain my point clearly enough.

He mentioned that he is using Google Cloud instead of storing images in MongoDB and my question was what's the big deal about it? It doesn't seem like something you'd do to make it more scalable, it's something you'd do anyway.

I've seen plenty of people suggest or question which is better, storing files in a database or outside of a database.

Like just about everything, each has trade-offs.

I don't find it immediately obvious that a database would never be used for such a thing, or that reaching for a third party file storage service is the only solution to consider.

OP is just saying that in general even a pretty novice programmer who also strongly believed in the MVP mindset still wouldn't store images directly in the database. I think he was just using S3 as an example since it is the most popular solution for web developers right now.

> If I would have built GoSnaps with a slower programming language or with a big framework, I would have required more servers. If I would have used something like PHP with Symfony, or Python with Django, or Ruby on Rails, I would have been spending my days on fixing slow parts of the app now, or adding servers. Trust me, I’ve done it many times before.

> As said, GoSnaps uses NodeJS as the backend language/platform, which is generally fast and efficient. I use Mongoose as an ORM to make the MongoDB work straightforward as a programmer.

That's an odd part.

I'm not there's anything substantial that prevents, say, Python+Flask (I see MongoDB and I haven't used it from Django, so avoiding the topic) from handling 600rpm on a machine where Node can. From what I got, all it does is processing messages and geospatial search queries - essentially merely passing those to the DB (which does the actual work) - so how come the runtime of the byte-juggling middleware layer even matters here?

It sounds to me like the author has drunk the "NodeJS is fast, everything else is slow" koolaid.

Definitely and I hate it. Makes all Node programmers look bad : (

Both nodejs and flask can barely hit 10k requests per second even if it doesn't touch database:


Node is a solution for network bound problems NOT cpu bound problems. Fibonnaci is very much in the latter category. Horses for courses

... if your web app is a fibonacci number generator..

In case it wasn't clear, I am saying 10k req/sec is low...

(compared to go/java which is 5 times as fast)

Well yeah, Node processes run a single thread (of user code). There's no concurrency there. Go and Java will use all available cores of the system.

A better (and likely more realistic) comparison would be to actually hit the database: node is async by nature so would better handle concurrent requests.

And minikomi's point is that a fibonacci number generator is ill suited for nodejs, so it is expected that this webapp doesn't achieve high rps.

If you're at 10k req/sec while processing nothing you can't possibly achieve a higher rate on real workloads. It establishes a hard cap on the best case scenario for a node server that manages to parallelize all IO work.

I am not trying to make any particular point here. In my original comment I was just providing a bit more context on whether or not nodejs/python can comfortably handle 600 requests per second, and if so, how much more "scale" is left in this architecture.

A fibonacci number generator is by definition not "processing nothing". "Hello world" is closer.

In any case, in practice, node.js can process almost as many as requests per second as nginx and PHP in common configurations, after you've implemented clustering - what other servers do by default. It's a dozen or so lines of code in node.js.

I expect the same is true with Flask, except that Flask's default configuration is even worse - Flask's built-in HTTP server is for development purposes only, and is similarly single-threaded. It's a WSGI framework, and deployment needs a proper WSGI server, of which there are a number of choices - gunicorn or uwsgi are common.

Using the default configuration is not a particularly good way of benchmarking the limits of a framework, because many of them are set up to be easy to work with by default, not to be fast. If node.js clustered by default, you couldn't use global variables for state and might have to deal with race conditions. Flask doesn't ship with a web server just because there's so many good choices available depending on what you need.

Out of curiosity what would be a healthy RPS range? I've always assumed anything over 1000+ is acceptable.

For this application where he was getting "hundreds of requests per second" at 500M users, if he suddenly grew another 10x he would've had to start scaling horizontally instead of running on just one machine.

according to my work, anything above 10rps is fine.

For internal business apps - or even b2b tools - 10 is probably plenty. There's a lot of premature optimization around.

Also, I only skimmed the article, but it seems like the heavy lifting of serving images is done by the Google infrastructure instead of the one server.

Serving images is usually just a wrapper around sendfile() and is rather trivial.

I'd advise you to go read Nginx source if you want to see how "trivial" it is... :)

I was skeptical too, but the guy has 500,000 users on a $100/mo server. It's not even up for debate anymore, Node.js is fast and scalable. You might not like MongoDB, but it also works.

The performance profile matters. 500000 users on an instagram type site create a COMPLETELY different kind of load compared to 500000 on a geo-enabled realtime many-to-many chat program.

Basically, nodejs server just convert http request to MongoDB queries and pass the json results back (especially he turns plan json result off for MongoDB driver).

I cannot even be convinced there is any job nodejs really works on except pass around the requests. Let me put this way, no matter what language or framework he use, spend 90% of his money on MongoDB is the win for this app.

500,000 users doesn't mean anything. what's the requests/second?

Now all we need is an entirely new software stack that passes flatbuffers end to end since they are BETTER than json (rolls eyes)

it can still be up for debate

Maybe a slow language doesn't matter too much when the majority of the work is conveying data back and forth from the database to the users

There's a little too much self-congratulatory prose in here. And poor advice (Use NodeJS because its fast).

But there is one take-away at least... design your application around your data and how your users will interact with it and performance will generally fall out of that. And it doesn't take much to start that way rather than leaving it as an after-thought.

People might break out the (oft-misquoted) "premature-optimization" horse for a little beating, but performance does matter. At least the bounds matter for most applications. You might not need to eek out every cache line but you can set targets up-front to say, "We cannot tolerate more than Xms req-to-res time" and bake that into your design.

It's funny you mention node as poor advice... and while I think it would need be replaced at scale, I tend to reach for node first, if only because it's easy enough to prototype something that can scale out, and gets decent per-instance performance. In this case, I'd probably have started with go for the core chat engine, combined with a plan to shard/distribute chat channels, however you want to break them up.

As for pre-mature optimization, for something like this, they should have had a plan to grow, and enough base to be able to handle some early growth. I think falling over once you hit a million or more simultaneous users can happen in a lot of ways. Especially if growth happens faster than you can provision servers/funds.

You can write a fast server in Python 3 that can benchmark well against Node [0].

The author tried to sell us on the idea that Python is slow because Django is slow and you shouldn't use Ruby because Rails is slow. You should just use Node because its fast.

Well I don't know about you but I've seen slow Node applications too.

It all comes down to data. If you really want to maintain your performance goals you have to include them in your design. You have to design for your data. Show me your data and I can write the program. Design it well and you can scale up when the time comes with minimal effort.

Besides there are other factors to consider such as familiarity with the tools, correctness of the implementation, etc. Javascript is great and all but it is incredibly easy to make errors that will go unnoticed until it hits production. The law of large numbers won't protect you if you're used to the blanket of obscurity. And so I find you need many more layers of tooling and choose your subset of the language carefully to maximize its use... something that you don't have to do as much with OCaml or Haskell for instance.

Hence, "just use Node because its fast," is misguided at best. (And I'm not even a Node hater.. I maintain a number of Node applications presently).

[0] http://magic.io/blog/uvloop-blazing-fast-python-networking/

Agreed, the reasoning isn't sound... I was just meaning that node is a perfectly serviceable platform, and better than most by design. I've also seen some hideous node apps... generally because people don't really get how it works, and try to do heavy math in the main process.

Don't optimize prematurely, but avoid designing a system with performance bottlenecks that can't easily be cleared later.

Survivorship bias anyone?

How many times did a project fail, because it the non-aspects (e.g. scalability) were undernegineered? How many times did it fail because it couldn't ship on time/budget due to excessive engineering? We do not normally read such stories, because they are totally unexciting, taken separately. And one failed case of GoChat does not a worthy stat make.

Ultimately, good job to the guy for nailing a sweet spot between his skills and the market of the application created by those skills. Just do not assume that's everybody's sweet spot.

> But this would have been totally disastrous under any type of serious load. Even if I would have simplified the above query to only include three conditions/sorting operations, it would have been disastrous. Why? Because this is not how a database is supposed to be used. A database should query only on one index at a time, which is impossible with these geospatial queries.

> On the database side, I separate the snaps into a few different collections: all snaps, most liked snaps, newest snaps, newest valid snaps and so forth.

Pardon my ignorance, but don't most databases have some method of handling these issues?

(defining multiple indexes for use, having support for geospatial data, having support for like, subsections of the existing dataset, etc?)

I thought that the main goal was to offload the developer's code's logic onto the performant database, as opposed to offloading the database's logic and caching onto the developer's code? is the former not practical?

Yes, with Postgres (which I am most confident to talk about), most precisely using PostGIS, you can do that in a matter of hours using it for geo-queries and indexing for getting important stuff (new, trending, etc...). Plus Postgres is supported basically everywhere in any tech stack. I still don't get one point, why people totally ignore SQL dbs by default with new products? I know MongoBD, RethinkDB, CouchDB, etc... are really fascinating solutions, but why not considering SQL eliminating it by default? I am just curios.

That's only needed for the coordination... that said, it would be easy enough to segment channels based on a certain precision of geohash... messages sent target 9 channels, your current and neighboring channels... you subscribe to the channel you are in, and this updates every N seconds.

Channel position/calculation can happen client side, and subscribe/unsubscribe can happen server-side. Though that may leave room for unscrupulous behavior, it could be locked down a bit more by moving sub/unsub server-side.

The issue will be growth/routing/rerouting of channel data... even then, you can get pretty far with RabbitMQ backed socket.io ... you might need to custom create something before hitting 10M simultaneous users, which at current growth rate would be an issue anyway.

I think its because the only really viable scaling option for Postgres is vertical scaling. Even just setting up any sort of replication with automatic failover is still a pain (multimaster is not yet built in, master-slave also needs 3rd party failover program...)

So, how large would his application have to get before that became a problem?

Replication with automatic failover? I'd go for it immediately, unless you are okay with long downtime and some data loss in case the server goes down.

But if you can live with that, then yes, you're unlikely to have actual scaling problems - at least not for projects like the OP.

Yeah, I don't get it either. Even without geospatial features its not too hard to put an index on (lat, lng) and then run a between query for the 4 given coordinates taking the minimum and maximum of those 4 latitudes and longitudes. Need to also sort by likes/abuse reports? Add those to composite (compound) index too.

No need to manage separate collections.

Don't do that... calculate a geohash, and send to self and neighboring chat channels identified by geohash. Then you don't even need lookup, only chat routing... auth/isolation/provision of connection is a related issue though.

Didn't know about geohash. So simple and obvious in hindsight! Thanks!

The two previous submissions have a few comments scattered between them - here are direct links to those comments:




Despite getting a few votes, neither of those submissions got any real attention first time round - no doubt pure chance that this one has got enough attention to hit the front page.

I read the story a while ago and was waiting for the criticism in the comments. Now one comment [0] already pointed out many of the issues of the article.

What's been mentioned in other comments but not explained in great detail is the database design, so I want to expand that:

The right way (TM) to do databases is to design a solid schema to keep data integrity and then apply indices and caches depending on your application needs. To be honest his application seems super simple to cache top-down, so a few lines inside the nginx config (which seems to scare him for some reason) would probably do. But if you use a real database (also TM) you can go bottom up, too:

1. solid schema with constraints

2. indices depending on your application

3. stored procedures, database views

4. some non-relational cache like MongoDB to cache denormalized data

5. maybe something in memory

6. (application)

7. nginx caching

He started with 4. What he did is not a solid database design to brag about, instead he hardcoded a cache inside his application. If he wants to scale his application vertically or horizontally he will have big problems, because he misses a point at the beginning which contains the truth on which everything else is build upon. If he starts scaling up and then wants to change his schema he is basically in hell.

What he did is nothing bad. It is exactly "the MVP way". MVP is not about slow or buggy software but a really small feature set and applying YAGNI. MVP is nothing bad, he seems to have great sucess with it! What I am criticising is not how he build his software but what he wrote about it, comparing it to a much harder case and thinking it has something to do with good design.

[0]: https://news.ycombinator.com/item?id=12135748

>If I would have used something [..] Python with Django [..] I would have been spending my days on fixing slow parts of the app

>GoSnaps uses NodeJS as the backend language/platform

Is NodeJS really that much faster than Python in practise-- even with a fast framework (Falcon, pycnic, hug.rest) and Pypy? I know a lot of work has been put into making V8 fast but I didn't realise it was notably faster than Python.

To me it seems like he's talking about plain CPython, which isn't even in the same class as Node.js/V8. As you mention, the correct comparison would be against PyPy.

You could have built most of the data side entirely static. First convert the user coordinates to simple mercator XY. Just divide or round that down to some precision and put the resources in a namespaced S3 bucket/path. Then just do a directory listing on resources in that bucket. You could even name them the full precision xy coord so you could still sort by distance, within the bucket.

Let S3 be your database.

You don’t need the full precision of a geospatial query or database if you’re building a simple app that organizes content by location. Depending on your density you segment few 100 meters or few 1000 meters.

Thank you... was my thought for channels as well, geohashes of varying precision for chat channels, and for more static resources using something like you suggest.

This reminds me of a quote from Biz Stone, Twitter founder: It takes ten years to become an overnight success.


>GoSnaps grew to 60k users its first day, 160k users on its second day and 500k unique users after 5 days (which is now)

How did you market the app ?

No need to market. In most countries Pokemon Go was already popular but app was not officially available in Play Store/App Store yet. So millions of people searched stores for Pokemon Go and downloaded chat, tutorial and apps like GoSnaps instead..

So your initial user growth was from people who weren't actually able to contribute to the app?

There are apks avaialable outside of app store and here in my office in India almost all of my colleagues are onto capturing pokemons

I'm amazed that Google, particularly being involved with Niantic in the app itself IIRC, don't police trademarks on major brands - there were a tonne of apps that said "Pokemon Go" in the title. I'm sure Niantic would have paid to not have them listed and dragging down the brand before people could even install the genuine app?

In the UK I looked for the app, saw a load of fakers and avoided them but I could see others [children particularly] installing a lot of those apps just trying to find the right thing (instead I installed Ingress to get an idea of how it might all work).

Seems like this sort of inability to find trademarked apps in the app-store makes it look far lower quality over all.

What would have been ideal as a user would be a "get Pokemon Go when available" that would have allowed Niantic to roll it out to my Android device and manage downloads a bit more progressively.

That's why the number of retained active users are more important than pure downloads. If I've downloaded something similar (wallpaper, fake, etc..) by mistake the very next step is to uninstall it immediately.

Reading this leaves me doubtful whether to use MongoDB or not again:



Do people at big startups use MongoDB in production?

This seems more like a limited-lifetime side project rather than an actual viable startup. I've used MongoDB for situations like this where I just need to stuff some simple data into a datastore and retrieve it later without any particularly complicated queries.

Personally, if I were the author, I'd switch over to PostgreSQL the second I start worrying about more complex queries, though.

I think schema-less plus excellent performance out of the box give MongoDB an edge. You also don't want to discard advance features like map/reduce buit-in in JS, chaining of commands, ability to have nested objects (think tables inside tables), ability to do atom modifictations to objects, etc.

Surprised he didn't talk about dedicated/colocated servers. For $100/month he could have had a E3-1231v3, 32GB of RAM, 2x480GB SSDs and unmetered gigabit bandwidth from OVH.

Instead he paid $100 for 4 hyperthreads, 15GB of RAM, a few GB of storage and fast but horrendously expensive bandwidth (assuming he used the n1-standard-4, which matches his description).

If he'd set it up to scale the number of servers with load or something it'd make sense but this doesn't make any at all.

Pff that's nothing. I could build a fake app, in less than 1 hour, for $100 and get about 5 million downloads in 1 day.

Step 1. Find some opensource app code

Step 2. Call it Pokemon Go 2!

Step 3. Upload it to Appstore & link it to dropbox

Step 4. Spend $100 on African "talent" to give fake 5 star reviews & positive comments in app store.

Step 5. Hit F5 repeatedly at Appstore to watch the download counter increase to 5 million in 24 hours.

Step 6. Profit ?!?!

Step 7. Post story in /r/nosleep because too much guilt fooling 5 Million people.

Yeah, if you go to the play store and search for pokemon go you get hundreds of people making spin of type things and guides. Reminds me of ambulance chasers and art van salesmen. Just knowing that they make tons of money off their overnight app and I make st off an app I spent a lot of time on is discouraging.

What's the dropbox part for?

Why spend $100 on African talent when you can spend $5 on Indian talent?

Do it?

Our company migrated to elixir 2 months ago. We have 2M users per server at $20/month.

I would be interested in more info about what kind of product you are working on in elixir. Is it chat-related or not at all?

2M active users on a $20 server?

Would love to hear more about what you're doing.

I just don't get how he goes on and on about uploading images to cloud storage instead of mongodb, which he makes it sound like a very genuine decision.

Is it just me or what he telling is rudimentary?

I'm fairly sure that was the first think I learned after the basics on how to connect, create, read, update, delete

"Don't store images in the database. No, not even then."

In both cases the developers have committed to a pretty big monthly payment for an app that serves hundreds of thousands of users.

4000USD is a huge amount and even 100USD monthly is a lot to spend out of pocket without a plan to recover that money. Do they have any plan of making money out of these sites or are they purely CV/portfolio pieces?

I'm less interested about the technology and more interested in whether he has a plan for monetizing all those users.

… and earned $0.

Yeah, although building something with 500,000 users gets you a lot of attention, and probably a lot of free marketing for his actual startup.

The idea of putting into different collections up front is pretty smart. To generalise it into a broader lesson, I guess you could say it makes sense to make a one-time effort up front to save complexity down the line.

It is true that GoChat doesn't need to be that expensive to maintain and his analysis is pretty much correct (I've maintained something that had similar amount of traffic, similar dynamics, and didn't cost me arms and legs at all, far from it. It's amazing how cheap you can start a company nowadays)

But no need for bashing someone else. These things are a fad so this GoSnaps thing will probably go the same way as GoChat anyway.

Solid read, good basic thinking with regards to scalability via basically prefiltering data except for the one query you need to run at runtime.

It's a bit strange that the author mentions Scala as lean/fast with lots of libraries (along with JS and Go) but Java is too bulky. I'd say modern Java 8 can be used in a pretty lean manner. There's also nice and small web frameworks (Spark etc.).

Here is my take on this:

1. MVP vs. scalability: While building scalable product/s right from the MVP stage is generally a good idea, it may not be particularly beneficial or applicable to most scenarios. I mean

a) how many typical startups happen to scale to 500k or 1M users within days from launch?

b) most founders would be needing an MVP mainly for market validation, as a proof-of-concept and for the purpose of attracting seed/startup funding

c) many founders - especially non-coders - may not have the luxury/resources to have scalability built in to the MVP

2. The original story goes to reconfirm my belief, based on multiple past experiences going back many years, that database continues to remain a (huge) bottleneck for web apps with high traffic volumes and max possible database optimization (right from config tune-up to table structure design/normalization to query optimization) can pay huge dividends in most cases.

There's a lot of flak over poor comparison to photos and self-promotion. However, his overall point is still true: just putting a little effort in upfront with assumption you will succeed can prevent these problems. My baseline for evaluating this is "Did they do at least as well as someone who spent 30 seconds on Google?" Short version: doing better wouldn't have required a ton of thinking.

Here's what 30 seconds Googling "highly-scalable chat architecture" gave me:




Note: Like to have seen numbers for field-test of the above in the article. Yet, it would've gotten someone thinking.



Previous times doing this for web services led me to highscalability.com with many architectures to imitate with fairly mature software components available. At this point, the common ones should practically have templates for "enter metrics expected here" then click to deploy.

If anyone has more tips about how to get 500,000 users in 5 days, I'm sure we would all like to hear them.

Attach yourself to something viral...

Of course, getting 500,000 users is nowhere near the same thing as keeping 500,000 users.

Actually not bad advice. Don't try to create the waves, ride the waves.

For me, this article is very reassuring.

My server used to be at <0.8% cpu usage. Now that I installed mongodb with almost nobody using my app (< 2 person per hour), my cpu is always at ~1.6% (It doubled because of mongodb!). At first, I feared that my cpu use would be enormous as soon as I would get new users. Now I guess my cpu% increase is due to some overhead that will not grow too much with db size/use (if the author was able to make an app of this scale with mongodb). I'll also try the lean() mongoose thing.

I'm all for critiques of software and how to improve work, but do we need to rag on the guy who made GoChat? Looking at the project, it was clear it was a single guy or a couple people, working to put out a project for experience.

It's poor form to self-aggrandize and say "move fast, make MVPs, etc", and then write a post pointing out over and over how people messed up, when they were trying to move fast and make an MVP.

Everyone is talking about the technical feat, but the real insight for me is that you should hitch your wagon to a rocket ship.

Pokemon is a rocket ship right now, and any new app has this enormous exposure advantage.

It is also important that it scale well or else you'll squander your advantage.

For what it is worth, I was very impressed by the technical stuff. (It is making me laugh to read about how disappointed others are. I feel like I missed something.)

This is the only real takeaway here, as far as I can see - If you build an app that attempts to associate itself with another extremely popular app, then your app will probably be a little popular too.

This is nothing new or insightful, but the author sure found a way to make himself feel good about it.

I'm going to use Cassandra for part of my application - the bit that might conceivably be unperformant and very difficult to cache - even though it'll take a few extra days now to get working over using Postgres I'd rather just do this at the start than have migrate a write heavy and main part of the apps functionality while live.

Most cloud providers have a BigTable solution already, though not the same as C*, would leave you to build the app over provisioning/configuring a cassandra cluster.

Depends on your application of course, but if you're self-indexing anyway, may as well lean on your environment (given a proper backup/exit strategy).

You are spot on here actually - I'm using AWS might as well go to the original dynamo db then.


why not just go straight to scylladb or ramcloud :)

> I personally love Erlang and would never use it for an MVP, so all your arguments are invalid.

Could anyone elaborate on the point the author was trying to make here? is it that erlang doesn't have many pre-existing libraries (for building an MVP) or is not fast enough (or something else)?

In all likelihood it is about the relatively lesser abundance of high quality, ready to use building block in the Erlang/Elixir ecosystem vs NodeJS or RoR. In my opinion, Node and its associated ecosystem will get you to your MVP faster, but Erlang/Elixir will provide you with a more solid base to build and scale a commercial application of this kind, it's what they've been designed to do.

If I have to guess, I think the author (although loving it) is not familiar enough to produce a working product fast. I have done it several times and can confidently state that Erlang (and especially Elixir) is very viable for that.

The more interesting question is how did these apps get their users in the first place?

Great job! One thing I noticed is your app requests my location always (even when the app is not on the foreground). It seems like you would only need my location while I am in the app.

You might get a higher acceptance rate.

Did you use Google app engine for this and used a third party MongoDB provider/own server? If it was GAE, what was it like using it? Was there anything you didn't like?

I'd be curious to know what kind of image recognition software the author used to detect relevant images and if it came with a significant performance hit.

I think you could do it very fast. It's just checking for the presence of static elements in the image, and you don't even need to check for every pixel. You would only need to decode the jpg and then check that maybe 30 pixels are the correct color (with a small tolerance). Also you know that every image is coming from an iOS device, so you can throw out any images that don't match a specific resolution, and you don't need to do any resizing or anything. I'm surprised that he didn't talk about resizing images on the device before uploading.

Pokemon go has many static elements in the UI (a pokeball front and centre being the most notable) -- I'd guess matching against a static template would be pretty fast. (Not my field, though.)

Basically, the author knows that best practices are truisms and require common sense to apply.

Is there any way to simulate high load (Millions of users) for testing?

Aw... it's a $100/mo server, not a $100 server.

His conclusion about Doctrine and other ORMs eating CPU and being the huge bottleneck in the app lines up with my experience using the same. The MVC framework itself, Symfony/Rails in his case, can indeed also be a huge bottle neck, though much less than the ORM yet higher than the DB calls themselves. That too has been my experience often.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact