
Three Months to Scale NewsBlur - conesus
http://blog.newsblur.com/post/45632737156/three-months-to-scale-newsblur
======
conesus
And I just want to remind the technical crowd here that NewsBlur is 100% open-
source: <http://github.com/samuelclay>

I've been asked quite a few times why I open-source the code. The answer is
simple. Let me use an example from Joel Spolsky.

(From <http://www.joelonsoftware.com/articles/fog0000000052.html>)

    
    
        The dominant spreadsheet, with 100% market share, is Lotus 123. You're the
        product manager for Microsoft Excel. Ask yourself: what are the barriers to
        switching? What keeps users from becoming Excel customers tomorrow?
    
        ... <snip: barriers to entry> ...
    
        That's the barrier to entry. Not how hard it is to switch in: it's how hard it
        might be to switch out.
    
        And this reminded me of Excel's tipping point, which happened around the time of
        Excel 4.0. And the biggest reason was that Excel 4.0 was the first version of
        Excel that could write Lotus spreadsheets transparently.
    
        Yep, you heard me. Write. Not read. It turns out that what was stopping people
        from switching to Excel was that everybody else they worked with was still using
        Lotus 123. They didn't want a product that would create spreadsheets that nobody
        else could read: a classic Chicken and Egg problem. When you're the lone Excel
        fan in a company where everyone else is using 123, even if you love Excel, you
        can't switch until you can participate in the 123 ecology.
    

If you know that in the absolute worst case that you can still use the product
even if it's shut down, then by golly, you have even less of a reason to not
switch to it.

~~~
mynegation
I tried to install NewsBlur, but instructions were not very helpful. I tried
to get as far as I can, editing fabfile and installing gajillions of new
python modules, but still failed.

Don't get me wrong, it is wonderful and very commendable of you to open source
NewsBlur! Nor do I feel entitled to the source code or free ride (I will
probably end up paying for the hosted solution). Just providing some feedback,
given that you indicated that low barriers to entry are important to you.

I am not a web developer so that can partially explain my struggle with
fabric, Django and multitude of webdev and devops modules that you use.

I wish you good luck and I hope that you take this wonderful product even
further.

~~~
conesus
This is something I'll be working on soon enough. There are many, many
dependencies (not only for the web app, but there are three DBs that you have
to have installed as prerequisites - mongo, postgres/mysql, and redis).

The real problem with setting up your own instance of NewsBlur is that you'll
have to do your own feed fetching. This is effectively what you're paying for
when you pay for NewsBlur. Let me break it down.

500 feeds updated once every five minutes is 144,000 feed fetches a day.
Couple that with the original page and icon fetches, you're looking at almost
two feeds every second just to keep your feeds up to date. And the average
feed takes 5 seconds to completely update (feed, page, differences in stories,
updating unread counts). So you have to run 10 processes in parallel, and
that's already beyond the capabilities of most single machines. Then good luck
keeping your DBs running clean and backed up.

Or you could pay $2 / month and never have to worry about setting up a very
big stack. And you get the social community of shared stories on newsblur.com.

~~~
duncan_bayne
Three DBs? I don't suppose you have a blog post or something that explains
your architecture decisions? Not criticising - I'm sure you had good reasons -
but I'm rather curious as to what they were :)

~~~
BoyWizard
Based on the Github page, I would imagine Redis as a cache, MongoDB to store
the 'non-relational data' (quoted from the GH page - I'm guessing the actual
articles and such) and Postgres for the other application stuff - accounts,
preferences, subs, etc.

~~~
ceol
[https://github.com/samuelclay/NewsBlur/blob/master/apps/rss_...](https://github.com/samuelclay/NewsBlur/blob/master/apps/rss_feeds/models.py)

It looks like MongoDB is being used for fetch/push history, feed icon data,
and article data. It's also being used for some other stuff:
[https://github.com/search?l=Python&q=%22mongo.Document%2...](https://github.com/search?l=Python&q=%22mongo.Document%22+repo%3Asamuelclay%2FNewsBlur&ref=advsearch&type=Code)

------
recuter
I'd just like to point out that '$24/yr * high four-figures of paying
costumers' with only one employee ensures NewsBlur will be around for a long
time. :)

This is the kind of single founder "life style" startup that VC's would paw-
paw that is Doing It Right that many of us aspire to. The other ones I can
think of are Pinboard and Instapaper. I'd rather go this route, personally,
here's to more like these. Cheers.

~~~
MatthewPhillips
Indeed. If one were wise, you could stalk Google/Yahoo/BigCo properties that
are being neglected, launch a paid competitor and reap the reward when they
shut down.

Pinboard and now Newsblur.

~~~
boyter
Thats actually a really good idea.

I wonder what would be next? Most of Yahoo's properties seem be be neglected
at the moment, but what about Google? Google News perhaps?

~~~
MatthewPhillips
Probably Google Alerts. It's rather niche but people who use it absolutely
rely on it. But the VC-funded Mention seems poised to take that one.

~~~
MiguelHudnandez
Google Alerts is a good candidate. Pardon anecdotal evidence, but the people
in my circles who rely on it are executives who watch for their own names, or
brand managers who watch for coverage of their company.

This information is very valuable and I'm sure a good service could charge a
nice premium.

~~~
darkspaten
I've received some deliverables from a marketing team which uses the following
service for just such a purpose (at a nice premium) @
[https://www.meltwater.com/products/meltwater-news/online-
med...](https://www.meltwater.com/products/meltwater-news/online-media-
monitoring/)

------
rodgerd
<blockquote>The inevitable file descriptor limits on Linux means that for
every database connection you make, you use up one of the 1,024 file
descriptors that are allocated to your process by default. Changing these
limits is not only non-trivial, but they don’t tend to stick.</blockquote>

You're clearly a talented programmer, but perhaps employee #2 should be a
sysadmin.

------
klox
"Luck is what happens when preparation meets opportunity"

Congratulations.

This is why I love SaaS. 6 figures revenue in a couple of days, and I can only
imagine how much more will be made in the next 3 months... and beyond.

But it's not a once-off payday. This is recurring. 7 figure annual income is
yours, Sam. People slave away for decades trying to reach a tiny fraction of
this amount. Put it to good use, brother.

~~~
LargeWu
"6 figures revenue in a couple of days"

Those couple of days took 4 years of preparation.

~~~
klox
Yes. Which is why the first line of my comment literally reads "Luck is what
happens when preparation meets opportunity".

------
sdfjkl
That was surprisingly clueful of PayPal. Usually that part of the story is way
more dramatic and ends up with alternate payment options.

~~~
dangrossman
Only in the stories. The majority of situations where PayPal calls probably
end just like this. I've taken that call myself and didn't have a problem. The
stories we read are almost all about people doing things they shouldn't with
any payment processor; that's why it only gets worse when the processor learns
the details.

~~~
lucaspiller
Agreed, when the company I work for was just one guy, he had the same thing.
They put a hold on funds (he had a lot of declined payments, nothing dodgy,
just users not knowing how to use company credit cards), but as soon they'd
checked all the paperwork they released it after a few weeks.

------
Maven911
Hi, So i have noticed that you are very forthcoming with data, such as
providing the number of premium users on your webpage, and I hope you continue
this - I really hope you become more succesful with this Google Reader debacle
going on.

1) Why are you open about the number of users ? (personally I think its great)
2) How do the real-time stats work, I notice that the amount of regular users
was over 20k a few days ago, and now its 8k. Are these the users who have used
the client/service in the past 24 hours ? (and not the actual total amount of
users)

~~~
DuskStar
From another comment, those numbers appear to be the new signups in the last
24 hours - and as the free option became much harder to find recently, I would
expect the number of free signups to decrease dramatically.

~~~
Semaphor
> and as the free option became much harder to find recently

That's only temporary though.

------
steve19
Will you be adding a Google Reader compatible API?

(So existing Google Reader apps can quickly switch to NewsBlur)

~~~
nicw
You can do this currently. My feeds were quickly ingested and everything
worked great.

~~~
Semaphor
In case you wonder why you are getting downvoted: He is asking about a
compatible API for 3rd party apps (Reeder, gReader and so on) to be able to
connect to newsblur just by changing the API URL.

~~~
nicw
Ah ha, thanks :) Woke up this morning and was wondering why. Was a late night
post.

------
tempestn
A tad off topic, but I'm hoping the fastest way to get an answer: How does one
move or delete multiple feeds at once in newsblur? There must be a way to do
this, right? Say I import 100 feeds from an OPML file, then realize I actually
want them all in a folder. At the moment the only way I can figure out how to
do that is to click on each feed, click move to folder, click the folder
dropdown box, click the folder I want, click save. Times 100. Or I can save
one click each by deleting them all then re-importing. But there must be a
faster way. (In Google Reader settings, one could filter all folders with a
keyword pattern, and bulk move.)

~~~
Semaphor
Heh, wondered that myself yesterday after I realized my OPML contained old
feeds for some reason. Reimporting feeds will delete your current feeds :)

~~~
tempestn
Hmm.. that would be helpful. Unfortunately it looks like there are some bugs
to iron out. Twice in a row now, I've created a folder, then in the folder
menu got to add feeds to that folder, imported an opml file, and... the folder
disappears and the feeds go into the root.

------
qoo
"I had been preparing for a black swan event like this for the last four years
since I began NewsBlur."

And then the black swan event comes.

And then fire everywhere.

~~~
jbackus
Same paragraph:

> I did not expect it to come this soon.

------
mynegation
I wonder what Samuel's attitude towards taking external financing is. Because
NewsBlur is having a hell of a momentum on Google Reader news (pun not
intended).

~~~
recuter
Well now that he has a graph pointing up and to the right he can _certainly_
raise money and collect developers as if they were Pokemon and play games with
the valuation in the hopes that somebody Acquihires Newsblur or buys them for
some opaque strategic reasons.

Maybe Google will buy them! :)

Edit: I kid, but that's actually a possibility in my mind and would be an
ironic outcome. Maybe Facebook or Yahoo or somesuch will see the value of
having the New Reader. Selfishly, I am hoping for something else.

~~~
rdl
IMO the best acquirer would be someone who wanted credibility among high-end
technical users or journalists, not someone who wanted to turn it into a mass-
market thing.

If I were Palantir, I'd probably drop $250k/yr to run the World's Best RSS
Reader for the kind of research analysts who become Palantir customers, and
the kind of geeks who make awesome Palantir employees. There are maybe a
hundred companies you could substitute for Palantir here.

Seriously ironic that both of the likely Reader successors are YC funded
(Feedly and Newsblur)

------
TheTaytay
Sam, congrats!

For a task a parallelizable as fetching feeds, I can't recommend Picloud.com
enough. You can ssh into a box, provision it however you want, save that image
off, and then run arbitrary commands/scripts on instances of that image, on
the CPU type you want, and pay only for the seconds of usage you have. (Their
"s1" core type is $0.04/hour: <http://www.picloud.com/pricing/> ) They also
have a system that lets you mount the same shared "drive" from multiple
instances at the same time, so if you're doing file-based stuff, it's easy.

The thing I like about it is that running another instance of your script on a
new machine (or 2k of them) is trivial. No need to wait to provision a new
VPS. Starting and stopping jobs is fast, so scaling up/down is fast.

(I'm swear I'm not affiliated with them. I'm using it for a side project at
the moment and have been so excited about it, I want to spread the word.)

------
zheng
It must have been an intense mix of fear and excitement trying to fix anything
and everything that broke, but NewsBlur deserves it IMO. It is a great
product, and I think this post makes it clear that he cares about his users'
experience.

------
johncoltrane
I registered progrss.net a few years ago in the hope of creating a "smart" RSS
reader that could help me (and others, possibly for a fee) save time by
passing my feeds through easy to set up filters. The idea, sounded good, the
mockups and and the prototypes looked good, it was fast and easy to use, the
UX was distinctly different from all the others… but I grew tired of RSS as a
whole and of woking on progrss.net. Now I don't use RSS anymore and my project
was already pretty much stalled when NewsBlur was first announced here.

The day Google announced they retired Reader was "missed opportunity day",
over here.

Anyway, good luck to you.

------
mamcx
Do you have a API? I was building a news reader for iOS when !kaput! Google
reader (I was in the middle of integrate it). Now I wonder what to do next. I
love to support another "small" players like yours...

~~~
conesus
You bet I do: <http://www.newsblur.com/api>. I would love to hear about your
ideas. I feature a number of third-party extensions and apps written against
NewsBlur's API on the Goodies page.

------
habosa
Can I possibly buy a month or two at a time? I want to try it out without
buying a whole year. I know that's cheap of me, but I'll probably be trying a
few RSS readers in the near future.

------
niggler
Congratulations on being well-positioned and ready to handle the onslaught
when Google announced that they were axing Reader. You clearly were "where the
puck will be" :)

------
electic
I tried out NewsBlur and sadly, Feedly seems leagues better.

~~~
jbigelow76
I looked at Feedly and Newsblur last week as I look for my eventual Google
Reader replacement.

Feedly turned me off because it has no open web implementation, it runs
through the Chrome web store. Aside from some developer related plugins I
refuse to use a completely superfluous wrapper around a website if I am on my
PC.

I've tried Newsblur but it's just so "busy" (I know where my mouse is on the
vertical axis thank you very much). I may warm to it, but I'll probably end up
with something a little simpler. But I wish all the success in the world to
conesus, a variety of small, customer focused, customer capitalized
independent web service vendors is a good thing for all of us.

~~~
conesus
You should lock that "dorito" that follows your mouse. It allows you to adjust
the currently selected story. It's an indicator line. Maybe I should lock it
by default and allow you to click on it to have it follow your mouse...

~~~
UberMouse
I wasn't aware you could do this. Locking it by default seems like it would
make more sense.

Also while I've got you, two issues I've noticed.

1\. When I click "All Stories" to refresh my feeds it defaults to listing
everything, even though I have it set so I only view one item at a time. As
soon as I click the button/arrow to go to the next unread item it shows the
first item as it should and everything works fine after that.

2\. I seem to run out of items after I've viewed ~20ish and I have to click
"All Stories" again to fetch a new bunch

I'm using the dev site if that matters (Issues happen on both sites though).
So far these are the only things annoying me and otherwise newsblur is pretty
good.

~~~
conesus
1\. Intentional. Otherwise you'd stare at a blank screen until you click a
story.

2\. That's an unfortunate bug, but one that will probably get sussed out when
I work on scaling over the next month.

Dev is largely just a re-skin. They share a backend.

~~~
UberMouse
Why can't there be an option in the cast of 1. for it to automatically open
the first unread story when I click "All stories"? Basically what I am doing
by pressing the first unread button.

I can understand not doing that by default since people may want to see their
unread feeds, but I always just go to the first unread one anyway.

~~~
conesus
In that case, there's a preference waiting for you that you're going to just
love...

~~~
UberMouse
Haha, excellent, thank you for helping me. Also "Text" view seems to be
missing from the Default View preferences, is this intentional?

------
LargeWu
Waiting for the day that NewsBlur becomes so successful that they get acquired
by Google.

------
carterschonwald
Props on having an exciting validating road over the next few months!

Likewise, what are some of the engineering things you'd really like some of
the more dev heavy users to maybe help via pull requests with? :)

(I see that the github issues list is relatively short)

------
momonga
Thanks for providing this service and having such great communication with
users!

------
tehwalrus
well done with keeping this app running! I switched from Reader as soon as I
noticed, and I've been reading my feeds through NewsBlur the last couple of
days with no problems (after an initial setup delay)!

I have a couple of comments on how the UI works, but I'll consider those as
time goes by and send you more specific feedback then (mostly I'm annoyed by
things that work differently to google reader, so I'll hang back and see if
they convince me...)

------
jpwagner
So far I am not a user of ANY reader. Between googlenews, twitter, and HN I
get my daily digest.

I would like to try Newsblur but it seems that to get immediate value I have
to have been some reader user: I can migrate my googlereader account or upload
an OPML file.

What I would like to see are different versions of OOTB "skins". I can sign up
as an "HN-lover" and it seeds my account with 50 blogs I should follow. Then I
can actually evaluate the product.

Also note that free accounts are not actually disabled. If you leave the
session when you hit the paywall and login from the homepage, you now have a
free account.

~~~
conesus
As I mentioned on Twitter, if you want a free account but can't get around the
paywall, just know that the paywall is _extremely_ leaky on porpoise...

~~~
saraid216
> just know that the paywall is extremely leaky on porpoise...

This is my new favorite typo.

~~~
mkr-hn
I don't think that's a typo.

------
Maven911
If i understood the numbers correctly, newsblur has went from 1500
paying/50000 free users to 6500 paying/110000 free users?

------
rdl
I'm interested in your decision process in picking Reliable and then Digital
Ocean.

~~~
conesus
Originally price, now price + API for deploying new instances. If you check
out my fabfile.py script you'll see that I have EC2 setup and ready but I
don't use it because it costs more, runs slower, and is a pain in the butt to
maintain.

~~~
rdl
Oh, I meant vs. spending $20k on hardware, or partnering with someone like
Softlayer/Rackspace as a promotional thing (I suspect either would be willing
to give you a great deal to use as a marketing case)

~~~
jcampbell1
My guess... DO offers hardware in 60 seconds. Also cheaper/faster than EC2.

------
randall
I want to buy! Take my money! But your site errors out after CC deets. :(

------
btbuildem
Congrats! Enjoy the ride..

