

We Just Undid Three Months of Dev work. Here's What We Learned. - acl
http://blog.scoutapp.com/articles/2009/10/06/we-just-undid-three-months-of-dev-work-heres-what-we-learned

======
stingraycharles
I think the real lesson to learn here is to never assume anything. They
assumed performance wouldn't be a problem, so they didn't test for it when
prototyping, and they didn't make those tests part of their development cycle.

It's tough, these things, especially if you're talking about a feature you
personally appreciate a lot. The article talks about the performance problems
leading focus away from actively marketing these solutions, which makes me
wonder: if properly marketed, would this have been a killer feature ?

The fact that they decided to get rid of it suggests no. However, are they
putting the code in the freezer for a while until they fix these issues and
re-release it ? Or is it simply a problem that can only be properly solved by
giants like Google ?

~~~
acl
_They assumed performance wouldn't be a problem_

We did pretty extensive performance tests, but not for long enough. We load
tested tested for hours at a time, and the problems started to show up after
days of production load and really compounded after that. It's probably a
topic for another post, but the performance of database tables with a lot of
churn (rows being frequently inserted and deleted) really degrades over time.

~~~
stingraycharles
Guess that's my bad for assuming you thought performance wouldn't be a
problem. :-)

------
megamark16
I wish I could send this to my boss. The functionality I'm working on right
now was requested by a single client and is really only applicable to their
unique billing situation. Why am I spending so much time on something that is
going to have so little payoff, and yet which adds so much additional
complexity to the system? But I've only been here a month and I don't feel
like I can cast a dissenting voice. Maybe I'm just a coward.

~~~
chriskelley
Being vocal about imperfections in the development cycle and suggesting
improvements that matter to the bottom line is not dissent.

It's important to have open communication about this very topic - it improves
the pipeline and keeps margins where they need to be. Be confident and
prepared with data, and make yourself an asset to your company, not just a
keypusher! :)

~~~
cmos
When discussing the merit's of this feature be open minded that you may not
have all the information they had in deciding to implement it. Perhaps this is
heavily requested by the sales department or by other customers further up the
pipeline. Maybe the company threatened to take their business elsewhere.

It's always less obnoxious to approach potentially illogical situations by
giving someone the benefit of doubt. So, while you collect your 'data', also
do some brainstorming on area's where the feature your adding makes a ton of
sense, and might even open up new sales channels or markets.

Instead of assuming there's been some catastrophic mistake you must remedy,
and instead of assuming that management puts no value on your time, try
assuming that they have good reasons for it that you, especially being their
for a month, might not have been fully explained. Quite often the problem is
communication, not intent to waste money.

At the end, if all your effort is for seemingly insignificant return, smile
and do an amazing job. Repeat until you grow weary and embattled, then leave
for another job with a different set of problems.

~~~
chriskelley
Absolutely. It's certainly something that needs to be a dialogue, not at all
"I know something you don't, here is why you are messing up."

There are always circumstances that affect decisions that you may not know of.
This is one of the reasons I am always encouraging artists/developers to be
knowledgeable of their project at a higher level. The more you know about
what's going on in the big picture, the more of an asset you can make
yourself. It's also important for sanity! The OP sounds like they are stewing
daily about disagreeing with the feature -- but if there is in fact a relevant
reason, some of that burnout-causing heartache could have been avoided.

Dialogue and accountability two huge keys to the management/artist
relationship.

------
mr_luc

        The database operations on the nested data were 
        just taking too much processing power.
    

Given that this is Rails, and given that it's certainly SQL involved here, I
just have to ask (and I know the answer is probably "yes") -- did you try
implementing nested sets?

I ask because my experience has been that nested data is (with the kinds of
nested data I've been handed, anyway) not a performance problem. Selects and
updates of nested data can be as responsive as any range query on flat data,
and when it comes to managing the performance of inserting new nodes, deleting
or moving subtrees, there are a lot of options depending on what you want to
optimize for (like spreading out the range from 1 to the max integer supported
and periodically re-packing; that way, inserting leaves and any kind of
deletion is as fast as with flat data).

I'm curious about what your nested data looked like. Sorry to get distracted
on a minor point, but I'm intrigued! When I'm developing, I just always _feel_
as though I should only worry when I start seeing data that has to be a graph
and can't be represented as a tree, but as long as it actually _is_
hierarchical then I won't have to worry about speed too much; but now I'm
wondering if that intuition will bite me.

~~~
acl
_did you try implementing nested sets?_

The big thing we needed to do was a rolling archive to progressively broader
timeframes. As metrics come in, we keep every single datapoint for the first 6
hours. After 6 hours, data gets rolled up into 5-minute archive. Each
datapoint in the 5-minute archive then contains avg, min, max, etc for all the
points that lived within that 5-minute span.

The archiving carries on through progressively broader windows as time goes on
-- a 10-minute archive, 1-hr archive, etc. This progressive aggregation is the
only sane approach to storing the massive amount of data we get. And, it
reflects the need for higher resolution for recent events -- it's rare you
need to see what happened at _one exact minute_ 6 months ago.

It was this progressive archiving that bit us, specifically as DB performance
degraded over time with lots of insertions/deletions. Nested set
didn't/wouldn't help with aggregation costs and degradation from churn during
the archiving process.

Hope this helps -- I'm going to try to do a more technical post on this in the
future.

~~~
mr_luc
Huh, that's interesting. I guess I've never worked with really "churny" data
like that before.

Sure, sometimes I've had cleanup/integrity/whatever tasks that run every few
minutes, but the amount of records affected has always been pretty small.

That's an interesting conundrum. See, _this_ is why we're all messing around
with Cassandra et al; sometimes, in SQL, the answer is "don't do that",
because it'd be too hard to tailor the db's behavior to suit your needs.
Although frankly, with a design that deletes and updates a significant
percentage of records in the system on a certain schedule, I can see any
number of storage solutions having trouble.

That's interesting. I'll be thinking about this at "work" today. ;) I have a
bunch of comments, but they're of the half-baked "oh, what about this?"
variety.

------
fizx
Haha, I'm doing the same thing right now. When one feature costs > 10% of dev
time, and you can afford to get rid of it, do it! In my case, I was spending
50%+ of my time for a few weeks on stability issues caused by a feature that
<10% of my users need. Bye!

~~~
smokinn
On the other hand, punting on any hard problems just makes it that much easier
for your competitors.

~~~
zaidf
You mean hard problems _that matter_.

There are a lot of hard problems startups try to solve which no one cares
about. That's the trap to avoid.

Fixing something hard does not inherently make your company valuable in the
marketplace.

~~~
evgen
The problem with this ideas is that you don't get to decide which hard
problems matter, but your users will not really know if the feature matters
until you have made a real effort at providing the feature. There is a
particularly insidious meme going around (usually from the so-called "lean
startup" crowd) that building a lame/simple version first to see if people
like the feature is how you learn what your users want. The problem with this
is that you never know if people don't like the feature because the problem
you are solving is not important to them or because your half-assed
"iteration" has led them to decide that you don't have the chops to solve the
problem so they should look elsewhere. I can't even count the number of times
I have seen one company introduce a poorly implemented version of a feature,
pull the feature or let it languish (presumably "because our metrics show no
one wants it"), and then watch as customers flock to another company that
actually solved the hard problem.

Fixing a hard problem does not automatically make your company more valuable,
but failing to fix a hard problem _will never_ increase the value of your
company.

~~~
zaidf
_your half-assed "iteration" has led them to decide that you don't have the
chops to solve the problem_

While this theory _sounds_ good, it is disproven time and again by initial
half-baked versions of sites that then go on to take off. Just check the
_original_ launch of YouTube, Digg, facebook.

Also, a HUGE idea coming from lean startup way is to invest very little in
marketing until you have a product users like. You don't need to get one
million users to tell you a product sucks. Often, 50 would do. Now if you are
saying that 50 users writing off your product will doom it for its lifetime,
the problem isn't the lean way it's that your market is too small. YouTube
guys had very poor reaction to their initial site.

 _"because our metrics show no one wants it"_

They have little idea on how to use metrics. Don't blame lean startup ideas
for that.

ie. What lean startup would do is put up a button that looks as good as your
best competitor can put up. Then see how many people click on it. What you
measure is action until the click, not the engagement after the click to draw
conclusions about the demand for that feature. Now if 1000 people are clicking
on the link but only few are using it, chances are your product sucks. Take
that insight and work on your product. Just one small example.

 _failing to fix a hard problem will never increase the value of your
company._

If you are saying that you have to solve really hard technical problems to
increase value of your company, I full disagree. Just look at the web2
companies that took off.

Craigslist did not take off because it solved a huge technical problem.
Craigslist also has a lot of value as a company.

~~~
evgen
> While this theory sounds good, it is disproven time and again by initial
> half-baked versions of sites that then go on to take off. Just check the
> original launch of YouTube, Digg, facebook.

It is easy to disprove the theory if you get to cherry-pick your examples.
Would you like me to list the thousands of other companies that had a couple
of poorly implemented features masquerading as a "beta" that were stomped into
dust by others who worked a bit harder to do the job better?

~~~
zaidf
Of course! I'd like to hear about them.

Btw, I don't consider using some of the most popular web2 properties as an
example to be cherry picking. I'm curious to see your examples nonetheless.

------
davidmathers
Short version:

 _In our case, the move from flat data to nested data was the killer...We came
up with a sweet way of storing the nested data and abstracting away most of
complexities of dealing all kinds of data...However, the load on our database
was far more than we envisioned...MySQL._

------
nestlequ1k
Great, as a paying customer of scout they are telling me their going to now
focus on getting new customers instead of servicing the ones they have.

Might make sense from a business standpoint, but probably something you dont
want to advertise :-).

~~~
acl
Actually we're also able to spend a lot more time now on things that _really_
matter to our customers -- things that customers request, like more graphing
options and better support for cloud instances. The cloud functionality is
already available, and graphing is coming up fast. It's a real pleasure to
finally have time to address these things, in addition to the sales and
partnership efforts.

Also, performance is significantly better since we simplified the architecture
([http://blog.scoutapp.com/articles/2009/10/01/simplify-get-
an...](http://blog.scoutapp.com/articles/2009/10/01/simplify-get-an-order-of-
magnitude-speedup)), which benefits customers old and new!

------
amichail
Just a meta comment about these lessons learned posts. The motivation behind
these is to promote the product in question and one must therefore take them
with a grain of salt.

Peer reviewed research is more reliable.

~~~
johns
If you want to dismiss a post on that theory, you should remove the links to
your products in your profile.

~~~
amichail
There is no need to dismiss all such posts. Some may be
interesting/educational regardless.

But you do need to keep in mind the primary motivation behind them --
marketing.

~~~
johns
Almost everything submitted here has marketing as an ulterior motive. They're
either marketing a product or themselves. But thanks for the reminder.

------
edw519
This reminds me of one of my biggest internal conflicts: when to optimize.

"Premature optimization" has earned a negative reputation because it has a
tendency to inflate dev schedules unnecessarily.

So I tend to just crank something, anything out just to have something. Once
you can see what you have, it's often a lot easier to modify than come up with
in the first place.

OTOH, I like to think that everything I build is a foundation for the next
thing to be built upon it. I am constantly getting bitten in the ass by some
grossly underperforming building block. If a prototype runs poorly a dozen
times, it's a concern. If the same code runs poorly a million times, it's a
disaster.

It's a constant trade-off. Get _something_ running vs. build solid building
blocks. Make a mistake one way and never release. Make a mistake the other way
and have a time bomb to clean up. Sounds like OP has a lot of the same issues.

