Hacker News new | comments | show | ask | jobs | submit login
I bet you over-engineered your startup (zemanta.com)
152 points by hamax 1958 days ago | hide | past | web | 100 comments | favorite

Of the three startups I've worked with, two of the three were ridiculously over-engineered monstrosities that were way over time budget. It was clear that the CIO/CTO wanted to do cool fun stuff and not build a marketable product.

The other was cobbled together with completely shit code, was constantly breaking on releases, and was glued together with perl scripts. They're now publicly traded.


A lot of people think following all best practices and offering a unit test sacrifice to Uncle Bob will lead to success

Shipping will put food on the table. Of course, if your system can't stand 5 simultaneous users it won't, still it's difficult to get that bad. (but yeah, some people manage to get to that level)

And some "online services" have very bad code, and still they sell millions. Get rich first, then you can improve your code.

Get rich first, then you can improve your code.

>Get rich first, then you can improve your code.

I agree with you, but the second part never happens.

Sure, but how can you fearlessly refactor the code without having automated tests in the first place for example?

Technical debt accumulates interrest.

Version 2.0

Or you start adding unit tests, which may be harder than rewriting

Doing a version 2.0 is easier (but maybe harder) than it looks

Can you name names? Specifically the publically traded one ?

I can, but I wont.

I'm sure you could guess at it if you had to, its all publicly available information.

Here's my guess: Synacor. Took me only a few minutes of googling.

Can we refrain from punishing people for sharing their insights, especially when they explicitly opt to not share those details?

You're right. I tried to remove it, but I can't edit the comment anymore.

publically traded != startup (in my opinion)

I inferred that the parent comment meant he/she worked for a startup, that became successful and is now a publicly traded company.

From my own experience, developers, particularly inexperienced developers, when learning something new have an insatiable need to implement their new found knowledge .. no matter if it's a good fit or not for the problem at hand (myself included .. I remember abusing the hell out of recursion for example).

One of the worst examples I've seen is the functional programming paradigm being crowbarred out of PHP .. why.

One of the worst examples I've seen is the functional programming paradigm being crowbarred out of PHP .. why.

Most likely related to the Innter-platform effect: ...the tendency of software architects to create a system so customizable as to become a replica, and often a poor replica, of the software development platform they are using.


One of the classic PHP examples being the Smarty templating system:

Smarty has been criticized for replicating features that PHP does natively very well, which leads to inefficiency:


Hey bill - looks like you've been hellbanned. Something that you said on http://news.ycombinator.com/item?id=4213223 ticked someone off. Maybe talk to info@ycombinator.com and get it sorted out.

PHP still sucks though ;)

Wow, I've seen much worse snark on HN get all the upvotes.

Yep. Evaporation in action... :(

The implementation of higher-order functions in PHP has saved me hundreds if not thousands of lines of code, and that's only in the last six months.

It's not perfect, but it's better than what we had before.

If you're talking about PHP's anonymous functions then yes I agree they have made life a lot easier. I was talking more about projects like Swiftmailer for instance, example:

  // Create a message
  $message = Swift_Message::newInstance('Wonderful Subject')
  ->setFrom(array('john@doe.com' => 'John Doe'))
  ->setTo(array('receiver@domain.org', 'other@domain.org' => 'A name'))
  ->setBody('Here is the message itself')

That's called a 'fluent interface.'

It's useful in some instances, and is a bit different than currying because PHP doesn't really support proper currying.

Ah, I thought they where going for functional as that's the response I got when talking to one of their developers about it.

That's probably because their developers didn't know what they were talking about.

I suspect xd doesn't either.

Nice. Are you trying to tell me that that example doesn't appear to try and avoid state?

That example is full of state. There's nothing to prevent you from calling any of those setX methods at a later point in time and changing the object's state. That interface very well could avoid state, but Swiftmailer doesn't seem to be implemented in that way. [1]

The stateless way would be to return an entirely new Swift_Message instance with each setX method. In languages that aren't built with immutability in mind, like PHP and Java, you end up instantiating and throwing away a lot of objects. Sometimes it doesn't matter, but when it does you use a mutable Builder to create the immutable instances.

[1] http://swiftmailer.org/docs/messages.html

That example is full of state.

That's why I said it appears to try and avoid state. Either way, as far as I'm concerned it's an horrific way to write code in an imperative language such as PHP.

I beg to disagree. It's a variation of the builder pattern and I find it quite useful, thank you.

Not to me.

setX is reminicent of Java to me, and of small talk attribute setters/getters before it.

Both of those languages elevate state to the point of godhood.

It's not about the setters/getters directly, it's the way they are being called by returning references to the parent object in each method call.

Right, and that's an insanely common idiom in Java, where state is A-number-one best friend.

Ah, I see. I'm no Java developer, sorry.

This looks exactly like Smalltalk's cascading; just allows you to not repeat $message loads of times. To me it's a useful feature which I often would like when working with lots of Java libraries.

Many devs tend to over-engineer things whether they are at a startup or not simply because they like building things, that's why they are software developers. I don't think this phenomenon is much more complicated than that.

'Over-engineering' is a term I'm starting to pay a lot more attention to :)

'Over-architected' and 'over-optimized' are the terms I prefer. Whatever term(s) is(are) used, criticism should emphasize 'non-pragmatically-built', instead of emphasizing 'startups don't need any design, engineering, or optimization, whatsoever'.

> over-optimized

That does seem like a better term for it. Or at least an equal term for it :)

Sorry, but I think this is really bad advice, it goes about it from the wrong direction. Saying "monolithic" or "services" like one is good and one is bad, or one is complicated and one is simple, is kind of silly.

For example... which is simpler, writing your own search indexing tool in ruby on rails, or installing solr as a service? MySQL is also service, for some reason people tend to forget that. Conversely, if your processes aren't yet resource hogs, why not just let them remain general purpose workers? If you are constantly fiddling with multiple services to make any changes to your app, then yes, you have probably made a bad choice somewhere. But a HAproxy/nginx/rails/memcache/mysql/solr stack is already six services, and not really so complicated to work with. When you write your own services, you should aspire to that level of simplicity.

At the end of the day, the shortest path will be wherever it will be. It's your job as a developer to weigh the pro's and cons on a case by case basis. The hard part is to test drive everything so that you can change it later, and constantly evaluate what choices each decision you make is removing from the table (painting yourself into a corner if you are not careful).

Another way of putting it: if you are picking your architecture before you begin, based on some kind of generalized principle, you are already over-engineering.

This is a really weird article.

The solution is not to say "Services Suck" or "Monolithic yay!", it's to realise that you should start with a monolithic app built in a good framework, then see how your app's internals are used/accessed as you grow THEN split out to logical well designed services.

Or in other words: Premature optimisation is the root of all evil.

More or less what the article said:

> Start merging services until you come to the bare essentials that have to be discrete.

He's giving an heuristic for finding the right granularity of services - that is what is up for criticism, IIUC.

I totally agree with you, start "monolithic", SQL + a lightweight Python web framework for most simple apps.

> Start merging services until you come to the bare essentials that have to be discrete.

I thing GP advised to: start with a monolithic system and then split up the services as your system grows and as you feel the need.

That sounds more solid advice.

After reading the "What Happens" section of the OP's article I can see that he's made the classical mistake of making many things do one small thing, but they're not independent.

The message queue to notify component X of changes to data in Y is endemic of badly designed systems; if system X cares about changes to data in Y it should be designed that (at scale) it caches the data for a suitably short time, otherwise reads-through to the canonical source.

This is a common anti-pattern, and I've seen it built by smart teams at epic scale (millions of uniques per day) and it is still un-manageable.

Feature toggles, hard and soft-failing, together with a baked-in assumption that APIs are asynchronous (that is, unreliable) at as many levels as is feasible is a good architectural move. (And does not necessitate an abundance of architecture over feature code)

Loosely coupled components that expect their counterparts to respond slowly, or not at all are easy to implement and even easier to test. (HTTP, if one wants to use HTTP as the transport medium, learned this, and offers 201 and 202 for CREATED and ACCEPTED).

In my own projects (and I work mostly on near-realtime billing APIs) we bake this assumption (and others) into every transaction, as try to be restful, and transmit the state, and a URL which can be used to get the canonical representation of any resource, at any given moment, and objects in all parts of the system are stateful, relying on the handshakes (accepted/created) (404, 406, 409) to avoid race conditions and to make sure our systems can handle downtime of any component (internal or external)

As a result, we have lightning quick tests, we are very confident in the system's ability to perform, and we have read-through caches which respect the transport medium's headers.

I suspect the OP is right, many do over-engineer the startup, but remember many startups appear to have an abundance of developer potential, until they don't. (Usually through bad design, not over engineering, the two ought not to be confused)

Perhaps I am biased by having seen it happening in a large company, completely unscalable architecture, and at a point as many "architects" as they had developers, desperately trying to keep the wheels turning.

You sound like you'd be interested in CQRS.

Scalability is about more than operations/second or any single metric, it's about making a service/system that can survive at scale, both technically and logistically. A server that goes 10x faster than it's competitor but cannot work with other servers to go to 100x is not a scalable solution, whether that reason is because of some low level networking limitation or a high level programming or administration limitation.

More semantics, but over-engineering is about more than making things complicated. If a tool doesn't do what it was intended to do, or if you've ignored the proper level of complexity, it's because it was badly engineered, not over-engineered. Good engineering will "over-engineer" things as much as possible within the constraints of the solution.

And if you keep that in mind, then you really can't over-engineer things or make them too scalable.

You're sorta kinda conflating two distinct axes of scalability: vertical and horizontal.

A system which is vertically scalable is one which will run faster (for a given definition of "faster") on chunkier hardware.

A system which is horizontally scalable in theory becomes faster by adding more independent hardware. In this age of spinning up anonymous VPSes by the handful that's an attractive quality.

However, horizontal scalability levies a very heavy architectural tax. Nobody has produced a convincing platform that successfully abstracts away the many, many moving parts and oversight that horizontal scaling requires in the same way that an operating system can abstract away a lot of the complexities of vertical scaling.

So what happens is that you spend less time thinking about the problem domain and more thinking about the solution domain.

And which one does the user care about, again?

My point was not clearly made.

I'm not talking about vertical or horizontal, I would put them both in the "technical" category of scaling. I meant to say that scalability is not something that can simply grow to meet a certain load, it is a concept that your system "can survive at scale" in whatever form that scale takes. It encompasses many things like having more servers, bigger servers, more people working on the servers, more users, longer sessions, more activity per user, supports more features, and so on.

It's a semantic point, but it seemed like the author had a very narrow idea of the terms he was using, and was complaining more about his own definitions than the concepts themselves.

And the user cares about all of them, indirectly. They all matter because failing at any one of them are reasons to use something else, whether it's simply slow performance or because your high performance system is so brittle you can't evolve, or downtimes because your high performance, quickly evolving system requires more admins than you can afford, and so on.

All well and good.

But you can go a long way -- a very long way these days -- with vertical scaling alone. As Stack Overflow have pretty convincingly demonstrated.

Since we're being pedantic, scalability doesn't mean "becomes faster".

Your definitions are generally right if you replace "becomes faster" with "handles more load". Something that scales is something that can handle additional load without slowing down as much as the next thing, or that has a prescriptive method for preventing such slowdowns (like adding more hardware or moving to a bigger server).

You're generally spot on but for that point though.

I should've used "throughput" as the metric of discussion.

You can do it Microsoft way: build Windows as a GUI for DOS, capture many users, earn a lot of money, then pay best developers to develop Windows NT and merge it into a new OS. (user experience first, architecture later).

But you can do it the Apple way: make it good from inside out - including architecture, wait loooooooong looooooooong until users recognize all this, then get maaaany users and earn lot of money. (Hopefully you have survived until then.)

If you have luck, you can have both: good architecture inside and very good user experience...

But anyway I agree with the author, that many many solutions are over-engineered instead of just simple...

what was good about OS9 :-)OK it gave Amiga users a good laugh back in the day.

NT was developed by ex VAX Guys from DEC

I believe you was talking about OS X.

Nope the poster was comparing Apple Developing Macs with Ms developing NT when MS started with NT Apple was on OS9 and stayed that way for years.

What the hell are you talking about? Windows NT was released in 1993, back when Apple was shipping System 7 and NeXT was shipping NeXTStep 3. Mac OS 9 didn't come out until 1999.

Actually I had Mach -> NeXTSTEP -> OS X in mind... :-)

>Nope the poster was comparing Apple Developing Macs with Ms developing NT when MS started with NT Apple was on OS9 and stayed that way for years.

Nope, the poster was comparing MS from _one era_ (when they developed NT) to Apple from _another era_ (when they developed OS X).

That is, contrasting those two transitions as different ways of building something.

oh so comapring two totaly diferent era's then?

Yes, the comparison is not about some era (what each company did at the same specified time) but about the methodology (how each company handled a specific transition to a different OS).

My biggest grief is that most of the engineering meetings consist of discussions of how to get a piece of data from one system to another.

Instead of working on things that matter to the user.

If you could solve that problem (hint: it is far from being only a technical problem) in the general case, you could be a rich man!

Data always being somewhere else or where it is wanted but not in the right structure to allow practical use, is something we are exposed to a lot by our clients.

And all that matters is the end user experience. Users don't care how it works as long as it works(according to their expectations).

I'm sorry, but that's a platitude.

It's also true.

yeah I should have said "end user experience is also important besides architecture"

Here is my simpler version: function calls are the fastest RPCs. They are faster to run, faster to write, and faster to debug if something goes wrong.

By all means conceive of your application as a bunch of well-engineered little pieces. But for now, write them all as modules that you're making method calls to. Should you some day want to, you're free to replace one by a facade around a remote service. But for now it works and you can move on.

you're free to replace one by a facade around a remote service.

RPC rarely lives up to this promise. Starting simple is certainly a good approach, but don't fool yourself about how hard it's going to be when you need to deal with latency and partial failure later.

Trust me, I'm fully aware of all of the trouble with getting RPCs to work properly. I still think that it is easier to do that work after you have the original non-rpc version as a reference.

Yes, but making those module boundaries loosely coupled and sharing information only through their interfaces will go a LONG way towards enabling you to remote the call at a later date. The biggest mistakes that prevent remoting a module have to do with requiring context that is not part of the interface of the method call (or having a very chatty back-and-forth, but that is less common.)

This isn't so much the symptom of a distributed architecture as it is one of a badly designed distributed architecture.

Distributed architectures are the only way to go once you reach a certain size both in terms of scale and in terms of team size. You can certainly make do without it (Wikipedia) but you'll have a much more robust product with it (Netflix).

The trick is always using the design appropriate for the current needs. It's good to think ahead, but it mustn't come at the expense of the present.

At the beginning—which is the case for most startups, since few make it to the later stages—it's often a good idea to go with a monolithic codebase based on a lean framework. As you grow, you're going to want to start adding components like a message queue for async work, rethinking your data store for scale, etc. As you grow even further, you're going to want to transition to a distributed architecture. I don't know what comes next… I haven't gotten there yet. But I'm sure as you grow even further, your needs are going to change yet again.

Joke's on you, I don't even have a startup.

Yes. We spent ages before product launch devicing a really solid service that nobody wants, and now make a stream of money from a side-shoot of our main product. Oh well.

I'm all for simplifying, but what exactly does he mean by "monolithic architecture" ? Not even sure I get his overall point, the rant seems to go in many different directions.

That phrase "monolithic architecture" makes me think of one-huge-Java-project and that's not exactly "simple" in my mind. Probably not what he meant though..

Generally, I guess separating things is more work and can actually lead to less flexibility when situations arise that you didn't plan for ("maybe bloggers should be able to advertise too"). OTOH, not separating can lead to entanglement, where you can't really change anything because other things depend on this or obvious and subtle ways.

(edit: removed dumb example ;)

I think this is basically the context for this article: https://speakerdeck.com/u/kennethreitz/p/flasky-goodness

This seems a bit straw man. "Distributed systems suck because Developer B has to wait for Developer A to add stuff to System A so System B can have a new feature". What you have there is a totally different problem that is unrelated to being distributed or not. Replace the word System with Module and imagine they are in the same codebase, still have the same problem. This post smacks, distributed system are hard and gave us a new set of problems that seem hard, runaway!

Side note: I love the fact that there is a COBOL manual in the top graphic in this article talking about over-engineering a software business.

There's somebody at the CIC in Kendall Square looking for COBOL programmers if you know any...

You are saying COBOL software tends to be over-engineered? Compared to what?

I'm just saying that I love the photo ;)

Some counterpoints and cases, anyone? I'm sure many of us would like to hear them. (I'm not at all qualified, just starting out myself.)

The article says you should merge all your databases into one, to avoid setting up a service-api-notification-message-queue mess.

While sharing a database lets you develop the initial system quickly, you'll have problems later on because you've made no distinction between your interface (which other people code against and you commit to not changing too often) and your internals (which you may want to refactor from time to time).

So either you make schema changes at will - in which case other developers do too, and you're spending all your time fixing that instead of developing new stuff - or you rarely make schema changes and do it with advanced warning and approval, in which case the pace of development slows to a crawl because other people are too busy to support the changes you want to make.

With well defined interfaces, only the interfaces have to evolve at a snail's pace; the internals can change as fast as you like, as long as you do it without breaking your interfaces.

Example: If you run a amazon style computerized warehouse and an amazon style shopping website; you want to know if an item is in stock. If the website just goes directly to the warehouse's database tables, the warehouse schema can't change without worrying about breaking the website. A nice simple how-many-in-stock web service would be a lot easier to maintain.

The thing is, prematurely defining interfaces is the worst kind of premature optimization; interfaces are a lot more permanent than any other part of your code. The most successful companies I've seen are those that define architecture as needed; start with everything sharing a database, then extract parts of that database into services /as you need to to scale up/. At that point you'll have a much better idea of what the use cases for those services are, and can define much better interfaces as a result.

> A nice simple how-many-in-stock web service would be a lot easier to maintain.

... yes, _after_ you are doing millions of transactions per month. But until you've actually built a business, there's no point in setting up a web service.

If there's a web service, there has to be a team to maintain it. This means you need to be big enough to have one team per web service.

Absolutely - I'm sure you've heard of the idea of technical debt [1,2] and sometimes it makes sense to build up technical debt to get a system off the ground, then worry about maintainability, documentation and whatnot in your copious free time later on.

[1] http://www.codinghorror.com/blog/2009/02/paying-down-your-te... [2] http://en.wikipedia.org/wiki/Technical_debt

This. Is what I am currently dealing with. And it sucks...

Steve Yegge's Google+ rant argues that Jeff Bezos forcing Amazon to implement everything as internal services allowed a services platform to emerge, and that platforms lead to success (the “because Bezos is smart” argument is a bit weak, though):

https://plus.google.com/112678702228711889851/posts/eVeouesv... https://news.ycombinator.com/item?id=3101876

https://plus.google.com/110981030061712822816/posts/AaygmbzV... https://news.ycombinator.com/item?id=3138826

Thanks very much for those. Probably the two most interesting industrial/work-related blog posts I have ever read!

It's hard to produce a counter point because it's unclear what the OPs point actually is. What does monolithic mean? I can tell you that I am working at a start up (albeit one on the cusp of being just a 'company') on a monolithic application right now, it isn't even that many LOC, and it is absolutely terrible to work with. We are working hard to split it into the services because it is absolutely impossible to develop for. Now, that is for a number of reasons, but the point is 'monolithic' doesn't actually solve anything, it means you'll be trading one set of problems for another. IME, wrapping things into services isn't that bad and you can always break the abstraction if you really need to and fix it as you go, whereas monolithic apps it's harder to realize you've broken an abstraction, for some value of 'monolithic'.

The counterpoint is reliability. If it's ok to have your whole service fall over when any part of it fails, go monolithic. This isn't snark, there genuinely are a lot of cases where this could make sense.

I'm surprised to see only one comment with the word "reliability" in it. I almost laughed when I saw examples from Google and Twitter. The article and the comments, sadly here as well as on the site, betray a shocking unfamiliarity with technical problems that really big systems face, user-facing or not. Making them distributed (which I guess is comparable to these "discrete services" the author mentions) does indeed have its problems, but... well ask Amazon and Google how much they regret making their distributed systems reliable. I bet there's some other path they would rather have traveled in order to simplify their architecture! Snarkiness aside, those are awful, misguided examples, even if the (I think) main point is true, that startups probably don't need to worry about such scaling issues yet.

Since when does monolithic mean unreliable? It's pretty easy to setup replication and have automatic failover.

So the same bug can knock out the primary and secondary for the same reason? Sure. Again, this boils down to what's an acceptable risk.

Agree, but trotting out Google and Twitter's most painful and unattended-to shortcomings, while touchstones, does not support your original thesis.

Furthermore, Google and Twitter aren't startups.

Furthermore read onward for some examples from Zemanta itself, which afaik is a startup.

I could also get into the experience I had when using Google App Engine for my startup a few years ago. Horribly over-engineered architecture, nothing worked, had a whole bunch of trouble and everything went all manner of bad to worse very quickly.

But hey, we had awesome scalability! Until we got 200 users and everything started falling apart because of the overhead of keeping all the different parts of the system communicating.

PS: the "saving is taking too long" example is actually aimed at Buffer not Twitter or Google :)

Isn't a philosophy like 37signals the answer to this? Simplify the software, reduce the number of "features", resist adding and adding and adding to the application?

Simplifying the software will allow you to simplify the architecture, no?

The stand still and dilemma for Developer A and B stems from a failure to plan and having a road map. I, of course, am assuming both devs work internally at the company.

This is bad advice for people who care. Granted, for businessmen it is most import to cash check, but even quick and dirty project slapped together will require rewrite eventually if it take off. And it most likely will be painful and have all kind of subtle bugs.

> will require rewrite eventually if it take off

The general advice is that once it takes off, you will have the money and can hire the resources to rewrite it.

And if it doesn't take off, which is most frequently the case, you lose less.

Exactly what I needed to read this morning. Thank you.

cogs bad...

Thats not over engineering just crap design and using the latest toys for the sake of it.

Overengineering is the Victorian egineers way of desiging as they did not have the indepth knowledge we have and so have massive margins of safety.

Of course that means that some of the pre WW2 London Underground tains lasted longer than the ones bought in the 70's.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact