The other was cobbled together with completely shit code, was constantly breaking on releases, and was glued together with perl scripts. They're now publicly traded.
A lot of people think following all best practices and offering a unit test sacrifice to Uncle Bob will lead to success
Shipping will put food on the table. Of course, if your system can't stand 5 simultaneous users it won't, still it's difficult to get that bad. (but yeah, some people manage to get to that level)
And some "online services" have very bad code, and still they sell millions. Get rich first, then you can improve your code.
I agree with you, but the second part never happens.
Technical debt accumulates interrest.
Or you start adding unit tests, which may be harder than rewriting
Doing a version 2.0 is easier (but maybe harder) than it looks
One of the worst examples I've seen is the functional programming paradigm being crowbarred out of PHP .. why.
Most likely related to the Innter-platform effect: ...the tendency of software architects to create a system so customizable as to become a replica, and often a poor replica, of the software development platform they are using.
One of the classic PHP examples being the Smarty templating system:
Smarty has been criticized for replicating features that PHP does natively very well, which leads to inefficiency:
PHP still sucks though ;)
It's not perfect, but it's better than what we had before.
// Create a message
$message = Swift_Message::newInstance('Wonderful Subject')
->setFrom(array('email@example.com' => 'John Doe'))
->setTo(array('firstname.lastname@example.org', 'email@example.com' => 'A name'))
->setBody('Here is the message itself')
It's useful in some instances, and is a bit different than currying because PHP doesn't really support proper currying.
The stateless way would be to return an entirely new Swift_Message instance with each setX method. In languages that aren't built with immutability in mind, like PHP and Java, you end up instantiating and throwing away a lot of objects. Sometimes it doesn't matter, but when it does you use a mutable Builder to create the immutable instances.
That's why I said it appears to try and avoid state. Either way, as far as I'm concerned it's an horrific way to write code in an imperative language such as PHP.
setX is reminicent of Java to me, and of small talk attribute setters/getters before it.
Both of those languages elevate state to the point of godhood.
That does seem like a better term for it. Or at least an equal term for it :)
For example... which is simpler, writing your own search indexing tool in ruby on rails, or installing solr as a service? MySQL is also service, for some reason people tend to forget that. Conversely, if your processes aren't yet resource hogs, why not just let them remain general purpose workers? If you are constantly fiddling with multiple services to make any changes to your app, then yes, you have probably made a bad choice somewhere. But a HAproxy/nginx/rails/memcache/mysql/solr stack is already six services, and not really so complicated to work with. When you write your own services, you should aspire to that level of simplicity.
At the end of the day, the shortest path will be wherever it will be. It's your job as a developer to weigh the pro's and cons on a case by case basis. The hard part is to test drive everything so that you can change it later, and constantly evaluate what choices each decision you make is removing from the table (painting yourself into a corner if you are not careful).
Another way of putting it: if you are picking your architecture before you begin, based on some kind of generalized principle, you are already over-engineering.
The solution is not to say "Services Suck" or "Monolithic yay!", it's to realise that you should start with a monolithic app built in a good framework, then see how your app's internals are used/accessed as you grow THEN split out to logical well designed services.
Or in other words: Premature optimisation is the root of all evil.
> Start merging services until you come to the bare essentials that have to be discrete.
He's giving an heuristic for finding the right granularity of services - that is what is up for criticism, IIUC.
I totally agree with you, start "monolithic", SQL + a lightweight Python web framework for most simple apps.
I thing GP advised to: start with a monolithic system and then split up the services as your system grows and as you feel the need.
That sounds more solid advice.
The message queue to notify component X of changes to data in Y is endemic of badly designed systems; if system X cares about changes to data in Y it should be designed that (at scale) it caches the data for a suitably short time, otherwise reads-through to the canonical source.
This is a common anti-pattern, and I've seen it built by smart teams at epic scale (millions of uniques per day) and it is still un-manageable.
Feature toggles, hard and soft-failing, together with a baked-in assumption that APIs are asynchronous (that is, unreliable) at as many levels as is feasible is a good architectural move. (And does not necessitate an abundance of architecture over feature code)
Loosely coupled components that expect their counterparts to respond slowly, or not at all are easy to implement and even easier to test. (HTTP, if one wants to use HTTP as the transport medium, learned this, and offers 201 and 202 for CREATED and ACCEPTED).
In my own projects (and I work mostly on near-realtime billing APIs) we bake this assumption (and others) into every transaction, as try to be restful, and transmit the state, and a URL which can be used to get the canonical representation of any resource, at any given moment, and objects in all parts of the system are stateful, relying on the handshakes (accepted/created) (404, 406, 409) to avoid race conditions and to make sure our systems can handle downtime of any component (internal or external)
As a result, we have lightning quick tests, we are very confident in the system's ability to perform, and we have read-through caches which respect the transport medium's headers.
I suspect the OP is right, many do over-engineer the startup, but remember many startups appear to have an abundance of developer potential, until they don't. (Usually through bad design, not over engineering, the two ought not to be confused)
Perhaps I am biased by having seen it happening in a large company, completely unscalable architecture, and at a point as many "architects" as they had developers, desperately trying to keep the wheels turning.
More semantics, but over-engineering is about more than making things complicated. If a tool doesn't do what it was intended to do, or if you've ignored the proper level of complexity, it's because it was badly engineered, not over-engineered. Good engineering will "over-engineer" things as much as possible within the constraints of the solution.
And if you keep that in mind, then you really can't over-engineer things or make them too scalable.
A system which is vertically scalable is one which will run faster (for a given definition of "faster") on chunkier hardware.
A system which is horizontally scalable in theory becomes faster by adding more independent hardware. In this age of spinning up anonymous VPSes by the handful that's an attractive quality.
However, horizontal scalability levies a very heavy architectural tax. Nobody has produced a convincing platform that successfully abstracts away the many, many moving parts and oversight that horizontal scaling requires in the same way that an operating system can abstract away a lot of the complexities of vertical scaling.
So what happens is that you spend less time thinking about the problem domain and more thinking about the solution domain.
And which one does the user care about, again?
I'm not talking about vertical or horizontal, I would put them both in the "technical" category of scaling. I meant to say that scalability is not something that can simply grow to meet a certain load, it is a concept that your system "can survive at scale" in whatever form that scale takes. It encompasses many things like having more servers, bigger servers, more people working on the servers, more users, longer sessions, more activity per user, supports more features, and so on.
It's a semantic point, but it seemed like the author had a very narrow idea of the terms he was using, and was complaining more about his own definitions than the concepts themselves.
And the user cares about all of them, indirectly. They all matter because failing at any one of them are reasons to use something else, whether it's simply slow performance or because your high performance system is so brittle you can't evolve, or downtimes because your high performance, quickly evolving system requires more admins than you can afford, and so on.
But you can go a long way -- a very long way these days -- with vertical scaling alone. As Stack Overflow have pretty convincingly demonstrated.
Your definitions are generally right if you replace "becomes faster" with "handles more load". Something that scales is something that can handle additional load without slowing down as much as the next thing, or that has a prescriptive method for preventing such slowdowns (like adding more hardware or moving to a bigger server).
You're generally spot on but for that point though.
But you can do it the Apple way: make it good from inside out - including architecture, wait loooooooong looooooooong until users recognize all this, then get maaaany users and earn lot of money. (Hopefully you have survived until then.)
If you have luck, you can have both: good architecture inside and very good user experience...
But anyway I agree with the author, that many many solutions are over-engineered instead of just simple...
NT was developed by ex VAX Guys from DEC
Nope, the poster was comparing MS from _one era_ (when they developed NT) to Apple from _another era_ (when they developed OS X).
That is, contrasting those two transitions as different ways of building something.
Instead of working on things that matter to the user.
Data always being somewhere else or where it is wanted but not in the right structure to allow practical use, is something we are exposed to a lot by our clients.
By all means conceive of your application as a bunch of well-engineered little pieces. But for now, write them all as modules that you're making method calls to. Should you some day want to, you're free to replace one by a facade around a remote service. But for now it works and you can move on.
RPC rarely lives up to this promise. Starting simple is certainly a good approach, but don't fool yourself about how hard it's going to be when you need to deal with latency and partial failure later.
Distributed architectures are the only way to go once you reach a certain size both in terms of scale and in terms of team size. You can certainly make do without it (Wikipedia) but you'll have a much more robust product with it (Netflix).
The trick is always using the design appropriate for the current needs. It's good to think ahead, but it mustn't come at the expense of the present.
At the beginning—which is the case for most startups, since few make it to the later stages—it's often a good idea to go with a monolithic codebase based on a lean framework. As you grow, you're going to want to start adding components like a message queue for async work, rethinking your data store for scale, etc. As you grow even further, you're going to want to transition to a distributed architecture. I don't know what comes next… I haven't gotten there yet. But I'm sure as you grow even further, your needs are going to change yet again.
That phrase "monolithic architecture" makes me think of one-huge-Java-project and that's not exactly "simple" in my mind. Probably not what he meant though..
Generally, I guess separating things is more work and can actually lead to less flexibility when situations arise that you didn't plan for ("maybe bloggers should be able to advertise too"). OTOH, not separating can lead to entanglement, where you can't really change anything because other things depend on this or obvious and subtle ways.
(edit: removed dumb example ;)
While sharing a database lets you develop the initial system quickly, you'll have problems later on because you've made no distinction between your interface (which other people code against and you commit to not changing too often) and your internals (which you may want to refactor from time to time).
So either you make schema changes at will - in which case other developers do too, and you're spending all your time fixing that instead of developing new stuff - or you rarely make schema changes and do it with advanced warning and approval, in which case the pace of development slows to a crawl because other people are too busy to support the changes you want to make.
With well defined interfaces, only the interfaces have to evolve at a snail's pace; the internals can change as fast as you like, as long as you do it without breaking your interfaces.
Example: If you run a amazon style computerized warehouse and an amazon style shopping website; you want to know if an item is in stock. If the website just goes directly to the warehouse's database tables, the warehouse schema can't change without worrying about breaking the website. A nice simple how-many-in-stock web service would be a lot easier to maintain.
... yes, _after_ you are doing millions of transactions per month. But until you've actually built a business, there's no point in setting up a web service.
If there's a web service, there has to be a team to maintain it. This means you need to be big enough to have one team per web service.
Furthermore, Google and Twitter aren't startups.
I could also get into the experience I had when using Google App Engine for my startup a few years ago. Horribly over-engineered architecture, nothing worked, had a whole bunch of trouble and everything went all manner of bad to worse very quickly.
But hey, we had awesome scalability! Until we got 200 users and everything started falling apart because of the overhead of keeping all the different parts of the system communicating.
PS: the "saving is taking too long" example is actually aimed at Buffer not Twitter or Google :)
Simplifying the software will allow you to simplify the architecture, no?
The stand still and dilemma for Developer A and B stems from a failure to plan and having a road map. I, of course, am assuming both devs work internally at the company.
The general advice is that once it takes off, you will have the money and can hire the resources to rewrite it.
And if it doesn't take off, which is most frequently the case, you lose less.
Overengineering is the Victorian egineers way of desiging as they did not have the indepth knowledge we have and so have massive margins of safety.
Of course that means that some of the pre WW2 London Underground tains lasted longer than the ones bought in the 70's.