Hacker News new | comments | show | ask | jobs | submit login
Programs should be Small (mkhadikov.com)
80 points by mairbek 1519 days ago | hide | past | web | 72 comments | favorite

I love all these articles which are wholly ignorant of how complex software is in reality and how such advice isn't necessarily good.

Sometimes decomposition results in problems at the other end of the scale such as communication performance, data duplication, extremely nested abstractions, messaging complexity, contract and API versioning hell etc.

Getting the sweet spot between monolithic coupled blobs and fragmented latent deathtraps is an art which can't be puked out in a blog post. It takes literally years of experience and some guesswork and testing and thinking.

Ultimately, lots of small programs are just as painful as a single large one if they have to talk to each other or do IO.

I'm pretty much convinced that the big problem today in software is deciding where your APIs should go.

My instincts are similar to the articles author, a preference for small discrete pieces software rather than a giant monolithic application. More Web Services!, if you will.

But you are correct, getting this sweet spot is hard. Truth be told, I am not even sure experience guarantees a successful design first time around.

There is a tendency with developers to want to keep everything nice and clean. For example app A is responsible for a data set, anytime other apps want to access it they have to talk to A, if they are asking for that data a lot you might be better off caching or periodically copying the data over to the parts of the system. I always try to decide whether to segment something by thinking about how many calls it is likely to receive as a web service, more than a couple in short succession and I start having doubts.

Ultimately what makes the entire system work best for users is the correct thing to do and sometimes it is very difficult to come up with something which does that and is pleasing to the discerning coders eye.

I was just about to make the same argument. I completely see why this is tempting, but it quickly makes maintenance into a new layer of hell, and anyone supporting your production environment will hate you. Added to which, knowledge transfer becomes a huge problem, and it takes new developers and production support people a small lifetime to learn all the pieces and their touchpoints.

Adding to that, his stated advantage of not having to limit yourself to one platform also seems opposite of my experience (i.e. keeping all on the same platform is an advantage).

When you have a big system with disjoint parts written in different languages, re-use and refactoring is a pain, and redundancy is almost certain to creep in (and with redundancy often comes inconsistency).

Yes heterogeneous systems are much easier to deal with, although from experience certain systems are a pig to deal with from end to end (anything Microsoft as a rule).

Different languages are just different forms of integration and the mantra of integration is hell should be in the forefront of everyone's mind, always.

Depends on tools you use, I found Apache Thrift and Protobuf pretty to be sophisticated tools for integration between services.

Yet they are still entirely impractical for what we do. There is no one size fits all methodology which results in a heterogeneous communication layer. This means that you end up with technology fragmentation and therefore additional complexity.

I totally agree that software development is somewhat non-trivial. I'm not saying that small programs would solve all problems you might face. Would you system be more maintainable? I think yes.

> Getting the sweet spot between monolithic coupled blobs and fragmented latent deathtraps is an art which can't be puked out in a blog post. It takes literally years of experience and some guesswork and testing and thinking.

I agree in the blog post, proper decomposition is the key if you want to write a good systems, and to be honest is really hard to achieve.

[Edit: deleted pointless bit here.]

If your comment consisted simply of its second and fourth paragraphs it would be better in every way and you would have contributed something of value.

I've read a thousand versions of this blog post over the years. It's decidedly abrasive as I'm tired with reading it to be honest. There is nothing new to be added to the discussion apart from people blindly falling over the same point, which is ignorant and not very well thought out and is not based on reality.

Apologies if you are personally offended, but my point still stands.

It's disgusting that your comment - a nasty, nearly content-free spit of cynical contrarianism - is currently the most highly-ranked reply to what was actually a congenial and well thought out blog post.

The problem with corporate, big-program development is that it's a premature abstraction.

If the system-of-small-programs doesn't perform, then you're in a state where larger programs might make sense. If the problem is well-understood and the pieces have been built and refined by competent programmers, but it's impossible to go any further without some coupling and integration, then a large program isn't the worst thing in the world. Really, that's what most "optimization" is: the use of about-the-system knowledge to make changes that, while they create couplings that exclude (by which I mean, may cause horrible things to happen, but that's irrelevant) unused cases, improve the performance of the used cases.

For example, with databases, you have requirements that are both technically challenging but also need to work together: concurrency, persistence, performance, transactional integrity. These involve an ability to reason about "the whole world" that can't be achieved with a system-of-small-things approach. That's a case where "bigness" actually imposes complexity reduction. But it has taken some very smart people decades to get this stuff right.

The problem with ad-hoc corporate big-program systems is that the one benefit of largeness-- complexity reduction-- never occurs because there is no conceptual integrity, but only a heterogeneous list of "requirements" that pile on and don't work together. You get the ugliness of "lots of small programs" but the APIs aren't even documented. Instead of reading crappy APIs to work on such systems, programmers have to read crappy code, which is even harder.

Small is the way to start. If you need to make a program large, there are intelligent ways of doing it, but it's best to start small and build enough knowledge so that, when largeness becomes necessary, the problem is actually well-understood.

Most enterprise systems start off with requirements similar to those you think of with a database - a lot of data with high expectations of performance.

For example, the program I work on has to support a million row database that can be sorted and filtered both on the server and client with subsecond response time. The program is incredibly configurable based on data in the system, so many of the features depend on reading data and reacting to it.

The problem with "many small programs" is the cost of communication. I can pass a pointer to a list of 100,000 items to be sorted and filtered in a trivial amount of time. If I have to serialize that list to json to pass to a separate program that then has to deserialize that list and perform the function, then reserialize the sorted/filtered list, send it back, re-deserialize.... it'll take longer to do the communication than it does to do the sort.

However, that's not to say that the idea of separation of concerns still can't be applied to large program. And in fact, most enterprise devs do exactly that. That's what all these "services" are in the program. Except that instead of having to serialize data, I can just pass them a pointer.

Just because you can't see all the different programs, doesn't mean they're not there.

I'm going to point to a development case which is outside the typical "database management" case everyone here seems to be thinking about: engineering modeling software like this: http://www.aspentech.com/fullproductlisting/

This company is ostensibly doing the right thing: they have developed a large number of "single purpose" programs. They also have some applications which attempt to integrate some of their technology into single packages. The problems, however, are exactly as you describe. From an end user perspective, having the various programs send data to one another is a crap shoot. Some applications are very tightly integrated while others seem to have been developed in a vacuum. The company has even developed an entire application that tries to fix this by allowing data to be automatically exported to and imported from excel. End users could try to use the COM interface to get and send data where they want it to go, but we have to remember that the target audience is engineers, not programmers.

> Most enterprise systems start off with requirements similar to those you think of with a database - a lot of data with high expectations of performance.

Not where I work.

And even then, this is no excuse to stick to a zeroth order heuristic, and make big programs every time. Some systems can be cleanly separated in simple components. Failing to see that is a waste.

Most enterprise systems start off with requirements similar to those you think of with a database - a lot of data with high expectations of performance.

Right, but there's a different process to it.

Databases solved a problem, and the requirements grew organically as people used them to solve harder problems. With product companies or with open-source software, the project owners can say, "We aren't doing that shit".

Enterprise projects accumulate requirements based on who has power within the organization. Each person who has the power to stop the project asks for a hand-out, and "We aren't doing that shit" isn't an option. It's like how businesses that want to operate in corrupt companies need to have a separate "bribe fund" for local officials. Over time, the result is an incoherent mess of requirements that make no sense together.

The requirement list for a typical enterprise project is the bribe trail.

However, that's not to say that the idea of separation of concerns still can't be applied to large program. And in fact, most enterprise devs do exactly that. That's what all these "services" are in the program. Except that instead of having to serialize data, I can just pass them a pointer.

Sure, but when you have a multi-developer project without an explicit API, what you end up is an undocumented and implicit API between peoples' code. This devolves into the software-as-spec situation where it's not clear what the rules are.

I think it's better to start with the inefficient service-oriented program, get that working, and then optimize with the merged, larger program if needed (and to document the API that has now become an implicit within-program beast).

> The requirement list for a typical enterprise project is the bribe trail.

I think this is purely a stereotype.

The behavior experienced is largely down to the fact that a large body of humans can't come up with a single consistent view of a large set of problems. You need singular control and ownership by someone with technical and business domain expertise. Some of this is politics (particularly from the MBA and psychotic corporate climber faction) but it's at least 80% standard human idiocy and ignorance.

I think from an architecture perspective (I'm an "enterprise architect" [whatever that is] by trade), clean service APIs are a good idea, but not necessarily the distribution model or fully decoupled integration path.

The problem is that software has a tendency to become complex. The proposed solution is to break up the software into smaller programs.

There are certainly advantages to having smaller components. It allows you to rewrite components in a different language should you want to, for example. But there are disadvantages to: smaller components means dealing with failure at a much finer granularity.

In my opinion, the reason large programs become complicated is that there has been no emphasis on simplicity. Breaking components into smaller pieces forces you to adopt robust interfaces, but there are better ways of creating simpler programs.

My personal approach is to reason about parts of a program in terms of what they mean rather than what they do. I also have a strict rule that says, "don't change the meaning of a component, create a new one". This methodology works for me.

Can you please elaborate on "reason about parts of a program in terms of what they mean rather than what they do"?

Of course.

A good example that many people are familiar with is parsing data. Compare these two approaches. You could write functions that manipulate the input and build up an output or you could create a structure that represents a grammar for the data you are parsing (perhaps by using a parser combinator library).

In the first approach, the only feasible way of reasoning about the program is operationally; when it gets to here, this function is called, causing ... . In the second approach, you can reason about the program by considering the grammar that you created. You don't need to know exactly how the parsing happens in order to understand a grammar. I argue that this is because the grammar has been given a meaning.

Parsing makes a good toy example, but this same technique of finding ways of giving meaning to components is applicable to software in the real world.

> But there are disadvantages to: smaller components means dealing with failure at a much finer granularity.

I am really curious why you see this as a disadvantage. At my current job, I've had the experience of moving from a relatively small backend system that was broken up into discrete message-passing parts to a larger frontend system that was mostly one monolithic project. The former was by far much easier to debug, despite it being much older, less sophisticated, and having less logging, simply because it was easier to isolate and reproduce problems. The queues also made it very easy for us to bring down, diagnose, or scale individual components as they ran in production.

I couldn't see this as anything but an advantage, and one that's well worth the added complexity once you pass a size threshold.

And sure you can just go cowboy and say "I'll make it simple anyway!," but firstly, interface-enforced simplicity is not the only reason you would go with queues and RPC (other reasons have been mentioned in the article or by myself), and secondly, it is much more difficult than you think to enforce such abstract common disciplines on a large project.

I'm specifically talking about failure of components here rather than bugs. When you have more components that can fail independently, dealing with failure is necessarily more difficult, for the simple reason that there are more ways that you can fail.

> interface-enforced simplicity is not the only reason you would go with queues and RPC

No, certainly not. I shouldn't have implied that that was the case. However, if reliability, maintainability or simplicity is the reason why you are considering breaking up your large components then perhaps there is a better way of achieving your goal.

> it is much more difficult than you think to enforce such abstract common disciplines on a large project.

I'm speaking from my experience working on medium-large programs written exclusively by myself. I don't know that my methodology would scale to a team, but I also don't know that it or something similar to it couldn't.

Modular design at any scale has a natural tension between looser coupling and higher cohesion. If you split up a large code base into many small parts, each part can be simpler and looser coupling between parts may improve maintainability. On the other hand, now you must to co-ordinate those parts somehow, and making up for the loss of cohesion introduces a kind of complexity you didn’t have before.

This tension exists at any scale, from a single-developer hobby project up to massive enterprise projects and OSS giants, so I challenge the original premise of this blog post that having a large code base is the root cause of the problem. Going too far in either direction can result in absurdity, whether that’s “enterprise software” levels of boilerplate (too much tight coupling) or DLL hell and typical Linux package management (not enough cohesion).

Microservices are not necessarily bad, but one should also be aware of the drawback of such an approach. If there is a very tight coupling between two modules, you will often find yourself having to keep making changes between 2 different modules. The typical process goes like this:

1. While working on module 1, you realize you need something from module 2

2. Open module 2, add new feature and publish changes

3. Go back to module 1, test new feature and resume work

This process is fine once both modules 1 and 2 have matured but painful to deal with while the APIs are still taking shape. Hence it makes sense to keep a good abstraction between potential components and spin them off as an individual service only when they're stable enough.

This is probably another way of saying Service Oriented Architectures (SOA) works best for the enterprise. They probably already know that we should have all functionality in coarse, self-contained services.

But often the plumbing required in the form of web services becomes really painful to leverage. For instance they require creating complex WSDLs and workarounds to prevent timeouts.

SOA is great, people in Java EE world use it wrong.

Instead of making small isolated services they do one single gigantic WAR file.

Instead of using right tool to do the job everything is written in Java.

Instead of having services with implemented business logic they do services that convert one DTO to another.

That sucks...

There is value in reading pages of legacy code. Its very common to watch new hires solve an already solved problem. Too many people are allergic to reading code it seems.

Solving complex problems in the physical world usually results in complexity in the source code world.

It is always overwhelming to jump into a new gigantic code base. Talk to someone who's been on it a while and they won't have the same drowning outlook.

Certainly perpetuating the illusion that all Java code is "enterprisey" and "monolithic" will get tiresome at some point, right? I sure am tired of reading such views.

That's what I think before. Until I met Grails and Groovy. Plus their friend Scala. I'm really getting high productiveness in those languages when developing web apllications. Grails for web stuffs and Scala for non web. Before them, I used to think that Java is only for enterprise and didn't like using it for my web apps even though it was the language I'm most proficient at.

This seems akin to the saying that "everything should be made as simple as possible, but not simpler." Well, yeah. simplifying to that point isn't exactly easy. And, worse, the act of simplifying your code to fit this description is something that is usually done after you had it working. In other words, instead of solving another customer problem, many folks spin wheels "solving" their own "problems." Even worse, often the solution is taken to that simpler place that the saying warns against.

The problem I see with lots of software is that you don't have an immediate view of the scale. Suppose you're opening a random file. What do you see? Are those the atoms or the gears? You want to see the gears but there is usually no map to point you to the gears. The software is the map - yeah, right.

What are you supposed to do? Find the int main() and then make the program run in your head?

I can make an analogy with a car - I don't know every one piece of it but I can infer from the context. The scale is evident.

SOA is somewhat different to the "microservice" design it seems this author is proposing. I summarised what I could find on this - ideas from James Lewis at Thoughtworks, Dan North at DRW, and Fred George at Forward - in a recent CITCON session: http://www.citconf.com/wiki/index.php?title=Continuous_rewri...

Isn't this "the Unix way"? I have seen this style of small cohesive programs promoted a lot in Linux/Unix literature so the advise isn't really breaking any new ground. Look at git for instance. The advise is of course sound but it goes against the "enterprisy" way of doing things, in part because they tend to be using huge frameworks from the get go.

On the other hand, think back to the famous macro- v. micro-kernel debate. The kernel itself is huge, and the "small cohesive service" philosophy microkernel advocates ended up never really taking over the world.

That's because kernels perform differently from user software (they have more optimizations available), and a lot of big code needs that extra performance.

That's a fact that still didn't change, but user level code is getting more powerful (mainly for virtual machines), and computers are still getting faster. So, it's still too early to declare the race finished.

Anyway, none of that has any relevance to how one should organize user level code.

It used to be. It may still be if you hang out on #suckless. But most of the Linux folks these days love their huge monolithic PulseAudios and systemds.

And why would you say systemd is monolithic? Have you looked at the source-code for it? Have you looked at how services are configured with it? I would argue its far more modular than sysVinit.

A good program is 500 lines or less.

I don't think that's true, even if you look at most open source software. On one extreme, Mediawiki is a pretty complicated piece of software, approximately 900k lines of "core" PHP/JS code plus another 1.5 million lines worth of extensions, many of which are pretty mandatory for basic functionality.

Is it "good" or "bad"? I'd argue that it's both. It's an important piece of software but a bad architecture, and it's grown in complexity over the years. I think you could write a piece of functional wiki software in about 500 lines of code, but complexity and features win in the marketplace, even if it's just for "mindshare".

I type at least 60,000 characters before inserting a line break.

I joke, but I've seen some php before that was at least 600 chars before line breaks, I have no idea how they wrote it like that.

Broken 'enter' key; it's the only way I can imagine while clinging to sanity.

They didn't write it: it happened, by which time it was too late...

Editors that wrap long lines for display.

Wow, it seems rather arbitrary. Can you elaborate?

In 500 lines you can write a self-hosting compiler, a checkers AI, a usable text editor, a compiling-to-DFA regex matcher, Bayes net inference, or a ray tracer, to mention the first examples that come to mind. (Ordinary lines, not golfed.) Most programs aren't so polished, but when yours gets 10 times longer than a Lisp compiler, it's worth asking why.

Yes, it's arbitrary and there are of course exceptions and perhaps domains where it may not apply at all. It's just a recurring observation that I've made (mostly but not only) in dynamic languages.

Pretty much all good code that I've read or written was compartmentalized into units of roughly 500 LOC. A big program may be composed of many such units, but it was almost always a bad sign when a divisible part would exceed the "magic" number.

What comprises a divisible part of course also varies by language; at the least it'd be the LOC-per-file, but usually it'd be a self-contained and separately tested module.

In a moment someone will probably come up with a great piece of software where this doesn't hold true, I'd actually be curious to see it.

And a program is usually made of many units so is much larger then 500 LOC

So, you actually mean functions instead of programs?


I mean an isolated unit that could be ripped out at any time and would be immediately useful on its own.

Anyway, abecedarius (below) has phrased it better than I could.

Those 'isolated units' are commonly called libraries then.

No. They are called "module".

Yes, I agree with that name, libraries usually start as modules and some extra work is required to convert a module to a library.

Also, a library can be thought of as a collection of modules.

I'm not sure I agree. How about: "there's no such thing as a bad short program".

Depends on the problem. Sometimes a "short program" means omitted error handling, lax input validation, not covering necessary edge cases and generally the sort of thing that leads to bugs and security holes. I prefer the word "concise"- brief but comprehensive.

At least if the program is short then you can work out that this is the case.

Beyond complexity management, breaking a design into many small programs opens it up to a rich set of well-known and proven OS services. Things like hardware memory protection, multi-processor support, queues, mutexes, monitoring, and cross language support. I'll take Unix over some language+standard library _any day_.

and it also opens it up to a rich set of well-known problems: sharing between processes and memory management, lock issues, system-call latencies, not to mention dealing with the monolithic environments that are almost definitely changing between each instance of your program you want to run (this 'environment' includes the shell, userspace services, kernel version and features...). Decomposing a program into multiple programs isn't always a good idea, there is a very broad trade-off here that needs to be evaluated for the needs of every program.

Absolutely. But it's better from a design perspective to start as pure as possible and then make concessions when you find your back against the wall.

Most of the "it's too hard and there's too much to do!" crowd doesn't understand the benefits of working clean.

"Open source that libraries if possible and you’ll get the feedback from the peers."

Does anyone have real experience with this (open sourcing a core piece of infrastructure and finding that others have found it, used it and provided feedback)?

My current theory is that a good program should compile in less than 1 second into an executable of less than 1 megabyte.

My current theory is that this is an arbitrary one. Its really, really, really easy to cross a megabyte with statically linked libraries.

I think the grandparent intended that the size measurement exclude statically linked libraries or assets, debug symbols, and compression technologies like UPX.

It can sometimes be beneficial from a distribution/deployment standpoint to have everything in one self-contained file. But you can't conclude much about the code quality of e.g. a computer game engine based on how many megabytes of graphics, music and sound effects a particular game based on that engine uses.

The rule is not meant to be universal, and I don't say other people should adopt it, but I think it's suitable for the work that I am doing right now.

Constraints like this can really shape a piece of software, for better or for worse. My inspiration is having work with a really powerful firmware system that had a hardware constraint to fit on a 1MB flash chip, everything included, and was done so well that it looked easy. Give yourself unlimited space and it's much easier to end up with UEFI...

I suspect quite a few programs out there would have turned out better if their authors had picked a semi-arbitrary maximum value for lines of code / bytes of RAM / bytes of disk / etc.

The actual rule I'm using for now is: - 1 second compile excluding dependencies. - 1 minute compile including dependencies (excl C compiler). - 1 MB executable including everything except libc and base OS.

he's talking about adding a lot of "moving parts"...

Yes yes yes yes yes yes yes yes yes. This is absolutely true.

The program-to-programmer relationship deserves to be many-to-one. It's a rewarding way to do things. You solve a problem. You add value. It's Done. You may have to go back to a program later to add features, but you don't end up with massive codeballs.

When the program-to-programmer relationship is inverted and becomes one-to-many, you get the enterprise hell with no feedback cycle, terrible code, and unnecessary complexity. It's not rewarding. Problems are never solved and software is never Done. Requirements are "collected", bundled into an incoherent mess, and delivered to bored, underachieving developers who never get to see their programs actually do anything.

Large problems that require more than one person need to be solved with systems and given the respect that systems deserve. Single-program approaches are a denial of the complexity (that comes whenever people have to work together) and a premature optimization.

I wrote about the political degeneracy that this creates: http://michaelochurch.wordpress.com/2012/04/13/java-shop-pol... . But it's unfair to associate it with one language. It's not that Java is any more evil than C# or C++. Any company that calls itself an X Shop is doomed.

There are cases where large single programs deliver value. For example, most people experience a relational database as a single entity. There are a lot of requirements (performance, persistence, transactional integrity, concurrency) that are technically very difficult to meet and all have to work together. I will also note that it has taken some very smart, very well-compensated, people decades to get that stuff right. The quality of programmers who tend to stick around on corporate big-program projects is not high enough to even attempt it, though.

So why is big-program development winning? There are a couple reasons for that. First, it gives managerial dinosaurs the illusion of control. If programs are Giant Things that can be measured in importance by "headcount", then executives can direct the programming efforts... which they can't do if the programmer's job is to go off and independently solve technical problems they deem to be important. Second, big-program design gives a home to mediocre programmers who wouldn't be able to build something from scratch if their lives depended on it but who, in teams of 50, might be as effective as 0.37 good developers. It's about control and a failed attempt to commoditize programmer talent, but it doesn't actually work.

>So why is big-program development winning? There are a couple reasons for that. First, it gives managerial dinosaurs the illusion of control.

So why do larger programs 'win' in the open source world as well[1]? Pop psychology about management doesn't seem sufficient to explain the phenomenon (although I'm sure it is a good way to sell a '101 habits of highly effective managers' book or get paid to give talks about management). Large systems are large, breaking them up into smaller pieces doesn't change that, but it makes navigating the code base harder (although I assume you don't care about that since judging from your blog posts you don't think tooling is important). It makes your interfaces less malleable (can be good can be bad), and moves a lot of communication to places where the compiler can't warn you about mistakes (again, if you don't care about tooling I guess this doesn't matter… but I would argue that this is bad).

It seems to me like systems of many small separate processes is basically dynamic OOP. Everything is late bound, dynamically typed and async. It's easy to make changes and also easy to break things. You can argue that this is better for certain problems, but I don't think it's universally better, and the community seems pretty divided on the issue too: look at the popularity of Go, statically typed and building concurrency into the language rather than using the OS like in the older C world.

Aa an aside; surely the web developer community is eventually going to grow tired of talking about how terrible Java is and how $idea_of_the_moment is good because it's 'not java'? As an outsider the obsession seems extremely unhealthy, and leads you to bizarre places like arguing against automated refactoring or interactive debugging or static type systems just because those things are associated with Java. I guess to maintain credibility I also need to point out that I don't and have never used Java…

[1] Firefox/Chromium vs uzbl, gcc/llvm/clang vs pcc, gdb vs printf debugging, sqlite/mysql vs directories and plain text files, perl vs sed/awk/grep shell scripts, emacs/vim vs ed/notepad etc etc.

I think async architectures are worthy but miss something important - the idea of "typed connections" between processes. Actors passing arbitrary messages doesn't make for an efficient "assembly line." Similarly, the Unix-style concurrency is tied to a single low-level protocol, which is not rich enough to adequately describe all data. It described "enough" of all data for 1970's era tasks, but our needs outgrew that.

When you set out to architect a customized, typed, async architecture, you end up with the "flow-based programming" style which has been captivating me recently. It tends to reduce to two events per component: "start" and "stop." The "integrated program" appears in a tiny top-level definition that sets up the components and connections. Components are relatively small and reminiscent of "pure" algorithmic code. Where synchronized behavior becomes essential, flows can be split into stages of processing and kicked off in a sequence.

This particular style has had a lengthy history of reappearing in numerous domains under various guises, and it has demonstrable effectiveness, but it can also feel alien and more "mathematical." The main issue is that has a lengthy design/prototyping time of weeks to months, and the initial complexity of the system looks high because you need a decent number of components to do anything substantial. This really, really goes against the "move fast and break things" mainstream, even within open source - everyone wants their project to just _instantly_ accelerate from 0 to 100 in terms of progress, and we've put most of our efforts into a toolchain that makes it easier and easier to do that.

The best remedy at present seems to be to embrace asynchronity, embrace static types, and maintain faith in both - i.e. to have a lot of discipline.

I regret that I have only one upvote to give this post.

On the anti-Java bias, I think the issue is that there's more than one Java culture. There's the horrid commodity developer culture, but that's not the language's fault.

I'm actually a pretty big fan of static typing. You don't get static typing's main benefits in C++ or Java, though. You have to use a language like Haskell or Ocaml, or the right subset of Scala, to see the major benefits of that.

Open-source is a bit different because people choose whether they contribute to a project. The quality of code in the active open-source world is leagues above what you find in typical enterprise codeballs, because of survivor bias. No one has the authority to mandate that code be maintained by others, so the messes are cleaned up by people who actually care, not people slogging through it to keep a paycheck coming.

The big-program methodology of the corporate world is the evil. In FOSS, the major projects are an unusual set-- code-quality at a high level just not seen in the for-paycheck commodity-engineer world and large because of success-- rather than the reverse. There's a survivor bias that occurs because the best projects are the only ones people pay attention to.

The corporate world is screwy because projects become large or small based on political reasons that have nothing to do with code quality. In the FOSS world, code-quality problems related to growth will be self-limiting because no one has the authority to "force" the program to grow.

I should mention that I've never worked in the corporate world, so my reaction is in that context. I can't talk about the corporate side of things since I've never experienced it.

Also yes static typing in Java does look pretty cumbersome. Personally I'm hoping that Rust takes off, I've enjoyed playing about with it over the last couple of weeks, although it has made me less happy using the more dynamic languages I normally use to do real work.

What do you mean by subset of scala, is it an absolute prohibition ("It will be flagged, you must be able to justify why you used this") on features used like the Google C++ style guide, or maybe the Levels in Cay Horstmann's book (which I think Odersky came up with), L1-3, A1-3, or something else?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact