

Scala Considered Harmful For Large Projects? - RiderOfGiraffes
http://kirkwylie.blogspot.com/2011/01/scala-considered-harmful-for-large.html

======
j_baker
I really disagree with the author in terms of metaprogramming. One thing in
particular jumps out at me:

> Ultimately, a programming language that's useful for large-scale programming
> projects needs to have clear, unambiguous grammar and syntax so that any
> developer familiar with the project and the language can instantly figure
> out what's going on.

Wow. There are just so many things wrong with this statement. Let me enumerate
them:

1\. Scala _does_ have a clear, unambiguous grammar. If it didn't there
wouldn't be a Scala compiler. I think what the author means to say is that
languages need to have _less flexible_ grammars (like Java does).

2\. Just because you _can_ use all kinds of wild things to obscure your code
doesn't mean you should. If you prefer the "boring" way of doing something,
then go for it. You just have to understand that just like anything else in
programming, it's about tradeoffs. Does the benefit gained from using a DSL or
operator overloading outweigh the costs? Sometimes it does, sometimes it
doesn't.

3\. The idea that a programmer (even one who's familiar with a project) can
look at any given piece of code and instantly understand what's going on is a
myth. Even if that snippet of code makes perfect sense, you also need to
understand the state the program is in before that piece of code executes.
Again, it's about tradeoffs. You need to choose the method that allows
programmers to understand code more quickly. Sometimes using metaprogramming
is the way to do that.

I will grant the author that there are valid reasons not to choose Scala, but
I don't think he's hit them.

~~~
RodgerTheGreat
Regarding 1, there are languages that do not have a clear, unambiguous (to my
reading, _formalized_ ) grammar yet possess compilers. Examples include
classical FORTRAN, Forth and (if I'm feeling particularly trollish today) C++.

~~~
j_baker
Fair enough. But in that case, the entire argument is moot. After all neither
Java nor Scala (as far as I know) are formalized. :-)

~~~
chalst
The Scala language report does include an Extended-BNF syntax for the
language, weighing in at 7 pages.

------
fogus

        The problem is that if you have a language 
        designed to make it easy to do this, you 
        encourage people to do it.
    

As the author of the BASIC DSL[1] in question, I must say that it was
decidedly _not_ easy to do. It would appear that Baysick is the number one
reason that Scala hasn't swept the enterprise.

[1]: <http://github.com/fogus/baysick>

------
plinkplonk
In the comments the author says, "(I am) just saying that on balance for a
large project you shouldn't willy-nilly start adding in Scala."

I think I can do better than that.

"I am just saying for a large project you shouldn't willy-nilly do X". You
fill in the X.

The "willy nilly" weasel word takes care of all possibilities and prevents
counter argument. What a dumb blog post (and argument).

------
rst
By his argument, Ruby is _really_ bad for large-scale projects, due to its
"language flexibility". (Lisp is just as bad, and Python not much better.)

All I can say is that this hasn't been my experience. It's not just that Rails
makes good use of Ruby to define and use several internal DSLs (for
validations, associations, etc.), or that many plugins do the same for other
purposes (HAML for markup). The large rails app I use at work also has a few
special-purpose internal DSLs of its own (for instance, to express
permissioning restrictions), and they've made some things a whole lot easier
to deal with.

These are features that can make a real mess, if abused, but bad programmers
can make a mess in any language. The only real cure for that is to have
programmers who have good enough judgment to avoid mistakes most of the time,
and code review processes solid enough to catch the mistakes they make on
their inevitable off days.

(Besides which, even raw Java isn't entirely free of "undesirable" flexibility
these days. There are lots of setups these days in which annotations are used
to augment or alter core language semantics, with the aid of runtime bytecode
generation. I happen to think that's a good thing for their users. But I
suppose one could disagree.)

~~~
raganwald
_bad programmers can make a mess in any language_

My experience is that inexpressive or hobbling languages encourage good
programmers to make a mess as well! To give an absurd example, I remember
working with a dialect of BASIC that did not support recursion (this would
have been the 1970s). To write a "Towers of Hanoi" solver, I had to greenspun
my own procedure argument stack. While this is a useful learning exercise,
greenspunning your own language features in an inexpressive language
inevitably leads to half-baked implementations.

And that is what I see with modern Java: lots and lots of things greenspun on
top of the language in half-baked ways. There is just as much functionality
code in a complex Java application as there is in a complex Scala application,
but in the Java application the domain-specific code sits on top of a rickety
tower of awkward implementations and work-arounds.

------
praptak
_"Ultimately, a programming language that's useful for large-scale programming
projects needs to have clear, unambiguous grammar and syntax so that any
developer familiar with the project and the language can instantly figure out
what's going on."_

I have a hard time swallowing the necessity to rely on restricting the
expressive power of the language to prevent people from creating code that is
hard to understand. My current project is in the order of 100 KLOC in Python
(the horror, you can even redefine methods of _single objects_ at runtime!)
and so far none of the problems we've got can be tracked to language abuse.

~~~
tomh-
Well, take a look at this sample
(<http://dispatch.databinder.net/Common_Tasks>) and see if you can figure out
what all those operators do just by looking at them :)

Thats the kind of abuse possible with languages such as Scala, Haskell or c++
templates.

~~~
fogus
Take a look at this sample
([http://static.springsource.org/spring/docs/2.5.x/api/org/spr...](http://static.springsource.org/spring/docs/2.5.x/api/org/springframework/aop/package-
summary.html)) and see if you can figure out how to use these classes just by
looking at them.

That's the kind of abuse possible with languages such as Java.

\---

Seriously. Every new API that you intend to use has a learning curve, some
more significant than others. Every new API is a new "language".

~~~
Stormbringer
It's not really a fair objection. AOP isn't part of Java, it is an ugly hack
that someone bolted onto the outsides of it.

Seriously. You write your Java code, you compile it, and then in the night the
evil AOP pixies sneak in and slip chunks of bytecode into your compiled
classes.

~~~
scott_s
The library tomh linked to is not a part of the Scala language or the standard
library.

------
raganwald
tl;dr:

 _The author has a set of views about the benefits of hobbling programmers on
large projects, which he takes as axiomatic. He espouses Java as a language
suitable for performing said hobbling. He convincingly argues that permitting
programmers to add Scala code to large Java projects undoes the hobbling
provided by the Java language._

I can't argue with his conclusions, but I'll fight his premises to the death.

~~~
Dylanlacey
You put it much better then I could.

I feel the entire article was just linkbait. Scala and Java could be replaced
with .Net 3.5 and .Net 4.0. Or C++ and F#. Or any other language pair. And it
would still be sorta right, maybe, if you assume your programmers are just
average, not-overly talented developers, who don't know how to
compartmentalize code.

------
dkarl
I'm learning Scala now and am impressed by its difficulty. Not since I learned
C++ in college have I needed to spend the same amount of time and effort
learning a language, and the arguments given here against Scala are exactly
the arguments you hear against C++.

That's heartening to me. C++ is a great language for people who 1) know the
language, and 2) use it sensibly (which often means humbly.) There's all kinds
of stuff you can do with C++, including metaprogramming magic that gives you
DSL-like capabilities a la Boost Spirit.

I can't imagine using Boost Spirit in production code myself. Maybe there's a
good use for it somewhere -- probably so -- but I really don't care if Boost
Spirit is every a good idea anywhere. Maybe it's just an abomination; I don't
care! All I care about is that I can make a good judgment about allowing it
into my codebase.

I'm going to approach Scala with the same attitude.

~~~
scott_s
I used Boost Spirit in my dissertation project for the parsing:
[https://github.com/scotts/cellgen/blob/master/src/cellgen_gr...](https://github.com/scotts/cellgen/blob/master/src/cellgen_grammar.cpp#L289)

I would not choose it again. With what I wanted to do, the performance was
poor. Spirit just had to deep-copy the parse tree it was generating way too
much. I liked having all of my source in C++, which meant not having to embed
C++ in ANTLR or yacc. But I'm not sure that was worth it. Perhaps the newer
version fixes that problem, but I doubt I'll ever test it out for sure.

Luckily for me, the performance of the compiler itself was not really a
priority. And it was a research project, so it didn't really matter.

Anyway, I know your point was broader than Spirit specifically, but I thought
you may be interested in my experiences.

~~~
munificent
Is there a reason you didn't just handcode a recursive descent parser?

~~~
scott_s
Yes. It's a waste of time. Why should I hand-code a parser for C when it's a
solved problem? Note that the source code I linked to calls into an already
existing Spirit grammar for C (which I made some modifications to, mainly so I
could get more information back from the parse tree).

Hand coding a parser is generally not a good use of your time. There are many
tools available to generate a parser based on a BNF grammar. I used my project
as an opportunity to learn one of them.

------
protomyth
I have a very tough time figuring out how having a DSL is much different then
having a huge project specific class library. You are going to have to learn
project specific stuff, and DSLs flow better for me than APIs.

------
Sandman
The title is misleading. The point that the author is actually trying to make
is that _adding_ Scala to an existing project can be harmful. He does not say
that Scala is inherently harmful for large enterprise projects, in fact, he
explicitly states the exact opposite:

 _Neither Geir nor I consider Scala inherently harmful to a large-scale
programming project._

I would argue here that adding _any_ language to an existing project would
have at least some of the drawbacks that the author mentioned (developers not
familiar with the language, added complexity...).

------
stephenjudkins
The author is correct that, as a language "in the small", Scala offers many
more ways for a programmer to create hard-to-understand complexity. Though
many things have been conceptually simplified (the distinction between
primitives and objects has been abstracted away so that most developers don't
have to worry about, for example) there is a whole lot more stuff that make it
easy to hang yourself with. (See implicit conversions and parameters,
destructuring and pattern matching, and the power of the type system). Most of
these things can be used to make code more readable, concise, and flexible,
but can also be used to great ill effect.

When it comes to programming "in the large", like the author seems to be
describing, I couldn't disagree more. Here it's Java--the culture and
ecosystem--that tends towards over-abstraction, huge frameworks that obscure
rather than illuminate intent, ridiculous class names that include
"FactoryFactory", and so on.

I would argue that the second phenomenon--the hugely complex frameworks that
have arisen in the Java world--is a direct consequence of the lack of power
offered by Java, the language. Complex frameworks that do dependency injection
through bytecode generation and XML configuration files really aren't
necessary in the Scala world, where 90% of what you're actually trying to do
can be done using compile-time mixins. The factory pattern can be replaced
with anonymous first-class functions, with a net increase in clarity and
conciseness. Most things that makes Scala more complicated in the small can be
also be used to replace even more complicated, ad-hoc patterns and frameworks
that make big Java projects such a pain to work on.

That said, it could be true that introducing Scala into existing codebases
that already have thousands of lines of Spring XML config might simply
introduce additional complexity, without offering a realistic path to removing
all of the nonsense that supports the legacy Java code.

------
rayiner
The author's argument is specious. You need design guidelines in place for any
language feature. You can say the same thing about interfaces in Java - you
see a call x.doFoo() and you have _no idea_ what code gets called! Well yes,
that's why you formalize your module interfaces and don't let people just
randomly create interfaces between modules, etc, etc.

Even in assembly you can define a function _do_foo that actually does "bar."

You need to know how to architect code with meta-programming in mind, no
doubt. But this isn't any harder than architecting code with any other
language feature in mind. Just as you need to clearly specify the semantics of
your interfaces, you need to clearly specify the semantics of DSLs.

Managed properly, DSL's can have a tremendous positive impact on the code-
base.

I've been mucking around in compilers lately, and one thing that never ceases
to amaze me is SBCL. The compiler is about 55K LOC + 15K LOC for each native
back-end. That's about the same size as the C1 compiler in Hotspot, but does a
whole lot more (ie: includes the front-end of the compiler, while C1 gets low-
level bytecode as input, has to deal with type inferencing and lambda lifting
and all sorts of other things to bring a high-level dynamic language down to
machine code, etc, etc). And it does that with algorithms about 15 years older
(it doesn't have the benefit of SSA form which would massively simplify many
of the optimizations SBCL performs).

Macros are a huge part of how SBCL does it. There are 123 macro definitions in
the compiler, and they do a tremendous job in cutting code by making DSLs for
things like specifying code generators, etc. As for readability - because it's
well designed and well-commented, it's surprisingly maintainable and readable
for a code-base that's 25 years old and has been maintained by so many
different people over the years.

------
jrbran
This is really, _Scala (and any of those extra JVM languages) considered a
harmful addition for large java projects_ which makes a lot a sense. It
surprises me a bit that it needs to be said, and said in such fashion.

~~~
technomancy
The obvious comeback being large Java projects considered harmful.

~~~
Semiapies
Obvious but not worthwhile, as the issue isn't Java, it's multiple languages
on a project.

~~~
Stormbringer
But I think this argument with respect to Java is flawed anyway, or was for a
long time. EJB 3 and JPA is enormously much better, but back in the bad old
days of J2EE you had these 'no go areas' anyway. You kind of slapped a couple
of interfaces down, and then hoped that the magic code generator of the App
Server would do the right thing (if configured properly).

How is that any different from having a "if you don't know what you're doing,
don't mess with these bits" area for the Scala stuff to sit in?

J2EE isn't the only example of this. Plenty of frameworks had their 'here be
dragons' sections, the fiddling with which was considered a dark art.

~~~
Semiapies
I'm no fan of the language myself, but you're going overboard to try to dig a
reason to trash Java out of this.

~~~
Stormbringer
??? I'm not trashing Java. Did you read the article? I was making an analogy
between the risk of having a language like Scala, which would mean there was
stuff the Java-only people would not understand (which is what the article
claims), with the black box auto-generated code produced by early J2EE (which
is similarly a no-fly zone).

Believe me, you'll know when I'm trashing Java, because I'll use the g word.

~~~
Semiapies
_"...with the black box auto-generated code produced by early J2EE"_

If this comparison _were_ analogous, your point would make no sense, since it
would boil down to _in the bad old days, we got stuck with code we couldn't
understand and lived with it, so_ now, _code we can't understand isn't a
problem_.

And it's not analogous. We're not talking about black-box generated code
nobody on the project understands, but instead people on the project making
code that other people on the project don't understand. To harp on the one
detail of incomprehensible code ignores _huge_ differences in those
situations. So, it's gratuitous to evoke that.

~~~
Stormbringer
Not to pick on you, but I've worked on plenty of Java projects where there
were sections of the code that some of the programmers either didn't
understand or didn't want to touch with a bargepole because they were 'too
scary'.

Taking the example of J2EE, on a sufficiently large J2EE project (where
'sufficiently large' is 3 or more people) it would be trivially easy for a
Java programmer who didn't know J2EE to work on it - since there's usually a
lot more to it than just the back end persistence bit. (Front end, middle
layer services, business illogic etc)

Or say they are using both an obscure database and hibernate and I spend a
significant amount of time trying to get them to agree on how date formats
will be read and written (don't laugh, nobody else could figure it out). When
I do get it working, does everybody else on the project magically absorb this
newfound expertise via osmosis or something? Don't be ridiculous! Only I have
the understanding of it, and if the others know what is good for them they
will not fiddle with it.†

Naturally I'll liberally sprinkle it with comments like:

// if you want to retain your SAN points, don't mess with this

In fact, this thing of someone working on something that is really tricky and
then when they do get it working warning off the other programmers from
messing with it is extremely common in larger projects. One of the benefits of
OO code is that you _can_ segment the 'nasty stuff' off from the rest of the
code so that it can be safely ignored. And this doesn't just apply to senior
programmers doing stuff too scary for the junior programmers, it can work the
other way too - as a senior member of the team I'm quite happy to let someone
junior with lots of 'fire in their belly' tackle the ugly ugly task of getting
the config files set up, and if they've done it right I never even have to
look at them (don't be outraged, it's called delegation).

If you have to twiddle your ant script every time you want to compile your
code, you ain't doing it right.

\---

† the corollary of this is: that there is _always_ a fiddler.

But when they break it you're allowed to laugh at them and then tell them to
put it back the way it was when they found it

~~~
Semiapies
_"I've worked on plenty of Java projects where there were sections of the code
that some of the programmers either didn't understand or didn't want to touch
with a bargepole because they were 'too scary'."_

And this is _undesirable_ , is it not? Encapsulation aside, sometimes people
get shuffled around in projects.

It's one thing to say, "OK, junior developers, don't mess with the tricky
stuff in this project." It's another to say, "OK, I've decided to start
writing this part of the project in a language the rest of you _don't actually
know_."

------
pgroves
I know everyone loves their high powered languages, but I'm afraid the author
is right. The more developers, the more conformity you need. I'm doing a
personal project in OCaml, and I absolutely love it. I like to think it keeps
my medium size code base from ever turning into a large code base.

But Ocaml has so many features that two programmers can write in such
different styles that it might as well be a different language. A team of
maybe 5 people with a good attitude could probably work on the same code base.
The same is true for lots of new languages like scala and clojure and even
ruby.

Knowing what I know now, if there's a project that's going to have more than
10 people working on it, I'd hesitate to use anything except Java or C#.

~~~
stcredzero
_The more developers, the more conformity you need._

If you need a language to enforce coding standards, you have some big problems
in your group!

------
hackerku
No matter what the language is, I think it is important to understand that
readability of the code should be part of the design. Scala does make it easy
to write hard to read code.

~~~
Stormbringer
Presumably it gets easier with practice. But I think you have to write it, and
there are some idioms and shortcuts that people use, and if you're unfamiliar
with the shortcut then it can be hard to figure out.

I had a similar problem with Objective-C. Everyone was saying how easy it
would be for someone who knew Java to pick up, but for the first couple of
years I found the message passing syntax to be incredibly jarring. Note that
I'm not stupid, of course I understand what messages are etc. It wasn't that I
didn't understand the concepts, it was that the syntax of the language kept
tripping my "that looks wrong" buttons.

I think that with programming languages there can be an 'uncanny valley'
effect. Something that looks too close to something else but isn't quite the
same. Java and JavaFX were an example of that for me. I thought they made the
JavaFX too close to Java. Someone who knew Java would keep getting tripped up
on the subtle (and not so subtle differences) - but because it was too close
to Java you couldn't easily tell where the Java ended and the JavaFX started.
You might have a class where halfway down your page of code the rules of
syntax and grammar shift slightly, and then a little bit further on they shift
back...

When I'm mixing two languages I find it easier if they are distinctly
different, it makes the cognitive burden easier. Nobody complains about having
to mix Java and SQL on the back end with HTML and Javascript on the front end
for instance. There's four languages right there, and the wuss that wrote the
original article thinks two is too scary for most programmers? I know most
programmers are like insects to the giant intellects of those of us at HN, but
even so I think he doesn't give them enough credit.

