
Scaling Scala - amzans
https://www.oreilly.com/ideas/scaling-scala
======
ludicast
What strikes me about the Scala community is how amazingly anti-fragile it is.

Odersky often uses criticism to refine future versions of the language. He is
currently my favorite benevolent dictator.

~~~
happyhippo333
With handling criticism it all comes down to the mental attitude:

Is your project just a tool to get a job done or do you draw some sense of
self-worth and life purpose from it?

Former is a healthy attitude, latter would make your life miserable.

------
MichaelBurge
Before you start trying to scale with Scala/Spark, consider if my preferred
progression makes more sense:

* < 50 million rows: Can your data fit into a standard RDBMS? Writing all your business logic in SQL might make more sense.

* < 1 billion rows: Can your data fit into a few flat files on disk, with some Python/Perl scripts to manipulate them?

* < 1 trillion rows: Can you use an off-the-shelf solution like Google BigQuery or Amazon Redshift? You can also use this rather than flat files depending on the skillset of your team.

* < 1 trillion rows: Can you engineer smart data structures & algorithms, write some custom C++, tune the compiler flags, inspect the resulting assembly, cache intermediate results on disk, etc. for a small number of core queries?

For anything larger, good luck. You seem to genuinely need a dedicated data
engineering team, and similar tools may very well help you scale up. Even so,
try to keep things simple. A single binary with well-defined inputs &
outputs(even poorly written in C++) is intrinsically simpler than a
distributed network(even if the Scala code itself is simpler and well-
written), just due to deployment considerations.

If you need things to be accessible in real-time from a website, that may also
require a sophisticated solution. Consider if you can queue the query and use
one of the above approaches, however. It's even harder to deploy a real-time
distributed query system with worst-case performance guaranteed to be
queryable from the front-end of your website.

~~~
vikiomega9
I'm averse to writing SQL simply because it's a bit harder to test business
logic with and things get hard to read very quickly. Also consider the example
where my logs are in json for whatever reason, now using SQL just makes things
harder especially when I need to run some sort of transformation.

~~~
tormeh
Also SQL is in clear text, which invites small manual changes outside of
version control.

------
spenrose
Two points: article is 5 months old, and it's fantastic: informed journalism
rather than "one coder's opinion."

------
blackoil
One of the main problems with Scala is that it has a clusterfuck of niche use
features, which even experienced developers have problem using.

Scala core library is littered with :: +: and other nonsensical operators.
Things like implicit may have few good use cases, but are mostly abused and
make code hard to read. Blindly following immutability in even local function
variables and stack makes code slow and unreadable.

~~~
RBerenguel
Eventually you get to understand these operators (and, after all they
essentially mean append/concatenate/join in one sense or another... so it's
basically add stuff to other stuff), and as for implicits, they make code much
easier to follow if they are used sparingly and for good reasons. What makes
Scala code (with implicits) hard to follow can be type classes, and
understanding how all the pieces of the language fit together in these cases.

As for immutability making things slow... Well, personally I start writing
code for humans (and immutability makes the code much easier to read for
humans). _If_ speed is an issue then I consider dropping immutability. So far
have had no issues with Scala. Also depending on how much speed I need I
wouldn't even think about using Scala.

~~~
killin_dan
That's what they say about perl. Keep trying and itll make sense!

Granted, scala can be written to read much easier than perl, but that argument
in particular don't carry much weight imo.

~~~
RBerenguel
The thing is, in Scala you can do really crazy and complicated stuff (see
Shapeless, Cats, etc) that in other languages (with exceptions, of course)
would be extremely hard or downright impossible.

In Scala, it is just hard, but doable, but implies code being hard to read
because what you are doing _is_ hard. This does not detract from doing as much
as you can to make it readable, but sometimes it is impossible, or at least
impossible to someone not used to it.

I'm not good enough with Scala to understand some of it (yet, I hope) but this
is something that is bugging me lately. Some things (in general, I'm not
talking just about programming here) are _hard_. No matter how much sugar-
coating, some stuff is hard to understand, complex and not everyone can get to
the bottom of it. It just happens in some programming languages, happens in
most of mathematics, physics and computer science.

------
jakozaur
The main long-term of Scala is nobody is making money on it. Lightbend tried
(formerly called Typesafe), but haven't been that succesful and pivoted to
other tools (Akka).

Java also stagnated, because money obssesed Oracle is not earning much money
on it.

~~~
BatFastard
Good point. Seems like a language either needs to be non profit and open
sourced, or profitable and closed source.

Non profit and closed source seems like a formula for doom and destruction.

~~~
hocuspocus
> Non profit and closed source seems like a formula for doom and destruction.

Which is not the situation here.

~~~
BatFastard
It seems very hard to make money off of a language unless you can get support
contracts from corporations. Which in the language rich world is very hard.

~~~
hocuspocus
No company is trying to sell Scala on its own. Several startups emerged around
major open source Scala projects and are doing fine. Lightbend employs core
contributors to the compiler and stdlib. There are also very successful
consulting shops around Scala technologies that are contributing where they
can. The language and its ecosystem haven't seen anything but growth, I'm not
worried.

------
geodel
> (Spark, Mesos, Akka, Cassandra, Kafka), most of which has been written in
> Scala.

Regarding Kafka, I noticed a large amount of Scala API is deprecated and
replaced with Java API in the latest version. There is about twice the Java
code in Kafka as compared to Scala.

Mesos is C++.

Cassandra is Java.

So it seems to me Spark and Akka is mainly Scala. And in this field I am
expecting Rust to make big inroads with high performance, no JVM like overhead
for memory safety, and lately emphasis on async IO with tokio.

~~~
throwaway91111
I wouldn't expect rust to take much scala market share. A big part of he scala
advantage is being able to take advantage of the massive number of jvm
libraries available.

------
FLGMwt
I'm not quite sure _why_ the Scalaz IRC behavior issue got such a focus here,
but here's the chat logs of one of the incidents
[https://gist.github.com/betehess/77de5b4b35ae11801936](https://gist.github.com/betehess/77de5b4b35ae11801936)

~~~
killin_dan
Idk, tony is my programming role model. Fantastic guy, extremely
knowledgeable, explains things in very approachable way. Love him to death.

He's helped me fix MANY stupid type puzzles, and scalaz scumbags ran him out
of the entire language.

I mainly do f# and erlang now :(

Scala has plenty of problems aside from the users. I can't begin to explain
how terrible sbt is to people who haven't suffered its grasp. Sbt imo is the
single most critical obstacle preventing widespread adoption. It's awful.

There's too much politics in the scalaverse. The people are too whiny, the
tooling whines even more than the people, and its a damn shame because it's
really great code at the end of the day. Its the most fascinating
objfunctional I know of, though F# isn't bad at all. Great tooling too, but
poor native support.

~~~
tekacs
[https://github.com/cvogt/cbt](https://github.com/cvogt/cbt) [0] is under very
active, heavy development and seems like a potential beacon of hope to SBT-
haters (including myself).

The CBT author and community seem determined to genuinely make it a completely
viable SBT replacement (it already works pretty well).

Given that during the earlier part of my career starting a new SBT project
would involve 1 - 2 days of wrangling _every time_ (there's always something
simple that needs changing, that's inevitably harder than it need be in SBT),
I can't wait to see CBT mature.

[0]: In their own words: 'fun, fast, intuitive, compositional, statically
checked builds written in Scala'

~~~
cdegroot
It'd be the first build system for any JVM language that would be nice. I
don't know what it is, but they all seem to come with various amounts of
suckage (started with Java when GNU Make was the most advanced option, via
ant, maven, gradle, buildr, sbt, ...). It must be the JVM. I worked with Scala
for close to a decade now, and my conclusion is that its advantages (type
stuff) are mostly theoretical, and its drawbacks (sbt, type stuff, the JVM,
slow compiler, the mess of features) are mostly practical. I'm happy to have
mostly moved on.

~~~
pimeys
While not on the same level as Cargo, I'd say l Leiningen is a very robust and
nice build system for the JVM. Miles ahead of SBT.

~~~
killin_dan
Yeah but who would want to use clojure of they had a choice to use scala?

Lein is what I always wished sbcl had.

------
AheadOfTime295
The community both shrinks and grows. An example of the former:

[https://www.reddit.com/r/programming/comments/6bh8xv/leaving...](https://www.reddit.com/r/programming/comments/6bh8xv/leaving_scala_after_six_years_of_development/)

------
bollockitis
Aren't alternative JVM languages subject to the same potential legal issues
that sparked the Oracle lawsuit? Don't languages like Scala, Groovy, and
Kotlin still make use of many Java APIs?

~~~
ex3ndr
kotlin does not reimplement java API. Instead it have it's own "wrappers" for
collections and primitive types. This wrappers are just platform specific
types that is replaced directly to native ArrayList and others by compiler.
This is more like just using Java API instead of reimplementing it. If Oracle
will sue someone for using their API to build Java apps - it will be the end
for everything.

~~~
sjrd
True, and Scala is exactly the same. It uses Java APIs.

~~~
vorg
Apache Groovy is also the same.

~~~
happyhippo333
Why is that in every discussion about Scala some Groovy fanboy pops out?
sometimes with a ridiculous claim like "groovy is more popular", or "groovy
has better java interoperability"

~~~
dang
This account's comments have already been violating the HN guidelines. We ban
accounts that do that. Please read the following and post civilly and
substantively, or not at all.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

[https://news.ycombinator.com/newswelcome.html](https://news.ycombinator.com/newswelcome.html)

