
Scala: Consider syntax with significant indentation - rolodato
https://github.com/lampepfl/dotty/issues/2491
======
FLGMwt
Having just moved to my first Scala team, I would really really not like this
to be in the language. Given 5 ways to do something, a large organization is
going do something all 5 ways.

Scala is a fun language, but consistency is _not_ one of its ecosystem's
strengths.

Grain of salt: again, I'm new to Scala, but I'd hate to think of it as a
language that can only be good once you're Stockholm'd.

~~~
emsy
There should be a styleguide that is enforced by CI tools, regardless of
language, so that's not an issue.

However, special cases in the syntax increase the complexity and I would
always argue against that.

~~~
FLGMwt
Yeah, and there's certainly style tools in place at some level (formatting,
implicit types, etc). But I don't know if I agree that there should be global
language rules about what defines sematic style across an org with 20+
relatively independent teams.

And that says nothing about third party libs that one can't control. Again, I
am new to the ecosystem, but with languages like C#, it seems that the
comparatively fewer language constructs streamline API patterns.

~~~
emsy
I don't know about how your organization works, but any one development team
with a shared codebase should enforce coding styles. This means indentation
but also how common constructs are used. Especially if someone external might
work on the code. Some standards are just necessary to achieve uniformly
looking code. But some might prevent bugs (For example mandatory curly braces
after ifs). It's also just a simple Tool to prevent friction and create a
sense of shared code ownership.

External libraries are usually black boxes and you don't work on the code so I
don't understand the argument.

------
akkartik
In a previous project I spent a lot of time thinking about significant
indentation
([http://akkartik.name/post/wart](http://akkartik.name/post/wart)), so I'm
glad to see it getting more mind-share. However, the end comments are
absolutely insane. Here's a counter-proposal: if you want optional end
delimiters, _make them not look like comments_. Then the relevant examples in
OP would look like this:

    
    
      def f =
        def g =
          ...
          (long code sequence)
          ...
      end f  // optional
    
      while
        println("z")
        true
      do
        println("x")
        println("y")
      end while  // optional
    
      package p with
        object o with
          class C extends Object
                     with Serializable with
            val x = new C with
                def y = 3
    
            val result =
              if x == x then
                println("yes")
                true
              else
                println("no")
                false
          end C  // optional
        end o  // you guessed it: optional
    

_Edit_ : somebody already brought this up in the comments on Github:
[https://github.com/lampepfl/dotty/issues/2491#issuecomment-3...](https://github.com/lampepfl/dotty/issues/2491#issuecomment-302930123)

------
tekacs
Folks commenting on the willingness to make breaking changes in 'Scala' should
note that this is Dotty, an upcoming very eventual breaking replacement for
Scala (think Python 3 or perhaps even Perl 6 but less radical than the
latter).

For now, to quote the authors, it's a: 'Research platform for new language
concepts and compiler technologies for Scala'

So this is exactly the right place to test this sort of concept.

~~~
smitherfield
Yes, but they're unusually willing to make significant breaking changes in
point releases as well — how many times now (including, IIRC, one or more
ongoing efforts) have they rewritten the collections API? Yes, that's a
standard library change, not a language change, but still.

And this is a _huge_ breaking change: if they make it you won't be able to
convert your Scala 2.1X code to Dotty/Scala 3 by simply fixing all the
compiler errors; you'll either need a foolproof automated conversion tool or
to hand-audit your entire codebase.

~~~
tekacs
Well they last rewrote the collections API in Scala 2.8 (I think), which was
in 2010. If they redo it in 2.13 that'll have been about 8 years. That's
certainly not C++ levels of backwards compatibility, but it's hardly frequent.

They do make some other breaking changes in 2.xx releases (which come out
about every 1.5 - 2 years!), but I wouldn't really call them point releases,
given that the 2.x hasn't changed in over a decade - that's like saying Python
shouldn't make breaking changes in Python 2.6 -> 2.7. They don't generally
make breaking changes in actual point releases (2.XX.yy).

Also, they're building an automated conversion tool ([https://www.scala-
lang.org/blog/2016/10/24/scalafix.html](https://www.scala-
lang.org/blog/2016/10/24/scalafix.html)) for Dotty. As I said, compare this to
Python 3. The Scala -> Dotty rewriter should be able to be more complete than
2to3 was however, mostly because they're not fixing many ambiguities like 2to3
was with encoding. Their rewriter is also based on a full sophisticated
framework that can parse or unparse multiple versions of Scala, including
Dotty, in one build.

Hopefully being able to rewrite a much, much higher percentage of code and the
backporting of changes into Scala 2.13+ will make Dotty adoption happen faster
(than Python 3) when it comes.

~~~
erlehmann_
What kind of version number scheme does Scala use where breakage is allowed in
point releases?

~~~
raquo
Epoch.major.minor

------
atemerev
I use Scala as my primary development language since 2010. My first reaction
to this change was "well, this is the end of it" \-- as I can't read or write
anything in Python because of that.

But I looked more carefully and the proposal is more balanced. It looks more
like Haskell than Python -- and Haskell's syntax is one of the best ever
invented, in my opinion (too bad I am not yet that smart to casually emit
production-grade Haskell code).

So, I'm fine with that.

------
sscarduzio
Lovely thread to read, these guys are so analytical, well mannered and
mutually respectful. Each comment is well thought, constructive, detailed,
clear and down to the point. But most of all they are proof reading before
submitting. I don't do it all the times.

------
eropple
I spent two or three years heavily using Scala. During that time, I more than
once got to the point where I literally couldn't casually read code I'd
written less than a week before.

This seems like yet another great way to make that worse.

------
Animats
1\. If you're going to make indentation significant, implement it in a way
that makes tab/space confusion impossible. Python 2.7 does this. The check for
indent ambiguity between two strings is:

\- Remove the common leading whitespace of both strings.

\- The remaining parts of both strings must be all tabs, all spaces, or empty.

This is the least restrictive rule which catches all indent ambiguities.

2\. Indent-based syntax is great for imperative languages, but not so good for
functional languages with very long expressions. It's not clear how to indent
stuff like "a.b(x).c(y).d.e(z)", where a-e x-z may be long expressions. In
LISP, the all-parenthesis syntax was so simple, and so hard for humans to
parse without help, that indentation was nailed into EMACS and everybody did
it that way. The indentation wasn't significant, but it was standardized.

Here's word wrap in Rust:

    
    
        s.lines()
                .map(|bline| UnicodeSegmentation::graphemes(bline, true)    // yields vec of graphemes (&str)
                    .collect::<Vec<&str>>())
                .map(|line| wordwrapline(&line, maxline, maxword))
                .collect::<Vec<String>>()        
        .join("\n") 
    

Note that the first "collect" is one level deeper in parentheses than the
second one. How would you do that with indentation only?

~~~
steveklabnik
You can get rid of the second map using itertools, which has join for
iterators, incidentally.

I feel like you can get rid of the inner one too but forget the right
combinator.

~~~
Animats
Off topic - If only that worked.

    
    
        use self::itertools::join;
        use self::itertools::Itertools;
        ....
        s.lines()
            .map(|bline| UnicodeSegmentation::graphemes(bline, true)    
                .collect::<Vec<&str>>())
            .join(|line| wordwrapline(&line, maxline, maxword),"\n")
    

error[E0061]: this function takes 1 parameter but 2 parameters were supplied.

Rust is picking the wrong version of "join". There's one in Iter with one
parameter and one in Itertools with two parameters. Haven't figured out how to
get the one from Itertools yet. The obvious syntax, ".Itertools::join(...)"
gets "error: expected `<`, found `join`".

Without either function overloading or member function qualification, how do
you do this?

~~~
steveklabnik

        use self::itertools::join;
    

This is the free function version; leave it out. I haven't tried this myself,
but I'd bet that's what's going on;
[https://docs.rs/itertools/0.5.6/itertools/trait.Itertools.ht...](https://docs.rs/itertools/0.5.6/itertools/trait.Itertools.html#method.join)
is different from
[https://docs.rs/itertools/0.5.6/itertools/fn.join.html](https://docs.rs/itertools/0.5.6/itertools/fn.join.html),
though.

------
virtualwhys
One advantage of significant whitespace is that it may enable a very concise
(G)ADT notation:

    
    
        enum Tree[T]
          Branch(t: Tree[T])
          Leaf(t: T)
    

Compare that with preset day Scala:

    
    
        sealed trait Tree[T]
        case class Branch(t: Tree[T]) extends Tree[T]
        case class Leaf(t: T) extends Tree[T]
    

In general with this proposal the `case` keyword could be implied in _any_
pattern matching block. That alone would be a big win wrt to reducing keyword
noise, something the MLs have enjoyed for decades.

~~~
JoshTriplett
You wouldn't need significant whitespace to do this; any block syntax would
work. The following, for instance, would still drastically reduce the noise:

    
    
        enum Tree[T] {
          Branch(t: Tree[T])
          Leaf(t: T)
        }
    

The main benefit here involves using a block instead of "extends", and using
"enum" instead of the odd use of a class hierarchy.

~~~
rbonvall
Indeed, there is already a proposal for an enum syntax in dotty that looks
almost exactly like that:

    
    
        enum Tree[T] {
          case Branch(t: Tree[T])
          case Leaf(t: T)
        }
    

[https://github.com/lampepfl/dotty/issues/1970](https://github.com/lampepfl/dotty/issues/1970)

------
smitherfield
I have to say I don't know of any other language where the maintainers have
such a cavalier attitude towards making breaking changes. It makes the common
analogy between Scala and C++ a bit ironic.

Not that that's necessarily a _bad_ thing; it's good to have some popular
languages that follow a less conservative approach, if only to get some real-
world experience with different strategies for dealing with the tradeoffs
between keeping a language modern and maintaining backwards compatibility with
legacy code.

------
royjacobs
I thought Scala was supposed to be following a route where the amount of
"stuff" they include is getting reduced?

The language is idiosyncratic and multiparadigm enough as it is, I'd say.

------
partycoder
Reminds me of the difference between JavaScript and CoffeeScript.

CoffeeScript attempts to become more concise by removing delimiters and making
things more implicit. I don't think that actually adds much value.

In a sense, it was the equivalent of trying to simplify traffic by removing
lane delimiters and street signaling. You could somehow imagine they're there,
but it's better if you can see them.

~~~
Rapzid
This is a relatively small change to how blocks are defined syntactically in a
language that already brings quite a bit to the table over Java.. You say
reminds, which is fair, but I'd say it's also quite different.

I'm a huge proponent of TypeScript and critic of using CoffeeScript in 2017,
so while you may not(but may!) agree that TypeScript brings significant value
over raw ES6+ I definitely relate with CofeeScript not bringing enough value
to the table to warrant such a divergence in syntax. I will say though that
pre-ES6+ it really tidied up a few things like classes, this binding, etc.

I've argued for and successfully migrated CoffeeScript projects to TypeScript,
but I'll be the first to admit it is "ugly" compared to CoffeeScript. F# is a
pretty elegant language IMHO, and it's success using significant white space
is cited early in the post. If we can have all the added VALUE of Scala AND a
tidier, potentially optional syntax then why not?

~~~
partycoder
"Tidy" is very relative. We could go back to the traffic lanes example and say
that streets would be tidier without lines drawn on them. But those lines look
better than crashed cars and dead people. Minimalism is about removing
redundant stuff, those delimiters aren't redundant unless made redundant
through whitespace. I do not think whitespace is the way to go.

------
mamon
The one significant effect of indentation-based syntax is that it makes copy-
pasting code from places like books or Stack Overflow somewhat harder and
error prone. I've been hit by this few times when learning Python, I can
imagine it would cause some frustration for Scala users too.

~~~
desdiv
Here's my straw-man proposal:

Allow both indentation-based and bracket-based syntax. Have a tool like
scalafmt/goformat that freely converts between the two on a per-file and per-
project basis.

When you're writing your own code, use the indentation-based syntax. You can
paste in bracket-based code anywhere, hit the auto-format key on your IDE or
run the commandline formatter and everything becomes nice and indentation-
based.

When you're writing books, libraries, example projects, SO posts, use the
bracket-based syntax. That way people who read your book/library
source/example/SO post can freely copy and paste into their own projects.

This has minimal impact on bracket-based syntax diehards; everything they read
and write stays the same. They won't even _see_ the new syntax if they don't
want to.

------
z3t4
I'm working on a text editor that automatically formats the code, it will have
trouble with significant whitespace if the code is valid both with and without
white-space, and even compiles, and even runs with or without, although with a
possible hidden bug. I've never worked with a language with significant
whitespace so I'm wondering, does it create bugs!? In my experience, bad
formatting can cause bugs, or is annoying, eg. syntax errors like "missing
bracket" and you got no idea where it's missing, which is why my editor does
the indentation automatically and enforce it (you can't change it).

~~~
hibbelig
Well, significant whitespace will play the role of braces, right? You can't
have your editor "fix" the braces, either -- the author has to put them in. In
the same way, you can't have your editor "fix" the indentation -- the author
has to put it in.

But with braces, you can do consistency checks, such as that they are properly
nested. In a similar way, you can do consistency checks with indentation. For
example:

    
    
        line 1
            line 2
          line 3
    

The third line is wrong, the transition from line 1 to line 2 introduced an
indentation step, and the third line is neither left nor right.

~~~
z3t4
I find adding braces much easier then managing white space though. For example
commenting out code, adding branches, removing branches etc.

------
AheadOfTime295
A non-backwards compatible syntax would lower the barrier to all breaking
changes.

Language spec [1], replacing the standard library [2], etc.

[1] Disallow implicit conversions between unrelated types
[https://github.com/lampepfl/dotty/pull/2060](https://github.com/lampepfl/dotty/pull/2060)

[2] Remove parallel collections from scala-library
[https://github.com/scala/scala/pull/5603](https://github.com/scala/scala/pull/5603)

------
richdougherty
This is a big change to make to an existing language. Even if you really,
really, really like indentation-based syntax, I doubt that the benefit of the
new syntax would match the cost of change.

------
tanilama
The benefit being...? Making it more complex?

------
placeybordeaux
I love scala, but scala already has some of the more confusing syntactic
decisions, not sure if this is gonna help.

------
sgt101
What does the literature say?

Is it known (as in quantitatively) if significant indentation is a good thing,
or a bad thing?

~~~
smitherfield
I think it's a bit of a bikeshed-type issue.

Subjectively, languages with significant indentation (Python, F#, Nim,
Coffeescript, Haskell, etc) are often thought to have nicer-looking syntax.

~~~
sgt101
If I had the patience I would try and measure something about the productivity
of those languages vs. opensource projects (not at all sure what) and control
a lot of things and do lots of stats and then write it all up and then realize
that no one else cares.

------
daly
The key flaw with indentation comes when you try to use it with printing. If
your function extends across a page-break, such as in a book or a printout, it
is not possible to resolve the indentation by eye.

------
mnd999
Just no.

------
exabrial
Sigh. Why are the people with the worst ideas most vocal about it?

If you like significant invisible characters, use Python. No need to introduce
this into other languages.

~~~
alex_duf
except if you want type checking at compilation time

~~~
AlexCoventry
[http://mypy-lang.org/](http://mypy-lang.org/)

