
A Crash Course on ML Modules (2015) - Tomte
https://jozefg.bitbucket.io/posts/2015-01-08-modules.html
======
Drup
This is a good rundown of the basics of ML modules, the syntactic aspects, if
you will. But it doesn't really give you the important insight:

    
    
      Abstraction. is. powerful.
    

if you consider the IntMap module provided in the article, you notice that the
type `t` is not defined. It is kept "abstract". One defining property of ML
modules is that you can indeed keep the definition of types hidden and
_nothing can break abstraction_. The rest of the program can only manipulate
the type `t` through the functions provided by the module.

This allows, in practice, to encode pretty much any property you would like.
In the case of maps, you can use BST and never have to worry about users
breaking the invariants: the user can't even see it's a BST! You can provide
any form of validation and be sure that the data you manipulate will stay
validated (as long as the module is correct, of course).

Functors allows you to rely on this by giving you a way to abstract away a
whole module. This gives you excellent guarantees: if two modules behaves
exactly the same, swapping them is semantic-preserving. Said in another way:
if I can prove that sets-as-lists and sets-as-bst are functionally
equivalents, then I can swap one for the other, and the rest of the program
will behave the same!

Refactoring becomes a breeze. Functors (and modules) makes all the Javaesque
reflections and dependency injections completely redundant: modules are (in
OCaml) first class objects that you can manipulate and apply as you wish.

It also gives you separate compilation! A lot of people say that the OCaml
compiler is blazing fast: this is precisely because OCaml's module system
ensure that each module can be typechecked and compiled incrementally.

It pains me that modules get so little appreciation. Many ML programmers get a
sort of selective blindness: just like fish in the sea, they don't realize the
power of what they're swimming in. Proper modules with actual abstraction are,
in my opinion, the single most essential feature for "large scale"
programming.

~~~
zvrba
It seems to me that you can get the same thing in C++ with classes <> modules
; templated classes <> functors. What am I missing?

~~~
Drup
There is no abstraction in C++, you can always poke around the memory layout
of what you are given, and change your behavior depending on that.

For example, let us say you have a module satisfying this signature:

    
    
      module M : sig
        type sorted
        val import : int array -> sorted
      end = struct
        type sorted = int array
        let import = Array.sort
      end
    

In OCaml, if you have a value of type `sorted`, you _know_ it's indeed sorted.
In C++, as soon any code external to the module had an handle on it, you don't
know! It could have modified the array behind your back, since it can look
directly at the definition, or worse, poke in the memory layout.

~~~
rapala
Yes, you can access raw memory and flip bits to modify a private member of an
object (and it migth even be defined, not too sure about that thou). But I
don't find that an valid argument for the statement that you can't do
abstractions in c++. That's just not something people do.

The c++ version of your example would be, if I understood your code correctly,
to take an std::vector as an constructor parameter, copy it to an private
field and sort it.

~~~
Drup
I don't know all that much about C++'s object system, so I can't give you a
concrete example on how it breaks down.

However, you say that it is "not something people do" ... well maybe not in
C++ (I highly doubt that), but it's very common in many languages. In C, it's
common to look inside structs directly and change things. Javascript libraries
do it all the time: They inspect their arguments, look at the types and change
their behavior depending on it. It's a common programming practice to poke
deep into the data-structures and do things. In Java, they made it an art with
reflection and monstrosity such as Spring.

Abstraction is a bit like immutability: Sure, you can try to fake it in
languages that don't have it, but then you are just praying that everyone
plays by your rules. :)

~~~
rapala
My comment "not something people do" was about directly accessing memory to
circumvent the private data abstraction in c++ and I stand by that.

I admit that I kinda pushed you into it, but you are moving the goal post.
People indeed use public fields in languages like C and Javascript.

In C it's often done for the sake of performance. Hiding data behind a pointer
has a cost.

In Javascript I would say it's lazyness above all. Front end programs often
aren't that big nor pinacles of code quality.

But it is possible to define abstract data types in both languages. ML makes
it a bit easier and some times even more performant, but it doesn't "own" the
idea.

Abstractions are quite like immutability. You can enforce both in many
languages, some just give you better tools for it.

~~~
Drup
The goal post, right from the start, is that by having proper abstraction (the
kind that doesn't break when you blow lightly on it), you get many benefits.

You say that, even when the language doesn't enforce it, people don't break it
..... except when they do. It doesn't really matter why, it simply makes every
thing else more brittle as a consequence and limits how you can reason about
your code.

You seem to trust that programmers will play by the rules, even if the
compiler doesn't enforce them. We will simply have to agree to disagree. :)

~~~
rapala
> You seem to trust that programmers will play by the rules, even if the
> compiler doesn't enforce them.

I've been trying to say the exact oppisite. C, C++, Javascript, all those
languages provide ways to define abstract datatypes that cannot be
circumvented (by "normal" code. Even Haskell has unsafePerformeIO). My latest
argument was that people decide not to use those abstractions not because they
are unavailable, but because it is more ergonomical or performant not to. The
same happens even in ML, not all data is abstracted as an abstract data type.

------
hajile
ML should get more love. If PolyML or MLton had received even a fraction of
the support behind golang, we'd have a language with all the upsides of go
(simple, concurrent, typed, etc) and none of the horrible downsides (not
functional, almost zero abstraction, empty interface, no generics, pointers,
etc).

~~~
elcapitan
The bucklescript and ReasonML ecosystem is quite interesting right now, seems
to attract at least a bit of attention.

~~~
hajile
ReasonML syntax just isn't as good as the ML style syntax. If I understand
correctly, they started with SML then switched to Ocaml because there were
more libs. I suppose it was a pragmatic decision, but I believe SML Successor
would have been a much better long-term target not to mention that going with
PolyML would have given them a concurrent back-end (PolyML has had years of
refinement while Ocaml's implementation is still not finished yet).

~~~
elcapitan
Can you explain what the problem with that syntax is, is there something you
couldn't express in ReasonML? It is meant to bridge the gap for existing js
developers, if I understand it correctly.

Regarding the ecosystem, I think the main goal is compilation towards js and
js interop, so other backends don't really matter that much, from my
understanding. Maybe Ocaml just fit the bill best as a base language
equivalent, regarding the features they wanted to map to and from js.

~~~
hajile
I love JS (which ReasonML tries to emulate), but the traditional ML syntax
(elm, haskell, SML, etc) is simply better for me. The extra parens and curly
braces everywhere don't really add anything useful to the language while
complicating the syntax (well, they help solve some edge cases in Ocaml, but
those cases also don't exist in most other MLs). Another result of using Ocaml
is that operators aren't overloaded (not bad if you have only a couple types,
but quite problematic as the number of primitive types grows). A personal
annoyance is the use of JS promises (non-monadic with auto-flatmap).

Most of this applies to reason as well.
[http://adam.chlipala.net/mlcomp/](http://adam.chlipala.net/mlcomp/) (I'd note
that the goal of Successor ML is to add the most useful Ocaml features into
SML).

As to why they chose Ocaml. More libraries makes it easier to write tooling.
Ocaml has builtin tools to make writing the language much easier. Most
importantly, other Facebook teams were already doing a ton of stuff in Ocaml.
There are probably a bunch of other reasons, but the ReasonML guys know way
more about that than I do.

~~~
elcapitan
Ok, yeah, I guess that's personal preference then. I have only superficial
experience with traditional MLs and am currently experimenting with ReasonML,
and it just feels more intuitive and productive right from the start, and
context switching is easier.

With regard to operator overloading, that's the number one issue I had anytime
I looked for example at haskell, it's just incredibly hard to figure out what
some <=$=> operator or whatever means, and also very hard to google, so for
newcomers it's really bad language UI.

~~~
hajile
Operator overloading has pros and cons. In truth, I'm inclined to agree with
you that operator overload should be limited in userland, but I'm a bit
incredulous at the choice with primitive data types. There are 12 or so common
primitive numeric types baked into hardware. If Ocaml ever expanded to include
these all, the effect without overloading would be horrendous.

~~~
elcapitan
Ok yes, agreed, the separate operators for different numeric types are not
exactly very user-friendly either. In the de-facto use case of Reason they
shouldn't be such a big deal though (building webapps).

------
kuwze
I know he said "All the code here should be translatable to OCaml if that’s
more your taste" but I have read multiple examples about the power of ML
modules and the code always seems to be in SML and not OCaml. Why is that?

~~~
hajile
Because Ocaml is easily the ugliest of all the ML languages. I will never
understand why Ocaml became the popular one. SML is much more elegant in my
opinion (and even adding all the good Ocaml stuff to SML still leaves a much
more elegant language).

The further reason is that SML is very popular in Academia (perhaps that is
its downfall). The same holds true for things like lisp where Common Lisp is
undoubtedly the most used, but scheme is usually an example language because
it is more elegant (and with fewer warts).

~~~
xfer
Can you define what you mean by "the ugliest"?

SML is used in academics because the language is very well-defined(both
semantically and typing rules) and all the fundamentals of typed functional
programming, so it is preferred for teaching. Ocaml is a moving goalpost, and
the language has no formal specification, afaik.

~~~
hajile
The fact that the implementation is the spec is a large part of the issue.
Someone feels like adding a feature, so they just tack it on somewhere until
the language has a million syntactic features littered everywhere. The
experimentation aspect is nice, but to me, the final product is an unwieldy
Frankenstein (though still better than many more popular languages).

[http://adam.chlipala.net/mlcomp/](http://adam.chlipala.net/mlcomp/)

SML is nice enough as is, but the progress with Successor ML to gradually add
the best of Ocaml on top is a very appealing option. As a personal gripe, both
languages need better Unicode support.

------
didibus
Prerequisite is existing knowledge of ML syntax I guess. Cause I couldn't
fully follow.

And can someone tell me how its different to Java Interfaces or Classes?

~~~
l_dopa
One big difference is that java interfaces can't contain member types. They
describe the methods of one particular type. Imagine you could define an
interface for an entire java package.

Another big one is that module types are structural: you don't have to declare
all the module types implemented by a module when you define the module.

