
Courting Haskell - honzajavorek
https://honzajavorek.cz/blog/courting-haskell
======
ssivark
Though I’m not much of a Haskell programmer (spent some time learning it and
writing toy programs), I would use a different motto to characterize Haskell:
_Stop prescribing (loose) design patterns; make them tight and refactor them
into libraries, so they need only be written once._

More than anything else, the language aims at providing modularity (and
terseness, on the macro scale) for the programmer (while trying very hard to
compile to something efficient). Even laziness is to enable more modularity.

The “make design patterns tight” part is what ends up needing abstract
mathematical reasoning (category theory is basically mathematical pattern
reasoning distilled to its essence).

The other consequence of abstracting patterns into libraries is that novice
programmers end up with a lower “writing code” -vs- “reading+thinking” ratio
compared to other languages. And this can be jarring to folks whose attitude
is to learn by writing code (to discover patterns in the process).

~~~
ncmncm
"All patterns are anti-patterns."

A pattern is a common expression form that your chosen language is unable to
capture in a library. As we get better languages, what had been patterns turn
into ordinary library components. Patterns composing those with one another
and with core language features either become more library components, or
challenges for subsequent language design.

~~~
mikekchar
I really disagree with this viewpoint. Patterns are just patterns. It doesn't
matter if your language is able to express the pattern in a reusable form or
not. The whole point of a pattern is that it is a common solution to a class
of problems given a set of circumstances. Even if you have a "super awesome
pattern" widget that you can use, you _still_ need to know whether or not that
pattern is appropriate for the problem you have and the circumstances in which
the problem is expressing itself. Even beyond that, most programmers will be
dealing with many more than one programming language in their lifetime.
Learning well known design patterns is about understanding the abstractions,
the problems and the circumstances so that when you are faced with a similar
situation in another language you can efficiently see if there are
capabilities that will help you out.

Basically, consider the situation where you say, "Here's how I do X in this
language. How would I do a similar thing in that language"? A design pattern
gives you a name that you can use instead of "X". It also gives you a context
where you can realise, this pattern is appropriate in here, but not
appropriate there. When you talk to people you can simply say, "Can I easily
implement a functor here? If not, how can I get similar utility in the same
circumstances? Are there any caveats that are different than the normal ones?"
It seriously speeds up the conversation. It also allows one to think about
programming more generally rather than thinking of it only in terms of a
specific programming language.

Edit: grammar

~~~
ncmncm
You don't use the "sort pattern". You call the sort function, because it
exists.

In C, you would code up, in place, an example of the "hash table" pattern, but
more expressive languages have hash dictionaries in the library that are as
fast as, or faster than, you could afford to code in place.

If your language can't properly express a "maybe" monad, you can cobble
something together to use instead, and mention that in a comment. But it's
only a pattern because you don't have it.

~~~
mikekchar
I'm afraid I didn't express myself well. Where I think we're not aligning is
that you are under the impression that I'm using the same definition of
pattern that you are. It was my intent to express that I disagree with your
definition.

It's important to understand that a pattern is a solution to a problem along
with contexts in which it is both appropriate and inappropriate. To document a
pattern you need: a description of a problem, a description of the solution,
contexts in which this solution is appropriate, contexts in which this
solution is inappropriate. If you read the early literature on patterns and
pattern languages (Beck and Cunningham's initial paper, or really any of
Coplien's writings), I hope it will be more clear.

"Sort" and "hash table" are not patterns. They are simply solutions without
problems. They are also not specific enough to discuss various contexts. When
is it inappropriate to sort? That's a meaningless question without a lot more
information. Sorting may indeed be a solution to a problem, but it is not in
itself a pattern.

We could say that the maybe monad implements a particular pattern for
representing optional data. It is, however, not the _only_ method for
representing optional data. There are many others. The point is to be able to
understand which one you should use in which context. That it can be
implemented in a library is fantastic (less code for you to write), but that
was never the point of design patterns (hence the word "design"). The point
was to give you a vocabulary with which to discuss the merits of various
solutions to problems and to pick appropriate ones for your circumstance.

~~~
ncmncm
"Pattern" comes from the Gang of Four book, by way of confusion about an
entirely different concept from architecture.

The book has not aged well. Its vocabulary has turned out to be decreasingly
useful. I go for many months at a stretch without encountering any reason to
mention any of them. The only names that come to mind, at the moment, are the
"visitor" and "pimpl" patterns, only the latter of which I have used in the
past decade, and that because it is imperfectly supported by the library
template std::unique_ptr<>.

That is not for lack of discussion of choices among possible solutions to
problems. Notably, most on
[https://cpppatterns.com/](https://cpppatterns.com/) are just library
components.

~~~
mikekchar
Pattern comes from here:
[http://c2.com/doc/oopsla87.html](http://c2.com/doc/oopsla87.html) It even
says so in the GoF book.

------
andolanra
"[Haskell is], indeed, heavily founded on mathematical theories (category
theory and lambda calculus)."

This is a pretty common reason given for why Haskell is "math-ey", but I'd
argue it's simultaneously a bit misleading and also a bit boring. The part
that's misleading is "Haskell is based on category theory": while a major
feature of programming in Haskell (the monad abstraction) was _inspired by_
category theory, the truth is that moment-to-moment programming in Haskell
doesn't have much to do with category theory unless you want it to. Even
monads require zero knowledge of category theory in order to understand or
use! Some people find usefulness in category-theoretic abstractions, but many
others—myself included!—don't at all: I've actually improved the performance
of Haskell code in the past by cutting out category-theoretic abstractions in
favor of simpler code, and by now I steer clear of most Haskell code that
wears category theory on its sleeve.

The part that's boring is that "Haskell is based on lambda calculus". The
lambda calculus is a mathematical description of a simplified model of
computation with functions and binding… and that's basically it. Almost every
modern programming language uses variable binding as a basic feature, and in
that sense, is "based on lambda calculus". Plenty of other languages, for
example, are inspired deeply by Lisps (like JavaScript or Ruby) which in turn
were directly inspired by the lambda calculus, but I don't think that lineage
makes them particularly math-like. Instead, being "based on lambda calculus"
usually just means that they have variables and closures—something true of
almost every popular language used today!

Now, Haskell-as-practiced can have a fair bit of math-inspired abstraction in
it, which can be daunting to someone new to Haskell. Some of that abstraction
is useful (e.g. monads), some of it isn't, but the fact that it exists is more
about vocal Haskell programmers and bloggers, and much less about Haskell-the-
language being "heavily founded on mathematical theories". And you'll also
find plenty of programmers and bloggers arguing _against_ the more math-heavy
parts of the ecosystem: just in the past month I've seen plenty of Haskell
programmers passing around links like
[https://www.simplehaskell.org/](https://www.simplehaskell.org/) which
advocates for exactly this less-mathy approach to Haskell.

~~~
the_duke
The big problem here is documentation.

Most beginner Haskell resources go into complicated monad explanations very
early on, instead of just showing how to use them like a tool. Intermediate
resources are even worse.

------
alephu5
I've been been learning haskell by building a standard monolithic web
application: Postgres DB, some static content, REST API and firebase
authentication. It's taking longer than it would with a familiar language but
so far I'm really enjoying the experience. I also took a small deviation a
couple of weekends ago to build a simple OSM router and was pleased with the
ease of development and performance.

I'd recommend the book Practical Haskell by Alejandro Serrano Mena to get a
firm introduction to the language and ecosystem of web applications. After
that take at look at the libraries developed by FP complete.

~~~
hopia
I'm using the same method as you. What are you using as the SQL and REST API
libs? I'm using Servant and a library called Squeal for Postgres.

~~~
alephu5
I'm combining servant and Yesod into a single WAI app so that I can serve a
web app and provide an API from the same server. For DB access I'm using
persistent.

------
honzajavorek
In the past two months I've been trying to learn the Haskell programming
language. It's vastly different from anything I know, so it served me also as
a way how to empathize with complete beginners to coding. This is a diary from
my journey.

------
proc0
" Powerful! It takes just a few lines to implement your own clone of Vim:"

Uh, ok lol.

" If SQL or Python read like an English sentence, then Haskell reads like
math. Feels like math. It is math. "

This is basically why there's a learning curve. Any analogy used as a learning
example will typically fall short because in order to fully grasp the concepts
you just have to learn the math behind it, because that's by design
(denotational semantics).

------
cdaringe
To the whole article I say "ditto." Great summary. Also, didn't Guido say he
wanted to eject the functional operators in Python? #hearsay

~~~
sli
Slashdot had an interview with him where he talks about it[0], he basically
says Python isn't the language for FP and then complains about readability
which, YMMV on that one. I don't know if his opinion has changed in the last
seven years, though.

[0]:
[https://developers.slashdot.org/story/13/08/25/2115204/inter...](https://developers.slashdot.org/story/13/08/25/2115204/interviews-
guido-van-rossum-answers-your-questions)

------
leshow
Just like to point out that you don't need to know category theory to program
Haskell. You don't need to read or watch Bartosz' videos to program (though
they are great fun to watch), that part in the blog is a bit specious.

LYAH isn't a great pedagogical resource. I recommend the Haskell First
Principles book. It's long but rewarding and has frequent questions.

------
ggm
At last! somebody else who is prepared to be honest about their fear of maths,
and mathematical notation, and people who leap from "its just like school
maths" to referring to the intimidating maths you skipped or failed in school
and university.

Its just like maths: requires deep brain structure wiring you may not have.

~~~
Rerarom
I wonder why is there is so much fear of math on HN, which is essentially a
STEM community. I would expect it among literary types, but not here.

~~~
Tarq0n
I think many people have a problem with mathematical notation, not maths
itself. Mathematical notation is optimized for terseness and suitability for
use on a whiteboard. Code has made great leaps in terms of readability and
expressiveness compared to that, and is therefore more accessible to many
people even when expressing the same concepts.

For an example see the comments on the map/reduce post from yesterday.

------
axilmar
Too much drama for no reason. Haskell is a programming language; it has some
neat features, and some downsides. If it's the right tool for the job at hand,
use it, otherwise use something else.

------
Rerarom
You keep mistyping Clojure as Closure

------
avindroth
no mention of haskellbook.com is surprising!

~~~
olah_1
I read that book for 6 months straight, followed all the exercises, and quit
when I realized I couldn't write a simple script that did something useful.

But boy did I sure evangelize Haskell throughout the process! _facepalm_

~~~
hopia
I started building straight away from day 1, getting a decent REST API
together is surprisingly easy. You really don't need to understand most of the
advanced type level stuff to stay productive.

Maybe you tried to go for too high abstraction level straight off the bat?
That can turn out very depressing on Haskell.

~~~
olah_1
Yes, for that reason I recommend Will Kurt's Haskell book instead. It's way
more practical.

------
crimsonalucard
A different way of thinking of abstraction or just a feeling of elegance or a
feeling of power is not a sufficient explanation for why Haskell is so great.

People want definitive and theoretically correct answers not talking points
for a philosophical debate on programming styles.

Why is typed functional programming measurably better than procedural? Why is
it better than OOP? Definitive answers are in demand not exploratory
experiences.

~~~
dpatru
> Why is typed functional programming measurably better than procedural? Why
> is it better than OOP? Definitive answers are in demand not exploratory
> experiences.

Functional code tends to be shorter than procedural code. This allows you to
think in bigger steps. For example, to read whitespace-separated numbers from
stdin and print out their sum, in haskell you could write:

    
    
        main = interact $ show . sum . map read . words
    

This uses very generic haskell functions to put the input into a list of
strings, read each string as a number, sum the numbers and show the sum as a
string.

It seems to me that the equivalent procedural code would be much longer.

Haskell's type system keeps track of what's going on and alerts you when you
try to do something that doesn't make sense. For example, if you leave out the
"read" function above, you would be trying to sum strings instead of numbers.
Haskell's type checker would complain at compile time. This enables you to
program at a high level without having to debug run-time errors because of
type mistakes.

~~~
crimsonalucard
>Functional code tends to be shorter than procedural code

Shorter is one metric that FP "tends" to be better at. But it's not
definitive. Who says procedural programs can't be shorter? And also is shorter
necessarily better? Also does it come at the cost of readability? Actually
let's not get into readability as it's not exactly measure-able.

>Haskell's type system keeps track of what's going on and alerts you when you
try to do something that doesn't make sense.

Algebraic Type systems are indeed a measure-able metric when you measure
correctness, amount of errors or total possible programs you can write. It
restricts the code you can compile to be correct from a typed perspective.
Meaning that out of all the possible programs you can write, Haskell allows
you to write less programs in the sense that it stops you from writing certain
incorrect programs.

However, ADT's can be used in procedural programs or OOP programs as well. See
Rust.

What I want to know is specifically about the functional programs. In the
functional programming paradigm what is the quantify-able metric that makes it
definitively better?

~~~
nimih
> In the functional programming paradigm what is the quantify-able metric that
> makes it definitively better?

It seems extremely unlikely that you'll ever find a satisfactory answer to
this, because any advantage a programming language paradigm gives you is
either going to be in terms of programming language theory (which you rejected
up-thread), or in terms of developer experience and productivity (whatever
_that_ means). However, any rigorous study of those latter categories is
likely going to be seriously confounded by their variability due to things
which are _not_ related to language paradigm, such as organizational concerns,
the language's tooling and ecosystem, the problem domain, the skill and
experience of individual developers, &c &c. None of these things is
straightforward to control for, and I'd be extremely skeptical of any
quantifiable metric someone shows me that purports to show clear wins in real-
world software development based on language paradigm of all things.

~~~
pron
> is either going to be in terms of programming language theory

Programming language theory does not study "advantages;" it's simply not a
question it asks, let alone answers. The theory is concerned with what
properties certain formal systems have. It cannot, nor does it attempt, to
assign those properties a value, just as mathematics does not ask or answer
whether prime numbers are better than composite numbers.

> None of these things is straightforward to control for, and I'd be extremely
> skeptical of any quantifiable metric someone shows me that purports to show
> clear wins in real-world software development based on language paradigm of
> all things.

Maybe, but that doesn't matter. You cannot claim that you're providing a
significant benefit and in the same breath say that it isn't measurable. An
advantage is either big or not measurable; it can't be both. If you say that
the benefit of the language is offset by the bad tooling, then you're not
really providing a big benefit. If and when the tooling catches up, then it's
time to evaluate.

But maybe not. I find it very dubious that significant differences are not
measurable in an environment with such strong selective pressures for two
reasons: 1. it doesn't make sense from a theoretical perspective -- adaptive
traits should be detected in a selective environment, and 2. it doesn't fit
with observed reality. We observe that technologies that truly provide an
adaptive benefit are adopted at a pace commensurate with their relative
adaptability; often practically overnight. The simplest explanation from both
theory and practice to why a technology does not show a high adoption rate is
that its adaptive benefit is small at best.

~~~
crimsonalucard
>Programming language theory does not study "advantages;" it's simply not a
question it asks, let alone answers. The theory is concerned with what
properties certain formal systems have. It cannot, nor does it attempt, to
assign those properties a value, just as mathematics does not ask or answer
whether prime numbers are better than composite numbers.

I think we can go deeper than this. There are properties of well designed
programs that can be measured to be numerically higher or lower than poorly
designed programs. Under this mentality "better" is simply a word with no
meaning that is describing a number. It is the human that has the opinion that
the higher (or lower) number is a "good design."

The question is what is that number and how do you measure it? For example one
number off the top of my head: lines of text. Another better number is the
amount of functions. Both of these numbers have flaws so maybe a better number
is given a (high level language) and a (program written in assembly language);
what is the largest number of high level language primitives you can use to
recompose an identical program?

(high level language primitives)/(low level language primitives)

As the ratio approaches 1 we are achieving maximum flexibility as the high
level language is injective to the low level primitives. As the ratio
approaches zero we are reducing complexity at the cost of flexibility (we
reason about less primitives). If the ratio exceeds one then we are creating
excess primitives.

Maybe the better designed language/paradigm can has primitives that can used
to drive that ratio back and forth from 0 to 1. A poorly designed language is
one with a ratio of 4.5 or 0.0001.

So something more advanced but along the lines of this rudimentary and rough
outline is certainly possible in my mind.

>But maybe not. I find it very dubious that significant differences are not
measurable in an environment with such strong selective pressures for two
reasons: 1. it doesn't make sense from a theoretical perspective -- adaptive
traits should be detected in a selective environment, and 2. it doesn't fit
with observed reality. We observe that technologies that truly provide an
adaptive benefit are adopted at a pace commensurate with their relative
adaptability; often practically overnight. The simplest explanation from both
theory and practice to why a technology does not show a high adoption rate is
that its adaptive benefit is small at best.

Yeah I agree. Additionally we're not dealing with the real world here with
billions of variables. This isn't a computer vision problem. Assembly
language, FP and OOP have a countable amount of primitives. It is amenable to
theory and measurement.

