Hacker News new | past | comments | ask | show | jobs | submit login
Why calculating is better than scheming (1987) [pdf] (kent.ac.uk)
65 points by tosh on Dec 4, 2016 | hide | past | web | favorite | 24 comments

I've skimmed this article a few times now, and something I've always had trouble with was this:

“…problem like this are damaging if one is trying to present programming as a discipline subject to mathematical analysis.”

because in general programmers steadfastly refuse to subject their discipline to mathematical analysis, and continue to look for some aesthetic balance between "enough abstraction" and "enough whitespace".

Here is a program to sum a list of numbers:

This definition is even less cumbersome than the Miranda or the Lisp, but it's unclear what the author expects us to conclude from this. And yet I think it's perfectly reasonable to posit that shorter programs are better. Shorter programs have less bugs[1], and run faster[2], and yet for some reason there are a lot of programmers who believe their job is to produce more code. Fuck those guys. Seriously.

[1]: See Code Complete and Software Estimation: Demystifying the Black Art, by Steve McConnell.

[2]: https://stacresearch.com/m3

Then we have the pursuit of "traditional mathematical notation": I don't think it produces better programs or better programmers. Computers do not execute mathematical notation, and mathematicians can't agree on their notation to enough specificity to permit as much useful computation[3]. It is clear that notation is important[4], but there is frighteningly little real research in this space. Zero-cost abstractions still take up source code, and yet many programmers think this is given as a good thing, despite Dijkstra’s wonderfully succinct observation:

“…the intellectual effort needed to conceive or to understand a program need not grow more than proportional to program length.”

[3]: See Whitehead and Russell, Principia Mathematica

[4]: http://www.jsoftware.com/papers/tot.htm

Lazy is also bad. Computers aren't lazy. Starting off by thinking about them as lazy puts programmers on that path for total abstraction. So much energy has been put into trying to hide concurrency-problems[5] that it's no wonder most big-data programs generate so much heat. The real value in Scheme Streams is that they're explicitly concurrent, and other languages that have optimised for those constructs[6] have discovered a great deal of utility.

[5]: http://hadoop.apache.org/

[6]: O=spawn(fun(P)->P!{f,self(),F()} end,[self()]), receive{f,O,X}->X end.

It's almost like we've plucked some ideas, said they're important, and then used them as a lens to study our languages without even doing the first thing by showing how we can know they are important to the point of excluding everything else.

About the only thing I really like about Miranda (and Haskell, etc) is the pattern matching, which is dope. Seriously good stuff. I don't know how to square that with array languages, but I think it's worth thinking about since they (array languages) seem to have otherwise sorted out program length and notation.

> Shorter programs have less bugs[1], and run faster[2]

I'd heartily disagree with that statement.

Short programs can run faster, but that entirely depends on the implementation of the language/architecture that they're running upon.

Consider the following:

    struct {
      char msg[400];
      double d;
      int i;

    struct {
      double d;
      int i;
      char msg[400];
The code is nearly identical, only the order is different, but the second uses far far less memory than the first.

Secondly, shorter programs can be bug free, but that depends on the level of abstraction exposed, and how that translates into language that can be understood by those using it.

There's no shortage of regex bugs in the wild, and its a fairly concise language.

Perl has often been criticised for having too concise a language, leaving it prone to code rot.

However, all that being said, I think there needs to be a balance between abstraction, and mathematical analysis, and I would also say there hasn't yet been enough actual research into the effect of syntax, rather, syntax has been a result of the abstraction its presenting, or the underlying implementation the language relies upon.

We do need abstraction, parsing mathematic notation or estoric languages like Malbolge are difficult for the human brain, even with experience.

I mean, Brainfuck isn't that far from what a processor actually runs, being registers and jump statements, but...

We don't need a Turing Tarpit.

But neither do we need the insane amounts of boilerplate that Java forces upon you to do even the simplest things that Java is meant to be good at.

I think that in recent years, mathematics has given us the tools to build better syntaxes:

* Better type inference

* Better proofing of type safety

Though I think Scala's syntax is still overly verbose, DOT calculus is a great step in the right direction for being able to create a syntax with both zero cost abstractions, type safety, and a more concise syntax.

With GCC, both those structs seem to have a size of 416 bytes. Can you elaborate on what you meant?

32 or 64bit GCC?

I was referring to Struct Padding[0].

Assuming a 32bit GCC, then the first would be padded with 24 bytes, the second with 16 bytes.

This padding doesn't show up under size, but does affect speed in small ways, and available memory (yay for dealing with memory constrained systems!).

[0] http://www.catb.org/esr/structure-packing/

You seem to be misunderstanding how padding is applied. As your link says regarding structure alignment:

In general, a struct instance will have the alignment of its widest scalar member.

...which is d if we assume sizeof(double) == 8, so the total size will be rounded up to a multiple of 8. Both your structure examples will be exactly the same size:

    0 msg
    400 d
    408 i
    412 [+4 padding]
    416 TOTAL

    0 d
    8 i
    12 msg
    412 [+4 padding]
    416 TOTAL

> On some processors, the addressing of a member is more efficient if its distance from the beginning of the structure is less than 128 bytes.

> In the first example, to address the d and i fields using a pointer to the beginning of the structure, an offset of at least 400 bytes is required.

> Instead, in the second example, containing the same fields in a different order, the offsets to address d and i are of few bytes, and this allows to use more compact instructions.

But isn't each field a multiple of the word size on both 32 and 64-bit platforms?

int will be smaller than double on x86_64 (edit: x86 too, and most other platforms), but since in both structs int comes after the double, that doesn't matter. 400 is a multiple of 8, so it doesn't matter if it's before or after the double either.

   ...and yet for some reason 
  there are a lot of programmers 
  who believe their job is to
  produce more code. Fuck those 
  guys. Seriously.
I hope this isn't a potshot at languages that tend to be verbose.

Coming in cold on a program that someone else has written in a terse programming language that you know well, still sucks. You can know the language, but if you don't know the person doing the writing, terse code can be all but opaque.

With some languages you need some apriori information before attempting to read a thing. Whether it comes in the form of a vulcan mind meld with your best friend, or cultural mores and in-group conventions like managerially enforced coding practices.

Sometimes this is less of a problem with verbose languages, which promote habits and tendencies to be adopted by the author which attempt to indicate intent within code, although any turing-complete language can disappear down a rabbit hole, if the author willfully tries to confuse the reader.

Consider the expressions:

  var _something = (data isNaN ? "string" : -1);
In this first example, who the fuck knows what that asshole intended? But there really are people actually expect a declaration like that to be useful?

Compare to:

  private static String something = "string";
  private static Integer aNumber = -1;
  private static boolean numericDataType = false;
Okay, possibly also terrible, and apparently redundant at first glance, but maybe there's a mud pit ahread of us, and some redundant tethering will come in handy.

At least I can see indications of intent, and a separation of ideas.

Maybe the second example involved more keyboard activity, but I can understand it at a glance.

The first example, while shorter, is something I might re-read 3 times, question why anyone would do that, and then find myself re-reading it 10 more times while I debug the all-but-guaranteed garbage that comes immediately after that albatross.

> Shorter programs have less bugs[1], and run faster[2]

It's quite easy to write a short, slow, buggy HTTP server. There are a lot of them.

Simple observation would tend to suggest that the fast and reliable HTTP servers tend to have quite a bit of code. Sure, a lot of that is extra features that most people don't use, but it's true even if you cut that out.

> Simple observation would tend to suggest that the fast and reliable HTTP servers tend to have quite a bit of code.

I'm unconvinced.

It may be more likely that people do not need ($$$) fast HTTP servers badly enough to make them. I have done a number of special-purpose HTTP servers (ad server/rtb) in order to eke out some of that additional performance, but the 10µsec response time I can get with 60 lines of C code saturates 1gbps, so more servers is simply more economical at that point.

Meanwhile, databases are a much better study since there are established industries that demand faster and faster queries, and the fastest databases are around 1000x faster than Hadoop at a fraction of the size (K6 is under 600 lines of C code; KDB's SQL92 is around 30 lines of K code).

I believe a big part of why is that when you get that small it's easy to see your entire program, so you can remove redundancy and bloat that you wouldn't otherwise be able to be aware of. This reduction also reduces the binary size, paying dividends by getting the important bits into L1 (K6's binary is under 100kb which leaves plenty of room for all your scratch variables and your program).

When you say "fuck those guys" is that a gender neutral idiom? Not sure about contemporary English.

In case anyone is unfamiliar with the author Philip Wadler, he went on to be a key figure in the design of Haskell.

In particular the language Miranda he discusses here was a direct predecessor to Haskell.

...as well as Java generics.

this critique seems to be of the kind of lisp you find in introductory textbooks on scheme

most lispers like to play around with syntax, doing that in haskell usually results in write-only code

Yes the critique is from an educational perspective. Many, especially in academia, view syntactic abstraction as a method of last resort, so it doesn't feature in the courses. So one might ask why then suffer the syntax of scheme if macros are never used?

All ancient history now, I heard that MIT now teach Python.

MIT uses Python now yes, but Scheme's offspring (Racket) is popular at a few universities. I've found Python easier to use though.

That critique had a lot of observations of the form "lisp is less good because I am not used to it". From the point of view of something with more lisp experience, quite a number of these observations would go the other way.

This article needs a (1986) in the title, at least according to the handwriting on the 1st page.

It says "received". That often happens before a paper is published, but we want the publication year. That was 1987 according to Wadler's CV: http://homepages.inf.ed.ac.uk/wadler/vita.pdf.

> … we want the publication year. That was 1987 according to Wadler's CV ….

Or just to the ACM: http://dl.acm.org/citation.cfm?id=24706 .

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact