Hacker News new | past | comments | ask | show | jobs | submit login
Pen: A programming language for scalable development (github.com/pen-lang)
92 points by raviqqe42 34 days ago | hide | past | favorite | 73 comments


Took just a bit of searching to find some code examples.

These black slashes looks so weird

...it's more popular with languages using the "ML" syntax.


It's not the ML syntax, it's almost only a Haskell thing, and while Haskell is a descendant of ML, it's a language on its own.

Really, this is why we can’t make any progress. Because this is such a common response to anything.

Go back to when you learned your first programming language. EVERYTHING was weird. I honestly don’t understand how after that, brains solidify and think anything after their first year of learning is weird.

afaict, they mean "lambda"

I expect that there will be an Emacs/Vi/VSCode/... mode to render them as such.

And I can't find a "hello world" or example project at all...

I love new programming languages and the concepts they bring to the table. But I feel PL designers should think about Dev experience.

Is it just me or is the "\" in front of parenthesis (func params) is just off-putting. I wonder, what purpose does it solve?

Familiar syntax makes it much easier to adopt and easily convince devs in your team to try it for real.

I agree, but lambdas do not really have a canonical syntax in the same way that braces have become synonymous with delimiters for statement blocks. This particular syntax comes from Haskell, where e.g.

    \x -> f x
denotes an anonymous function that applies its first argument to `f`.

But I have seen at least these forms in other languages of varying popularity:

    [x] f x

    |x| f x

    { f(it) }

    { x -> f(x) }

    fn x => f x

    function(x) { return f(x); }

    x -> f(x)

    (x: A) => f(x)
In other words, I think it is really hard to pick a syntax for this construct that every programmer is going to feel familiar with, especially if your language is supposed to cater both to programmers coming from FP and more traditional languages.

Edit: Fixed error in JS example; added Java and Scala.

Examples 3 and 4 are cursed, to my eye. Curly braces denote a block, so why is there some magic syntax inside the block that denotes that this block happens to be a lambda? surely the block is the closure itself. The syntax to say what the block is goes outside.

Imagine if i wrote a conditional like that:

    if x > y { return x; } else { return y; }

    { if x > y; return x; } { else; return y; }
It completely erases the usefulness of {}, and is cursed, cursed I say! And I’m looking at you, rust, swift, ruby, etc.

I wholeheartedly agree. Whether {} denotes a block or a lambda is context-dependent in the language I took these examples from (Kotlin). I think they adopted the syntax from Groovy. It is "useful" for creating DSLs because you can implement constructs like `.forEach` so they look like imperative constructs, e.g.

    list.forEach { x -> ... }
But it becomes harder to read code outside an IDE, especially if the implicit `it` construct is used.

That's the thing that annoys me most about Kotlin (even though aside from that it is my personal 10X language). In a large project that uses DSL's, coroutines and flow it quickly becomes a curly brace mess and it gets hard to see what code runs where, when and in which context.

Same, I use Kotlin daily, and I would choose it over Java any day as it really solves a lot of its quirks very nicely. There are however some parts of the syntax that keep confusing me on a daily basis, curly braces being one of them. The other is the weird unintuitive asymmetry that is going on between function types and lambdas. For example, a lambda of type

    (Int, Float) -> Boolean
is written as

    { i, f -> ... body ... }
but a lambda of type

    (Pair<Int, Float>) -> Boolean
is written as

    { (i, f) -> ... body ... }
because `(x, y)` is a destructuring pattern that projects the first and second components into `x` and `y`, respectively. The pattern in the lambda has to use tuple syntax exactly when the type doesn't, and vice versa. Gets me on a weekly basis even though I completely understand what's going on, it's just not very ergonomic.

No, the idea is that {} denotes executable blocks, which lamdas are. In Kotlin, if the last parameter of a function is a lambda, you can close the parenthesis before the lambda:

  f(5, { square(it) }
can be written as

  f(5) { square(it) }
which makes constructs like

  if(condition) { foo() }
Look like

  if(condition, { foo() } )
as in a function that takes a boolean and a lambda and only executes the lambda if the boolean is true.

It's a neat reinterpretation of what {} means.

I’m not super familiar with kotlin, but swift does the same thing you’re describing so i get where it’s useful. My beef is with including metadata about the executable block (parameter names) inside the block. So per your example:

    f(5) { square(it) }
My preference would be to tag the fn signature on the outside of the braces, something like:

     f(5) [(x)->int]{ square(x) }
I’d also accept ||, (), or nothing as the delimiters around the fn signature. Key point is that it’s OUTSIDE the executable block.

I’m not opposed to using some smarts to infer/simplify the expression when possible. I.e. if it’s a closure with inferable parameter and return types the “it” construct could be used (in swift they use $0, $1, $2 etc for unnamed parameters). Just the only thing inside an executable block {} should be code that gets executed - not type information about that block

I can’t believe the canonical anonymous function notation is omitted:

    (lambda (x) (f x))
On the other hand, this is a discussion about syntax, and Lisp is the “syntaxless” language…

How is that canonical? λ x . f x is the original syntax of Church, Lisp came around decades later.

Canonical from a programming languages perspective.

> function(x) { f(x); }

Slight correction (presuming JavaScript): that’d be `function(x) { return f(x); }`.

This could also be R, which allows for `function(x) {f(x)}` and (since version 4.1) `\(x) {f(x)}`.

Thank you, fixed!

for clojure there's

    #(f %)

    (fn [x] (f x))

This is canonical syntax in any functional programming language. This is like a Haskell programmer complaining that in most post-haskell languages, the keyword class means an oop type instead a class of types.

No it's not. It's used mostly by Haskell, but ML/SML/OCaml and Scala use a different syntax.

All true, but perhaps it’s better to use something more familiar to more developers, such as

    x => f x

    x -> f x

Passerine[0] uses the latter syntax.

[0]: https://github.com/vrtbl/passerine

Probably looks like lambda. Haskell has them.

    \x -> x + 1

“λ” is a bit inconvenient to type on most keyboards. And supporting Unicode at the source code level is still surprisingly rare in programming languages.

By that logic we would never get new programming language ideas, because we could only reuse what has already been done.

I’m all for experimenting with languages, but there is something to be said for having a familiar syntax. Using & and | for Boolean logic operators could be seen as a useful improvement, but then how does one express bitwise and and or? Additionally, all numbers are floating point. Would there be any future support for native integers? Handling currency with floating point has long been known to cause rounding errors.

Despite that, it’s interesting to see a language put dependency injection at the core of the language. I’d be curious to see how that works out in the long term.

One of the innovations brought by the C language was the use of distinct symbols for the McCarthy logical operators (&& || !) and for the bit string operators (& | ~).

The predecessors of C, i.e. B and BCPL also had both kinds of operators but they used the same symbols for them (in B: & | !) and the context determined which was meant.

However the choice made by C for the new symbols was constrained by the ASCII code, which had no other available symbols.

One decade before the C language, there were 6 symbols in use for the 3 logical or bit string operators, for example IBM PL/I used & | and NOT SIGN, while CPL and IBM APL\360 used LOGICAL AND, LOGICAL OR and TILDE OPERATOR.

The names in capitals are the Unicode names of the symbols.

So there are enough traditional symbols for both the McCarthy logical operators and for the bit string operators, without inventing any new symbols, like C did.

In my opinion, the IBM PL/I set of 3 symbols (& | NOT SIGN) is appropriate for the McCarthy logical operators, while the CPL / IBM APL\360 set of 3 symbols is appropriate for the bit string operators.

The reason is that & and | are more visually distinct so they are preferable in IF or WHILE conditions.

On the other hand, the bit string "and" and "or" are nothing else but "min" and "max" applied to vectors of 1-bit numbers. So the symbols LOGICAL AND and LOGICAL OR are also the right choice as symbols for "min" and "max". Because the LOGICAL AND and the LOGICAL OR symbols are just rotated LESS THAN and GREATER THAN, they are graphically suitable for "min" and "max".

You know, the reason Bob Bemer put backslash in ASCII was so that you could write logical AND and logical OR in the conventional way: P /\ Q, P \/ Q. With underscores and backspace for overstrike you had XOR, but that went away with the move to CRT terminals in the 01970s†, and of course ¬ isn't in ASCII.


I like the generalization you're implicitly suggesting, to use /\ and \/ for generalized lattice "meet" and "join"; bit strings are more naturally a lattice or a Boolean algebra (in the algebraic sense—a distributive complemented lattice) than they are a vector space. There's no analogue to \/ in a vector space! Other useful lattice algebras include integers as you point out ("min"/"max"), integers ("lcm"/"gcd", which is how recent versions of APL interpret ∧/∨, though this seems a bit 05AB1E-ish to me, since how often do you actually use those operations?), floating-point numbers (again min/max), and sets (intersection/union). All of these happen to be distributive in the sense that A /\ (B \/ C) = (A /\ B) \/ (A /\ C) and A \/ (B /\ C) = (A \/ B) /\ (A \/ C). You can also construct lattices as Cartesian products of other lattices and duals of other lattices; these are distributive iff the underlying atomic lattices are. (It might be worthwhile to think of bit strings as a topological space along the lines of the Sierpinski space, but I'm not sure if that enables anything useful.)

If we want /\ and \/ to represent operations on specifically a distributive complemented lattice (a Boolean algebra in the algebraic sense), we unfortunately rule out the min/max and gcd/lcm lattices, as well as finite subsets of an infinite set, because there's no suitable way to define ¬. This seems like a pretty big loss because integer min and max are such common operations, more common than bitwise AND and OR in my experience; infix and looping notation for them makes a lot of algorithms easier to express.

But when we introduce element complementation, there's another operation that becomes natural and is in fact so widely used in programming that Golang has an infix operator for it: abjunction, &^. And abjunction is also an infix operator in SQL (MINUS), as well as a machine instruction in SSE (ANDNPS), ARM (BIC), and Wirth-the-RISC. Because it's falsehood-preserving, it doesn't suffer from the infinity problems that occur in extending ¬ to finite subsets of an infinite set or gcd/lcm, although I'm not sure about min/max. It permits McCarthy-style short-circuit evaluation.

On the other hand, every lattice, whether unbounded, bounded but uncomplemented, or complemented, has a dual lattice; you just interchange meet and join. It's unfortunate that there's no way in most programming languages to use this to derive an efficient min-heap data structure from an efficient max-heap or vice versa.


One of the great benefits of syntax in ASCII (or other small character sets without homographs) is that it diminishes Don Norman's "gulfs of evaluation and execution". If you see an operator in ASCII, even an ugly weird one like #+#, you can probably figure out how to type it on a US English keyboard. (And ASCII characters are common enough that even nontechnical users can usually tell you how to type @ on their non-US-English keyboards.) And, if you've typed it, you can usually tell if you've typed it correctly. By contrast, operators like ¬, ÷, λ, ɑ, 𝛼, ⍺, 𝛂, ꭤ, 𝞪, 𝝰, and α have some real usability drawbacks. (All of those last eight are distinct Unicode characters, but two of them are only very subtly different in the font I'm using here.)


† Inexplicably, although the heyday of terminals with hardware character generation only lasted from about 01975 to 01988, overstriking never made a comeback, instead blessing us with the botched abortion that is Unicode combining characters and the fifteen normalization forms.

> Using & and | for Boolean logic operators could be seen as a useful improvement, but then how does one express bitwise and and or?

Depending on your target audience, a language could just not support these at the syntax level. There are all kinds of applications where you never have to do things at the bit level. There's no reason bitwise and/or/not/xor couldn't just be functions. Then you can also use ^ for exponentiation, which depending on the audience may be a more common operation than bitwise xor.

The only way I would take a new programming language seriously is if it doesn’t cater to current conventions. Other than that, there’s no point in making a new one. We don’t need new languages that just repeat the same ideas.

IMO, dependency injection tries to solve similar problems which are traditionally tackled by purely function programming languages, such as side effect management, reliable unit tests, segregation of application logic and implementation details, and so on.

This is too rough but, in other words, dependency injection is simply a less strict or untyped version of effect system or purely functional programming.

> It pursues its minimal language design further after removing several features from Go like pointers, mutability,

I don't consider pointers and mutability to be features, but I consider immutability and trying to hide pointers a feature. One that I don't want. Allocating too much/too often has been like 95% of the performance issues i had.

Computers use pointers and data in ram is mutable. How is pretending that's not the case removing a feature? It's the exact opposite. One of the reasons I like go is that I can have some control over memory when necessary. But most of the time I don't need to think about it unless the profiler tells me that's where the bottleneck is.

Small edit to not be so negative: The language looks interesting and I think it's well presented. Just not for me, and that above irked me in particular.

Modern machines are essentially distributed computers, multiple levels of cache per core, RAM allocated per core. The flat shared memory address space is an illusion. With such an architecture, immutability and not exposing pointers is a very reasonable approach for many parallel/concurrent workloads. It enables many optimisations.

An algorithm written for such an abstraction is also much more easily adapted to work across multiple machines (e.g. map-reduce).

Since Pen claims to be inspired by Koka, I would hope that they eventually implement Perceus reference counting [0], which (partially) solves the problem of too many allocations. The idea is that with immutable datastructures you often free a constructor just before allocating the same one again (for example during a map function). But with reference counting we can check if the old constructor is dead and if so use it in-place for the new constructor. Applied carefully, you can build functional programs that use no further memory.

[0]: https://news.ycombinator.com/item?id=25464354

I didn't think there was a massive memory gap between functional languages and other paradigms? I had always made the assumption that it was because immutable data allows the compiler to make optimizations that it normally can't.

For example, you could pass all values by reference implicitly, because they can't be mutated.

There are probably others. It seems like the paradigm is different enough that the pitfalls are also slightly different.

I’ve never heard of Koka before (one of its stated predecessors), but Koka seems really intriguing. Has anyone used it?

Yes, I am currently writing a masters thesis about it! It is a "research language that [is] currently under heavy development" as the README says so I don't recommend using it for anything other than a research/toy project. But you would be surprised how well it works: Algebraic effects can give you 98% the power of monads while being much faster and easier to understand. And then Koka uses mimalloc [0], Perceus [1] and tail recursion modulo cons which make Koka programs almost as fast as C++. For example, in the benchmarksgame, Haskell takes 5x as much time on the binarytrees benchmark as the fastest implementation, while Koka is 50% slower (similar to other C/C++ and Rust implementations).

[0]: https://github.com/microsoft/mimalloc [1]: https://news.ycombinator.com/item?id=25464354

> Haskell takes 5x as much time … as the fastest implementation, while Koka is 50% slower (similar to other C/C++ and Rust implementations).

Which is to say Koka takes 2x as much time as the fastest for that task ?

Huh, now I'm curious whether it means 2x or 1.5x. These graphs on the github look promising https://github.com/koka-lang/koka/blob/master/doc/bench-amd3...

That's pretty good considering Koka is a high-level language like Python and JavaScript.

Super cool indeed. I learned MSFT pulled funding for this project. Sadly.

No, that seems wrong. Its currently only Daan working on this, but as you can see the last commit to 'master' was 7 days ago and the last commit to 'dev' yesterday...

So was there MSFT funding? And is there now MSFT funding?

That it gets continued as a hobby project does not mean it was not funded at some point (and is no longer).

But I'm not sure about my statement; no source also (just remember reading it).

CORRECTION: as can be read rest of the thread, Koka is being funded by MSFT.

Daan is employed full-time at MS Research and gets to spend this time on Koka (when he is not writing conference papers). There is currently no further funding or deployment of Koka within Microsoft that I know of (unlike mimalloc which seems to be used by Azure and Bing).

Thanks for setting this straight!

I miss a serious examples section.

I have no idea what it looks like from merely reading the reference docs.

The real question: is it really mightier than the SWORD

what I liked about this and would want Go to pick up is error handling. Rust's ? expression is probably one of the best inventions from the last 2 decades. There is more error handling code in my Go project than actual business logic. I see hundreds of error handling proposals being shot down by the Go community, I am not sure if these people write the Go compiler or the ones actually using Go to build projects or is it the case of, "since I had to suffer with error handling, you should too."

What does it take for a language is like this to get enough critical mass to be the kind of language you don't get fired for choosing?

> It pursues its minimal language design further after removing several features from Go like pointers, mutability, method syntax, global variables, *circular references*, etc.

So no graphs? Or more complex data structures?

We can still represent graphs expressing references explicitly, for example, using hash maps. Here, I just wanted to point out no circular references "on memory" in the language. It might be interesting for you to take a look at documentation of Rust or Koka and how they represent those data structures!

"System injection" is apparently the major novelty in this language. But it's not described in detail, and I'm not finding it on Google. What is it? Is it important?

From what I just saw on their website it looks like an other way to implement what an IO monad in Haskell gives you, which is to write code that pretends functions that in reality have side effects are pure. This is achieved in part by passing around some Context (World in Haskell) that encapsulates the "state of the world" onto which the side effects act upon.

I don’t need to spend any more time with static typed languages that avoid generics. I wrote Go for several years, tried to love the simplicity, then tried to love code generation. Then I gave up. I don’t need to repeat that experience.

I agree. It's not for everyone but there are many different kinds of developers.

If you are and can hire smart people who can use generics without introducing extra tech debt and complexity, I definitely recommend the ones with generics like Rust and Haskell. But the problem is that I'm not smart enough.

I'm in the same boat. I still enjoy writing Go, it's my primary language but adding generics is something I have been looking forward to for a long time. there is so much code that could be made simpler or deduplicated with generics.

As JS developer it took me sometimes realizing the importance of generics, as by default in JS problems it solves doesn't exist. Funny how typed language advocates want automatic casting in the end after all.

Generics are the polar opposites of automatic casting. They exist so you don't have to shoehorn one argument type into another.

JS isn't exactly the best environment to understand this. Ideally, use a compiler language that can show you the generated assembly (`gcc -S ...` with C++ for instance). Then you can see first-hand how type-specific implementations of a generic function look like.

Since when are generics automatic casting?

it's not but it's essentially the same practically speaking -- you get type A|B|C you write code to interact with it in just one way. Either by casting, or generics. weakly typed language do that for you, but they are looked down on.

I think you're missing the key point of statically typed languages, that the compiler catches errors for you. Generics are valuable because it allows users to write "generic" reusable functions that are still checked by the compiler. Of course dynamically-typed languages can do this easily, but that's not the point.

Blub 101


> Pen is a statically typed functional programming language for application programming with system injection. Its design is heavily inspired by the Go programming language and many functional programming languages like Haskell and Koka.

... Pen aims to be even simpler [than Go] by ... removing several features from Go like pointers, mutability, method syntax, global variables, circular references, etc.

... Pen's approach to that [maintenance] is embracing battle-tested ideas in such field [software engineering], such as Clean Architecture, into the language's principles and ecosystem. One of its most clear incarnations is system injection.

System Injection: A mechanism to inject system functions into pure functions, eg a dynamically typed effect system (https://github.com/pen-lang/pen#system-injection)

dang / raviqqe42: I believe this is a [Show HN].

everyone: reminder of the very good Show HN guidelines regarding expectations from submissions and those offering feedback: https://news.ycombinator.com/showhn.html

I cannot comprehend what would drive someone to make this.

free time haha

Fair enough! Sorry, I didn't intend that to be rude, I just genuinely didn't understand as it seems to be two very conflicting ideologies merged.

I think this project is a great idea! A minimalist run-time and language (like Go) but with an emphasis on functional programming seems like the best of both worlds! However, will the language get cumbersome without advanced FP features? See Elm debates…

Honestly, my main issues are with the lack of generics and the "dynamically typed effect system". The idea of a minimalist runtime and language reminds me of Clean a little bit.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact