
Show HN: Whack – A simply-designed compiled programming language - wycliffb
https://github.com/onchere/whack
======
SamReidHughes
I only looked for a few minutes, so first impressions:

1\. Try the declaration syntax

    
    
        x Foo;
    

instead of

    
    
        Foo x;
    

I tried it before, you might like it.

2\. I think the way you're defining the AST types is a crapload of work. You
should have had a bunch of dumb structs, all in one file.

Then you can see everything at once, and you aren't mixing AST representation
with codegen logic. Sometimes that's a better way to do algebraic types in C
or C++.

3\. I don't know what "type Foo struct {...}" does but you'll save a lot of
work if the type system only has names as types, nominative typing, without
losing usability.

4\. Personally I'd parse straight to the AST type you define and not use the
mpc lib with its own AST implementation. I don't believe in parser combinator
libraries, especially not in C. It's better to copy/paste those loops. Better
than using a parser generator too. But since you have a parser already... not
right now.

Edit: 5. Avoid looking at Zig, Myrddin, etcetera, if you can. There are
obviously paths that any C-like language tends to go down in the 21st century,
and the world would probably be better if you rethought the problems from a
blanket slate.

~~~
exikyut
> _Edit: 5. Avoid looking at Zig, Myrddin, etcetera, if you can. There are
> obviously paths that any C-like language tends to go down in the 21st
> century, and the world would probably be better if you rethought the
> problems from a blanket slate._

I read the points above this one and cannot comment/disagree with them.
However, I am very curious about (5).

\- What would you classify as "etcetera" here? (ie what other languages would
you list)

\- What are the paths in question?

\- With said blanket slate, what other mindset may be useful to keep in mind?

Thanks!

~~~
SamReidHughes
Blanker slate, damn iPhone. More of a blank slate.

I don't know, it's a general problem of balancing the originality you might
get from not looking at other people's work, with what you miss out from not
looking at it.

"Etcetera" includes other low-level languages people've made. Like, I guess Go
might even count. Honestly 5 is kind of stupid. No great reason not to ignore
it. It's the sort of thing to do for 1 month, but not forever.

By "paths" I mean different features that different languages have and how
they do them. You could just copy how these languages try to improve the
ergonomics around error handling, for example, or you could decide how you'd
like to do it. Thinking from first principles it's likely you'll end up
walking into exactly the same decision other languages make, only with a
different choice of operator. But it's possible you'd improve matters.

Other paths are questions like implicit conversions or how do explicit
conversions happen. And what do you name the bitwise negation operator? Can
you do pointer arithmetic? How do you handle pointers to array elements? Do
you have a one-to-one mapping from indentifiers to indentificands?

~~~
exikyut
> _I don 't know, it's a general problem of balancing the originality you
> might get from not looking at other people's work, with what you miss out
> from not looking at it._

Yeah, that's a fun one. The impression I get is that, the only way you can
reverse that knowledge bias is to have sufficient knowledge of and experience
with a given field that you can look at all possible approaches objectively.
But that creates a paradox, since you can only gain said amount of knowledge
by studying others' work... :/

So, you're saying I should avoid looking at other languages for a month?
Sorry, not clear on this bit.

As for error handling, that's a tough one. Try/catch is great for worlds
without recursion (as inherent in OOP, or elsewhere), if you ask me. Go's
fancy "returning an array is a first-class idea, I am so awesome" and the
resulting `ok, err` is... I guess you're forced to type that out every single
time and thus forced to think about it, which is good, but it still feels
really inelegant. Erlang's atom-based {ok, value} / {error, Reason} return
value approach seems interesting/nice/cute - but, admittedly, only because I
haven't tried to actually use it (yet) ;)

How would I handle errors myself, pretending I hadn't wrote the above. Hmm.
(Now all I can do _is_ think of the above. :D) Well, having something like
Error/Fail be a first-class type next to True/False/NULL could be interesting,
then I could do `if (something()) { ... otherthing() ... }` style constructs
but with added enlightenment about failure states, so I could `if
(something()) OK { ... otherthing() ... } else Fail { ... cleanup() ... }` or
similar. (In this case the success/failure state would be being propagated
within the scope of the if block, with appropriate scope analysis to look for
ambiguity.) This is basically just renamed try/catch though, and all I've done
is concretely demonstrate that language design requires investments of more
than 15 minutes, heheh :)

That being said, on bitwise negation... my first instinct is to make that a
function. Then I started thinking about in-source dynamic DSL lexing like Perl
6 has, "so the user can set up their own operators", and then I suddenly
realized I was reinventing Forth. Raincheck #2.

Pointer arithmetic... depends on the language in question, and whether it's so
low-level you want unfettered access to memory. I consider this from the
perspective of something like PHP, which offers enough low-level access to be
useful in a lot of sitations, but still leaves me high and dry when I least
want it to. The problem of course is whether I want a language that does its
own memory management or not, and that's a question I'm really headscratching
over actually. (I now realize/remember PHP gets away with its relative
simplicity because it's an interpreter, and that this comparison is a bit
wonky. Raincheck... #3?)

Pointers to array elements is a C-ism. I'm 100% sure this can be cleaned up to
be a bit more elegant, even in the context of a low-level language that allows
for memory twiddling.

\--

When I initially read your comment, and before I typed out all the above, for
a bit I really started wondering about the balance problem you opened with.
The fun paradox (if my theorization is even half correct) I mentioned is one
way to look at it, but it _is_ really hard, and I didn't know have any good
ideas about a solution.

One idea presented itself as I finished reading, in the form of the question
"...what on earth are identificands?!"

I had no idea what that meant. And this gave me a thought.

I wonder if, it could be possible to publish a language-design tutorial, in
the form of a gigantic pile of unanswered questions that do explain enough to
get an understanding, but don't suggest or hint at any one particular solution
to a given problem?

Obviously such a work would involve significant reinvention of a lot of wheels
and a lot of duplication of work. But I wonder if it wouldn't result in a
deeper understanding of the problem domain, and maybe even some newly sparked
ideas.

~~~
SamReidHughes
I made up the word identificand. Like, integrand, subtrahend, identificand.

I mean, don’t take me too literally about the month thing. Obviously your mind
is already poisoned by other languages. But I’d say, try to avoid just doing
what other languages do, and inject some novelty. If only like a chess player
going off-book.

------
fuddle
It would be a good idea to add some examples to the readme.

~~~
wycliffb
Will remember to do so: will add an example for each language feature.

------
wycliffb
Will appreciate comments on the implementation/design choices.

~~~
TheDong
I think you'll get more comments if you include specific code samples.

How would you implement the Fibonacci sequence in whack? How might one
organize a simple game of hangman?

If it's suitable for use as a tcp server/client, how about a "echo" client?

Things like that can really show off the stdlib and the syntax choices.

Relatively few people will read through the code, and even those that do will
likely understand the code better if they start from "this is the syntax or
idea that needs to be implemented" as expressed in example code.

Documentation is also, obviously, more coherent to read than implementation
code, and you don't seem to have any documentation explaining what features
whack has (other than a note that it doesn't have a comprehensive type
system).

~~~
wycliffb
There's a sample file main.w in the snapshot folder, there's also a grammar
file at the source root.

~~~
dan-robertson
In that file I saw the line:

    
    
      r.amazing();
    

But that function is defined as taking a Bool. What is this call supposed to
do?

I’m also curious how you plan to implement match type. How would this work if
you give it eg a char __? Will the compiler know what the type is and pick the
right clause. I don’t really see a way to do it by inspecting the data at
runtime so maybe the pointer would have to have runtime type information
attached to it, but you would then need to transform that info when
dereferencing the pointer as this info can’t live in memory next to the
pointed-at objects if you want C compatibility.

I also can’t really tell what match type is for from your example. Are you
intending to have inheritance and then using match type as a kind of ad-hoc
polymorphism (eg is my Animal a Dog or is it a Cat?), or some sort of weird
template-like thing, or something else entirely?

If you allow subtyping then does “func(Dog->Int)” successfully match something
of type “func(Animal->Int)”?

~~~
wycliffb
Will commence working on a doc to explain some design choices, and what some
non-obvious code fragments do; will update here when I commit. There's no
subtyping being done currently. The 'match type' construct matches the type of
an expression at compile time. I should also add that some design choices may
be reviewed before the first release.

~~~
dan-robertson
Surely if match type is compile time and there is no subtyping then it is
basically a no-op type assert. I.e. it’s like writing (e : t) in ML?

Or is it supposed to allow for some kind of ad-hoc polymorphism like:

    
    
      func foo(x) string {
        match type(x) {
          char** : return “an array of strings”
          int : return “a number”
          default : return “not sure”
        }
      }
    

And this gets transformed into something like:

    
    
      func foo(Type x_t, x_t x) string {
        match(x_t) {
          pointer_t(pointer_t(char_t)) : return ...
          ...
        }
      }

~~~
wycliffb
Interesting suggestion! Will be sure to add that when I can.

------
joshumax
Interesting project! I've written compiler frontends for both GCC and LLVM,
and surprisingly found it easier to write one for GCC, despite LLVMs
reputation on modularity. I'd love to hear the reasons why you chose LLVM for
code generation over something else!

~~~
thechao
In my experience, LLVM has too much code motion for understaffed projects to
seriously consider. When I’m developing small languages my goal is to reduce
overall work or, barring that, keep the work constant & get some other
benefit. While I always reach for c++ first, I’m under no delusion that it’s a
fit language for describing easily portable, easily consumable, and stable
API/ABI. For that work, C is the undisputed grand champion. As such, I
generally just translate to C. With the vector intrinsics provided by Clang
(or GCC), I can still target all the features I need.

~~~
sitkack
I think you would both be well served by targeting WASM or Lua as your next
compilation target.

~~~
thechao
The halcyon days of high level languages like C are far behind me. These days
I target custom ISAs using lovingly hand-crafted machine code. My goal is to
write assemblers =)

------
otabdeveloper2
> Whack currently lacks a comprehensively designed type system.

The type system is 90% of programming language design effort.

This is like releasing a car without a 'comprehensively designed engine'.

~~~
maxnoe
To be fair, this is probably far from being "released"

------
pppaul
would recommend looking into writing a grammar, have that generate your AST,
then do some transformations on the AST to generate code. you will save a lot
of time.

I recently did that for a language that i made, via instaparse. the
flexibility and speed i gained was very big. my language isn't Turing
complete, but it has functions, lookup tables, and some pattern matching.

~~~
wycliffb
> would recommend looking into writing a grammar, have that generate your AST,
> then do some transformations on the AST to generate code. you will save a
> lot of time.

I had this thought, so I used a parser combinator (mpc) to generate the AST
from the grammar and source file, then extract useful elements from the AST
for codegen.

~~~
rurban
With mpc you can support macros, adding better macro definitions at compile-
time, not just primitive cpp-style replacements. This would be definitely a
game changer.

~~~
wycliffb
Will definitely follow up on this. Maybe we could support DSLs while at it?

~~~
rurban
Yes, a bit like perl6 grammars. Just performant.

~~~
lizmat
And immutable? Aka, you could not create a DSL that would allow you to create
other DSL inside of it?

------
Myrth
Amazing job, thank you for sharing!

------
devoply
I wanna see a language that is both dynamic and can be compiled. Runs on a VM
and on bare metal. Something like C++, Java, and Python combined. It can
definitely be done and would be an interesting exercise.

~~~
frutiger
> that is both dynamic and can be compiled

JavaScript is dynamic and compiled (as are many other dynamically-typed JIT
languages). Did you mean dynamically typed and optionally statically typed?

> Runs on a VM and on bare metal

JavaScript is also runs on a VM and bare metal (e.g. V8's Ignition interpreter
will interpret JavaScript, but TurboFan will compile it to machine code).

~~~
devoply
Yeah I guess that's true. And I guess you could use TypeScript if you don't
like Javascript.

------
nsstring96
This is super cool, thanks for sharing! Are there any books or other resources
that you found helpful in learning and implementing Whack? I’m dabbling a
little bit with PLs and would love to hear your opinion.

~~~
wycliffb
You might find Types and programming languages - Benjamin C. Pierce to be
particularly interesting. There's a GitHub repo with a list of materials,
can't seem to find it. When I do, will remember to share!

~~~
nsstring96
Thank you! Looks very interesting. As for working with LLVM codegen libraries,
would you recommend anything beyond the official docs? I found that most books
for this sort of thing are based older APIs from a few versions ago.

~~~
wycliffb
I haven't come across good material on the matter. If you do use C++ I do
recommend you employ RAAI idiom in code generation for help with lexical
scoping for your language.

------
wycliffb
Just pushed LLVM.dll to snapshot folder.

------
webkike
> Whack currently lacks a comprehensively designed type system.

To me, this statement means: currently whack is not a properly designed
programming language.

Proofs are an important part of programming! This is the most important part.

~~~
chrisseaton
Lots of properly designed and practically useful languages have no
comprehensive type system.

~~~
nerdponx
What constitutes a type system? Do Python or Scheme have type systems?

------
lewisj489
Shwacked

------
xixixao
I know that mentioning other language on a thread about A language is
contentious, but if you haven’t played with Nim, I seriously recommend you
check it out.

