
Clay: A new language for generic programming (based on LLVM) - zaph0d
http://tachyon.in/clay/
======
kssreeram
Hi folks, I'm the author of Clay. There aren't any docs ready yet, but I'll
try to answer whatever questions I can. Also, there's a long thread over at
reddit with many interesting questions.

[http://www.reddit.com/r/programming/comments/ctmxx/the_clay_...](http://www.reddit.com/r/programming/comments/ctmxx/the_clay_programming_language/)

~~~
joe_the_user
I couldn't help imagining how OO "objects" might be implemented in Clay

Your dispatch code now:

    
    
        record Square {
        side : Double;
        }
    
        record Circle {
            radius : Double;
        }
    
        variant Shape = Square | Circle;
    
        procedure show;
        overload show(x:Square) { println("Square(", x.side, ")"); }
        overload show(x:Circle) { println("Circle(", x.radius, ")"); }
    
    

My "Objectivizing":

    
    
        interface Shape {
            abstract procedure show(self);
        }
    
        record Square : Shape  {
            side : Double;
            show(self) { println("Square(", self.side, ")"); }
        }
    
        record Circle: Shape {
            radius : Double;
            show(self) { println("Circle(", self.radius, ")"); }
        }
        variant Shape = Square | Circle;
    

Here, "abstract" declaration in the parent would make the children's "show"
automatically become an overload. 'self' is variable which would be
automatically specialized to the containing record. I think the resulting code
isn't excessively verbose.

------
jaen
At first sight, this looks like a major improvement over C++ and even more
powerful than D with less ad-hoc features.

Interesting language features, as far as I can tell:

* Type inference

* First-class functions/closures (interesting compared to C++)

* Overloading is based on compile-time evaluation, with the possibility to have arbitrary values as type parameters, and is similar to predicate dispatch (you can dispatch using arbitrary conditions), with also the possibility to do runtime dispatch.

* Most of the functionality of the language is implemented in libraries, with a small kernel of built-ins.

Library functionality:

* the standard standard library: arrays, vectors, maps, algorithms, sequence abstractions etc.

* tuples, unions, variants

* lazy sequences (streams)

* reference-counting shared pointers (boost::shared_ptr)

* SIMD intrinsics and vectors using SIMD operations

* bindings to the C standard library of Unix and Win32 API

* Green threads, channels

Other:

* C binding generator based on CLang

------
joubert
I like the multiple dispatch feature
([http://bitbucket.org/kssreeram/clay/src/b1df8340f2b4/test/va...](http://bitbucket.org/kssreeram/clay/src/b1df8340f2b4/test/variants/3/main.clay)).

It reminds me of clojure multimethods (<http://clojure.org/multimethods>). Can
dispatch conditions in Clay be "arbitrary"? (in the example it is on the value
of 'x', but can it be as complex as in Clojure multimethods?

------
Chickencha
It looks pretty nice. I'm curious about how the code generation works and
whether Clay has the potential to produce the kind of code bloat you sometimes
see with C++ templates. Say I have a function

    
    
      min(a, b) {
          if(a < b)
              return a;
          return b;
      }
    

From what I understand, this function can accept two Int32s, or a Uint8 and a
Float32, or basically any other combination of numeric types. Is a new
function for each type combination generated? I don't think there's any way
around this to some degree, but I'm curious if you've been able to do anything
to ease the pain.

~~~
kssreeram
You can control type specialization to a certain extent with overloading. For
instance, if a procedure takes two arguments, and if both these types can vary
independently, then you can use overloading to ensure that both arguments are
converted to a common type.

    
    
        procedure foo;
        
        [T1,T2]
        overload foo(a:T1, b:T2) {
            // by default, convert both arguments
            // to the type of first argument.
            foo(T1(a), T1(b));
        }
        
        // the following overload specializes for the case
        // when both arguments are of the same type.
        [T]
        overload foo(a:T, b:T) {
            ...
            implement the logic here
            ...
        }

------
mark_l_watson
As someone who likes high level languages (mostly Lisp and Ruby, and learning
Haskell for for the second time), I am surprised how much I like the language:
list comprehensions, type inferencing for concise code, etc. There is about
zero chance of my using this new language however since my language selection
is about 90% driven by what customers want.

~~~
mark_l_watson
Hello kssreeram, question: your company web page and quillpad product look
interesting; do you use Clay for quillpad development and deployment?

~~~
kssreeram
We are currently using Clay for a few projects within Tachyon for performance
sensitive code. It's simply being used as a better C, and since Clay can
generate light-weight C compatible DLLs, it fits in very well.

Quillpad itself predates Clay and hence doesn't use it.

------
roryokane
It looks like the syntax could use some polishing. Three obvious improvements:
use line breaks for semicolons (and backslash at the end of line for multiline
statements), do not require parentheses around conditionals in if, while, for,
etc. statements, and (slightly more controversially) use significant
indentation instead of curly braces. I think each of these changes would make
the language undebatably more concise and readable than before (except the
last one, which _has_ been debated, but I think it’s definitely an
improvement).

An example from algorithms/introsort of how the syntax would look:

    
    
      overload introSort(first, last)
          if first != last
              introSortLoop(first, last, log2(last-first)*2)
              finalInsertionSort(first, last)
    

instead of

    
    
      overload introSort(first, last) {
          if (first != last) {
              introSortLoop(first, last, log2(last-first)*2);
              finalInsertionSort(first, last);
          }
      }

~~~
alnayyir
This more than a little vacuous.

~~~
chc
Not really. Python is famous for its readability, and this interest in
reducing "noise" is one of the reasons why. It's definitely something that a
language could benefit from at least riffing off, if not stealing outright.

The Python philosophy is that everything is expressed clearly and concisely.
Parens around conditionals are just restating information that is already
apparent, as are semicolons between lines. Fluent readers actually learn to
ignore these syntactic features unless they're debugging (since these useless
tokens are a breeding ground for bugs).

Ask yourself: How often do you mentally match the opening and closing parens
on a conditional? How often do you rely on semicolons to tell when you're
looking at a new statement as opposed to just looking at the lines of code?

~~~
alnayyir
Yeah, listen. I'm a python programmer as a matter of profession and paycheck.

It's a matter of taste.

Do I prefer significant whitespace ala python/haskell?

Yes.

Does it matter? No.

Does it matter when you're discussing a programming language you've just now
encountered for the first time and is rather new and has many novel things to
contribute to the world?

Fuck no.

Like I said, it was a vacuous thing to say. There are far more important
questions to ask like,

"Are the generics a space-time trade-off similar to C++ templates?"

"Can I use the type system to encapsulate and restrict behavior in powerful
ways, allow me to create performant but safe code?"

"Can I make a beowulf cluster out of this?"

Any of those questions have more substance than, "hurr whitespace is bettar
why didn't you do tghaaaasdfsdgsg"

Christ-sakes.

~~~
chc
Your entire response focuses on significant whitespace, which I didn't mention
at all in mine.

And no, the time-space tradeoff of the generics is not necessarily a better
question than how readable the language is. _That_ is a matter of taste. I
will spend more time reading the code than I will worrying about the
performance characteristics of generics or creating a Beowulf cluster, so
caring about the common case is not exactly vacuous, even if it isn't the #1
most important thing.

~~~
alnayyir
>Your entire response focuses on significant whitespace, which I didn't
mention at all in mine.

Doesn't matter.

>Beowulf cluster

It was a fatuous comment designed to compare with the original fatuous
comment.

>the time-space tradeoff of the generics is not necessarily a better question
than how readable the language is. That is a matter of taste

No it's not. Taste is preference, whether or not a language is impossible to
deploy with generics in an embedded environment has absolutely nothing to do
with whining about syntax.

    
    
        importance:
            Semantics > syntax
    

I don't think I've seen sophistry and an obsession with the trivial on
hackerne.ws like this in quite some time.

You're complaining about the color of the bikeshed when real discussion and
work is to be done concerning the semantics and structure of the language?

~~~
chc
I think you're unnecessarily trivializing things that you personally care less
about. Who cares about embedded environments? People who work in them. Is that
group dwarfed by the group of "programmers who have to read and debug code"?
Yes it is. So it seems really petty to personally insult me for caring (not
much, but a little) about something that makes code less bug-prone and easier
to read for everybody while you hold up something that affects a vanishingly
small number of people as _what we should be talking about_.

Your complaints seem to reflect an idea that only features and implementations
matter, while user interface is fairly trivial. If you believe that, I think
you might be interested in this:
<http://www.alistapart.com/articles/indefenseofeyecandy/>

~~~
alnayyir
You linked to a web design website in defense of trival eye candy in a
discussion about computer science.

I'll just leave it at that.

~~~
roryokane
Web design is the art of creating user interfaces for websites – the website
visitors are the users. Language design is the art of creating user interfaces
for programs – programmers are the users. The fields are related by the common
theme of designing things for users. And I don’t see how you can call eye
candy trivial without defending that assertion _right after_ reading an
article that argues it is not trivial.

------
planckscnst
I find it interesting that goto is in the examples. I'm not deeply familiar
with program theory and computability, but I thought goto made the program
impossible both to reason about and to verify its correctness. Why would one
include it in a new programming language? Are there really valid uses of it? I
don't think the factorial example here is a good use - he basically used it as
a while(true) infinite loop. However, I also couldn't tell you with accuracy
why it might be bad here other than the ingrained mentality of goto=evil.

~~~
fhub
goto can be useful for "alarm exits". The linux kernel uses them a lot. See
the bottom of this thread for some pseudo code
<http://kerneltrap.org/node/553/2131>

~~~
adamc
Sure, but in the factorial example, Clay uses goto to express iteration:
[http://bitbucket.org/kssreeram/clay/src/b1df8340f2b4/test/fa...](http://bitbucket.org/kssreeram/clay/src/b1df8340f2b4/test/factorial.clay)

------
Raphael_Amiard
Looks like a very interresting language to me. I think it superficially looks
a lot like rpython, with way more options to restrict and specify types on top
of it. I'm very eager to see more documentation, specifically on types and
memory management.

What is the syntax for pointers for example ?

~~~
kssreeram
Hi. Pointers to type T have the type Pointer[T]. '&' operator is for getting
the address of a lvalue, and the '^' operator is for dereferencing. '^' is a
better choice than C's '*' for dereferencing because, I can conveniently use
the same operator for dereferenced field-access too, whereas C had to invent
another operator "->" for that.

    
    
        record Point[T] {
            x : T;
            y : T;
        }
        
        updateViaPointer(ptr) {
            ptr^.x += 1;
            ptr^.y += 2;
        }
        
        test() {
            var p = Point(10, 20);  // type will be inferred as Point[Int]
            updateViaPointer(&p);
        }

------
joe_the_user
Sample code I'd like to see: * How you'd implement a B-tree object * How you'd
implement the-equivalent-of-a-class (a list of related functions and
structures akin to something you'd see in the GTK documentation).

------
jacquesm
That looks like a hodge-podge of C and Pascal to me.

Sorry about the tone of this comment but I fail to see anything that would
make me go 'yes, let's try this'.

Generic programming is something you can do in any language, and most of the
(successful) ones out there are created with that goal in mind.

Can someone more in the know enlighten me as to why 'clay' is special in this
respect?

~~~
kssreeram
> Generic programming is something you can do in any language, ...

That's not true.

Generic Programming is about writing re-usable code that is also very
efficient. On the whole, it requires the following:

\- Static types

\- Overloading

\- Type-parametric functions (templates)

\- Type specialization.

Not all languages have these features.

Generic programming first took off with C++, when it introduced templates. But
a few languages before and after C++ have supported generic programming: Ada,
Haskell, D, etc.

edit: formatting.

~~~
jacquesm
Generic programming is a way of writing code, not a thing that your language
supports or you can't do it. I can write perfectly re-usable C code that is
also very efficient by relying on the pre-processor to customise the code to
the exact types and conditionals required for the situation at hand.

It's a bit like saying you can't write functional code unless you use a
functional language.

~~~
Robin_Message
Firstly, using the preprocessor will mean there is code duplication, even if
it is not necessary. Secondly, once you are using the preprocessor to do
generic stuff, you have probably thrown type-safety away.

That's not to say you can't write C in a generic way, but "generic
programming" means something specific to computer scientists and has certain
prerequisites.

And you can't write functional code without a functional language, without
building a functional language on top of your language (which may not even be
possible, e.g. you can do functional-ish stuff in C because of function
pointers. Without them, you'd be stuffed.)

You can write functional code in a "non functional language", if by functional
language you mean "Haskell, ML or Scala". But you need certain features like
function pointers to do it, and in that sense you _do_ need a functional
language. Same for generic programming - you need certain features.

~~~
Raphael_Amiard
Function pointers are very far from sufficient because you can't define new
functions at runtime, since you can't nest functions or define anonymous ones.
That means your functions are not first class citizen of your language and
some common functional programming techniques are impossible to use in a clear
way.

You can't do that for example :

    
    
        def make_adder(num_1):
            def adder(num_2):
                return num_1 + num 2
            return adder

~~~
Robin_Message
Yes. Generally you then start cheating by passing around a function pointer
and a void* that is the first argument of the function, i.e. doing closure
conversion yourself. You can hide this with some macros and get close to
having a functional language, but it's ugly and tedious (Cfront anyone?) Also,
what I'm describing here is more similar to object orientation than functional
programming. It's worth remembering the following koan though:

The venerable master Qc Na was walking with his student, Anton. Hoping to
prompt the master into a discussion, Anton said "Master, I have heard that
objects are a very good thing - is this true?" Qc Na looked pityingly at his
student and replied, "Foolish pupil - objects are merely a poor man's
closures."

Chastised, Anton took his leave from his master and returned to his cell,
intent on studying closures. He carefully read the entire "Lambda: The
Ultimate..." series of papers and its cousins, and implemented a small Scheme
interpreter with a closure-based object system. He learned much, and looked
forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying
"Master, I have diligently studied the matter, and now understand that objects
are truly a poor man's closures." Qc Na responded by hitting Anton with his
stick, saying "When will you learn? Closures are a poor man's object." At that
moment, Anton became enlightened.

\-- Anton van Straaten

