Hacker News new | comments | show | ask | jobs | submit login
Deca - a systems language based on modern PL principles (code.google.com)
119 points by nponeccop 1968 days ago | hide | past | web | 57 comments | favorite

It's great to see a language which doesn't ignore 20th and 21st century programming language theory. It's tremendously boring waiting for the future to happen.... Anyhow, the type system looks quite interesting.

The choice of LLVM for C interface and performance is also really sensible.

I was surprised to find the language implemented in Scala and Java though. Perhaps there will be some kind of bootstrap, or maybe the ability to access the java libraries is intentional.

Without proper documentation it is a bit hard to evaluate this more fully at this stage. I looked but didn't find many docs yet. Certainly a language to watch.

The language is a compiled language, with the JVM being used (originally) because I wanted a particular parser generator. Deca code can't access Java libraries because Deca code compiles to LLVM bitcode and thence to machine code. decac is a compiler, not an interpreter, so it takes the Deca code in and outputs the LLVM bitcode.

Sorry about the lack of docs. I've been slowly dumping my undergraduate thesis on this into the wiki, but the language has been evolving and I've been starting a career. Oy ;-).

Sorry to nag, but how is outputting to LLVM bitcode not interpreting? You are taking source code and turning it into intermediate form (i.e. LLVM bitcode), no?

An interpreter means a program that runs a program, straight from its source code.

A compiler transforms a program from one format into another. decac does not execute programs, so it's not an interpreter.

LLVM bytecode does not exist after linking - it turns into usual machine code. It is not even a JIT compiler, talk less of an interpreter.

I keep a (small) list of active awesome and interesting alternatives to C. Cyclone has been mentioned already, adding Deca now. the other two I know are:

* ATS http://www.ats-lang.org/#what_is_ats_good_for

* Clay http://claylabs.com/clay/

You may be interested in this (see the section about Nothing and Maru): http://www.vpri.org/pdf/tr2011004_steps11.pdf

A lot of work there has been done in completely replacing the entire current software development stack with something more reasonable (in terms of size/bloat and design). They aim for a full system (frontend user apps to the metal) in under 20,000 lines of code. Check it out. Quite the alternative to C, and everything.

Thanks yeah, I've heard of VPRI and read a few of thier papers and seen some of their videos. Alan Kay is one of my heroes and I really do think that the idea of breaking down problems to domains and working within that is very important going forward.

I'm shamelessly adding my own HNC here :) http://code.google.com/p/inv/ https://github.com/kayuri/HNC/wiki I'm moving to GitHub - so 2 repos. "C++ is an evil language used at early stages of HNC development", so it will go away. See also discussions on reddit (submitted not by me lol) http://www.reddit.com/r/haskell/comments/mldzq/hnspl_a_bette... and here on HN long time ago http://news.ycombinator.com/item?id=2116337 (submitted by me)

See also Vala, Mercury, Felix and Haxe.

Dont you know about Rust? If you do why is it not on the list?

Rust, while interesting and admirable, isn't quite as low-level as C or Deca. For example, one of Deca's explicit goals is to not require a garbage collector or a runtime library. Rust has GC and a light-weight-threading runtime. So it's not quite a replacement for C, though it's certainly a good replacement for much of the application-level programming C and C++ are (unfortunately) widely used for.

Ok, thx.

Rust certently has a runtime, cant argue with that.

I thought Rust allows a pretty tight control over allocation with the pointer hierarchy feature. The GC is designed not to be global and should be pretty easy ignore. They even have diffrent kinds lambdas to allow the programmer if the should be allocated on the heap or the stack.

I have not done alot of low level programmig but I want to learn more about writting Jits and the pretty much need to be rather low level atm there is not really an alternative then writting them in C (witch sucks).

Yes I know of Rust and think it's very interesting. For most system level programming it should be an excellent choice. It's great that the dearth of systems capable languages is something that is becoming less of an issue.

But when i was making the list I was thinking about if the language could possibly be used in an embed context. C is still king there. Rust I believe, is partially garbage collected in a way that is not practically avoidable. Any use of boxed types or abstract datatypes will incur this penalty. So with that criterion rust is not an option, in an embed system continuous execution is most vital. The languages I listed are either not garbage collected or with an ability to turn it off or use an alternative to abstract datatypes and have basically no overhead. The last time I saw Rust mentioned on HN a team member validly stated that such concerns are not the priority at the moment.

Garbage-collected languages can still be fast on microcontrollers. When there isn't a lot of memory it only takes a few milliseconds to run a collection cycle. The problem would be when you need real-time and can't deal with the jitter.

There are real time GC working on mission critical embedded scenarios in real life.

One example is one of the french radar system for missile defense, Normandie.

Can modern garbage collectors deal with almost full heaps? There is an urban legend that GC only shows good performance with 50% of memory free, which is certainly unacceptable for microcontrollers. Also, most garbage collectors require boxing which prevents tight packing of heap data. Are all these problems already solved somewhere? Jitter is the least important issue as there are a lot of works on realtime collectors.

Unfortunately I've no idea how the various GCs deal with the trade-offs; my point is that (memory bandwidth / memory size) in these embedded things is high enough that garbage-collection is fast, contrary to popular wisdom. I was looking at OCAPIC for an example, which has a very simple stop and copy gc which takes 1.5ms to collect. The trade-offs would have some 2× impact on some metric, but not change the feasibility of the thing.

My point is that 2x slowdown is critical for many systems - embedded, gamedev, HPC, systems software. Controlled measurement of slow-down and heap slack caused by GC is beyond capabilities of most embedded devs. Academia provided us with some experimental results saying that in half-empty heaps GC is lightning fast, but there's no (recent) data for 90% full heaps and for very large heaps.

I greped for "core", "memory manager", "thread", "threading", and "cache", in the pdf [1]. Am I missing something? I'll probably get flak for this, but this programmer looks for a modern systems language that directly addresses these concerns.

"The fundamental problems of a systems-programming task or environment are hard limits on computational resources and a lack of safety protections. Systems programs have to deal with hardware-imposed limitations on their CPU time, registers, memory space, storage space on I/O devices, and I/O bandwidth. They also often have to deal with a lack of the safety features normally given to most programming environments: garbage-collection of memory, validation of memory accesses, synchronization primitives, abstracted access to I/O devices, and transparent multitasking. In fact, the point of systems programming is usually to create such abstractions and program such protections."

Let's take Go, as an example. (Or D). Per above definition, neither is a "systems programming" language. I know for a fact [2] that Go team would disagree.

So what is the accepted definition of a "modern" systems programming language?

[1]: http://code.google.com/p/decac/downloads/detail?name=Deca%20...

[2]: http://golang.org/doc/go_faq.html#What_is_the_purpose_of_the...

I've always taken "systems language" to mean something you might write an OS in while the Go team takes it to mean something you'd replace Java with. Whichever way you want to look at it one thing the Go team can't say is that writing an OS is a "non-systems" problem.

Ideally these would be disjoint sets (in my opinion) but consider this: ASM -> C -> C++ -> Java -> ... looks like a continuum if you're writing any kind of Java code, but there's a sharp discontinuity in there if you're writing (say) a boot loader.

> Whichever way you want to look at it one thing the Go team can't say is that writing an OS is a "non-systems" problem.

I do agree with that. (Perhaps it would be helpful to apply the high/low qualifier to systems languages?)

> Ideally these would be disjoint sets (in my opinion) but consider this: ASM -> C -> C++ -> Java -> ... looks like a continuum if you're writing any kind of Java code, but there's a sharp discontinuity in there if you're writing (say) a boot loader.

Food for thought. I am not arguing for needless complexity (in a language) as far as the boot up phase is concerned. As far as I understand it, the loader has its critical, but short lived, life-cycle role to play and then it is out of the way. Arguably, it is distinct from the OS. Can we not continue to write them in ASM/C and load operating systems that are written in a more "modern" language?

Personally I'd sooner map the somewhat fuzzier concept of "applications language" onto that supposed continuum instead. It would put Go in a clearer category, since it seems to compete with Java and Python better than it does with C.

I still don't imagine an OS being written in a non-systems application language, since you've got drivers and performance critical subsystems to write and things like GC to worry about from the apps-only languages.

Have a look at Spin or Native Oberon, just to name two operating systems written in GC enabled languages.

Before C existed, there were already operating systems written in PL/I and ALGOL, which provide better type safety and memory management as C or C++.

Unix and C success has regressed what meant to be a proper systems programming language.

There is even a paper from Adele Goldberg about this phenomena, unfortunately I cannot recall the title now.

One of the examples in the repository is an implementation of malloc.

I tend to think of Go as an attempt to redefine what a systems programming language is (hence the "modern").

I don't think that there can be an established definition of what a modern systems language is, otherwise it would no longer be termed modern. Almost by definition theorists will differ on what the essential characteristics must be.

> One of the examples in the repository is an implementation of malloc.

Right. And (imho) malloc is quite possibly an "old fashioned" way of looking at things:

   function malloc(num_bytes: nat): @byte
A reference to a byte block obtained from specifying the number of bytes! I would like to see the modern memory manger be type aware, have a very rich memory model, and allow for the propagation of application level semantics to OS (so they can cooperate in dealing with the current issue: memory access latencies and concurrency).

> I don't think that there can be an established definition of [a] modern systems language

Key word is "modern" so here is an attempt to box that a bit: There are 2 possible dimensions to the notion of modern:

1 - it addresses new hardware realities

2 - it enables "progress" in dealing with matter rendered difficult by existing paradigms of system programming.

The realities of a modern system surely include: multi-core SMP; strong likelihood to be employed as distributed nodes; virtualization. So "modern" platforms and deployment patterns are not addressed.

Second, Deca, I gather, would delegate dealing with all this to some higher level (user level) framework built on top of the language. I don't see any progress in that critical front, either.

Deca does deal with SMP and concurrency in one way: module-level variables are tagged thread-local in the bitcode by default, at least if they're mutable.

Once the "newsig" branch (it's a rewrite from scratch after the original code had troubles with multi-module compilation) with the new signature system (region, effect, type, mutability) can compile some example code again, I'm merging it back into the main branch.

> I would like to see the modern memory manger be type aware

Because these are system languages, that would happen when hardware is type aware.

It is not necessary. With software replacement for MMU possible (see Singularity OS), hardly any hardware safety support is a requirement nowadays. TAL is the way to go - AFAIK it is possible to make plain x86 assembly dependently typable by attaching proof witnesses to EXE.

Nice to see something actually trying a CLOS-style object system. There are good ideas in there that don't get enough presence in more recent language efforts.

Look at Dylan, stuff started happening again.


I wonder what happened to SELF-like object systems (slots, prototypes and all that).


I mean high-performance implementations. Techniques to (statically) compile prototypes into efficient code were developed, but they are largely irrelevant to Javascript. High-performance modern implementations of JS rely on tracing JIT.

Pretty sure polymorphic inline caches are standard fare in both Javascript and other VMs (JVM, CLR, etc.). What other techniques are you referring to?

Looks interesting. When I read "modern PL principles" I was afraid to read about auto-generated getters and setters or something. :P

Looks like you have never took academia papers seriously :)

I see that "lispy macros" will not be supported, but one of the things that bugs me about C (in my brief experience) is the verbosity and repititiousness. Good generics will help, but I'd still like to see a better macro system than the C preprocessor.

One possible feature that stands out would be macros local to a scope. I actually did this, defined a macro right in the middle of a function to automate some error-handling junk. It felt nasty, but not quite as nasty as copy-pasting or retyping the code, as long as I don't try to re-use the name. It would be nice if my language handled stuff like this.

This is obviously not a huge issue. The language looks awesome.

Macros can be band-aid to overly verbose languages, but plenty of languages are concise without (whereas C is particularly bad). Higher order functions and an expressive type system (features it shares with OCaml, Oz, etc) are where the magic is.

ltu discussion: lambda-the-ultimate.org/deca

Finally a successor to the now frozen bitc

What about Cyclone?

I'm not sure I've even heard the name once. Looks interesting , gradually safer than C, I like that.

Well, BitC is not too popular either. Another interesting experiment is Sing#/Singularity, which is a successful attempt to get IO performance of FreeBSD despite designing whole OS in modified dependently-typed C# with usual JIT and garbage collection. The question whether GC is applicable in memory-constrained environments is still open - I couldn't find a single research in this direction.

Looks awesome, especially the type system.

Check out http://code.google.com/p/decac/source/browse/examples/list.d...

I'd love to see some discussions on possible/practical shortcomings.

The "newsig" branch is current development. That old stuff still has older syntax and semantics.

The type system looks novel, but it doesn't provide controlled effects or safety as the system for Disciple language described in the "Type inference and optimization in impure world" paper. So it doesn't qualify as 'awesome' IMO

The docs are out of date. I read the Disciple papers about a month after finishing the thesis, and immediately began working out how to add "back" (regions were always intended) regions and effects.

Yeah, I have the same story. I have long wanted a "mutable" type system for my HNC project ( https://github.com/kayuri/HNC/wiki, http://code.google.com/p/inv/ ) - it was here on HackerNews about 1 year ago but got little to no interest. I asked on cstheory.stackexchange.com but got no useful answers. And then someone pointed my attention to DDC. I knew about it but I thought it was just an eager revamping of Haskell, just like many other eager haskells (speculative execution etc) out there.

how do you combine first class functions (and so closures) with no garbage collection? don't you end up with possibly multiple references to memory and no way of knowing who owns it?

If you have a function that returns a function, basically you need to rewrite things to lift the referenced state into the outer scope. In pseudo-C:

    (void -> int) incrementer(int initial_value) {
	int count = initial_value;
	return lambda() { return count++; }
might be rewritten as:

    struct hidden_environment_type {
	int count;

    int incrementer_lambda(hidden_environment_type *env) {
	return env->count++;

    void incrementer(hidden_environment_type *hidden_arg, int initial_value) {
	hidden_arg->count = initial_value;
Now the caller will allocate a variable of type hidden_environment_type, and pass its address as the first (artificial) argument.

Note that lambda functions require a different calling convention to C function pointers, since they need to be passed a pointer to their environment in addition to their regular arguments. Introducing lambdas either requires some resolution of this complication, such as specialisation machinery (ie in C++11, templates provide the necessary specialisation). Another option is to use the environment-passing convention for every indirect call, possibly giving a minor performance hit for calling raw function pointers.

Another issue is that the size of an environment type depends on the code of a function involving it, so lambdas don't have uniform size unless some additional indirection is introduced, probably to heap storage (Apple's blocks extension to C takes this approach). This makes it either impossible or slightly more expensive to store lambda functions in data structures, depending on which implementation you choose.

After all this, your ownership/lifetime question boils down to "who is holding onto the instance of hidden_environment_type?", which is a pretty simple question to answer in C-like languages.

If you read the dissertation, it turns out that variables are captured by value. But even if they are captured by reference - solutions are possible such as the one in C++11. The decac doesn't provide more memory safety than C, so there are no problems :) Yet another solution is to inline all higher-order functions so no closures exist at runtime.

Actually, this issue is what spurred a major evolution in the language. I hadn't had a real region or effect system before, and now I do. Why? I needed some way to register what pointers (regions) are getting captured by an existential package (aka: a lexical closure). Since regions are now registered as part of an existential package, the regions captured by a package can be constrained to suit the region of the stack up to which a package is being returned.

In short, closures now remember what pointers they capture, and you won't be allowed to return a closure over a pointer further up the stack than safe. Other than that, closures capture by value, copying the captured variable into an environment structure.

Isn’t the name of the language Decac?

The dissertation says it's Deca. DecaC stands for Deca Compiler.

They should have used more C-like syntax.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact