
C2 Lang design (2014) [pdf] - asaka
http://c2lang.org/docs/c2lang_design.pdf
======
adrusi
I understand _why_ it's attractive to integrate more and more into the
language proper, but I don't like it. Almost all tools which still embrace the
unix way of doing things are written in C, and it's not just because that
community is conservative. C feels like unix, because it's one of the only
languages that abides by that philosophy. C-the-language is only part of the
experience of programming in C. Programming in C includes DSLs and code
generation and makefiles. Even what we consider C-the-language can be
decomposed into preprocessor code and actual C.

That said, C needs to be improved, but I think it should be done via the build
system, adding layers on top of C, in a tasteful, thought-out way. Some ideas
below:

If we're expanding on C by adding build system complexity, Makefiles need to
be improved. Make is a great tool, but it's unityped, like shell scripts. And
it's essentially a preprocessor over shell, which leads to a mess of sigils.
Maybe redesigning make as an embedded language in some lisp would do it.
Additionally we could unify the notion of linking to a library and importing
code into the makefile to allow dependencies to specify build steps. This
could make some of the added build complexity implicit.

Namespaces could probably be implemeneted as a preprocessor, taking module
declarations and import statements and converting all identifiers into they
prefix qualified equivalents, emiting warnings when there would be a collision
(like if module "foo" declared "bar_baz" and module "foo_bar" declared "baz").

Rust-style syntax-case macros can be implemented as a preprocessor.

Go-style defer statements could be implemented in a preprocessor, and avoid
the somewhat verbose goto-style error handling.

As I said, this approach requires a lot of care to avoid adding a huge amount
of complexity to the language.

------
xamuel
Thoughts as I read:

1\. "uninitialized var usage is error": unfortunately impossible without at
least one of the following compromises: Automatically initialize variables
(wastes CPU); False alarms (see Java); Built-in formal proof system; or,
Require compilers to solve the halting problem.

2\. Removed keyword "static": kills one of my favorite tricks, "self-init'ing
functions".

3\. New keyword "as": A good invention in Pythonland. Good call to bring this
in.

4\. New keyword "nil": Redundant with NULL?

5\. Example - Base Types: Uses uint8 in place of char. This obscures intent
and makes code less readable. Compare: int library_fnc(char asterisk errmsg)
versus int library_fnc(uint8 asterisk errmsg). (HN wants to turn my asterisks
into italics...) In the former it's clear errmsg is a string, in the latter
it's not clear (it could be a pointer to a flag).

6\. Example - function types. Doesn't one usually typedef the function
pointer, rather than the function itself? So making that require two lines is
annoying. Aside that, the author is right that C has confusing function
pointer typedef syntax.

7\. Multi-part array initialization: Encourages unmaintainable code. Depending
on what's in those "..."'s, might require compiler to solve halting problem?

8\. Multi-pass parsing: Trades maintainability for instant gratification.

9\. Symbol accessibility: The author makes "public" (and implicit "private")
modify entire structs rather than individual fields...

10\. Multi-file module: May lead to unmaintainable code

11\. I'm worried about the language arbitrarily defining things like "the
results of building are stored in the 'output' directory". OTOH the recipe.txt
idea could help standardize what amounts to a lot of ad hoc Makefile
programming.

12\. Build process difference: Theoretically could speed up compilation. I'm
worried for social reasons. In module-based languages, we tend to fall into
module hell: one symptom being the infamous 20-page stacktrace (see: Java,
Clojure, etc.) The nature of C's #include incentivizes shallow dependency
trees (a very good thing).

13\. "Language scope": trades portability for convenience

14\. Tooling: This shouldn't be part of the language, it should be separate.

~~~
jeffreyrogers
I agree with a lot of your thoughts. Here are some areas where I disagree

> 1\. "uninitialized var usage is error": unfortunately impossible without at
> least one of the following compromises: Automatically initialize variables
> (wastes CPU); False alarms (see Java); Built-in formal proof system; or,
> Require compilers to solve the halting problem.

I don't think this is true. It should be pretty easy to detect if a variable
is initialized or not. I can potentially see how a false alarm would arise,
but I don't think that matters in practice. (All the situations I'm imagining
involve writing bad code)

> 10\. Multi-file module: May lead to unmaintainable code

Go does this already and it is fine.

> 12\. Build process difference: Theoretically could speed up compilation. I'm
> worried for social reasons. In module-based languages, we tend to fall into
> module hell: one symptom being the infamous 20-page stacktrace (see: Java,
> Clojure, etc.) The nature of C's #include incentivizes shallow dependency
> trees (a very good thing).

I can see why this is a concern, but I think it is more a problem with JVM
languages because of the type of programming Java encourages.

> 14\. Tooling: This shouldn't be part of the language, it should be separate.

I thought this way too. Then I used Go and realized the huge benefit tooling
integrated into the language provides. (Go has other problems, but tooling is
not one of them).

~~~
eli_gottlieb
>I don't think this is true. It should be pretty easy to detect if a variable
is initialized or not. I can potentially see how a false alarm would arise,
but I don't think that matters in practice. (All the situations I'm imagining
involve writing bad code)

It's easy to trace the code-paths between a variable's declaration and its
usage, _as long as those don 't involve procedure calls_. Then that problem
becomes "static-analysis complete".

------
bluejekyll
Is there an advantage to this over say a more modern and safe language like
Rust? It seems to be just reducing the complexity of the language, but doesn't
look like it will reduce memory related bugs.

~~~
rwmj
Since you have to rewrite everything, you might as well switch to another
language (we switched to OCaml). If there was an incremental path or a safe
subset of C or something like that, that would be more interesting.

~~~
qznc
You can use D, which shares a lot of syntax with C, although you cannot
directly reuse C code because there is no preprocessor in D. Some people use D
as a "C development compiler".

The incremental path is the official C standards. C will probably gain modules
for example.

~~~
rwmj
In the project I'm working on right now, there are 277,617 lines of C (using
gcc extensions). The incremental path either compiles all of that directly, or
we rewrite it. There's hardly any middle ground, and actually rewriting it
isn't a realistic option either.

(For comparison the same project has 63,972 lines of OCaml and 31,605 lines of
Perl)

------
jeffreyrogers
This is cool. I've often toyed with a similar idea of creating a language that
improves/fixes the thing C messed up. If you aren't worried about safety
(memory bugs can largely be avoided by changing how you do memory allocation,
i.e. switch from individual mallocing to region based memory management) then
C is actually a pretty nice language since it is simple enough to hold the
entire language in your head. Plus it's nice to know how things are actually
laid out in memory. The problems C2 solves are really the main things that
frustrate me about C: header files, lack of a build systems, no modules,
spiraling type signatures.

------
yoklov
Looks extremely interesting. I'm skeptical of some of its claims (faster
compilation when incremental compilation is removed sounds unlikely, for
example), but nothing that worrisome.

Anybody have any experience? Is it still basically a toy language?

~~~
aidenn0
The main site makes it look like the language is about a year old.

I have no doubt that they can make the _parsing_ stage faster, as they won't
be parsing the same headers over and over again. This is often a bottleneck in
C++, but much less often in C. (I've seen 10+MB C++ files after
preprocessing).

~~~
buserror
If people used precompiled headers (which have been available for about 20
years) that problem wouldn't be there... However I'm not sure that the parsing
is such a bottleneck these days; I think all the complex constructs are a lot
more of a time sink (templates etc). That and linking! Takes an hour to link
the webkit library with the regular linker :-)

~~~
aidenn0
It's been a while since I tried precompiled headers, but the last time I tried
it, the experience was unpleasant and error-prone.

------
felixangell
Very similar to Ark: www.github.com/ark-lang/ark. Except Ark has no GC, has
tagged enums, ownership is enforced, and a few other smaller differences.

