I think it is worth noting that C is special in the sense that compared to other programming languages it is semantically low-level, pretty much on par with assembly. Which is why it offers less opportunity for automatic optimization than most of other languages do.
Therefore, just like with assembly, there is nothing too "unfortunate" about the language's design (including its "treatment of arrays"), and the fact that C has been, and still remains, highly popular is just the result of healthy competition, IMO.
There are a few little things that I wish were different (for example, I see the arrow symbol '->' as noisy and unnecessary - the simple period '.' would work with pointers just as well; also, the "semicolon cancer"...), but the language seems to be pretty usable the way it is.
On the other hand, C's treatment of arrays in general (not just strings) has unfortunate implications both for optimization and for future extensions. The prevalence of pointers in C programs, whether those declared explicitly or arising from arrays, means that optimizers must be cautious, and must use careful dataflow techniques to achieve good results. Sophisticated compilers can understand what most pointers can possibly change, but some important usages remain difficult to analyze. For example, functions with pointer arguments derived from arrays are hard to compile into efficient code on vector machines, because it is seldom possible to determine that one argument pointer does not overlap data also referred to by another argument, or accessible externally. More fundamentally, the definition of C so specifically describes the semantics of arrays that changes or extensions treating arrays as more primitive objects, and permitting operations on them as wholes, become hard to fit into the existing language.
Ritchie notes the difficulty with optimization and aliasing and as far as I know the only portable C89/99 convention for this is using the appropriate function if what you want to do is memcpy() or memmove().
I love C and programming is mostly a hobby for me. But my language experience does not really extend beyond C and C++, other than reading about D, Lisp, Rust, etc. I also think the STL is the beautiful thing C++ gave us, and having it as a library is better than building these structures into the language. The STL relies heavily on iterators of course.
My question to HN is: do any languages that really emphasize pointers and iterators over arrays and indices have a non-cumbersome way of telling the compiler when no aliasing is expected? As a hobbyist, I am more interested in something like slices from D or parameter type restrictions than trivially obvious global guarantees like ``if you mutably borrow it then there is no aliasing''. Do D programmers find the slices helpful and easy? Is the syntax and/or semantics such that the programmer can provide the compiler with aliasing info? Are there a small number of type concepts for this or does it explode with the number of different data structures.
I know there is the restrict keyword, but I have always been too lazy to use it in C and even in C++ it seems (quadratically?) unlikely you would be energetic enough to properly declare which pairs of entities could not alias.
Wow Ada syntax is not what I'm used to! This is interesting; it seems like a very simple solution to make the problem less widespread.
It is still on you though to use the correct version of the equivalents of memcpy() and memmove() though?
It has all the modern goodies in it, OOP, interfaces, contracts alongside type safety.
Then if you really care about high integrity systems, there is its brother SPARK, which is a hardened version of Ada, the latest revision being from 2014.
> My question to HN is: do any languages that really emphasize pointers and iterators over arrays and indices have a non-cumbersome way of telling the compiler when no aliasing is expected?
Rust has this concept baked into the language: any mutable reference is statically guaranteed to not alias anything else that is accessible. Non-mutable references can alias each other, but I don't think there are any optimizations that could be inhibited by this.
Yeah, this is the gist of my question. Basically if programming language inc(C) has an overloaded function for both memcpy(alias...) and memmove(restrict...), how does the type information enter the system which allows the compiler choose the correct procedure?
I'm interested in any language innovations that have a smooth, suave feature for this. Rust changes the default, but in some sense its not an absolute improvement; where C bluntly assumes memmove(), Rust bluntly applies memcpy().
C99 introduced the restrict keyword by which the programmer promises that the pointer doesn't alias anything (or else the behavior is undefined).
memmove versus memcpy only solves the problem for block moves, not for other array operations. Think about a Fast Fourier Transform or whatever where the compiler has to suspect that the inputs overlap in dumb ways.
But, if I understand correctly, what HP and SGI did was basically to let Stepanov work on STL while he worked for them, and give some semi-official sanction for the STL releases. They didn't see the STL, decide that they needed to work on it, and set up an independent team to do so.
And, if I understand correctly, Stroustrup saw the STL and said (paraphrased) "Yeah, that's going in."
When Java came out there were a few commercial offerings that were kind of "STL for Java", but with 1.2 Java got its own Collections API and the interest faded away.
Thanks dozzie and pjmlp, I should have credited Alexander Stepanov. He is a great programmer and educator that has impacted my CS education.
What I meant was that C++ was the first widely adopted language that had memory access and sufficient generic features for the STL to be written and easily used.
"Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."
Yet it took up all these years until clang for anyone to start taking static analysis seriously in C, and even now many still ignore it.
Yet it took up all these years until clang for anyone to start taking static analysis seriously in C
In some circles maybe, but some industries have decades of experience applying static analysers to their C code, and companies have been selling static analysers throughout.
Some embedded systems places do this, even ones that are not safety-critical. It's more common as the level of concern rises. Medical instruments, for example, aren't MISRA, but at least some of them run static analyzers on C/C++ code. (All should...)
No, indeed! This obsession with "undefined behavior" and the desire to eliminate it from the language spec is something recent; when I learned C, we all just understood such gaps in the spec to be places where the compiler writer would do whatever was reasonable for the target platform.
> I always assumed C came from a strong typing background
C is notoriously weakly typed. There are lots of implicit casts that will automatically be applied. And (void *) is your door to freedom from types (this can be a good thing in some cases, of course.)
C was designed to be more portable than assembler. In fact, that was one of its raisons d'être. It is not hard to imagine that other programming languages (COBOL, Ada, Common Lisp, Perl) could be more portable than C, but still.
Therefore, just like with assembly, there is nothing too "unfortunate" about the language's design (including its "treatment of arrays"), and the fact that C has been, and still remains, highly popular is just the result of healthy competition, IMO.
There are a few little things that I wish were different (for example, I see the arrow symbol '->' as noisy and unnecessary - the simple period '.' would work with pointers just as well; also, the "semicolon cancer"...), but the language seems to be pretty usable the way it is.