Hacker News new | past | comments | ask | show | jobs | submit login
C’s Biggest Mistake (2009) (digitalmars.com)
232 points by todsacerdoti 15 days ago | hide | past | favorite | 374 comments



Author here. I'll be blunt and repeat a prediction I made 3 years ago or so:

C is finished if it doesn't address the buffer overflow problem, and this proposal is a simple, easy, backwards compatible way to do it. It is simply too expensive to deal with buffer overflow bugs anymore.

This one addition will revolutionize C programming like adding function prototypes did.


"C is finished if it doesn't address the buffer overflow problem"

You should keep making this prediction ... one day you might be right! :)


I suspect C has been steadily losing ground since I made it.


It's been 11 years, and C is running on more hardware than it ever has before, viz every android device.

By what measure would C be losing ground?


Is the number of devices that C is running on a good metric? I think the number of developers or number of projects using C makes more sense to track. And then get the market share. If there are 10x more devices today than 10 years ago then I'd sure hope C was running on more devices but anything less than 10x means C is losing ground.


The pyrrhic victory, other than the Linux kernel, everything else is a mix of C++ and Java, with a native compiler written in C++.

And the history of Linux kernel and Android might come to an end if Zirkon ever replaces it, then it will be no more C on Android.


[[citation needed]]


Citations for what?

Android and Fuchsia source code are public available, just as their code reviews and ongoing work to port ART to Fuchsia.

Anyone skilful to use C without creating CVEs of their own, is surely able to find that information.


> By what measure would C be losing ground?

When we start seeing Rust or similar being used instead of C would be a good metric and / or major OS development.


Linux is currently investigating using Rust in drivers (The main issue being lack of compiler support for the more isoteric architectures).


Thank you for the new knowledge that "eso" is pronounced similar to "iso" in some dialects of English, I didn't know that.

However, the word "isoteric" is more correctly spelled (in non-phonetic spelling) as esoteric. The prefix "eso-" means "inside" in Greek, as in "esothermic", or "esophagus". The prefix "iso-" means "equal", as in "isomorphism", "isosceles", "isometric", etc.


Maybe dial back the sarcasm a notch or two?


Many apologies - I was not being sarcastic and I'm sorry that this is how my comment came across. As chongli says I'm a native speaker of Greek and I really didn't know how "eso" is pronounced by native English speakers. I've lived for 15 years in the UK and I'm still surprised to hear how people pronounce the more obscure words in their language (some of which come from Greek).


Oops, sorry, I apologise for the mistake. I have seen too much bad behaviour on the 'net, so naturally I assumed the worst. It's a valuable lesson at the modest cost of a few karma points. (I guess I violated HN guidelines too, there. Good thing I don't have the power to downvote yet. I might have done so, and never discovered my mistake.)


Hey, it's OK, no need to apologise. Sorry I caused you to be downvoted.

You didn't cause the downvote. Really. But in any case, there are more important things in the world than HN karma. And keep up your Greek lessons. That's one language I'd love to learn, if only I had the time. But I understand it is fiendishly difficult for non-native speakers. (Source: Greek to Me: Adventures of the Comma Queen by Mary Norris.)

I’m pretty sure they’re a native speaker of Greek, given their name on GitHub.


He wasn’t being sarcastic.


He, who? :)


Like C++ on Windows, Arduino, ARM mbed, macOS,...


The same is likely true of Fortran, since Fortran code is shipped around with several Python data science libraries and included in R. Does that mean Fortran is a thriving language, or does it just mean Fortran was used a long time ago to write some important libraries that are now hard to get rid of?


Except Fortran 2018 is quite modern, supports modules, generics, and even OOP, first class support on CUDA alongside C++, whereas C18 hardly changed since C89 besides some cosmetic stuff and it is as secure as when it got used to rewrite UNIX in the early 70s.


It could be argued though that less usage means fewer stake holders to convince of the need for specific changes to the language, which helps with increased evolution. (I have only cursory knowledge of what's happening in C and none about what's happening in Pascal nowadays, just pointing out that being a smaller community might ironically help the language).


The relevant metric would be what fraction of hardware it's running on.


Literally all of it.


By number of new projects using it (without counting legacy code)

Nobody would do a project in C today with Go/Rust/C++ etc unless it's for a very specific situation


Careful with generalizations like that. You're forgetting the most important reason anyone uses a language:

It's the one they know.

A C programmer isn't automatically going to switch to Rust for new projects that they would use C for, unless their goal is to use Rust.


Lol C programmer since 1994 here... just started a Rust project with the specific goal of learning Rust. The project itself is just to scratch my own itch. I’d honestly probably write it in Python if I didn’t want to see what Rust was all about :)

C is also used a lot in embedded environments where the hardware won't support languages with a larger footprint. Quite common in my line of work.

I started a project this year, I write it in C.


C doesn't "run on hardware" .. unless you're talking interpreted C. Of course compiled machine code is running on more hardware but that's just a truism.

The question is are people using C to program these hardware more? .. or are people gravitating towards safer compiled languages (Rust?). That's a valid question, even if the answer is "no C's usage is only increasing."


You’re being pedantic to the point of being actively misleading.


"you suspect" is pretty vacuous.

More usefully, if you want people to use D instead, what is stopping them, what reasons do they give? How can these be mitigated, cos while I like C I sure would like something better.


C has been "losing ground" not because of random per peeves of those who never wrote a line of code in C but because since C's last standard update there have been other programming languages that offer developers something of value so that the trade-off between using C or any alternative starts to make technical sense.

It also helps that C's standardization proceeds in ways that feel somewhat between sabotage and utter neglect.

Meanwhile, C is still the absolute best binary interop language devised by mankind.


> C has been "losing ground" not because of random per peeves of those who never wrote a line of code in C

This is not a random pet peeve, and WalterBright is as far as you can get from someone "who never wrote a line of code in C". This is the cause of numerous security bugs in the past and currently, and the reason most C material written in the 70s/80s is unsafe to be used today (mostly due to usage of strlen/etc vs strnlen/etc).


Frankly I never would have made the proposal if I didn't love C. I've made proposals to add D features to C++, too.


A question: since your company also makes a C/C++ compiler (and the repo has very :), have you considered adding this addition to it, as an experimental feature, perhaps to demonstrate its usefulness to other developers and standard bodies? (Although, now that I think of it, D itself might serve the same purpose)


I don't see much point in it. I've proposed this change to C in front of several audiences, and it never received any traction. If you want to experiment with it, you can use DasBetterC, i.e. running the D compiler with the `-betterC` switch, which enables programs to be built requiring only the C Standard Library.

Fair warning - once you get accustomed to DasBetterC, you're not likely to want to go back to C :-)


> If you want to experiment with it, you can use DasBetterC, i.e. running the D compiler with the `-betterC`

I've been meaning to experiment with DasBetterC for a while, and I have a project C I've been wanting to migrate to something with proper strings (it's an converter for some binary file formats, but now I want it to import some obscure text formats too). Maybe that's the push I needed :)

After 20 minutes and about 250 out of 2098 lines converted, the error messages are very good and give very nice hints about what to change, I must say I prefer them to Rust's verbose messages.


Great!

DasBetterC's trial-by-fire was when I used it to convert DMD's backend from C to D.

I'm sure you already know this, but the trick to translating is to resist the urge to refactor and fix bugs while you're at it. Convert files one at a time, and after each run the test suite.

Only after it's all converted and passing the test suite can refactoring and bug fixing be considered.


I don't get why it hasn't gotten traction. When I read it, it was immediately obvious to me that this would be extremely helpful. I want it yesterday, and so should everyone.


Pro tip: Google the name of the person before responding to them, it can help avoid the taste of foot in your mouth which you are currently experiencing.


> Google the name of the person before responding to them

Is it a rule at HN that you can't take someone else's name? Otherwise, there's no guarantee that you're talking to the "Real" Walter Bright...

... or that you're talking to that Walter Bright, come to think of it.


There can be only one.


But The One doesn't get the username.


I’m new here, so this seems like a valid criticism to me — but judging by the number of downvotes, it may not be. Can someone explain why this comment is incorrect?


Perhaps because so many of know Walter from his work and his history here on HN? Sometimes you have to just trust that someone is who we all say they are.


Cool, thanks for the explanation.


It's fair argument but I also check karma point is high so it looks like legitimate account name in this case.

[flagged]


There's "arguments of authority" and then there's "accusing Walter Bright of having never written a line of code before".


AKA: Conflating authority with expertise


What argument from authority is being made by anyone?

The GP decided, out of the blue, to accuse the author of never having written a line of C code in his life. That's kind of inappropriate in any context, IMO, but just downright laughable when the author is well-known for singlehandedly writing several compilers and a whole new language.


Anyone that has to rely on their name for an argument isn't worth listening to.


You misunderstand what the conversation is that’s occurring. The parent implied the person had never written C.


He never said explicitly, he was just making a general statement. Not that it matters whether he did or didn't, there's a lot of things wrong with C, it will most likely eventually disappear, but not for reasons outlined in this article. That's what he was saying.


Well, he dismissed Bright’s argument as a random pet peeve from people who haven’t written a line of code in C before, so yes, I do think he said it explicitly.


> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

This is one of HN's comment guidelines. If you're not sure that someone is who you think they are, you can just ask, e.g.: "Hey, are you Walter Bright who did X and Y?"


? I’m confused how your comment relates to mine. Did you post on the wrong thread?


Someone once said COBOL would disappear.

C will still be used long after you and I and everyone here have returned to dust.


Try becoming a COBOL developer and see how that works for you. Likening C to COBOL isn't doing it any favors.


What's the implication here? I only know one COBOL developer but they seem to be doing quite well for themselves, making over $400k a year for something like 15 hours of work a week.


COBOL developers commanding a high salary is directly related to it not being a thriving language.


> C will still be used long after you and I and everyone here have returned to dust.

there are also people still riding horses. does not make it relevant in any way.


> Meanwhile, C is still the absolute best binary interop language devised by mankind.

You're mistaking the “C” ABI with the C language. The so-called C ABI should actually be called the UNIX-derived ABI, as (i) C doesn't define an ABI and (ii) C can perfectly produce binaries using another ABI (such as e.g. the “Pascal” one, common on the DOS platform).


Maybe people are voting this down because they think it's directed at Walter Bright in particular, but I think there is actually some truth in the harsh comment.

Nothing about Walter Bright in this statement, but some of the harshest criticisms from others I have seen of C are not from expert practitioners in C.

People who are experts and also critics seem to have a more practical, realistic, nuanced critique, that understands history and challenges to adoption, admits that the long history and difficulty of replacing C isn't exactly for no reason.


That's the way I interpreted it because it's true. A lot of the criticisms are misdirected one by people that haven't used C except being forced to use it for few assignments in school, C++ jockeys that think C is the 30 year out of date version of C that's supported by C++, and people that haven't used it at all for anything real.

I also agree that what the standard committee has been doing for the last 20 years amounts to willful sabotage.


So what the improvements between C89 and C18 in regards to UB and security, for any ISO C compliant compiler?


Between c89 and c18 is close to 30 years.

What about between c99 and c18? Is there anything you can think of? I think the _s() functions, advertised as security features, are a weak effort. Anything else come to mind?


Nothing really, if anything VLAs have proven such a mistake that Google lead an effort to remove all instances of VLA use out of the Linux kernel.

Also the amount of UB descriptions just increased and are now well over 200.

Annex K was badly managed, a weak effort as you say, given that pointer and size were still handled separately, and in the end instead of coming up with a better design with struct based handles, like sds, everything was dropped.

ISO C drafts are freely available, I recommend everyone that thinks that they know C out of some book, or have only read K&R book, to actually read them.


> some of the harshest criticisms from others I have seen of C are not from expert practitioners in C.

But were they expert practitioners of C in the past? My experience is that most of the harshest criticisms of C come from former C experts who moved on to other languages because it became clear to them that C would never be fixed - Walter Bright included.


I also have extensive (20 years) experience with the solution I proposed.


Yes I know, and for clarity I appreciate your work and insight, and frequently enjoy your comments here.

My point was that people were mistaking the comment for an attack on you, which I don't think was necessarily intended or needs to be without it being a valid point about a different set of critics.


Lol at “never wrote a line of code in C”. Surely you are not addressing the article’s author?


C code is being replaced by Rust fast. The only limit is how quickly programmers can become good at Rust. It's already happening.


One of the niceties of C is that I can get anything done pretty damn quickly, without anything getting in my way. The syntax is extremely simple, too, vs. Rust. Rust is close to Perl when it comes to syntax; full of symbols. Too implicit for me. I want to look at the code, and I want to understand what the heck is going on, even if it is written by someone else. I usually do, with C. Rust? Not so much, and believe me, I tried. I would not like to call myself an idiot, either. :)

A viable C replacement is Ada, although it is not for people who dislike the "code is documentation" bit.


> One of the niceties of C is that I can get anything done pretty damn quickly, without anything getting in my way.

not saying you're necessarily wrong (c is definitely simpler than rust), but I think most people would write a similar comment for whatever language they feel most comfortable with. I write c++ most days. even though it's the most verbose and complicated language I've ever used, I can still probably get stuff done way faster than in another language I happen to pick up.

if I'm just throwing something together really fast, I do mostly use the old school C functions though. scanf is way nicer than streams.


Agreed - in that I write it in Perl if I need it to just work right now. Perl is still, to me, the most useful programming language for completing a generic program in the smallest amount of time. (Part of that is due to CPAN, and part to the gazillion built-in features of Perl 5)


I agree. I’m an older programmer (almost 61) and I’ve been using Perl for a long time - maybe 20 or 25 years. I reach for it when a job feels like a bit too much for a bash (1) script.

I don’t go out of my way to teach or suggest Perl to my younger colleagues. I don’t know why that is. They don’t usually reach for Python, which I think would probably be their best choice.

Maybe I’m just a curmudgeon ... and by the way, can you please stay off my lawn?

:)


One way rust's just a different tool for a different job, is it really doesn't optimize for knowing everything that's happening inside someone else's code. It's a great example of the difference in procedural vs functional programming, where in rust you mostly just care what your function args are and what it returns.

Nothing wrong with learning multiple languages, of course. C was my first professional language, and I spend most of my days in rust now. No shortage of rust devs who are big on C too.


Sure. I code in C, Go, OCaml, Ada/SPARK, Factor (Forth-like, Lisp-y), Common Lisp (rarely), and Erlang (moving to Elixir).

But I was responding to "C code is being replaced by Rust fast.".


You might like zig. It's still pre 1.0, but I feel like it really has that "get out of your way" feel of C with a ton of safety. If you write tests, you can get tested memory safety, too.


Zig has a LOT of good stuff going for it but one of my pet peeves is how arsey the linter gets about formatting. No tabs (They might begrudgingly fix that one), no multiline comment/string support (and before anyone tries to correct me on the strings front, you look at that syntax and tell me it isn't an intentional joke), you must use these specific line endings (That one was actually fixed in master recently iirc)

The syntax is also currently REALLY unstable. As in: The hello world has changed almost every major version. Hopefully that too will be squashed with 1.0

To be fair though, Zig is probably the lest egregious and most flexible of the modern "C killers". I can see it's really trying to innovate low level programming. I really like it's flexible malloc systems and support for dynamic linking at runtime. It's compile time code execution is excellent too. The fact that they're actually trying to support obscure platforms like the z80 is a good indicator that they're staying true to C's "code anywhere" mantra. That's why I'm mostly focusing on linting issues of all things.


Ah. If anything, the whole "rust as better than C for everything" thing is starting to hurt its reputation, regardless of veracity. People get so focused on its use in perf-critical applications they ignore its other strengths. eg I've never replaced C code with it, but we redid our whole PHP backend as a rust app, because it's great for rigidly defined business rules too.

Ah. If anything, the whole "rust as better than C for everything" thing is starting to hurt its reputation, regardless of veracity. People get so focused on its use in perf-critical applications they ignore its other strengths. eg I've never replaced C code with it, but we redid our whole PHP backend as a rust app.

Young programmers seem to prefer learning Rust than C. Generational replacement will take care of making Rust prevalent, no matter what existing programmers think.


Do they? Or are they just told that they should be using Rust? Even putting aside outdated curriculum, Rust is a fairly involved language to teach to a new programmer.


Indeed. What languages are most common in universities anyways? Haskell? Java? C++? OCaml? Which ones are the most common?

Maybe he meant outside of the education system. I think the reason for that would be hype and peer pressure, and the feeling of novelty, with a hint of FOMO. I do not see any languages being pushed/hyped as hard as Rust.


I investigated this in Florida. I checked Florida Tech, Embry-Riddle, and a half dozen state universities. One was Java, two or three were C++ (really C with cout and maybe vector), and all the rest were plain C.

That is a mighty good showing for C.


I can't imagine universities actually caring to teach new programmers Rust... it's an overly complex language that most professors themselves would steer far away from because they know there's more to programming than following trends.

(We learned C++ in university in New York which was basically C with occasional help from C++'s standard library).


Sadly the latter is often how C++ is taught even though it is not at all how modern C++ is used…


My much younger brother is currently enrolled at a a major university in comp sci. His coursework is primarily in Java but with certain classes in C and other languages.


My guess is that those four see more use than Rust does, with Java and C++ forming the base of a typical undergraduate curriculum alongside Python and C and Haskell and OCaml showing up in classes where the concepts behind them are typically introduced. (FWIW: my college experience was C++, C, Scala for the required courses.)


I believe we will come back to C eventually. Kind of irrelevant but: I learnt C by writing mods for ioquake3 forks. :) Fun times. I was about 13 years old.


Lol.

For every young programmer learning Rust there are probably 10000 learning C.

C is still the only language you can count on anyone with a programming-related education to have knowledge of. (That doesn't translate into being able to program is C, but still.)


It isn't. You can tell how much a language is used by the inverse of the number of blogposts about it. People who have jobs don't have the time to write about how they would solve problems using that language, because they already are and have better things to do in their free time.


It is. Since 2012 C has dropped from about 4% of Github code to 3%. Meanwhile Rust has climbed from 0 to about 1%. That on its own doesn't prove that people are moving from C to Rust but it's not a risky guess. I did see some data on language transitions a while ago but can't find it now unfortunately (why is browser history search still so shit?).

https://madnight.github.io/githut/#/pull_requests/2020/2


>(why is browser history search still so shit?)

Because Google wants people to search using the Google search engine, and see ads while they're at it. That would happen less often if Chrome were properly capable of searching the history.

Opera did full text history search a decade ago, but that browser doesn't exist anymore.

Also, history is expected not to be tied to the device in the cloud era. What more history data could you scalably sync other than URLs and page titles?


Number of blogposts about Python is a counter-argument.


The number of people employed as python programmers is vastly smaller than the number of people who can write hello world in python, which is what the majority of blog-spam is about.


There's a lot of truth in it, but for a slightly different reason. Almost everyone I work with knows or writes C regularly. They're usually very senior people, and not once have I ever heard them talk about a blog, let alone write one. so there is this large group of C practitioners out there that simply don't know or care about all these other things happening around them. To some degree, it doesn't matter since there are plenty of jobs doing this.


I love the move to d/rust/zig/nim/... but there are other issues too. Ecosystem of libraries, stabilisation of common patterns (futures and Tokio issues are still out there), platform compatibilities, industry support for moving away from known solutions, and many other issues. Even if we all suddenly knew Rust perfectly tomorrow, there are other issues in the way.


futures is definitely the big one for me. Getting all concurrency fully on async/awaits is amazing, e.g. actix-web awaiting an endpoint calling juniper for graphql, with async resolver methods making async calls against my DB, without needing to spawn a single thread, is lovely. Still doesn't work as smoothly as that, even though on paper it ought to. Getting close though.


> C code is being replaced by Rust fast.

One of the key areas C is used that Rust cannot be used easily is in limited embedded devices. That looks like it'll be the case for at least 10 years and probably much longer than that.


The embedded Rust ecosystem is actually pretty vibrant these days. The number and quality of #![no_std] crates is improving rapidly.

The main limitation is going to be if your MCU is supported by LLVM. If you're targetting ARM, RISC-V, MSP430 or Xtensa you might get further than you'd expect.

There's a hell of a lot of tooling built around C though. So nomatter what happens I don't think we'll ever be rid of it.


Many, if not most, small MCUs can also run C++ these days as just as well as C. If you want to write simpler and more robust applications, it's easily achievable.


I used to write whole bunch of firmware for microcontrollers and frankly C was doing just fine for me in this area.


> C code is being replaced by Rust fast. The only limit is how quickly programmers can become good at Rust. It's already happening.

I think Rust has been very quickly fading into obscurity. What Rust hast brought to the tables was nearly the same what was brought by 100+ other programming languages in attempts to "fix C."


Oh... can you show me those 100+ other languages that have opt-out memory safety, explicit lifetime annotation, a borrow checker and no runtime?


Cyclone, ATS, Checked C, Ada/SPARK.


Footnote: With Ada/SPARK being much more battle-tested and ATS being a much more flexible & complete solution. Though I wouldn't exactly recommend ATS in terms of learning curve.

That’s an extremely specific reading of the claim “fix C” that excludes almost everything but Rust.


I never said those are the only ways to fix C. Try reading the argument again:

> What Rust hast brought to the tables was nearly the same what was brought by 100+ other programming languages in attempts to "fix C."

> Oh... can you show me those 100+ other languages that have opt-out memory safety, explicit lifetime annotation, a borrow checker and no runtime?

Maybe I can make it even clearer:

> What Rust has brought to the tables was nearly the same as 100+ other languages

> Oh... can you show me those 100+ other languages that have (some unique Rust features)


It simply impossible for C to finish or vanish. It is one of the most tested and rock solid pillar in programming world. Developers already mastered how to handle the issues you mentioned in article. They are not such big to discard C.


I mean it in the sense of starting new development of a major new project with it. Of course, C will be around a very long time, like COBOL and FORTRAN.


How many new embedded projects are picking non-C languages currently? I'm no fan, but C is a long way from dead.


I would wager the percentage of embedded projects than are picking C has decreased in the past 10 years. I have no direct evidence of that, but I think it's a likely guess.


The first thing you do to see if a new processor has bugs is to port a forth interpreter for it.

The first thing you do to see if a new processor is ready for prime time is to port a c compiler for it.

The people who talk about rust replacing c are the type of people who thing that GTK is a reasonable C project.


The proposal is just syntactic sugar for an size argument. It doean't add or solve anything really.


In my experience with language design, a little bit of syntactic sugar can have transformative results.

C's function prototypes, syntactic sugar added circa 1990, were transformative for C programming.


> In my experience with language design, a little bit of syntactic sugar can have transformative results.

I agree 100%. This also reminds me of this article:

https://nibblestew.blogspot.com/2020/03/its-not-what-program...

HN discussion: https://news.ycombinator.com/item?id=22696229


I totally agree that C shepherds you into pointers. I also think C shepherds you into writing everything from scratch. Most of all I think it's a self-perpetuating cycle of "there's no system for X (e.g. sized arrays, packaging system, classes) so everyone makes their own, and now all other code feels slightly incompatible with all other code."


C could have gained the safety of function prototypes without them. Note that the function bodies and call sites had the type information. It could have been propagated through the compiler and assembler to the linker, which would then check for compatibility.

In some ways it would have worked much better. Header files can easily be wrong. The object files being linked are what really matter.

So, suppose we decided to implement this today, on a GNU toolchain. At the call site, we'd determine the parameter types based on the conventional promotions. This info gets put into the assembly output with an assembler directive. The assembler sees that, then encodes it in an ELF section. It might get a new section called ".calltype" or it is an extension to the symbol table or it involves DWARF. Similar information is produced for the function body. The linker comes along, compares the two, and accepts or rejects as appropriate.


This would require storing type information in a symbol, would it not? Either via mangling or some other method.


Yes. I proposed a method without mangling, but I suppose there isn't any reason why C couldn't use mangled names.

It also isn't a requirement that C++ use mangled names. Other ways of carrying the type information are possible. I like the idea of a reference to DWARF debug info, which C++ is already using to support stack unwinding for exceptions.


> It also isn't a requirement that C++ use mangled names.

Overloading requires the type information to be part of the symbol "name" (ie, whatever is used for symbol lookup and linking) wether that is a mangled string or more complex data structure.


> In my experience with language design, a little bit of syntactic sugar can have transformative results.

Promises/async functions in JS and C# do absolutely nothing that you couldn't do without them. But they've had a structural effect on the average developer's ability to write scalable code.


I wonder if a better idea (in principle) would be to have some kind of hardware implementation, sort of like a finer-grained memory segmentation.


CHERI is an attempt. DARPA paid to have it for RISC-V and ARMv8. It was originally for MIPS.

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

It is fundamentally like the 80286 segments, but with all sorts of usability troubles solved. The 80286 segments were impractical because there were a small number available and because the OS couldn't safely hand over direct control. Every little segment adjustment required calling the OS.


x86 has a BOUND instruction which generates an exception if an index is out of bounds. It didn't make it into x64.


This is likely coming in ARM soon. (And I'm hopeful it's not even soon™, or "it came but nobody used it" as has happened on x86.)


Going forward Android will require hardware metadata extensions on ARM devices, as of Android 11.


You bring this up often and I ask you to modify your wording about every time I see it. Android will not require these extensions, no hardware ships with it yet. Android says they will support it.


And I give you the official wording of Google every time.

> Google is committed to supporting MTE throughout the Android software stack. We are working with select Arm System On Chip (SoC) partners to test MTE support and look forward to wider deployment of MTE in the Android software and hardware ecosystem. Based on the current data points, MTE provides tremendous benefits at acceptable performance costs. We are considering MTE as a possible foundational requirement for certain tiers of Android devices.

https://security.googleblog.com/2019/08/adopting-arm-memory-...

> Starting in Android 11, for 64-bit processes, all heap allocations have an implementation defined tag set in the top byte of the pointer on devices with kernel support for ARM Top-byte Ignore (TBI). Any application that modifies this tag is terminated when the tag is checked during deallocation. This is necessary for future hardware with ARM Memory Tagging Extension (MTE) support.

>....

> This will disable the Pointer Tagging feature for your application. Please note that this does not address the underlying code health problem. This escape hatch will disappear in future versions of Android, because issues of this nature will be incompatible with MTE

https://source.android.com/devices/tech/debug/tagged-pointer...

So unless you have other official feedback from Google management, I will keep repeating myself.


I am not sure if I have made myself clear, because I have no issues with the Google documents on this and I believe they are very clear: this feature is optional! Optional optional optional, only on hardware that supports it will Google implement these things because they literally cannot use it otherwise. Your wording has always implied that this is a requirement to run Android 11 and it is not, and that is what I am asking you to change. Like, what’s wrong with being accurate and saying “Google is implementing support for this in Android 11”? “This feature may be used to classify Android devices”?


Well, it could also solve the problem of sizeof(array) not working inside the function.

More specifically, at the moment, it evaluates to the size of the pointer itself, which is useless. On the other hand;

  static void
  foo(int a[..])
  {
    for (size_t i = 0; i < (sizeof a / sizeof int); i++)
    {
      // ...
    }
  }
... would be very useful, as it's the same syntax you can already use inside the function where the array is declared, which makes refactoring code into separate functions easier, as you don't have to replace instances of sizeof with your new size_t parameter name.

The only thing I'd like to see is compatibility with the static keyword; so that you can declare it as a sized-array but still indicate a compile-time minimum number of array elements. At the moment, in C99, this does not compile without serious diagnostics which would immediately highlight the problem:

  #include <stdio.h>

  static void
  foo(int a[static 4])
  {
    for (size_t i = 0; i < 4; i++)
      printf("%d\n", a[i]);
  }

  int
  main(void)
  {
    int a[] = { 1, 2, 3 };
    foo(a);     // Passing an array with 3 elements to a function that requires at least 4 elements
    foo(NULL);  // Passing no array to a function that requires an array with at least 4 elements
    return 0;
  }



  demo.c:14:3: warning: array argument is too small; contains 3 elements, callee requires at least 4 [-Warray-bounds]
  demo.c:15:3: warning: null passed to a callee that requires a non-null argument [-Wnonnull]


It is not just for the size argument. The array becomes an abstract data type whose bounds are consulted when the array is indexed. That's the key.

Yes, it's not a lot of effort to manually add a size_t argument. But it is far too tedious and error-prone to expect a programmer to add all the bounds checks. Being able to effortlessly tell the compiler "please check for me" is the huge win.

The second huge win is that the array is type-checked. So if you pass it to another function, the compiler enforces that it must again be passed with the size included. You don't get that by manually adding a size argument.


It allows automatic bounds checking i.e. I don't need to point out how many bugs that could fix.

If you're worried about performance test it and turn it off.


Bounds checking is a solved problem in C. The challenge is proving that your program _cannot_ go out of bounds. That is very much not a solved problem in C.


I'm curious what you'd want the automatica bounds checking to do?

Just terminating the program won't be much better than OOB memory access in many cases.

Continuing but discarding OOB writes/use a dummy for OOB reads could lead to much worse behavior.

An exception (or setting errno since this is C) would need that exception to be handled somewhere in a sensible way in which case you could just as easily add manual bounds checking.


I may be wrong, but something that you and I recognize as syntactic sugar may not be recognized as such by other, less experienced programmers.

So those programmers might just use the sugared approach and avoid the problem of writing past the end of an array, without ever knowing how tedious and/or difficult debugging such problems can be. They might sort of never even realize that they dodged a bullet simply due to some sugar.

How do you see it?


Agreed. The proposal is also wasteful.

>extern void foo(size_t dim, char *a);

And the like assumes that I have the space to waste a native type on every array. So if I’m using a 10 length array, I need to provision a native 32 or 64 bit value for “10”.

In embedded system this wouldn’t happen. At least not mine, I’m running up on limits all over the place even being careful with bitfields and appropriately sized types.

He’s right of course that foo(array[]) is converted to a pointer but that’s why I think you should always use array as a pointer so YOU know not to rely on its automatic protections.

I get the point; but I just don’t see C making this change.


So... don't use array syntax in the function prototype and definition? The proposal doesn't PROHIBIT passing a pointer, it would just offer an option to pass a fat array.


I think you mean fat pointer. And yea, that’s nice for people that don’t care their 4bit array has 64bits of native type reserved... I think other people would care.

So, I go back to the idea that it seems unlikely this would ever be an official C change.


I think you and I are talking past one another.

Currently, f(a[]) with declaration void f(int a[]) passes a pointer to the first element, with no additional overhead.

Under the proposal, f(a[]) with declaration void f(int a[]) passes a pointer to the first element, with no additional overhead.

Help me understand why "other people would care"? What is the negative impact on someone who would not use the a[..] functionality?


Pascal (Delphi and FreePascal incarnations) does that just fine with strings and dynamic arrays and in a way that is compatible with C. Just friggin' steal it an be done.


I hold hopes people will discover as I did, the Real real problem with the C language is the C library and the diglossia. Maybe that's a different language[1], but it's one you get with just a bit of #define and discipline.

For example: If you had just put the length before the array buffer, you could've saved a stall in almost every use. That's a problem with out-of-order processing that's hard to fix. Maybe your compiler will get sufficiently smart, or maybe CPUs will collude with the memory controller (or something else amazing will happen), but those things are really hard. However we fixed it ourselves; we didn't need anyone to do it for us, because (due to laziness or luck) C gave us enough of the tools we needed to do what we needed to do.

I think that's a bigger deal than buffer overflows, as unpopular an opinion as that is.

[1]: https://news.ycombinator.com/item?id=22010895


> C is finished if it doesn't address the buffer overflow problem

Assuming that it is so: when? It seems that C—despite all its shortcomings—remains a very popular language in some problem domains. For some platforms it seems like it's really the only performant HLL option.


The real troubles are undefined behavior and aliasing. Buffer overflows are just a well known gimmick of the language that is more or less controllable with some discipline. Aliasing is hell. You cannot even use a global variable safely!


Isn't much of the undefined behavior in C that people love to complain about intentionally left in the standard for the purpose of optimization? Similarly, bounds checks are necessary in insecure contexts (ie most places) but you probably don't want them slowing down (for example) an MD simulation.

Edit: But to be clear, C really ought to have first class arrays. If you truly don't want bounds checks in a specific scenario for some arcane reason, you could still explicitly pass a raw pointer and index on that. (The same as you would in any sane systems language.)


UB has nothing to do with optimization. It's about working around the differences between all the platforms that a C program might need to be compiled for. UB covers things like the layout of a signed integer(might be two's compliment, or it might not). It's about letting the platform or compiler dictate what the program does in the rare case where the program does something that might result in different behavior on different compilers and platforms.

Note that I'm using "platform" to refer to the CPU instruction set.


Signed integer overflow could have been marked as implementation-defined rather than undefined behavior. That would have meant that compiling a program with overflows on most systems would produce the same results, but compiling it for the occasional rare sign-and-magnitude machine would produce slightly different results. However, they didn't do this. Instead, they said that it's undefined behavior, which means that any program that overflows integers has no guarantees about it's behavior at all - it could crash right away, generate the correct result 99 out of 100 times, or the compiler could outright reject the program.

A good example of this is calling functions with the wrong parameter types. UB in C, but practically allowed by every compiler. No machine would care if you do this... until WASM came along and suddenly every function call is checked at module instantiation time for exactly this behavior. This is because all WASM embedders are fundamentally optimizing compilers. And what is the mother of all optimizations? Inlining: the process of copypasting code from a function into wherever it is called. If a function is being called with the wrong arguments, how do you practically do that? You can't.

It is meaningless to talk about UB without also talking about optimizations. If you do not optimize code, then you do not have UB. You have behavior that is defined by something - if not the language spec, then the implementation of that spec, or a particular version of a compiler. There are plenty of systems with undocumented behavior that is nonetheless still defined, deterministic, and accessible. Saying that something is UB goes one step beyond that: it is saying that regardless of your mental model of the underlying machine, the language does not work that way, and the optimizer is free to delete or misinterpret any code that relies on UB.


> UB has nothing to do with optimization. It's about working around the differences between all the platforms that a C program might need to be compiled for.

That's what it used to mean. But at some points compilers people decided that since UB means literally "anything can happen" they can make optimizers optimize the shit out of the code assuming that UB can't be there.

C code that used to work 20 years ago, because the UB in it resulted in some weird but non-catastrophic behavior, doesn't work at all compiled modern compilers.


> UB has nothing to do with optimization.

Other commenters already responded to this, but I thought I'd link an article I came across a while back that gives a concrete and easy to understand example of how UB can be leveraged for optimization by modern compilers. (https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=63...)


> You cannot even use a global variable safely!

Sorry? I’m unsure what you mean here, because there are plenty of ways to use globals in ways I would call “safe”: no undefined behavior, correct output, …


In one C file you can declare a global:

    int x;
and in another:

    int* x;
and you'll be mixing pointers and ints and it won't be detected.


I was not talking about this, but about aliasing a variable on the same translation unit.

    int x = 7;
    void f() { /* do things using x */ }
    void insidious_function(int *p) { *p = 3; }
now, inside f you cannot be sure that x equals 7, even if you never write into it. You may call some functions, that in turn call the insidious function that receives the address of x as a parameter. There's no way to be sure that the value of x is not changed, just by looking at your code.

Isn't that what asan/ubsan is for?

Granted, it's not static analysis, but it should catch most aliasing related errors, no?


No. There has been some effort in that direction, somebody proposed a Clang "type sanitizer" patch, but it wasn't merged.


I'm fully in the camp of C plus powerful analysis tools, plus a high-level language (Python or Scheme).


Since powerful analysis tools aren't defined by ISO what they are supposed to be, there are plenty of C compilers that will never get them.


only if your tests exercise that code path


You can't do compile time bounds checking, if that's what you're implying.


Most, but not all. They are excellent tools but not perfect by any means.


Would you care to define "finished" tightly enough that we can objectively evaluate whether or not it happens? And put a time frame on it?


Do you think there are good contenders to replace C? V seems like it could.


V has a bit of a poor track record with its claims.


Walter, I honestly just can't trust your judgement on the future of C or C++, because all I ever see is you pushing D anytime you comment.

This seems like just another gimmick to push D tbh.


Your choices are:

1. the judgement of people in the industry, who will always be biased

2. the judgement of people not in the industry, who don't know what they're talking about

As for me, I still sell a C and C++ compiler https://www.digitalmars.com/shop.html


Did you see that the article is from 2009?


I'm not sure whom this proposal is aimed at exactly.

Any production-quality C code will already use a (pointer + count) combo when passing arrays to a function, which is something that will still be needed under your proposal because the vast majority of arrays is dynamically sized. So unless all arrays in C are given the fat pointer treatment, I don't really see how what you suggest would make much of a difference. That is, if fat pointers are made the first class language construct, then, yes, that can be useful... though I disagree if it's not done, it will cause a demise of C.


pointer + size does not really fix anything, as you are relying on the programmer to correctly keep track of the size. I'm not even sure what alternative this improves upon. even more error-prone null value marking the end? praying the array will be big enough (looking at you, gets!)?

unless you have a team of incredibly diligent coders, people are going to read past the end of bare arrays over and over again. one specific mistake I keep seeing is where people misinterpret the meaning of a variable named `size`. is it the number of elements or the size in bytes? who knows, but it's probably UB either way if you're wrong.


> misinterpret the meaning of a variable named `size`

Quite right. I use, and highly recommend, the convention that `size` is for number of bytes, `length` is for number of elements, and `capacity` for the allocated number of elements.

    assert(length * sizeof(element) == size);
    assert(length <= capacity);


Would you just wrap the pointer and size in a strut, then only iterate the array via a library of functions that check the size first?

I don't code c full time, but it's what I have always done when needing to use c via ffi to get a speed up in a dynamic language.


you could do that, if you were using a library that provided/understood that struct. not sure how common this is; I work with c++ much more than c.

the problem with this approach is that you are still relying on the application programmer to provide the correct size at the beginning and not to mess it up by directly accessing the struct member later. private/public does not really exist in c, so it is a lot harder to enforce invariants within an object. the library could make the struct layout a private implementation detail (ie, not fully define the struct in the header provided to the client and take a pointer to the struct as arguments in the API) to at least discourage this. you could combine this approach with a my_array_struct_init function that returns a pointer to an empty array object. this is a common approach taken in c libraries (eg, libcurl) where the author really doesn't want you messing with their structs.


You can, but since C doesn't have templates or operator overloading this becomes awkward and unappealing.


But it makes it so much easier to

- Statically check the code, since the static analysis tool knows for certain which value is the size and can check that you're using correctly.

- Initialize the size correctly, since you don't have to enter it twice, or more crucially, remember to change it twice (or create a #define in another part of the file, name it, and document it)

You also make an excellent point yourself about the meaning of 'size'. If this was standardized, it would be the same everywhere, minimizing the risk of ambiguity.


"relying on the programmer to correctly keep track of the size"

I don't interpret Walter's suggestion that way. Of course, I might be wrong. Since the compiler must know the size of the array at the time it's declared, my thinking is that the compiler is smart enough to pass the size without the programmer having to even think about it.


Any production-quality C code?

Any or some? I'm not sure if I've seen that in the wild.


> C is finished if it doesn't address the buffer overflow problem, and this proposal is a simple, easy, backwards compatible way to do it.

Is this really "simple, easy, backwards compatible"?

I think Rust kind of counterexamples this.

While Rust can throw around slices [] (effectively runtime length), throwing around [u8; 8] and [u8; 9] (compile time length) to the same function gets nasty.

Perhaps all the constexpr work in Rust will make this a lot easier.


Not constexpr, this is solved by constant generics (the famous RFC 2000) with the array bound as a constant “generic” parameter/value.

I’ve been pushing for this feature for many years and have been playing with it since it first landed in nightly. It works quite well.


Sorry, I misspoke. But this wasn't meant to be about Rust.

My point was that: if you dump slices/fat pointers into C, how does it help?

Slices/fat pointers and the corresponding checks are a runtime thing, and that's absolutely anathema to a lot of C programmers. If it's not anathema to the C programmer, they probably aren't in C anyway.

So, now you need a way so that compiled slices/fat pointers mean something, and I'm not convinced that doesn't have a lot of ramifications that are being glossed over.


The checks can be at compile time, given how good the data flow analysis is in the compiler. The rest are at runtime, presumably coming with a compiler switch to turn them off.

Most people using D leave the checks on.

Even without the checks, however, I can vouch that implicitly carrying around the length of the array with the array pointer is a vast improvement in the clarity of the written code.


I don't see a future where C survives, not only because of memory corruption bugs (although that's a pretty big one), but also for usability: the lack of package manager, common build system, good documentation, good standard library, etc. are just too much to compete with any modern system language.


> I don't see a future where C survives

I've been seeing those exact words for decades now, and C is still going strong. Every few year a new language comes, somes writes something in it, that was written in C before, someone might even write a basic OS in it, and after a few years, that language is almost forgotten, a new one is here, and again, someone is writing something in it, but in the end, we still use C for the things we used it 10, 20, for some, even 30 years ago.


Usage of C in new projects has fallen dramatically in the latest decades. It used to be the case that C was considered a general purpose programming language and applications such as Evolution were written in it. Today big applications in C are increasingly rare, and Rust is only accelerating this trend - nobody wants to have buffer overflows anymore.


Do you have a citation? Genuinely curious.


> lack of package manager, common build system, good documentation.

This is where C is superior to virtually every other language. It has K&R to start with [1], a wealth of examples to progress from there, man pages, autotools, cmake, static and shared libraries.

> good standard library.

It should have hash tables at least, but it isn't bad.

[1] Which is still the best language book ever written (yes, it has some anti patterns, you unlearn them quickly).


Huh? In what way is C’s books, documentation or build system superior to that found in other languages? Most languages have plenty of good books written about them. And plenty of code examples online. I can’t speak for other languages but I find MDN (Javascript) and the rust docs consistently better than C’s man pages. Ruby’s documentation is great too.

As for build systems, autotools is a hilarious clown car of a disaster. You write a script (automake) to generate a huge, slow script (configure) to generate a makefile to finally invoke gcc? It is shockingly convoluted. It seems more like a code generation art project than something people should use. CMake papers over it about as well as it can, but I think cmake is (maybe by necessity) more complex than some other entire programming languages. In comparison, in rust “cargo build” will build my project correctly on any platform, any time, with usually no effort on my part beyond writing the names and versions of my dependencies in a file.

And as for package management, C is stuck in the 80s. It limps by, but it doesn’t have a package manager as we know them today. There’s no cargo, gems, npm, etc equivalent. Apt is no solution if you want your software to work on multiple distros (which all have their own ideas about versioning). Let alone writing software that builds on windows, Mac and Linux.

So no, C is not superior to other modern languages in its docs, build system or package manager. It is vastly inferior. I still love it. But we’ve gotten much, much better at making tooling in the last few decades. And sadly that innovation hasn’t been ported back to C.


> in rust “cargo build” will build my project correctly on any platform

I've got a parts drawer full of controllers that says it won't.


You are conflating consistency of package management and library behavior across platforms with platform support.

When it comes to microcontrollers rust is currently at the mercy of LLVM support and vendors.


> It should have hash tables at least

https://man.openbsd.org/ohash_init.3


What of it?

> Those functions are completely non-standard and should be avoided in portable programs.


Sure, it has not been standardized, it is not part of the standard library, so what? Did the world stop? I mean, practically speaking, who cares? Implement it, or find libraries that did. There are plenty. I posted this one because it exists for an OS; OpenBSD, since 1999. Plus, AFAIK ohash is portable enough. It consists of 2 files, and you can compile it with -std=c89. Only the bounded attribute is ignored.

If you want I could have brought up hcreate, hdestroy, and hsearch:

> The functions hcreate(), hsearch(), and hdestroy() are from SVr4, and are described in POSIX.1-2001 and POSIX.1-2008.

Happier?


I use stb myself, so I have no qualms with that. The point is rather that GP was discussing praise for C’s standard library, and even the most portable single-file include-only dependency remains just that: an external dependency that isn’t part of the C standard library (and no, posix isn’t C).


Does it make much of a difference though? Take the hyped Rust for example. Most useful stuff is in crates, i.e. an external dependency. No one seems to have a problem with that.

Personally I do not mind using libraries typically installed by the Linux distribution's package manager anyways.

If the question is whether or not I think the C standard library could be improved, then yes, I would say it could, but I do not want it to have a hash table and all sorts of stuff like that, because there are lots and lots of ways to implement them, and they might not suit my needs. C is great, because you can build it from the ground up (if you want to) to make it specifically for your use case. It gives you the building blocks. I believe I have a comment regarding this somewhere, that I like C because it does not implement stuff for you that is in some ways "generalized", which is often a bad thing. This is my problem with "it should have hash tables at least". You cannot implement it in such a way that it suits everyone's needs.


Rust not having a good standard library is a huge problem. This increases the risk of a rust codebase due to the high number of third party dependencies.


I only said that "I do not mind using libraries typically installed by the Linux distribution's package manager", which was in respect to C.

As far as Rust goes, yes, I do not like that crates are full of one-liners, and so forth. It shares the same problems that npm has. I ran cargo build on many Rust projects before. No way.


So what are the antipatterns?


> I don't see a future where C survives

Meanwhile C is running strong since the 70s.

> the lack of package manager

What do you call linux distro's package managers then? I mean, in distributions like Debian you can even download a package's source code with apt-get.


>What do you call linux distro's package managers then?

If you want to count them as package managers, they're by far the worst ones of all the well known languages (with some notable exceptions e.g. guix's and nixos's).

They're not portable between distributions or even different versions of the same distribution (!), since it's non-trivial to install older versions of libraries (or, hell, different versions of the same library at the same time). Not to mention that it's a very manual and tedious process in comparison to all the other language specific package manager. 'Dependency hell' is a problem virtually limited to distro package managers (and languages like C and C++ that depend on them).

Getting older, unmaintained C programs to run on Linux is an incredibly frustrating experience and I think a perfect demonstration of how the current distro package manager approach is wholly insufficient.


> If you want to count them as package managers, they're by far the worst ones of all the well known languages (with some notable exceptions e.g. guix's and nixos's).

The have the only feature I care about: cross-language dependency management.

Unless you are suggesting to reimplement everything in each language and then make users install ten different XML parser, SSL implementations, etc. just because not-implemented-in-my-favorite-language syndrome.


Only on platforms where UNIX is the name of the game.


Those are features which makes C flexible on main-stream platforms and also usable for so many other platform where other languages just don't/won't work.


Those features are not unique to C, they are just cargo culted as such.


I don't see C being in much worse shape than C++ with respect to build system and package manager. It's slow going, but progress seems to be happening there.

Are you saying both are doomed? Or is there some scenario where C++ survives without C?


I think both are, long term (think FORTRAN where it’s not particularly popular but a lot of existing code is maintained and not rewritten).

C++ is actually in a slightly better spot ironically because it’s harder to integrate with. If you have a C program you can pretty easily start replacing parts with Rust. You can’t do the same with C++ which insulates it better in that sense.


Reports of Fortran's death (latest standard 2018) are greatly exaggerated (much like C). It's receded to a niche, but it's still a very important niche (numerical, HPC). Hopefully, the development of a new Fortran front end for LLVM (from PGI/Nvidia?) pans out, as this would fill a gap in LLVM's offerings, and provide more competition for ifort and gfortran.


You’re proving my point. FORTRAN is a niche language. C++ is still mainstream. It will recede but not completely disappear


I don’t see a great future for C++ either


I definitely like the "lack of package manager, common build system". For me, having those is a negative for a language like rust.

You see, my OS already comes with those, and I expect to use them. I have the Debian package system: dpkg, apt, aptitude, and so on. It's a big mess when other software tries to steal that role. I have the traditional build systems and more: make, cmake, autoconf, scons, and so on. If I'm building a large project with multiple languages, I'm going to use one of those tools. If a language wants to fight me on that, I'm not interested in that language.


SQLite alone has a support contract through 2050.

C survives.


C will survive, if just for embedded/systems programming where you need a "portable assembly language" that can run on the simplest CPUs.


That's because of sunk-cost rather than design.

Thanks to LLVM and GCC you can happily write embedded code in a higher level language, but the vendors don't bother supporting it because a lot of embedded coding isn't really what we would call software (no tests etc.)


Toolchains are one side, but garbage collection and big standard libraries are also a big reason. Anything with under a MB of RAM has a choice of several modern languages, but it is still basically just C, C++, Rust, Lua or MicroPython.


D works fine on microcontrollers.

I don't think anyone was going to write their fridge's code in Haskell anyway.


The higher level languages are kind of the problem though. I need things like the ability to know the layout of my structs.


Rust, D, Whatever let you control the layout of your struct.


Even C#, actually!


As much as I dislike C

> There are only two kinds of languages: the ones people complain about and the ones nobody uses.

This unfortunately seems to mostly hold true.


Porque no los dos? There are a few languages that nobody uses and also everybody seems to complain about.


Complaining about language doesn't seem off topic here:

"Porque" means Because. "Por que" means Why.


> mostly


MUMPS.


Unfortunely until we get rid of UNIX/POSIX clones, C will be kept around.

So not in my lifetime.


Or at least not in Torvalds' lifetime. His views on replacements for C are legendary. It would be interesting to see his comments on this proposal.


Torvalds himself might have a different opinion:

People have been looking at that for years now. I’m convinced it’s going to happen one day. It might not be Rust, but it’s going to happen that we will have different models for writing these kinds of things

https://thenewstack.io/linus-torvalds-on-diversity-longevity...


> the lack of package manager

Just use nix or even apt. Both of them are MUCH better when compared to trash like npm or cargo which do not even check for signatures.

> common build system

Such as make? There is also Ninja/Meson if you prefer.


> I don't see a future where C survives

if C dies then what replaces it?


Perhaps a combination of a language like Zig (a 1:1 replacement for situations where you really do want a lot of manual low-level control) and higher-level languages like Rust eating into more and more of the use cases.


The meat of the proposal:

"a pair consisting of a pointer to the start of the array, and a size_t of the array dimension"

No, that still doesn't fix the ABI. It's syntactic sugar. It is most definitely not passing an array.

Passing an array means exactly that, no more and no less. For example, suppose this is your array:

  double foo[100][100];
The size is 80000 bytes. That is exactly how much data needs to be copied onto the stack, no more and no less.

Getting the array dimensions is secondary. It would be nice to have them work. They could automatically get names. They could get size checks, so a function might declare itself compatible with a certain range of sizes. That's all a bonus, of much lower importance than the actual ability to pass an array.

The inability to pass an array impacts numerous other languages because they use the C ABI. If you can't put those 80000 bytes on the stack in C, then you can't do it in any language. The whole software ecosystem is thus impoverished.


Are you non-jokingly suggesting copying the entire array to and from the stack each time, as an alternative to the OP's proposal (and as a default best-practice)?


Yes. If you don't really want to pass an array then don't do that. The language shouldn't get in the way when somebody wants to pass an array.

Take the address, and pass a pointer, if that is what you want to do.

Maybe I want the callee to be able to modify the array without affecting the caller. Maybe I'm even telling the linker to put that array in ROM, but I want a writable copy in the callee.

Whatever... I have my reasons. The language shouldn't block me.


I don't think you realize how intractably inefficient that would be for all but the smallest cases.


I realize exactly how inefficient it would be. If it hurts, don't do that.

I'm the kind of person who optimizes with assembly, counts cache misses, counts TLB misses, and pays attention to pipeline stalls. I definitely understand the performance implications, and I definitely wouldn't be passing arrays around all the time.

That said, I want the ability. I want the language to let me do what I want, and on rare occasions I want to pass an array. Let me pass an array.


Ok, but you didn't just say this should be possible, you said it should be the default best-practice. Even if it were useful in a handful of cases, this would be a terrible default way of doing things.


I didn't say it should be the default, but yes it should be. It is for structs.

We can have giant structs. I've seen some over a megabyte in size. The default is that the callee gets a copy. (depending on the ABI it could be in the "wrong" stack frame, but it is a distinct copy)

Are we having huge problems with structs being passed by value? I don't think so. Normal people pass pointers, except when they actually want to pass by value. It works fine.

I have frequently seen beginners struggle with C arrays and pointers. Part of the trouble is that you can't pass an array. You can try, but the compiler quietly substitutes different code. It's a source of confusion, generating incorrect mental models of what is going on.


Beginners don't struggle in Java, JavaScript, Python, C#, Ruby, and the dozen (at least) other languages that exclusively pass arrays by reference.

But all of this is way off-topic from the OP: the original point was, "Passing a pointer and length separately is error-prone; there should be a way to easily package the two together and this should be the default pattern for 90% of cases." Then you came in and said "No, instead C should support this totally orthogonal side-case that's a bad idea 90% of the time but has some niche uses." It's not a bad suggestion in itself, necessarily, but it's totally unrelated to the original proposal, much less an alternative to it.


The original claim was that the proposal would be C really passing arrays. In the article it says:

"the inability to pass an array to a function as an array, even if it is declared to be an array. C will silently convert the array to be a pointer, and will rewrite the function declaration so it is semantically a pointer"

...and later, referring to the new syntax:

"an array is passed"

In no way is it so. It has nothing to do with passing arrays. It's passing a fat pointer, which is different.


The weird thing is that the compiler obviously knows how to copy an array, because you can pass a copy of a struct that contains an array. I have a vague impression that early versions of C couldn't pass either structs or arrays, only scalars and pointers.


You are correct. This is briefly mentioned on https://www.bell-labs.com/usr/dmr/www/chist.html:

> While [the first edition of K&R] foreshadowed the newer approach to structures, only after it was published did the language support assigning them, passing them to and from functions, [...]


> Are we having huge problems with structs being passed by value?

Doing this frequently in any application means your profile will have a lot of memcpy in it.


Yes, but is that happening? I think the answer is no. The mere ability to pass huge structures does not cause programmers to do that.

I believe the same would be true if the language allowed passing arrays. Programmers would not generally pass them around. There is no need for the language to protect us from this by failing to implement the ability to pass arrays.


I think I agree with that position. I seem to have gotten the impression that you were doing this and not seeing a performance impact, which has been wrong in my experience.


I love how so many people here argue with the Walter Bright about technical aspects of C.

I have been a member of many programming languages communities, and every language has its own culture. C was always a language for the arrogant. "The real programmers" that can handle their memory, not afraid to work with pointers and that can get their code right.

I've been there, done that for many years, and became more humble with time. In a way I still love the brutal simplicity and low-level nature of C, but I would use it only if absolutely can't use any other language for technical reasons, and I would be really, really cautious.


> Oh, how they dare argue with WALTER BRIGHT. The hubris!

With all due respect, but if someone says something invalid, the fact that they have authority on a subject does not mean that we should agree.

As far as I understand the article (And I'm not the great Walter Bright, so I may be wrong) - the author states that "void foo(char a[..])" is better syntax than "void foo(size_t s, char a[])" but does not provide any arguments for it. Furthermore, the author initially fails to mention that there has been an attempt to fix the array-to-pointer-decay issue, when discussing "C's Biggest Mistake".

So, yeah, the author may be right that this has been C's biggest mistake. I don't know whether that is true or not, I do not have his experience. It is certainly true that this mistake would be high on rankings of all mistakes that C did. Still, the initial "sleight of hand" move followed by unsubstantiated argument leads to a post with the quality similar to that of a twitter post. Maybe even worse, since, you know, it's posted on a place other than twitter, so we are actually talking about it as if it was something serious.


The biggest mistake to me feels like implicit integer conversions. That's where C feels like it's really out to get you.


on a somewhat related note, I've always wished for something like `explicit` that prevents assigning different typedefs for the same underlying type to each other. like suppose I have two types, WorldVec (vector in worldspace) and ViewVec (vector in view/sceenspace). under the hood they are both typedefs for float[3], so I can freely assign them back and forth. but any vector operation that mixes the types would almost always be a bug, since they are in different spaces. would be cool to get this functionality out of the humble typedef.


This has always bugged me as well. I've generally solved this by wrapping things in a struct. Type checking will use the (incompatible) wrappers and a modern compiler should optimize them away. To avoid strict aliasing violations when converting between equivalent wrapped types you can use a union and employ a function to hide the verbosity.

I have no idea if this is the "right" way to do things, but it seems to work.


That's what Microsoft did from some version on in their build tools, all the HANDLE's etc used to just be typedef void , now they're a dummy struct each (HANDLE__ ). Seems to be a good solution.


That was already opt-in back in the Windows 3.1 days, but few bothered to use it as such.

Back when I was doing pure Windows C for a while, this helped quite a bit,

Basically you had to #include<windowsx.h> and define the STRICT macro.

There were also several utilities that made it much easier to deal with events, dialogs and callbacks.

https://docs.microsoft.com/en-us/windows/win32/api/windowsx/

https://jeffpar.github.io/kbarchive/kb/083/Q83456/

I got to learn it via the "Programmer's introduction to Windows 3.1" book,

https://archive.org/details/programmersintro00myer


You can do that in C++ by wrapping he type in a new class although it requires some boilerplate depending on which operators you want to support. [0]

You don't have to use all features of C++ - if you prefer a more C like style you can have that.

[0] https://www.boost.org/doc/libs/1_48_0/boost/strong_typedef.h...


C is considered strictly typed but that’s only when compared to the likes of JS, Python, and co. What you’re taking about is incredibly important and using it in other truly strongly typed languages has opened my eyes to just how much compile time safety a language can really provide, practically free of cost.


structs


The exact-width integer types proposal[1], implemented in clang, happens to fix this.

1. http://blog.llvm.org/2020/04/the-new-clang-extint-feature-pr...


I don't feel it's so bad. You have a a specific flag to tell the compiler to show warnings if you have any.


Why should I need to use a flag to fix something that doesn’t make sense? I can understand implicit up-conversions, but down-conversions too?


I'm not arguing that the original design was good or bad. I'm simply saying that it's as simple as adding a flag to the compiler these days.


you can certainly argue it should have been a default warning from the beginning, but mistakes were made and it's a bit too late to change that. people don't like when old, battle-tested code suddenly starts spewing a new warning everywhere after a compiler update.

in reality, most production build systems at least use -Wall (or their compiler's equivalent) and possibly also have a list of specific warnings turned on/off for different parts of the code. it would be nice to have some saner defaults, but it just doesn't matter that much.


Agree. And they have leaked out to C++ where they have been very hard to fix, and even, to some degree, to Rust.


How have they leaked into Rust? I thought Rust had no implicit conversions?


There are a small number of coercions, but we do not do them around numeric types, it’s true. Not sure what your parent is referring to.


It does, however, have integer overflow, in release mode. So if you do code a conversion, you can end up with a value different from the source.


But that's because Rust designs for zero cost abstractions in release mode. Overflow checking is not zero cost, so it's only enabled in debug mode, which is the maximum safety possible here while keeping it zero cost at (release) runtime. Without dependent types or something similar I don't know if it would be possible to check bounds.


Yes, there are reasons for it.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: