Hacker News new | past | comments | ask | show | jobs | submit login
I love coding in C (lord-left.github.io)
346 points by lordleft 27 days ago | hide | past | web | favorite | 424 comments



Coding in C is like camping. It's fun for a while, but eventually you really miss things like flushing toilets and grocery stores. I like using C, but I get frustrated every time I hit a wall trying to build features from other languages. C is a very WET language.

C is basically it for embedded development though. I've gotten so tired of recompiling and waiting to flash a chip with C, that I've started learning ANTLR and making my own language. The idea is to have a language which runs in a VM written in C, and allows you to easily access code which has to be in C. Sort of like uPython, except it can easily be used on any board with a C compiler with a small amount of setup, and C FFI is a first class citizen. Also coroutines masquerading as first class actors with message passing, since I always end up building that in C anyway.


There's an old quote, "Greenspun's tenth rule" https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule :

"Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp."


An old quote indeed.

In modern world, any sufficiently complicated C or C++ program embeds a runtime of some higher level language. More often than not, that language is formally specified, well-supported, and fast.

Sometimes it’s indeed LISP like in AutoCAD, but more popular choices are JavaScript (all web browsers), LUA (many video games and much more, see Wireshark), Python, VBScript or VBA (older Windows software), .NET runtime (current Windows software and some cross-platform as well, like Unity3D game engine). These days, it’s rarely a custom language runtime, but it still happens, like Matlab.


Which of those is actually formally specified? Also, calling Python or JavaScript any effieicnt sounds a bit unwarranted. Consider any JavaScript desktop application, its power and memory usage... it's a mess.


> Which of those is actually formally specified?

Many of them are international standards, see ANSI INCITS 226-1994 for LISP, ECMA-262 and ISO/IEC 16262 for JavaScript, ECMA-334 and ISO/IEC 23270:2018 for C#. Lua and Python aren’t standardized, but even so, they both have a comprehensive formal spec.

> calling Python or JavaScript any efficient sounds a bit unwarranted

Both are much slower at number crunching compared to C or C++ but they aren’t that bad, either. Most JavaScript VMs feature a good JIT compiler. Some Python runtimes have JIT too.

> Consider any JavaScript desktop application, its power and memory usage.

There’re good ones, like VSCode. On average they’re indeed not great, but I think it’s just an unfortunate consequence of low entry barrier. It’s easy for inexperience people to get started with the technology. And the ecosystem is misjudged based on the output of these inexperienced developers.


Python is actually scary fast because anybody that uses it for serious numerical work will use numpy.


ie "python is scary fast because anyone who uses it is mostly running C"


Yep. This goes for a lot of environments. It's going to take many many decades to get rid of C. Probably more than it has been here up to now.


Doesn't always mean it's particularly performant, unfortunately. The JSON lib in PHP is still C (part of Zend), but it's very susceptible to malloc failures (one big contiguous request for the whole shebang), and it's generally way faster to serialize arrays into a .PHP file you load than into JSON if you're storing it locally. I compared what I read there to the serde stuff for rust and the difference is stark.

Maybe having most of this stuff in c libs with scripts wrapped around them will make it easier to migrate. Keep the same py but swap out the lib from C to some rust that's still useful in a crate for pure rust projects.


what you’re really saying is that C is fast and python is slow. I don’t think that was ever in debate.


> Consider any JavaScript desktop application, its power and memory usage... it's a mess.

I'm running VSCode right now with many tabs open and extensions running, and it's using less than 100mb of memory. Doesn't seem so bad to me


> using less than 100mb of memory. Doesn't seem so bad to me

You know... any embedded programmer reading that comment is probably laughing hysterically. In most cheap or energy efficient microcontrolers, more than 4 Mb is considered a luxury.


Ok.

There is no virtue in using less memory for the sake of it. If a desktop application is taking 100mb, you can have a lot of those running before modern boxes start to struggle.


Whether 100Mb usage is unacceptably large is mostly dependent on the precise features in play:

* 32-bit vs 64-bit application(64 bit will bloat all pointers)

* Methods of loading and editing documents(modern editing attempts a lot of introspection based on highlighting syntax or project environment)

* Character encoding support(Supporting current Unicode rendering would almost immediately take you beyond 4Mb)

Interactive editing is a pretty memory-hungry task compared to a simple viewer or batch processor. There's more reason to keep things cached. If you actually go back to old text editor versions from the days of 4Mb desktops, you'll find yourself missing stuff. Not so much that you can't get by, but enough to give pause and consider taking the hit.


A friend of mine has been developing a small Lisp which starts at 2Kbytes RAM: http://www.ulisp.com


I mean sure, but he's talking about desktop applications.


Depends on the exact numbers. For instance, Allwinner R328 chip has 2 ARM cores running up to 1.2GHz, includes 64MB or 128MB RAM, and the price is around $3.


> I'm running VSCode right now with many tabs open and extensions running, and it's using less than 100mb of memory. Doesn't seem so bad to me

uh... is it ? no tabs open and less than 12 extensions, the code process itself takes 120 megabytes, and there's also 1.2 gigabyte of electron processes running along with it


Are you sure you didn’t forget to count all the electron helper processes as well?


100 mb to edit text seems not great...


Have you ever installed enough Vim or Emacs plugins to reach feature parity and then looked at memory usage? VS Code is doing a fine job.


VS Code does a lot more than "edit text" even out of the box with no extensions.


~100,000,000 bytes! That's a lot of "state".


>In modern world, any sufficiently complicated C or C++ program embeds a runtime of some higher level language. More often than not, that language is formally specified, well-supported, and fast.

More often than not? Almost none of the languages mentioned are "formally specified", and most of them are hardly fast either...


> Almost none of the languages mentioned are "formally specified"

They all have written specs. Many of them even have these specs standardized, see ANSI INCITS 226-1994, ISO/IEC 16262 and ISO/IEC 23270:2018.

> most of them are hardly fast either

It’s borderline impossible to be fast compared to C or C++. By that statement I meant 2 things.

(1) They’re likely to be much faster compared to whatever ad-hoc equivalent is doable within reasonable budget. People spent tons of resources improving these runtimes and their JITs, it’s very expensive to do something comparable.

(2) On modern hardware, their performance is now adequate for many practical uses. This is true even for resource-constrained applications like videogames or embedded/mobile software, which were overwhelmingly dominated by C or C++ couple decades ago.


the first rule with no spin: "Any portable, formally-specified, bug-free, quick implementation of all of Common Lisp contains at its core a fair amount of nicely indented C."


Sure, sure. "...there is but one God, the Father, from whom all things came ..."


BCPL?


Unlike LISP, C is still #2 on TIOBE vs. LISP which is #27, aka no place.


> Coding in C is like camping. It's fun for a while, but eventually you really miss things like flushing toilets and grocery stores.

The only issue I have with C is that there are no good reasons for some of those missing features to be missing.

Take, for example, namespaces. Would it be a problem to implement them as, say, implicit prefixes?


That's exactly my complaint. I think it's missing things because if you added them in a way compatible with the C way of doing things then you would permanently cleave C from C++. Since C++ is a runaway train at this point detaching is probably a good idea now.

Me I want range types like Ada. Real array types. I think I want blocks/coroutines.


> Me I want range types like Ada. Real array types. I think I want blocks/coroutines.

aka CDC


Would it be possible to do this without changing how name mangling works?


Actually, yes:

  #include <string.h>
  __prefix__ str; /* in scope: "","str" */
  strlen("Hi!"); /* try "strlen",done,ignore "strstrlen" */
  len("Bye"); /* no "len",try "strlen",done */
  __prefix__(foo_) { void bar(void); } /* "foo_bar" */


There's no name mangling in C, at least not in any ABI I know of.


There’s “mangling” where symbols have an underscore put in from of them. But I think the point is that namespaces functions would need to be mangled and thus be difficult to call.


> But I think the point is that namespaces functions would need to be mangled

Namespaces don't need to be mangled at all if they are interpreted as symbol prefixes.

I'd prefer that, say, namespace foo::bar resolved into foo_bar for all symbols (even that meant risking namespace naming collisions) to not having any support for namespaces in C.

In fact, this approach is already used to implement pseudo-namespaces, so that wouldn't be much of a stretch.


What's the point? How is typing foo::bar better than typing foo_bar?

It seems like you want the language to be more complicated for no benefit. Why not use C++ at that point?


> What's the point? How is typing foo::bar better than typing foo_bar?

You're missing the whole point of namespaces. The goal is not to replace foo_bar with foo::bar. The whole point is that within a scope you can type bar instead of foo::bar, or bar instead of foo::baz::qux::bar, because you might have multiple identifiers that might share a name albeit they are expected to be distinct symbols.

https://en.wikipedia.org/wiki/Namespace


> What's the point?

using namespace kind::ofa::long_name;

or

using kind::ofa::long_name::bar;

is the point.


As far as I know, name mangling only exists because linkers don’t have a notion of namespaces. If you fix linkers, then there is no need to mangle.


And also because of function overloading in, say, C++. To link code two functions with the same identifier which differ only in signature (e.g. types of arguments), we need to pass them to linker as two different functions with different identifiers.


Or something like the std::vector would be nice.


https://github.com/nothings/stb/blob/master/stretchy_buffer.... Although I get you want an official standard library version, or a built in language feature.


I think GP’s point was that what you want is called “C++”.


Pasted the quote into work chat and a waggish co-worker immediately came back with, "on the other hand, Python is like camping with a toilet, but that toilet may have a dangerous snake in it, but you won't know until you try it"


The seat is in 2.7 but the flush tank requires 3.5.


5 more days until it is officially deprecated


> I've started learning ANTLR and making my own language.

> a language which runs in a VM written in C [...] easily be used on any board with a C compiler with a small amount of setup, and C FFI is a first class citizen. Also coroutines

Use Lua. It has all of that already.


I've looked at LUA, and it's possible I missed a flavor, but I have a long list of things I want:

  -Static types that match C for seamless interop.
  -No GC
  -First class actors 
  -Ahead of time declaration/allocation of actors and messages, with automatic async-like passing on a coroutine/actor waiting for an allocation, and automatic disposal of the actor after prolonged allocation failure.
  -Hot swapping actors from a running system in the field
  -Over the wire debugging and inspection of a running system in the field
AtomVM, an erlang VM for small devices is closer than LUA to what I'm looking for. However, I really want to create an environment made for embedded development, not try to tack a higher level runtime onto something like an ESP32 and have devices drop every 2 months from a memory leak.


I've had a similar list for embedded. Unfortunately I've yet to fit many of those criteria without relying on GC. Maybe Zim, or Nim. But then you won't get introspection.

Rust using one of the actor libraries based on async, might suit embedded really well if cross platform support matures a bit more. Rust cross-compiling requires a C linker & compiler for the target platform but Rust doesn't respect the standard environment variables for setting them using LD/CC/CXX environment variables. Though it'd be really interesting to see if anyone can setup Rust to run on an embedded WASM to allow introspection / real co-routines. IMHO, that'd be awesome.

Currently I've settled on a Forth (in Forth style, an implementation I wrote https://github.com/elcritch/forthwith/) to give me an interactive environment without GC and stable timing (important!) with ability to add C function calls readily. Interactivity is big for me in the area I'm currently working in. Forth can easily be extended to have tasks / co-routines too, though I never tried implementing them. And of course, Forth macros are like exercises in puzzle solving.

AtomVM does look interesting, but early stage. Erlang VM actually matches embedded really well. The actor model in theory lets each process run it's own GC which makes the GC model much simpler than other dynamic languages. Modern OTP is large and not suited for embedded as you mention, but pairing it down to basic actors and processes would fit well on many modern embedded MCU's. Elixir and Nerves is great for SBC/Linux embedded computing and some boards like the Omega2/Licheepi Nano. There's also GRiSP (https://www.grisp.org/) that runs BEAM on top of RTEMS.


Some of those I'm iffy on the details, but Zig (Ziglang) has async/await, static typing, and very robust ability to intermingle with C. It's being designed for systems/embedded/games.

https://ziglang.org/


Just program an FPGA. You can get all those things with https://clash-lang.org/ cause basically what you want is first order FRP and hardware does that natively. (Actor model is junk cause the interesting thing is the data/information flow not the "actors" themselves. FRP puts the focus back where it should be.)

You can implement that for embedded systems too, but as there is no obvious best way to implement it (Von Neumann machine is so different) you'll constantly get annoyed if you are the type that likes to use the weakest hardware possible as different implementations have different costs and so you'll always be second-guessing the abstraction.


need to support hot reload is one of the reasons Erlang does not have static typing


Dart has static typing, and hot reloading in its VM, so it's definitely possible.


I'm looking at hot reloading more for debugging difficult to reproduce bugs in the field than for patching critical systems which can't go down. The goal will be to hot reload logic, but not types. I think that should be doable.


seamless


Yup, the Lua interpreter is written entirely in pure ANSI C. Op should definitely see if solves his problem before implementing his own language. Unless of course he just wants to learn how to implement a language as a project by itself, in which case, have fun :).


"Also correct array indexes" is not a feature that most people feel they need to explictly specify.

Also bitwise operators, but using functions instead isn't a huge problem (and supposedly 5.3 or so fixes that).


Why not FORTH? It may not have some features out of the box (no type checking - which isn't as bad as it sounds with FORTH) but newer runtimes tend to put those things in and/or are fairly easily implemented yourself.

https://en.wikipedia.org/wiki/Forth_(programming_language)

also, system level REPL is some hardcore nerdity.


FORTH is neat, but I think you need to be too smart to use it. I made a list of other things I want under the peer post about LUA, some of that applies to FORTH.


> I made a list of other things I want under the peer post about LUA, some of that applies to FORTH.

Incidentally, while Forth comes from the era of ALLCAPS language names, even its creator Chuck Moore calls it Forth: https://web.archive.org/web/20040131054056/http://www.colorf... ; and Lua never was spelled with all caps.


> C is basically it for embedded development though.

This is the entire reason for me. With the way IoT is blowing up C actually might be gaining market share. I've done my research and things like TinyGo exist, and you can sort-of compile Rust for microcontrollers if you use all of the right crates, but it feels pretty risky to do so.


> I've gotten so tired of recompiling and waiting to flash a chip with C, that I've started learning ANTLR and making my own language. The idea is to have a language which runs in a VM written in C, and allows you to easily access code which has to be in C.

You may want to check out Moddable's XS runtime[1], which is exactly that. It's written in portable C, and "XS in C"[2] makes it easy to interoperate with C code you may write. And even if you decide to roll your own runtime, there's tons to learn from the source code.

[1] https://www.moddable.com/faq.php#what-is-xs [2] https://github.com/Moddable-OpenSource/moddable/blob/public/...


I will definitely check that out. Thanks!

I haven't gotten to building the VM yet, and I'm hoping to just use someone else's and then contribute the tools I need to their ecosystem.


Besides embedded applications, C is important because it has a simple binary interface. Libraries written in C can be used in any other language. This is not true for higher level languages, whether compiled or virtualized.


Only on OSes written in C, because C ABI is the OS ABI.

On Windows any language able to speak COM can use it, regardless what language was used to create the COM component.

Same applies to mainframes and their language environments.


you can actually call golang from C but you have to copy values over otherwise the go garbage collector will free it while the C side may still be using it.


I love writing C when I haven’t written C in several years. It feels nostalgic and simple and elegant in my mind, and then I remember all of the issues I run into and how none of them have good, clear solutions. Things like writing cross platform code or string manipulation or how often things I swear I understand actually have bizarre edge cases, things I thought were well defined behavior are actually undefined, etc.


The one thing C will teach you quite quickly is that trying to be 'smart' gets punished, harshly.


I’m not talking about being clever, I’m talking about writing code that meets some baseline standard for security, correctness, and cross platform functionality. The issue with C is that standard, “unsurprising” code often behaves unexpectedly.


I didn't get that impression when I used it, if it does that then most likey your code wasn't so standard or your understanding of the language was incorrect.

I personally only got in trouble in C when trying to be too clever.


You'd be an extreme exception, then.

The truth is that decades of projects written in C have clearly demonstrated that people can't write secure programs in C.


People can't write secure programs in any language. But C programs have ways of being insecure that other programs don't.


Virtually no one can write secure C code. This is different than “other languages allow some people to write insecure code occasionally”.


You're missing the point: writing secure code is categorically impossible, no matter what the language. That's very much different from 'other languages allow some people to write insecure code occasionally'. Every piece of code, every system ever built and every piece of hardware ever designed had bugs in it and if that item was supposed to be secure over time it will turn out that it wasn't. So the best you can hope for is that an item is obsoleted before you (or an adversary) detects its flaws and you're going to be fighting a rearguard action.

The whole idea that many eyes or 'safe' languages or some other magic bullet are going to solve this is so far off-base that it makes productive discussion impossible because everybody starts to focus on the particular tool at hand whereas the real problem is an instance of of the class 'architecture', not of the class 'tool'.


I don’t think I’m the one missing the point. You seem to be rebutting points I didn’t make, like other languages (or magic bullets) afford a bug count of zero. You also seem to be ignoring my point, which is that C is in a different ballpark than other languages in its difficulty of implementing correct software.


you did say that other languages are more secure, and the parent post here is asserting the entire platform is insecure therefor no languages are secure, which is true.

however i disagree with both of you that writing safe C is hard. most of the problems i see relate to incorrect pointer usage or memory management. use stack allocated containers from libs very well tested and pass references. this removes entire classes of bugs from popping up. one thing you can’t do with C is be lazy.


I absolutely guarantee that if you throw AFL at any modestly complex C program that you have written that it will find bugs. Guarantee. This isn't about laziness.


safe and bug free are different things. nobody writes big free code but writing safe code isn’t as hard. also i’m not entirely sure how a program proves lack of laziness, if anything it proves laziness


Many of the most expert C programmers have reported an inability to write secure C code. But you’re right that my understanding of the language is incorrect; that’s largely my point. There are so many oddities and exceptions and surprises that the language is very hard to understand correctly. The people who profess to “not have these problems” haven’t written code that has run the gauntlet of security and widespread multiplatform distribution.


do you have any references for problems these alleged expert programmers had when trying to write C?



he didn’t give any actual reason, just threw around some credentials.


I misunderstood your question. I don't know of the specific reasons; presumably they're the same reasons that intermediate C programmers can't reliably write secure C.


I don't know what you mean, exactly.

I do recall that in 1991, I was the only student in my "Introduction to Programming Using C" 101 class (as a Freshman in college) who could understand pointer arithmetic. That class ended the careers of many aspiring Computer Science majors.

I eagerly learned C++ a few years later. I did not understand at the time what shitpile OOP is. I was totally in OO because it was the "industry trend." Someday we'll fully recover from that entire folly.


FYI, zig is the language that the most impressive embedded programmer I know is currently interested in to use for his embedded work (he does a lot of real time hard deadline things, like audio processing). I don’t know really anything about the language, but his interest in the language is enough to suggest other embedded programmers look into it.


From my subjective perspective, Zig, of all the languages, is what feels most the modern successor of C.

I think, it even does a better job than C++ which share some of the C syntax.

The problem i guess is complexity and productivity. Zig here is a hit while Rust is a miss as much as C++.

The question Zig was trying to answer is: Can we have a simple, free and powerful language as C but with a more modern approch?

While Rust in its cruzade to be seen as a C++ competitor were not aiming at simplicity and user ergonomics.

So my feeling is, Zig is much more compeling as a C substitute than any other language out there, and probably thats the reason why your friend like it more than others.


This is all very subjective but I, for one, agree with every word you wrote.


You can fancy-up your sleeping bag + tent with a huge motorhome/caravan by using C++. Its like you have this house on wheels, but if you just want to take a sleeping bag + tent with you on a trip - you can. Take whatever you need.


But then you have a hundred people following you around, yelling that you're doing C++ wrong.


> Coding in C is like camping. It's fun for a while, but eventually you really miss things like flushing toilets and grocery stores.

Bookmarking that comment. Will totally quote that whenever the need arises.


That sounds cool. Does ANTLR help with anything besides parsing your new language? It seems like the substantial effort is making the VM/runtime that the language runs within.


Haven't finished the ANTLR book yet, but that was the goal. I'd much rather define a grammar than hand write a(nother) lexer/parser. It also has some useful tools for debugging and visualizing.


Forth is tiny, anybody can write their own and it is pretty awesome and interactive.


and can be made fortranish fast: http://soton.mpeforth.com/flag/jfar/vol5/no2/article1.pdf (disclaimer, not a recent paper)


I started learning C by writing a database, not a way to go. Since the information about database available online is rare.


What does it mean for a language to be wet?


I think the parent is using WET as the opposite of DRY: Don't Repeat Yourself (https://en.wikipedia.org/wiki/Don%27t_repeat_yourself).


Wepeat Evewy Time -- Elmer Fudd's programming acronyms


V looks like it fits this bill even better than lua. It looks like a very useful and well designed language, but it is in an extremely rough state so it might be more worth keeping an eye on.


Coding in C is like constructing a bomb and and defusing it at the same time.

It will explode and kill people eventually. That's a sure thing, no matter how good you are.

The only relevant question than is: Have you been still around or did you manage to find some safer workplace far away?


> It will explode and kill people eventually

It will? Some of the stuff in this thread is really over the top.


> I like using C, but I get frustrated every time I hit a wall trying to build features from other languages.

Well there's your problem. Instead of thinking about your problem and devising a C solution, you resort to pounding the square peg into the round hole. Your problem isn't C, it's your proficiency at it. What's worse, instead of correcting said ignorance your solution is to produce more code using a new language. This only adds to the gross pile of trash that is every language/framework that tries to solve programmer ignorance and laziness with what amounts to wishful thinking. This is why software really sucks nowadays. Everyone wants results without reading the manual or putting in real effort. They want languages with built in hand holding and magic inference. That's not how any of this works.

I get it, programming and C sucks because it's so damn tedious. I suck at C too. But I actively study books and similar problems people already solved using the language. Programming is very hard. The best programmers I know are also very good with math and puzzles (e.g. Ken Thompson was a chess geek). They're puzzle solvers. They don't cheat and try to smash or duck tape the pieces together. They thrive on studying strategy meaning reading books, papers, and other people's code. Learn how to solve the puzzles strategically instead of fighting them.


I shouldn't bite... sigh.

When writing interrupt service routines for serial communications on an obscure architecture with multiple heaps (one 16 bit addressable and one 32 bit addressable), I could have written 2000 lines of assembly. Instead, I used a system of assembly macros and wrote about 300 lines of that.

When writing firmware in a C which lacked decent coroutines, I could have just made a giant while loop and put all of my logic in it. Instead, I wrote a quick implementation of coroutines, a scheduler, and a basic async/await implementation using the C preprocessor. This also saved thousands of lines of boilerplate code.

The fact that C has no good method of implementing a generic hashmap has nothing to do with my proficiency in the language. My proficiency in C also has nothing to do with my love of sweet, glorious type inference, or my deep felt appreciation for the conveniences modern IDEs and languages provide. Writing software has never been better!

Maybe I'm misinterpreting, but what I'm hearing is: "You don't need to use Vim to get more done, you just need to get better at typing. Not that I know anything about how fast you type." Or... you know, I could just use Vim.


> I shouldn't bite... sigh.

That is not how your original post read to me. It sounded like you came from the dynamic/scripting language world and are trying to hammer those features into C. You sound quite proficient, I'm not trying to shit on you or anything. I'm tired, it's late, and no one cares but it sounds like you want something like an mbed that runs python or FreeRTOS? Why reinvent the wheel if someone already did this? Or is it simply that you're stuck using C on your "weird" platform? In that case, I'd say have a look at Nim, there's an embedded branch. It's a dynamic language that compiles to C89 so perhaps it's buildable on your platform. If it's a personal project then have fun.


Sounds like you rolled your own RTOS. Dang. Very nice!


Psh! I wish!

Implementing cooroutines in C is pretty old hat: https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

If you just put together a linked list of structs which represent the routines, the time they last slept, and what they're last waiting on, you can just iterate through it and you've got most of the functionality of async/await. I never bothered to implement separate stacks and register restoration, opting to just use global variables for state that needed to be saved between awaits. It would have been neat to do that, but I think it would have turned an afternoon spent saving time into a multiday project, and left me permanently paranoid about corrupting the stack or registers. Would have been fun though.


> That's not how any of this works.

That's actually how _all_ of this works. If we required everyone to handle the complexity of everything, we wouldn't be able to make technological progress. Yes, there's the tension (there's always a tension) between having a deep understanding of an underlying technology, and therefore be able to use it better, and trying to get to a point where you can get away with not understanding. But it's undeniably considered a success story in technology if the technology gets to a point where people can use it to basically its full potential, without needing to understand it.


A lot of people have a misconception that C is close to the way hardware works. The reality is that hardware engineers jump through lots of hoops so that C programmers can keep pretending that they're coding against a really fast PDP-11. Not only that, but these contortions directly lead to vulnerabilities like Meltdown and Spectre because memory representation in C doesn't map well to modern hardware. C family of languages is basically holding us back from utilizing current day hardware safely and efficiently. This [1] is an excellent essay on the topic.

[1] https://queue.acm.org/detail.cfm?id=3212479


That's not quite right.

They jump through hoops so that C programs and the programs of all the other languages designed to run in the same execution model run as fast as possible. It's not to indulge C programmers but to support the vast body of existing software.

Hardware engineers need optimization targets just liker anyone else, and it's an eminently reasonable one.

Also, it's not like loads of improvements not related to the C execution model haven't been made. Demonstrably, the C model is not holding us back.

Anyway, if you want to move on from the C execution model, great. Now you need to introduce a practical transition plan, which should include such details as how and why we should rewrite all of the existing performance sensitive software designed to work in the old model.


the C execution model has demonstrably held us back because the practical transition plan for migrating existing C codebases would be automated translation (of the C Turing tarpit) into sane functional/formal models


Demonstrably?


Sure. Foreign function interfaces (FFIs), also known as ctypes, are probably the best thing to point to as glaring examples of how C holds us back. Once we start there, then a spidering network of technologies are highlighted, all surrounding C, from dynamically-linked code objects (DSOs and DLLs) to the required usage of '%' characters for printf-style string formatting via templating.

We need to move beyond memory-unsafe paradigms already. Assembly intrinsics aren't great but they're certainly better than C.


Sorry, who requires % formatting?

Never use it, myself, except configuring the date format in my menu bar.


In a way this already happened, with GPU programming, which has a more realistic model of what modern hardware is (explicit memory hierarchy for example).

Because the speed advantage is so huge, people went through the pain of learning this new model and redesigning algorithms to better fit it.


Modern hardware is multiple different things. GPU programming does not have a realistic model of what CPU hardware is like, because despite CPUs having more cores and SIMD, they are also very cache-oriented and branch-prediction-oriented, while GPUs aren't that at all.


And yet the most common language used for GPU programming is a C++ dialect.


And it took 10 years for NVidia to make it match the ISO C++ memory model introduced in C++11, as per NVidia talk at CppCon 2019.


And the C++11 memory model was added so that the language would map to the execution model of the underlying hardware, not viceversa, refuting the parent.


Not really, C++11 initally got its memory model from .NET and Java memory models, two mostly hardware agnostic stacks, and then expanded on top of them.


I'd argue that modern GPUs pipelines are quite different from the API programming models they support, especially tile based renderers do lots of funny things while maintaining the illusion that you are running on the abstract OpenGL pipeline.

So even with GPUs you have a similar situation..


I was talking about stuff like CUDA, not OpenGL.


The computer history graveyard is full of architectures that were supposed to be superior by directly supporting the trending programming paradigm of the day.

The current microarchitectures for general purpose computing, that is OoO execution, cache coherent shared memory with a flat memory model, and mostly transparent and coherent caches seems to be optimal. In fact, far from being designed around C, C and siblings had to evolve to support the model well (a memory model, explicit SIMD builtins, support for vectorization and offloading etc).

It is entirely possible this is only a local optimum, but I have yet to see a plausible model for a better architecture.

It is more likely that, now that silicon is cheap and CPU designers are struggling to find ways to use it, extensions to support higher level languages might be added: more fine grained cache control for message passing, extra tags to help GCs, and more stuff that I can think of.


I’m sympathetic to the argument that C isn’t realistically a “high-level assembler” any more, but if there is low-hanging performance fruit available for a language free to dispense with C’s PDP-11-flavored abstractions, I haven’t seen practical examples of it.

If anything, more modern languages (eg Java) have to contort their data representations to build, say, “structs of arrays” that are highly efficient on modern CPUs (and trivial to express in C!)

I’m very curious to hear of any recent languages that are designed expressly with “mechanical sympathy” in mind.


The ISPC language by Intel is designed to give enough information to the compiler so as to allow auto SIMD-vectorization of programs [1]. It opts for an explicitly parallel model of programming, unlike C [2]. It also has easier support for SOA and AOSOA [3].

[1] https://ispc.github.io/

[2] https://ispc.github.io/ispc.html#the-ispc-parallel-execution...

[3] https://ispc.github.io/ispc.html#structure-of-array-types


The trouble is both processor instructions and compilers have been optimized for C (well C and FORTRAN). So no matter how good or bad C is, we're mostly "stuck" with its PDP-11 computing model unless someone does a lot more work than just making a new programming language.


Regarding your SoA example, Jai has explicit support for switching between SoA and AoS without having to rewrite everything: https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md...


I believe jblow removed this in more recent versions.


How are structs of arrays more natural in C than in Java?


Thanks! Comments like these, especially backed by good sources, are why I keep coming back to the often pretty messy comment sections of HN :D


I'm pretty certain the CPU engineers at Intel/AMD/etc aren't pulling out K&R 1978 and then telling themselves this is their new reality. All kinds of new safety, virtualization, and ML instructions have been added to CPUs without consent from the "C language". Infact, its Google, Amazon, Apple, Microsoft, and Facebook who are controlling CPU roadmaps!


What a load of crap

Without cache coherency and ILP, programming in any language would be insane. It's not like programming in x86_64 assembly suddenly opens a world of possibilities because its "truly low-level". You gain very little extra control over a given platform by switching to assembly over C, that's what we mean by "low-level".

C maps cleanly onto the instruction sets provided by chip manufacturers. It provides the option to the programmer to optimize structures for use in vectorization if they so choose, or to optimize for some other objective like size for a binary wire protocol or limited memory space.

The very nature of having a choice about memory layout of structures and the ability to cleanly link with the platform ABI is what makes C low-level. Obviously the inner-workings of a modern CPU don't map cleanly to the C virtual machine. However, there's no convincing evidence that greater control over cache invalidation is what's holding back performance of those CPUs.


If finer-grained cache control is needed, then it is no big deal to invoke the appropriate compiler intrinsics (if available) or write your own with inline assembly. My preference is to put these sorts of things in a separate .S file.


> It's not like programming in x86_64 assembly suddenly opens a world of possibilities because its "truly low-level"

It certainly opens a world of possibilities regarding the vector units. That's one place where inner loops hand-written in assembly still have an edge. On the other hand, with contemporary CPUs, high quality assembly coding is a highly specialized job skill by itself, so for general purpose engineers, learning it is most likely to yield a bad cost/benefit.


Right. Exact control of data structure layout is also practically essential when implementing a database system. It is needed for correctness as much as anything else.


The instruction sets provided by chip makers (except vector instructions) are tuned to match what compilers for ancient languages want to emit.

The chips move heaven and earth to maintain the fiction that those instructions actually direct what they do.

The number of fundamentally different kinds of cache, and specialized state machines not directly accessible by instructions, in a modern chip would boggle your mind.


Lose the leading sentence, your comment is great otherwise.


In GPU programming (CUDA for example), every time you allocate memory you need to explicitly specify where (general memory, L3, L2, L1, register). Because C doesn't support this, they added language extensions.

So C is too high level for GPU programming, and needed to be extended.


While I also disagree that C is “holding things back”, I don’t think their comment is a “load of crap”. Talk like that is unhelpful and quite frankly non-technical and unprofessional.


The author seems to trivialize how hard it is to write a compiler that generates efficient code. Like designing a simpler language than C (less general architecture targets) would solve much.


It's not about designing a simpler language than C, but rather one that matches what hardware is actually doing better. The example in the article is how GPU programming works. To make the most out of modern hardware we need to change the way we program to embrace concurrency, parallelism, and eschew shared global state entirely. The way FP works is much closer to hardware than the imperative style.


I have a hard time figuring out how forbidding concurrent task from communicating would necessarily make them faster. Also, how is FP a more accurate model of eg. AMD64 than imperative style? How would eg. "Clear Global Interrupt Flag" instruction be modeled in FP with no states?


I don't know if it is a more accurate model of AMD64/X86 but it may be a better match for the underlying silicon (gates, logic circuits). I can see function pipelines and composition of these in some ways analogous to circuit design. After all those instruction sets in AMD64 are merely an implementation of these. Pure functions and pipelines could model circuits quite well.


Shared state is not a problem with current hardware, mutable state is also bit a problem. The only issue is with shared state that is mutated from multiple CPUs and that's why the single writer principle is a fundamental rule of multithreaded programming.


Any idea what a low level language that was designed for today's hardware would actually look like?


GPU shaders are a good example.


LLVM IR, but it’s also not designed for humans to write without tool assistance.


Similar reasons are given for Golang design in their comparison to C language.

e.g.

>Why is there no pointer arithmetic?

>Safety. Without pointer arithmetic it's possible to create a language that can never derive an illegal address that succeeds incorrectly. Compiler and hardware technology have advanced to the point where a loop using array indices can be as efficient as a loop using pointer arithmetic.[1]

[1]https://golang.org/doc/faq#no_pointer_arithmetic


But there is pointer arithmetic in Go, in the "unsafe" package. Like with generics the Go authors want to decide where and when these are used, but they actually made it possible for mere users of Go to use pointers freely, since it's badly needed from time to time, unlike generics which they hide fully.


>Like with generics the Go authors want to decide where and when these are used,

I wish this meme about Go and generics would die already. C allows you to define an array of int and and an array of char and have those be two different types. Go allows you to do this with slices and maps as well, because slices and maps are also built in types in Go. This is not "generics".


Yes, C has (very restricted) generics too. No, Go does not need to repeat all of C's mistakes.


I think it's more useful to think of "generics" as applying to type systems that allow abstraction over types. C and Go crucially don't allow you to define a function that, say, appends an element of type T to an array of T. That's where things start to get complicated. C and Go don't have a restricted version of that [1] - they just don't have it at all.

[1] With the largely irrelevant exception of the C11 _Generic stuff.


Surely it's better to confine it to a package labeled unsafe that one must import to use (and which by Go's rules one can't import and then not use) and to make a point of only using it where necessary and paying particular attention to it in testing and code reviews.


Good comment and article, thanks! One thing only, the phrase "The quest for high ILP was the direct cause of Spectre and Meltdown" is slightly exaggerated. One might say ILP was rather a remote indirect cause, not the direct cause. Blaming it on C and ILP maybe is kind of far fetched.

That isn't to say that the article doesn't have a point though, far from it.


Isn’t this abstraction presented at the ISA level, not by C? It’s not that C isn’t close to the way hardware works, it’s that the ISA interface presented to C compiler writers isn’t close to how instructions are actually executed.


Original thread about that article in 2018: https://news.ycombinator.com/item?id=16967675


Meltdown and Spectre are hardware bugs.


I am sure this "misconseption" is what keeping Python programs so fast and efficient comparatively to ones written in C


I've been writing a large app in C again after a few years away. Originally wrote the app in C++ then again in Python and now in straight C. I enjoy the faster compile times of C (over C++) and the better type checking of C (over Python).

However, when I'm carefully using C++, I don't have to fiddle around with memory management and still get fast performance and better type checking.

Yesterday I wrote a Python script to generate C code test cases. String manipulation in Python is very easy.

The right tool for the right job, I suppose.


> Yesterday I wrote a Python script to generate C code test cases.

I do this quite a bit. Basically, anytime there is something very boiler-platy (like HTTP APIs, configuration management, test cases, etc.) I write metadata in JSON that describes what I'm adding, and then at build time a python script will parse the metadata and generate all of the C or C++ which is then compiled. I even have the python generating comments in the generated C/C++!

I used to use Jinja2 for generating the C/C++ files, but now I just use f-strings in the latest version of vanilla Python3.


This is what Lisp/Scheme based languages use macros for -> to reduce boilerplate code. A lot of programmers hate DSL's, but using Python to generate test code is the same as using a DSL. The difference is, Python is a much bigger and more complicated DSL, than a test-macro will probably ever be.

There is nothing wrong with generating code in a pragmatic way. It is just strange to see all the time people using workarounds for things that a tiny DSL could probably solve better.


There's another advantage to using a high level language like python though - since the "source" is now JSON metadata which is decoupled from the code, I can use that metadata to generate other interesting things. For example, I can use python-docx to generate a Word document, for example (which my company requires for documentation tracking). Or, if the metadata is describing HTTP APIs, I can generate an OpenAPI document which can be fed into Swagger UI[1].

[1] https://swagger.io/tools/swagger-ui/


This is really interesting — nice work! Have you published any tooling as OSS or examples? If you haven’t seen it before, the Model Oriented Programming work by Pieter Hintjens is worth looking at.


Not yet. Most of the stuff I've done is internal to my company unfortunately. But just to give you an idea, if I were adding a new configuration parameter to our IoT device, I would add an object to a JSON array that looks like this:

    {
        "name": "backoff_mode",
        "group": "telemetry",
        "description": "We don't want to clog up networks with tons of failing attempts to send home telemetry. This parameter configures the device to back off attempts according to the specified mode",
        "cpptype": "std::string",
        "enum": ["none", "linear", "exponential"],
        "default_value": "exponential"
    }
At build time, a few things are generated, including a .h file with the following:

    struct telemetry_config {
        ...
        std::string backoff_mode = "exponential"; //Possible values: ["none", "linear", "exponential"]
    };
The boilerplate for getting and setting this parameter is also generated. Thus, just by adding that simple piece of metadata, all the boilerplate and documentation is generated and you as the developer can focus on the actual logic that needs to be implemented regarding this parameter.


> better type checking of C (over Python).

Note that Python's official type checker (mypy) is pretty good (within the limits of type erasure); I started the project I've been leading up at work this past year in statically-typed Python, and it's been a great experience.


I tried to like mypy, really did. But it always ended up in some kind of circular import tailspin for me. C++ gets around the problem by allowing separation of interface and implementation, Haskell by allowing circular imports. Maybe I missed something, maybe it's fixed by now, but that was my experience a couple of years ago.


They made a flag TYPE_CHECKING to avoid circular import in type checking. Its ugly but works.

https://docs.python.org/3/library/typing.html#typing.TYPE_CH...


Also the circular imports was the least of the problems I’ve run into with mypy. They still can’t model a JSON type because it is recursive.


Recursive types were added in Typescript 3.7. Hopefully Python takes a page from their book.


Can you show example? I believe you can use TypedDict to express that.


It’s roughly

    JSON = Union[Dict[str, JSON], List[JSON], str, bool, int, NoneType]
TypedDict isn’t for arbitrary JSON, but it is really cool.


Ah I see, TypedDict won't be useful here.

Looks like there is a work in progress though[1] and people in the ticket provided some workarounds[2][3] for now.

[1] https://github.com/python/mypy/issues/731

[2] https://github.com/python/mypy/issues/731#issuecomment-53990...

[3] https://gist.github.com/catb0t/bd82f7815b7e95b5dd3c3ad294f3c...


I've played with mypy a little, and it's certainly not bad, but it does suffer by comparison to typescript, which just has a lot of really great structural typing features that mypy still lacks.


mypy recently got structural typing. They call it Protocols, which isn't the first thing one one put into google when looking for this feature. It's not as powerful and terse as typescript unfortunately.


They also added TypedDict recently https://mypy.readthedocs.io/en/latest/more_types.html


Yeah, I didn't mean to imply no features, just far less than TS.


I haven't tried mypy yet but want to get to it. It sounds very interesting! After using Python so much the last several years, I've gotten more curmudgeonly about strong type checking in my programming languages.


Don't wait for a new project, just start using it (is most beneficial if for every function you specify types for parameters and output). If you use an IDE that understand it like PyCharm, suddenly the autocomplete will start working right, so will refactoring and it will start highlighting potential errors.


IME, 'C with classes', or C++ but skipping the more esoteric features(OOP overuse, template metaprogramming) yields a fairly easy to understand language for most C folks. I'm a C guy and am most comfortable with it, but I like having the advanced data structures out of the box. Granted, most C codebases of any reasonable complexity have their own support for relevant data structures, but they're a lot cleaner to use via the STL.

Python is a great tool for a certain class of jobs. I saw a company doing systems programming in Python and originally thought it was a great idea, and then I learned of the hidden(or not) dangers of Python.

Yes, you definitely need to pick the right tool.


Here is AzPainter — obscure image editor for painters with PSD-format support & modern UI, written in pure C with own GUI toolkit on top of X11.[0]

Few years ago it fully rewritten from C++ to C.

[0] https://github.com/Symbian9/azpainter


Thanks for the link! I just installed it on my i5 Thinkpad. I can't test filters and more complex stuff right now, but it loads super fast, like 150 ms uncached, and probably less than 50 ms when cached. I'm curious if there is other software around rewritten in pure C. It would likely be a huge task not worth the effort, but I can't but imagine the speed and size gains by rewriting say Firefox or Libreoffice.


> I'm curious if there is other software around rewritten in pure C.

Can't answer for now, but check next awesome lists:

AWESOME Chttps://github.com/Bfgeshka/awesome-c

AWESOME Chttps://github.com/kozross/awesome-c | mirror — https://notabug.org/koz.ross/awesome-c

AWESOME Chttps://github.com/aleksandar-todorovic/awesome-c

AWESOME C++https://github.com/fffaraz/awesome-cpp


I think what squarefoot is looking for is lightweight (tiny?) C apps that have limited abstraction layers. For example, Gtk+, which is pure C, is no longer considered lightweight. It was with Gtk+ 1.2 because GTK was just a small wrapper around X11 in the 1.x days. That is what AzPainter is. AzPainter basically depends on xlib, xft/freetype, and some image libraries (libpng, libjpeg).

Modern C apps with a GUI typically build on GTK but that toolkit has become massive. Modern GTK3 apps are built on top of dbus, pango, atk, cairo. This is not necessarily a bad thing though and libraries like ATK provide accessibility which is important. At the time GTK+ was created the competing toolkit at the time, under Unix, was Motif and it had the nickname "bloatif". Now we've reached a point where GTK3 is much much larger and "bloated" than a 2019 Motif application.

If we include C++ the Qt and WxWidgets toolkits are also massive. FLTK is still a lightweight option though.


"I think what squarefoot is looking for is lightweight (tiny?) C apps that have limited abstraction layers. For example, Gtk+, which is pure C, is no longer considered lightweight."

Yes, I was also meaning the ui, which in this one is amazingly fast. I tried a few filters and they seem fast too, but being not a gfx expert I have no way to judge. But the ui is incredible; same speed on my home PC which has a mechanical disk. Making an external library of its gui primitives and functions could be an interesting project to be used on small embedded boards.


> Making an external library of its gui primitives and functions could be an interesting project to be used on small embedded boards.

Think, AzPainter itself already could be used on small embedded boards ;)

For source code of AzPainter's `mlib` toolkit look here:

- https://github.com/Symbian9/azpainter/tree/master/mlib

Since AzPainter v2.1.3, `mlib` sources shipped with AzPainter now licensed under GPLv3 terms, but there is AzPainter theme editor (older `mlib` minimal demo app) which sources licensed under BSD terms:

- http://azsky2.html.xdomain.jp/linux/mthemeeditor.html

Also, there are few other apps based on `mlib` toolkit

- http://azsky2.html.xdomain.jp/linux/azpainterb.html

- http://azsky2.html.xdomain.jp/linux/azcomicv.html

- http://azsky2.html.xdomain.jp/linux/aobook.html


Funny I do that in perl, and some people freak out about having code in an unfashionable language generating code in another language.

It's just equivalent to a preprocessor.


In the mid-90's, before mod_perl, I used to generate PHP in minutely cron job using a Perl script from a constantly changing database. That came close to the best of both worlds at the time: the flexibility of Perl, fed from a database, and no CGI overhead when serving.


Code generation usually doesn’t get you any support from standard tooling. My autocomplete can see through macros, but it has no idea what C is if I was working with Perl.


Code generation usually sucks.

A small one-off run under your control is an exception. But people learn to fear the thing, and apply that fear everywhere.


> Code generation usually sucks.

> A small one-off run under your control is an exception. But people learn to fear the thing, and apply that fear everywhere.

Why does it usually suck? Writing code that writes code seems like an optimization that would bolster productivity.


> Why does it usually suck?

One large factor is that when somebody really dislike the result of a code generator, they most commonly cope by editing the generated code by hand. With other metaprogramming techniques people can't do the same, so they fix the issues.

The end result is that code generators are usually a set of mostly-functioning tools that create bad code that is almost, but not exactly impossible to understand but that you will be required to change at some point.


Most of the code generation code I’ve seen is itself opaque and generates opaque code. This makes refactoring, extending and bug fixing hard.


I honestly don’t like C++ at all as a programmer, “C” is way better for hacking.

For example: I understand why the standard was written the way it was but syntactic closures in C++ are way worse than GNU C.


Don't GNU C's closures/nested functions require an executable stack or something?

I remember GNU GRUB going through and getting rid of all of them.


The implementation uses a executable stack, but they could do it differently if they cared.

Eg, https://news.ycombinator.com/item?id=21635551 with alternating ro+x and rw-x pages holding thunks and fptr+data pairs.


Yes, because that’s where they put trampolines that are generated on the fly. (They don’t need to do this, but that’s how it’s currently implemented.)


I use them. The executable stack bit doesn't bother me since I'm on an ARM Cortex. Would be nice if they fixed that because they are kinda handy once you get used to them.


> The executable stack bit doesn't bother me since I'm on an ARM Cortex.

I'm not familiar enough with ARM - what does ARM Cortex give you here?


It's what it doesn't give me. Doesn't have a memory management unit, so no stack execution protection. So no protection against classic stack smashing attacks.

I saw a talk about weird machines. The upshot of it is if you can clobber the return address on the stack you can usually create a weird machine that you can then program[1]. Leads me to believe that if you are putting entrusted data and return addresses on the same stack you're in a state of sin security wise.

https://en.wikipedia.org/wiki/Weird_machine

[1] Also saw an exploit on an embedded system where they leveraged that even though the memory was locked you could set the program counter and read/write registers via jtag. Using that they were able in short order to recover the devices security keys and re-flash it with their own code.


What are syntactic closures in GNU C?


This [1] explains how to create them using nested functions. Near the end is a quick blurb about the implementation which is probably why c++ doesn’t do it that way.

[1] https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html


Oh no, nested functions are a terrible, insecure hack that are far too verbose to be ergonomic closures :(


Of course, that’s why I said it’s good for hacking.

Although I’m not sure I agree about the “ergonomic” thing.


Wouldn't you just use a lambda nowadays?


Not in C you wouldn't.


You can with Clang’s blocks extension: https://en.wikipedia.org/wiki/Blocks_(C_language_extension)


Sure, but the parent post was saying they didn't like nested function handling in c++.


Nested functions don't exist in C++ (except for lambdas).


Agreed - this feature seems useless in modern C++


I don’t like C++ anymore. I used to love it but now it’s just a bloated mess of a language. I get it, you don’t have to use every part of the language but I really don’t like any of the newer bits and that’s what everyone wants to use.


I can't imagine writing C++ without lambdas and auto


Yes, generic lambdas in 2014 made coding C++ fun. The range library in C++20 will do it again.


Have you considered Nim?

Much better type system than C, syntax is very similar to Python, compile times are excellent and speed analogous to C++.


How do you generate the test cases? Check the types being fed in and fed out, then generate random input/output scenarios?


It's pretty simple. I'm decoding a structure into bit fields (802.11 Information Element). I have a hex dump that I decode into a two structures with almost 100 fields (eyeroll for 802.11 committee). Each field can only be 0, 1, or sometimes up to 15.

I have a list of the structure members. I generate an assert for each member being a certain value. The python script builds/runs the C code. On failure, I parse the assertion failure, get the actual value, change the C assert string, rebuild-rerun. Continue until the program succeeds.

I've visually verified the decode is correct once. I want to keep the decoder tested if I change the code again so I'm generating complete test coverage.


It sounds like your use case is property based testing. Have you tried any of the QuickCheck ports to C++? My favorite is RapidCheck (https://github.com/emil-e/rapidcheck). You define generators for your struct fields and write a pure function to check whether the parsed output conforms to your expectations.


Checking that out now! Reminds me a little of the Python library Hypothesis. https://hypothesis.readthedocs.io/en/latest/


C has static typing but Python has strong (safe) typing. Many people would contest the claim that C is better here.


Strong typing is of little practical help when you're trying to read a code base with zero type annotations.

Static typing is a superior approach that makes maintainability and refactoring orders of magnitude easier.


>100x advantage in those areas does sound pretty attractive! Maybe my memories of using C are faded. If you have references, I'd be interested.


I enjoy using C++ from Python through Cython.

This way you can have the speed and memory tightness of C++ where it matters, but do the other stuff, like initial data parsing and munging or overall control and sequencing in Python, thus avoiding the general clumsiness and unproductivity of C++


In August this year I started a new project. In C. Why? Because I'm quite familiar with it and because it allowed me to focus on the problem rather than on the language that I was writing in (the problem is - for me at least - a very hard one). The codebase has steadily grown, at some point it was 7K lines, then I cleaned it up and refactored it back down to about 5500 right now. At some future date it will be - hopefully - finished and I may use it to power applications and/or services. It's super efficient both memory and speed wise compared to what the competition is doing but that's just a side effect. That side effect does allow for some applications that would otherwise be impossible.

The code takes one file and outputs another, it's a classical 'unix filter' and it relies on a couple of very high performance libraries that others provided, battle tested and ready to be plugged in. As the project matures I come across the occasional wart in the language, something that I know I could have done more elegantly in a language with richer constructs (lists, list comprehensions, that sort of thing).

What surprised me most after working on this project for a couple of months though is how well the language fits the problem, that's something that I did not really think would be the case when I started out with it. But as time passes and things 'find their place' it is indeed just like the author writes, I - still - love coding in C. Even if I know there are better alternatives out there and I should probably get invested in one of them one of these days at the 75% mark or so of my career I'm still very happy that I picked C at the start and never once did I imagine that it would still serve me this well 37 years later.

Most of the other 'hot' languages of the days that went by in the meantime have all been lost, and yet, C is still here, and likely will still be here for years to come. To me it's like an old knife. Not pretty, known to hurt you if you abuse it but still plenty sharp and well capable of doing the job if you treat it well.


Experience shows that almost every time someone writes a project like this in C, applying modern fuzzing tools (e.g. AFL) will uncover bugs that let a maliciously crafted input file hijack execution and act with the privileges of whatever's running the code. It is a lot of work to remove all such bugs (and not reintroduce any) and you never know when you're done. That is a very severe problem with C that is having devastating effects on society.

It is therefore irresponsible to use C for a new project like this, unless you know your code will never be exposed to malicious input. (Note that if you share the code with anyone else, it's hard to have such confidence.)


> It is therefore irresponsible to use C for a new project like this

So, given all this doom that hangs above my head, what should I have used?


C++ is easy to transition to.

The powerful libraries growing up around it make dropping down to the dangerous level mentioned largely unnecessary, without giving up performance.


Every time you use references in C++ you are "dropping down to the dangerous level".


Passing things by reference is safe. Any other use is mostly just pointless; it what reference types are for. If you are doing something else with them, it easy to stop.


Passing parameters by reference is certainly not always safe. You can, for example, pass a reference to a deref'ed unique_ptr into a function call, while the function call does something that clears the unique_ptr.

If you want to avoid objects containing references (or raw pointers), then you have to avoid using lots of standard library stuff --- `string_view`, C++20 `span`, or even STL iterators.


Now you are talking about pointers.


There isn't much difference between references and pointers. That is the problem.


A reference only ever points at one thing, from birth to death, and it always points at something.


No, it is entirely possible to delete the underlying object so a reference is left dangling. See https://github.com/isocpp/CppCoreGuidelines/issues/1038 for example.


You wrote "Even if I know there are better alternatives out there"...so you tell us!

You also wrote: "Not pretty, known to hurt you if you abuse it but still plenty sharp and well capable of doing the job if you treat it well."

To me it seems you know it can be dangerous, but like it too much not to use it.


> Even if I know there are better alternatives out there

I could have chosen Java, Python, Go or C++ instead. Instead I chose the tool I'm most familiar with because that will get me focused on the problem rather than on the tool. It's a disadvantage of getting older: there is less time left to waste.

> To me it seems you know it can be dangerous, but like it too much not to use it.

It is mostly a familiarity thing, not a like or dislike.


I don't know all your requirements so it's hard to say.

It is very likely that Rust would be a good choice, especially if the libraries you used have Rust equivalents.


Isn't Rust right now (1) still a niche language (2) terribly slow to compile (3) a bad fit in the first place since I don't actually know the language and (4) still immature?

As for libraries, the most important ones that I rely on deal with the loading of .wav files, writing of midi files, decoding of mp3s and fftw ( http://www.fftw.org/ ).


1) Define "niche language". Lots of people are using Rust in production. Rust isn't in the top 10 most popular languages, but it is in the top 100, and rising. 2) Compilation speed is a problem for some projects, but not for small projects like yours. 3) Understandable, but writing exploitable code because you don't know a suitable alternative language very well is not a good long-term situation. Maybe you should do something about that. 4) Define "immature". It's not one of the world's most mature languages, but it is definitely mature enough to write production code for many contexts. In many ways C is the language that needs to grow up.

There are Rust libraries for all those things (including Rust bindings for FFTW itself), though I can't speak to their quality. I can confidently say that using Rust libraries in a new Rust project is a lot easier than using C libraries in a new C project, especially if you're not on Linux.


> Lots of people are using Rust in production. Rust isn't in the top 10 most popular languages, but it is in the top 100, and rising.

I've yet to come across anybody outside of the HN crowd that knows about or uses Rust. Not a single company that I looked at in the last year used Rust, that's 41 of them and I asked every one of them which programming languages are in use.

> Compilation speed is a problem for some projects, but not for small projects like yours

ECT cycle (edit, compile, test) is a pretty big factor during early stage development. Running a test in a second versus running a test in ten seconds would ruin my day. I'm not sure what the current speed of compilation is for Rust but the only time that I looked at it (very early on) it was so slow as to be unusable from my perspective. I'd hope that has substantially improved. Think of what I'm doing as explorative programming, I'm both trying to understand the problem and trying to solve it at once.

> Understandable, but writing exploitable code because you don't know a suitable alternative language very well is not a good long-term situation.

That may be true. At the same time, Rust may not be a good long term solution either, languages come and languages go, and before I invest a year or so into a new eco-system I'd like to see it has staying power. It is interesting that you recommended Rust and not say Java which has pretty good performance, is memory safe and is in production for well over a decade, for this particular use case I would choose that over Rust.

> Define "immature". It's not one of the world's most mature languages, but it is definitely mature enough to write production code for many contexts.

There isn't a month or we see an announcement of the next point release of Rust on HN. I don't have the luxury of tracking a moving target next to working a full time job and having a family, then doing this hobby project besides. If Rust wants to see mainstream adoption for stuff like this because you seriously believe that writing a new project in C is irresponsible (which smacks of FUD, and is one of the reasons why I think Rust is one of the more toxic language communities on the internet, I have never seen a Java proponent use that kind of language) then I hope you agree that getting Rust to some kind of long-term stable form in the very near future is a must. This whole 'the sky is falling' tactic is rather off-putting, at least, it is to me. The same happened to perl, which was supposed to be - and still is according to some - the best thing since sliced bread.

> There are Rust libraries for all those things (including Rust bindings for FFTW itself), though I can't speak to their quality.

That's good to hear.

> I can confidently say that using Rust libraries in a new Rust project is a lot easier than using C libraries in a new C project

For you, as a Rust user, yes. But you are forgetting that for me that is not at all easier, and that using C libraries in a new project is a lot easier than using Rust libraries for me because I happen to know how that works.

Vantage point is a bit of an issue here, I take it that you are fluent in both Rust and C and that you have decided to hitch your wagon to Rust. I'm fine with that and no doubt in the long run you will be proven right but to me it smacks of 'you should do as I do' rather than that it is the best for me.

Learning a new language just for some project is a very high degree of friction to add, it would slow me down tremendously and it might lead to the project being abandoned rather than progressing at a quite acceptable speed given my constraints in time.

> especially if you're not on Linux.

I wouldn't dream of developing software on anything else, but I totally respect other people's choices in their platforms.


> I've yet to come across anybody outside of the HN crowd that knows about or uses Rust.

Interesting. High school students and university students that my kids know are using Rust (in New Zealand, not some high-tech mecca).

> Think of what I'm doing as explorative programming, I'm both trying to understand the problem and trying to solve it at once.

OK. I might recommend Julia then.

> It is interesting that you recommended Rust and not say Java which has pretty good performance, is memory safe and is in production for well over a decade, for this particular use case I would choose that over Rust.

Sure, Java sounds like a fine choice. I didn't know much about your requirements when I suggested Rust. People often choose C because of strict performance or deployment constraints, which Java often can't meet, which makes Rust often a safer recommendation.

> There isn't a month or we see an announcement of the next point release of Rust on HN.

That's because they release every six weeks. That doesn't mean the language is unstable or you will keep having to make changes to your code. It means there's a steady flow of incremental improvements that preserve backwards compatibility.

> you seriously believe that writing a new project in C is irresponsible (which smacks of FUD, and is one of the reasons why I think Rust is one of the more toxic language communities on the internet, I have never seen a Java proponent use that kind of language)

In hindsight I shouldn't have mentioned Rust at all. I do sincerely believe that propagating C code is irresponsible and the software industry needs to recognize this. It sounds harsh, but I think that's partly because software developers have historically taken too lightly the consequences of their choices (even for hobby projects).

People like me who are zealous about the industry moving away from unsafe code tend to be big Rust fans, because Rust finally makes that possible for the systems/embedded domain where C/C++ were for such a long time the only viable option. I guess we have skewed the Rust community to some extent. Mea culpa.

> it smacks of 'you should do as I do' rather than that it is the best for me.

When we write code and distribute it, we have an effect on the world, and then I think it behoves us to consider more than just "what is the best for me".

For hobby projects where the code is never distributed, these issues are mostly moot ... though it's amazing how often such projects escape sooner or later.


> I do sincerely believe that propagating C code is irresponsible and the software industry needs to recognize this.

I believe that you are sincere in this. At the same time I ask you to recognize that the way you - and many other proponents of 'safe' (safe between quotes because computer programming will never be safe) languages approach this is combative and therefore ultimately un-productive.

The old proverb says that you will catch more flies with honey then with vinegar and headbutting with people and calling them irresponsible is not going to get you where you want to be. It will have the exact opposite effect, people will dig in and their resolve will strengthen rather than weaken. This is because people tend to be invested in their tools and their creations. Going for a full on frontal confrontation about this is counter productive, that's just human psychology, which is in cases like these as important - if not more important - than having a technological edge or being right.

Zealotry is the last thing you want to take with you in your toolbox if you want to make change.


That's a good point, but it assumes the only audience for this conversation is developers already emotionally attached to C, which I don't think is true.


Check the title.


> Rust finally makes that possible for the systems/embedded domain where C/C++ were for such a long time the only viable option

To be fair other languages where there first, they just lost due to UNIX uptake across the industry.

Had any of those language been more sucessful regarding adoption across mainstream OSes, and a language like Rust wouldn't be required.


Not clear what "other languages" you mean here.

GC is a problem for OS/embedded programming. And before Rust, safety required GC (or an unpalatably restrictive programming model like MISRA). Even if Symbolics or the C# Windows Vista work had been more successful, Rust would still be needed.


Rust or modern C++, although the abstractions on the latter are leaky


I love the (seeming) resurgence of C, Rust, and C++ articles on HN. Not sure if it's robotics, IoT, webdevs longing for simpler and stricter frameworks or what. But I love it.


For me, it’s embedded. So I use C, mainly because as a professional company releasing a real world product my only other choice is C++ and I haven’t found a “need” for it yet.


The best language is usually the one you know!


Just like the best camera is the one you have.


Can't tell if sarcastic. I'm not a professional photographer but for me the time to pick up a better camera takes a couple seconds... :)


I think we also choose our favorite programming languages based on our psychology, not only the technical features.

The ones who like C and C++ are like Toretto from the Fast and Furious franchise choosing a classic muscle car for a race. They are super fast, give you much more options, but because of that, they are not the safest.

A muscle car requires more skill to drive and win than a modern fast-shiny-and-safe car. But because you know they are more dangerous to drive, is likely that you will try to improve your skills to make it less likely to have an accident with it.

Thats one of the reasons why im still skeptic to jump into the Rust bandwagon, once you turn into a good muscle car driver, its hard to choose for safer but more limiting options, but you totally get it why someone without experience in fasters cars would choose for the safer version.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: