C development, like most languages, isn't really "taught" at all. You might have some teaching on the structure of the language and a basic getting started course on main(), but that's just the beginning. Everything from there up seems to be self-taught, either from the internet, code examples, or working it out from first principles.
There are a lot more risks in C than more recent languages. Koenig's book "C Traps And Pitfalls" https://www.amazon.co.uk/C-Traps-Pitfalls-Andrew-Koenig/dp/0... was extremely useful to me when I was learning the language (available at my local library!). I don't know if there's a more modern version that takes into account recent changes to the C standard.
I note that a lot of the time in embedded work you don't get to use the latest version of the standard because you're using a weird or obsolete toolchain that doesn't support it. Or architectures like PIC, where your stack is hardware limited to 8 levels and indirect addressing is inefficient.
I think a lot of the "pure C" people are disguised as electronics engineers - doing PCB design and microcontroller programming together.
>>Everything from there up seems to be self-taught, either from the internet, code examples, or working it out from first principles.
But to be fair, can't that be said that nearly everything is self-taught? From Physics, to mathematics, to music performance and theory, to learning to ride a bicycle or drive a car -- we can read but it takes some time to internalize the structure of each discipline and "own" it.
>>in embedded work you don't get to use the latest version
I work in C77, and I'm reminded of how far C (and C++) has come each time I have to #define that fancy new-fangled "boolean" type.
>>I think a lot of the "pure C" people are disguised as electronics engineers...
Agreed, this is exactly right. Most if not all of the pure-c world is due to embedded micros with strict limitations (<1k RAM, etc.)
It looks like the author removed the bit about C77 from their post.
Although ANSI C wasn't a thing until 1989, the language had been around since 1972.
The original K&R book was published in 1978, and if you read it, you'll see C that's noticeably different from C89 or newer. 'C77' was probably just a fun bit of shorthand pjc50 used to describe old, K&R style C. Or at least C that's missing some ANSI features.
>C development, like most languages, isn't really "taught" at all. You might have some teaching on the structure of the language and a basic getting started course on main(), but that's just the beginning. Everything from there up seems to be self-taught, either from the internet, code examples, or working it out from first principles.
If there's one thing my CS program did correctly, it was having a two semester course (sort of, officially they were two separate courses) that covered an intro to C programming but also C in terms of machine architecture, down to learning to disassemble C or assemble it from assembly, etc.
I think it was fairly valuable at the time to learn how and why C is important/forms the lowest layer for 'modern' languages.
Can you recommend any books for the more advance uses you stated?
I’m currently going through MOOCs and most courses seem to focus on the language itself as opposed to why the language is useful.
C is a weird language, in the job market. It's at the core of all our electronic infrastructure, it's important to learn it if you want to be a serious developer, yet nowadays you can hardly find a job that is "pure C".
So, in universities, you have either students who don't see the point, because they'll do java or PHP or JS at their daily jobs, or students who get a little too excited by it, almost specialize in it, and can hardly find a job that fulfills their excitment.
Disagree, the embedded world is still firmly routed in C and this world is larger than you realise.
Consider that every electronic device you have has multiple microcontrollers inside running C code.
I would argue that if you want to get your code in the most peoples hands, you have to know C. Just looking around the (engineering) office I'm in at the devices using C:
- My PC.
- The router(s)
- The printer(s)
- the USB hubs.
- the mobiles and tablets
- the lights (yes).
- the A/C.
- The door entry system
- the microwave
- the desk phones.
- the monitors on my desk.
- The dishwasher
- The coffee machine
- The clock
The C world is much larger than most web people realise.
I agree, there is more C code used in everyday life than anything else. But, regarding job offers, there's a bias, because this code tends to be very stable. The code in your microwave was written 10 years ago, and didn't change since then. OTOH, even a rather stable webapp like HN changed a lot in the meantime. Maybe the guy who worked on your microwave's code moved on to several other projects, and produced code for dozens of objects in your office, while the guy working on your company's website has been working fulltime on it since its creation.
Web stuff is maintenance heavy, so even a small webapp requires a lot of human work to keep working in the long run. Add this to the fact that, although most companies, even small ones, need a website that is more than a static single webpage, and/or a way to manage their customer / provider databases, there are rather few companies producing electronic appliances, and unless you live in the right place, most programming jobs you'll find are in high-level stuff.
C is much bigger than just the “embedded” world/microcontrollers. iOS and Android are still C from the kernel up through the window system/graphics layer. The router software at the core of the Internet are written in C. Vulkan or whatever underlies your game console is in C, as is the GPU driver. (The game itself is probably C++.)
IOKit and all drivers, which are most of the iOS kernel by lines of code, are C++. The windowing system of iOS and modern macOS, Core Animation, is a hybrid of Objective-C and C++. As pointed out by the sibling comment, core Android—everything Google wrote except kernel modules—is basically all C++. Vulkan is a C API (though might have a C++ implementation), but Metal is all Objective-C (with internals written in C++, I think?), and Direct3D is all COM-flavored C++.
> - The router(s)
> - The printer(s)
> - the mobiles and tablets
> ...
All of those do contain some C, but the ratio of it to all the js/java/python/swift etc. on top of that (because this router has web, ui, mobile app and so on) is growing.
> The C world is much larger than most web people realise.
Perhaps. But it was a lot larger 10 or 20 years go, and C was wiped out as "the language" from many domains it used to occupy.
I'm afraid the embedded world now is probably 10 to 20 time the size it was then.... the number of devices in corporating electronics and therefore software has skyrocketed.
A lot of low level embedded programming is effectively "pure C". I think the problem is that these aren't usually entry-level jobs for people without experience. It doesn't help that schools prefer to put the emphasis on languages like PHP or JS because they're easier and almost guarantee that people will find a job immediately regardless of their background. It's just the path of least resistance.
But as somebody who mostly writes C for a living and specializes in embedded programming I can assure you that there's no shortage of employment offers.
I didn't mean it in the sense "C developers can't find a job", but in the sense professional C programing is a niche; low supply, low demand. When you're a freshly graduated student and there are 10 companies at 30 km around that are ready to recruit you for c#/js stuff, and none for C stuff, you tend to think "there is no job in C", even if there are 2 or 3 companies working in the embedded field that were ready to recruit you, with an even better salary, but 60km from here.
No you're right but from my point of view (I've been a professional C coder for about 13 years) it's not so much that there aren't many C jobs out there, it's more that there's an avalanche of "c#/js stuff". In absolute numbers I would be surprised if there was a significant reduction in number of C jobs available in the past decade.
There's also the fact that there's comparatively little innovation and hype in the C sphere of influence. C rarely makes the headlines these days while every other week you have a fancy new JS framework.
There is location to keep in mind. There is the defense and aerospace industry concentrated in a few hubs, that should offer a stable supply of C jobs.
Embedded in general and IoT in particular are big these days and even though modern "controllers" are perfectly able to run scripting languages there's still a lot of low level C going on. If you need to write a device driver you're probably not going to do it in Javascript. You can argue that it's a niche but it's a rather large one.
> these aren't usually entry-level jobs for people without experience
This seems true in any software engineering job - but at the same time, open source or personal projects are experience too. I would be very surprised to find a junior developer CV that didn't mention any projects like that.
I certainly had a lot of trouble finding my first real coding job, as someone who's mainly into C. The relative quantity of web jobs is frustratingly high, and while a few of these openly admit willingness to hire people without prior experience in their tech stack, the handful of C jobs were all for seniors with years of experience plus specific experience in things like RTOSes or Linux kernel drivers.
I think your parent was saying that people using the term "C/C++" are often muddy on the considerable differences between these two languages, and that it is a red flag in a job description if the company cannot even manage to properly name the language they use.
If you are pedantic enough that you get irritated by "C/C++", which clearly denotes that they understand they are different languages, and only implies they may disagree with you on how much crossover knowledge there is, you probably won't be happy anywhere on earth except maybe the linux kernel team.
Not parent but I tend to think there are broadly 3 "languages", with 3 very different cultures and mindset:
- C, which is, well, pure, raw C,
- C++, which tends to gravitate toward modern C++ (C++11 and later), and focuses on avoiding C constructs (except when needed, eg for legacy reasons),
- the so-called C/C++, which is C with a small subset of C++98 constructs, mainly classes, the public/private distinction, the string type, references, and not much more
For a purist (either a C purist or a C++ purist), the third category feels like losing most advantages of the original language (simplicity and orthogonality of C, relative safety and expressiveness of C++) and getting drawbacks of both (name mangling and dangling pointers, both at the same time, yay!)
For me, it would not be a red flag, but definitely a yellow one.
Back when I was reading comp.lang.c, people would often wonder by to ask for help with their "C/C++" programs which typically turned out to be C++. Such people were usually surprised when it was pointed out to them that they were in a C newsgroup and that that is a different language from C++.
The confusion was widespread. It may have changed, but I wouldn't really expect that to be the case.
But surely that is a function of the recruiter and not a function of the employer. Most recruiters are clueless about the technology they are recruiting for.
Not in my experience. Recruiters may be clueless, but they will only pass what was given to them. Companies with well-defined requirements will either make sure they have recruiters who understand what they require or copy it precisely, or don't use recruiters as much and rely on communicating with developers directly. I had way better experience with those who say "you will be working on a project involving mainly X and maybe some Y" than ads stating "we are looking for C, Scala, CSS etc.". Some of course say "we have range of project and we want polyglots", but if they don't, its definitely lack of care and understanding of technology.
> You're looking forward to development tools that have neither refactoring nor auto completion.
On the contrary, I have found C and C++ to have the best refactoring and autocompletion tools of most of the languages I have tried, since usually the compiler is involved at some point and you can make non-trivial guarantees about identifiers based on types.
That's part of the fun of it for me. Getting close to the metal, having to worry about electronics, debugging with an oscilloscope or, if you're lucky, an UART etc...
Regarding development tools that's a bit orthogonal, you can code for your 8bit AVR controller with Visual Studio if you really want to. Personally I don't care much for full blown IDEs and automatic refactoring so I use Emacs with dumb completion and cscope/ctags and I'm good to go but to each their own.
If you like quick prototyping and are used to develop iteratively through heavy use of the debugger I can see how it could be incredibly frustrating though. It's a different mindset and a different approach. Still, as far as I'm concerned I wouldn't trade it for all the JS jobs in the world.
The electronics and oscilloscope is the funnier part but that doesn't help with software development.
For some microcontollers and FPGA I had to work with, there is hardly any API documentation. It could take a whole week just to figure out how to blink a LED. It's frustrating. There is zero help available on the internet because nobody uses that. It comes with its own tool chain and there isn't the option to use something else, like CLion or Visual Studio.
I'm supposed to make some home automation devices, or an inertial navigation system, or god knows that. It's enough work on its own, I don't have time or energy to fight broken tools.
By comparison, android development is a wonder. Development tools that work, long form documentation of every single API and guides and examples. Just plug the USB port and the program loads and it shows debugging output and you can debug line by line and use breakpoints.
Having used better development tools, I just have higher standard. I can't tolerate embedded anymore. I hope there are some working tools for some chips of some manufacturers nowadays, but wouldn't know.
The limitations of low level debugging are often just that: hardware limitations. The scenario of debugging Android applications is really bending the limit of what I would consider "embedded development", it's very high level.
ARM does provide debugging tool that can get to very low level (you can even step through bootloader code if you want) but they generally require specific hardware support and the license is pretty expensive.
I agree that documentation is often terrible though, although obviously that's pretty vendor-dependent. For instance regarding FPGA development Xilinx tends to have pretty decent docs while Altera (or IntelFPGA as it's now called) is pretty bad at it. Same goes for SoC vendors, Ti is pretty good but I've been a lot less lucky with other vendors.
I wouldn't generally consider Android embedded besides a few use cases as a mobile/handheld device. In my view, tt's a wonderful platform to develop for and that's what every manufacturer should be striving for, or die.
Of course there is a difference in capacity between a 8 bit MCU and a Xilinx SoC, they may not be able to cover everything the same, but that doesn't explain why neither have a working example to use an interrupt handler with a push button.
The other factor for getting good documentation and support is how big of client you are. Unfortunately, the companies that uses 1000K pieces ARM micros will get better support than the uses 10K pieces.
What do you do now? Sometimes I feel the same way but I'm still in the embedded world. I continually ask if it's worth transitioning into web. What else is there to transition to?
I do something mixing development and infrastructure, in finance.
What there is depends on where you live. In London/NYC, it's either web or finance. In the silicon valley, it's all web. In Washington it's government contracting. Do what needs doing.
I graduated university really excited that I understood pointers, and that I could implement about 10 data structures from the ground up. However when I got my first job at a large dotcom, I got treated as an unqualified beginner because I didn't know Java. 20 years later and I have never had to write so much as a line of code in C, and that makes me kind of sad.
I'm CS since 2015 and am graduating next year. Most of our programming classes were in Java, which was deemed outdated by some. But we still had an operating systems and an embedded systems course that used C exclusively.
I think it's important to use the language that best applies to the course material. For example we looked at UNIX in the OS course and it really made sense to use C there. But when we did Algorithms and data structures we used Java for that. Using C really takes away from the point and bogs you down in language specifics in this case.
There are quite a few jobs available which are pure C.
But obviously the industries/domains which require C are not the same as those which require Java/PHP/JS.
In embedded software no-one is going to use Java or PHP, like in mobile/web apps software no-one is going to use C. That being said, Objective C is a pure superset of C (which C++ isn't).
Honestly, efforts to run Java/Python/Lua/etc in embedded are more about retrofitting developers into that market than anything else, the fact that embedded devices get more and more powerful only enables that.
US and French military are some of the entities using Java based embedded solutions for weapons control systems, they surely know a few things about hard embedded requirements.
If we're talking about Aegis, they replaced UYKs with PCs and the realtime constraints are very soft (hundredths of milliseconds); that's hardly embedded, unless you mean embedded in a ship. And yeah, Java works there way better than the CMS-2 they force fed engineers with before.
In any case, those systems are far from having the physical and technical constraints of the actual system on a missile, plane or satellite; simply put, a 50ms delay to know the speed of a remote object is acceptable, a 50ms delay to know the speed within said object isn't.
To this day, and to the best of my knowledge, 95% of hard realtime logic is C, C++ and Ada or a combination of them, when not custom ASICs and FPGAs for low latency and high throughput tasks.
If we're talking about other systems, I'd like to know about them.
You claimed Aegis is an embedded system. It's not.
Your link specifically calls out general purpose computers, not embedded systems. Java is probably not the worst language to use in that particular application, but it's certainly very close to the bottom of the list.
You can only afford to run Java on an embedded system when you are making limited-run devices, i.e. where the cost of the processor & RAM etc are negligible.
For most embedded devices, this is not true.
When you're paying for every device the code runs on... and that code wastes 80% of your cycles in abstractions and indirections then forces you to have 200% more RAM than the equivalent C code, that changes your mind.
Saving a tiny amount on a processor adds up when the multiplier is (say) 1 million units.
> Garmin devices use Monkey, which is their own Java like language.
"Monkey C" (name apparently chosen only for the pun) didn't strike me as especially Java-like when I used it, and given the memory constraints placing such a low limit on the practical max size of Garmin watch applications, that it was OO at all seemed more a hindrance than an aid. I kept wishing it was just C, and I'm not even that good at C.
The same applies regardless of the processor (size).
Less overhead will allow a cheaper device.
Its worth noting that the java support on Cortex-Ms (Jazelle) is deprecated. If you're using a cortex-m class device, chance are you don't want to be running java on it.
Note that I'm talking in general terms, there are always exceptions to the rule (such as for the rare sandboxed application).
of course aot exists, was just pointing out that java is not much of a use case in embedded sw, to the extent that once major features are now deprecated.
These seem mostly application level (i.e. high level) software running on embedded systems. I wouldn't call developing on Android "embedded software".
The point being that the choice of language is strongly influenced by the domain the software is developed for.
I graduated 16 years ago and I have always got jobs which were pure C, on state-of-the-art products as well. That's still the way it is, plenty of jobs available.
It was a managed industrial switch from a 2nd tier manufacturer that am not going to name. Was found out because it was breaking the fiber ring on site. My colleague opened it, hooked to a UART, accessed the system and found it's running a Node application for Web UI. On (self-refreshing) status page it was accessing exposed to userspace PHY registers to read out the stats and not in a clever way, blocking up the switch.
It's pretty important to note this is about how to teach C, not why. Should schools be teaching C at all? I would say yes, not as an introductory language but as a precursor to studying kernels or databases or compilers. Real-life examples of these are mostly written in C, and studying such systems is IMO a necessary part of a CS curriculum. Therefore, at some point students need to learn C. They should certainly have the option, so we still need to address the question of how it should be taught.
With regard to that question, the OP seems pretty spot one. The identification of certain codebases as good or bad matches my own experience with most of them. The advice to use static analysis and to avoid preprocessor abuse is sound. There are only a few things I'd add.
* You can't really learn C without also learning to use a debugger - probably gdb. That must be part of the lesson plan.
* A more in-depth discussion of good error-handling and memory-management practices would be worthwhile, since C itself doesn't provide these.
* It can be very informative to implement some of the features found in other languages in C. Examples might include a simple form of object orientation, or a map/dict/hash, or a regex parser. Besides their value as a form of practice, this would expose students to some of the issues and decisions involved in making these a built-in part of a higher level language.
> You can't really learn C without also learning to use a debugger
News to me.
I learned C in '85 from the K&R book and the project lead for our little startup's Fortran to C translator. No debuggers were involved as far as I can remember.
[p.s. as to "why" teach C. I know many languages at this point, but that initial exposure to C necessitated a deeper understanding of 'the machine', memory management, etc. The core of C is fairly straight forward and imho an excellent pedagogical vehicle for a 'substantial' understanding of "programming".]
OK, I'll amend my statement. You'll waste a lot of time if you try to learn from or work on any substantial C codebase without knowing how to use a debugger. Some people have a lot of time to waste, but that doesn't make it a good idea.
The CS:App book introduces gdp, valgrind and the how the compiler works so linker/optimizations, how C looks like in assembly, and pitfalls so introduces the SEI CERT C Coding Standard http://csapp.cs.cmu.edu/ and also tells the reader to go over K&R to look at how declarations differ.
It's usually used for a introductory systems programming course covering enough you can poke through a systems library and know what's going on but to actually teach C development would be a much different course I would imagine.
> precursor to studying kernels or databases or compilers
While widely used kernels are still written in C, new ones are often not, and vast majority of new databases isn't either. Its java (Hadoop, Solr), Go (InfluxDB), Rust (Tikv) among others. That's if you want to be prepared for today. If you look into few years ahead, its also TLA+, Idris or Coq. Same for compilers, its increasingly rare to see anyone choosing C for that domain, especially as a language for teaching them.
> Should schools be teaching C at all?
I'd change this question to "should schools have any say in deciding what to teach?". Wouldn't be better to offer a list of options and let students do the market research and decide on their own?
> It can be very informative to implement some of the features found in other languages in C
That's not how programming works today. You are given those as language primitives and don't think about it same way you don't think how the hardware implements "if" in C.
You may be curious and find some use for this knowledge if you have it, but its not very useful when you want to "teach C" (and its also very ephemeral).
I'm pretty sure the Linux kernel, BSD kernels, MySQL, Postgres, gcc, etc. will all be with us for quite a while. A few years ahead things won't be written in TLA+ or Coq because they're not even programming languages. Lamport is very explicit that TLAx is a specification language, and the central component of Coq (Galina) is too, so the name-dropping doesn't really make your case more convincing. C should definitely not be the only or even primary language people learn, but it's pretty critical to understanding how things work today and will continue to work for at least the next decade.
> Wouldn't be better to offer a list of options
I do believe I already said something like that. Yep, just checked. I did. The point is that, even if it's only an option, the question of how to teach it is still important.
> That's not how programming works today.
In your opposite-of-humble opinion, which is wrong. In fact, implementing things that are already pretty well known is exactly how programming is taught. Everywhere. In all languages. Students learn about pitfalls, and tradeoffs, and why "this" is easy/common but "that" is hard/rare. If you want to teach C, you have to teach C as it is used and why. A language is far more than just syntax you can learn from a book. Any student who has actually implemented a "basic language feature" in C is better equipped to understand how that implementation of that feature in a particular other language represents only one set of decisions within that space, and that others are possible. I'd say that's pretty important knowledge. Programmers who just settle for "it's magic" and "that's how it is" are never going to be very good programmers.
> I'm pretty sure the Linux kernel, BSD kernels, MySQL, Postgres, gcc, etc. will all be with us for quite a while.
Yes, they will exist and be used somewhere, but when people design a new database or compiler today, C is not the language of choice anymore, so teaching it for that reason is not useful. Many of the things I am using for storing data today have no line of C in them, same for a lot of research papers in that subject.
> implementing things that are already pretty well known is exactly how programming is taught.
I agree with that, just discussing what those things should be. Implementing low-level language features has more to do with hardware architecture than any code above it, and C is not good for representing (modern) hardware. The way it is done today is that you come up with some mental model, and instruct compiler to generate native code for that, and none of those need to involve C. LLVM IR or some equivalent is more likely to be used if you need some "C-like language".
> Any student who has actually implemented a "basic language feature" in C is better equipped to understand how that implementation of that feature in a particular other language represents only one set of decisions within that space
You don't need C for explaining that, just show two sets of decisions without any code.
> Programmers who just settle for "it's magic" and "that's how it is" are never going to be very good programmers.
It's not magic, but how exactly one feature or another was implemented changed many times for languages I am using because hardware changed, or more research was done,
and besides gaining some performance I wouldn't notice any difference. Knowing how to use those features is important, practical and useful for many years, implementation details are irrelevant and may become outdated tomorrow (and they literaly did in many cases).
C is still used, but nowhere as popular or relevant as it once was. It is not "the language of databases and compilers". If you want to teach it, focus on domains where it likely will be relevant, not the ones where it was relevant 20 years ago.
> when people design a new database or compiler today, so teaching it for that reason is not useful
Those who do not learn the lessons of the past are doomed to repeat it. Studying past and present highly-successful systems and understanding why they made the choices they did is not just useful but essential.
> Implementing low-level language features has more to do with hardware architecture than any code above it,
Untrue for the examples I gave. Most of what you would learn from implementing a map or regex parser is way above the hardware - sorting/searching, state machines, optimizing for different input sizes or patterns, managing mutability and concurrency. Implementing a simple object system is even less hardware-oriented.
> C is not good for representing (modern) hardware.
Perhaps not, but very few languages are better for that. Of the programming (and non-programming) languages you've mentioned so far, only Rust might be a tiny bit better in that particular regard. Go ahead, try to name a language that better represents hardware. Or don't, since representing the hardware is not the point anyway (see previous paragraph). Apparently some people think that anything below the web browser is hardware.
> You don't need C for explaining that, just show two sets of decisions without any code.
And here I think I see the crux of the problem. Have you ever programmed? If you have, or even if you've been engaged in some other kind of creative effort, you'd know that making something teaches lessons in ways that abstract thought can not. It's true that you don't particularly need C for that, but if you're teaching a class on C for other reasons it's kind of the obvious choice. "Without any code" is an absurd statement.
> besides gaining some performance I wouldn't notice any difference.
You wouldn't. The implementors would. The people who read the implementors' papers and use them as the basis for the next next system would. Even you, as a user, could benefit from understanding whether particular input or concurrency patterns represent best or worst cases for the algorithms that were used. You can't optimize your own code without that knowledge, and you can't have that knowledge without ever having worked on anything similar or even thought about it at any but the most superficial level. Your attitude is effectively "it's magic" even if you don't use the word.
I think the most important point here is that people who haven't even educated themselves shouldn't be commenting on how to educate others. Would you like to Dunning-Krueger us some more?
> Studying past highly-successful systems and understanding why they made the choices they did is not just useful but essential.
Studying currently used and developed systems for people who will be designing future ones and enter job market in near future is more essential. Databases designed today are mainly distributed and knowledge of design choices made by Postgres doesn't help there, same as design choices made by whatever existed before Postgresql.
> Most of what you would learn from implementing a map or regex parser is way above the hardware
Ok, I agree with that. Although I am testing now hash implementation written using SIMD instructions. My main point here was however that while you see map and regex parser as some building block that people should know how to build, I am saying that the building blocks today
are things built on top of them. This moves whole thing one layer above, and makes the question of "how does map work internally" into the same territory as "how is nand gate built internally" for C developers: totally irrelevant. Programming courses were focused on teaching how to write hashmaps and sorting algorighms for decades, but here is the news: we already have them, start teaching how to use them to build higher-level things.
> Go ahead, try to name a language that better represents hardware.
Assembler. LLVM IR. C or C++ with custom extensions. Whole families of languages for graphic cards.
> representing the hardware is not the point anyway
Oh, that's what I think too. That's why I see C as obsolete: if you don't need it to represent hardware (which it doesn't) and rely on compiler doing optimisations for you, you can as well use more convenient language (like Rust. Or Java), which will provide the same end result with a lot of added value. Like actually making use of the hardware easier for example.
> Apparently some people think that anything below the web browser is hardware.
The implementation details of anything below the web browser is indeed mainly dictated by hardware. How the web browsers are designed too: all changed significantly to utilize parallelism and accelerated graphics because that's how hardware works today.
> you'd know that making something teaches lessons in ways that abstract thought can not.
Definitely. But it helps a lot if that "something" will be relevant to your future work.
C largely isn't, outside of relatively shrinking domains.
> You wouldn't. The implementors would.
Yes. But implementors would not look into "how to write C code", but into "how specific hardware works" to make things efficient. People who write compilers today look into either hardware specs, or models exposed by code generators. That's another niche where C is not the king anymore.
> Your attitude is effectively "it's magic" even if you don't use the word.
True. I consider certain amount of layers below what I work on as "magic" in the sense that it makes no sense to me to know how it works in details, only to have some mental model.
It is useful for me to know that CPU has registers and cores, it is not useful to know how to produce it. I also know that whatever people work on today, will be covered by more layers tomorrow. Of course someone will need to know to build a CPU or how to write a regex parser, but we come to time when we spend 99% of time just using them and the need to building or redesigning one appears maybe once a decade. A lot of things the people teach when they teach C is very useful for people who will be using C professionally, and largely useless for 99% of those who will use something else because the world moved on higher or lower for the things that C was used before.
> Databases designed today are mainly distributed and knowledge of design choices made by Postgres doesn't help there
I've been working on distributed systems for 25+ years, and I still believe that understanding what happens within each node is important. You might be surprised how many of the real world's distributed databases are basically sharding over local databases written in C, and even the newer ones borrow many of the concepts. Would you rather study a mature query planner or an immature one? Both, really, but seeing how the mature one has developed over time is probably more instructive. And then there are kernels, filesystems, compilers, and other things you conveniently dropped from the comparison.
> Assembler. LLVM IR. C or C++ with custom extensions. Whole families of languages for graphic cards.
LLVM IR doesn't count, because there's no software to study that's written directly in it. The same is true of assembler. As for "X with custom extensions" you can't learn those without learning X itself so you haven't really made an argument against learning X. Also, very little software is written using those extensions so, again, there's no pedagogical value.
> you can as well use more convenient language
Missing the point again. The OP wasn't about what language to use for new development. It was about learning what is needed to study existing systems.
> A lot of things the people teach when they teach C is ... largely useless for 99%
Or maybe some of that 99% just don't know what they don't know, and that's the essence of Dunning-Krueger. People develop expertise in their own tiny sub-domain, then use simplified models for the rest ... but those other sub-domains aren't actually any simpler. Abstractions leak. While there is certainly a danger in drilling too deep or getting too wrapped up in details that end up not being relevant (or permanent), any programmer who treats "those other layers" as magic is going to be handicapped relative to a programmer who knows something about them. For example, knowledge I gained about cache-coherency algorithms very early in my career has come back to me again and again in contexts that were surprising to my colleagues who were never exposed to that stuff. They just thought it was magic, but it turned out to be magic that could be re-applied at a different layer for big wins.
> You might be surprised how many of the real world's distributed databases are basically sharding over local databases written in C
I probably would be, most of the ones I hear about or use aren't. When it comes to data storage, I work mainly with solr. Not only it doesn't have a line of C, if you were to study its domain there is no C equivalent. development and research is available in Java or Python,
and nearest native competitors ... are also not written in C.
> Would you rather study a mature query planner or an immature one?
If I were a student, I'd rather study one that will give some idea of what people will design when I will enter job market, using tools they will use.
> seeing how the mature one has developed over time is probably more instructive
maybe, but is also less relevant for the needs of current database market and subject of research.
> It was about learning what is needed to study existing systems.
Existing systems are written in variety of languages and C, while still significant, does not constitute majority. Recently created systems are mostly written in something else.
> Or maybe some of that 99% just don't know what they don't know, and that's the essence of Dunning-Krueger.
Or maybe those who teach C stay in their bubble and missed the fact that other things not only exist, but also took over market and mind share.
> any programmer who treats "those other layers" as magic is going to be handicapped relative to a programmer who knows something about them
Only if those things have any relevance to their job. If you go 2 or 3 layers below, any relevance is lost and we keep adding those layers.
> For example, knowledge I gained about cache-coherency algorithms very early in my career has come back to me again and again in contexts that were surprising to my colleagues who were never exposed to that stuff. They just thought it was magic, but it turned out to be magic that could be re-applied at a different layer for big wins.
So you provided some cache-coherency implementation and now they can use it without knowing how it is implemented, right? And spent their time on improving their product having the cache coherency as given? As it was magic?
The author's main paint is pedagogy of teaching C as opposed to industry relevance of C.
>The other day Neel Krishnaswami mentioned that he’s going to be teaching the C class at Cambridge in the fall, and asked if I had any advice about that topic.
Some googling found a C/C++ class by Neel Krishnaswami[1]. The description says 10 lectures so I'm guessing roughly ~10 hours of instruction. The prerequisites for the class are "none". You can also browse the pdfs of class lectures and past exams to get a feel for what the professor is teaching.
With that in mind, we then consider Regehr's advice:
>My main idea is that we need to teach C in a way that helps students understand why a very large fraction of critical software infrastructure, including almost all operating systems and embedded systems, is written in C, while also acknowledging the disastrously central role that it has played in our ongoing computer security nightmare.
The noble goal of "traps/pitfalls/disasters of C" isn't really possible to weave into an introductory course if a class is currently structured with ~10 hours. You have some hard time constraints there. You'd have to remove some topics ... e.g. maybe remove linkers?, STL?, metaprogramming? -- or -- make the class much longer ... e.g. more hours?, more homework?, more case studies?
An "Introduction to programming using C syntax" is one type of class that requires a baseline of time to each.
An "Introduction to industrial-grade software engineering using C best practices" is another type of class with its own heavy time commitments. Teaching the topics of correct/incorrect uses of C would take even more time than the learning the syntax of C.
Regehr's essay is well-intentioned ("hey, you should teach this very important topic <X-prime> related to topic <X>") -- but he doesn't actually tackle how to squeeze it all into a finite time-constrained class.
I've moved to a style where the lectures are recorded and the lecture hours have been replaced with lab sessions. I've also mostly[1] removed the C++ content and replaced it with more C. Both the videos and the labs should be available for everybody.[2]
This gives more time for tooling (eg, ASan/MSan/UBSan/valgrind), as well as for various low-level topics like memory management (eg, arenas, reference counting and mark-and-sweep) and data-structure layout and cache optimizations (ie, the reasons why people still use C). Also there's a heavy focus on the lack of safety and undefined behaviour -- beyond the lecture devoted to it, this comes up in basically every lecture.
[1] There are still 2 lectures on C++. If you think that it is impossible to learn C++ in 2 one-hour lectures, you are correct. The pedagogical theory at work is basically "Dumbo's magic feather": the goal is to give students enough self-confidence that the year after they won't be afraid to consider doing a final year project using C++ (eg, modifying a web browser), which is the point at which they will _actually_ learn C++.
[2] I'm not totally sure how I feel about this. On the one hand, obviously I'm happy anyone can access these resources.
But on the other, it's really easy to say subtly not-quite-right things when lecturing. If it's ephemeral, then this is harmless since the effect of lectures is just to cue students too remember what to look up when they are programming on their own, but if they can re-watch the video then there's a bigger risk of burning in the wrong idea.
I suppose the long-term thing is to pre-write a script for each lecture, maybe even turn it into a book.
Regarding the bit about "subtly-not-quite-right things when lecturing": IMHO, it's too tall an order to ask professors to get everything in their lectures exactly right. It seems more reasonable to take a page from good journalists - get as much right as possible the first time around, acknowledge if you discover something was wrong, and provide corrections with context.
Ultimately, I think it's up to students to remember that everyone can be wrong, even professors, and to trust but verify. :)
They are not completely different. Entire large projects can be written in C that also compile with a C++ compiler, and it is entirely practical, even easy, to maintain them that way.
My TXR language is like that:
./configure cc=g++
make && make tests
The Lua people have done this too, and call this dialect "Clean C", I think.
Compiles cleanly and tests pass. I usually work with it with a C compiler. Before releases I run it through the C++ compiler to see what breaks, and fix it. (Usually very little and often nothing at all.)
Teaching them at the same time? I'm not sure I would take programming newbies and try to teach them how to program in C such that it also means the same thing as C++. That is to say, I would be confident that I could teach this in a straightforward way; however, the problem would be the lack of supporting material. For instance, there is no reference manual for the "Clean C" language; what would we use as a text book?
One approach would be just to use C resources, and cover the C and C++ differences as one chapter of the course: basically, how to port C programs to the C++ dialect.
I'd definitely teach the students how to leverage the stricter C++ type conversions while keeping the code C, including the following trick:
People keep saying this, but the difference is comparable to Perl5/6 or Python2/3 - you can construct programs that are valid in both, and there's a big overlap of concepts and syntax.
In some ways it's easier to interlink C and C++ code than it is to interlink Python2 and Python3 code.
> People keep saying this, but the difference is comparable to Perl5/6 or Python2/3 - you can construct programs that are valid in both, and there's a big overlap of concepts and syntax.
There's more overlap wrt mental models between C#, Java and C++ than there is betwern C and C++. Being able to compile code in both languages means close to nothing if a developer can't understand how a project is structured and how a component is supposed to work.
I would say that being able to compile code in one language means close to nothing if a developer can't understand how a project is structured and how a component is supposed to work.
Thus, the observation doesn't inform the debate about C and C++ overlap.
This is not true at all. C and C++ are about as similar as C and Objective-C: sure, your code is pretty much a superset (or in Objective-C's case, a strict superset) of C. But most of the time your code looks nothing like C at all.
Good points, but I don't think STL (which is, of course, the standard template library from C++) has anything to do in a course on C. Perhaps you just meant "the standard library"?
In my university, back in the mid-90's teaching C was already seen as outdated.
First year students got Pascal in the first semester and C++ on the second one, followed by Caml Light, Lisp, Smalltalk, Prolog, MIPS and x86 Assembly, PL/SQL, while those taking compiler design classes would also dive into many other ones.
OS design, data structures and distributed computing classes assumed that having learned C++, any student would be able to downgrade themselves into C level on per-case basis.
C only became industry relevant thanks to the widespread of UNIX clones, and in platforms like ChromeOS, Windows, Android, its relevance is slowly fading away.
Still it is going to be around for as long as we need to target POSIX platforms, which is why I follow with interest any attempt to make it safer, as I would wish my computing tools not to be weakest link, if they happen to be written in C for whatever reason.
> In my university, back in the mid-90's teaching C was already seen as outdated.
Given the examples that follow it seems more of an ideological stance that a practical one. C is dead so let's teach Pascal, Lisp, MIPS assembly and Smalltalk?
From a language theory standpoint C is pretty clunky but in terms of getting stuff done it's still one of the most mainstream languages out there.
>having learned C++, any student would be able to downgrade themselves into C level on per-case basis.
That's sort of how I learned C in the first place where I see where you're coming from but I don't think that's good general advice. "Don't learn C, learn C++ and figure it out from there" is sort of like saying "Don't learn Italian, learn Latin and figure it out from there". C++ is a beast of a language, if people are interested in C they should learn C.
>C only became industry relevant thanks to the widespread of UNIX clones, and in platforms like ChromeOS, Windows, Android, its relevance is slowly fading away.
I sort of agree but I'm more of a glass half full type of guy, I think C is less relevant not because it's fading away but because other languages are taking a lot more space nowadays. What used to be considered "scripting languages" are commonly used to build complex applications because we have a lot more RAM and processing power than we used to. Still, kernels and low level code are still commonly written in C or, sometimes, C++ because you need good performance and/or tight control over the application.
C is C and C++ is C++. The analogy breaks down in a pretty major way because Latin has no native speakers, but important work is done in C. They didn't stop speaking Latin because it had problems as a language, they stopped because the cultural apparatus behind Latin collapsed after an exceptionally long period of dominance that Italian speakers never achieved.
I'd like to believe C has a stronger future in the academic world because it is simpler than C++, but also because C is a designed language and C++ is a cobbled-together language, and maybe that counts for something in a field where people call themselves engineers.
Anyone who sees C as a 'downgrade' has badly misunderstood what design means as an engineering concept. It doesn't mean 'does everything', it means that strategic choices were made early in the process after careful consideration.
* Serious hat *
C has a pretty well defended place in the world - it is an excellent choice for manipulating RAM and controlling the size of variables. C++ isn't as good, because its features add a lot of complexity that isn't really desirable when you have a problem that C can handle.
I wouldn't use it for string parsing. Or GUIs except as an afterthought on a project that used C for a good reason.
C's well defended place is unfortunately mostly defended by folklore, nostalgia and a desire for one-upmanship.
There are several good languages available today that can successfully replace C - including at controlling RAM or the size of variables
P.S: there's a book called "The design and evolution of C++" which you should probably read. Where can one learn more about the design of C?
I disagree, C is popular for the same reason that QWERTY is more popular than DVORAK or Colemak or that x86 is more popular than ARM (talking purely from an IA perspective). It's not folklore or nostalgia, it's just that it was here first, there's a ton of legacy applications that rely on it and the cost of the switch is seen as a large, one time cost vs. the small benefits it would bring over a longer period of time.
I'm a huge fan of Rust and I can only dream of having all C codebases in the wild ported to it some day but I'm still writing C at work because it's faster, easier (vendors provide C libraries and code, not Rust) and it's easier to hire C coders than Rust ones.
>P.S: there's a book called "The design and evolution of C++" which you should probably read. Where can one learn more about the design of C?
I'd be surprised if C was really ever "designed" the way we would design a language these days. Basically it was a fork of earlier languages tweaked to more conveniently model the hardware of the time. It was more about getting the job done at the time than making an innovative language. At least that's what I gathered from what I know about the early days of C and it's also fairly apparent in the way the language itself is designed (the whole pointer/array shenanigans come to mind). C is many things but it's definitely not elegant or even clever.
Except that is yet another folklore, as C wasn't there first in any form or fashion.
There is an history of 10 years of systems programming languages, preceding it. Prooving one does not need to throw security out of the window in systems languages.
Had Bell Labs been allowed to sell UNIX, instead of giving it away for a symbolic price and C would have been a footnote on history of system languages.
The parent was talking about "nostalgia" and "replacing C" so I assumed they were talking about more recent languages like Rust, C++, D, Go, Zig and all these newer system languages.
I agree that if we're talking about why C took over in the first place it's a different discussion.
I agree with you regarding existing software - I don't think it's worth discussing too much about, since that software will either live on "forever" or be outright replaced, rather than being rewritten.
I was mainly referring to new projects, where C continues to be selected because of a myriad of reasons, none related to technical uniqueness or superiority.
This is truly infuriating, because we will collectively pay for those poor choices in security vulnerabilities.
Stability and being able to find developer who will maintain the project 5 years from now are still very pragmatical arguments though. Again, I love Rust but I still don't feel comfortable pushing it for critical components at work because who knows where it'll be 5 years from now? Is it here to stay? If I leave my job will they find people to take over?
Furthermore even brand new projects will have to interface with existing code to some extent. C bindings are everywhere, Rust not always. And even when they exist they're not always complete and are often quite experimental.
Still, I'm doing my part and I did write a few smaller components of my current project in Rust at work. But it's not surprising that the shift is not happening overnight.
There are no current languages that can displace C, none, zero. If you want a language that _will_ displace C it needs to be designed and features mapped, interactions, pros, cons, etc so others can understand it and __why__ choices were made.
C++ is evolved not designed from the start. A properly (IMO) designed language would have a pro, con wiki/ graph map and features would have _all_ interactions mapped. C isn't this either but it doesn't try to paint over bad feature interactions by adding more.
I've already pointed another person to the book "The design and evolution of C++" which I believe puts the not designed argument to rest.
But it is impossible to create a beautiful, "properly designed" programming language that's also backwards compatible with C. I mean look at Objective-C :)
I'm not advocating for C, I'm just describing a language that could displace C while having some good features. C takes features to the extreme by having few and being somewhat simple, a language designed as I stated could have features and still remain simple.
> I'd like to believe C has a stronger future in the academic world because it is simpler than C++
C is simpler than C++ until we need to manage memory or need to handle basic data structures. When that happens, C requires reinventing the wheel over and over again.
Because memory management is an area where C++ over-simplifies and assumes one size fits all. In the C++ worldview, memory allocation is a black box, and no two apps would ever have any desire to use different memory allocation strategies. For many apps, that's fine, but for the type of apps where C shines, memory allocation isn't just a magic black box.
In fact C++ allows for flexible memory allocation strategies per data structure, while C forces one to hand code them all and replacing something like malloc affects the whole executable behavior.
Here, shape = type. Yes, they do. Start thinking why C++ compilation times are so abysmal? Because of this dependency mess. Then realize the same inefficiencies apply to maintainers' heads.
> Because memory management is an area where C++ over-simplifies and assumes one size fits all.
And the fact is that it does. In the rare case that it doesn't then C++ also accomosdates those requirements if developers are willing to put in the same sort of effort that goes into implementing data structures by hand.
I hardly disagree that C was well engineered given how many memory corruption bugs DoD found on Multics codebase vs plain UNIX, and how many we continue to get every single month to this day.
Well engineered tools have quality as their main focus.
C won its place in the world thanks to UNIX adoption, and its commoditization via free beer OSes, nothing else.
My point was that I don't think Smalltalk was ever more popular in the industry than C regardless of the time frame. By the mid 90's lisp machines were pretty long gone, betting on Lisp instead of C if employability is the metric seems rather bold.
MIPS I concede, it was remained fairly popular for a long time.
I guess my overall point is that the subjective choices made by this university back then don't really mean much by now. The fact that they taught Pascal instead of C alone proves that they weren't particularly prescient.
>Regarding language comparisons, actually C++ is Italian and C Latin.
I thought about that while writing this bad analogy but I couldn't come up with a better one. I wasn't really talking about the historical relationship between C and C++ but rather about their relative complexities. Learning C++ is vastly more complicated than learning C. On top of that the coding style of a modern C++ application (which generally involves a lot of metaprogramming, OOP and even some functional constructs) is fairly different from a standard C codebase. You don't write C like you write C++ and vice-versa.
And Pascal was a big language back then. Big software like photoshop were written in it, IIRC. The very first version of GCC, in the late 80s (when it was not even called that way) was written in a pascal dialect, too (and compiled both C and pascal code).
OTOH, C was just standardized a few years before, and linux was just a side project from an unknown student in a small university. And there was a new version of C, with objects, classes, inheritance and all those buzzwords, so why care about the old C anyway?
> ...because we have a lot more RAM and processing power than we used to.
I don't know about you, but I haven't noticed any dramatic increase in performance - if anything - apps are slower and consume ridiculous amounts of RAM. At work I'm using PyCharm (written in Java) and somehow it's "normal" for it to consume 1Gb of memory. Now if every app would be free to use memory like that, then I wouldn't be able to run more than 8 apps on my computer.
For all the 'memory safety' and 'exception' features that Java has, in my experience - and that includes the time spent with PyCharm too - Java apps are more unstable. Sometimes the JVM downright crashes, or angrily throws a nasty exception, leaving me perplexed as to what happened and worried whether I should restart the app or not.
I don't see a reason why a program like PyCharm couldn't have been written in C. 'It's too low-level' is the common answer - I don't think so; the real reason is probably along the lines of 'I don't want to deal with manual memory management and pointers, and since there's plenty of languages that do that for me - I'll use that!'. C is bare-bones - and that's a virtue - because the programmer is in charge. Every single aspect of the program is a consequence of his decisions. It also removes the opaque 'middle layer' of cruft, like garbage collector and standard libraries.
C is the best language out there, in my opinion, sitting in the nice spot between assembly and higher-level languages. I wish there was a successor of C, which would have carried its philosophy into the 21st century (and no - I'm not talking about the abomination that is C++).
> “C only became industry relevant thanks to the widespread of UNIX clones”
It has always been deeply important in scientifuc computing, and even now it’s very valuable to know C underpinnings of Python & Python scientific computing tools. On the scientific algorithm side I don’t see C diminishing in importance at all, and many critical systems and libraries use it and require young programmers to know it to really use them (zeromq, Postgres C api, CPython, various things written in Cython).
This is often very overstated. For example in government research labs there are often huge, critical pieces of software in both C and FORTRAN. I worked in one of these labs in the 00s and about 80% of my team’s work required running reports and simulations through a huge system written in C between the late 80s and mid 90s. Even generating plots for conference presentations was done in pure C with a special DSL tool that took in a static plot conf file and wrote raw postscript outputs.
For the nuclear industry, this is true as well. A lot of it is because of the original codes were written in Fortran, and thus there is some institutional lock-in even for new stuff. Everybody in the organization already knows Fortran, it's quite fast, and there are good libraries for doing all of the standard number crunching.
However, my unscientific and unsupported observation is that in fields that have emerged after Fortran went out of vogue as a general-purpose language, this same process happens again, but just with a more modern language. SciPy etc. are quite common in fields like genomic analysis, and I suspect that in 20 years, this will still be the case even if almost nobody else is writing new Python code.
It is silly to start teaching in C. You must first learn assembler and then you understand the godsend that C is. When you write C, you should be thiking what happens in assembler, and then writing it beautifylly in C.
In parallel to this low-level thread of learning, you have to learn other high-level languages, of course.
That's how you end up with student believing dangerous falsehoods "signed integers are 2's complement".
This is true in x86, ARM, RISC-V, and just about any architecture I can think of, but it is not quite true in C, even when it is implemented on top of a 2's complement architecture. Signed integer overflow is undefined, you have to explicitly convert to unsigned before you do any overflow. Only then do you get the implementation defined behaviour your sought, instead of the usual nasal demons.
Also, assembly doesn't mean much with out of order cores. Data dependencies are likely more important than the generated assembly.
I wonder how much heartache it would be for the next revision of the C standard to go: Integers behave as if they were two's complement for all overflow behavior.
It's already true in just about every system in the world today, and people with extremely obscure hardware can just hack a workaround into their compiler (or straight up ignore the standard because they're just a handful of people anyway).
I get that back in the day the K&R didn't want to impose on the hardware vendors and left the standard vague where they had different ideas, but that doesn't mean you can't evolve the standard as the hardware evolves over the decades.
Why is this getting downvoted? It sounds very reasonable to me. It wouldn't even be the first time C breaks compatibility (I'm thinking of strict aliasing).
It doesn't even break compatibility. It just defines and area that was previously left undefined. It would only break things that were relying on undefined behavior.
It wouldn't break overflow, but it would break the conversion to and from signed negative integers (which if I recall correctly is implementation defined, and behaves differently on 1's complement and sign & magnitude platforms —if there are any left).
And you will have disgusted half of the class from programming by the end of your first assembler lecture.
Something similar happened to me in a general engineering curriculum where our introduction to programming was in C on some old unix machines that were already already very obsolete (as of end of 90s), with no more IDE or debugger than the equivalent of notepad and debug by printf. I thought it was interesting but too tedious for my taste (I had never heard of IDEs and debuggers).
It's only years later that I went back to programming, from higher level languages, and progressively making my way into more sophisticated / lower level languages.
You don't teach kids math by starting with topology. Most 1st year university students will have never programmed in their life (particularly if all they dealt in their life was smartphones and tablets).
And you will have disgusted half of the class from programming by the end of your first assembler lecture.
IMHO that's a major part of the problem. Somehow Asm turned into something horrible to avoid, instead of being the crucial understanding that makes everything else fall into place.
I have an engineering textbook from the mid 60s. There is an entire chapter on how computers work and how to program them (in Asm!) to solve numerical problems. The audience of this book was not computer scientists/specialists, but just engineers in general.
Two decades later, computer magazines --- not "developer-oriented", just "user" or perhaps "power user" --- had Asm listings of simple (sub-512-byte binary) and useful utilities for readers to manually enter, use, and modify.
Another two decades later, and the programmers today barely understand binary nor Asm, despite the concepts being so ridiculously simple. A computer is a machine that executes instructions step-by-step, and nothing evokes than more clearly than an Asm listing. An HLL with all its nested structure doesn't convey the same "it's just a list of instructions" to the beginner.
I thought it was interesting but too tedious for my taste
"Genius is 1% inspiration and 99% perspiration."
You don't teach kids math by starting with topology. Most 1st year university students will have never programmed in their life (particularly if all they dealt in their life was smartphones and tablets)
Starting with Asm is like learning arithmetic. Starting with a HLL is like teaching calculus to students who don't know arithmetic by giving them calculators.
I've worked with a lot of both "new-school" and "old-school" developers over the years, and there's a large contrast between them: the former group is heavily reliant on tooling, tends to make many localised changes to "fix" things without considering the whole picture (causing more problems later), and get stuck easily when debugging. The latter tend to adopt a more methodical approach to problem-solving, and while they don't appear to be doing as much interaction with the computer, are overall more productive.
We (CMU) respectfully disagree, and we turn out some damn fine programmers.
Python -> A restricted C subset ; standard ML -> asm
Teaching fun and power first is good for engagement. Teaching how to think and structure is good for later asm experience so people have an existing concept of how to turn problems into solutions in code.
>Something similar happened to me in a general engineering curriculum...
The implicit context is that we are talking about fields that need to understand programming at a low level: Computer Science, Computer engineering, Software Engineering, Electrical engineering.
There is no point in a Chemical Engineer programming in asm. Just throw some python and R at him so he understands the basic concepts.
But for the fields that need to understand computers at a low level, I would expect enthusiasm from the student to understand how it all works.
> Most 1st year university students will have never programmed in their life
What? That's completely untypical for me - I would expect that in a university with decent admission competition (don't know how these things work in US exactly) students that get in know at least a couple of high-level programming languages in addition to taking part in programming competition and general geekiness like brainfuck.
The point of university is to learn, not to already know the course content by the time you get there. Plenty of high-school students may not have taken any computer science subjects during school, but may wish to get a degree in computer science.
Universities with decent admission competition value well-rounded, high-achieving students. Such students are involved in a lot of co-curricular activities, may take many subjects, and have broad interests. It's perfectly reasonable to expect that they simply never had the time to learn multiple programming languages and compete in programming competitions, in addition to whatever else they do.
For kids straight out of high school? Some certainly will. But many are just good students who are going for a general college degree with a major in computer science.
As I'm reminded every time someone mentions Lisp or Forth, it seems that different people prefer very different "intuitive" representations of programming. Bottom-up from assembler is one of those. It suits some people very well and others not at all.
You can make gates out of more than just electronics (mechanical, electromechanical, pneumatic, hydraulic computers all exist), hence the suggestion to start there --- the abstraction doesn't really leak.
More seriously, it is possible to fit quite a lot of this stuff into an undergrad program - my Cambridge degree included logic gates (including drawing silicon layout with coloured pens), computer architecture (ARM assembler), a lab on FPGAs, Java, Standard ML (for type inference and lambda calculus), and had plenty of time left over for higher level topics.
There is one specific reason why assembly language is a good idea to help with C: pointers. Many students have trouble with them. The idea of storing an address in a register is the same as loading a pointer into an automatic variable. Pointer indirection is the same as assembly load or store register indirect.
> When you write C, you should be thiking what happens in assembler, and then writing it beautifylly in C.
That is a horrible, horrible idea. It gives people the misrepresentation that C is really "just" a thin veneer over assembly language, which it most certainly has not been for at least 20 years. Moreover, it feeds the cult of C, the idea that C is the only language capable of being close to the hardware and that other languages or attempts to build system-level languages need not apply.
C has lots of flaws and faults in the language, but we're not going to be able to make any progress in fixing the language, or more likely replacing it with a better language, if we continue acting as if it is the only way to get low-level access to the hardware.
C is a simple language that obviously maps to a programmer's idea of the assembly they'd write.
It biases towards lots of fixed-size stack frames and fixed-layout structs. Writing C code really does feel very close to the "bounded turing machine" metaphor that the CPU gives us.
Other languages have their advantages, but C's proximity to assembler, with all the faults included, really is a great pedagogical tool.
> C is a simple language that obviously maps to a programmer's idea of the assembly they'd write.
It's not. That's my point. You don't need undefined behavior to get to that point, just consider what assembly you get from this code:
unsigned prog_a(unsigned a) { return a * 2; }
unsigned prog_b(unsigned a) { return a / 3; }
What do you think you're going to get for that code? The "obvious mapping" principle for both codes suggest you're going to get a multiply and a division instruction, respectively. On x86, however, what you get is this:
In other words, the first program turned into a load-effective address instruction, and the latter turned into a multiply-hi instruction. Even something as simple as basic arithmetic gets turned into weird instructions, and which ones get this treatment is essentially a game of roulette based on compiler optimizations and particularly heuristics. We're not even bringing undefined behavior into play here, which is usually the major objection to C-obviously-maps-to-assembly; it's falling flat enough without it.
The primary advantage C brings, what people are really touting when they tout its proximity to assembly, isn't that it's close to assembly but that it has a well-known, stable ABI for that mapping. Of course, it turns out that said ABI is actually a bad idea if you want performant modern code on most architectures (the infamous array-of-struct versus struct-of-array problem).
It does though just not the way you planned, a litmus test on how close a languages is to 'the hardware' is to see how much crap a compiler frontend generates that the backend has to remove that doesn't have any positive effect on optimizations.
No, the litmus test would be "can I reason about generated native code produced by this line of C". And you can't do that today. This leaves you two options: either switch to something where you can, going lower than C to be able to control all the details, or stop worrying about, accept the fact that compilers take care of those details for you and use higher-level language.
With optimization turned off, each line of C does map to a fairly obvious sequence of native instructions, including instructions that are actually poking at memory-mapped hardware registers. No hidden constructors/destructors, no operator overloading, no exceptions, no "built in" data types that require thousands of lines of hidden library code mapping to what the hardware actually offers. That doesn't cover things like memory ordering and barriers, but it's still a lot closer than any other language in which software is actually written (so e.g. assembler or LLVM IR don't count).
I believe Rust has the same property, but only a tiny fraction as much software is written in it. Ditto for Zig and Nim. C++ has all of the paradigm-breakers mentioned above, Go has its own hidden magic mostly related to goroutines, Java doesn't even try, and anything functional doesn't even occupy the same conceptual space.
As a mental exercise, try taking any random piece of generated assembler code and map it back to the corresponding higher-level-language code. I can usually do this pretty easily for C code, though switch/case statements present some challenges and heavy use of macros in any language can defeat the effort. Even calls to compiler-internal functions (e.g. memcpy-like functions for structure assignment) will be easy to interpret. Again, you can probably do the same for Rust/Nim/Zig. Attempting the same for C++ or Go will lead to all sorts of random places other than the original source line, and for just about anything else good luck even finding that generated assembler in the first place.
When you can map easily back and forth between the language and CPU instructions, you're as close to the hardware as you're going to get. This is something few programmers - even in kernels and such - actually need, but C offers it and most other languages don't.
> When you can map easily back and forth between the language and CPU instructions, you're as close to the hardware as you're going to get. This is something few programmers - even in kernels and such - actually need, but C offers it and most other languages don't.
If you think this is true, try building a native assembly to C decompiler. There are a lot of warts that you realize C gives you absolutely no way to describe. For example, C has no concept of unwinding, which both C++ and Rust do. (Admittedly, much of unwinding is handled by library functions, but one piece of the unwinding puzzle is do-something-in-this-function-when-unwinding, which can't be handled by library functions). There's no support for SIMD vector units in C. And getting floating point edge case support is tricky: there's a difference between a * b + c and fma(a, b, c), and that's before you get into trickiness like floating point exceptions, rounding mode, and signaling NaN.
The C language only has an easy bijective mapping to the actual CPU assembly if you focus on the easy scalar integer, general-purpose subset of both. Throw in floating point support, control flow [1], and vectors, and the mapping is much more difficult. Trying to support the actual hardware memory model basically kills the ability to map CPU assembly back to C/C++ [2]. The mapping just isn't as good as most people think it is, not if you actually care to start exploring the space of what can be done.
[1] I didn't mention this earlier, but try taking the address of blocks (this is a GNU extension, not standard C) while not doing anything else with them and see how well-preserved they are even at -O0. Spoiler alert: they're not.
[2] Virtually all of the effort to actually make hardware features invented in the past 20 years available in standard programming language is concentrated in the C++ working group, not the C working group.
OK, so how does any of that get easier with any other language? In what other language do SIMD instructions, floating-point edge cases, or fused multiply-add map neatly back into HLL constructs? The unwinding stuff is an even clearer example of other languages - notably those with exceptions - being further from the hardware. I was already aware of all of those things when I said "as close as you can get" and they only reinforce the point. Thanks for the help.
Not having access to SIMD vector units IMO is a feature. Let the compiler figure out if SIMD is useful and have a language that can be expressive enough to allow a compiler to do SIMD if needed. The other 1% use case of SIMD could be a separate extension.
of course there are many semantically equivalent ways to translate from C to assembly, and the optimizer may chose in surprising ways. But the point is that when you write your functions, there is always an obvious assembly representation that should be in your head.
Of course, the #1 biggest problem of C is that C's semantics are not what people think they are. Most notably, pointers are not integers in the C abstract machine, nor can you really enforce such a mapping without prohibiting nearly all optimizations [1]. And when you teach people to understand how processors work via C, you start to claim that C's semantics are processor semantics, which has not been the case for a very long time.
[1] The underlying issue here is that the model you need for any sane optimization is that if you tell nobody that this box of data exists, then there is there no way for anyone else to refer to it. This basic fact opens up the ability to do things like move variables from stack allocations to registers, eliminate dead stores, etc. It also implies that you need some sort of pointer provenance in the semantic model to track who actually has the ability to refer to something. In practice, a semantic model for C needs to carry pointer provenance through integers as well, but that really means that C's integers also aren't mathematical integers (modulo some power of 2).
Pointers are integers on most hardware, if you ignore the complexities of multiple address spaces (i.e., a feature present on virtually anything that's not a general-purpose CPU) or segmentation (i.e., the most common CPU architecture for desktop and server processors).
In C, however, pointers arenot integers. I've not fully built any kind of operational semantics for C, so I can't say what the full list of features you need to track for pointers is for sure, but I can take a stab at how you could describe it. A pointer consists of a tag of the object being stored in the object model (which needs to tell you what the current dynamic type of the object is and its size and alignment in addition to the actual memory contents of the object) as well as an offset of where it points inside that object. Iterating through an array would merely be incrementing (or decrementing, as the case may be) the offset field of the pointer, subject of course to the well-definedness characteristics that you can't exceed one past the end of the object or go before the beginning of the object.
At compile time, the compiler knows the type of a pointer, so it can interpret struct_ptr->field and do the proper arithmetic (field is 20 bytes offset into struct).
But when running, the pointers are just int64s on the stack. Check stdint.h. It's all right there.
Anyone writing C 'gets' roughly what's happening and knows the shape of their structs in memory.
He is commiting basic fallacy which is when you discovered some benefit of learning something related to X, you want to tell everybody to learn it before they start learning X. It so common in programming discussions. It is a fallacy because oftentimes you can't appreciate something until you have a need for that or geniune curiosity.
I actually learnt assembler after a being able to code in a few high-level languages. Assembler was a sort of "liberation", an exhilarating feeling of understanding "so, that's it!". Even if you do not spend much time in assembler, being able to write a hello world and a program that prints prime numbers (or some graphical demo) is a valuable knowledge in computing.
Programming without knowing anything about assembler is like driving without having ever opened the hood of your car and changed some oil. Of course, if you are a professional driver you will have specialist mechanicians that take care of all of it, but still it would be ridiculous if you could not do a few simple things.
> being able to write a hello world and a program that prints prime numbers (or some graphical demo) is a valuable knowledge in computing.
You are so wrong. It is not valueable at all. If you are able to do this, then you're just capable of signaling that you have been exposed to assembly and familiar with the very basics of it. But how is it any useful? If you could find some intricate bug, or optimize a program in a considerable way based on insight you gained from reading it's assembly code that would have been very valueable indeed.
First you need something like a simple Pascal. Otherwise you've got C that crashes all the time or you're stuck with some environment too opinionated about GC/references and a way of least resistance to put everything into its specific weird structures.
Most people coding in C don't hang out on HN or SO. They are typically electrical engineers working for companies selling hardware. They tend to be less web savvy than the average CS/programmer types, but they're the ones building the guts of the infrastructure that runs the world.
C isn't going anywhere, even in 10 years. People keep floating Rust and the like, but most targets for C have 16KB program space, and about that much in RAM budget. The C standard library itself is opportunistically linked on these platforms.
The dependencies required for Rust, or even C++, are larger than the code space of 90% of the microcontrollers in your car
Fortunately, you don't have to use C for this any more. Rust has bindings for making extensions for Ruby, Python, Lua and Node, and generally it's very well suited for making zero-cost cleaner APIs on top of C FFI.
Depends on what you are doing. You often need to pull in the other language's runtime while libc is typically already in use. You also have to be careful about ABI details like string representations, concurrency, and threading features.
Node is actually a bit lack luster to write extensions for in C, as it prefers to go through v8's C++ abi. Same for LLVM: they have a half baked C interface to what's really a C++ abi. There's an alright attempt at allowing for Rust bindings in node: https://github.com/neon-bindings/neon
Because C is so raw bone, highly generic APIs from higher level languages can come off as rather lackluster when reduced to C's world view
> There’s a lot of reading material out there. For the basics, I still recommend that students purchase K&R. People say good things about C Programming: A Modern Approach; I’ve only skimmed it.
This early quote filled me with dread. The article fulfilled that dread. IMHO K&R was a classic book but C has changed far too much for that to be of service to students, who would have to unlearn as much as they learn.
Then the author, a college teacher, offhandedly mentions another outdated book as if to recommend it but admits no firsthand knowledge of it. A serious programming student taking these recommendations has just spent a hundred bucks or so and a great deal of precious study time on material that will derail them from their studies with no benefit at all.
I don't know whether you read the article through, but he later has strong stances against some books with outdated or bad advice. Furthermore, he is well aware of the fact that C (and systems programming, and systems themselves) has changed over the years. In general, John Regehr (a professor, not a college teacher, mind you) is very knowledgeable of C, compilers and especially systems programming languages, memory safety and undefined behaviour. I would say that his words have some serious weight.
I'm self taught (well, related degree) so I missed out on OS and compiler classes that got into C beyond a basic level. What is the best resource out there for someone who is interested in learning C from a systems programming perspective? I would LOVE to dive in and learn everything I can about it, but there is so much contention out there about what to learn/where to learn it from that I get analysis paralysis and end up going nowhere with my interests.
K&R is a bit old, but I think it still serves as a good introduction to C. It can be supplemented with newer material, but it still serves as a pretty solid foundation.
I still have my university-library-xeroxed pages from Expert C Programming from 20 years ago... Seems like I got lucky since it was the second and only other C book I ever read on C, apart from K&R of course...
I wonder what his views are on using an IDE vs. using a plain text editor. Because getting to know the tools (compiler, build system) is important, I'd go for the latter choice.
What does he mean by "probably"? How could anyone in PL and teaching community ignore decades of research and development of teaching materials in Scheme and Standard ML?
Scheme (Racket) should be the first language. Standard ML or Haskell - the second. One will tap into two distinct whole sub-cultures (US and British respectively).
Also this way of teaching C as a specialized implementation language which must be used together with special tools is the right way. It is useless to teach C without or prior to machine architectures and assembler language basics.
C is by no means a general purpose language, even when the industry is using it this way and should be taught only as part of specialization in systems programming.
I kind of want to say it's important for programmers to have some knowledge of C but frankly I'm not sure that's true anymore.
There's only been once case in which I really needed C (because I was programming for a system from the 90s and I only could get C running with the tools I had).
As others have said in this thread, it might be useful for you if you end up in embedded devices or other areas of development, but it won't help you for the bulk of applications written in Java / Python or some other higher-level languages.
The biggest issue C has is that there's no good answer to the second big question.
No matter how you slice and dice C, its highest abstractions are functions, structs and pointers and all of them can fail catastrophically at each and every point of use.
One can't even add two numbers or read from a variable in C without the possibility of triggering undefined behaviour. And there's nothing that will rescue you from that possibility except paying attention, which is something us humans can be pretty bad at.
When using C, one is setting themselves up for failure, which already hints at the answer to the first big question - no. Go, Rust and C++ can do everything C does in a safer way.
There are a lot more risks in C than more recent languages. Koenig's book "C Traps And Pitfalls" https://www.amazon.co.uk/C-Traps-Pitfalls-Andrew-Koenig/dp/0... was extremely useful to me when I was learning the language (available at my local library!). I don't know if there's a more modern version that takes into account recent changes to the C standard.
I note that a lot of the time in embedded work you don't get to use the latest version of the standard because you're using a weird or obsolete toolchain that doesn't support it. Or architectures like PIC, where your stack is hardware limited to 8 levels and indirect addressing is inefficient.
I think a lot of the "pure C" people are disguised as electronics engineers - doing PCB design and microcontroller programming together.