Hacker News new | past | comments | ask | show | jobs | submit login
The Safe C Library (2009) (drdobbs.com)
94 points by cvwright on April 12, 2017 | hide | past | favorite | 108 comments



I'm saddened by some of the comments here. It's true that C is not safe. It's true you shouldn't be using it for large portions of your professional setting. It's true that these sort of libraries appear to be slapping a bandaid on a broken bone... All of those things are true, but you should learn C. And you should learn it well.

I wonder where we will be in as little as 10, 20 years. When the C dinosaurs die and universities stop teaching the basics of computers. Young programmers are being pushed away from the harsh realities of C (not to mention anything below C), who's going to be building the building blocks in the future? Who's is going to keep optimising modern languages like Go and Rust? That minority is only going to get smaller...


It is called progress, one ludditte at the time.

C wasn't that relevant before business started to adopt UNIX workstations.

We were happily using other languages and will continue to do so.

Neither Go nor Rust are written in C.


> who's going to be building the building blocks in the future?

There will be tens of thousands of us. The prevailing media narrative will be that there is a crisis, with not enough people available to do the work. This will have less to do with the amount of work and more to do with how much employers would like to pay to have the work done.


The very definition of programmers has been already significantly expanded (and slightly contracted in the reverse direction) for decades, and it will continue to. It is natural that the past's required knowledge is only optional or even obsolete today; what's a problem with that?


I can't speak for Rust, but for Go the answer is easy: Go programmers. That is, why implementing the full Go stack in Go was such an important step, there is little C left in the toolchain, and consequently, C programmers are not needed for most of the development.


> All of those things are true, but you should learn C. And you should learn it well.

No, you shouldn't, unless you're a historian. It's not just a waste of your time, there are enough bad design decisions that using it will actively make you a worse programmer.

> Who's is going to keep optimising modern languages like Go and Rust?

Talented Rust programmers. The language is perfectly well suited to self-hosting.


Rust is certainly interesting but it's very new and it still has a way to go to achieve the breadth of hardware support it needs to replace C/C++.

C is a thin layer over hardware. There is a reason it hasn't yet been replaced despite waves of new, safer, more powerful paradigms and languages.

While its hold has (thankfully) been eroded for the higher-levels of abstraction needed in user-facing apps today, C/C++ are still king on low-level hardware because there is basically little to no overhead compared to any other safe language.

If you need to get your code on a 8/16/24 and even most 32 bits microcontrollers, there is not a lot of choice.

Maybe Rust will eventually have enough backends, debugging tools and hardware vendor support on all theses 1000s of platforms to offer a compelling alternative, but it's disingenuous to claim you have to be a historian to study C when it's still one of the most used languages today and virtually all the hardware you own has most -if not all- of its code written in C/C++.

Rust may not be able to replace all of C's niches. It will -and I hope it does- replace C/C++ in many areas of development, but it's premature to call for the death of C when the alternative is still in its infancy.

You should definitely study C, then Rust and use what your place of work will let you use because most of the time, it's not your decision anyway.


> C is a thin layer over hardware. There is a reason it hasn't yet been replaced despite waves of new, safer, more powerful paradigms and languages.

C has already been replaced in what would have been the vast majority of its use cases 20-30 years ago, especially if we're talking about new code. At this point it's mainly alive due to inertia.

> If you need to get your code on a 8/16/24 and even most 32 bits microcontrollers, there is not a lot of choice.

Rust is starting to get some microcontroller support. And in many cases the business case has shifted away from true microcontrollers - there's no longer much or any price advantage over a proper ARM (for example), battery technology is getting better all the time...

> virtually all the hardware you own has most -if not all- of its code written in C/C++.

I don't think that's true these days. My phone is running a certain amount of C in its low-level OS, sure, but that will be dwarfed by the amount of userspace code it's running that's written in Java or similar.


I get what you're saying, but don't be so dismissive. There are a lot of industries that will probably never move off of C (aerospace, for one). Plus, C fluency is pretty much required for serious security work.


> There are a lot of industries that will probably never move off of C (aerospace, for one).

A lot of aerospace is in Ada already isn't it? And the parts that do use C use such specialized subsets of it that general-purpose C knowledge isn't very valuable (e.g. you won't be able to use any of the libraries you're used to if you're working in aerospace).

> Plus, C fluency is pretty much required for serious security work.

Only a very niche subset of security work.


(I used to work in aerospace).

Yeah it's not all C. At my Rolls-Royce we used SCADE and assembly, and we hired a guy from Raytheon in my last couple months that was used to working in Ada. In my experience, the difference tends to be civilian vs. military projects.

But those decisions are almost always based on platform. We used C because the platform we chose shipped with a C SDK, and we had a lot of tooling based around C, not to mention standards and processes (MISRA, FAA certification, etc.). The perspective is that hardware is a stricter constraint than software is, so as a software engineer you just make it work.

But what this means is that C is pretty dominant. Rare is the occasion you'll find dev kits shipped with Ada SDKs.

> And the parts that do use C use such specialized subsets of it that general-purpose C knowledge isn't very valuable (e.g. you won't be able to use any of the libraries you're used to if you're working in aerospace).

Ehh, I wouldn't say that. Depending on the product, you might even be doing such crazy things as parsing JSON/XML, running a PHP interpreter or a JS interpreter, or other non-embedded things, and you'll choose C because of hardware limitations, like running on a 32-bit 400 MHz chip.

Usually the most stringent requirement is "no dynamic allocation", but that doesn't rule anything out, you just need to know your constraints ahead of time.

Specifically on one project I worked on, we ran Redis and used MessagePack for IPC between our processors/boards. This was an embedded Linux project with essentially no certification requirements, but our hardware was so low-spec that we needed to be very focused on maintaining efficiency. C was really the only option with existing libraries, sufficient tooling, and the necessary performance profile.

But we were careful. Our codebase was under 20k LOC. We ran unit tests, integration tests, and fuzzing. We ran code through code reviews multiple times. We had a strict coding standard. We followed MISRA (mostly), even though we didn't need to. We wrote reams of requirements and specs, and documented our code meticulously (I spent more time flowcharting my code than writing it in the first place). None of that is gonna go away by switching to a memory safe language, so it's hard to say, "Hey, switch to this language that you're unfamiliar with, even though you'll still have to do all the extra 'be careful' work anyway".


You have completely missed the point. Who are going to be these Rust programmers if no one is learning low level stuff? If all you're doing is teaching young programmers how to program in a specific language, how are they ever going to get that much closer to the machine?


Why and how would learning C rather than Rust make them any closer to the machine? It's not like C corresponds particularly closely to the hardware (e.g. there's no way to get access to the carry flag). Learning assembly might be worthwhile, but learning C isn't.


Because C is less safe. Simple as. If I stop worrying about certain aspects of very low level programming, I will never learn how to implement the solutions that take care of that for other programmers to benefit from.

My goodness, I'm not saying C is better, or that programming needs to be harsh by default, that memory needs all of your attention or that garbage collectors are for peasants... But there are many bonuses to learning C even today. Teach a kid Rust today and he will never really know about memory safety - it's handled, don't worry about it! My concern is when EVERYONE starts their journey like this, the small group of people that can handle it can only grow shorter.


Rust has unsafe for if you need it. You can also do resource management at high level in any language, and the techniques for that generalize very directly to implementing memory management.

There's no excuse for teaching a language that doesn't have sum types or decent structured literals, a language that twists intelligent people into absurdities like http://www.tedunangst.com/flak/post/string-interfaces . The djikstra quote about BASIC applies.


What makes you think you can't do (or teach) low-level stuff in Rust?


>I wonder where we will be in as little as 10, 20 years.

Can't you see the writing on the wall? In a javascript framework, hosted in javascript, all the way down, to some obscure operating system programmed by the ancients that it is best not to fiddle with.


I know C. I write software in C++.

C++ is better than C in pretty much every way imaginable. Imagine, for a moment, that C++ is C but with even more tools. You don't have to use those tools, but the tools are there. You could write pure C and use a C++ compiler and have very few troubles.

Structs? C++ has that.

Structs without any member functions or constructors or destructors or fancy automatic under-the-hood magic? C++ has that too.

Global functions? Don't worry.

Inline assembly? C++ has that.

The biggest trouble I've ever encountered is when ensuring portability of exported symbols. That's typically only needed when interfacing with other languages or libraries. And it's not impossible to handle (just annoying).

I could very well imagine a future where things are built with C++ instead of C. And that future will be good.


The problem is that C++ can also be worse than C in pretty much every way imaginable due to how large and complex the language is.

If you believe otherwise then you have not worked with enough C++ codebases.

C is much simpler than C++ and with C source code what you see is what you get.


"I'm saddened by some of the comments here. It's true that COBOL is not safe. It's true you shouldn't be using it for large portions of your professional setting. It's true that these sort of libraries appear to be slapping a bandaid on a broken bone... All of those things are true, but you should learn COBOL. And you should learn it well.

I wonder where we will be in as little as 10, 20 years. When the COBOL dinosaurs die and universities stop teaching the basics of computers. Young programmers are being pushed away from the harsh realities of COBOL (not to mention anything below COBOL), who's going to be building the building blocks in the future? Who's is going to keep optimising modern languages like Smalltalk and Simula? That minority is only going to get smaller..."

It would sound ridiculous then and it sounds ridiculous now.


The CheckedC compiler from Microsoft (built on LLVM) is probably a more complete solution to the issue of memory unsafety in C. Check it out!

https://github.com/Microsoft/checkedc


Only 8 mentions of Rust per C/C++ related link. I am disappointed!


14 now, including the one in my comment. That's more like it, but I think we can do even better.


C is not a safe language. This sort of thing seems (to me, at least) like a jury-rigged workaround rather than an actual solution.

If you actually care about memory safety—which you should, as the CVE list shows—then why not use an actual memory-safe language?


I don't disagree.

> If you actually care about memory safety ... then why not use an actual memory-safe language?

Many teams are stuck with established codebases that they can't afford to re-write entirely from scratch.

I think the proposal in the article is an interesting point on a continuum of mitigations for C, spanning from the really weak and/or hacky (but cheap/easy) all the way up to safe systems languages like Rust and Ivory. Other points on this continuum include Cyclone [1], SAFECode [2], and the fat pointer approach in Cello [3].

[1] http://cyclone.thelanguage.org/

[2] http://safecode.cs.illinois.edu/index.html

[3] http://libcello.org/learn/a-fat-pointer-library


And don't forget the llvm (and gcc) sanitizers[1] and SaferCPlusPlus[2] (essentially a memory safe subset of C++), which are more modern and have arguably better overall combinations of performance/safety/compatibility/practicality.

Automatic translation (assistance) from native C to SaferCPlusPlus is being worked on. And if you'll permit me to gripe a little here - you might think it's a reasonable undertaking, until you start contemplating code like this:

  int **var1 = (int **)malloc(20);
  var1[2] = (int *)malloc(10);
  int *var2 = var1[2] + 4;
  var2--;
  free(var2 - 3);
  free(var1);
:)

[1] https://github.com/google/sanitizers

[2] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus


> Many teams are stuck with established codebases that they can't afford to re-write entirely from scratch.

You don't have to. Most alternative can call compiled C libraries. Rust can.


Converting a monolithic chunk of code into modules then rewriting it in another language takes a lot of effort with not too much to show for it.


Rust can be compiled to be called from C too.


I once wrote a C interpreter for scripting (picoc) which had memory safety. To my surprise I received a lot of complaints from people who wanted the more conventional memory unsafe semantics. I changed it to being memory unsafe and I've never had any requests to change it back.

Perhaps there's a place in this world for both approaches?


Or more probable you userbase was full of C developers, that can't imagine to be possible to code outside the C-mindset.

Take in account how many can't even consider the possibility that a full OS can (and have be made before) safer, have GC, is NOT C, is NOT EVEN ALIKE C, etc.

C, C++, Unix are the biggest roadblock in the progres of computing. Or more exactly, the users, that prefer patch-along and commit million of dollars in wasted effort than in build (or rediscover) better ways.

This is alike climate change resistance. Why go solar, if carbon is better and well understood? Instead, push ahead and waste money fixing symptoms instead of causes.

And "causes" is a word that here mean, C, C++, Unix, etc...


That's an unfair judgement toward the C development community as a whole.

There are legitimate use cases for C. If you're writing low-level, bare-metal programs then there is no other portable programming language that offers better semantics. C programmers often laud that language because they can easily reason what the output of the compiler will be. That is an extremely important and rare property for a portable programming language.

But hey, don't listen to me, listen to an actual experienced C programmer: https://youtu.be/MShbP3OpASA?t=20m45s


The key-word is "portable". C is running on momentum in the embedded world because for many chips you only have a crappy "kinda C, but not standard C" compiler the vendor provides. Ada for example is a nicer language than C, suitable for embedded work, but not as popular and hence less supported. So is Pascal, and some more modern languages that get mentioned often enough on HN that some readers find it disagreeable.


> C programmers often laud that language because they can easily reason what the output of the compiler will be.

They certainly do, and in doing so, conveniently forget all the instances of their compilers surprisingly deleting invalid code as optimization because it inadvertently depended upon undefined behaviour.


> There are legitimate use cases for C* * Or C-like langs...

Of course something must be use to fill the niche C have. But is clear that exist a huge resistance to fill it with something better, to the point that anything is almost dismissed and the only "valid" answer is continue with C (or C++, or Unix, or Old-School Terminals).

Probably you have see

https://www.destroyallsoftware.com/talks/a-whole-new-world

Is not hard to imagine that better/improved tools must have arise in all this decades. In contrast, we (as community) are stuck in more or less the same things and maybe even devolving in some areas.


Would rust fit the bill though ? It's not as mature, and don't support as many platforms yet, but it is considered now as a serious safer alternative, with as much bare metal capabilities.


No.

Because it's not as mature, and doesn't support as many platforms. Those things are very important to me, and there is no steel thread for them.

I imagine that in twenty years, it could be worth talking about rust as an alternative, but rust is only two years "stable", and it's unclear what the language, the tooling and the libraries are going to look like in that time.

Meanwhile, I have software I wrote 20 years ago in C that still runs, correctly, and on the Internet, generating income, and a big part of the reason why is that C was already more than twenty years old at that point.


> I imagine that in twenty years...

But WHY in 20? With this thinking NEVER WILL BE IMPROVED THE SITUATION!

And I don't mean you, I mean the whole industry.

Obviously stability and compatibility is important, but is not this almost like the COBOL problem? Nobody try to move out from COBOL because "maybe in the far future a solution will emerge" and the the future get here, and we are decades LATE?

And RUST is not the only viable alternative. Ada, Pascal, Oberon, Modula (to name the family of languages I know more, but is likely some lisp and others exist too) where already there (decades ago) without break too much from the C mindset (and go crazy with monads and stuff like that).

So, because look like without {} a system language have not chance in hell, take oberon and put the C syntax? And them we are not waiting 20 years in the future for it.

But this will fail, for the same reasons that renuevable energy will fail against carbon:

The community will not accept the costs in the short-term, not matter how much things could improve later...

P.D: And what if a "rust-alike" lang but trans-piling to C, like with Nim?

Or even better:

C2 (C with improvements and eliminating as much baggage as possible) trans-piling to C. Eventually this will allow to bootstrap and still keep a way to old code bases...


That's really interesting. Is there any documentation around for memor safe version? I'd like to study your implementation!


For new Language to become safe you need to write state of-the-art linters, static source code analyzers, dynamic code analyzers, code coverage analysis tools, testing tools for the language.

C has all these. Only C/C++ and Ada have these.

People who say "C is not a safe language" may not be aware, but most of the worlds safety critical code is written in C or ancient C++. Most aerospace, military and medical applications are written either in C/C++ or Ada. Ada would be much safer than C, but it has been abandoned for C/C++ because language of choice is not that important.

If the culture and tooling to write safe code is not there, changing language is not going to help. It's currently easier to write safe code in C than it's to write safe code in your favorite 'memory safe' language like Rust.

https://en.wikipedia.org/wiki/MISRA_C


This is so true. It seems like most of the arguments against C are made in isolation of the ecosystem that has been built around C. At this point we should be helping new programmer learn how to work with C with all the safely equipment we now have.


Because only around 1% of all C developers care to use such band aids.


They are not band aids.

Analysis tools for C exceed anything you can have in Rust or Go compiler in scope and features.


Something that is optional doesn't exceed what are standard safety features in programming languages since Algol and its derived languages exist.

Having external tools to track out of bonds errors, use after free, double free, free bad pointers, implicit conversions and UB, errors that don't happen in another languages is band aid.

Other languages just don't need them thanks to being designed with safety first.


I write in C (mostly) and very few of my bugs are the typical errors C is famous for - 99.9% are logic bugs that can occur in any language. I have lots of great tools to catch the C type bugs - I wish I had a tool to catch my logic bugs. Tests can only find the bugs where the code doesn't do what I expect, not where what I expect is wrong.


Which tells me that you work alone without other devs of various skillsets messing around your beautiful C code.


Well I wouldn't call my C code beautiful, but no I don't work alone. I do only work with devs better than me :)


>Other languages just don't need them thanks to being d esigned with safety first.

Other languages are not used for writing safety critical code. Language features are not even close being enough for that.


Ada and SPARK are such languages, better improve your knowledge.


I specially mentioned Ada with C/C++ in my first post that started the tread.

No need to defensively boast if you realize that you are wrong.


I don't need to be defensive in any way.

The band-aids called MISRA-C, Frama-C, High Integrity C++ and DOD certifications are a mechanism to force Ada safe like semantics in C and C++, while paying lower salaries, instead of hiring Ada devs.

F-35 proves how good they work out in large scale projects.


Exactly. The real problem is many C developers are unaware of these tools. Rather than just saying use programming language x, we should be helping people use C with the right tools.


Not at all.

These tools exist since 1979, it hasn't changed anything.

C doesn't deserve a place on any security conscious developer toolbox.


> This sort of thing seems (to me, at least) like a jury-rigged workaround rather than an actual solution.

Really? I had a look at the code and I would agree with the basic idea of eliminating errno and always using the return value.

> If you actually care about memory safety—which you should, as the CVE list shows—then why not use an actual memory-safe language?

Following the logic here, you seem to be implying that no-one except the irresponsible should use C.


> Following the logic here, you seem to be implying that no-one except the irresponsible should use C.

There are some platforms (such as embedded systems) where C really is the only choice. But if you do have a choice? Then I don't think using C is a good idea.

I'm not sure if "irresponsible" is the right word, though. I think developers underestimate the risks of C's unsafety and/or overestimate their ability to properly deal with it. Hanlon's razor[0] applies.

0. https://en.wikipedia.org/wiki/Hanlon%27s_razor


Given a problem, C may or may not be the correct tool to solve it. Decrying C as so dangerous that it can never be safely used under any circumstances, as you and others on this discussion seem to be doing, is absolutely ridiculous, like trying to ban chefs from ever using sharp knives because they might cut themselves.

Much of the software that we all currently rely on is in C, for example all of the common operating systems are in C, and the reliability of these operating systems shows very clearly that C is nowhere near as inherently dangerous as you claim it to be.


"There are some platforms (such as embedded systems) where C really is the only choice. "

That's never true. You can always use a higher-level, safer language that can compile to C. The result is C code but you never write it & avoid its pitfalls. Ivory language in Haskell, Ada-to-C, and Modula-2-to-C come to mind. Alternatively, you can use something even more primitive in a C subset or native assembly that it compiles to. OcaPIC or JavaCard come to mind. Astrobe Oberon is another for safe, low-level coding but not sure how its compiler works.


Adding an extra compilation step doesn't make debugging easier. The embedded C compilers also aren't known for their stability and standards compliance, so you'd likely have to massage your generated C to avoid bugs in the C compiler.


Those are good points worth considering for a given tool.


No you can write rust now.


Per the Reenix OS implementation:

"This boot-loader, unfortunately, did not support loading any kernel images larger than 4 megabytes. This turned into a problem very quickly as it turns out that rustc is far less adept than gcc at creating succinct output. In fact, I was hitting this problem so early I was barely able to make a “Hello World” before having to stop working on rust code. Fixing this problem required rewriting most of the early boot code, all of which was x86 assembly ..."


Not for all targets.


> Following the logic here, you seem to be implying that no-one except the irresponsible should use C.

This seems highly likely, yes.

C was a good language at the time. It's still probably a very good teaching language for systems programming. But it's a bad language for anyone who wants to write software that needs to be used responsibly. It predates the entire idea that some people on the network might be hostile. It predates the knowledge that buffer overflows might be used for malice. It predates about 40 years of programming language research and security research, some of which has even shown results.

C also predates computers of modern speed; it was written at a time when interpreted languages were mostly unthinkable for serious software. There's no reason that, say, the manpage rendering tools (the most recent thing I had half a mind of exposing to the network) would be written in C today, but it was presumably the best tool for the job at the time.

It is no criticism of C to say that it has served us well but it is a bad choice for new software for clear and objective reasons.


ESPOL and a full OS written in it, Burroughs B5500 nowadays sold as Unisys MCP, predates C by about 10 years, having been released in 1961.

There were already powerful computers than the PDP-11, with more saner languages.


> There's no reason that, say, the manpage rendering tools (the most recent thing I had half a mind of exposing to the network) would be written in C today

Someone called Kristaps Dzonsons wrote a manual page renderer and a cgi/fastcgi library in C quite recently.


In how many "safe" languages can you write a complete shipping application without calling into libraries and/or kernels written in C?


It depends on what platform you are targeting, but as far as safe languages go, it is hard to beat Ada: https://en.wikipedia.org/wiki/Ada_(programming_language)

Although Rust shows a lot of promise, Ada has a proven track record of use in safety critical applications such as avionics and railways.


Is C being in at some point even relevant? There's plenty of safer languages thst can fo that. There's also containers, OS's, and embedded runtimes for some. The first, business mainframe was even programmed in an early one. It's even still on market although less desirable today. And C itself has automated toolimg available like SAFEcode that can make it safer with one team doing that for FreeBSD.

So you can make the C safe or get rid of it. However much work you want to put in.


You can make the same argument for assembler.


Today? Zero, probably.

Five years from now? Probably a number of them. You've been able to write in Go with no standard library for years. There's a project to avoid it in Rust. There are lots of active kernel development projects. Do you want your software to continue to be written in C with a kernel and standard library in safe languages?

And why does it matter to you to avoid improving your code until it's possible to run on an all-safe stack? It is as if it is 1990, and you are asking, why should I license my code as free software? How much can I run on a free C library and a free kernel?


You don't need a project to do it with Rust, you just need a !#[no_std] marker.


You can do so with Rust, and there's a number of them, from toy POCs to much more serious projects.


Lots of them, like any Modula descendent.

C only became relevant outside UNIX around the late 90's.


That doesn't sound right. A ton of DOS software in the late 80s and early 90s was written in C.


Not at all.

MS-DOS was written in Assembly, just like any application where performance mattered.

CRUD applications were mostly written in an xBase dialect, with Clipper being the prominent one.

Turbo Pascal, Turbo C, Turbo C++, Turbo Basic, Quick Basic, Quick C, Quick Pascal.... All of them had plenty of users.

I own MS-DOS system programming books, with examples in Assembly, Pascal, Basic and C for both Microsoft and Borland variants.


Are you thinking of Pascal, maybe? I thought that was the language of choice for DOS application development (and 8086 assembly, of course).


I'm thinking of games, really. The ones I'm most familiar with had certain performance-critical parts written in assembly, with the bulk of the connective code written in C. I suppose it depends on the definition of "written in C".


So not C at all, that was a common practice in Basic and Pascal as well.

Those languages were the "Unity" of the 80 and 90's.

Real games were 100% Assembly.

This only started to change with the adoption of Watcom C, as it could generate relatively good code and use a DOS 32 bit Extender.


That and advanced variants of BASIC such as GWBasic and QBasic.

There is a good reason why Visual Basic is still alive.


> C is not a safe language

More like "most C implementations do not offer memory safety by default". This is unrelated to the language per se, it's like for example calling Scheme unsafe because an implementation could have set-cdr! on '() write on random memory. It can be applied to any language.


Is it C if it's incompatible with any standardized or widely accepted notion of it?


(0) Memory safety alone doesn't fix the downsides of lacking a formally defined semantics. You can turn most undefined behavior into trapped errors for the cheap price of a constant overhead factor, but that won't make writing correct programs any little bit easier. Your programs will just fail in more manageable ways.

(1) Once you have a formally defined semantics, undefined behavior is no big deal, because you have the tools to prove its absence in whatever program you're writing.


Because every platform generally has a C compiler (which is usually GCC, not LLVM/clang).


even my tp link router has gcc!


> C is not a safe language.

It what ways is it unsafe? Thinking about the examples you'll cite, can the programmer introduce checks and safety nets to overcome them? For example, you'll no doubt mention buffer overflows, but can these be protected against?


Yes, it is possible to build checks and safety nets to overcome them. But it isn't easy.

The simplest example here is the one the Safe C Library is about: it's very hard in C to talk about memory buffers that know their own capacity. C wants you to think about pointers, often with implicit lengths (memcpy takes a single size parameter, strcpy takes none, tmpname takes none, etc.). You can write a library where you pass lengths alongside every pointer. But you've got to keep track of two variables for what should conceptually be a single variable.

Now, one approach is to make a struct buffer {char *ptr; size_t len;}. But then you run into C's problem that it has no encapsulation or operator overloading. You can't index into a buffer; you can call a function that checks the length, or you can just do the tempting thing of ->ptr[i], which you'll get wrong at some point. The compiler won't help you notice that you got it wrong.

And even if you do that, the C standard library is full of the unsafe approach to pointers. If you scrupulously avoid all functions that use unsafe pointers, you've also got to avoid the C standard library as well as almost all third-party libraries. This is not what most people think of when they think of using C.

C++ (and especially C++11 with good discipline) is a very good step forward in the direction of making C more safe, in no small part because C++ has its own standard library and the entire ecosystem tends to use that standard library. Due to the desire to be backwards-compatible with C, C++ retains a lot of unsafety, but it makes safer paradigms possible. If you're not bound to backwards-compatibility, though, there's probably no reason to voluntarily take the burden of dealing with the things that are unsafe in C.


> Now, one approach is to make a struct buffer {char ptr; size_t len;}.

A "fat" pointer. That's essentially what the Cello project does [1]. The author has pulled together some crazy hacks to make this approach accessible in C. Including, apparently his own version of many standard library routines. Crazy, but very interesting. Discussed on HN yesterday/today: https://news.ycombinator.com/item?id=14091630

[1] http://libcello.org/learn/a-fat-pointer-library


With multiple threads, detecting UB is undecidable (as seen in https://www.ideals.illinois.edu/bitstream/handle/2142/30780/...). If you ignore the weirdest UB and are willing to eat a lot of performance at runtime there is probably a conservative version that works though (there is for sure in the single-threaded case, e.g. see http://robbertkrebbers.nl/research/thesis.pdf).


Because they use a library to manage memory is one reason. Also they use linters which means that the C that they write is totally different from the C you see in tutorials. Of course it depends on the company/project.


There are no unsafe languages, only unsafe coders.


I'm going to go ahead and say that if (1) deciding whether code is valid for your language is undecidable, (2) no industrial compiler makes any effort to conservatively accept only valid programs, and (3) unsoundness cannot even be detected at runtime (this is true of C in the multithreaded case) and may result in completely arbitrary behavior, your language is unsafe.


There are only unsafe coders.


There are no unsafe cars, either, only unsafe drivers.

However, there also are plenty of cars that safe drivers will refuse to drive. You don't need crumple zones, safety belts, etc. in a car you will never drive.


> There are no unsafe languages, only unsafe coders.

There are no unsafe nuclear launch buttons, only unsafe nuclear launch button pushers.

Hmm, maybe some buttons should be harder to push...


I enjoy using computers because they make fewer mistakes than humans.

I would like to apply the same principle to the language I program computers in. Make it the computer's job to minimize mistakes - it might not be perfect, but it's definitely better than I am!


I agree with this sentiment. What I disagree with is the sentiment (which I'm not saying you expressed) of a silver bullet for it. IE if only our test coverage was complete, or our type system was sophisticated enough.


You're right, there's no silver bullet. But it seems like some people have the mindset of "Since everything is bad and (to some extent), there's no point in trying to improve."


>There are no unsafe languages, only unsafe coders.

And who develop the languages? Coders.

Ergo, exist unsafe languages.


This is a link to https://sourceforge.net/projects/safeclib/ , an implementation of ISO/IEC TR 24731-1. In brief, TR24731-1 provides:

- a type rsize_t, typedef'd to size_t, but by convention limited to a lower value RSIZE_MAX that is ideally no larger than SIZE_MAX/2, thereby preventing some classes of arithmetic overflows (adding two valid rsize_ts never overflows, but the next function using it will complain at runtime if it exceeds RSIZE_MAX)

- a large number of functions ending in _s, like memcpy_s and tmpnam_s, which use rsize_t as their size type, take lengths with every buffer, and return an errno instead of a direct value

Wikipedia says that TR24731-1 "has met severe criticism with some praise.... Despite this, TR 24731-1 has been implemented into Microsoft's C standard library and its compiler issues warnings when using old 'insecure' functions." glibc 2.25, released last month, has provisional support for it.

Wikipedia links to this Austin Group report criticizing TR24731: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1106.txt

The approach of simply providing a length argument to a number of string functions may cause as many problems as it solves. It provides no assurance that the programmer provide a sensible value, and may end up obfuscating otherwise clear and safe usage of the original function. Functions that allocate memory (using malloc()) would provide safer, clearer, and more robust interfaces. E.g. strdup() instead of strcpy().

and this Stack Overflow thread: https://stackoverflow.com/questions/372980/do-you-use-the-tr...

On a side note, especially now that it's a decade or so after TR24731-1 was released, there are many good languages (some compiled, some interpreted, only one of which rhymes with "overdiscussed") that are suitable to a lot of the things you'd want to use C for a decade ago, and solve the problem in a straightforward way that resolves the objection above: they provide either buffer types that remember their own capacity, string objects capable of allocating as necessary, or both. There are also C libraries that do the same (bstring and GTK+ come to mind), if you're inclined to continue writing C.


I forked it at https://github.com/rurban/safeclib

Besides the mentioned critism there are several outstanding issues I'll fix: add sprintf_s, fix strljustify_s from the 2 unmerged merge requests. fix memset_s API to be C11 compatible.


Done and merge request for 2.0 sent upstream.


> (using malloc()) would provide safer, clearer, and more robust interfaces. E.g. strdup() instead of strcpy().

malloc() ... Lutz


In this whole thread, among all these talks about memory safety, zero mentions of valgrind.


There is no substitute for discipline. The unskilled always blame their tools. That is all.


Safety first, an opt out for control.


(2009)


(2009)


The library itself seems to have been updated as recently as November 2016.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: