
Curl is C - mhasbini
https://daniel.haxx.se/blog/2017/03/27/curl-is-c/
======
simias
I have no problem with Curl being written in C (I'll take battle-tested C over
experimental Rust) but this point seemed odd to me:

>C is not the primary reason for our past vulnerabilities

>There. The simple fact is that most of our past vulnerabilities happened
because of logical mistakes in the code. Logical mistakes that aren’t really
language bound and they would not be fixed simply by changing language.

So I looked at
[https://curl.haxx.se/docs/security.html](https://curl.haxx.se/docs/security.html)

#61 -> uninitialized random : libcurl's (new) internal function that returns a
good 32bit random value was implemented poorly and overwrote the pointer
instead of writing the value into the buffer the pointer pointed to.

#60 -> printf floating point buffer overflow

#57 -> cookie injection for other servers : The issue pertains to the function
that loads cookies into memory, which reads the specified file into a fixed-
size buffer in a line-by-line manner using the fgets() function. If an
invocation of fgets() cannot read the whole line into the destination buffer
due to it being too small, it truncates the output

This one is arguably not really a failure of C itself, but I'd argue that Rust
encourages a more robust error handling through its Options and Results when C
tends to abuse "-1" and NULL return types that need careful checking and can't
usually be enforced by the compiler.

#55 -> OOB write via unchecked multiplication

Rust has checked multiplication enabled by default in debug builds, and
regardless of that the OOB wouldn't be possible.

#54 -> Double free in curl_maprintf

#53 -> Double free in krb5 code

#52 -> glob parser write/read out of bound

And I'll stop here, so far 7 out of 11 vulnerabilities would probably have
been avoided with a safer language. Looks like the _vast majority_ of these
issues wouldn't have been possible in safe Rust.

~~~
eddieroger
> Of course that leaves a share of problems that could’ve been avoided if we
> used another language. Buffer overflows, double frees and out of boundary
> reads etc, but the bulk of our security problems has not happened due to
> curl being written in C.

He addressed all of those points in the second short paragraph. None of those
are C vulnerabilities, they were mistakes made on the part of the developers,
not the language. Avoidance of problems in a safer language doesn't mean when
things happen, it's the language's fault.

~~~
pjmlp
These are surely C vulnerabilities and contradict the statement regarding zero
policy static analysis errors.

#55 -> OOB write via unchecked multiplication

#54 -> Double free in curl_maprintf

#53 -> Double free in krb5 code

#52 -> glob parser write/read out of bound

~~~
mikulas_florek
Double free is not necessary a C issue, but it can be also a program logic
issue - I expect to have an object, but it's already deleted. So it's one of

1\. object should not exist and the second free is incorrect 2\. object should
exist and the first free is incorrect 3\. object existance is uncertain and
the second free must somehow check that

Although I can double-free only in unsafe languages, the wrong logic behind it
can be the same in safe languages. It just have different consequences.

~~~
xorblurb
Of course it is a C issue, in the same sense that ALL logic errors are in the
case you describe -- so either C issues do not exist or you did not manage to
find the correct definition: for ex OOB access is caused by faulty program
logic, and the consequences are dramatic in unsafe languages. That is an issue
of the language, despite you being able to compute OOB indexes in any
language. Same thing for double free; the language issue is that the result is
catastrophic, not that you can write for ex faulty logic attempting a
liberation too early, or an extra one. (Because in safe languages, the result
is not _as_ catastrophic as in unsafe languages).

That is the _whole_ concept of safety.

~~~
mikulas_florek
Let's take some safe language, e.g. c#

1\. object should not exist and the second free is incorrect

In C# I can have to variables pointing to the same object, I null only one of
them. The second should be nulled too, but it's not. That's a logic error. So
in C# I end up with some object that should not be there, but it is. Which is
better - doube-free or undestroyed object - depends on use case.

~~~
pjmlp
The _big_ difference that you are overlooking is that double free leads to
memory corruption, with undefined behavior of program execution.

It can crash right away, in a few seconds, minutes, hours later, or never and
just keep generating corrupt data.

Having a reference that the GC doesn't collect doesn't lead to memory
corruption, just more being used than it should be.

~~~
mikulas_florek
> Having a reference that the GC doesn't collect

Using such object is/can be as dangerous.

In fact I find double-free safer because it usually crashes (and in my code I
do checks so it almost certainly crashes), while in C# I can happily use such
object without knowing it. But as I said, it depends on specific use case.

~~~
stymaar
> In fact I find double-free safer because it usually crashes (and in my code
> I do checks so it almost certainly crashes)

You don't know what an undefined behavior is, do you ? You cannot be sure it
crashes since the compiler is allowed to do anything with the assumption it
doesn't happen. It's absolutely legit for the compiler to remove all the code
you added to check a double-free didn't happen because it is assuming that's
dead code.

See this post[1] from the LLVM blog which explains why you can't expect
anything when you're triggering an UB.

[1]: [http://blog.llvm.org/2011/05/what-every-c-programmer-
should-...](http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know_14.html?m=1)

~~~
mikulas_florek
I know very well what UB is and I bet there is not a single big program which
does not have undefined behaviour. I even rely on UB sometimes, because with
well defined set of compilers and systems, it's in reality well defined.

I was talking in general about "unsafe" languages. I use c++ in my projects
and use custom allocators everywhere, so there is no problem with UB there.
The custom allocators also do the checking against double-free.

~~~
xorblurb
What do you mean by checking against double-free? Either you pay a high
runtime cost, or use unconventional (and somehow impractical in C++) means
(e.g. fancy pointers everywhere with a serial number in them), or you can't
check reliably. Standard allocators just don't check reliably, and thus do not
provide sufficient safety.

Anyway, double-free was only an example. The point is that a language can, or
not, provide safety _by itself_. Not _just_ allow you to create you own
enriched subset that is safer than the base language (because you often are
interested in safety of 3rd party components not written in your dialect and
local idioms of the language)

In the case of C and C++, they are full of UB, and in the _general case_ UB
means you are dead. I find that _extremely_ unfortunate, but this is the
reality I have to deal with, so I don't pretend it does not exist...

~~~
mikulas_florek
> What do you mean by checking against double-free?

I pay small runtime cost for the check by having guard values around every
allocation. At first I wanted to enable it only in debug builds, but I am too
lazy to disable it in release builds, so it's there too. Anyway the overhead
is small and I do not allocate often during runtime.

> Anyway, double-free was only an example. The point is that a language can,
> or not, provide safety by itself.

I can write safe code in modern C++ (and probably in C) and I can write unsafe
code in e.g. Rust, only difference is which mode is default for the language.
On the other hand I have to be prepared to pay the performance (or other)
price for safe code.

> In the case of C and C++, they are full of UB, and in the general case UB
> means you are dead.

I doubt there is a big C or C++ program without UB, does that mean they are
all dead? I do not think so.

> I find that extremely unfortunate, but this is the reality I have to deal
> with, so I don't pretend it does not exist...

I do not like UB in C++ too, but mostly because it does not make sense on
platforms I use. On the other hand I can understand that the language can not
make such platform-specific assumptions. I can pretend UB does not exist with
some restrictions. UB in reality does not mean that the compiler randomly do
whatever he wants, it do whatever he wants but consistently. But as I said it
twice, it depends on use case. Am I writing for SpaceX or some medical
instruments? Probably not a good idead to ignore UB. Am I making writing a new
Unreal Engine? Probably not a good idea to worry much about UB, since I would
never finish.

~~~
xorblurb
> UB in reality does not mean that the compiler randomly do whatever he wants,
> it do whatever he wants but consistently.

There is nothing consistently consistent about UB. The exact same compiler
version can one day transform one particular UB to something, the other day to
something else because you changed an unrelated line of code 10 lines under or
above, and the day after tomorrow if you change your compiler version or even
just any compile option, you get yet another result even when your source code
did not changed at all.

EDIT: and I certainly do find _extremely_ unfortunate that compiler authors
are choosing to do that to us poor programmers, and that they mostly dropped
the other saner interpretation expressively allowed by the standard and
practiced by "everybody" 10 years ago; that UB can also be for non portable
but well-defined constructs. But, well, compiler authors did that, so let's
live with it now.

~~~
mikulas_florek
> There is nothing consistently consistent about UB.

Yet, for years I am memmove-ing objects which should not be memmoved. Or using
unions the way they should not be used.

> and that they mostly dropped the other saner interpretation expressively
> allowed by the standard and practiced by "everybody" 10 years ago

Do you have any example?

> that UB can also be for non portable but well-defined constructs.

Do you mean instead of signed integer overflow being UB it should be defined
as 2 complement or something like that?

~~~
xorblurb
> Yet, for years I am memmove-ing objects which should not be memmoved. Or
> using unions the way they should not be used.

There can be two cases:

A. you rely on additional guarantee of one (or several) of the language
implementation you are using (ex: gcc, clang). Each compiler usually has some.
They are explicitly documented, otherwise they do not exist.

B. you rely on undocumented internal details of your compiler implementation,
that are subject to change at any time, and just have happened to not have
changed for several years.

> Do you have any example?

I'm not sure that compiler did "far" (not just intra-basic-block instruction
scheduling) time-traveling constraint propagation on UB 10 or 15 years ago.
For sure, some of them do now. This means you should better use fno-delete-
null-pointer-checks and all its friends, because that might very well save you
completely in practice from some technically UB but not well known by your
ordinary programmer colleague - so likely to appear in lots of non-trivial
code bases.

Simpler example: behavior of signed integer overflow. (Very?) old compilers
simply translated to the most natural thing the target ISA did, so in practice
you got 2s complement behavior in tons of cases and tons of programs started
to rely on that. You just can't rely on that so widely today without special
care.

More concerning is the specification of << and >> operators. On virtually all
platforms they should map to shifting instructions that interpret unsigned int
a << 32 as either 0 or a (and same thing for a>>0), and so regardless of the
behavior (a<<b) | (a>>(32-b)) should do a ROL op. Unfortunately, mainly
because some processors do one behavior and others do the other one (for a
single shift), the standard specified it as UB. Now in the standard spirit, UB
_can_ be the sign something that is non-portable but perfectly well-defined.
Unfortunately now that compiler authors have collectively all "lost" (or
voluntarily burned) that memo, and are actively trying to trap other
programmers and kill all their users, either it is already handled as all
other UB in their logic (=> nasal daemons) or it is only an event waiting to
happen...

Maybe a last example: out-of-bound object access was expected to reach
whatever piece of memory is at the position of the intuitively computed
address, in the classical C age. This is not the case anymore. Out-of-bound
object access now carry the risk of nasal-daemons invocation, regardless of
what you know about your hardware.

Other modern features of compilers also have an impact. People used to assume
all kind of safe properties at TU boundaries. Those where never specified in
the standard, and they have been dropped through the window with WPO. It is
likely that some code-bases have "become" incorrect (become even in practice,
given they always have been in theory with the most risky interpretations of
the standard, that compiler authors are now unfortunately using)

> Do you mean instead of signed integer overflow being UB it should be defined
> as 2 complement or something like that?

Maybe (or at least implementation specified). I could be mistaking, but I do
not expect even 50% of C/C++ programmers knowing that signed overflow is UB,
and what it means precisely on modern implementations. I would even be
positively surprised if 20% of them know about that.

And before anybody through them at me:

* I'm not buying the performance argument at least for C, because the original intent of UB certainly was not to be yielded this way, but merely to specify the lowest common denominator of various processors -- its insanely idiotic to not be able to express a ROL today because of that turn of events and the modern brain-fucked interpretation of compiler authors -- and more importantly because I happen to know how modern processors work, and I do not expect stronger and safer guarantees to notably slow down anything)

* I'm not buying the "specify the language you want yourself or shut up" argument either, for two at least reasons: \- I also have an opinion about safety features in other aspects of my life, yet I'm not an expert in those area (e.g. seat belt). I _am_ an expert in CS/C/C++ programming/System Programming/etc... and I'm a huge user of compilers, in some case in areas where it can have an impact on people health. Given that perspective, I think any argument to just specify my own language or write my own compiler would just be plain stupid. I expect people actually doing that for a living (or as a main voluntary contributor, etc..) to use their brain and think of the risks they impose on everybody with their idiotic interpretations, because regardless of they want it or I want it or not, C and C++ _will_ continue to be used in critical systems. \- The C spec is actually kind of fine, although now that compiler author have proven they can't be trusted with it, I admit it should be fixed at the source. But would have them be more reasonable, the C spec would have been continued to be interpreted like in the classical days, and most UB would merely have been implementation defined or "quasi-implementation defined" (in some cases by defining all kind of details like a typical linear memory map, crashing the program in case of access to unmapped are, etc...) in the sense you are thinking of (mostly deterministic -- at least way more than it unfortunately is today). The current C spec _do allow_ that and _my_ argument would be that doing otherwise (except if the performance price is extremely highly unbearable, but the classical implementations have proven it is not!). So I don't even need to write an other less dangerous spec, they should just stop to write dangerous compilers...

------
ameliaquining
I'm kind of torn on this.

On the one hand, Curl is a great piece of software with a better security
record than most, the engineering choices it's made thus far have served it
just fine, and its developers quite reasonably view rewriting it as risky and
unnecessary.

On the other hand, the state of internet security is really terrible, and the
only way it'll ever get fixed is if we somehow get to the point where writing
networking code in a non-memory-safe language is considered professional
malpractice. Because it should be; reliably not introducing memory corruption
bugs without a compiler checking your work is a higher standard than
programmers can realistically be held to, and in networking code such bugs
often have immediate and dramatic security consequences. We need to somehow
create a culture where serious programmers don't try to do this, the same way
serious programmers don't write in BASIC or use tarball backups as version
control. That so much existing high-profile networking software is written in
C makes this a lot harder, because everyone thinks "well all those projects do
it so it must be okay".

~~~
baldfat
> the only way it'll ever get fixed is if we somehow get to the point where
> writing networking code in a non-memory-safe language is considered
> professional malpractice.

So using Linux, Windows, BSD or MacOS servers are malpractice? I think you
might have over stated your case. So are you waiting for a memory safe Herd
re-write? A memory safe any OS will be decades away if someone wanted to start
tackling it now.

~~~
pjmlp
Microsoft does research how to improve security at OS level and sometimes
those efforts do end up on Windows.

Latest examples, Windows 10 secure kernel and Device Driver protection.

[https://myignite.microsoft.com/sessions/36925](https://myignite.microsoft.com/sessions/36925)

Or the new Windows USB stack, written in the P language.

[https://github.com/p-org/P](https://github.com/p-org/P)

UNIXes, not so much beyond patching C exploits.

~~~
madphrodite
And the last windows 7 update black screened my pc and left me in limbo till I
was able to restore. What exactly is your point? Windows is not a standard to
be held higher than..well anything.

~~~
dsacco
The parent commenter used Windows 10 as an example and you used Windows...7.
His point, which is that Microsoft actively researches and implements OS-level
security improvements, hasn't been rebutted by your statement about Windows 7.

Windows 7 was released in _2009_. There are substantial proactive improvements
that Microsoft cannot feasibly backport to older OS versions even if they're
still within the support period, and the corporate culture was simply
different when Windows 7 was released.

~~~
petee
Counter-point: Windows 10 updates locked up 2 of my machines - Microsoft has a
tough time implementing _most_ things properly...

------
rwmj
While this doesn't so much apply to libcurl (but see below), there is a third
alternative to "write everything in C" or "write everything in <some other
safer language>". That is: _use a safer language to generate C code_.

End users, even those compiling from source, will still only need a C
compiler. Only developers need to install the safer language (even Curl
developers must install valgrind to run the full tests).

Where can you use generated code?

\- For non-C language bindings (this could apply to the Curl project, but
libcurl is a bit unusual in that it doesn't include other bindings, they are
supplied by third parties).

\- To describe the API and generate header files, function prototypes, and
wrappers.

\- To enforce type checking on API parameters (eg. all the CURL_EASY_...
options could be described in the generator and then that can be turned into
some kind of type checking code).

\- Any other time you want a single source of truth in your codebase.

We use a generator (written in OCaml, generating mostly C) successfully in two
projects:
[https://github.com/libguestfs/libguestfs/tree/master/generat...](https://github.com/libguestfs/libguestfs/tree/master/generator)
[https://github.com/libguestfs/hivex/tree/master/generator](https://github.com/libguestfs/hivex/tree/master/generator)

~~~
KuiN
> generate C code.

Programmatically generating C code not without problems. How can you prove
that the C you're generating is free from problems solved by the safer
language? Cloudbleed came from computer generated C code:
[https://blog.cloudflare.com/incident-report-on-memory-
leak-c...](https://blog.cloudflare.com/incident-report-on-memory-leak-caused-
by-cloudflare-parser-bug/).

~~~
patrec
No, it didn't.

See quote from the author of Ragel in the comments:

 _There is no mistake in ragel generated code. What happened was that you
turned on EOF actions without appropriate testing. The original author most
certainly never intended for that. He /She would have known it would require
extensive testing. Legacy code needs to be tested heavily after changes. It
should have been left alone.

PLEASE PLEASE PLEASE take some time to ensure the media doesn't print things
like this. It's going to destroy me. You guys have most certainly benefitted
from my hard work over the years. Please don't kill my reputation!_

~~~
deong
Well, the general point still applies. The bug occurred using code that was
written in a safe language and compiled to C. It's just that there are
multiple ways for that to go wrong. The generator _could_ have had a bug --
it's software, so it almost certainly does. Or, as in this case, the user
didn't use it correctly. Either way, the idea that you can write code in a
safe language and compile to C to eliminate the type of bugs that C allows
isn't true.

Are such errors less likely? Possibly so, but they're not categorically
eliminated. It becomes a risk assessment exercise rather than a simple thing
that everyone should do. Note that it also opens the door to Java-style
problems, where once the generator becomes ubiquitous, it becomes the most
valuable target for exploit-hunting because a vulnerability in the generator
gets the keys to _all_ the houses.

~~~
mbel
You are arguing that no language X is safer than writing program manually in Y
when program in X is compiled to Y. Because compiler from X to Y may have
bugs.

Therefore no code written in Rust (X) executed on x86 CPU (Y) is safer than
manually written x86 assemby, because Rust compiler (and LLVM) may have
errors.

And well, we can actually go deeper. There is CPU frontend that is generating
micro code, which may have bugs. There is also CPU backend which is executing
micro code, which also may have bugs. All in all there is no hope in
programming. There might be bugs everywhere so you can never be sure what your
program does.

~~~
deong
That's not what I'm saying. I'm saying "rewrite it in Rust (or whatever)"
isn't some silver bullet that fixes security problems. It's always about
assessing risk -- both risk of security issues as well as risk of upsetting
your users, etc. Basically exactly what the article says.

~~~
mbel
> Either way, the idea that you can write code in a safe language and compile
> to C to eliminate the type of bugs that C allows isn't true.

Is a bit different statement than:

> I'm saying "rewrite it in Rust (or whatever)" isn't some silver bullet that
> fixes security problems.

The first one is wrong, the second one is true.

Using a higher level language rules out some classes of programming errors
which are possible in lower level languages. The fact that compilers have bugs
does little to diminish those gains.

Semantics of Haskell does not allow to express program that generates double
free [0]. Perhaps one of the compilers will compile some Haskell code to
binary that frees memory twice. However, this bug in compiler is far more less
likely that a programmer making this mistake in C. Whats more when this bug in
compiler is detected and fixed. The problem can be fixed in all affected code
bases without need to change the original source code. Thus chances of bugs
are lower.

Nobody really argues that Rust (or OCaml, or Haskell, or whatever) is a silver
bullet, i.e. solution to all problems that will miraculously make programmers
produce no bugs at all. Obviously we will have software bugs even with most
restrictive languages. No amount of formal proofs will save us form
misunderstanding specifications or making typos. And then again we will also
have bugs in implementation of those high level abstractions.

And for the record I am really annoyed with movement to rewrite everything in
Rust.

[0] Yes, you can call free through FFI with whatever arguments you like, as
many times as you like. But for sake of brevity let's assume this is not how
you write your everyday Haskell.

------
tannhaeuser
Not only is curl based on C, but so are operating systems, IP stacks and
network software, drivers, databases, Unix userland tools, web servers, mail
servers, parts of web browsers and other network clients, language runtimes
and libs of higher-level languages, compilers and almost all other
infrastructure software we use daily.

I know there's a sentiment here on HN against C (as evidenced by bitter
comments whenever a new project dares to choose C) but I wish there'd be a
more constructive approach, acknowledging the issue isn't so much new software
but the large collection of existing (mostly F/OSS) software not going to be
rewritten in eg. Rust or some (lets face it) esoteric/niche FP language. Even
for new projects, the choice of programming language isn't clear at all if you
value integration and maintainability aspects.

~~~
wyldfire
> I know there's a sentiment here on HN against C

I think there's two major against-C groups: those of us who have worked with C
for decades and those who never worked with it. I'll try and speak for those
of us who've used it for decades. The popular high-level languages that have
arrived since ~1995 (Java, Python, JS, C# and friends) are excellent
productivity increases. In general, they sacrifice memory and performance in
favor of robustness and security. For enormous software problem domains, we
just don't need C's complexity or error-proneness.

Until Rust, there's been very close to zero serious competitors for C if I
wanted to write a bootloader, OS, or ISR. Not even C++ could do those (without
being extremely creative on how it's built/used). The ~post-2000 languages
(golang, swift, D etc) can't do that (perhaps D's an exception but it wasn't
an initial goal AFAICT). This is huge, IMO.

We've groaned and grumbled about how hard it is to parse C/C++ code for
decades. This is a big deal for tooling. Because of the language's design,
even if you use something "simple" like libclang to parse your code, you still
have to reproduce the entire build context just to sanely make an AST. All of
those other new languages above probably address this problem but also add all
kinds of other stuff which we can't have for specialized problem domains
(realtime/low-latency requirements, OSs, etc).

> collection of ... software not going to be rewritten in eg. Rust or some
> (lets face it) esoteric/niche FP language

IMO it's not appropriate to lump Rust in with "nice FP language"s. And don't
look now but lots of stuff _is_ being rewritten in Rust. Fundamental this-is-
the-OS-at-its-root stuff: coreutils [1], "libc" [2], kernels [3], browser
engines [4].

[1] [https://github.com/uutils/coreutils](https://github.com/uutils/coreutils)

[2] [https://github.com/japaric/steed](https://github.com/japaric/steed)

[3] [https://github.com/redox-os](https://github.com/redox-os)

[4] [https://github.com/servo/servo](https://github.com/servo/servo)

~~~
Manishearth
Could you expand on why you think Rust is a serious C competitor for
OS/ISR/bootloaders but not C++? This statement intrigued me.

I thought C++ had naked functions and all the things you need to write an OS.

~~~
Rusky
Not wyldfire, and I think that claim is a mischaracterization, but the main
obstacle to using C++ in the kernel is that some of its language features
require runtime support (new/delete, globals/statics with constructors,
exceptions).

You can of course just ignore those when writing kernel code- they get ignored
in application code much of the time! But I suppose at that point it could be
argued that you're just writing C with a C++ compiler?

~~~
kyberias
[https://en.wikipedia.org/wiki/Symbian](https://en.wikipedia.org/wiki/Symbian)

~~~
pjmlp
Add BeOS, OS X drivers (IO Kit) and OS / 400 after they started rewriting the
PL/M code, to that list.

------
unwind
Well put.

Didn't know that curl was stuck back on C89, that's really optimizing for
portability.

If anyone is confused by the "curl sits in the boat" section header, that's
basically a Swedish idiom being translated straight to English. That rarely
works, of course, and I'm sure Daniel knows this. :)

The closest English analog would be "curl doesn't rock the boat", I think the
two expressions are equivalent (if you sit, you don't rock the boat).

~~~
paulddraper
I didn't know "sit in the boat was a thing", but I liked it.

"Sit in boat" is a positive expression of the stability benefits.

------
devy
The 7th point: "curl sits in the boat"

    
    
        In the curl project we’re deliberately conservative and 
        we stick to old standards, to remain a viable and reliable 
        library for everyone. Right now and for the foreseeable 
        future. Things that worked in curl 15 years ago still work 
        like that today. The same way. Users can rely on curl. We 
        stick around. We don’t knee-jerk react to modern trends. 
        We sit still in the boat. We don’t rock it.
    

I see a lot of inertia in there. While it's a great record to maintain 15-year
consistency but in the era of every changing InfoSec outlook, it could be a
legacy and baggage if the authors resist to change. One thing we know for sure
is that human will make mistakes, no matter how skillful you are. In the
context of writing a fundamental piece of software with an unsafe programming
language, that means we are guarantee to have memory-safety induced CVE bugs
in curl in the future.

Some of other points that the author raised are valid too. If there is a
trade-off that we can have a safer piece of fundamental software by almost
eliminating a whole category of memory safety related bugs, and with the
downside of less compatibility with legacy systems, more dependencies etc.,
perhaps we should consider it? I believe the tradeoff is well worthy in the
long run and option is ripe for explore.

~~~
cestith
How is the author resistant to change? He specifically said new code should be
written in a language that meets the priorities for that code. He specifically
said someone has or would write a competitor to curl in Rust or some other
safer language and that a good one will take off. He welcomed that.

What he doesn't welcome is rewriting something that's had those bugs and the
types of logic bugs not related to the language already worked out. There's a
saying about a baby and bathwater.

Not everything is a dichotomy, and you shouldn't be reading the article as if
the author is against newer languages. He specifically says that given a fresh
start with the availability of these languages he might use something besides
C. Carefully weighing options is wise. Throwing away years of actual progress
for the appearance of quick progress is foolish.

~~~
devy
> How is the author resistant to change?

I specially quoted the section head "curl sits in the boat" and the entire
section ends with "We sit still in the boat. We don’t rock it". Now read it
again, and then tell me if that's welcoming changes or resisting changes.

> He specifically said someone has or would write a competitor to curl in Rust
> or some other safer language and that a good one will take off. He welcomed
> that.

Sure there might already be some alternatives out there. But those are not
curl, they are at most forks.

> He specifically says that given a fresh start with the availability of these
> languages he might use something besides C.

Nope, he used the word "Maybe. Maybe not." Might is a stronger word.

~~~
cestith
Having no plans to change curl within itself is not a resistance to change in
general.

Not everything is a dichotomy. One doesn't have to be for all change or
against all change. One can choose which things need to change.

~~~
kelnos
I see "we don't rock the boat" as completely synonymous with "I am resistant
to change".

(Note that I suspect you think that I or the original poster are suggesting
that he's resistant to _all_ change, not just change within curl. I don't
believe he's resistant to all change, but I do believe he's resistant to
change within curl, which is what we're talking about here.)

~~~
cestith
He's not resistant to change in the problem space. He's resistant to very
particular types of change in one very particular codebase, and for very sound
reasons.

He doesn't want to break the ecosystem around curl, which is huge, while
getting back to feature parity and compatibility during a full rewrite.
Something that comes along and replaces curl externally needn't be completely
compatible and therefore is freer to leverage their new, fresh start much more
fully. He welcomes a competitor, which means even a possible complete
replacement. That's not a resistance to change. That's being very judicious
about what one changes and why.

~~~
devy
> He doesn't want to break the ecosystem around curl, which is huge, while
> getting back to feature parity and compatibility during a full rewrite

Very good point! However, rewrite doesn't have to break the existing ecosystem
and it can happen in a parallel track right? (curl has had 1.5k contributors
so far so the community should have enough support to maintain existing
codebase while developing new version on a new programming language
hypothetically speaking.)

In fact, I would argue that the "it works now and has been working for 15
years so we don't rock the boat" attitude is negatively impacting the curl
ecosystem in the long run. I'm a python developer, a good example I can give
you to support my view is the python 2.x to py3k transition (If the "better
unicode support" is analogous to "memory safety bug avoidance").

[1]:
[https://github.com/curl/curl/blob/master/docs/THANKS](https://github.com/curl/curl/blob/master/docs/THANKS)

~~~
cestith
Python 2 to Python 3 was basically world-breaking for many projects. Most
every non-trivial piece of code needed to be modified to work with 3. Many
people maintain older code on 2 to this day even with newer code being written
on 3 by the same people.

Is that what you're wanting for curl?

------
throwaway5752
It's extremely simple. If you think Curl would be better in another language
then port it, release your alternative, and maintain it for a long time.

Even if your language (Rust, Erlang, LISP, Go) is "better", it's still a
minimal part of the equation. A maintainer is what makes the tool. It's hard
work to decide which PRs to accept (and worse yet, reject), to backport fixes
to platforms for which you can't get a reliable contributor, coordinating
fundraising/donations, keeping up with evolving standards...

Anyway. Thank you, thank you, thank you Daniel Stenberg. Use whatever damn
language you want.

~~~
kazinator
> _Use whatever damn language you want._

On the other hand, if he didn't want his justifications for that choice
examined by the world, he wouldn't have aired them, right?

> _If you think Curl would be better in another language then port it, release
> your alternative, and maintain it for a long time._

That's out there; some languages have URL downloading objects that are not
based on Curl.

E.g. Edi Weitz's Drakma client library for Common Lisp doesn't seem to be
using Curl as far as I can see.

[http://weitz.de/drakma/](http://weitz.de/drakma/)

~~~
throwaway5752
I wouldn't presume to speak for Daniel, but I got the feeling that he just
wanted to publish this to point people to rather than send the same canned
response to inquiries about porting to Rust et al.

Drakma sounds great. Tone is tough to get right online. I don't like people
doing drive-by suggestions like "you should rewrite X in Y". But if people are
really willing to roll up their sleeves, write the tool (in any language) and
keep it going for the long haul, I applaud them. I just have great respect and
empathy for project maintainers, I think some don't appreciate what a huge
PITA it is to BDFL a successful project.

------
derefr
> A library in another language will add that language (and compiler, and
> debugger and whatever dependencies a libcurl written in that language would
> need) as a new dependency to a large amount of projects that are themselves
> written in C or C++ today. Those projects would in many cases downright
> ignore and reject projects written in “an alternative language”.

Why would I be vendoring my own copy of libcurl in my project? Who does? This
is how I (or rather, the FFI bindings my language's runtime uses) consume
libcurl:

    
    
        dlopen("libcurl.so")
    

I rely on a binary libcurl _package_. The binary shared-object file in that
package needed a toolchain to build it, but I don't need said toolchain to
consume it. That would still be true even if the toolchain required for
compiling was C++ or Rust or Go or whatever instead of C, because either the
languages themselves, or the projects, ensure that the shared-object files
they ship export a C-compatible ABI.

An example of a project that works the way I'm talking about: LLVM. LLVM is
written in C++, but exports C symbols, and therefore "looks like" C to any FFI
logic that cares about such things. LLVM is a rather heavyweight thing to
compile, but I can use it just fine in my own code without even having a C++
compiler on my machine.

(And an example of a project that _doesn 't_ work this way: QT. QT has no
C-compatible ABI, so even though it's nominally extremely portable, many
projects can't or won't link QT. QT fits the author's argument a lot better
than an alternate-language libcurl would.)

------
Sir_Cmpwn
Agreed 100%. Definitely going to be trotting this article out next time I see
someone blindly arguing for rewriting xyz in Rust.

I particularly like the mention of portability. No other language comes even
_remotely_ close to the portability of C. What other language runs on Linux,
NT, BSD, Minix, Mach, VAX, Solaris, plan9, Hurd, eight dozen other platforms,
freestanding kernels, and nearly every architecture ever made?

~~~
cwyers
I mean, sure, and if you have users running VAX or the Hurd, that matters. But
it turns out that most of us use one of Linux, NT or OS X. And even if you add
BSD and Solaris (and a few other Unixes) you can still find languages without
C's known problems that cover 100% of users. "But embedded." Embedded can
maintain their own software, they do all the time. How long are we going to
insist that end users run software that cannot be secure because of the lowest
common denominator of programming languages?

~~~
Sir_Cmpwn
I think this is a flawed mindset for a number of reasons.

First, I'd rather appeal to every user than most users. That one user I didn't
have to appeal to is going to be a much more faithful and grateful user than
the "normal" ones. Most of my software work is open source (remember this
context is a discussion about curl), and this encourages active collaboration
with users with niche situations. If I choose technologies that make using my
software attainable for these people, odds are they aren't going to stop at
just porting it to their platform.

Limiting your platforms to Linux, OSX, and NT also stifles innovation. These
platforms are all deeply flawed. Their popularity isn't due to having the best
design, but rather to having a _good enough_ design and being entrenched.
They're old platforms, we've learned a lot since they were started. New or
niche platforms bring a lot of value to the table. The BSDs are a great
example, as it's the best suited platform for a wide variety of applications.

All a new platform has to do to be able to run nearly all general purpose
software is port a C compiler. Not even that - they just need a cross
compiler. This is a great thing, IMO.

>Embedded can maintain their own software, they do all the time

This is a pretty silly argument. Most embedded developers don't ship their own
implementation of HTTP, they ship curl!

~~~
cwyers
> Their popularity isn't due to having the best design, but rather to having a
> _good enough_ design and being entrenched. They're old platforms, we've
> learned a lot since they were started.

I think one could say the same thing about C's popularity as a language.

~~~
Sir_Cmpwn
I disagree. I think C is extremely well designed, but there are unfixable
shortcomings to this design. Choosing it is a tradeoff.

~~~
smitherfield
C was well-designed for its time, but "extremely" well-designed is a stretch
given the much better designs that came immediately before (ALGOL 60 and 68,
Pascal, Scheme) or after (Ada, Modula, ML) it. C was optimized to be fast to
implement (and won out for that reason — "worse is better," and because UNIX
was the first usable OS written in a high-level language), not for the best
practices in safety or even performance, even as understood at the time.

~~~
Sir_Cmpwn
I'm not talking about pre-ANSI C. Compare C89 to those languages and, in my
opinion, C comes out on top.

~~~
smitherfield
I'd really disagree. All those languages are both safer and more expressive
(if more verbose in the case of those with Pascal-like syntax) than any
version of C, and, except in the case of ML, Scheme and ALGOL 68 with the
optional garbage collection, there's no reason they couldn't be as fast or
faster than C. Their main fault was simply in being too ahead of their time:
too difficult or impossible to implement well on a PDP-11.

(I deleted the part about FORTRAN 77; seems I was confusing it with F90, which
is the version that first allowed identifiers longer than 6 characters,
dynamic memory allocation and user-defined types).

~~~
Sir_Cmpwn
But they fill a different niche. C is about being close to the hardware and
using minimal abstractions.

~~~
pjmlp
There isn't a single C feature that I wouldn't be able to use in say Modula-2.

------
kazinator
> _The simple fact is that most of our past vulnerabilities happened because
> of logical mistakes in the code. Logical mistakes that aren’t really
> language bound and they would not be fixed simply by changing language._

This statement is laughable nonsense. Shall we go into their bug history and
point out counterexamples left and right? [Edit:user simias has done this;
thanks!]

Every single bug you ever make interacts with the language somehow.

Even if you think some bug is nothing but pure, that logic is part of a
program, embedded in the program's design, whose organization is driven by
language.

------
coldtea
> _There. The simple fact is that most of our past vulnerabilities happened
> because of logical mistakes in the code. Logical mistakes that aren’t really
> language bound and they would not be fixed simply by changing language._

That's wrong. A lot of the C mistakes are indeed "logical mistakes in the
code", but most of them would be indeed fixed by changing to a language that
prevents those mistakes in the first place.

------
chousuke
In my view, the problem with C in general is that it's a loaded gun with no
safety or trigger guard. It's trivial to shoot yourself (or someone else) in
the foot, and it requires knowledge, meticulous care and lots of forethought
to avoid getting shot.

I very much agree that rewriting existing, stable software written in C is
likely not worth the trouble in many cases, but I can't accept claims that the
limitations of C aren't the direct cause of tens of thousands of security
vulnerabilities, either.

In Rust, even a less experienced developer can fearlessly perform changes in
complicated code because the language helps make sure your code is correct in
ways that C does not. And you can always turn off the safeties when you need
to.

Experienced developers should feel all the more empowered by simply not having
to always worry about things like accidental concurrent access, use-after-
free, object ownership, null pointers or the myriad other trivial ways to
cause your program to fail that are impossible in safe Rust. You get to worry
about the non-trivial failure modes instead, which is much more productive.

------
jeffdavis
"C is not a new dependency"

To just use a library, rust isn't much of a dependency, either. It's designed
so you don't even need to know that it's not C.

Rust would obviously be a build dependency, but that's lessened somewhat
because it tries to make cross-compilation easy.

(But this point does apply to pretty much any other language. Curl would not
be used as widely if it depended on the Go runtime, for instance.)

------
tombert
While I'm definitely not suggesting we _replace_ Curl with a rewrite in Rust
(since the current Curl has had decades of good testing and auditing done on
it), I am actually very curious how a rewrite in a safer language like Rust,
OCaml, Haskell, or Go would fair in comparison in regards to performance and
whatnot.

If I were ambitious enough, I'd do it myself in Haskell, but I think it'd be
too much work for a simpler curiosity.

~~~
fiedzia
Most likely I/O will take more time than whatever code is running, so in that
aspect it would make no difference. Memory overhead is main concern here. Rust
doesn't use GC, so you'll have full control and there should be not much
difference in that aspect. Other languages do, which means more sophisticated
runtime, less control and more overhead (or writing ugly code to avoid it).
Libcurl written in Go/Ocaml/Haskell would require anyone using it to also
include runtime of the language, which is usually rather large.

------
empath75
This seems like a no-brainer for a re-implementation in rust, but I wouldn't
expect that someone would rewrite curl itself in rust, but a new library that
does the same things.

~~~
floatboth
> a new library that does the same things

Most languages already have HTTP client libraries. (In particular, Rust has
Hyper. Ruby/Python/Node/Go have HTTP clients built-in in the stdlib, Haskell
has http-client, etc.) Who uses libcurl really? (Spoiler alert… PHP.)

Of course libcurl does FTP and Gopher and all the things, but these aren't
commonly required, most applications just need HTTPS.

~~~
IshKebab
People that write C and often C++ use libcurl. A better library for C/C++
developers would be nice and I believe it could be written in Rust, although
that would be a bit of a pain because then you need to integrate Rust into
your build system.

~~~
LeonidasXIV
Do you? For building C projects I certainly don't have to build libcurl, it
comes packaged and ready to use with my distribution. The same could be the
case with a hypothetical HTTP library written in Rust.

------
geodel
I think Rust community increasingly behave like this[1]. They are big on
suggesting others the better 'ideas' instead of implementing themselves. So
they keep using 'curl' and 'openssl' but tell others to rewrite their software
with Rust.

1\. [http://dilbert.com/strip/1994-12-17](http://dilbert.com/strip/1994-12-17)

~~~
lifthrasiir
I would say _a portion of_ Rust community---that said, it is that portion that
is the most visible, and I think the community well understands what it does
mean.

~~~
sidlls
I'm not sure the community well understands it at all. The usual rejoinder is
something along the lines of "well, I don't see that in the Rust community I'm
in."

~~~
lifthrasiir
Both /r/rust [1] and the official forum [2] are reasonably regulated and I
often see less informed members got warned about their use of aggressive or
insulting languages, often directed to non-Rust languages. There are surely
other venues with less enforcement (they still commonly observe the Rust Code
of Conduct however), but at least I think the main venues and corresponding
community heavily tries not to be offensive.

[1] [https://www.reddit.com/r/rust/](https://www.reddit.com/r/rust/)

[2] [https://users.rust-lang.org/](https://users.rust-lang.org/)

------
coding123
I don't see why this is an issue, whoever is arguing for a change can write
rurl and be done, and see if anyone takes it up in their distributions.

------
skocznymroczny
I don't think C is a bad language, although I think it could use lists and
dictionaries in standard library. std::vector and std::map are the only things
that make me pick C++ in an instant, given the choice.

------
adynatos
While C by itself is not safe, I would argue that no sane development
environment uses C by itself. Over the decades of its production use dozens of
tools have been developed that make it far safer: *grind suite, coverage
tools, sanitizers, static analyzers, code formatters and so on. Those tools
are external, otherwise they would make C slower. Something for something.

------
tete
I think it's a bit weird that C and curl are used. If we look at C and OpenBSD
or so things might look a bit different.

Also one has a hard time comparing curl with another language, simply because
something with curl's properties (take portability for example) doesn't exist.

And no that isn't in defense of anything, just me thinking thinking that
measurable points brought up in the discussions don't make sense or exist.

The topic is also a bit broader, as you can easily add in static code
analysis, compiler flags, stuff like W^C, stuff like seccomp, capsicum,
cloudabi, pledge which might not work (well) in other cases.

It's a great philosophical discussion topics and I don't wanna stop anyone,
just hoping people keep that in mind, when they participate, so we don't end
up with new dogmas that get thrown around for the next few year, without
knowing contexts or meaning of phrases.

Other than that: I really enjoy this discussion. :)

------
tlrobinson
I'm curious, of the bugs that _could_ have been avoided by using a "safe"
language, how many could have been avoided by using a bounds checking
extension like as
[https://www.doc.ic.ac.uk/~phjk/BoundsChecking.html](https://www.doc.ic.ac.uk/~phjk/BoundsChecking.html)
or
[https://github.com/Microsoft/checkedc](https://github.com/Microsoft/checkedc)

Are such extensions popular, and if not, why not? I assume there's always some
performance hit, but that might not be a big deal in an HTTP client, for
example.

~~~
pjmlp
Because they aren't portable.

------
renesd
CPython also has many vulnerabilities in python rather than C.

It's hilarious reading rust marketers talk about how people should use rust,
and yet their software doesn't work as well. It has plenty of bugs.

Then they go on and on about issues which post modern C doesn't have. Guess
what? C has a lot of tooling, and yes, it's been improving over the years too.
CQual++ exists. AFL exists. QuickCheck exists.

Can your rust project from two years ago even compile? Does it have any users
at all?

There's a formally proven C compiler. How's that LLVM swamp going you've built
your castle on?

Rust brought a modern knife to a post modern gun fight -- and lost.

~~~
steveklabnik
> Can your rust project from two years ago even compile?

Post 1.0's release date, which is just short of two years, the vast, vast
majority should, yes. We've had one or two soundness fixes in those times that
would take a trivial amount of updating to do, but that only hit a very small
part of the ecosystem.

------
koja86
Quite recently I happily used libcurl for C++ project rather than any of those
C++ wrappers found at github. Granted there is some non-elegance when you
adapt C-style error codes to C++ exceptions and non-C++-idiomatic code style
right next to any C lib. Yet libcurl is battle tested (AKA proved to be rather
bug free) and has nice clean API unlike.

IMHO it might eventually make sense to use other language/tech/whatever but
the bar is quite high and it will quite probably take some serious sustained
effort.

------
oldsj
> The plain fact, that also isn’t really about languages but is about plain
> old software engineering: translating or rewriting curl into a new language
> will introduce a lot of bugs. Bugs that we don’t have today.

Don't rewrites, even in the same language usually lead to a better version of
the software? I can't really imagine a seasoned C developer introducing
completely new bugs in a code base they are already very familiar with

~~~
fiedzia
> I can't really imagine a seasoned C developer introducing completely new
> bugs in a code base they are already very familiar with

Get a CVE list for some Linux distribution, this will open your mind. Happens
all the time.

------
gyrgtyn
What is everyone using curl for that it needs to be written in C (or Rust?).

If it think about my usage, it's like get or post something and see what the
returned json looks like. If I need to download something wget usually works
without having to remember -O.

But higher level things like httpie are easier to deal with, sane defaults and
all that. Maybe they use libcurl...

Are there any re-write userland in ${safe-high-level-lang} projects?

------
kodest
Maybe curl could be rewritten in C++ step by step like mpd
([https://musicpd.org](https://musicpd.org)). C++ has RAII for resource
management which can help a lot by itself. In my opinion the most hateful
thing in C is freeing resources on all exit paths.

Although, curl in C++ - the naming would became inappropriate...

------
lettergram
I feel blaming a language for errors is like blaming a gun for killing people.

The fact is, mistakes will happen, but in general if you follow the best
practices you'll be fine. Failing to follow the best practices means you could
be a better programmer. Just because the language gives you an option to do
something, doesn't mean you should.

~~~
myrrlyn
"I don't need a safety toggle on my gun I just totes keep it pointed at the
ground and unloa-- oh god where's my foot"

~~~
trav4225
Still not the gun's fault.

~~~
roca
In some sense, sure.

But in practice, people always make mistakes. Some guns/programming languages
limit the damage of those mistakes more than others. All other things being
equal, those guns/programming languages are better.

~~~
trav4225
Yup, no argument from me there. There's always room for improvement...

------
_of
Rust might be a great language, but it has not completed the test of time yet.
C is 45 years old. Rust appeared 7 years ago.

------
krystiangw
C has still huge market share. Seems that it occurs in 5% of all tech job
offers:

[https://jobsquery.it/stats/data.technologies/C](https://jobsquery.it/stats/data.technologies/C)

Stats also showing that average salary for C developers is above average for
all tech job openings.

------
AstralStorm
Maybe they should attempt writing it using the Isabelle/HOL transpilers to C
from SEL4 project. I don't care if it is C or machine code as long as the
proof of correctness is complete, down to at least C library.

Curl is small enough to make it relatively easy and used widely enough to make
it worthwhile.

------
carapace
This has probably been said, in this thread even, but if curl is insecure (for
some value of "insecure") then its ubiquity and ease of embedding are a
problem rather than a feature. Fuzzy thinking.

------
didip
It's well within reason and capabilities for rust community to write libcurl
and curl CLI libraries.

The community should do it, spend a couple of years stabilizing, and then
spread the words to others.

------
dmitrygr
Software is almost a perfectly open market. If proponents of rust really think
their preferred language is better in every way, they are free to rewrite the
world in rust, and see the adoption numbers they get. After all, if rust is
better in every way, we'd expect the adoption numbers to go up for their rust
OS, with a rust http stack and rust web browser. Right?

Telling others to use their language instead of putting their money where
their mouth is is truly what irks me about the rust community the most.

Want a rust world? Go write it and ship it.

Oh, and you don't get to complain about C until your PC runs more rust than C

Cool?

~~~
arcticbull
I'm not sure why I can't complain about problems I have today -- if nobody
complained about C ever, I doubt Rust, Go or for that matter, basically any
other language project, would have ever gotten started.

It's okay to identify flaws in tools, that's how we make them better. It
doesn't make sense to say '[car on fire] you can't complain about that Honda
Civic until you develop your own better car, sir, and more people are driving
it than not, now please leave the service center -- until then consider the
fire normal'.

------
digi_owl
A refreshing read in what seems to be a ongoing deluge of rewrites, languages
and frameworks.

------
faragon
Another reason: C is beautiful.

~~~
madphrodite
It is. It is an elegant language, at least I've always found it to be.

------
trav4225
It is utterly amazing to me to see so many people's attitudes on this issue.

If I cut myself by hasty use of a knife, is it the fault of the knife maker?
How is that even remotely rational? If you aren't willing (or don't know how)
to use the tool correctly, don't use it.

------
mgrennan
Why does something old (C) have to be bad these days?

~~~
wtetzner
It's not bad because it's old, it's bad because of how unsafe it is. There are
older languages than C that are safer, but they didn't gain the popularity C
did.

------
davexunit
>C is not the primary reason for our past vulnerabilities

Completely false. C is a disaster.

------
madphrodite
This is a great little read and encapsulates the other side of the 'rethink
the way' trend-ism of some HN new lang advocacy. C is fine, C is good. It is
widely understood, it is a systems staple, and it is not dangerous in
knowledgeable hands. Rocking the boat is fashionable.

~~~
macintux
> it is not dangerous in knowledgeable hands

For crying out loud, this is patently false.

C isn't as dangerous if you really know what you're doing and you never make
mistakes. I would bet fewer people know what they're doing than _think_ they
know what they're doing, and the set of people who never make mistakes is
entirely empty.

~~~
madphrodite
Well you know..that's just like your opinion man. Opting out of this site at
this point. A bunch of 6-12 year idiots (recognizably) telling people what
they think they know that they don't know. It's silly recursive.

~~~
macintux
You're right, my reply was unnecessarily harsh. I apologize.

------
fiatjaf
Why are you saying this? Who asked? I always imagined it was written in C.

~~~
grimgrin
Dude, it's in the first sentence.

> Every once in a while someone suggests to me that curl and libcurl would do
> better if rewritten in a “safe language”.

And so he minimally addresses that and some other reasons for sticking with
(and originally choosing) C89.

~~~
fiatjaf
Oops. I read everything after the first subheader.

