Hacker News new | past | comments | ask | show | jobs | submit login
RabbitMQ integer overflow that leads to heap memory corruption (mitre.org)
74 points by DyslexicAtheist 3 days ago | hide | past | web | favorite | 21 comments





Note this is for one of the client libraries - the C library, to be specific - not the server (written predominantly in Erlang, a memory-safe language).

(Furthermore, checking the git blame, it turns out this bug is my fault, nearly a decade ago. Decoding a size_t out of the packet and then adding a small constant (7) to it turns out to be able to overflow size_t ... sigh. Programming in C should be against some kind of Geneva Convention.)

So it's a typical 'oh a size_t is big enough I'm never going to have numbers that big' bug? Not that I blame you, I've written enough of those..

More of a "oh yeah, that's right, C doesn't have integers, just machine words, and it doesn't help you avoid errors when you use machine words to simulate integers" bug. Overflow? You get to keep both pieces.

Mathematics in C is modular arithmetic built as an abstraction on top of fixed-width numbers. One has to keep reminding oneself of this fact. This got etched in my brain after I saw a piece of code that calculated the middle index of an array in a roundabout way

mid = left+ (right-left) /2 instead of mid = (right +left) / 2

Although even this code can cause overflows, it has served as a reminder to be cautious even with simple addition or increment operations.

For this reason the best C code I have seen, an in-memory database with lots of pointer arithmetic, array indexing and bitwise operations, had its own 'safer arithmetic' implemented using macros.


> One has to keep reminding oneself of this fact.

No, one doesn't need to do this, and no matter how hard you try, ths doesn't work in practice since you only need to slip once to introduce a CVE into your program.

What we need is for compilers to start calling `abort` by default on any kind of integer overflow. If you have a piece of C code in which you actually need integers to overflow with some particular semantics (e.g. saturating, modulo 2, etc.), you should actually call some standard library API that's lets you express this in a very explicit way.

I need to use such APIs every now and then from Rust, but 99% of the time, integer overflow is an error, not something I explicitly intended to introduce undefined behavior or wrap with modulo 2, and as a consequence, if that happens to execute silently, my programs which are not programmed to handle this case will eat your laundry.

The suggestion that "you have to keep overflow in mind all the time" is nuts, because most of the time, you have more important things to keep in mind than whether overflowing a singed integer is going to cause UB or whether overflowing an unsigned one is going to wrap around, and manually insert checks everywhere every time you use integers. That's just a completely unreasonable expectation to have about programmers and human beings in general.


Doesn't Rust arithmetic wraparound on release? I read Microsoft's Midori language paniced/abandoned on overflow even in release, and the compiler got quite good at optimizing overflow checks away.

The rules are slightly more complex than that to allow the behavior to change in the future, but they currently two’s compliment wrapping, yes.

> Programming in C should be against some kind of Geneva Convention

Possible and likely in languages other than C. It's just going to hit an out of bounds error when you index the buffer, instead of potentially overwriting an unrelated allocation.


More that it'd either properly promote from a 64-bit size to a big integer size, or trap on overflow. But either way, yes, a trap is kind of the point. And there are definitely other languages almost equally as toxic as C, yeah.

I think in this beating a dead horse you fail to see some issues here.

If you are going to allocate memory based on the size of a network packet's 32 bit quantity, in any language, that means your connection peer can make you allocate a contiguous 4GB. You already have a bug there.

Many languages that readers here consider saviors to big bad C do not make integer overflow fault. It's not as simple a matter as you suggest because there are real tradeoffs. There are also algorithms that rely on wrapping around that are working as intended.

Making it fault all the time also means that trap, if not handled, is a bug that may tear down your process, just as that runtime out of bounds exception represents a serious bug too. So another thing to consider.

I just tire of hearing that switching your language will make it so you never write any bugs again. A crashing bug in C often would be a crashing bug in another language. The difference is that memory safety gives you better failure modes that don't lead to serious problems like remote code execution. You can make that argument reasonably without pretending the new language will make you infallible and the old language is responsible for all evil in the world.


Can you imagine a better world?

Can you imagine a programming language, library, and/or operating system that improves these issues?

Perhaps one where integer overflows were handled safely by default, with wrapping behaviour within easy reach of the programmer for those tricky bit-twiddling algorithms.

Perhaps one where allocating some large blob based on tainted data were a compile-time error until the programmer explicitly asserts that's what they really meant to do.

Perhaps one where process teardown isn't some faceless horror from the deepest depths, but a reasonable and expected way of reacting to an unexpected situation.

You don't have to settle for second-rate tools. Compilers and interpreters are just programs. We can learn from our mistakes.

> I just tire of hearing that switching your language will make it so you never write any bugs again.

Well, you might be reading too much into what I actually wrote. All I said was that writing C programs ought to be a literal crime against humanity, ultimately punishable by hanging from the neck until dead. A moderate and restrained position, if ever I heard one. I'd never go so far as to suggest that switching language would improve things in any way!

> The difference is that memory safety gives you better failure modes that don't lead to serious problems like remote code execution.

... yes! Yes. Exactly. This is exactly my point. Memory safety, proper support for integers, and so forth give you better failure modes. Significantly better failure modes. In the sense of "fail safe".

> You can make that argument reasonably without pretending the new language will make you infallible and the old language is responsible for all evil in the world.

Oh, I'd never say it's responsible for all the evil in the world. Just an eminently avoidable fraction of it.


In a sane language, there'd be no lossy automatic conversion, no arithmetic on incompatible types, no silent incorrect results by default, and no magic type selection for constants.

What should integer arithmetic do by default? Either yield the correct result, or trap/throw. Of course. Obviously.

No, this doesn't apply to rounding division, as there can never be any reasonable expectation that integer division is reversible or exact in the general case.

But there are good reasons to want saturating arithmetic, wrapping arithmetic, or an optional/nullable result, so the language should provide all of these as opt-in (i.e. by using a syntax that clearly expresses the intent).

What should floating point arithmetic do in case the exact result cannot be represented? Return the approximate result, of course. Floating point arithmetic cannot reasonably be expected to be exact.

What should mixed floating point and integer arithmetic, arithmetic on mixed integer types, or mixed floating point precision do? Not compile! What was your intent when you wrote an addition of an int8 and an int32? Did you want an 8-bit addition and thought the int32 was actually an int8? Did you want to promote the int8 to an int32? Did you want to truncate the int32? Just say what you want explicitly.

What should the type of a constant like 255 be? Nothing! It should not exist. If you want to specify a constant, specify the type with it. No automatic choice is reasonable. uint8? Then 255 + 255 would trap (if the language is otherwise sane, or worse, wrap, saturate, whatever). int32? OK, then what about 0x7fffffff? Also int32? Again, that added to itself would trap. So everything is int64 (or whatever the largest int is)? Then 0x7fffffffffffffff would trap. Eventually you'll hit the largest size available and there is nowhere to run.

A burden on the programmer, unintuitive or needlessly complicated? Lolwat? It just cleanly exposes fundamental facts of the hardware that will crop up eventually anyway when you inevitably hit some limit, like in an overflow bug. And then you have to think about exactly those details. What is actually going on? What integer width is the addition performed with? Can it overflow? Is precision lost in an integer-to-float conversion, etc.

Unfortunately, no mainstream language is sane.


> Yes. Exactly. This is exactly my point.

Well actually, you said writing C should be a war crime...


I keep coming back to wanting range types in C ala Ada. And if something screwy happens you get a seg fault.

   type packet_buf_size is range min_packet_buf_size to max_packet_buf_size;

   packet_buf_size = 10222034440;  // this causes a seg fault

I don't know this system well enough to say, but 4GB 'frame' size doesn't automatically imply a 4GB buffer must be allocated; the design might be 'cut-through' and only hold a portion of a frame in RAM. For a store and forward service such as a message queue that's actually a likely design.

In any case I agree with you; this isn't really about C and isn't a reason to condemn C in favor of something else. This is about trusting user input; the size_t taken from the network PDU/packet/whatever is trusted. That's not valid in any language at any time regardless of how memory safe something else might be.


In this case, RabbitMQ doesn't trust user input - the maximum frame size is negotiated as part of the AMQP protocol. The affected code did have a check that the frame wasn't bigger than the agreed limit, but there was a chance to overflow before it got there.

afaict, the issue is in amqp_handle_input()[1]. The fix was

> "add additional input validation to prevent integer overflow when parsing a frame header"

Wouldn't this be a bug in the MQ and not in the C library?

[1] https://github.com/alanxz/rabbitmq-c/commit/fc85be7123050b91...

EDIT: sorry I just realized this is an internal rabbit MQ (C) library, I misunderstood the reference to "C library" thinking it meant the systems g/libc


Nope. It's the C library's problem: who is to say which AMQP implementation is at the other end of the socket?

thanks for clarifying, my bad (I misread the meaning of C library)

I figured this was some counter wrapping around but this is a size field, for what looks like data?

Who is transporting 2GB files over RabbitMQ? I did not think that was a thing reasonable people would do.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: