
A fifteen year old TCP bug? - Sec
http://blogmal.42.org/tidbits/tcp-bug.story
======
pilom
This is true hacking news! Discusses a possible new bug in TCP, teaches how
TCP works, has links to useful and relevant books on the subject, AND includes
remarks about how difficult it is for a newbie to actually make changes to
open source software and not get yelled at. I love it!

~~~
mrspeaker
I agree - but it's funny that there's only a handful of comments. You just
can't have an uninformed rant with such technical correctness (the best kind
of correctness)!

------
feintruled
The response to the bug report looks depressingly typical. Rejects the working
fix with a wall of text speculation on numerous other possibly better fixes
(without deigning to actually choose one). Nirvana fallacy in action!

~~~
jongraehl
What he's doing seems useful to the project. There's no better time to get it
right. I'm just surprised he's willing to expend so much effort communicating
instead of just fixing the patch.

I noticed that C programmers tend to use macros for things where (possibly
non-exported) inline functions would make more sense. Why is that? Are they in
the habit of building the OS with all optimizations off? Or is it that they're
being used as poor man's generic function?

~~~
cube13
The inline keyword is best thought of as a hint to the compiler, not a
command. The compiler is free to ignore the meatbag telling it to inline
functions if it chooses to.

Macros are substituted in before the compiler, so they are always inlined.

EDIT: Hint, not suggestion.

~~~
Someone
That is true, but the idea that programmers can use macros to force the
compiler to emit optimal code is wrong, too. In the early days of C, that was
(almost) true, but those days are over.

In theory, a compiler could uninline common code blocks, including macro
calls, into functions to decrease object code size and/or working set size,
thus speeding up the program (example: functions f and g with inlined function
h each take 2 cache lines; without inlining, each of f, g and h fit a single
cache line)

In practice, using an inline function will give the compiler the opportunity
to weigh different objectives (code size, execution speed, debuggability, etc)
against each other, and do the better thing.

------
runjake
The response to the bug report was by Bruce Evans, who is listed as the "Style
Police-Meister" for FreeBSD. Apparently his job is to enforce standards & code
style. Seems like he was doing his job.

[http://www.freebsd.org/doc/en_US.ISO8859-1/articles/committe...](http://www.freebsd.org/doc/en_US.ISO8859-1/articles/committers-
guide/people.html)

Edit: edited for clarity. Thanks, pinko!

~~~
pinko
I believe the comment above was meant as a response to the "The response to
the bug report looks depressingly typical" comment elsewhere in this thread.

It took me a minute to sort that out ("hmm, why is he referencing Bruce
Evans?"), so I thought I'd mention it for anyone else trying to follow.

------
direxorg
In 2002 we did custom patch for an energy company which had hundreds of
outdated remote RS232 terminals hooked up via wireless links to the central
station for control and monitoring. Their goal was to encrypt transmitted
messages so it will not be intercepted and messed with during wireless
transmission. Solution was Linux boxes on both sides that encrypts
communication using OpenSSL... The problem was the terminal do not want to
talk to Linux over crossover Ethernet because of.... you guessed... bug in
TCP... To solve that we had to make patch for Linux kernel. and let me tell
you that code in 2.4 kernel was very ugly with extremely funny comments :-)

My companion since than developing drivers and "he feels that he is doing
something important rather than boring UI".

but all he is doing is mostly his own projects and drivers since updating open
source IS a pain in the neck.

I guess problem in collaborative work is the reason why people do open source
vs something that have to be supported. What do you think?

------
rboyd
_The only reply this PR got, was from Bruce Evans who critiqued my use of a
simple (long) cast, which appears to have derailed this PR, sticking it in the
usual never getting fixed limbo where unfortunately most of my PR's appear to
end up._

Looks to me like Bruce gave you some valuable advice. You spent more time
complaining about the handling of your PR and documenting the issue on your
blog than it would have taken you to fix your patch.

------
HenryR
Is Stevens vol. 2 in the public domain now? If not, that's pretty poor form,
linking to a scanned pdf of the book.

~~~
doki_pen
Wouldn't that be fair use? Do you have issue with the content being posted, or
the fact that it's scanned?

~~~
estel
Quoting parts of the book to support some points made in the article would
fall under fair use, but linking to a scan of the entire book would not.

~~~
btilly
Hosting a scan is not fair use, but linking to a copyright violation is not
generally a copyright violation. Though a few courts have ruled that it can be
if done for the specific purpose of knowingly disseminating illegal material.
See <http://www.chillingeffects.org/linking/faq.cgi#QID152> for more.

~~~
estel
That's true, but the original point was 'poor form', which this likely falls
under also.

------
pavel_lishin
> As I had virtually no understanding of the TCP code, I liberally sprinkled
> it with printf()s

And people say it's a stupid way to debug!

~~~
SwellJoe
What people say that?

~~~
pavel_lishin
Pick any "how do you debug?" submission anywhere, and you'll see a lot of
people claiming that using printf, etc, is retarded in the age of good
debuggers.

Maybe they're just a noisy minority.

~~~
SoftwareMaven
Using printf to debug when you could use a good debugger _is_...well, I
wouldn't say stupid, just highly unproductive.

The problem is there are a lot of problems that aren't debugger friendly,
especially if you are new to a particular domain. The kernel, timing-related
problems, remote systems, production systems (you have intelligent logging,
right?), etc., all have extremely valid reasons for using printf debugging.

~~~
kahawe
One thing I am missing or haven't found in some of the debuggers I have used
(mostly Java) is the feature to just print out the code execution and return
values for a certain part of or the whole code.

Those prints or printfs are great to get a quick overview of what's going on,
if you ask me... instead of stepping through the whole thing.

------
barrkel
So much of this is caused by unsigned types. They are evil; avoid them
wherever you can.

~~~
1amzave
Care to elaborate? Unlike signed ones, unsigned integral types at least have
well-defined behavior on shifting and overflow. (I'm speaking in terms C
specifically here, of course.)

~~~
cpeterso
Signed ints are easier to range check at runtime. Given an unsigned int, it's
difficult to detect an invalid result from combining or comparing signed and
unsigned ints.

Google's C++ Style Guide discourages using unsigned ints to represent
nonnegative numbers (like sizes or counts). It recommends using runtime checks
or assertions instead.

[http://google-
styleguide.googlecode.com/svn/trunk/cppguide.x...](http://google-
styleguide.googlecode.com/svn/trunk/cppguide.xml#Integer_Types)

Unsigned ints make sense for bit twiddling, but you should probably use a
fixed-size uint32_t or uint64_t to ensure the results are consistent across
various architectures.

~~~
bodyfour
The "always use signed" rule is a source of endless debate in C circles. I
personally like almost everything in the Google C++ Style Guide, but this is
one place where I think they got it wrong.

The problem is that the riskiest place for a signed/unsigned mismatch is when
calling an unsigned API with a signed value. Simply deciding to not use
unsigned at all doesn't fix this because ANSI C and STL use unsigned types
throughout (f.e. memcpy)

    
    
      if (size <= 10) {
        // Yay, I have plenty of space
        memcpy(buffer, src, size);
      }
    

The code looks fine, but if "size" is an int with the value -1 there's a hard-
to-spot bug. Plenty of security holes have been caused by just this sort of
mistake. If you don't fight against the types that libc uses you don't have
this problem.

There will still be spots where you'll need to compare signed and unsigned
values, but the compiler will warn you about these. You'll have to cast one
side or the other but that's a GOOD thing. Since neither a signed-compare nor
an unsigned-compare is always what you want you want to be explicit about it.

There are other advantages to using unsigned types. For instance, it gives an
explicit hint to the person reading the code about the range of the value. I
think this makes interfaces clearer. For instance if you see a function
signature of "void foo(const uint8_t *, size_t)" you'll immediately guess that
you're dealing with a memory buffer and its explicit size without even seeing
the names of the parameters.

Actually, if I had my way "int" would default to being unsigned and you'd have
to specifically request "signed" if that's what you want. I find that I
probably use unsigned types 5x as often as signed ones.

~~~
barrkel
"There are other advantages to using unsigned types. For instance, it gives an
explicit hint to the person reading the code about the range of the value."

 _This is, without doubt, the worst reason for using unsigned types_ , and
it's the primary reason (IMHO) for the flaws in the C API that force you to
use unsigned types unnecessarily. Unsigned types are not a documentation
feature, and they are not merely an advert for an invariant; they are opting
in to a subtly different arithmetic that most people are surprised by. It
would be better to have a range-checked types, like Pascal, than to infect the
program with unsigned arithmetic.

I find that most programs deal with values for their integer types with an
absolute value of under 1000; about the only excuse for using an unsigned
type, IMO, is when you must have access to that highest bit in a defined way
(for safe shifting and bit-twiddling).

~~~
bodyfour
> they are opting in to a subtly different arithmetic that most people are
> surprised by

I think that's a "citation needed" moment there. It's true that _any_ native
integer type will strange if you go outside of its defined range. The only way
to avoid that is to use a language that automatically converts to bignums
behind the scene (Common Lisp, etc)

What I don't agree with is that this is something that "most people are
surprised by" If anything, the word "unsigned" is a pretty good hint about
what behavior you'll get.

And even when you play fast-and-loose with the rules, it usually turns out ok:

    
    
       unsigned a, b, c, d;
       a = b + (c - d);
    

even if d > c, this will do the expected thing on any 2's compliment
architecture. Now, this will break if a and b were instead "unsigned long
long". I think that case is fairly rare -- it's not a mistake I've seen
commonly in real life (especially compared to the dangerous "botched range-
check of a signed value" error)

But you are correct that it's not "merely an advert for an invariant" -- it's
advertising that the compiler actually reads. It gives you better warnings
(I've had plenty of bugs prevented by "comparison of signed and unsigned"
warnings) It also allows the compiler to optimize better in some cases:
compare the output of "foo % 16" with foo as signed and unsigned.

> It would be better to have a range-checked types, like Pascal

Adding runtime checks to arithmetic is the type of costs that are never going
to be in C. This is no different than saying "C should have garbage
collection" or "C should have RTTI". They're perfectly valid things to want in
a language, but they're anathema to the niche that C holds in the modern
world. With C I want "a + b" to compile down to one instruction -- no
surprises.

And even if you DID do a range-check, what do you do if it fails? 1\. Throw an
exception? Sounds logical... oh wait, this is C there's no such thing as an
exception 2\. Clamp the value? Now you have behavior that is just as bizarre
as an integer overflow 3\. Crash? Not very friendly.. 4\. Have a user-
definable callback (i.e. like a signal) What is the chance that the programmer
will be able to make meaningful recovery though?

There are, however, some additions to the C99 type system that I think would
be useful.. for example C++11's strongly typed enum's are a good idea.

> I find that most programs deal with values for their integer types with an
> absolute value of under 1000

I find that most programs deal with values greater-than-or-equal-to zero.

~~~
barrkel
_I find that most programs deal with values greater-than-or-equal-to zero._

-1 is very frequently used as a sentinel value. For example, counting backwards through the elements of some container:
    
    
        for (i = count - 1; i >= 0; --i)
            /* body */;
    

_I've had plenty of bugs prevented by "comparison of signed and unsigned"
warnings_

You wouldn't have had these warnings, much less needed to pay attention to
them, if you hadn't had to use unsigned types in the first place.

This conversation is much like those around GC. It's impossible to convince
people labouring under tyranny they've learned to love without them
experiencing a free life first. You just can't communicate it with words.

~~~
bodyfour
> -1 is very frequently used as a sentinel value.

I actually think these sentinels work quite nicely with unsigned.

    
    
       unsigned element_num;
       static const unsigned NO_ELEMENT = (unsigned) -1;
    

Yes, you need a cast, but it's just one place. From then on you can use a nice
constant for your sentinel.

Also, your sentinel is more range-check safe then it would have been if it
were an int (the classic "if (x < sizeof(arr)) arr[x] = 1;" issue again)

> For example, counting backwards through the elements of some container:

Ok, that is a fair point. Tat type of loop is easy to mess up with an unsigned
type. Worse, gcc will only warn you if you compile with -Wextra which not
everybody does.

Actually implementing the loop in a safe manner isn't really that hard though,
of course:

    
    
       if (count > 0) {
          unsigned i = count - 1;
          do {
             /* body */
          } while (i-- != 0);
       }
    

Couple extra lines, true. I don't think it loses much clarity.

> For example, counting backwards through the elements of some container:

You missed my point. I mean warnings caused by "oops, that's not the variable
I meant to compare with there"

It's a similar story with "const". One of the great side-benefits of using
"const" consistently is that suddenly you find that the compiler starts
catching more of your dumb mistakes ("oh I thought this was called like
func(dest,src) but it's actually func(src,dest)" The moral is that "more info
to the compiler" translates to "compiler notices a larger percentage of your
dumb mistakes"

> It's impossible to convince people labouring under tyranny they've learned
> to love without them experiencing a free life first.

Melodramatic much?

------
ig1
I only had a quick skim through the article (need to be off to the London HN
meetup shortly!), but couldn't this be used to mount a DOS attack sucking up
the number of available sockets on a server?

~~~
pmjordan
Maybe, if you could trick the server (64-bit FreeBSD) into connecting to
sockets open on 32-bit FreeBSD machines. I can't think of any common services
that would be susceptible to this (they would normally be susceptible to being
tricked into opening other kinds of long-standing connections, too, which is
just as good for DoS).

~~~
zwp
> if you could trick the server (64-bit FreeBSD) into connecting to sockets

Proxies, SMTP gateways, FTP servers (active mode), ...

~~~
delinka
"...on 32-bit [systems]" is the operative part of his statement. I'm pretty
sure 'tricking' a client into opening connections to a server is trivial
regardless of TCP bugs.

