
Nweb: a tiny, safe web server (static pages only) - josstin
http://www.ibm.com/developerworks/systems/library/es-nweb/index.html
======
kragen
A few months ago, I wrote httpdito, a tiny web server that serves static pages
only. It's about the same amount of code as nweb, but less functionality, and
I have more confidence in its security:
[http://canonical.org/~kragen/sw/dev3/server.s](http://canonical.org/~kragen/sw/dev3/server.s),
with README at [http://canonical.org/~kragen/sw/dev3/httpdito-
readme](http://canonical.org/~kragen/sw/dev3/httpdito-readme). It's 296
instructions.

I'm not saying it's secure, but I certainly intended it to be, and it doesn't
suffer from the particular problems tptacek, evmar, kedean, and nknighthb
identify in nweb. I'd like to think I'm not naïve enough to have written
problems like that, but that's probably not true.

(I'm pretty sure that "Try my new secure software!" is something that should
not be followed with "I wrote it in C!" but usually assembly language is not
going to be an improvement. In this case I think it happens to be.)

httpdito was discussed on HN a bit before it was finished; for example, it's
no longer completely trivial to DoS it, although I could do more to protect it
against that.

------
tptacek
Adding to what everyone else has said, this also "how not" to write socket
code; for instance, the assumption that you can read a whole HTTP request "in
one go" with a single large read call is false.

Also, casting function calls to (void) is nonsensical.

You can perhaps forgive the sprintf() call because, AIX. (Believe it or not,
there was a time when snprintf was a portability problem). You can't forgive
the log() function that doesn't explicitly bounds check its argument (though
it's not exploitable in this code).

~~~
cgh
> Also, casting function calls to (void) is nonsensical.

This is an older convention to indicate that the programmer knows the function
returns a value but has chosen to ignore it. It got around the false positives
generated by lint.

This code uses SIGCLD which I don't think is supported by BSD, where it is
called SIGCHLD and is slightly different (someone please correct me if I'm
wrong). If that's the case, then the author's assertion that it "should run
unchanged on AIX, Linux®, or any other UNIX version" is incorrect - it will
only run on System V varieties.

~~~
Pitarou
Confirmed the SIGCLD problem.

I had to change it to SIGCHLD to get it to compile on my Mac.

------
chm
This crowd is always tough to please. There's a description at the top which
says, about the 200 loc http server:

You can see exactly what it can and can't do.

Thank you Mr. Griffiths. Your example will help extend my understanding of an
http server, even if I don't intend on writing one. I would never read through
the 90 klocs of httpd.

------
pblakeney
My C is a little rusty, but it seems like this web server is definitely not
safe. The very first function in the code has a local stack variable and uses
sprintf() to fill it. That's almost a textbook example of a buffer overflow
vulnerability, if I'm not mistaken. Even if they try and compensate for that
by checking the data length before it's passed to that function, it's still
scary to see someone using sprintf() instead of snprintf() these days. It's
like walking a tightrope without a net.

~~~
nknighthb
It's scary style (and speaking of style, this code is really inconsistent in
its formatting), but from a quick search of the usages of the logger function,
I didn't see any way to overflow the buffer.

* BUFSIZE is 8096.

* logbuffer (the local variable) is BUFSIZEx2

* s1 looks like it's always trusted and an order of magnitude smaller than BUFSIZE.

* The format strings and numbers are nowhere near big enough to make up the difference.

* Where s2 is untrusted data, I _think_ it's always guaranteed to be <=BUFSIZE and zero-terminated.

But there are definitely other possible issues I haven't looked at closely,
and I'm certainly troubled that this mess has showed up on an IBM site as an
example of a "safe" web server.

------
ChuckMcM
I always enjoy articles that reiterate how the simple stuff really is pretty
simple. I like thttpd for that reason, its a really simple (and a bit more
featureful) webserver than this one, but not by a lot. Easy to comprehend,
easy to keep all the moving pieces in your head in one piece.

Folks building embedded stuff have been using this stuff to create their UIs
for like forever it seems, and this kind of web server works pretty well in
that capacity.

[1]
[http://www.acme.com/software/thttpd/](http://www.acme.com/software/thttpd/)

------
evmar
This code is really not good, and certainly not worth learning from.

It appears that if you request a path like "//etc/foobar" with two slashes at
the front it'll allow traversal outside the starting directory, though it's
mitigated by checking file extensions.

~~~
JasonFruit
That may be, but the docs are quite nice; people who can write better code
might still gain from considering how they can make their documentation this
instructive.

------
dfc
I am always on the look out for a small, lightweight and secure web server for
impromptu file sharing. Right now I use publicfile from djb.[^1] My only
complaint is that there is no debian package for publicfile so I have to build
my own package. I would love to find an equivalent (ftp not necessary) daemon
that is included in debian. Is anyone aware of a something in debian repos
that I am overlooking?

[^1]: [http://cr.yp.to/publicfile.html](http://cr.yp.to/publicfile.html)

~~~
pmahoney
This is hardly in the category of "lightweight and secure", but for impromptu
stuff, this Ruby+Rack one-liner serves directory listings and static files
from the current directory:

    
    
        ruby -e 'require "rack"; include Rack; \
          Server.start :app => Directory.new(".", \
          Static.new(nil, :urls => ["/"], :root => "."))'

~~~
dylz

        python -m SimpleHTTPServer

~~~
pekk
although that server is very horrible

~~~
hcarvalhoalves
Works fine for some tests, but it's single threaded. If you need concurrency:

    
    
        twistd -no web --path=.

~~~
deathanatos
Perhaps, but the use case it usually finds is "I need a webserver, here. Now."

Python is nearly always installed, and that depends on nothing but the
standard library. You can stick an alias in your dotfiles (I have, it's called
"serve-this") and not have to worry about having twisted¹ getting to wherever
your dotfiles get put. (I distribute my dotfiles over git/github, so it's
really easy to move them around. More work to get Twisted.)

¹Or Ruby… or Go…

~~~
hcarvalhoalves
Twisted is pre-installed on OS X and various Linux distributions...

------
Rzor
I want to learn a bit about web servers and I think that study the source code
of a functional one may worth more than try to build something from scratch at
first glance. Since I'm seeing too many comments on the security issues of
this particular project, can you guys recommend something more reliable?

Thanks in advance.

Edit: I "know" C and C++ and would like to remain in one of these languages,
if it's not asking too much.

~~~
pjmlp
If you care about security then C and C++ are out of the game, specially if
there is a team of different skill sets involved.

Having said this, have a look at Wt and Poco

[http://www.webtoolkit.eu/wt](http://www.webtoolkit.eu/wt)

[http://pocoproject.org](http://pocoproject.org)

------
nthitz
[http://www.ibm.com/developerworks/systems/library/es-
nweb/si...](http://www.ibm.com/developerworks/systems/library/es-
nweb/sidefile1.html) Direct link to the (200 lines of) source code. I can't
speak to the security, but it is a nice little read.

------
theboss
The only thing safer about this that I see is it is extremely small. No high
assurance design, etc. What am I missing here that makes them advertise its
safety?

------
rsync
Does this do SSL ? If not, there is no reason to use this instead of the
(excellent) thttpd.

thttpd is a very, very nice tool. It's very handy sometimes to just fire up
thttpd -d /some/dir because you want to look at the contents of the dir in a
web browser but don't want to spin up the whole environment and server, etc.

I put thttpd on a lot of informal servers just to have it around when I need
something like that...

~~~
Pitarou
This is not intended for real world usage. Its purpose is to help others learn
how web servers work.

------
fmela
Just because the code is tiny doesn't mean that it is safe. How do we know
that the code is not vulnerable to e.g. buffer overflow exploit?

~~~
kedean
I find your comment particularly funny, because in the information security
course I took for my masters degree, we had to perform remote buffer overflow
exploits using an older version of this exact software.

------
abimaelmartell
The code is ugly, checkout
[https://github.com/cesanta/mongoose](https://github.com/cesanta/mongoose)
(GPL & MIT) or
[https://github.com/sunsetbrew/civetweb](https://github.com/sunsetbrew/civetweb)
(GPL)

------
steventhedev
If I had infinite time and patience, I'd tinker with this to show the
differences in socket code, specifically with the approaches outlined in the
C10k document.

Although it was largely unfinished, the approach outlined in C10m would be
interesting to see implemented here (via the intel user-space driver).

------
inconshreveable
For those of you who are interested in a tiny, safe, static file server that
provides secure, public URLs from any machine (ngrok-style), I have a simple
project called srvdir that will probably be useful to you:
[https://srvdir.net](https://srvdir.net)

~~~
mo
Nice, but the writing suggests that it's secure as in end-to-end encryption,
what people expect from HTTPS, while in fact you tunnel everything in plain
through a central server. You should make this clear on the site.

------
stefs
> if LINUX sleep for one second to ensure the data arrives at the browser

can someone explain that please?

~~~
Pitarou
It's explained further down:

 _After the last byte of the file is sent, the nweb web server web() function
stops for one second. This is to enable the file contents to be sent down the
socket. If it immediately closes the socket, some operating systems do not
wait for the socket to finish sending the data but drops the connection very
abruptly. This would mean that some of the file content would not get to the
browser, and this confuses the browser by waiting forever for the last bit of
the file and often results in a blank web page being displayed._

~~~
kragen
This text has an error rate of about one factual error per sentence, as you'd
expect from someone who capitalizes "Linux" as if it were an acronym.

~~~
Pitarou
We're definitely in "works on my machine" territory here.

Still it's good for building your confidence as a programmer, no?

------
adultSwim
This is a great example of how to clearly document a project.

The technical criticisms are confirmation of that. How many HN submissions to
open source projects are as well explained?

------
jrochkind1
today i learned: there's still AIX. Huh, really?

~~~
kokey
Yup. Banks still have lots of these, and Solaris is being phased out more
actively than AIX.

------
Codhisattva
Not a prank!

~~~
oracle2025
This is actually the first thing I thought, after having a quick look at this
code, and being aware of todays date ;-)

------
reustle
All you really need is

> python -m SimpleHTTPServer

