
OpenBSD, C, httpd and SQLite – Web App stack - dhruvkar
https://learnbchs.org/index.html
======
bitofhope
I find the site a little tongue-in-cheek, but I genuinely like the ideas
behind it. I honest-to-$DEITY prefer programming in C to any sort of
Javascript. C is kinda hellish for large, complex projects, which is good. The
world needs fewer large and complex websites. OpenBSD also provides very good
security features that mitigate the inherent security challenges of writing C.
pledge(2) is just a very good idea.

I generally think websites should do their processing in the backend, whenever
possible. Running code to generate the website should not be the user's
problem. This also comes with the advantage of not being constrained to
*Script languages. C is alright, but if you're of the defeatist camp who
thinks writing C safely is impossible, you can adapt the BCHS philosophy just
as well to C++, Rust, Go, D, Python, AWK, Common Lisp…

~~~
rileymat2
C is also hellish for string processing.

From memory management to buffer overflows to understanding the encoding
types.

~~~
mpweiher
Objective-C with a reasonable Foundation is such a great language for this...

It has essentially the predictability and simplicity of C, just with a minimal
amount of dynamic binding to make it comfortable. And NSString variants tend
to take care of string handling and encodings.

~~~
okket
At that level, you can just switch to the clang compiler based Swift language,
which is much nicer/modern. And look, there are already Swift web frameworks
like [https://vapor.codes](https://vapor.codes)

~~~
gkya
I really dislike the Swift/Scala/Rust/Dart/Kotlin etc. style, heavy and
centralised syntax. ObjC is not very beautiful, but it has an elegant way of
combining composable primitives (functions and objects). The same kinda sorta
goes for Perl too, which I think is a nicer alternative to C in this stack.

------
ronreiter
If you want simplicity as your #1 objective then I would recommend Flask and
Python. If speed is your #1 objective then you can try Go.

Maintaining a C webapp is terrible.

You don't have any primitives, you need to manage your own memory which is
extremely dangerous, and the amount of best practices you need to learn so you
won't shoot yourself in n the foot and create an exploitable/unstable
application is immense.

~~~
hedora
Python is the opposite of simple.

If you want good performance, then, for non-trivial sites, you cannot afford
to spawn the interpeter on each request, so you have to use some asynchronous
tooling. Similarly, spawning connections is much heavier weight than it should
be, so you end up with backend connection pools.

At work, we used fabric for this, and it blew up in our face at scale, and we
had to rewrite.

In order to get off the ground, you also need to use pip, which causes an
inordinate number of problems vs. the bsd ports tree or .deb files. Managing
dependencies and versions between the host os and pip space still seems to be
an open problem. You can use virtual environments for each script, but then
patching zero days becomes infeasible.

If you want basic type safety, you need to rely on external tooling. In my
experience, third party python libraries evolve types and api’s without
warning. This shows up at runtime, so it introduces security holes.

I could go on for hours.

~~~
xtrapolate
> "you cannot afford to spawn the interpeter on each request"

You mention this as a case against using Python, but this is entirely
unrelated to Python as a framework. In other words, had they used PHP or Node
or %s, they'd still have to face the issue of optimizing the request-to-
response path.

This relates much more to the issues that often come with the common practice
of separating your webserver, from your business logic (Flask, in our case).

> "so you have to use some asynchronous tooling"

Solutions such as mod_cgi and wsgi have been the standard for years at this
point. That's, in many ways, a solved problem.

> "In order to get off the ground, you also need to use pip, which causes an
> inordinate number of problems vs. the bsd ports tree or .deb files."

Any specific issues you can expand on? Because this is a rather vague form of
criticism.

Using pip is ridiculously easy. I've been using Python extensively in various
production environments and rarely had to point a finger at pip. Package and
dependency management in Python are solid (that isn't to say things can't
improve).

> "You can use virtual environments for each script, but then patching zero
> days becomes infeasible."

Again, this is an entirely subjective form of criticism. I haven't had any
issues with keeping my virtualenvs up to date.

> "If you want basic type safety, you need to rely on external tooling."

Again, you just seem to favor strongly typed languages and that's fine, but
that's hardly a point against using Python here. Type safety is simply not a
Python thing, much like weak typing is not a C thing.

> "In my experience, third party python libraries evolve types and api’s
> without warning. This shows up at runtime, so it introduces security holes."

Developers will sometimes break APIs. How is this a Python-specific issue?
These things are literally happening everywhere.

~~~
bqe
> Again, this is an entirely subjective form of criticism. I haven't had any
> issues with keeping my virtualenvs up to date.

What tools do you use keep all of your virtualenvs up to date?

~~~
xtrapolate
I can recommend Snyk[0] (I am not affiliated, have used their products with
clients before). They've got a fully functional free tier, and they have a ton
of relevant features. And yes, there are other, competing products out there
you could use.

[0] [https://snyk.io/docs/snyk-for-python](https://snyk.io/docs/snyk-for-
python)

------
notaplumber
Important pieces of the puzzle are Kristaps' kcgi (perhaps kwebapp/ksql).

[https://kristaps.bsd.lv/kcgi/](https://kristaps.bsd.lv/kcgi/)

And also OpenBSD pledge / "sandboxing" on other OSs

[https://man.openbsd.org/pledge](https://man.openbsd.org/pledge)

[https://learnbchs.org/pledge.html](https://learnbchs.org/pledge.html)

[https://kristaps.bsd.lv/kcgi/tutorial6.html](https://kristaps.bsd.lv/kcgi/tutorial6.html)

~~~
krylon
I really hope other systems will adopt pledge. Of all the selective privilege
dropping mechanisms I know of, pledge seems to be the least complicated to
use.

~~~
pjmlp
Mojave has a new security aware runtime for its applications.

Bringing sandboxing to all applications and requiring listing of entitlements
and digital signing.

Microsoft has decided if the apps don't come to the store, the store goes to
the apps, and are merging Win32 with UWP sandboxing concepts via Windows
containers and the new MSIX package format.

Android sandboxing is still the most complicated one.

------
psergeant
> because the open internet is damn inhospitable.

And the answer to this is writing text-processing functions that you expose to
the world, in C... skeptical face

~~~
kuon
We could argue that modern static analysis tools and the compiler itself are
good at catching a lot of errors, and that high level languages have bugs too.
C is a dangerous beast, but with good practices it isn't that dangerous. Of
course you have to be extra careful when dealing with user content, like JSON
payload, but many decoders in other languages are written in C.

Rust is a good solution, but it isn't trivial to learn, and might not suite
all situations. I think it's great to have a C solution like this.

~~~
unrealhoang
I don’t think any non-gc language is trivial to learn, C also. Manual memory
management is hard, and either you learn from the system (C way) or learn it
upfront from the language (Rust way), it’s still the same amount of concept
you have to digest to write practical software.

~~~
tannhaeuser
That's true, but the original idea of Unix is to have many small program
invocations work together, not to erect a monolithical long running daemon
that does it all in a single address space. Memory management for one-shot
command line apps isn't hard at all and can often get away with static
allocations. Even if you're screwing up your heap, process isolation will take
care of recycling memory when your program terminates.

~~~
sfifs
Many small programs simply doesn't scale to the modern web when just the
baggage associated with spinning up processes will kill you at any reasonable
load. UNIX has nice ideas but recognise they were founded in the shared
computer between a few dozen users environment of the 70s, not a web server
serving thousands of requests a second of today

~~~
bitofhope
What do you mean by modern web? I don't see how modernity implies more load,
at least for fork() bottlenecked programs specifically.

If you're serving thousands of requests per second, you probably need a pretty
beefy server anyway, regardless of the stack you're running. Forking makes
good use of multiprocessor capabilities if nothing else.

~~~
sfifs
Modern language runtimes used in the server space like Go basically multiplex
green threads across processes to minimize context switches between concurrent
execution paths. By leveraging the fact that I/O is much slower than processor
ops and switching green threads on I/O without necessarily always switching OS
process, they easily scale to hundreds of thousands of concurrent requests. I
don't really know about Erlang's model but it also has green threads. There's
a reason why high traffic sites use languages like these.

[https://talks.golang.org/2012/concurrency.slide#1](https://talks.golang.org/2012/concurrency.slide#1)

------
reacharavindh
As much as I align myself philosophically to most things this project stands
for, I wouldn't trust myself to write public facing web applications in C. The
same philosophy could still be held up while replacing the last components
with a less dangerous language like Rust(with the trade off of developing
complexity, and possibly giving up a bit of control).

OpenBSD as a base operating system, running relayd and httpd to face the
hostile web proxying for efficient services written in Rust would be my ideal
choice. SQLite for most simple data store needs, and Postgres if the data
management gets big enough that it might spill to another machine.

------
jmartrican
Two points. Would be nice if there was an AMI on AWS for this stack.

Second point. We do a lot of RESTful microservices. Which might be good for
this in the sense that with C you would want to keep your code small (cause no
classes and manual mem-management). But what people do not always appreciate
at first about microservices is that you need to rely heavily on libraries
(either custom or external like Spring Security) to handle cross cutting
concerns, else you are writing the same code multiple times. Essentially if
you write a peice of code that you expect will go on more than one
microservice, then it should go into library. So my concern is how easily
would it be to create custom libraries to deal with cross cutting concerns
like logging, and security in this stack, AND does C ecosystem offer external
libraries for dealing with these common concerns that a modern RESTful
microservice would encounter?

~~~
pjmlp
And most important how sure can one be that those libraries have been propely
sanitized.

------
kazinator
If you want to do web backend in C, this approach is far from the only game in
town. For instance, there is this: [https://kore.io/](https://kore.io/)

I like the idea of a web framework having a C API. We can make bindings to it
in different languages and have a kind of semantic standard across the board,
like is done with numerous other things: numeric libraries, crypto, GUI, ...

------
hedora
I suspect “unsafe” C leads to safer server side code when combined with a few,
simple libraries.

One issue that seems to fly over the head of “safe” high level language
proponents are that you need to know what data types you are manipulating in
order to write safe code.

Also, language-level isolation is much harder to implement correctly than
process-level isolation (and OpenBSD has spent tons of time hardening process-
level isolation).

I’d love to see a security bake off of a few mature applications built with
this and random-language-de-jour.

~~~
okket
> I suspect “unsafe” C leads to safer server side code when combined with a
> few, simple libraries.

As many mentioned this might be true, but only if you don't have to deal with
user input, strings, unicode and anything but extremely simple manual memory
management.

The essence is, C and similar languages like Rust are great for writing an OS
or a browser, but you really, really, really do not want to write complex
server side web apps in those languages.

Sure, you can force your way through and if you have 30+ years of C in your
head, it might even be safe. But it is not reasonable nor advisable for the
general case. IMHO.

~~~
efficax
Rust brigade engage!

Not sure why you're lumping Rust in with C here. Rust has automatic memory
management as part of the language, _without_ a runtime, and it has excellent
libraries for dealing with Strings, unicode, and concurrency and asynchronous
functions. It's a perfect language for writing a complex server side web app.

~~~
okket
Sorry, my ignorance. I've had only a brief look at Rust, it did't 'click' in
my head. On a second thought, it seems much more viable than this plain C
mockery. Especially with the right support e.g.
[https://rocket.rs](https://rocket.rs)

------
hellofunk
I am seeing SQLite pop up more and more in articles that tout its
effectiveness and simplicity for many use cases over popular alternatives like
PostgreSQL. It has definitely persuaded me to consider it on future projects.

~~~
dom96
Oh yeah. SQLite is incredibly versatile and for most web applications it's
more than enough. The only pain point I have with it is that all data is
stored as a string no matter the table schema types specified, this can lead
to some bugs, but if you have a good library in your language for SQLite then
it's not a problem.

~~~
hellofunk
I am currently working with a data set that is billions of rows and tens of
GB. Will be interesting to see how SQLite handles it, which I plan to try. All
of that in one DB file!

~~~
dagw
The main place you might run into problems is if doing lots of simultaneous
writes. As long you're mostly dealing with reads then it should work.

------
sbr464
The example says no mysticism but is using at least 4 undeclared variables.
I’m not a C developer so I’m sure I’m completely wrong, but where do pledge,
puts, EXIT SUCCESS/FAILURE come from?

I assume from importing stdlib etc, which probably give you additional
variables to use, that you would need to know about explicitly. One thing I
like about zeit.co’s micro server (js) was the argument about not using
bodyparser and magically having res.body available, but using async/await and
assigning it to a variable. This is popping up more with things like es module
imports, render props in react, etc. They make it clear where variables come
from and allow you to avoid stepping on existing variables. What other 10-100
variables can I step on potentially with global patterns as a new user?

~~~
cpmouter
It's surprising to me that someone could be a programmer without knowing
anything about C.

~~~
scandox
The best programmer in the world may never have even seen a computer for all
we know.

~~~
carapace
Dijkstra famously didn't use one until his friends and colleagues made him get
a mac so they could email him.

------
ohazi
I really don't like OpenBSD's httpd. Last I checked (three months ago?), it
couldn't do something as simple as adding a custom header to a static file.
Why you'd use it over Apache or nginx is beyond me.

~~~
petee
I'd love to know what header you desperately need just for a static file, that
would necessitate a behemoth such as nginx, or apache and all it's CVEs?

Simply having all the bells, whistles and free candy, doesn't determine 'toy'
status

~~~
ohazi
X-some-signal-to-tell-aws-that-its-okay-to-use-this-file: true

"Why would anyone need this" was also the typical dismissive response I got
when looking around for help.

~~~
petee
It's definitely a legitimate question - not everybody knows that AWS needs
some silly header, nor why. OpenBSD people will be the first to call out
unnecessary/outlandish requests unless you can back it up with a good reason
outright.

I appreciate the fact that httpd doesn't attempt to be the end-all server for
everybody - its primarily a lightweight way to run things like the BGPd
looking glass, other tools, a simple website, or something via fastcgi. The
term they use often when denying pull requests is 'Featuritis', which Apache &
Nginx suffer from.

~~~
abiox
what would you recommend for someone needing to add headers?

~~~
stevekemp
If you're using httpd you'd probably pair it with relayd.

With relayd you can add custom headers, as this random example shows:

[https://github.com/reyk/httpd/wiki/Using-relayd-to-add-
Cache...](https://github.com/reyk/httpd/wiki/Using-relayd-to-add-Cache-
Control-headers-to-httpd-traffic)

However that's probably not great still, because it's a global solution to a
per-file problem.

------
kizer
I may be young and dumb, but C programming seems very simple and
straightforward to me. With abstraction comes lack of knowledge of
implementation in favor of ease of use; but in C you can build your own
abstractions quickly and reach a level close to even JS (opaque types,
function pointers). At least in C you know exactly what is going on.

Can someone briefly summarize the security issues in C? If you manage memory
properly and take a conservative approach to handling input, where is the
risk?

Like I said, I'm young and have only be programming in C for ~3 years.

~~~
pjmlp
In C you think you know what is going on, specially bad when developers mix
the compiler specific behavior with the standard and then try to write
portable code.

\- Decays of arrays into pointers.

\- Decays of enumeration into numeric types

\- Default signess is implementation defined

\- Implicit conversions

\- Currently ISO C11 lists about 200 UB cases, C2X plans to list even more

\- Strings are a pointer to somewhere in memory that you hope the caller
actually terminated them.

\- While the pre-processor seems basic, compared with real macros, you can
still be very creative with it

\- No way to validate security issues in binary libraries

The LLVM and PVS Studio blogs have quite a few examples of C gotchas.

~~~
earenndil
> While the pre-processor seems basic, compared with real macros, you can
> still be very creative with it

That sounds like a benefit, not a downside. See, for instance,
[http://libcello.org/](http://libcello.org/)

~~~
pjmlp
Try to be do maintenance work on a foreign code base and you quickly will
change your mind regarding it being a benefit.

Doing code fixes on a server code which was using clever tricks to convert
between memory handles and the real memory addresses teached me that.

------
usgroup
Lol, about 10 years ago I wrote a CGI framework in C. It was fun and the
minimalist in me loves the lark but for productive work especially working
with other people, opting for a stack with next to no abstraction is crazy.

Put it this way; Would you invest in s startup intending to use this stack ?
Can this stack ever be the result of pragmatic decision making?

If you can genuinely answer yes to both in your use case , then why not ...

~~~
stevekemp
People generally invest in startups because they believe the sales/customer
numbers. Not because they pay attention to the implementation of every single
component which is used by the company.

That said OkCupid famously used a custom HTTP-server to power their site:

[https://github.com/OkCupid/okws](https://github.com/OkCupid/okws)

------
UncleEntity
> C is a straightforward, non-mustachioed language

Maybe non-mustachioed out of the box but I've done the mustachio thing to
generate code for my ASDL parser backend.

Well...I guess it was a C++ library since it was "header only" but I'm sure I
could have found a C library if part of the project wasn't to learn
boost::spirit.

------
potta_coffee
Please...I'm convinced this "BCHS Stack" is just an elaborate joke.

------
qwerty456127
I can imagine an even more weird one: DOS, ASM, dBase... (;,;)

> C is a straightforward

Indeed, what can be more straightforward than manual memory management and
pointer arithmetic...

------
ghapereira__
> "man pages and "-Wall -Wextra" are your new best friends" These guys are
> more likely friends with "-pedantic"

~~~
bitofhope
And nothing wrong with that. Stricter compiler is a good thing if you don't
need it to be laxer.

My favourite flag in clang is -weverything. Everything I write in C I compile
with it and fix whatever I reasonably can.

------
blueside
I love this; we are definitely in the territory of Poe's law here.

I strangely really like the css formatting used in the website's source code
(e.g.
[https://github.com/kristapsdz/bchs/blob/master/json.css](https://github.com/kristapsdz/bchs/blob/master/json.css))

~~~
ape4
That formatting is cool. Why would they not want radio buttons to display?
`input[type=radio] { display: none; }`

------
alexhutcheson
Doing anything on the web with a language that doesn’t have a string type
seems like masochism.

~~~
kizer
A character array seems like a natural way to me to represent a string of
characters. But Unicode though...

~~~
fulafel
As C functions can't take arrays as arguments, you don't get very far this
way.

~~~
kizer
...you pass in a pointer to the first character. Arrays in C are simply
contiguous spans of memory.

~~~
fulafel
Thid then has the problem that thr length is unknown, leading to well known
problems.

~~~
petee
Well if you wrote the array, or your program did, then you do know the length.
Not a problem. Would it be nice if an array internally stored that? Sure. Does
it? No. Is that a real problem? No, not really - its been solved and worked
around for decades

------
yellowapple
If you're looking to develop a website exclusively with the tools that ship
with OpenBSD, Perl would probably be a much saner choice than C, not to
mention one with a long history of use in web development.

------
tannhaeuser
I'm loving the spirit if not the snark, but it's still odd OpenBSD's httpd
(not to be confused with Apache httpd) calls it's CGI gateway "slowcgi" (and
kindof derides CGI programming). I mean, it being slow is entirely the fault
of the O/S isn't it ?: And using C gets a whole level more challenging with
long-running processes and async I/O because of manual memory management.
Native CGI programming IMHO is best enjoyed with a reverse cache if you can
help it, such as with Apache's mod_cache.

~~~
notaplumber
It's slowcgi.. you know, as opposed to fastcgi. It's just a name. It was
implemented so that httpd(8) didn't have to support executing CGI programs
itself and could speak only FastCGI.

You don't /have/ to use it, but it's safer.

[http://man.openbsd.org/slowcgi](http://man.openbsd.org/slowcgi)

~~~
tannhaeuser
I was referring to the release announcement back in 2014 or so which I'm
unfortunately not able to find (there was more than a bit snarkiness towards
CGI programming).

I'm aware what FastCGI is ;) Btw. if you're interested in native HTTP I'd be
looking into nghttp2 rather than FastCGI and OBSD's httpd.

Edit: see also [https://ef.gy/fastcgi-is-pointless](https://ef.gy/fastcgi-is-
pointless) and
[https://news.ycombinator.com/item?id=9202039](https://news.ycombinator.com/item?id=9202039)
(previous discussion of OBSD's httpd)

~~~
bigato
how can i trust a lib which still uses openssl in this day and age?

------
jokoon
How will does it run on a raspberry pi?

~~~
sgt
It should run just fine, but I haven't tested it myself. OpenBSD has support
for the Raspberry Pi (BCM2837):
[https://www.openbsd.org/arm64.html](https://www.openbsd.org/arm64.html)

------
sfifs
Expose a C program with all the language's subtle undefined states and manual
memory management to the big bad web! What could go wrong...

~~~
petee
Well untrained programmers who have no business writing C apps in the first
place, for starters.

Everyone here is completely glossing over this as if it were suggested that
this is intended for everybody, for every solution. If you're not extremely
confident in your C skills, or have the experience to back it up, _This is NOT
for you!_ If you think you can only do it with another language, then _This is
NOT for you!_ If you think you're gonna write the next Amazon, _This is NOT
for you!_

And lets not forgot just how many things are written in C and still _running
the internet_. It's just another tool in your toolbox.

~~~
moosingin3space
Linus Torvalds is "extremely confident in his C skills". Doesn't stop hundreds
of CVEs per year targeted at the kernel.

Sick of the C Apologism Task Force's idea that "if only rockstar programmers
wrote code we wouldn't have any problems", completely ignoring the reality of
software development and the evidence of 50 years of exploits.

~~~
petee
50 years of the same bugs over and over - the language might let you shoot
yourself, but that doesn't abdicate the programmer from continuing education,
code review, or testing. Seacord has a whole book on avoiding these; arguably
anyone without it isn't doing all they can to write safe code, and might as
well be considered uneducated. You can write unsafe code in any language if
you don't know what's going on, or not a 'rockstar'

As for Linus, the kernel was written when he was a student, and I doubt he
personally still makes the same low hanging fruit vulnerabilities that are the
big issue.

------
kworker
Any example site with huge traffic, or is it just a "joke" project?

