Hacker News new | past | comments | ask | show | jobs | submit login
A second life for very old C programs (tailordev.fr)
77 points by couac on March 24, 2017 | hide | past | favorite | 29 comments

If you have these compiling with a modern toolchain, you may wish to ditch the python subprocess wrapper and compile them with something like emscripten (compile to javascript) or NestedVM (compile to JVM bytecode). Then you won't be exposing 30 year old unsafe code directly to the internet... at least it would sandboxed. If you use emscripten you could even have the programs run client side.

I have a fork of NestedVM that I've been using with success but I don't have any simple examples for you:


Here is an example of compiling the Apache Thrift compiler to run on the JVM:


And I have a version of the toolchain precompiled as a docker image:

https://hub.docker.com/r/bcg1/nestedvm/ https://github.com/bgould/thrift/blob/nestedvm/contrib/neste...

Very nice, thanks! Third time I read something like this today, so worth considering I guess. Thanks a lot for the links!

How does it compare with the classic approach of providing a JavaScript emulator for the binaries?

This is what the archive.org Software Library does.

MSDOS Games: https://archive.org/details/softwarelibrary_msdos_games

Atari 2600: https://archive.org/details/atari_2600_library


Thanks. I did not know about it to be honest. Very interesting!

There is no comparison since the idea was not to port the programs to JS or in the browser but rather being able to compile them (with a common setup) and create an API on top of them :)

If you don't need multiple people to interact with shared data maintained by the same program, and instead everyone's interactions with the program can run independently, then you could still run the JavaScript version of DOSBox on the client side, and extend that version of DOSBox to offer JavaScript APIs to communicate with the virtual machine running inside. For instance, to send or receive characters, or modify files on the virtual disk.

That sounds awfully much more complicated.

> these software have been written for MS-DOS and therefore require a rather old computer to use them

There are quite a few options to run MS-DOS applications on modern hardware.

If you give that sentence more context, it becomes clearer what the author was hoping to be able to do with the old code. It's not merely "keep this old DOS accounting app working for the local auto mechanic".

> First, these software have been written for MS-DOS and therefore require a rather old computer to use them. This leads to two more issues: not everyone can easily use them, and it is nearly impossible to interface them with other software.

> We have been asked to build a Proof Of Concept (POC) to transform these programs into web services in a week

This person is a museum curator, so there's a good reason (society benefit) for their desire to transform something from the old computing world into something people can appreciate and understand in today's context.

This is not a 'curation' worthy event, in my opinion. A more accurate presentation would be to have the DOS machine 'in the flesh' (i.e. a booting example of the ol' PC). When that's not happening, the next best thing is emulation.

Re-compilation, cleverly disguised as 'auto-configuration' in this case, is pretty much a total fakery.

1987 isn't very old. There is easily code in, oh, Vim which is that old, and gets used daily.

(No double entendre intended here with "gets", by the way.)

Though I'm sure you meant this casually, and yes, vim has been around quite a while, vim is not vi.


" Vim (/vɪm/;[3] a contraction of Vi IMproved) is a clone of Bill Joy's vi text editor program for Unix. It was written by Bram Moolenaar based on source for a port of the Stevie editor to the Amiga[4] and first released publicly in 1991. "


" The original code for vi was written by Bill Joy in 1976, as the visual mode for a line editor called ex that Joy had written with Chuck Haley.[2][3] Bill Joy's ex 1.1 was released as part of the first BSD Unix release in March, 1978. It was not until version 2.0 of ex, released as part of Second Berkeley Software Distribution[4] in May, 1979 that the editor was installed under the name "vi" (which took users straight into ex's visual mode), and the name by which it is known today. "

" In 1989 Lynne Jolitz and William Jolitz began porting BSD Unix to run on 386 class processors, but to create a free distribution they needed to avoid any AT&T-contaminated code, including Joy's vi. To fill the void left by removing vi, their 1992 386BSD distribution adopted Elvis as its vi replacement. 386BSD's descendants, FreeBSD and NetBSD followed suit. But at UC Berkeley, Keith Bostic wanted a "bug for bug compatible" replacement for Joy's vi for BSD 4.4 Lite. Using Kirkendall's Elvis (version 1.8) as a starting point, Bostic created nvi, releasing it in Spring of 1994.[28] When FreeBSD and NetBSD resynchronized the 4.4-Lite2 codebase, they too switched over to Bostic's nvi, which they continue to use today.[28] "

No, I meant Vim very specifically, because I know Vim to be based on sources dating to 1987 or older, and I know it to be in daily use.

I used the original Elvis in the early 1990's a bit and am aware that nvi is based on it. I don't know off the top of my head how old Elvis is. I don't know whether Joy's vi code is still in use anywhere, either; obviously it isn't in any of the BSD's that use nvi.

Probably the vi in AT&T derivatives (Solaris, etc) has Bill Joy code in it. Does Solaris receive daily use involving invocations of vi? That is the question ... haha!

Yeah, my first thought was a front-end for pre-K&R versions of C.

Surely you meant "pre-ANSI".

No, K&R (first edition) is pre-ANSI. I mean pre-K&R. Things like initializers without ‘=’ and compound operators with ‘=’ first.

How many programs from that era still exist and were not updated over the years? Of those, how many are of the form of "diagnostics program for a 1960s era IBM printer attached to an OS/360 machine using some custom bit of hardware?" For the latter running the process in a browser is probably worthless.

What are some examples of this software? I think I'm too young to have seen any, though it sounds interesting.

Hmm... I did not expect that C from 1972 was still in use 15 years later. Interesting!

How big are these programs? Isn't it better to just commit them on GitHub and have volunteers rewrite them nicely in something modern?

This is very cool and something I have been curious about for a long time. How is the performance?

Surprisingly pretty good. Simple API load tests show response time < 80ms. The C programs themselves are very fast (because there is not a lot of computation involved).

Older C programs were made before many well-known security attack vectors were discovered. e.g: the first exploitation of the buffer overflow came in 1988.

So be careful. Take measures so people do not exploit those vulnerabilities, such as executing arbitrary code in your machines.

Actually it was already acknowledged by the C designers in 1979 that not everyone was writing good C code, hence lint.

"Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."


While it is true that in 1979 buffer overflows were known by researchers, it wasn't until much later when it started being exploited in the wild.

The problem with security was already well known:

"The first principle was security: The principle that every syntactically incorrect program should be rejected by the compiler and that every syntactically correct program should give a result or an error message that was predictable and comprehensible in terms of the source language program itself. Thus no core dumps should ever be necessary. It was logically impossible for any source language program to cause the computer to run wild, either at compile time or at run time. A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to - they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."

-- The Emperor’s Old Clothes By C.A.R. Hoare Communications of the ACM, 1981

As for exploits in the wild, before the UNIX worm there were the game cheats on home micros.

Buffer overflows were discovered earlier than that, in 1972.


However regular C programmers in the 70s and 80s were not defending themselves against arbitrary code execution from buffer overflows.

In addition, you can take C programming books from the time and see how functions like memcpy and strcpy were being used in unsafe ways.

That is my point. If you are going to reuse old programs, be aware that in earlier decades there wasn't as much awareness about security as there is today.

You are right, thanks for the reminder!

It's a DOS to Gnu conversion. 1990s dream come true.

Nice work.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact