Hacker News new | past | comments | ask | show | jobs | submit login

This is definitely a cool little tutorial, but I'd strongly caution anyone who's writing string-handling software (and what is a shell but string-handling software: the input and the output are all strings!) not to write it in C. There are higher-level languages with good string support which support fork(), exec(), wait() & friends, and are far less susceptible to string-handling and memory errors.

That said, a shell's remarkably limited scope is actually something C is reasonably suited for.




I don't think there are that many languages that are good for writing a shell. There seem to be a few shells written in Go, but Go only reluctantly supports fork(), because it interacts with its threaded runtime. Python is actually closer to what you want than Go, but I think it will end up having problems with signal handling, because the Python interpreter does its own signal handling. The prototype of my shell is written in Python [1].

Garbage collection almost always interacts poorly with signal handling. You can't interrupt a garbage collector at an arbitrary point.

I wanted to bootstrap my shell [1] with Lisp -- and I hacked on femtolisp because Julia is bootstrapped in exactly this manner. And for awhile I thought about using an OCaml front end.

But in the end I settled on C++ (writing 3K lines of code, which I plan to return to after my prototype is done.) C++ lets you do fork(), exec() and handle signals exactly as in C, but it actually has useful string abstractions!

C++ has a lot of problems, but it appears to be the best tool for this job.

[1] https://github.com/oilshell/oil/


I think that a shell is doable using my TXR Lisp as a base.

As far as signal handling goes, it's in the box. See this Rosetta Code example under the task "Find limits of recursion" where we catch a SIGSEGV using a TXR Lisp lambda (running on an alternate stack via sigaltstack):

https://rosettacode.org/wiki/Find_limit_of_recursion#TXR

Handling a SIGINT:

https://rosettacode.org/wiki/Handle_a_signal#TXR

Programming reference:

http://www.nongnu.org/txr/txr-manpage.html#N-010CFD89

You don't have to worry about signal handlers and garbage collection.

If you need to do something in C++, the TXR code base will support you. Though it is C, it all compiles in C++, which I check for regressions every release. (At this time, I won't accept patches which break away from C compatibility and cause C++ to be required, but for experimenting and forking off your own thing, there it is.) Just ./configure cc=g++ and off you go.

TXR also integrates a nicely hacked version of Linenoise that could be used to bootstrap the shell command prompt. I rewrote the multi-line support so that it is excellent. There is visual cut/copy/paste, paren matching, undo, and more. (It's the only part of TXR without proper Unicode support, unfortunately: in the TODO list.)


Yes, interesting project, and I think it would work. But femtolisp would work fine too... In the end I decided not to use a Lisp because it wasn't making me more productive. It felt a bit unstructured.

Instead, it looks like I will generate a lot of C++ from Python to control the complexity (and line count).

For example, right now I'm working with Zephyr ASDL, which is basically a DSL for algebraic data types which can bridge Python and C++. You can do this in Lisps too of course, but Python works just as well in this case. Julia is interesting because the lexer and parser are in femtolisp, and that enables Julia macros.

If I'm reading your page right, Python handles signals the same way -- it receives the signal in C, and then runs a Python handler later on the main interpreter loop. I think you pretty much have to do it that way.

(But Python has some logic about turning certain signals into exceptions on the main thread, which I don't want to bother with.)

I wrote on some related topics here:

http://www.oilshell.org/blog/2016/12/05.html


TXR has deferred signals as well as async ones. Both the Rosetta examples show async signals going off: lambda being called in the signal handler context. For the SIGSEGV catch, this must be so: the lambda executes on the alternate stack arranged with sigaltstack.

There is a flag which enables or disables async signals. In the case of a CPU exception signal (SEGV, BUS, ILL, FPE, ...), we ignore this flag and just do async. Other signals respect the async flag. The async flag is almost always off; async signals are enabled in various places in the library when blocking system calls are being made. So for instance an async signal won't go off during garbage collection; the internal handler will see that async signaling is disabled, and arrange for a deferred call at the next opportunity (unless it is a CPU exception).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: