It depends on the definition. You can do better than this if you define a valid C program as anything that passes though the C compiler and generates an executable. Behold the zero length program:
$ touch a.c
$ gcc -c a.c
$ ld a.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400078
The file is marked as executable, so the shell very reasonably tries to execute it by calling some well-chosen member of the exec() family (http://linux.die.net/man/3/exec).
The exec() function then needs to open and parse the file according to the formats it supports, which of course fails since the file is empty.
Do you simply mean that you expected the shell to validate this, and not try to execute empty files?
Traditionally, if the kernel cannot execute the file, then it is treated as a shell (/bin/sh) script. (Somewhere along the line, #! got added to specify an interpreter other than the shell.) I read POSIX as requiring this <http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu..., so if zsh claims to be a POSIX compatible shell, that's probably a bug.
In Seventh Edition UNIX, /bin/true is an empty file; it is a shell script that succeeds at does nothing.
Some later commercial UNIXes are noted to have /bin/true contain nothing but comments containing a copyright notice for that nothing.
The particular version of POSIX you linked to (2004) actually forbids the behavior you describe if you read it strictly. [1] defines a text file as "A file that contains characters organized into one or more lines.".
This was altered for 2008[2] to "A file that contains characters organized into zero or more lines."
The 2008 version is actually broken, since it contradicts itself -- a file cannot "contain characters" on zero lines.
> a file cannot "contain characters" on zero lines.
I disagree. To me this doesn't mean that a file "contains at least one character", but that files are containers and their contained values are characters. Like most containers in computer science, the set of contained values can be empty, but it's still meaningful to say that it's a container that "contains characters".
> Do you simply mean that you expected the shell to validate this, and not try to execute empty files?
I understand what happens here and why there is an error message in zsh, but I'm surprised by the fact that bash does not signal the error (exec returns -1, after all).
Bash includes logic to parse ELF[1], so I guess that after exec fails it tries to parse the file and has a special case for empty files.
$ ld a.o
ld: warning: -macosx_version_min not specified, assuming 10.7
Undefined symbols for architecture x86_64:
"start", referenced from:
implicit entry/start for main executable
ld: symbol(s) not found for inferred architecture x86_64
Since ANSI/ISO C, a "translation unit" (whatever is left of a file after preprocessing) has to have at least one declaration; a zero length source file won't cut it.
Originally I thought I'd skip mentioning compiling empty files because doing so without linking separately `gcc` will refuse to link it. I updated the article with a reference to your comment.
$ touch a.c
$ gcc -c a.c
$ ld a.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400078
$ ./a.out
Segmentation fault