I grabbed this when it was originally published, but somehow the file name I have is different from the one in this article. Mine is called "carol.c" and I just compiled and ran it on a modern system. The compiler spat out the following warnings:
gcc -o carol carol.c
carol.c:2:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
2 | main(t,_,a )
| ^~~~
carol.c: In function ‘main’:
carol.c:2:1: warning: type of ‘t’ defaults to ‘int’ [-Wimplicit-int]
carol.c:2:1: warning: type of ‘_’ defaults to ‘int’ [-Wimplicit-int]
I guess I got lucky then. This Ubuntu 23.10 install has gcc 13.2.0-4ubuntu3 installed from the repo. Given that the code is pre-ANSI C, and that it won in the category of "Least likely to compile successfully", it's surprising that there are no other issues.
I wouldn't normally run such a radioactive distro, but this laptop is so new that there is no LTS distro that works on it. (AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics)
The issue is calling xmas() in main before it's defined. Compiling with GCC on macOS gives the error:
xmas.c:16:5: error: call to undeclared function 'xmas'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
xmas(2, 2, "");
Moving main() to the bottom compiles and executes with the proper output.
This makes me think about kolmogorov complexity. The program here looks like gibberish but produces the desired output, would there be even shorter programs that don't look like they make sense but produce the same output? How would you search for these programs?
#define p printf
int main(){const char *g[]={"A Partridge in a Pear
Tree.\n","Two Turtle Doves, and","Three French Hens,","Four
Calling Birds,","Five Gold Rings,"," Geese-a-Lay"," Swans-
a-Swimm","t Maids-a-Milk","e Ladies Danc"," Lords-a-Leap","
Pipers Pip","Twelve Drummers Drumm"};
const char *d[{"First","Second","Third","Four","Fif","Six","Seven","Eigh","Nin","Ten","Eleven","Twelf"};for(int i=0,j;i<12;i++){p("On
the %s%s day of Christmas\nMy true love sent to
me\n",d[i],i>2?("th"):"");
for(j=i;j>=0;j--){p("%s%s%s\n",j>4&&j<11?
d[j]:"",g[j],j>4?"ing,":"");}}}
Technically yes, initializing a plain char * from a const char * (from array decay) is a constraint violation (an error the standard requires the compiler to detect and complain about) even in C89. Many compilers will let you off with a warning though.
> However it's always in principle possible for someone to prove that the KC of some particular string is X.
Only for a bounded number of strings. I.e. there is a finite set of string X such that for strings outside of X, you can not prove their KC, even in principle.
This result by Chaitin [1] can be paraphrased as: you cannot prove a 2 kilo theorem with a 1 kilo theory.
I think it's honestly quite hard to know, as it's really (generally speaking, AFAIPK) impossible to directly compute the KC in most cases, only really from the feasibility standpoint we can check that it's slower than some other version/value.
Which is quite exciting, it sets us up nicely for long-running competitions and the like due to the logarithmic-like growth curve (with sometimes some very fun discoveries on the ultra-tail-end of things! <3 :D)
Am currently running a mini-competition with a current prize bounty of $100 (distributed proportionally by % contribution in logspace) for an LLM that can memorize the most digits of Pi by March of next year. Pi is nice as it actually is quite compressible in theory and seeing if a model is able to learn a sufficiently-close-to-the-MDL set of weights that recovers a highly-compressed algorithm from the data would be extremely nice, indeedy!
However, whether this is feasible or not with off-the-shelf models and such is not entirely easy to know, so for now, it's just a digits-memorizing competition, and we shall see where it goes from there!!! <3 :'))))
That page shows the last IOCCC was 2020. But the home page has an update from May 2023: "We do plan to hold a 28th IOCCC." Much like Nethack releases, some things are worth waiting for.
"I remember my last two semesters of university. All the way back to last year!" It's kind of fuzzy being such a long time ago (or maybe all of the beer and other consumables), but if I recall correctly (from last year),...."
I can't tell if you were being serious or just a damn good understated bit of comedy
It compiles (and runs!) fine with just "gcc a.c"; nothing more needed. clang does throw some fatal errors, but there's probably some combination of flags to make it work; I couldn't be bothered to look further.
I've compiled plenty of code from the 80s; I can't recall a single example where it didn't work that wasn't due to OS-specific stuff. Tons of warnings and maybe some special-flags? Sure. But it works.
There's even some modern (well, "modern") code around today that uses K&R style function parameters.
TIL that on macOS, both `gcc` and `clang` are present, but `/usr/bin/gcc` is in fact just a hard link to `/usr/bin/clang`. So I couldn't compile it with gcc (which is clang) on the Mac, but it compiles just fine with gcc on Linux.
It's really astonishing that we can compile 35 years old code with our current tooling. I somehow doubt that we'll be able to compile 35 years old Java/Go/Rust/... code without changes in the futue.
Java is in that age ballpark and code with a similar level of dependencies as this of that vintage works just fine. Java has a solid backward compatibility story. The Java standard runtime also comes with much more than the C library so I’d say it’s better.
That's surprising! But I kind of get why they do that, as so many tools hard-code "gcc" rather than using "cc" or "$CC".
I'm not all that familiar with Java or Rust, but Go is very compatible; it's hard to predict the future, but it's already a third on its way to "35 years of compatibility".
JavaScript is pretty compatible as well; crummy JS from the 90s typically still works today (arguably it's compatible to a fault, with things like arr[-1] not being fixed).
I've also seen it in some FreeBSD code, but that was quite a few years ago and I don't know to what degree that's been cleaned up – not so easy to grep for, but a quick check in git log shows e.g. https://github.com/freebsd/freebsd-src/commit/e5d0d1c5fbbc and some other things, so it seems there's still some K&R around.
Probably others as well, this is what I happen to know from the top of my head.
It does compile, even with `-std=c17` under GCC. With a few warnings about implicit `int`s, of course. Even `-Wall` only adds a single warning (about !0<t not meaning what you could think it means.)
It's only "compressed" if it was made by compress(1) in France. Otherwise it's just sparkling entropy coding.
For reference:
1490 xmas-with-leading-comment.c
913 xmas-without-leading-comment.c
2357 xmas.out
1297 xmas.out.9.Z
1038 xmas.out.10.Z # actually better than with more bits!
1048 xmas.out.11.Z # compression with 11..16 bits have the same size
319 xmas.out.1.gz # compression levels 1..2 has same size
317 xmas.out.3.gz
307 xmas.out.4.gz # compression levels 4..9 have same size
Note that despite looking hard I haven't found a version of `compress` that supports `-H`, which is referenced and decompressable by gzip. I'm not sure how common it was in the wild.
% gcc xmas.c
xmas.c:2:1: warning: return type defaults to 'int' [-Wimplicit-int]
2 | main(t,_,a)
| ^~~~
xmas.c: In function 'main':
xmas.c:2:1: warning: type of 't' defaults to 'int' [-Wimplicit-int]
xmas.c:2:1: warning: type of '_' defaults to 'int' [-Wimplicit-int]
% wc -c <xmas.c
913
% ./a.out | wc -c
2359
% ./a.out | compress | wc -c
1048
zstd -19 compresses the text of the song to 309 bytes.
To be a fair comparison, though, you'd have to write zstd -d in 604 bytes. I suppose to be REALLY fair, though, you have to count the bytes of code in the compiler itself. A convenient enough implementation of compression could index into the C compiler binary to find the bytes it needs. (For example, my GCC contains "first", "second", and "third" in the binary, which a sufficiently clever implementation could make use of. "Exit on the first error occurred.", "Append a second underscore if the name already contains an underscore.", "Warn about suspicious calls to memset where the third argument is constant literal zero and the second is not.", etc. I didn't check but I doubt turtle doves or maids-a-milking come up that often in the description of warning flags.)
And that comment was clearly written in 1988. I fail to see what's so hard to understand about that and why people feel the need to "prove" it's "wrong".
I don't think people are trying to prove it's wrong. Most of the comments are saying that general purpose technology in 2023 doesn't reduce the file size by as much as the manual job in 1988 did.
This particular entry to the IOCCC was for 1988, those two algorithms didn’t come about for a few more years after that, gzip in the early nineties and 7z later that decade. The note is probably correct when comparing against the state of the art at the time.
The program that produces the song, without the introductory comment, is 913 bytes, as presented in the article. Removing whitespaces it uses just 800 bytes and produces the song which is 2359 chars here. The whole C is:
Since the word 'compressed' is in quotes they are probably suggesting that they mean when processed by the 'compress' command as available on UNIX at the time, as opposed to some other compression available at the time.
The program was written in 1988. I ran the text through LZSS which was published in 1982, so was available before 1988. I used a 1989 public domain version by Haruhiko Okumura, which is after 1988, but I don't believe it is optimized to improve upon the compression level of the 1982 algorithm.
It took it from 2357 bytes to 534 bytes, which is smaller than the Xmas.c program which I counted as 917 bytes, but another poster counted 913 bytes.
#include <stdio.h>
#define o putchar
#define p main
#define q(a) return a;
#define r ) {
#define s {
#define t for
#define u if
#define v else
#define w while
char x="a partridge in a pear tree.\ntwo turtle doves\nand three french hens, four calling birds, five gold rings;\nsix geese a-laying, seven swans a-swimming,\neight maids a-milking, nine ladies dancing, ten lords a-leaping,\neleven pipers piping, twelve drummers drumming, ";
char y[]={"first", "second", "third", "fourth", "fifth", "sixth", "seventh", "eigth", "ninth", "tenth", "eleventh", "twelfth"};
int p() s
int i=0, j, k, l;
t(;i<12;i++) s
printf("On the %s day of Christmas my true love gave to me\n", y[i]);
j=0, k=0, l=0;
t(;x[j];j++) s
u(x[j]==' ' && (l==i || (l<i && x[j+1]=='\n'))) s
u(!k)r k=1; o('and '); }
v u(x[j]!='\n')r o(x[j]); }
v r k=0; l++; }
}
o('\n');
}
q(0)
}