Hacker News new | past | comments | ask | show | jobs | submit login

this guide/book from inria is pretty good: http://icube-icps.unistra.fr/img_auth.php/d/db/ModernC.pdf

edit-001: this was previously discussed here as well. ref: https://news.ycombinator.com/item?id=9018247




I just skimmed through this. Page 3 says Rule 0.1.2.1 C is a compiled programming language.

I never understood the phrase compiled language in the context of C and also in general. C interpreters exist. I happen to have an official copy of the ISO/IEC 9899:1999 standard. The word "compiler" appears only once in a footnote. Other languages have both compilers and interpreters (e.g. Common Lisp, OCaml).


C being a "compiled language" refers to a cluster of concepts which goes perhaps something like this:

* The semantics is simple. For instance, local variables vaporize when a block terminates; compiling a lexical scope in C means not even having to know what "closure" means, and thus not having to deal with a whole class of code generation problems like treating optimizing trivial closures, and non-escaping ones.

* A correct binary call to a C function can be generated if we just parse and analyze a simple prototype declaration. Very little information is needed to generate the calls to functions in a separately compiled file.

* The declaration of any complete data type specifies how it is laid out in memory and accessed.

* In general, anything which would make compiling difficult is either off limits to the programmer entirely, or "undefined behavior". For instance, functions aren't objects and cannot be manipulated at run time in any portable way. There are almost no introspective features whatsoever. A C interpreter can easily have all sorts of introspection features as extensions, but those are kept out of the language because they would interfere with the concept of it being a "compiled language".

* The type system is static and supports type erasure: most C implementations throw away all type info (except as part of "debug info") when translating C programs. No requirement in the standard requires any type info to be retained. Even though anything can be interpreted, static typing with erasure tends to make languages geared toward compiling.


I always read that as the default behaviour for the language. C is usually compiled but there are exceptions. Perl (<=5) is almost always interpreted but there are tools to turn Perl code into an executable. I agree it's a somewhat arbitrary distinction given the above but when I first learned any programming at all having it explained this way helped my understanding of how languages can differ from each other.


>Perl (<=5) is almost always interpreted but there are tools to turn Perl code into an executable.

IIRC those tools (of which similar ones exist for Java, Python, Ruby, Node.js, Lua and so on) just package together the sources (precompiled to bytecode) and an instance of the Perl[5] interpreter.


Here's Java getting compiled to actual x86-64 code (mov $0x2a,%edx; add %edx,%eax).

  $ cat FortyTwo.java 
  public class FortyTwo
  {
          public static int fortytwo(int x) {
          	return x + 42;
          }
  }
  
  $ gcj-4.9 -c FortyTwo.java 
$ objdump -S FortyTwo.o

  FortyTwo.o:     file format elf64-x86-64
   
   
  Disassembly of section .text:

  <snip>
    
  0000000000000022 <_ZN8FortyTwo8fortytwoEJii>:
  {
          public static int fortytwo(int x) {
          	return x + 42;
    22:	55                   	push   %rbp
    23:	48 89 e5             	mov    %rsp,%rbp
    26:	48 83 ec 20          	sub    $0x20,%rsp
    2a:	89 7d ec             	mov    %edi,-0x14(%rbp)
    2d:	b8 00 00 00 00       	mov    $0x0,%eax
    32:	48 89 c7             	mov    %rax,%rdi
    35:	b8 00 00 00 00       	mov    $0x0,%eax
    3a:	e8 00 00 00 00       	callq  3f <_ZN8FortyTwo8fortytwoEJii+0x1d>
    3f:	8b 45 ec             	mov    -0x14(%rbp),%eax
    42:	89 45 fc             	mov    %eax,-0x4(%rbp)
    45:	8b 45 fc             	mov    -0x4(%rbp),%eax
    48:	ba 2a 00 00 00       	mov    $0x2a,%edx
    4d:	01 d0                	add    %edx,%eax
    4f:	c9                   	leaveq 
    50:	c3                   	retq


I also remember tools that simply converted Perl code to C. The argument against being that the advantages of an executable would be outweighed by the disadvantages of transpiled C being very inefficient.


"Transpile" is not a real and meaningful word. Can we stop using it please? Btw, if you have relevant citations stating otherwise (Wikipedia isn't it), I am happy to be proven wrong. Thanks!


Why would anyone want to run C in an interpreter though? Seems kinda pointless other than for educational purposes. I guess compiled/interpreted is arbitrary, but on a practical level most languages are one or the other (im not going to count JIT as compiled)


If you have a C compiler written in C and you want to bootstrap it from assembly then you need either a compiler or an interpreter. Writing a simple interpreter in assembly is far easier.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: