I never understood this sentiment. C isn't assembly exactly because there isn't necessarily a direct correlation between the instructions the programmer writes and the resulting machine code. To me, this is as contrived as the idea that Javascript is "LISP in Java's clothing"
In what possible sense can it be described as an assembler?
It’s relatively easy to mentally compile unoptimised C, or to look at an unoptimised assembly listing and correlate it with the source, but that’s about as far as the metaphor goes. C is a high-level language.
Only when I’m writing very performance-sensitive code do I actually think “here’s a branch, there’s a load, yonder is a divide”. Most of the time it’s “eh, compiler’ll get it”.
Because C is a very low level language, it imposes very little architectural decisions on higher levels that could cause impedance mismatches.
For example, you wouldn't want to implement a garbage collector in a garbage collected language; C won't get in your way.
This is also why there are loads of C libraries out there: they can be used with bindings from essentially any other language, which is generally not true for libraries implemented in non-C languages (in particular, C++).
> Because C is a very low level language, it imposes very little architectural decisions on higher levels that could cause impedance mismatches.
Remember that it also enforces the idea of types, scopes, blocks, functions, the heap and the data stack quite strictly. It might be a very low level language compared to something like Java, but in terms of being unassuming of architectural intent, I can't say that I agree that it is even in the same ballpark as any of the assembly languages I've used.
Indeed it enforces some function calling convention, which may be an issue if you want to compile a high level language via C, particularly since C does not guarantee tail call optimization.
Also there's no fast way to check arithmetic overflow from portable C, which limits bignum performance.
Due to these limitations better portable assembly languages like C-- have been designed; these days LLVM bitcode is an option as well (which is not actually portable IIRC...).
I think these days the term "portable assembly" misses the mark. A term that better captures the flavor of C with today's CPUs is "programmable RAM".
C is one of the few languages today that still gives you very precise control over how you use memory. That can be a powerful tool, especially given how important effective cache usage is on chips.
My point is that if you choose to ignore this characteristic of assembly languages, what exactly is an assembly language? C is a comparably high level language, which affords it things like portability and optimizing compilers. Assembly languages are used for minute and immediate control of the exact implementation.
>Assembly languages are used for minute and immediate control of the exact implementation.
I think the joke/humor/analogy is that C compilers are so good and C itself is so low level, that you don't actually gain much (if anything at all) by dropping down to ASM now (compared to 10-20 years ago).
I'm not sure what you are saying here. Do you mean to say that it makes no difference to the machine how the code is generated, or are you saying that there is a CPU for which there is "little difference" between its machine code and C?
Out of curiosity to the downvoters, has he NOT written C++ programs? Was his paper in IEEE the other year calling for more static code (let the compiler do its work) and abandonment of the dynamic_cast "let's check at runtime!" development model something he pulled from thin air? Or has he actually written code?
Unless I missed it, it is surprising the lack of benchmarks or mentioning how compile speed compares with the previous iteration.
I'm not saying it would be necessary slower than a compiler written in C, if they wrote the more critical parts in assembler, but you would think compiling speed would be one of the outstanding discussion points, after a major rewrite.
While the next mentions that the got it back up -- but not by how much.
It wasn't exactly crystal clear from the docs/website, but apparently[1] "master"
is go1.5 -- so with go1.4 on windows, one can:
# from a git bash, already have go1.4 installed
git clone https://github.com/golang/go.git go.git
cd go.git/src
git checkout master
export GOROOT_BOOTSTRAP=$GOROOT
time cmd "/c all.bat"
real 0m29.078s
user 0m0.015s
sys 0m0.030s
# with our fresh go1.5 (see more below):
real 0m33.034s
user 0m0.000s
sys 0m0.000s
So, apparently doing the same job (compiling go1.5 "master") with go1.5 and
go1.5 -- there's a small hit (I ran a couple of runs with each, the numbers
are consistent to ~1sec, so call it 29 for 1.4 and 33 for 1.5).
Note that if you want to do this, in particular the second part, it gets a bit
hairy, as "go.exe" needs to be in your path, and you need to change your
gopath (easy).
On my system, I had everything in $HOME/opt (%HOME%\opt), so all i did was
move opt\go to opt\go1.4, and copy go.git to \opt\go. One can confirm the
right version is being run with 'go version' (and 'cmd "/c go version"' from
bash).
[1] After a few useless hits, I found:
https://godoc.org/golang.org/x/mobile/cmd/gomobile -- which mentions what's
needed to actually test go 1.5. If they want testers, they should probably
add something under "install from source" on golang.org about "And if you want
to live on the edge, or say, test go in go, use "master")
(I'd be more confident in this test if someone could point me at a past commit where go1.5 works, builds with go1.4 and itself, but is (much 10x) slower...).
> Generate machine descriptions from PDFs (or maybe XML).
> Will have a purely machine-generated instruction definition:
> "Read in PDF, write out an assembler configuration".
> Already deployed for the disassemblers.
(my emphasis)
That coupled with a whole tool chain in a friendly language like go, makes be exited for how this might be used by other language designers. While "compile to go" might not be as attractive as "compile to c" -- it's not half-bad. More importantly it kind of smells like part of this tool chain should make it quite easy to generate machine code quite easily.
Nothing against llvm, rpython/pypy, graal etc - but the more the merrier!
Internal graph representation the compiler uses for programs. llvm uses SSA form of programs as its main low level representation. The benefits of SSA is that it enables some analysis and optimization techniques.
It is basically a simplification that makes optimizations easier by eliminating re-assignment of local variables (where possible, for loops and conditions, merging phi values might be necessary). It has been awhile since I played with this, but it made CSE (common sub-expression elimination) really easy.
Finally, my moment! I submit my own library, which is really meant for two players and scoring...and like much of what I do, was purely pun-driven. Maybe one day I'll add AI:
Have a look at the Go Text Protocol [1] and SGF [2]. If you add these then the strongest existing AIs can already play on your board.
All that said you can of course write your own bots, but Copmputer Go isn't as easy as one might think. In fact it's still considered a much harder problem than Computer Chess.
The usual way to do it is to define a restricted subset of the language that has explicit memory management and is statically typed, and implement GC with that.
I would assume not. GCs are typically invoked by requesting allocation. Since the GC (presumably) uses lower-level OS facilities to allocate memory, it wouldn't need to recursively invoke itself.
I have to hold the screen in landscape to not get the sides of the slides cropped off, and then I still get the address bar covering the headline after each new slide load, unless I tilt the phone to portrait and back for each slide.
Can some one answer this question. It says that Go1.4 will be needed to compile Go1.5. Does this mean that Go1.4 will always be needed, even for Go1.6 and beyond? So are they essentially locking things to the Go1.4 "C" code, then updating on top of that?
Right, but if someone surreptitiously deleted all Go compiler binaries from the world, then we'd have to go back to compiling the C source of 1.4...
Of course, that's the nature of bootstrapping! If someone managed to erase all the software from all the computers in the world, we'd have to go back and find an "Old world" Mac and use the on-die Forth compiler to write a C compiler so we could start compiling things again.
Provided people didn't forget. Recreating from memory will take a fraction of the time. I would go for a lisp interpreter in assembly then implement a c compiler in that.
Yes. It means that anytime you want to add a new supported architecture, you'll need to bootstrap through Go 1.4.
Otherwise, I'm guessing that for Go 1.6, they'll rely on you having a binary distribution of 1.5 (probably from your distribution, or their website), etc.
>It means that anytime you want to add a new supported architecture, you'll need to bootstrap through Go 1.4.
That's not true at all, (1) you write the backend for the target machine, (2) you recompile the compiler with the new backend included on a supported host, (3) then use the new compiler to compile itself for the target using the new backend, (4) you now have a compiler that runs on the target.
Same process is used to port C compilers (or any self-hosting compiler, really).
Sorry for the noob question, but could someone help me understand how this is (conceptual) possible?
I get that you can write a program which compiles language X to machine code, e.g. how python interpreting Go to write the assembly necessary would be technically possible.
My question is, how is the compiler generated? If it's written in c, gcc -o compiler, but what is the piece I'm missing when it comes to Go compiles a Go compiler -- or is this still done as "here's the machine code, now compile the Go runtime"?
I suppose my question is a similie to "the chicken or the egg?".
On a related note, while Go at least has Go 1.4 available to allow bootstrapping from a C compiler for the near future (see [1] for more detail), many languages have no easy way to do so. For example, Rust's compiler was originally written in OCaml, and Nim's in Object Pascal, but both compilers were migrated to be written in the respective languages themselves, and AFAIK both languages have since evolved alongside their compilers for long enough that bootstrapping all the way from before said migrations to the present would require a massive chain of newer and newer compiler versions - not an undertaking anyone really wants to carry out.
Instead, any Linux distribution etc. that wants to integrate the language into its build system has to start by importing a binary from somewhere else. In general this isn't a problem, because that binary is used only to compile the compiler from source (and then that compiler recompiles itself, usually), so any bugs in it are unlikely to affect the final product. However, the topic tends to come up of Ken Thompson's famous paper [2] describing a hypothetical scenario where a compiler binary is intentionally backdoored to insert a vulnerability when it's compiling some security-critical program, plus a copy of the same backdoor whenever it detects it's compiling itself. In that case, the backdoor could theoretically infect the final result of a bootstrap, despite it being compiled from pristine source; no attack of that nature has ever been detected in the wild, though.
Easy, you cross compile from Go on a different architecture. Go apparently has a good cross compilation story, though I don't have experience with it; honestly IMO the whole perception that cross compiling is an unusual or difficult thing comes from poorly designed Unix build systems. Though I suppose this would make a fully automated bootstrapping system a bit messier. (Do the first build in an emulator?)
Simple, you need a working version of go1.4 to compile go1.5.
The bootstrapping process can therefore take advantage of the fact that go1.4 will continue to build as it does today, and use that to get a working go installation to build go1.5+.
I understand that the standard build/release procedure for compilers which are the primary compiler for the language they are written in is to:
Build the new version of the compiler with the previous version of the compiler
Rebuild the new version of the compiler with the new compiler from the previous step, that was built with the old compiler
Rebuild the new compiler with the new compiler from the previous step, that was also built with the new compiler, and that's your official binary
The initial creation of the language requires somebody to write a compiler for it in assembly or in some other language that already has one, but once you get that first one built, all the rest could be in their own language.
a curiosity question. is "all but impossible" correct?
shouldnt it be "all but possible" or "all but easy"?
whenever i see the "X is all but Y" phrase, it makes sense to me that X is not Y. its everything else but/except Y.
am i thinking correctly or is the use in the slides correct?
It's correct, but so is your line of thinking. These things are not impossible, they are all but impossible. And all in this context includes things like requires an expert, is highly hard to do, and/or would take many years. It is all of those things, but not impossible, hence all but impossible.
aaah thats why i get confused. i thought its always one phrase, but its "all but" and "anything but". now i get it.
lets just say this whole confusion is anything but unclear now. it used to be all but impossible to understand what people meant.
I don't know why I've been downvoted. Maybe my question wasn't clear. Since there is no need anymore for a C compiler in the Go toolset, how will the magic import "C" package work ?
Start by implementing in Assembly as little as possible for the programming language as an interpreter.
Meaning just one form of conditionals, just one looping construct, very few basic types, basic IO.
Then use the bare bones interpreter for the second stage, writing a real compiler that is able to process the same basic language and additional constructs.
Another alternative is to add some bytecode format that can then be used equally for interpretation or as input to native code generation.
Niklaus Wirth originally designed P-Code with the intent to make Pascal compilers easier to port. He wasn't thinking to use it as full OS VM, like the guys at UCSD did with they Pascal dialect.
Another trick is to design the bytecode in such a way that it can directly be translated to native code via a macro assembler. It won't generate fast code, but it will provide an easy path for a compiler instead.
Nowadays the Assembly step tends to be replaced by another language that can be found in most platforms, so many tend to choose C.
Not always because it is the best language to write compilers in, but rather because it is ubiquitous or there is some library the authors want to use, thus perpetuating the myth that all languages require C for those not so well versed in compiler design.
It's a sign of maturity to show that a language can bootstrap itself, prolog has been written in prolog, erlang was written with prolog first, then later rewritten in erlang, as other's have pointed out, it's pretty common for C/C++, Lisp, Assembly. This of course applies more to compiled languages.
It would be nice to see benchmarking stats to gauge the significance of the performance problems mentioned in the slides. For users, I'd imagine the ease of developing the Go language is less important than its performance.
It is a program representation that a few modern compiler optimization techniques rely on. It just mean there is potential for the compiler to emit better assembly code.
Because the compiler is now written in Go, compiler optimization will affect how fast programs compile.
The compiler may speed up or slow down because it is doing more advanced code analysis, but is generating better assembly. I do not know if this will cancel itself out or not.
• Talk 1: Andrew Gerrand on Go
• Talk 2: Rob Pike on Go
• Talk 3: Aaron Schlesinger, Concurrency Conventions in Go
• Talk 4: Steve Francia, Common Mistakes in Go and When to Avoid Them
I've created my own programming language and they always tell me that the best languages are always self-hosted, so I wrote the compiler in my new language too.
They're all C programmers!