Hacker News new | past | comments | ask | show | jobs | submit login
Assembly Language: Still Relevant Today (wilsonminesco.com)
135 points by ingve on Nov 6, 2015 | hide | past | favorite | 75 comments

Tangent: I wish more (any?) high-level languages were capable of expressing self-modifying code safely. A lot of values are variable at compile time but constant within a significant block of computation at runtime. If compilers were designed to emit self-modifying code, they could propagate constants through the block and replace register/register instructions with register/immediate instructions, freeing up registers for other variables. This is the promise of JIT compilers but in practice the popular jit-compiled languages have too many other performance issues to be popular for serious CPU-bound work. Plus the jitter has to do a lot of analysis at runtime. It would be cool if you could mark a block of C code with `#pragma jit`, making the compiler emit a binary that contains some compiler IR and a peephole optimizer to make the code as tight as possible given the conditions at runtime.

Maybe you know that Common Lisp makes it easy to compile and recompile functions at runtime.


In fact, with the package system ASDF, you can recompile (incrementally) an entire package at runtime. This is a single blocking function call; something like:

(asdf:operate 'asdf:compile-op :my-package)

I have a web project where any request in dev mode triggers a system recompilation only if the git revision of the working tree is new.

(my-handler :did-recompile (when (revision-is-new) (recompile) t))

That recompiles the system including MY-HANDLER, and right after RECOMPILE returns, the new version of MY-HANDLER from the working tree will be active.

The way Common Lisp lets me do that as a ten minute hack feels like living in the future... except of course it's pretty old technology!

Edit: I realize this is not really what you are talking about as "self-modifying code." Having full access to the compiler at runtime is a good first step for doing some of what you want, though. And with S-expressions, constructing and manipulating "ASTs" is ridiculously straightforward.

C# (Microsoft stack, probably other stack, probably other .NET languages) lets you generate or input C# code at runtime, compile it, and run it through the jitter as desired. It's nearly trivial to do it. The Regex classes for example create code at runtime to do compiled regular expressions, but I think they do it with emitted IL instead of C# code directly.

All the pieces are there.

.Net provides a couple of different ways to dynamically emit IL, but that's not quite the same as being able to emit your own native machine instructions, though. I think the challenge would be that you'd need to work out a calling convention between the jitted code and the emitted native code, at which point you might as well stick the native code in its own library function and compile or assemble it with a native tool.

People have done this if you dig around. The basic idea is you can assemble arbitrary code into a byte array, then use an unsafe cast and use PInvoke to execute. The calling convention for many platforms has been reverse engineered, but I don't have a link handy. I've used it before to see if it works, and you can do it.

Have you ever seen the jiter use SIMD? Having written some test code that was the simplest use case (multiply an array of floats by a float constant iirc) I have not. C# is a great language, but not for compute bound problems.

Yes, I have seen that.

Of course the JITer will not perform the same level of optimizations as the fullblown code generator, but then you can just call that if you like.

The new JIT engine does support SIMD [1]. And C# can generate SIMD code using various libraries (I think MS has one, and I know Mono does).

And various people have shown how to do fullblown x86 assembly code generation and execution from within C#, but it's fragile.

[1] http://blogs.msdn.com/b/dotnet/archive/2014/04/07/the-jit-fi...

Actually, I just tried it:

            float[] floats = new float[1000];
            float factor = 1.01f;
            int index = 0;
            while (index < 1000)
                floats[index++] *= factor;

00007ff8`80a804a9 488d549010 lea rdx,[rax+rdx*4+10h]

00007ff8`80a804ae c4e17a1002 vmovss xmm0,dword ptr [rdx]

00007ff8`80a804b3 c4e17a590524000000 vmulss xmm0,xmm0,dword ptr [00007ff8`80a804e0]

00007ff8`80a804bc c4e17a1102 vmovss dword ptr [rdx],xmm0

This is in VS 2013, compiled with DEBUG off, prefer x86 off, and 'any cpu' as the target.

Those are scalar instructions, not the SIMD instructions. With x86_64 the fpu stack operations are basically for legacy purposes only, and normal floating point operations are expected to be done with the SSE/AVX scalar instructions. SIMD instruction used the packed variants of the commands allowing you to multiple say 4 floats in a single instruction.

Yes, you just need to fulfil three conditions:

- Use the new RyuJIT

- Make use of 64 bits CPU

- Use the vector and matrix classes from System.Numerics

The LINQ expression compiler also lets you generate expressions at runtime and compile them, it's an amazingly powerful feature: https://msdn.microsoft.com/en-us/library/bb397951.aspx

Modern processors don't handle self-modifying code efficiently. https://en.wikipedia.org/wiki/Self-modifying_code#Interactio...

short sections of self-modifying code execute more slowly

Unless you're optimising for size, SMC'ing little pieces is definitely going to be slower. But e.g. in loops that do a few hundred iterations, saving a cycle or two by getting rid of some opaque predicates quickly adds up and the cache/pipeline-flushing overhead (few hundred cycles at most) becomes insignificant.

After all, as the parent comment says, a JIT is basically SMC'ing huge blocks of code, and it's definitely faster than the equivalent alternative of a bytecode interpreter.

Modern branch predictors handle opaque predicates (assuming you mean branches that don't change). Jump instructions in tight, repetitive loops on superscalar out-of-order processors are basically free.

Probably doesn't count as "high level" but you can embed the LLVM compiler backend to your project and compile and optimize code at runtime.

See the LLVM "Kaleidoscope" tutorials from the LLVM docs. They implement a simple toy compiler with a JIT-based REPL. Basically you use the LLVM framework to build a program and then run it through the compiler pipeline and get a callable function pointer to your newly compiled function(s), optimized for the CPU architecture you're running on.

I've seen people use that for compiling regular expressions to LLVM assembly at run time.

> It would be cool if you could mark a block of C code with `#pragma jit`, making the compiler emit a binary that contains some compiler IR and a peephole optimizer to make the code as tight as possible given the conditions at runtime.

Even if that remotely made any sense, which i don't think it does... c code compiles to the end target of all jits anyway, you just have some if statements or use pointers to other executable code that you swap around. Not only do you get the executable code as efficable as any jit, its constant with no jit cost. Its called c programming.

When you have variable types (say, you're deserializing foreign data, always fun), typical C code will end up having a huge switch() block do determine the type (expensive, e.g. due to branch prediction) and do something with it (trivial). Ahead-of-time compilation cannot optimize away the switch(), JIT can, by evaluating the data fed to the function at runtime.

What OP wants is a way to get the advantage of ahead-of-time compilation for the 99% of the code that benefit from it, and the advantages of JIT for the remaining 1% (or even less).

However, I'm not sure how feasible that would be.

Selectively using code generation to implement closures over immutable variables could get you part of the way there while allowing a lot of the work to be done at compile time and avoiding the need for a full JIT.

How much it would help performance in practice I couldn't say.

> If compilers were designed to emit self-modifying code

... then the high-level language isn't expressing self-modifying; the compiler is doing it to optimize the code while preserving the abstract meaning that is free of self-modifying.

I can't think of any language that has this as a feature. But there are some nice libraries for assembling machine code at runtime - there was one I used called "softwire" that seems to have disappeared. I just found this one on github: https://github.com/kobalicek/asmjit

I wrote up a small module to do this in C on x86 platforms. I called it a predicate replacer, but it can replace any integer constant. https://github.com/mgraczyk/fast_check_once

When I was at Microsoft, at least among several of the better developers, when debugging an issue, many would just ignore source code and enter a bunch of "u" (disassemble) commands into windbg. This is IMHO a great habit if you're writing C or C++. In many cases it ensures less distance between you and a bad pointer dereference. Your code, my code, source code, or none - doesn't matter. Just look at the faulting address and see what it expects.

From a certain vantage point, in these languages, if you aren't seeing assembly from time to time, you're not seeing reality.

Consequently, I kind of chuckle when I hear people say they can't debug optimized builds.

Two or three years ago I was doing an informational interview, trying to move from a QA job to a developer job, and I mentioned to the guy that I used assembly code almost every day. He guffawed and said I was full of it, but it was true -- firing up a debugger and stepping through assembly might be pathological on the desktop these days, but when you're investigating a failure on a back-end datacenter machine that's running some interim build that's impossible to find matching symbols or source for, probably over a KVM...

I don't think it's unreasonable to differentiate between reading assembly produced by a compiler and handwriting assembly. I do the former several times a month and the latter almost never.

It's never reasonable to laugh at someone you're interviewing. Too many so-called engineers treat an interviewee just as a captive audience to brag about their own skills.

Did you get a chance to prove it in the interview?

Hell no. I was a dead man walking in that interview. When I was trying to transition from test to dev, half the people I interviewed with didn't even ask me technical questions. I had one interviewer show up half an hour late, spend the next 25 minutes trying to convince me to stay in test, then (literally) give me five minutes to whiteboard a linked-list problem in C++, and his feedback to the hiring manager was that I was lacking coding skills. I eventually got into a dev position, but it was it was an utterly medieval process. I wish I was kidding, but my take on the whole get-a-dev-job scene right now is that you have to be (a) young, (b) skinny, and (c) the same race as the hiring manager and interviewers (I shit you not). After that comes coding.

An article I wrote a while back, largely about how assembly is not dead:


That project was using a CPU costing less than twenty cents to control the power of a consumer product. Code space mattered a lot, and doing the project entirely in a high-level language would have doubled the part cost. When you're talking quantities in the tens of millions, saving ten or twenty cents on the cost of goods is a Big Deal.

No mention of graphics programming or signal processing? High level syntax just gets in the way when you're trying to pack six pixels into two registers just right. Assembly language is still pretty much the right tool for the job.

Or security, for that matter. Reversing, malware analysis, black-box auditing, &c require, at minimum, the ability to read assembly.

The article is begging the question. I'm unaware of anyone who thinks assembly is not relevant.

There's a ton of people who don't need assembly. Including security engineers that work in C with tools like Astree or Frama-C. Then, tools such as CompCert compiler can handle the rest. ;)

Blackbox and reversing, though, certainly need or benefit from assembly. Depending one what one is analysing.

There's a big difference between "assembly is not relevant for anyone" and "assembly is not relevant for everyone". I suspect few people hold the former view and a bunch hold the latter, but shortened both sound the same.

This is pretty much it, the last time I touched assembly was to write some string routines for Delphi 5 as the included ones where horribly slow, that was 16 years ago.

Since these days I'm a web developer I'm at a level in the stack where using Assembly would be alien and frankly I doubt I could anyway without some serious study.

Talk to a data scientist.

Or from my experience lately, talk to javascript programmers.

They call javascript the "assembly" of the web, yeah certain subsets of programmers are interesting in their beliefs.

... Said no-one ever ;-)

There are a lot of engineers out there work on higher abstractions. If the task is to write a map reduce or spark job, for sure, most of the time I don't think you need to know what happens with your application at assembly level.

If students count, the other day I heard some griping about how they had to learn assembly even though "we're never going to use it".

This is pretty sad, but a lot of students tend to be only interested in whatever is applicable to the job market of present day.

However, even students do need assembly language when they take their computer architecture, operating systems and compilers classes. And some students will probably end up to do compilers or operating system work professionally at some point in their careers.

Assembly language is still a very valuable skill although a few people will ever write any code by hand. But reading assembly code as well as writing programs that emit assembly code will always stay relevant.

To be fair, students are mostly learning MIPS32. It's a 'language' great for teaching concepts while preserving the students' sanity. The fact that I have to run an emulator of an effectively dead-for-consumer use architecture to write labs hurts student satisfaction. You can't even be cheeky and use interesting, assembly dominated elements like SIMD.

The true masochists continue to x86 in the higher level courses.

Some places are starting to use ARM Thumb which is both reasonably sane and actually usable!

MIPS is alive and well. There is a huge world of software outside of websites, you know.

Where is MIPS still used. As far as I see it, it has mostly been replaced by ARM.

I've heard this but never understood why arm would be used over mips. Cost? Power consumption?

What about all the consumer routers than run on MIPS cores?

Or writing compilers, OS drivers, to add two more.

I know a developer that earns well above 1500£ per day writing and maintaining x86 assembly embedded in C++ for finance applications.

That is higher than the going rate for Haskell in finance afaik.

So yes, still relevant.

Is the going rate for Haskell in finance especially high?

I used to work with Haskell in finance, but as an employee not contractor. Money was OK, but not higher than for any other language as far as I can tell.

(I wouldn't be surprised if going rates for Cobol and Fortran were higher than for any sane language.)

Huh. Too bad it's finance. Is there anyone in a non-parasitic industry still paying that kind of money for assembly coding? Maybe I should get back into it. I always loved the low-level world, anyway.

I can't comment on £1500 specifically but there's plenty of (IMO) well-paying demand for low-level dev work out there in the tech industry in general. (Whether you consider the tech industry parasitic is of course another matter entirely…; I've never worked in finance and I do alright.) I don't know about predominantly writing assembly code, but that probably exists, too (SIMD…). Personally, I read a lot more (dis-)assembly than I write.

General advice for deeply technical consulting: you need to have a lot of deep practical experience in a few specific areas and you'll generally need to hit the ground running no matter what project you're dropped into and rapidly start generating corresponding value for the client. And expect to be continuously fighting off offers of permanent employment once you start picking up work.

Furthermore, is any non-parasitic industry paying that kind of money for any coding?

Further furthermore, is any non-parasitic industry paying that kind of money for anything?

Heads of co-ops or well-run businesses (eg Costco)?

I have difficulty believing that beyond the rare exception, those who earn over £1500 a day are nothing more than rent seekers who's staying power is more political than based on work and skill.

If securing a $2300/day job through political means is easy why doesn't everybody do it?

Exactly. It certainly takes strength in the skill of politics. While some are naturals, it's an art and science that's anything but simple. The people on top were often groomed for it by experts (parents or whatever) over literally decades. They also get lots of practice during that time.

That's why I tell people who want to be rich or powerful to master people skills, maintain right image, go to right schools, get the right connections, and deal/negotiate their way to top manager by manger, boardroom by boardroom.

Oh I agree there.

Finance is the "parasitic" industry that powers every big VC and every tech IPO - the reason that founders can pay programmers as much as they do these days.

I can see the guy in Saudi Arabia saying this now: royalty is a "parasitic" class that powers every big oil deal -- the reason that businesses can pay oil workers as much as they do these days.

(I'm not criticizing your conclusion, just your argument)

Thats a good point, thanks.

> Too bad it's finance. Is there anyone in a non-parasitic industry

Oh grow up

Still relevant for that developer? Yes, very much so.

For others?

I can heartily recommend this book for beginners http://www.amazon.com/Assembly-Language-Step-Step-Programmin...

For more (in the x86-64 specific context), see also http://tinyurl.com/x86-64-assembly

I'm a bit surprised by the title of the article. The content was a fun and interesting read. When I read the title I thought to myself: "of course assembly is still relevant!"

Which group of people or school of thought would say that assembly isn't relevant? I'm genuinely curious.

The group of people chasing the javascript framework of the week, writing text editors in javascript.

"writing text editors in javascript"

Just 5-10 years ago, that might have been considered a joke. Now I wonder if you're referring to a specific project. And if it's already deployed as a Droplet. ;)


Oh I've seen plenty of JS stuff. That was the most incredible one I knew of. But a friggin text editor given all platforms have several good ones? And with less bugs & MB than a browser. I mean, come on...

Google basically rewrote MS Office in Javascript. (And Microsoft rewrote the real MS Office in Javascript.)

Text editors are pretty tame beasts by comparison.

Nice little pissing contest they have going on haha. I wonder if the rewrite (a) resulted in eliminating inefficiencies in MS Office code to make up for JS overhead or (b) led them to advise old strategy of telling users to buy better hardware.

Which was it?

Assembly Language is for understanding the processor.It is still relevant for that.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact