Hacker News new | past | comments | ask | show | jobs | submit login
An Introduction to ARM64 Assembly on Apple Silicon Macs (github.com/below)
300 points by udev4096 10 months ago | hide | past | favorite | 83 comments



It's a great introduction. I used it a lot when trying to work out a collection of sample introductory assembly programs for Silicon. I only share as it's on topic:

https://github.com/jdshaffer/Apple-Silicon-ASM-Examples


May want to review https://developer.apple.com/documentation/apple-silicon/cpu-... for additional details…


I strongly second the recommendation! Apple's CPU optimization guide is great and not just for their own CPUs but for anyone interested in ARM64(ARMv8, Aarch64, however you call it) in general. It's one of the best written manuals I've read on this topic (which is few but still), with great visualizations and should be accessible to a person even with little low-level knowledge.

(the original comment does not mention but, to be specific, this is about this document: https://developer.apple.com/download/apple-silicon-cpu-optim...)

Should you want to play with SIMD but are a little intimidated - Swift and C# and offer convenient "platform-agnostic" SIMD abstractions, and C# also has NEON/AdvSimd intrinsics in the form of "plain" API calls e.g. `AdvSimd.AddPairwiseWidening` for more direct control (I'm biased on this subject as, while I like Swift, using Xcode and surrounding tooling is sad and less convenient, and the support for Linux/Windows is not there yet).


Have you done anything interesting with what the guide has shared? I see people talking about how it's amazing but... where? in what?


Unfortunately no, not after I read it at least. But I wish it was there a year earlier or so - would prefer it to reading ARM's SIMD&FP documentation. It mostly helped me better understand ARM's strided simd loads and stores (scatter/gather) and shuffles, to verify previous data from https://dougallj.github.io/applecpu/firestorm.html, improve overall mental model and was just pleasant to browse through with all the visualizations.



Nice table of contents.


While I know Apple isn't exactly stopping this sort of knowledge discovery and distribution, I don't know why they aren't helping it, given that any increased demand of their hardware would just result in more hardware sales


All this isn’t exactly a secret. ARM maintains and provides extensive documentation and so does Apple. Is there anything specific you think is being hidden or obfuscated in the documentation?


There are a lot of undocumented parts of the Apple CPUs, for instance AMX. All such undocumented features can normally be exploited only by the libraries and applications provided by Apple themselves, but not by the applications and libraries written by other parties, which are disadvantaged.

This is the same mechanism by which Microsoft has eliminated the competition for Microsoft Office, which used undocumented Windows APIs so that the products of any other vendor could not keep up with it, especially after the launch of any new Windows version.

Now one can find some more complete documentation for the Apple CPUs as the result of reverse engineering work done by various people, but after each introduction of a new Apple CPU model the reverse engineering work may need to be done again.

Examples:

https://github.com/name99-org/AArch64-Explore

https://github.com/corsix/amx


Probably because they don't want anyone to depend on AMX, and they want to be free to remove it or change it in the future. On the M4 for example AMX features are accessible thru SME, which is an official ARM extension.


AMX and SME are independent


Do they really not share any execution hardware?


I'm sure the register file and execution units are shared to some extent


Anything capable of fast C interop (so no Go and Java for you, good riddance) is free to use Accelerate. The reason Apple went with AMX first was that SME was not ready at the time, and they did want to have that. Once SME became available, they readily exposed it, as can be seen in M4, using the same hardware blocks underneath.

I'm not here to defend other anti-competitive practices by Apple but as far as just their CPUs go, there are none in that area.


Apple aren't allowed to publicly support unofficial extensions of the ARM ISA.


Apple writes the libraries for you to use AMX. They aren’t giving themselves preferential treatment here.


If that's the case, then why does the GPU portion have to be reverse-engineered for Asahi Linux? Of course I knew about the ARM portion, there are lots of ARM chips licensed to by ARM Holdings, it's not exactly a secret. But the "apple silicon" chip in its entirety, is not completely documented.


Are any competitive GPU architectures any better? I don't think nVidia, AMD, Intel, nor PowerVR openly publish the internals of their graphics products either.


AMD and Intel publish detailed GPU documentation.



The API for programming the GPU is Metal.


Peripherals are not the ISA or CPU architecture: they are usually made by numerous parties.


Apple has designed their own GPUs since they stopped using PowerVR with A11


What does that have to do with ARM64 assembly? The ISA and CPU architecture are orthogonal to all peripherals.

These peripherals are accessed with memory-mapped IO using the same instructions any other program uses.

Documentation about ARM64 assembly shouldn't and doesn't contain specific peripheral access info. ISA docs contain info common to all CPUs implementing the spec.


Apple literally published a 200 page guide on arm64 which I linked above, they've contributed low level optimizations and tunings to NumPy, Embree and Blender to name a few projects.

They clearly are helping. Perhaps you've not noticed?


They're not publishing any docs about the GPU portion. See: the rest of the thread currently. They give you the Metal API (only designed for Apple OS'es), and that's it.

In contrast, AMD and Intel publish GPU documentation.


They only care about sales from consumers they can lock-in and control.


Sure. That's pretty acidic wording, but I think it's fair to say they want more consumer market share and lock-in helps that.

The original post's point was that by being more open they would encourage more software to be built for their platform. That would create more demand for their products from consumers.


wow, I take time out of my day to write a clever pun and you flag comment? You're sure abusive.


Might have been interesting, but adding a Linux dependency looks like a really strange choice. And why does it refer to GCC when the Mac toolchain is clang? This feels all wrong.


The repo is meant to be used in conjunction with the book Programming with 64-Bit ARM Assembly Language: Single Board Computer Development for Raspberry Pi and Mobile Devices. The primary learning material is the book; the repository suggests adaptations for going through exercises using Apple Silicon.


This repo is written to be read alongside an existing book; the book was originally written with Linux in mind. The first chapters of the repo describe what’s different on Mac, particularly how to use the XCode/Clang toolchain instead of the instructions in the book.


I type gcc to compile things all the time. Apple helpfully made an alias. I know some people will be mad that "it's not the tool I asked for", but personally I'd rather it works for the 99% of cases where it doesn't matter.

  % gcc --version
  Apple clang version 15.0.0 (clang-1500.0.40.1)
  Target: arm64-apple-darwin23.3.0
  Thread model: posix
  InstalledDir:   /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin


"All of these registers are yours, except X18. Attempt no landing there."



It was a reference to Arthur C. Clarke's book, "2010: Odyssey Two" https://www.goodreads.com/quotes/992255-all-these-worlds-are...


What happens if you do land there?


Undefined Behavior - anything from "well, that was harmless" to "your computer restarted because of a problem."


Probably no support for kernel dev on MacOS, which may be the reason, but I don't see any mention that gcc can be installed on MacOS with for example brew.


Very interesting. Does anyone know any similar resources for non-apple ARM64?


If you just care about the instruction set, there's a long series by Raymond Chen about the AArch64 instruction set. This is the first post: https://devblogs.microsoft.com/oldnewthing/20220726-00/?p=10...


Honestly no reason the majority of this wouldn't apply to "non Apple" arm64. It's sort of like being worried about AMD x86 vs Intel x86. It mostly doesn't matter, except in some cases where it does, but you don't set out to say "I don't want to learn Intel x86, only AMD x86"

You learn arm64, then you can worry about if you need to deal with implementation specific ISA quirks.


hopefully this tinkering can be redirected to framework laptops when they release their snapdragon X based CPUs


This is just “how to write ARM assembly”. There’s not much special that would require the existence of this to help with Snapdragon processors.


I think the ARM platform is super cool and I’m glad Apple adopted it, but at the end of the day if the OS is closed source, I’m not that interested in hacking on the instruction set it runs on.

In fact the only reasons I’m on a Mac rather than Linux are because 1) they make the best consumer laptops by far and 2) I’m stuck in the ecosystem.


> I’m not that interested in hacking on the instruction set it runs on.

This makes zero sense, frankly. Do you also feel this way about an x86 laptop, that x86 isn't worth learning anything about? Because it might run macOS? It's pointless. The main thing tied to the OS is the userspace ABI (e.g. callee/caller saved registers, parameter passing), but this is generally only one important-but-small part of actually using the CPU, or doing low-level optimization, and every major operating system (including macOS!) tends to have quite detailed ABI documentation for all the supported architectures. Actual fundamentals of the ISA, the supported instructions, etc all translate between operating systems cleanly, and for a given microarchitecture the performance characteristics will also broadly translate between operating systems. But I sort of doubt you're talking about low-level Apple Silicon-specific uarch optimizations, because that's not what the original guide is talking about.

Like, if you don't already know how to write ARMv8 assembly, macOS is not what's stopping you. You can take most of this knowledge, which is fundamentally generic, and just as easily apply it to an RPi for example, or the new Snapdragon X Elite, etc. macOS is mostly a non-issue in this regard, it just happens to be the operating system "of choice" for some of the best client-oriented ARM processors you can buy right now.


I thought Apple open sourced parts of it? "Darwin"?

https://github.com/apple-oss-distributions/distribution-macO...


they allow other OSs to be used, and notably Asahi linux is making great strides there, is that something you are insterested in?


>1) they make the best consumer laptops by far

Very debatable. That's just your opinion.


>>1) they make the best consumer laptops by far

> Very debatable. That's just your opinion.

Sorry, no, I'm not handholding PC manufacturers anymore. When Macbooks were Intel based there were trade-offs, you were buying a laptop that would overheat and underpeform for the spec. The keyboards were iffy to the point where they removed essential keys etc;.

Now, there isn't a better laptop that's an all-rounder for 95% of people. They are fast, cool, do not compromise on display quality or audio quality, afaik they're the only laptop manufacturer giving you full 40GB/s out of each and every USB4 port and they optically seal them (and always have) making USB-Killers ineffective.

They are stupid expensive for upgrades, this is true, however the comparable systems (XPS/Precision, Elitebook, Thinkpad X) are all within spitting distance of the price and still have significant compromises.

PC manufacturers need to do better.


They have nearly zero incentive to do better. Most PC users simply do not care. They are not really changing the world either, just gaming or plopping figures into Excel.


I recently purchased an M3 Max MacBook with a ton of RAM. It was expensive, but I love that I can run various containers, Emacs, Slack, and more, while never swapping a single byte. More importantly, I can have all of these containers running yet have battery that lasts a whole day. I usually charge my laptop once every two days. Most other laptops I've used would last 2-3 hours under the same workload, and produce tons of heat and fan noise when compiling software.

I will concede that some aspects of the laptop are pretty bad. For instance, the keyboard is subpar compared to even the cheapest mechanical keyboards. The trackpad seems overrated. On desktops, I always use an ergonomic trackball mouse, so trackpads and even "regular" mice put my hand in relatively awkward position and I can't keep it stationary. Something that isn't mentioned often is that the mini-LED displays have considerable bloom, especially compared to the previous generation of MacBooks that used IPS displays.

Personally speaking, the pros of this machine vastly outweight the cons I've just listed.


Mac laptop quality is just so good. Granted i DID enjoy the osx from back when jobs was still alive and imposed his visions. But the hardware is still, like back then rock solid. I have yet to find another producer that has the same build quality. If you know a product that can compete, please inform me, as i would ultimately want to have a linux setup.


Not very debatable though. Mention the many others that have;

As good a trackpad, As good a battery, As good speakers.

Just to mention a few. Lots of other stupidities like their RAM pricing, OS magick etc, but other companies have rarely come close to the hardware.


its the trackpad for me, no other laptop is comparable afaik


it was the trackpad for me, now it's the trackpad and the battery life. nobody else comes close on either.


I wonder what exactly sets the Mac trackpad apart from the rest: is it hardware or software? I wonder if there's a way to improve it on other laptops.


A lot of it is software. When the Chromebook guys were just starting out they wanted a Mac style touchpad, but discovered you couldn't buy one on the open market. They had to do a ton of custom work on drivers to try and get close to Apple's work. There's a lot of subtlety to palm rejection, scaling, haptics, etc.


I think it also helps a lot that the OS (or at least a version of it) was originally designed to work with a one-button mouse.


The two-fingers-tap for right click works great on the various Apple trackpads. Much better than the same gesture on my expensive Dell Precision. Both the two fingers gesture and the click on the right side of the trackpad, which is infuriating. I really don’t think the OS supporting one-button mice explains the difference in reliability.


So is it software in the OS, or is it firmware in the touchpad itself? Software in the OS should be very easy to improve, but it seems like it should be very device-specific. Firmware in the touchpad controller is another matter.


I feel like on other laptops everyone is optimizing for specs/price and specs don't include quality of life features like a good trackpad. Sure, it would be nice to have a good trackpad, but chances are your customers are going to buy a cheaper laptop with a worse trackpad from a competitor.


there are some coming out, the lg gram for example, has a haptic touchpad


Name another company that makes debatably the best consumer laptops?


There is no undebatably best consumer laptop for the simple reason that different users have different requirements and priorities.

And for some, things that Apple do are a no go. Like glued parts, limited Linux support, no OLED screens, no post buy upgradability, overpriced RAM upgrades, limited and finicky multimonitor support for most models.

So clearly it is debatable and depends on who you ask.


Yep, everyone has different requirements and things they'll put up with. Personally, the touchpad on my Dell laptop is absolutely terrible, but it's not hard to get a wireless (or wired) mouse and plug that in, so I can have a reasonably-priced laptop that works well with Linux.


> limited and finicky multimonitor support for most models.

I’ve never had an issue, and I regularly plug in one or two at a time. Even two that lie and say they are the same monitor with same serial and it still just works. Although, only after m1 and beyond.


Yeah and you can't plug in a third. I'd call that limited


I plugin a third just fine.


You have to get a 3999€ max for that right?

I threw them all together in my mind and didn't check the display capabilities of the max MB.


I got a Max in the 14” notebook, yeah, but not 3999€ for me at least.


How about I name a company that sold us a laptop that required 7 motherboard replacements, and after the 8th time it crapped out they told us it would cost $1200 to fix it from then on out. We signed up on a class action lawsuit along with tons of other people having the same exact dead motherboard problem, and we won.

The company was Apple.


Curious, was that laptop running with nvidia?

Because the reason Apple really dislikes nvidia was because nvidia sort of lied about the thermal spec (much like intel does, except intel could downclock); and it caused a lot of GPUs to kill their motherboards: https://blog.greggant.com/posts/2021/10/13/apple-vs-nvidia-w...


That was in 2012, right?

Their hardware has got a whole lot better in the past 12 years.

My guess is that they learned important quality lessons from that class action lawsuit too.


We’ve had our share of Dell lemons, too. Bad batches and problematic models happen, that’s life. If we have to go back 10 years to find an example of widespread problem, it’s not that bad.


Yeah, and now you know why Apple hasn't sourced NVIDIA stuff for years now, too.


And NVIDIA is now worth more than Apple, lol.


Now we are splitting development resources on CPUs because Apple. Wow, talk about the second worst negative externality to the industry(behind M$). At least we got 20 years of google FOSS before they started turning.

At least with Apple, its going to stick. They have marketing mind control that will keep people around rain or shine.


You're blaming Apple for Intel's incompetence at low-power CPUs?


I got into Apple because they have a gigantic consumer ecosystem with unmatched integration, and several class leading products, not the ads.


A human is rare to explain their own manipulation.

>several class leading products

See.

I mean, 'class leading', like Nintendo leads, pick something no one cares about and peer pressure people into caring.


I can't believe Apple invented implementation-specific ISA quirks. The nerve!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: