Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Running Hashcat on the M1?
10 points by maCDzP 48 days ago | hide | past | favorite | 11 comments
There’s a lot of talk about how fast the M1 is. I am curious to know if anyone has tried it with Hashcat?

Hashcat is a SIMD-vectorized program.

I'd expect the M1 to be terrible at it: with 128-bit vectors and only 4-big cores.


AMD has 256-bit vectors and 8-cores or 16-cores. Intel has fewer cores (10-cores or so) but 512-bit vectors on workstation processors. Consumer Intel (Laptops or Desktops) are usually only 256-bits per core, but that's still doubling up over M1.

Finally, GPUs are 32-wide x 32-bits aka 1024-bit (NVidia or AMD NAVI), or 64-wide x 32-bits aka 2048-bit (AMD GCN), and you can see why GPUs do so well on a massively-parallel program like Hashcat. Those 1024-bit or 2048-bit compute units are arranged 4x per Compute Unit / Workgroup Processor (AMD) or 2x per SM (NVidia), and then they offer 60, 70, 80, 100+ Compute Units / SMs per GPU (depending on chip specific details)

You're comparing the SIMD capabilities of the M1's CPU to its competitors, when modern hashcat has long been 100% OpenCL/CUDA and primarily run on GPUs.

As Apple still implements OpenCL even on the M1's GPU, it's possible to run on the GPU portion of the M1, and appears to do reasonably well. It outperforms the low end Quadro GPU in my laptop and appears to outperform at least 9th gen Intel iGPUs by 5-10x as well.

Hmm, I realize its an iGPU, but...


I'm not very impressed with the 1000MH/s SHA1 Hashcat result. GPUs have moved forward in the recent generations by extraordinary amounts.

Even a low-end NVidia 1660 (which comes in laptop flavors and is a generation old at this point) is pushing 6000+ MH/s on SHA1.

Those are desktop GPUs though, with (I'm guessing) an order of magnitude more power to deal with.

Looking at someone testing laptop GPUs specifically (https://github.com/analsec/hashcatbenchmark), it looks like it's only 3-4x slower than a mobile GTX 1060?

And .. also hugely outclassed by even a laptop GTX 1070. Oh well.

I'm still holding to "it's impressive for the size and TDP", even if it's probably not enough to replace "SSH into the workstation at the office to run hashcat" yet.

10xx series (Pascal) is 2-generations old, released in 2016 on a far larger TSMC 16nm process.

> I'm still holding to "it's impressive for the size and TDP", even if it's probably not enough to replace "SSH into the workstation at the office to run hashcat" yet.


Jetson Xavier crushes the M1 using only 9-billion transistors. I realize that the M1 has other stuff on it (4-big cores, Neural Engine), but... yeah... the M1 ain't a GPU architecture. It has one, but its not a "serious" SIMD-platform.


M1 has impressive big-core / CPU characteristics though. But Hashcat just ain't where its promising.

Okay I'll concede the point :P

I guess I'm just thinking it looks "good enough" for a particular usecase, running NetNTLMv2 through my usual rule/wordlists on pentests. Intel's IGPs have been good enough to do NTLM in a few minutes for a few years, so being able to crack NetNTLM handshakes locally would be nice.

Nvidia's Tegra stuff is really impressive though. I had a shot at playing with hashcat on a nintendo switch a while back, and even though that's several gens old and has very limited VRAM, it did surprisingly well. The newer SoCs must really fly.

I mean... you just don't use 10W solutions to run GPU problems.

Push your wattage up to 40W on the platform, and suddenly you're looking at 10W CPU + 30W GPU, and things start to get interesting. All Apple really needs to do is get their M1 + AMD GPU (NAVI 2x is looking decent), and they're set. (Rumor has it that Apple is pissed off at NVidia for some reason, so Apple x NVidia solution seems unlikely)

I'm not sure if the M1 has enough PCIe lanes. But hypothetically, a future design would include good I/O capabilities and start to scale upwards.

I’ve asked this but not heard a good response. What is holding apple back from building its own discrete video cards / eGPUs?

Could it be competitive in this area?

I mean... why doesn't Apple have good aluminum mines to build its own aluminum for all of the chassis it makes?

A company just decides where the scope of their work ends at some point.

I can't vouch for the authenticity of this data, but a Google search shows someone allegedly benchmarking a macbook pro: https://v2ex.com/t/729284

By comparison with, say, my GTX 1060 (6GB), it appears to be around 4-5 times slower. Other benchmarks seem to confirm this.

I'm very impressed. My Quadro P600 is around 5 times slower than the M1, all while using more power by itself than the entire system.

Granted, that's a very slow (and generation old) discrete mobile GPU, but for what it is, I think the M1 makes a fine showing. It greatly outperforms my 8th gen Intel brick that weighs 5 pounds and needs a 110 watt power supply, so I'll probably cave and upgrade.

Not hashcat which is primarily GPU, but smhasher which is CPU only.

There it is as expected comparable to fast aarch64 phones with 2.5GHz, faster than 2GHz laptops. Neon is not as fast as AVX2, only comparable to AVX, and has no AVX512. Ie. 2x slower than a Ryzen with 2x higher clocks, but comparable to old Desktop CPU's.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact