The repository names all look like two terms/words from dune (harkonen, mentat, ornithoptor, etc.) followed by a number. This would indicate that the account (possibly GitHub auth/actions token) has been compromised and then used to create the repository.
Why can't GitHub get on the case and just block any repo where the README matches the regex? I thought they'd have learned their lesson the last time it happened.
This malware isn't even trying. Then again it's Microsoft so they're not even trying either.
6 minutes later an HN submission "GitHub blocks your account if you mention X in the README" with a top comment "This is absurd, are they just doing regex matching to check for malware?"
That doesn't really explain why there is a bunch of GitHub repos created as well.
If I remember correctly from Shai-Hulud 2, the attacker extricated creds by posting them in public github repos with minor easily reversible encryption. I believe it was double b64 last time.
I'm assuming the logic there is that every security researcher and company is going to pull and scan those creds for their stuff and their clients' stuff. So the attacker is just 1 of N people downloading it. As opposed to trying to send it to their own machine directly.
I think it's more about convenience and bypassing filters - developers are already logged in to github, already have access to create repos and publish code, firewalls will allow it. Even fancy HIDS systems will think the git push is rather normal.
If they have a clue, the attacker still will not download that without using a botnet tunnel or Tor at a minimum.
Note though that these credentials aren't even encrypted using some lightweight ECC to prevent others from capturing them, they're posted in cleartext. Embarassment might be part of the point.
With HN ettiquette in mind, I must make an exception: this is a case where skimming the first parts of the article would help a lot!
The public repo path is just one of four parallel paths, with the goal of getting around any barriers:
The exfiltration component shares its design with the "Mini Shai-Hulud" mechanism from their last campaign, using four parallel channels so stolen data gets out even if individual paths are blocked.
The review is also heavily LLM-inflected, to the point of being distracting.
GPTZero gives it a 100% chance of being AI generated, and I've found that these tools may give false negatives from a well-prompted model, but false positives are rare.
If you are looking to tune your intuition for AI-written text, here's an interesting list of their quirks (ironically provided as a Claude skill for removing those quirks from emitted text):
According to that site, Robert Kennedy's speech on the night Martin Luther King was killed[1] was almost entirely the product of GenAI, as were both of Obama's inaugural addresses[1][2].
By this logic, I'd venture a guess that "AI" was also responsible for some of Shakespeare's most famous lines.
Almost certainly. Someone no-one has ever heard of before driving a hallucinating AI claims to have done what the world's best cryptographers have been unable to do. Just wait a day or two for the first crypto person who notices to pick the claim to pieces.
>Just wait a day or two for the first crypto person who notices to pick the claim to pieces.
we went to cryptographic experts first and published second, after they said it is a very good result and worth publishing. We've given a lot of help for reproducibility, the c and python programs encode the claims very precisely and anyone can verify the claims in ten minutes. The bottom line is that you wouldn't have seen this article if cryptographers hadn't seen these results first and liked them.
edited to clarify, thanks for pointing it out. It wouldn't be responsible for us to only publish when we got to the same stage for SHA-256, since at that point TLS and other certificates would be considered compromised.
They certainly have ambitions – the most recent changelog claims to add "Full PCB design pipeline: schematic capture, routing, DRC, Gerber export, and signal integrity simulation."
It also seems to have a physics engine, a slicer for 3D printing, an embroidery mode, and a entire ecosystem of math crates (https://tang.toys/).
Whether any of that works – or whether it's pure LLM slop – is less clear. I tried to import a trivial STEP file, and it crashed my browser tab [1]. Every commit is co-authored by Claude.
So far, he’s shown incredible productivity (with Claude Code). I integrated his vcad into my toy project here, and it worked on the first try, which is quite impressive for such a young project:
https://github.com/darwin/supex/tree/dev
By definition, you can't interpolate a sample. A sample is a measured value.
What you can do, if and only if you have an exactly repeating signal triggering at the same point within a cycle, is change the delay between the trigger and sample, and repeat. In other words, sample at different times within the same signal (since it's exactly repeating), to build up samples in time, of that waveform, to whatever time resolution you want.
Of course, you're limited to any noise in the trigger, variation in the signal, etc.
This is how you can record light moving through your garage [1]!
I understand, but that's my point, it's not interpolated!
The number he's referring to is in units of samples per second. It's not doing interpolation between samples, to achieve a high samples per second, because that's not possible, which is my point. Interpolation results in an imagined value, but samples are measured values.
It would be correct to say that the values between samples are interpolated, but the subject of interpolation isn't applicable for anything mentioned in this comment chain.
Ah you are referring to the 'sps' bit. Ok, but I think the extra sentence is enough clarification of what they mean, even if they're wrong about what the device is doing.
The only time these are interpolating is when they are visualizing, there is no point (hah) in storing interpolated data, you can generate that whenever you want.
Not the original reply, but I support the correction here. Regardless of how pedantic/nitpicking it seems, I remember getting confused about this a lot when learning digital signal processing. Simply because its really easy to upsample.. or look at an upsampled result and get confused by that
The giveaway is that LLMs love bulleted lists with a bolded attention-grabbing phrase to start each line. Copy-pasting directly to HN has stripped the bold formatting and bullets from the list, so the attention-grabbing phrase is fused into the next sentence, e.g. “Potential for abuse Attestation enables blacklisting”
Calling this a "giveaway" is kind of hilarious. LLMs use bulleted lists because humans have always used bulleted lists—in RFCs, design docs, and literally every tech write-up ever. Structure didn't suddenly become artificial in 2023. lol.
https://github.com/search?q=A%20Mini%20Shai-Hulud%20has%20Ap...
reply