Hacker News new | past | comments | ask | show | jobs | submit | rfoo's comments login

This is great idea! But it's more about having LLMs to give function & variables names, instead of having LLM to deobfuscate. The (traditional) deobfuscations (e.g. unpack, de-flatten, de-virtualization etc) were done by 100% precise human made Babel plugins and is totally unrelated to a LLM.

I'd recommend having a "gemm with a twist" [0] example in the README.md instead of having an element-wise example. It's pretty hard to evaluate how helpful this is for AI otherwise.

[0] For example, gemm but the lhs is in fp8 e4m3 and rhs is in bf16 and we want fp32 accumulation, output to bf16 after applying GELU.


We don't yet support newer types like fp8 and fp4, that's actually my next project. I'm the only contributor with the hardware to actually use the new types, so it's a bit bottlenecked on a single person right now. But yes, the example is rather simplistic, should probably work on that some time once I'm done updating the feature set to Blackwell.

Isn't there a CPU-based "emulator" in Nvidia dev tools?

From what I can tell it's not accurate enough to catch a lot of errors in the real world. Maybe an illegal instruction, but not a race condition from a missing sync or a warp divergence on a uniform instruction or other potential issues like that.

Agreed! I was looking through the summation example < https://github.com/tracel-ai/cubecl/blob/main/examples/sum_t...> and it seems like the primary focus is on the more traditional pre-2018 GPU programming without explicit warp-level operations, asynchrony, atomics, barriers, or countless tensor-core operations.

The project feels very nice and it would be great to have more notes in the README on the excluded functionality to better scope its applicability in more advanced GPGPU scenarios.


We support warp operations, barriers for Cuda, atomics for most backends, tensor cores instructions as well. It's just not well documented on the readme!

Amazing! Would love to try them! If possible, would also ask for a table translating between CubeCL and CUDA terminology. It seems like CUDA Warps are called Planes in CubeCL, and it’s probably not the only difference.

CubeCL is the computation backend for Burn (https://burn.dev/) - ML framework done by the same team which does all the tensor magic like autodiff, op fusion and dynamic graphs.

One of the main author here, the readme isn't really well up-to-date. We have our own gemm implementation based on CubeCL. It's still moving a lot, but we support tensor cores, use warp operations (Plane Operations in CubeCL), we even added TMA instructions for CUDA.

What if I prefer to have a clone of me doing my coding, and then I throw my clone under the bus and start to (angrily) hyperfocus explore and change every piece to be beautiful? Does this mean I love coding or I hate coding?

It's definitely a personality thing, but that's so much more productive for me, than convincing myself to do all the work from scratch after I had a design.

I guess this means I hate coding, and I only love the dopamine from designing and polishing my work instead of making things work. I'm not sure though, this feels like the opposite of hate coding.


If you create a sufficiently absurd hypothetical, anything is possible.

Or are you calling an LLM a "clone" of you? In that case, it's more, "if you create a flawed enough starting premise, anything is possible".


> flawed enough starting premise

That's where we start to disagree what future looks like, then.

It's not there yet, in that the LLM-clone isn't good enough. But amusingly a not nearly good enough clone of me already made me more productive, in that I'm able to deliver more while maintaining the same level of personal satisfaction with my code.


The question of increasing productivity and what that means for us as laborers is another entire can of worms, but that aside, I have never yet found LLM-gen'd code that met my personal standards, and sped up my total code output.

If I want to spend my time refactoring and bugfixing and rewriting and integrating, rather than writing from scratch and bugfixing, I can definitely achieve that by using LLM code, but the overall time has never felt different to me, and in many cases I've thrown out the LLM code after several hours due to either sheer frustration with how it's written, or due to discovering that the structure it's using doesn't work with the rest of the program (see: anything related to threading).


This has been my experience as well, and really leaves me puzzled as to what anyone is gushing about.

Can I just about corral the LLM into producing working output? Yea, sometimes. From a technology perspective, that’s pretty cool!

But is it a productivity boost? Absolutely not. Like not even close. Every time it would have been faster for me to just write the code myself.

I really don’t know how to square the vast gulf between my experiences and many other peoples’.


Meanwhile, in a certain modern OS, unloading a library is too broken to the point that people are discouraged to do so... Try to unload GLib [0] from your process :p

[0] https://docs.gtk.org/glib/


Unloading C libraries is fundamentally fraught with peril. It's incredibly difficult to ensure that no dangling pointers to the library remain when it's unloaded. It's really fun to debug, too. The code responsible for the crash literally is not present in the process at the time of the crash!

`Any` is the correct call.

It could be:

  def f(i=0) -> None:
    if i is None:
      do_something()
    else:
      do_something_else()
Yeah, I know it's retarded. I don't expect high quality code in a code base missing type annotation like that. Assuming `i` is `int` or `float` just makes incrementally adoption of a type checker harder.

No it’s not. The typing system should use the most specific type available, and it’s your responsability to broaden it if needed. That’s how it works in all statically-typed languages.

> so much from your macbook

At least on cloud I can actually have hundreds of GiBs of RAM. If I want this on my Macbook it's even more expensive than my cloud bill.


Strangely I've found inverse to be true: many backend technologies are actually quite good with memory management and often require as little as a few GiB of RAM or even less to serve production traffic. Often a single IDE consumes more RAM than a production Go binary that serves thousands of requests per second for example

You can, but if you need it you’re not searching for a product market fit anymore.

There are a lot of examples about SQL in comments. In the SQL case you want something like:

  def process_template(template: Template) -> tuple[str, tuple]:
    sql_parts = []
    args = []
    for item in template:
      if isinstance(item, str):
        sql_parts.append(item)
      else:
        sql_parts.append("?")
        args.append(process_value(item.value))
    return "".join(sql_parts), tuple(args)
(of course it would be more nuanced, but I hope you get the point)

Yes that makes sense, thanks.

Also, my comment was about the amount of boilerplate required, but that can be vastly reduced by writing `process_template` in a more functional style instead of the highly-imperative (Golang-like?) style used in the article. The first `process_template` example is just:

    def process_template(template: Template) -> str:
        return ''.join(interleave_longest(template.strings, map(process_value, template.values)))
And the second is something like:

    def process_template(template: Template) -> tuple[str, tuple]:
        return (
            ''.join(interleave_longest(template.strings, ['?'] * len(template.values))),
            map(process_value, template.values)
        )

Google has been doing this since forever for recaptcha. And, to be fair, it seems to be fairly effectively for bot detection.

https://github.com/neuroradiology/InsideReCaptcha

> bots seem to be going a very different route

If the "very different route" means running a headless browser, then it's a success for this tech. Because the bot must run a blackbox JS now, and this gives people a whole new street of ways to run bot detection, using the bot's CPU.


Okay... but those bots exist... and in high numbers... By "very different route" I mean "measure to effectively stop the bots" (or dramatically reduce). It seems like if they're using a headless browser then they're still being quite effective in accomplishing their goals.

Google's obfuscating VM based anti-bot system (BotGuard) was very effective. Source: I wrote it. We used it to completely wipe out numerous botnets that were abusing Google's products e.g. posting spam, clickfraud, phishing campaigns. BotGuard is still deployed on basically every Google product and they later did similar systems for Android and iOS, so I guess it continues to work well.

AFAIK Google was the first to use VM based obfuscation in JavaScript. Nobody was using this technique at the time for anti-spam so I was inspired primarily by the work Nate Lawson did on BluRay.

What most people didn't realize back then is that if you can force your adversary to run a full blown web browser there are numerous tricks to detect that the browser is being automated. When BotGuard was new most of those tricks were specific to Internet Explorer, none were already known (I had to discover them myself) and I never found any evidence that any of them were rediscovered outside of Google. The original bag of tricks is obsolete now of course, nobody is using Internet Explorer anymore. I don't know what it does these days.

The VM isn't merely about protecting the tricks, though. That's useful but not the main reason for it. The main reason is to make it easier to generate random encrypted programs for the VM, and thus harder to write a static analysis. If you can't write a static analysis for the program supplied by your adversary you're forced to actually execute it and therefore can't write a "safe" bot. If the program changes in ways that are designed to detect your bot, done well there's no good way to detect this and bring the botnet to a safe halt because you don't know what the program is actually doing at the semantic level. Therefore the generated programs can detect your bot and then report back to the server what it found, triggering delayed IP/account/phone number bans. It's very expensive for abusers to go through these bans but because they have to blindly execute the generated programs they can't easily reduce the risk. Once the profit margin shrinks below the margin from abusing a different website, they leave and you win.


> I've already seen this story play out with low-code. The pitch was the same

That's the point. Unlike low-code or no-code bullshit, this somehow worked. People "magically" want to design solutions, and are able to, now.

> People executing small scripts all over the place sounds to me like the stuff of nightmares.

Me too. However, watching how eager people want to (and they indeed can!) make progress with these LLM generated horror, I believe it's time to give up and start designing secure systems despite the existence of massive slightly broken scripts.


> People "magically" want to design solutions, and are able to, now.

I agree that LLMs lower the barrier significantly compared to low-code/no-code, but most people are not able to design solutions because they lack the business analysis skills and are not detailed-oriented enough to follow through with the specification of requirements. Let's not even talk about the discipline to carry out maintenance over a working project in the face of changing requirements.

Even if we agree that LLMs move a lot of the work up the stack towards business analysis / product ownership / solution design, my experience in Enterprise IT in companies ranging from small to gigantic is that users do not magically become BAs / POs / PMs. There's a reason those are professionalized and specialized roles.

I wouldn't mind being proven wrong, it's not like I feel personally threatened or anything. I feel it's the integrity of the systems I oversee that would be threatened.

> I believe it's time to give up and start designing secure systems

OK, well I'm not going to bear that responsibility for that, I have enough on my plate as it is. I'm not allowing an arbitrary sales rep to interact with our production Salesforce instance by automated means, period. Even if they have the proper permission levels configured to a tee in Salesforce, I can think of a thousand ways they could badly mess up their own slice of data. Interacting with the local machine: also potentially a supermassive black hole of vulnerabilities. Some of them possibly more serious than data loss, such as the syphoning of data to malicious actors.

If someone can think of secure ways for citizen devs to interact with critical enterprise systems via scripting, then fine. I'll sit here waiting!


> Cursor isn't providing their own models

For use cases demanding the most intelligent model, yes they aren't.

However, there are cases that you just can't use best models due to latency. For example next edit prediction, and applying diffs [0] generated by the super intelligent model you decided to use. AFAIK, Cursor does use their own model for these, which is why you can't use Cursor without paying them $20/mo even if you bring your own Anthropic API key. Applying what Claude generated in Copilot is just so painfully slow to the point that I just don't want to use it.

If you tried Cursor early on, I recommend you update your prior now. Cursor had been redesigned about a year ago, and it is a completely different product compared to what they first released 2 years ago.

[0] We may not need a model to apply diff soon, as Aider leaderboard shows, recent models started to be able to generate perfect diff that actually applies.


(I most recently used cursor in October before switching to Avante, so I suspect I've experienced the version of the tool you're talking about. I mostly didn't use the autocomplete, I mostly used the chat-q&a sidebar.)

And I pay Cursor only for autocomplete - this explains the difference I guess.

I do sometimes use Composer (or Agent in recent versions), but it's being increasingly less useful in my case. Not sure why :(


The redesign was ~5 months ago. If you switched in October, you 100% have not used the current Cursor experience.

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: