Hacker Newsnew | past | comments | ask | show | jobs | submit | jwpapi's commentslogin

You know what the G stands for in AGI? General intelligence. You could measure a plane on general versatility in air and it would lose against a bird. You could also measure it against energy consumption. There are a lot of things you can measure a lot of them are pointless, a lot of articles on HN are pointless.

There are very valid reasons to measure that. You wouldn’t ask a plane to drive you to the neighbor or to buy you groceries at the supermarket. It’s not general mobile as you are, but it increases your mobility


This is a very good estimation of AGI. We give humans and AI the same input and measure the results. Kudos to ARC for creating these games.

I really wonder why so many people fight against this. We know that AI is useful, we know that AI is researchful, but we want to know if they are what we vaguely define as intelligence.

I’ve read the airplanes don’t use wings, or submarines don’t swim. Yes, but this is is not the question. I suggest everyone coming up with these comparisons to check their biases, because this is about Artificial General Intelligence.

General is the keyword here, this is what ARC is trying to measure. If it’s useful or not. Isn’t the point. If AI after testing is useful or not isn’t the point either.

This so far has been the best test.

And I also recommend people to ask AI about specialized questions deep in your job you know the answer to and see how often the solution is wrong. I would guess it’s more likely that we perceive knowledge as intelligence than missing intelligence. Probably commom amongst humans as well.


AGI’s “general” is the wrong word, I thinkg. Humans aren’t general, we’re jagged. Strong in some areas, weak in others, and already surpassed in many domains.

LLM are way past us at languages for instance. Calculators passed us at calculating, etc.


I think the issue is that on Walmart they can upsell more. It’s really gonna be tough for ChatGPT to beat that, but arguably its better for customer. So Open AI could just go for the Amazon Model, where merchants will make less money, but are eventually forced to sell through

I have a little ai-commit.sh as "send" in package.json which describes my changes and commits. Formatting has been solved by linters already. Neither my approach nor OP approach are ground-breaking, but i think mine is faster, you also !p send (p alias pnpm) inside from claude no need for it to make a skill and create overhead..

Like thinking about it a pr skill is pretty much an antipattern even telling ai to just create a pr is faster.

I think some vibe coders should let AI teach them some cli tooling


OP here, I disagree, it's great to have a skill for cases where you have extra steps and want the agent to run some verification steps before making a PR. It's called making a PR, but it's not _just_ running the gh cli to make a PR.

It's checking if I'm in a worktree, renames branches accordingly, adds a linear ticket if provided, generates a proper PR summary.

I'm not optimising for how fast the PR is created, I want it to do the menial steps I used to do .


I have cli script for that as well.

I have a cli script(wtq) that takes whatever is in my clipboard, creates a new worktree, cds into that worktree, installs dependencies, and then starts a claude session with the query in my clipboard. Once im done i can rune `wtf` and it it does the finish up work you described.

It’s not about the workflow. A skill doesn’t make sense when you have a deterministic describable workflow, it’s just slower, because you have an interpretation and consuming step in there.

You can just tell claude to turn the skill into a bash script and then alias it to whatever you like.

A skill is useful if you have a variety of use cases that need to be interepretated and need a lot of the same utility.


I see what you mean - I have a setup-worktree script that does this, but I use the skill for knowing when to do bits and pieces. I would agree, if it were 100% deterministic script is much better.

I just completely shifted my mindo n that as well. I used to think I can just ai code everything, but it just worked because I started at a good codebase that I built after a while it was the AIs codebase and neither it, nor me could really work in it, till I entangled it.

I switched to Fedora as my first full time Linux OS and it’s honestly changed my life.

I can use my computer as a tool to do my craft and I’m not constantly sucked in ai features, news, or external search results, if I don’t want it.

OS stands for operating system, Microsoft is not that for me.

I wouldn’t know how to ever go back. I really hope I’m not forced to for some reason.


about 30 times more cost

Honestly it helps a lot to learn new patterns or languages but once you know it all I rather would touch it myself tbh

I think thats a fallacy. As of right now there is a point of no return where the complexity cant be broken by the agent itself without breaking more on other things. I’ve seen it before. Agents cheat on tests, break lint and type rules.

I was hoping for it to work, but It didn’t for me.

Still trying to figure out how to balance it.


I think it works great in codebases that are good, but I think it will degrade the quality of the codebase compared to what it was before.

A good codebase depends on the business context, but in my case its an agile one that can react to discovered business cases. I’ve written great typed helpers that practically allow me to have typed mongo operators for most cases. It makes all operations really smooth. AI keeps finding cretaive ways of avoiding my implementations and over time there are more edge cases, thin wrappers, lint ignore comments and other funny exceptions. Whilst I’m losing the guarantees I built...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: