Hacker Newsnew | past | comments | ask | show | jobs | submit | jackschultz's commentslogin

I literally did this yesterday and had the same thought. Older computer (8 gigs ram) with crappy windows I never used and I thought huh, I wonder how good these models can take me through installing linux with goal of docker deploys of relatively basic things like cron tasks, personal postgres, and minio that I can used for self shared data.

Took a couple hours with some things I ran across, but the model had me go through the setup for debian, how to go through the setup gui, what to check to make it server only, then it took me through commands to run so it wouldn't stop when I closed the laptop, helped with tailscale, getting the ssh keys all setup. Heck it even suggested doing daily dumps of the database and saving to minio and then removing after that. Also knows about the limitations of 8 gigs of ram and how to make sure docker settings for the difference self services I want to build don't cause issues.

Give me a month and true strong intention and ability to google and read posts and find the answer on my own and I still don't think I would have gotten to this point with the amount of trust I have in the setup.

I very much agree with this topic about self hosting coming alive because these models can walk you through everything. Self building and self hosting can really come alive. And in the future when open models are that much better and hardware costs come down (maybe, just guessing of course) we'll be able to also host our own agents on these machines we have setup already. All being able to do it ourselves.


Reread Story of Your Life again just now, and all it made me want to do is learn Heptapod B and their senagram style of written communication.

Reading “Mathematica - A secret world of intuition and curiosity” as well and a part stuck out in a section called The Language Trap. Example author gives is about for a recipe for making banana bread, that if you’re familiar with bananas, it’s obvious that you need to peel them before mashing. Bit of you haven’t seen a banana, you’d have no clue what to do. Does a recipe say peel a banana or should that be ignored? Questions like these are clear coming up more with AI and context, but it’s the same for humans. He ends that section saying most people prefer a video for cooking rather than a recipe.

Other quote from him:

“The language trap is the belief that naming things is enough to make them exist, and we can dispense with the effort of really imagining them.”


Quote that I always like:

There was a man who was afraid of his shadow and disliked his footprints. So he tried to get away from them. He ran, but the faster he ran, the more numerous his footprints became, and his shadow kept up with him without lagging behind. Thinking he was going too slowly, he ran faster and faster, until he collapsed and died of exhaustion. He did not realize that if he had simply stayed in the shade, his shadow would have disappeared, and if he had sat still, there would have been no footprints.

And another one [0]:

My hut lies in the middle of a dense forest; Every year the green ivy grows longer. No news of the affairs of men, Only the occasional song of a woodcutter. The sun shines and I mend my robe; When the moon comes out I read Buddhist poems. I have nothing to report, my friends. If you want to find the meaning, stop chasing after so many things.

[0] https://firstknownwhenlost.blogspot.com/2011/06/stop-chasing...


Infinitely agree with all. I was skeptical, and then tried Opus 4.5 and was blown away. Codex with 5.0 and 5.1 wasn't great, but 5.2 is big improvement. I can't do code without it because there's no point. Time and quality with the right constraints, you're going to get better code.

And same thought with both procrastination because of not knowing where to start, but also getting stuck in the middle and not knowing where to go. Literally never happens anymore. Having discussions with it for doing the planning and different options for implementations, and you get to the end with a good design description and then, what's the point of writing the code yourself when with that design, it's going to write it quickly and matching the agreements.


You can code without it. Maybe you don't want to, but if you're a programmer, you can

(here I am remembering a time I had no computer and would program data structures in OCaml with pen and paper, then would go to university the next day to try it. Often times it worked the first try)


Sure, but the end of this post [0] is where I'm at. I don't feel the need or want to write the code when I can spend my time doing the other parts that are much more interesting and valuable.

> Emil concluded his article like this:

> JustHTML is about 3,000 lines of Python with 8,500+ tests passing. I couldn’t have written it this quickly without the agent. > But “quickly” doesn’t mean “without thinking.” I spent a lot of time reviewing code, making design decisions, and steering the agent in the right direction. The agent did the typing; I did the thinking. > That’s probably the right division of labor.

>I couldn’t agree more. Coding agents replace the part of my job that involves typing the code into a computer. I find what’s left to be a much more valuable use of my time.

[0] https://simonwillison.net/2025/Dec/14/justhtml/


But are those tests relevant? I tried using LLMs to write tests at work and whenever I review them I end up asking it “Ok great, passes the test, but is the test relevant? Does it test anything useful?” And I get a “Oh yeah, you’re right, this test is pointless”


Keep track of test coverage and ask it to delete tests without lowering coverage by more than let’s say 0.01 percent points. If you have a script that gives it only the test coverage, and a file with all tests including line number ranges, it is more or less a dumb task it can work on for hours, without actually reading the files (which would fill context too quickly).


That does not work as advertised.

If you leave an agent for hours trying to increase coverage by percentage without further guiding instructions you will end up with lots of garbage.

In order to achieve this, you need several distinct loops. One that creates tests (there will be garbage), one that consolidates redundant tests, one that parametrizes repetitive tests, and so on.

Agents create redundant tests for all sorts of reasons. Maybe they're trying a hard to reach line and leave several attempts behind. Or maybe they "get creative" and try to guess what is uncovered instead of actually following the coverage report, etc.

Less capable models are actually better at doing this. They're faster, don't "get creative" with weird ideas mid-task and cost less. Just make them work one test at the time. Spawn, do one test that verifiably increases overall coverage, exit. Once you reach a treshold, start the consolidating loop: pick a redundant pair of tests, consolidate, exit. And so on...

Of course, you can use a powerful model and babysit it as well. A few disambiguating questions and interruptions will guide them well. If you want true unattended though, it's damn hard to get stable results.


If you read my comment, I was describing the consolidation part.


We fixed this at work by instructing it to maximize coverage with minimal tests, which is closer to our coding style.


Those tests were written by people. That's why they were confident that what the LLM implemented was correct.


Meta about how important context is.

People see LLMs and tons of tests tests written in the same sentence, and think that shows how models love writing pointless tests. Rather than realizing that the tests are standard and people written to show that the model wrote code that is validated by a currently trusted source.

Shows the importance for us to always write comments that humans are going to read with the right context is _very_ similar to how we need to interact with LLMs. And if we fail to communicate with humans, clearly we're going to fail with models.


Yeah, we now need to specify who wrote the tests, because it's important information.


Yes

Skill issue... And perhaps the wrong model + harness


It's the semantics of "can", where it is used to suggest feasibility. When I moved and got a new commute, I still "could" bike to work, but it went from 30min to an hour and a half each way. While technically possible, I would have had to sacrifice a lot when losing two hours a day- laundry, cooking dinner, downtime. I always said I "can't really" bike to work, but there is a lot of context lost.


So you can, but don't want to.


yup


"Can" is too overloaded a word even with context provided, ranging from places like "could conceivably be achieved" to "usually possible".

The only hint you can dig out is where they might have limits feasibility around it. E.g. "I can fly first class all the time (if I limit the number of flights and spend an unreasonable portion of my weath on tickets)" is typically less useful an interpretation than "I can fly first class all the time (frequently without concern, because I'm very well off)", but you have to figure out which they are trying to say (which isn't always easy).


I can't without seriously sacrificing productivity. (I've been coding for 30 years.)


What are you talking about? 5.2 literally just came out.


5.2-codex just came out. You could use codex with regular 5.2 for a week or so.


Another video about this today: https://www.youtube.com/watch?v=4Bg0Q1enwS4

Summary is that for agents to work well they need clear vision into all things, and putting the data behind a gui or not well maintained CLI is a hinderance. Combined with how structured crud apps are an how the agents can for sure write good crud apps, no reason to not have your own. Wins all around with not paying for it, having a better understanding of processes, and letting agents handle workflows.


Same. Never used worktrees before, but mapping a worktrees to tickets I’m assigned to for Claude to work on is really great.

Heck with the ai, I even have it spin up a dev and test db for that worktree in a docker container. Each has their own so they don’t conflict on that front either. And I won’t lie, I didn’t write those scripts. The models did it all and can make adjustments when I find different work patterns that I like.

This is all to the point of me wondering why I never did this for myself in the past. With the number of times I’m doing multiple parts of a codebase and the annoyance of committing, stashing, checking out different branch and not being able to go more quickly between when blockers are resolved.


This is what I'm doing, Opus 4.5 for personal projects and to learn the flow and what's needed. Only thing I'll disagree with is how the work takes similar amount of time because I'm finding it unbelievably faster. It's crazy how with smart planning and documentation that we can do with the agents, getting markdown files etc, they can write the code better and faster than I can as a senior dev. No question.

I've found Opus 4.5 as a big upgrade compared to any of the other models. Big step up and the minor issues that were annoying and I needed to watch out for with Sonnet and GPT5.1.

It's to the point where I'm on the side of, if the models are offline or I run out of tokens for the 5 hour window or the week (with what I'm paying now), there's kind of no use of doing work. I can use other models to do planning or some review, but then wait until I'm back with Opus 4.5 to do the code.

It still absolutely requires review from me and planning before writing the code, and this is why there can be some slop that goes by, but it's the same as if you have a junior and they put in weak PRs. Difference is much quicker planning which the models help with, better implementation with basic conventions compared to juniors, and much easier to tell a model to make changes compared to a human.


> This is what I'm doing, Opus 4.5 for personal projects and to learn the flow and what's needed. Only thing I'll disagree with is how the work takes similar amount of time because I'm finding it unbelievably faster.

I guess it depends on the project type, in some cases like you're saying way faster. I definitely recognize I've shaved weeks off a project, and I get really nuanced and Claude just updates and adjusts.


Go to usda.gov and two recent press releases are

1 - Secretary Rollins Blocks Taxpayer Dollars for Solar Panels on Prime Farmland

2- Secretary Rollins Prioritizes American Energy on National Forest Land

Both have quotes about putting "America first" to confuse people to make them think this is better for all. We think the USDA is about getting healthy food to people, but really they're about maximizing the money for farmers and people who own the land. Terrible.

[1] - https://www.usda.gov/about-usda/news/press-releases/2025/08/... [2] - https://www.usda.gov/about-usda/news/press-releases/2025/08/...


Farmers also want solar panels is the thing. It brings their costs down.


I'm in Wisconsin and if I drive on a county road, I see signs near the road that say "Save our S̶o̶l̶a̶r̶ Farms". Maybe some are fine with them, but seems like lots of internal pressure to say no or unfortunate reasons.


They may feel like grassroots campaigns to save farms, but much of it is backing from large corporate interests. Doesn't mean that there aren't legit concerns, but the sponsorship makes me weary.

https://www.npr.org/2023/02/18/1154867064/solar-power-misinf...


Presumably they are still free to purchase solar panels?


The US already banned the import of affordable solar technology from china


And how they feel about US tariffs on foreign agricultural products?


Is it affordable because they are dumping though? Or is it because of the slave labor? Or is it cheap because they freely pollute the air water and land? Or maybe all of the above? I would personally love to see a full ban on imports from countries unless they are at parity with US environmental/labor/trade laws.


An interesting take. What happens for all the things where US laws and policy are worse (“below standard”) for many developed countries, which are plenty?


My point isn’t that the US is great at everything (it’s clearly not), it’s that putting restrictions on US businesses while trading with places that don’t have the restrictions is moronic. It puts US businesses at a disadvantage and then still has the negative effect the regulations were trying to prevent in many cases (e.g. air pollution, water pollution, abuse of workers if you care about Chinese people too and don’t want to support their abuse, etc).


I don't get it… isn't up to the landowner whether they farm corn, soybeans, or solar radiation? The government may provide different incentives for each, but AFAIK, they aren't forcing a choice.


> Farmers also want solar panels is the thing. It brings their costs down.

I am a little curious to know what percentage voted for this.


Depending on the state, not enough to matter. Farmers are not a major voting block in most of the US anymore. Farmers are a bit over 1% of the US population. I'm trying to find better sources than listicle type things, but the best I can find is that in the states with the highest percentage of farmers, it's still only 5-6% of their state population.

That can be enough to swing things, but it's not enough to be the deciding block that many think they are. A century ago things were much different.


This is very wrong: farmers still have incredibly outsized influence in American politics, mostly to our detriment. We have a number of horrible policies (ethanol subsidies, HFCS in everything, tons of inexplicable restrictions on food stamps, water policy in the west emptying all the aquifers) that are entirely because of lobbying by farmers, and the Farm Bill distributes between 70 and 100 billion per year, much of it well-spent but also with a great deal of graft and patronage because of farming lobbying.


here in California, the word "farmers" can mean very different things


And even that is probably an optimistic number for what most people would consider a "farmer." It always irks me when people are blaming farmers for the polices of farm counties when the vast majority of voters in those communities have nothing at all to do with farming.


Someone can be more specific and accurate with this, but in the US, population percentages don't vote. Or in other words, some votes are worth more than others. So relying entirely on % of population isn't a great measure.


That's a fair point. By the numbers, about 60% of voters turned out in the last election. Historically, though, we've had lower turnouts. Let's say that nationally only 50% turned out but all farmers voted. That would make their 1+% block closer to 2-3% of the votes for an election which is more significant.


That still elides over the electoral college. A vote in Pennsylvania is worth a lot more than a vote in California or Georgia.


Except the Iowa farmer has a disproportionate amount of voting power compared to say someone in NY or CA.


They don't, they have some early power but even in iowa there are a lot of non farmers.



I’m not sure what exactly you mean by that link.

Iowa was listed as 63% urban in the 2020 census. But that doesn’t tell the whole story. An area needs 2000 housing units and/or 5000 people to be counted as urban. If you’ve been through the state, you’ll see lots of tiny little 2000-3000 person towns that have an urban street grid around a couple-block downtown core. These things don’t get counted as urban.

The farmland is too valuable for you to see much of any sprawl except in Des Moines and Iowa City. Even Council Bluffs (the Iowa side of the Omaha metro) has very little for the metro size.



Another wrinkle is funding for Secure Rural Schools under the 1908 25% fund act hasn't been renewed. Counties that have a national forest presence have a federal government offset to compensate for lost logging.

https://krcrtv.com/news/local/trinity-county-urges-congress-...


My brother in law curls. Asked him about this and he said that it's been coming for a while, the men's teams at Canadian nationals has a self imposed a ban on them, and for amateurs it doesn't really affect them since they're not good enough to have it make a difference. Seems like it means it's not that big an issue and players aren't arguing to keep them.

Now golf on the other hand has a much bigger equipment issues if people want to see some big time drama.


I remember back in the day when the brooms were real corn bristle brooms. You had to have forearms like hams to slap those brooms back and forth hard enough to get those stones to move. The side effect was that every now and then a piece of straw would bust off and cause the stone to veer way off course.


I've been curious about what the best way to recording breathing rates with wearables would be. Thought was a chest strap with springs to measure tension with higher tension being air in lungs. But you're talking about a different way. How does the magnet work to get rates? I'd want something that can get rates and volumes from mouth vs nasal and also tell which vent the air in coming into the lungs from. Probably a case of how much intrusion you want vs how intricate and correct the data is.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: