The post is just about playing around with the tech for fun. Why does monetizati...

hoppp · 2025-11-07T01:09:09 1762477749

Agents use Apis that I will need to pay for and generally software dev is a job for me that needs to generate income.

If the Apis I call are not profitable for the provider then they won't be for me either.

This post is a fly.io advertisement

simonw · 2025-11-07T01:37:49 1762479469

"Agents use Apis that I will need to pay for"

Not if you run them against local models, which are free to download and free to run. The Qwen 3 4B models only need a couple of GBs of available RAM and will run happily on CPU as opposed to GPU. Cost isn't a reason not to explore this stuff.

awayto · 2025-11-07T02:45:21 1762483521

Google has what I would call a generous free tier, even including Gemini 2.5 Pro (https://ai.google.dev/gemini-api/docs/rate-limits). Just get an API key from AiStudio. Also very easy to just make a switch in your agent so that if you hit up against a rate limit for one model, re-request the query with the next model. With Pro/Flash/Flash-Lite and their previews, you've got 2500+ free requests per day.

robot-wrangler · 2025-11-07T06:40:29 1762497629

> Not if you run them against local models, which are free to download and free to run .. run happily on CPU .. Cost isn't a reason not to explore this stuff.

Let's be realistic and not over-promise. Conversational slop and coding factorial will work. But the local experience for coding agents, tool-calling, and reasoning is still very bad until/unless you have a pretty expensive workstation. CPU and qwen 4b will be disappointing to even try experiments on. The only useful thing most people can realistically do locally is fuzzy search with simple RAG. Besides factorial, maybe some other stuff that's in the training set, like help with simple shell commands. (Great for people who are new to unix, but won't help the veteran dev who is trying to convince themselves AI is real or figuring out how to get it into their workflows)

Anyway, admitting that AI is still very much in a "pay to play" phase is actually ok. More measured stances, fewer reflexive detractors or boosters

simonw · 2025-11-07T06:45:55 1762497955

Sure, you're not going to get anything close to a Claude Code style agent from a local model (unless you shell out $10,000+ for a 512GB Mac Studio or similar).

This post isn't about building Claude Code - it's about hooking up an LLM to one or two tool calls in order to run something like ping. For an educational exercise like that a model like Qwen 4B should still be sufficient.

robot-wrangler · 2025-11-07T07:11:25 1762499485

The expectation that reasonable people have isn't fully local claude code, that's a strawman. But it's also not ping tools or the simple weather agent that tutorials like to use. It's somewhere in between, isn't that obvious? If you're into evangelism, acknowledging this and actually taking a measured stance would help prevent light skeptics from turning into complete AI-deniers. If you mislead people about one thing, they will assume they are being misled about everything

simonw · 2025-11-07T07:26:06 1762500366

I don't think I was being misleading here.

https://fly.io/blog/everyone-write-an-agent/ is a tutorial about writing a simple "agent" - aka a thing that uses an LLM to call tools in a loop - that can make a simple tool call. The complaint I was responding to here was that there's no point trying this if you don't want to be hooked on expensive APIs. I think this is one of the areas where the existence of tiny but capable local models is relevant - especially for AI skeptics who refuse to engage with this technology at all if it means spending money with companies they don't like.

robot-wrangler · 2025-11-07T08:06:05 1762502765

I think it is misleading to suggest today that tool-calling for nontrivial stuff really works with local models. It just works in demos because those tools always accept one or two arguments, usually string literals or numbers. In the real world functions take more complex arguments, many arguments, or take a single argument that's an object with multiple attributes, etc. You can begin to work around this stuff by passing function signatures, typing details, and JSON-schemas to set expectations in context, but local models tend to fail at handling this kind of stuff long before you ever hit limits in the context window. There's a reason demos are always using 1 string literal like hostname, or 2 floats like lat/long. It's normal that passing a dictionary with a few strict requirements might need 300 retries instead of 3 to get a tool call that's syntactically correct and properly passed arguments. Actually `ping --help` for me shows like 20 options, and for any attempt to 1:1 map things with more args I think you'd start to see breakdown pretty quickly.

Zooming in on the details is fun but doesn't change the shape of what I was saying before. No need to muddy the water; very very simple stuff still requires very big local hardware or a SOTA model.

simonw · 2025-11-07T08:43:51 1762505031

You and I clearly have a different idea of what "very very simple stuff" involves.

Even the small models are very capable of stringing together a short sequence of simple tool calls these days - and if you have 32GB of RAM (eg a ~$1500 laptop) you can run models like gpt-oss:20b which are capable of operating tools like bash in a reasonably useful way.

This wasn't true even six months ago - the local models released in 2025 have almost all had tool calling specially trained into them.

lossolo · 2025-11-07T14:28:48 1762525728

You mean like a demo for simple stuff? Something like hello world type tasks? The small models you mentioned earlier are incapable of doing anything genuinely useful for daily use. The few tasks they can handle are easier and faster to just write yourself with the added assurance that no mistakes will be made.

I’d love to have small local models capable of running tools like current SOTA models, but the reality is that small models are still incapable, and hardly anyone has a machine powerful enough to run the 1 trillion parameter Kimi model.

simonw · 2025-11-07T14:58:20 1762527500

Yes, I mean a demo for simple stuff. This whole conversation is attached to an article about building the simplest possible tool-in-a-loop agent as a learning exercise for how they work.

sprobertson · 2025-11-07T01:47:31 1762480051

> software dev is a job for me that needs to generate income

sir, this is a hackernews

lojack · 2025-11-07T03:12:02 1762485122

> This post is a <insert-startup-here> advertisement

same thing you said but in a different context... sir, this is a hackernews

eli · 2025-11-07T15:41:50 1762530110

Because if you build an agent you'll need to host it in a cloud virtual machine...? I don't follow.

theshrike79 · 2025-11-07T23:29:48 1762558188

I have an "agent" that posts our family schedule + weather + other relevant stuff to our shared channel.

It costs like 0.000025€ per day to run. Hardly something I need to get "profitable".

I could run it on a local model, but GPT-5 is stupidly good at it so the cost is well worth it.

vel0city · 2025-11-07T01:40:01 1762479601

Practically everything is something you will need to pay for in the end. You probably spent money on an internet connection, electricity, and computing equipment to write this comment. Are you intending to make a profit from commenting here?

You don't need to run something like this against a paid API provider. You could easily rework this to run against a local agent hosted on hardware you own. A number of not-stupid-expensive consumer GPUs can run some smaller models locally at home for not a lot of money. You can even play videogames with those cards after.

Get this: sometimes people write code and tinker with things for fun. Crazy, I know.

hoppp · 2025-11-07T02:13:32 1762481612

The submission is an advertisement for fly.io and OpenAI , both are paid services. We are commenting on an ad. The person who wrote it did it for money. Fly.io operates for money, OpenAi charges for their API.

They posted it here expecting to find customers. This is a sales pitch.

At this point why is it an issue to expect a developer to make money on it?

As a dev, If the chain of monetization ends with me then there is no mainstream adoption whatsoever on the horizon.

I love to tinker but I do it for free not using paid services.

As for tinkering with agents, its a solution looking for a problem.

johnfn · 2025-11-07T02:26:10 1762482370

Why are you repeatedly stating that the post is an ad as if it is some sort of dunk? Companies have blogs. Tech blogs often produce useful content. It is possible that an ad can both successfully promote the company and be useful to engineers. I find the Fly blog to be particularly well-written and thoughtful; it's taught me a good deal about Wireguard, for instance.

hoppp · 2025-11-07T02:46:19 1762483579

And that sounds fine, but Wireguard is not an overhyped industry promising huge gains in the future to investors and to developers jumping on a bandwagon who can find problems for this solution.

I actually have built agents already in the past and this is my opinion. If you read the article the author says they want to hear the reasoning for disliking it, so this is mine, the only way to create a business is raising money and hoping somebody strikes gold with the shovel Im paying for.

simonw · 2025-11-07T02:51:14 1762483874

How would you feel about this post if the exact same content was posted on a developer's personal blog instead?

I ask because it's rare for a post on a corporate blog to also make sense outside of the context of that company, but this one does.

tptacek · 2025-11-07T03:03:54 1762484634

They're mentioning WireGuard because we do in fact do WireGuard, unlike LLM agents, which we do not offer as a service.

tptacek · 2025-11-07T03:03:10 1762484590

You keep saying this, but there is nothing in this post about our service. I didn't use Fly.io at all to write this post. Across the thread, someone had to remind me that I could have.

hoppp · 2025-11-07T04:09:48 1762488588

Sorry, I assumed a service offering Virtual machines shares python code with the intent to get people to run that python on their infra.

tptacek · 2025-11-07T04:12:01 1762488721

Yes. You've caught on to our devious plan. To do anything I suggested in this post, you'd have to use a computer. By spending compute cycles, you'd be driving scarcity of compute. By the inexorable law of supply and demand, this would drive the price of compute cycles up, allowing us to profit. We would have gotten away with it, if it wasn't for you.

hoppp · 2025-11-07T10:42:01 1762512121

Scooby Doobie Doooo!

tptacek · 2025-11-07T03:01:58 1762484518

No, we are not an LLM provider.

balder1991 · 2025-11-07T01:13:46 1762478026

Yeah we have open source models too that we can use, and it’s actually more fun than using cloud providers in my opinion.