Hierarchies can punish this. Note that the legislature and judicial branches exert their power. Epstein files got released if you need proof.
(However, if we are International Systems Realists, there are inevitable effects that happen. I have a feeling even Biden/Harris would be in Iran right now.)
Some got released, and in the way the Executive wanted them to be.
This proves the opposite IMO - while the Legislative is co-opted, the Judicial branch has shown it is quite inadequate exerting control or punishment of the Executive.
It's odd because I no longer really like ChatGPT. For chat-type requests, I prefer Claude, or if it's knowledge-intensive then Gemini 3 Pro (which is better for history, old novels, etc).
But GPT 5.3 Codex is great. Significantly better than Opus, in the TUI coding agent.
I don't know about Opus, but Codex suddenly got a lot better to the point that I prefer it over Sonnet 4.6. Claude takes ages and comes up with half baked solutions. Codex is so fast that I miss waiting. It also writes tests without prompting.
ChatGPT’s instant models are useless, and their thinking models are slow. This makes Claude more pleasant to use, despite them not being SOTA.
But ChatGPT is still SOTA in search and hard problem solving.
GPT-5.2 Pro is the model people are using to solve Erdos problems after all, not Claude or Gemini. The Thinking models are noticeably better if you have difficult problems to work on, which justifies my sub even if I use Claude for everything else. Their Codex models are also much smarter, but also less pleasant to use.
IME ChatGPT is pretty mid at search. Grok although significantly dumber, is really strong at diligently going through hundreds of search results, and is much more tuned to rely on search results instead of its internal knowledge (which depending on the case can be better or worse). It's the only situation where Grok is worth using IMO.
Gemini is really good with many topics. Vastly superior to ChatGPT for agronomy.
You should always use the best model for the job, not just stick to one.
We better get a liberal democratic Iran government out of this.
We better remove and halt nuclear powers for the rest of my life.
I suppose pick either, and it was successful.
My personal polymarket says we wont get either. Trump and Israel ruin their reputation. But reputation matters close to 0 in international relations, which is why they don't care.
There's next to no chance that whatever comes out of the end of this will be a "liberal democratic Iran government". Obama started a route in that direction with the lowered sanctions and the Joint Comprehensive Plan of Action from 2015. Iran having a democratic government doesn't really help the GOP war hawks so of course they trashed it. The same happened with North Korea in the 90s with the Agreed Framework that had some promise before GWB torpedoed it to please his oinking base.
I also think that nuclear powers mean regional stability. Ukraine gave up its nukes in the 90s and we saw what happened there.
> We better get a liberal democratic Iran government out of this.
I doubt it. US intervention seems to have a habit of creating weakened nations for its rivals to benefit from. In Iraq's case: Iran and in Iran's case maybe the Taliban in Afghanistan.
>Time to first token measured with an 8K-token prompt using a 14-billion parameter model with 4-bit quantization
Oh dear 14B and 4-bit quant? There are going to be a lot of embarrassed programmers who need to explain to their engineering managers why their Macbook can't reasonably run LLMs like they said it could. (This already happened at my fortune 20 company lol)
It won’t handle serious tasks but I have Gemma 3 installed on my M2 Mac and it is good for most of my needs—-esp data I don’t want a corporation getting its hands on.
I run Qwen 3.5 30B MOE and it’s reasonable at most tasks I would use a local model for - including summarizing things. For instance I auto update all my toolchains automatically in the background when I log in and when finished I use my local model to summarize everything updated and any errors or issues on the next prompt rendering. It’s quite nice b/c everything stay updated, I know whats been updated, and I am immediately aware of issues. I also use it for a variety of “auto correct” tasks, “give me the command for,” summarize the man page and explain X, and a bunch of tasks that I would rather not copy and paste etc.
Nothing like coding, just like relatively basic stuff. Idk its hard to explain but I use AI so frequently for work that I have a sense for what it is capable of.
Yeah no it didn’t. If you have a fully speced out M3/4 MacBook with enough memory you’re running pretty decent models locally already. But no one is using local models anyway.
I run a local model on the daily. I have it making tickets when certain emails come in and made a small that I can click to approve ticket creation.
It follows my instructions and has a nice chain of thought process trained.
Local LLMs are starting to become very useful. Not OpenClaw crap.
Technically you can get most MoE models to execute locally because RAM requirements are limited to the active experts' activations (which are on the order of active param size), everything else can be either mmap'd in (the read-only params) or cheaply swapped out (the KV cache, which grows linearly per generated token and is usually small). But that gives you absolutely terrible performance because almost everything is being bottlenecked by storage transfer bandwidth. So good performance is really a matter of "how much more do you have than just that bare minimum?"
Oh sure it is! I’ve helped set up an AI cluster rack with four K2.5s.
With some custom tooling, we built our own local enterprise setup:
Support ticketing system
Custom chat support powered by our trained software-support model
Resolved repository with detailed step-by-step instructions
User-created reports and queries
Natural language-driven report generation (my favorite — no more dragging filters into the builder; our (Secret) local model handles it for clients)
In-application tools (C#/SQL/ASP.NET) to support users directly, since our software runs on-site and offline due to PPI
A cool repair tool: import/export “support file packet patcher” that lets us push fixes live to all clients or target niche cases
Qwen3 with LoRA fine-tuning is also incredible — we’re already seeing great results training our own models.
There’s a growing group pushing K2.5s to run on consumer PCs (with 32GB RAM + at least 9GB VRAM) — and it’s looking very promising. If this works, we’ll be retooling everything: our apps and in-house programs. Exciting times ahead!
For anyone who has been watching Apple since the iPod commercials, Apple really really has grey area in the honesty of their marketing.
And not even diehard Apple fanboys deny this.
I genuinely feel bad for people who fall for their marketing thinking they will run LLMs. Oh well, I got scammed on runescape as a child when someone said they could trim my armor... Everyone needs to learn.
Yesterday I ran qwen3.5:27b with an M1 Max and 64 GB of ram. I have even run Llama 70B when llama.cpp came out. These run sufficiently well but somewhat slow but compared to what the improvements with the M5 Max it will make it a much faster experience.
my mac mini m4 is getting to be a good substitute for claude for a lot of use cases. LM Studio + qwen3.5, tailscale, and an opencode CLI harness. It doesn't do well with super long context or complexity but it has gotten production quality code out for me this week (with some fairly detailed instructions/background).
I don't know that there would be a huge overlap between the people who would fall for this type of marketing and the people who want to run LLMs locally.
There definitely are some who fit into this category, but if they're buying the latest and greatest on a whim then they've likely got money to burn and you probably don't need to feel bad for them.
Reminds me of the saying: "A fool and his money are soon parted".
In retrospect, was there a better place to learn about the cruelty of the world than runescape? Must've got scammed thrice before I lost the youthful light in my eye
Musk is leading the build of the biggest objects we have ever sent to space. It does give him some sort of aura that is hard to dismantle, let's be honest.
He can do and say a lot of shit because he will still be viewed as real-life Iron Man, because in some ways he kind of is.
Somehow Tim Cook's many year's position that the lightening port was very important to Apple vs USB-C, fell flat as a parsec wide pancake.
(It didn't help that they couldn't point to a single user facing feature.)
Or that the App Store lock in is for our safety. When anyone who wanted that particular safety, could choose to continue using there store exclusively.
Etc.
He just does not have it. No field. No spiraling eyes. Perhaps he should grow a beard and wave around a tobacco pipe. Works for some.
Your first 2 points make me extra bitter about COVID.
Less store hours. Higher prices. Inflation. People in school got a terrible education and it affected my workforce. (But hey 1% of people died, as predicted if we did nothing at all... )
It only reinforces the importance of competition over protectionism.
I used to be a walmart fan, but my local store is cheaper now. I didn't bother to look at prices until things were getting silly.
> But hey 1% of people died, as predicted if we did nothing at all
Nope. Compare the death rates of Sweden vs its neighbours in the Nordics (the closest comparisons we have with similar weather/culture/etc.). Or if you don't care about minimising variables, in the US between states that did lockdowns and mask mandates and those that didn't. In every comparable (e.g. excluding rural vs urban) case, there were more deaths in "doing nothing" than implementing the same basic public health axioms that have held true for centuries.
> Inflation
That was also helped by Russia invading Ukraine, which increased global prices of multiple important raw materials. But yes, inflation after a period of deflation/economic contraction/restricted travel and consumption was to be expected.
> People in school got a terrible education and it affected my workforce
It's definitely a bigger issue for them than it is for you. And yeah, it sucks for them. Would have been pretty terrible to tell teachers (who overwhelmingly skew older) they should risk their lives just to keep kids occupied too.
> It only reinforces the importance of competition over protectionism.
The thing too many forget is that if we didn't flatten the curve our entire medical system was going to collapse. It's insane that people don't yet understand this concept and can't even empathize with medical professionals. Yes, we all struggled, but try talking to medical professionals to see how they did.
When something doesn't happen because enough measures were taken, then it wasn't worth it because it didn't happen?
> The thing too many forget is that if we didn't flatten the curve our entire medical system was going to collapse
Yep, if things were going well there wouldn't have been makeshift morgues with refrigerated trucks, sick people having to be moved around to different countries, the military deploying field hospitals, corpses piling in the streets. Those examples are from a variety of countries, which shows how bad the situation was globally.
You had 6 weeks of staying at home, and then quarantines for international travellers after that. In return, you had no COVID-19 at all for several years. Seems a fair trade.
Norway had that too; without lockdown. Curfews would require a change in the constitution and the last time they happened was during WWII which makes them doubly unpopular.
Sweden all-cause mortality was indeed higher if an immediate pre-pandemic year is taken as a base. However, pre-pandemic years in Sweden show a substantial dip in all-cause mortality, something that neighboring countries did not see. It is not that simple.
I think its just marketing, and the marketing is working. Look how many people bought Minis and ended up just paying for API calls anyway. (Saw it IRL 2x, see it on reddit openclaw daily)
I don't mind it, I open Apple stock. But I'm def not buying into their rebranding of integrated GPU under the guise of Unified Memory.
> Look how many people bought Minis and ended up just paying for API calls anyway. (Saw it IRL 2x, see it on reddit openclaw daily)
Aren't the OpenClaw enjoyers buying Mac Minis because it's the cheapest thing which runs macOS, the only platform which can programmatically interface with iMessage and other Apple ecosystem stuff? It has nothing to do with the hardware really.
Still, buying a brand new Mac Mini for that purpose seems kind of pointless when a used M1 model would achieve the same thing.
It’s exactly that. They are buying the base model just for that. You are not going to do much local AI with those 16GB of ram anyway, it could be useful for small things but the main purpose of the Mini is being able to interact with the apple apps and services.
16GB should be enough for TTS/Voice models running locally no ? I was thinking about having a home assistant setup like that where the voice is local and the brain is API based
Sure, that’s why I said maybe it’s useful for a few things. But the main reason people were recommending the Mini was for its price (base model) and having access to the Apple services for clawdbot to leverage. Not precisely for local AI.
No one is buying a base model Mac for local LLM. Everyone is forgetting the PC prices have drastically increased due to RAM and SSD. Meanwhile, Macs had no such price change… at least for the models that didn’t just drop today. Mac’s are just a good deal at the moment.
Yeah because Mac upgrade prices were already sky high, long before the component shortage. 32GB of DDR5-6000 for a PC rocketed from $100 to $500, while the cost of adding 16GB to a Mac was and still is $400.
I'm kind of curious how Apple's supply contracts actually work, because it's currently more attractive to buy a Mac with a lot of RAM than it usually is, relative to a PC. So if it's "we negotiated a price and you give us as much RAM as we sell machines" the company supplying the RAM is getting soaked because they're having to supply even more RAM to Apple for a below-market price.
But if the contract was for a specific amount of RAM and then people start coming to Apple more for high RAM machines, they're going to exhaust their contract sooner than usual and run out of cheap memory to buy. Then they have to decide if they want to lower their margins or raise the already-high price up to nosebleed levels.
Apple has accepted a 100% price increase for Samsung's LPDDR5X memory, with DRAM supply commitments secured only through the first half of 2026. Tim Cook acknowledged during the Q1 FY2026 earnings call that storage price increases would significantly impact Q2 gross margins.. Apple is evaluating ChangXin Memory Technologies (CXMT) and Yangtze Memory Technologies (YMTC) as new supply sources, attempting to rebuild pricing leverage through supply chain diversification.
Worse than that, they hold their value, so buying a used M1 mini is still a few hundred bucks, and saving $200-300 by purchasing a 5 generation older mini seems like a bad deal in comparison.
Someone came to be excited they got a "deal" on the newest Intel Mac Mini for hosting OpenClaw. 8GB model for $300. I kind of regret bursting their bubble by telling them you can walk over to Costco (nearest one at time of discussion was walking distance) and pay $550 for one with an M4 and 16GB of RAM.
> Aren't the OpenClaw enjoyers buying Mac Minis because it's the cheapest thing which runs macOS
That's likely only part of the reason. Mac Mini is now "cheap" because everyone exploded in price. RAM and SSD etc have all gone up massively. Not the mention Mac mini is easy out of the box experience.
It's not cheap, though. Two weeks ago I bought a computer with a similar form factor (GMKtec G10). Worse CPU and GPU but same 16GB memory and a larger SSD for 40% the price of a base mac mini ($239 vs $599). It came with Windows preinstalled, but I immediately wiped that to install linux. Even a used (M-series) mac mini is substantially more expensive. It will cost me about an extra penny per day in electricity costs over a mac mini, but I won't be alive long enough for the mac mini to catch up on that metric.
I considered the mac mini at the time, but the mac mini only makes sense if you need the local processing power or the apple ecosystem integration. It's certainly not cheaper if you just need a small box to make API calls and do minimal local processing.
If you just need "a small box to make API calls and do minimal local processing" you an also just buy a RPI for a fraction of the price of the GMKtec G10.
All 3 serve a different purpose; just because you can buy a slower machine for less doesn't mean the price:performance of the M1 Mac Mini changes.
> you an also just buy a RPI for a fraction of the price of the GMKtec G10.
Sadly not really. The Pi 5 8gb canakit starter set, which feels like a more true price since it's including power supply, MicroSD card, and case, is now $210. The pi5 8gb by itself is $135.
A 16gb pi5 kit, to match just the RAM capacity to say nothing of the difference in storage {size, speed, quality} and networking, is then also an eye watering $300
>Sadly not really. The Pi 5 8gb canakit starter set, which feels like a more true price since it's including power supply, MicroSD card, and case, is now $210. The pi5 8gb by itself is $135.
Bro. The used M1 mini and studio are all gone. I was thinking of buying one for local AI before openclaw came out and went back to look and the order book is near empty. Swappa is cleared out. eBay is to the point that the m1 studio is selling for at least a thousand more.
This arb you’re talking about doesn’t exist. An m1 studio with 64 gb was $1300 prior to openclaw. You’re not getting that today.
I would have preferred that too since I could Asahi it later. It’s just not cheap any more. The m4 is flat $500 at microcenter.
Why not? The integrated GPUs are quite powerful, and having access to 32+ GB of GPU memory is amazing. There's a reason people buy Macs for local LLM work. Nothing else on the market really beats it right now.
My M4 MacBook Pro for work just came a few weeks ago with 128 GB of RAM. Some simple voice customization started using 90GB. The unified memory value is there.
Jeff Geerling had a video of using 4 Mac Studios each with 512GB RAM connected by Thunderbolt. Each machine is around $10K so this isn't cheap but the performance is impressive.
It’s what a small business might have paid for an onprem web server a couple of decades ago before clouds caught on. I figure if a legal or medical practice saw value in LLMs it wouldn’t be a big deal to shove 50k into a closet
You would still have to do some pretty outstanding volume before that makes sense over choosing the "Enterprise" plan from OpenAI or Anthropic if data retention is the motivation.
Assuming, of course, that your legal team signs off on their assurance not to train on or store your data with said Enterprise plans.
But why? Spending several thousand dollars to run sub-par models when the break-even point could still be years away seems bizarre for any real usecase where your goal is productivity over novelty. Anyone who has used Codex or Opus can attest that the difference between those and a locally available model like Qwen or Codestral is night and day.
To be clear, I totally get the idea of running local LLMs for toy reasons. But in a business context the sell on a stack of Mac Pros seems misguided at best.
I ran the qwen 3.5 35b a3b q4 model locally on a ryzen server with 64k context window and 5-8 tokens a second.
It is the first local model I've tried which could reason properly. Similar to Gemini 2.5 or sonnet 3.5. I gave it some tools to call , asked claude to order it around, (download quotes, print charts, set up a gnome extension) even claude was sort of impressed that it could get the job done.
Point is, it is really close. It isn't opus 4.5 yet, but very promising given the size. Local is definitely getting there and even without GPUs.
But you're right, I see no reason to spend right now.
Getting Opus to call something local sounds interesting, since that's more or less what it's doing with Sonnet anyway if you're using Claude Code. How are you getting it to call out to local models? Skills? Or paying the API costs and using Pi?
I just start llama.cpp serve with the gguf which creates an openai compatible endpoint.
The session so far is stored in a file like /tmp/s.json messages array. Claude reads that file, appends its response/query, sends it to the API and reads the response.
I simply wrapped this process in a python script and added tool calling as well. Tools run on the client side. If you have Claude, just paste this in :-)
Assuming you ran the gamut up from what you could fit on 32 or 64GB previously, how noticeable is the difference between models you can run on that vs. the 512GB you have now?
I've been working my way up from a 3090 system and I've been surprised by how underwhelming even the finetunes are for complex coding tasks, once you've worked with Opus. Does it get better? As in, noticeably and not just "hallucinates a few minutes later than usual"?
I'm not really into AI and LLMs. I personally don't like anything they output. But the people I know who are into it and into running their own local setups are buying Studios and Minis for their at home local LLM set ups. Really, everyone I personally know who is doing their build your own with local LLMs are doing this. I don't know anyone anymore buying other computers and NVIDIA graphics cards for it.
I think people buying those don't realize requirements to run something as big as Opus, they think those gigabytes of memory on Mac studio/mini is a lot only to find out that its "meh" on context of LLMs. Plus most buy it as a gateway into Apple ecosystem for their Claws, iMessage for example.
> But I'm def not buying into their rebranding of integrated GPU under the guise of Unified Memory.
But it is Unified Memory? Thanks to Intel iGPU term is tainted for a long time.
I've tried to use a local LLM on an M4 Pro machine and it's quite painful. Not surprised that people into LLMs would pay for tokens instead of trying to force their poor MacBooks to do it.
Local LLM inference is all about memory bandwidth, and an M4 pro only has about the same as a Strix Halo or DGX Spark. That's why the older ultras are popular with the local LLM crowd.
This would be an absolute game changer for me. I am dictating this text now on a local model and I think this is the way to go. I want to have everything locally. I'm not opposed to AI in general or LLMs in general, but I think that sending everything over the pond is a no-go. And even if it were European, I still wouldn't want to send everything to some data center and so on. So I think this is a good, it would be a good development and I think I would even buy an Apple device for the first time since the iPod just for that.
And while it is stupid slow, you can run models of hard drive or swap space. You wouldn’t do it normally, but it can be done to check an answer in one model versus another.
Try a software called TG Pro lets you override fan settings, Apple likes to let your Mac burn in an inferno before the fans kick in. It gives me more consistent throughput. I have less RAM than you and I can run some smaller models just fine, with reasonable performance. GPT20b was one.
What models are you using? I’ve found that SOTA Claudes outperform even gpt-5.2 so hard on this that it’s cheaper to just use Sonnet because num output tokens to solve problem is so much lower that TCO is lower. I’m in SF where home power is 54¢/kWh.
Sonnet is so fast too. GPT-5.2 needs reasoning tuned up to get tool calling reliable and Qwen3 Coder Next wasn’t close. I haven’t tried Qwen3.5-A3B. Hearing rave reviews though.
If you’re using successfully some model knowing that alone is very helpful to me.
iPhone makes you an easy target. Sorry Besos, security through obscurity was a bad idea... but you should have known better.
reply