I have a bit of experience with using a desktop OS on a smartphone. In fact, you can try it very easily yourself! I used TeamViewer.
Of course, it runs on a separate device, but it gives us an idea of what it would be like to have it there natively.
I used this to make some music in FL Studio on my phone. Well, on my laptop, which I had set up in the office. So I controlled it via my phone.
I feel like mobile versions of stuff is just gimped versions of the real thing. I would almost go so far as to say that I am more productive when using a desktop interface on a smartphone (than the smartphone equivalent of the same software), despite the ergonomic awkwardness.
From what I heard it was designed to prefer truth over political correctness. I don't use Grok or Twitter though so I cannot comment on whether that aim was achieved (or even seriously attempted).
I will however note that when I asked ChatGPT for an LLM prompt for truthfulness, it added "never use warm or encouraging language."
It would appear that empathy and truth are in conflict — or at least the machine thinks so!
You can just turn the AI off. I think that's a good idea to do regularly, in the same way it's good to have some time every day without screens and internet in your life.
I did some "trad coding" to see how much I'd atrophied, and I was startled at how difficult and unpleasant it was. I was just stuck and frustrated almost the whole time! But I persisted for 7 hours and was able to solve the problem.
Then I remembered, actually it was always like that! At least when doing something unfamiliar. That's just what programming feels like, but I had stopped being used to it because of the instant gratification of the magic "just fix my problem now" button.
In reality had spent 7 hours in "learning mode", where the whole point is that you don't understand yet. (I was moving almost the whole time, but each new situation was also unfamiliar!)
But if I had used AI, it would have eliminated the struggle, and given me the superficial feeling of understanding, like when you skim a textbook and think "yeah I know this part" because you recognize the page. But can you produce it? That's the only question that matters.
I think that's going to become a very important question going forward. Sure, you don't need to produce it right now. But it's mostly not for right now.
Just like you don't "need" to run and lift weights. But what happens if you stop?
Yes, exactly this. I'm using a 10 year old elitebook folio g1. It's about 2 pounds and does what I need it to do. Available with a 4k screen if that's your thing. Does not feel sluggish. Given that I'm not gaming, video editing, or doing local LLMs (and I think there is a big chunk of the population in that camp), I feel like I am missing out on nearly zero.
(And I'm not trying to say anything is special about the laptop I'm using. I adore using trackpoint (so much that I brought my own trackpoint keyboard in to work to use there) so would gladly trade for an old thinkpad if what I had didn't already do what I need it to do).
In the middle of a gaming session one stops thinking about graphics once it's reached a certain level of fidelity, and that level is far below RTX. Not worth the money, especially today.
Yeah they have more "common sense", though not as much as I'd like. I used to think Opus is big, but after using it a lot, I think it should actually be a lot bigger. The difference from Sonnet to Opus is really noticeable, but the difference from Opus to human (in common sense) is also massive. I expect as the hardware improves, we'll see 3-10x bigger models become the default.
Small models are making great strides of course, and perhaps we will soon learn to distill common sense ;) but subtlety and nuance appear physically bound to parameter count...
> What is the cost of verifying the generated artifact meets requirements vs. a directly produced artifact? This is mostly a function of the task and the user, but also the generative model.
So this is the fun one for programming.
I let AI agents do some programming on my codebases, but then I had to spend more time catching up with their changes.
So first I was bored waiting for them to finish, and then I was confused and frustrated making sense of the result.
Whereas, when I am asking AI small things like "edit this function so it does this instead", and accepting changes manually, my mental model stays synced the whole time. And I can stay active and in flow.
(Also for such fine grained tasks, small fast cheap models are actually superior because they allow realtime usage. Even small latency makes a big difference.)
Yes, the more you let agents loose, the less you are in control and the more time you spend later cleaning up their mess.
It is tempting letting them loose, after they delivered unexpectedly good results for a while, but for me it is not worth it. Manually approve and actual read. (And manually edit CLAUDE.md etc. if necessary. )
This is exactly why I don't like those "swarm" approaches with 8 Claude Code's running in parallel. Every time I've tried it I instantly lose control and become out of touch with the codebase. The quantity of the produced output is simply too fast & large to follow, so I tune out and it becomes a 100% vibe coded project.
We don't post-train current frontier models to pass the Turing test, but if we did, it wouldn't be much of a challenge for current models IMHO. It's a dead benchmark. It tests the human machines, not the machines.
I remember reading tht hallucination is still a problem even with perfect context. You build a theoretical perfect RAG, give the LLM the exact correct information, and it will still make mistakes surprisingly often.
Of course, it runs on a separate device, but it gives us an idea of what it would be like to have it there natively.
I used this to make some music in FL Studio on my phone. Well, on my laptop, which I had set up in the office. So I controlled it via my phone.
I feel like mobile versions of stuff is just gimped versions of the real thing. I would almost go so far as to say that I am more productive when using a desktop interface on a smartphone (than the smartphone equivalent of the same software), despite the ergonomic awkwardness.
reply