I think COVID-times ended that habit quickly around my city. I used to lick my finger to open plastic bags on the supermarket and now try to find something wet instead - usually alcohol bottle.
(I'm asure it was always a bit nasty but when it became a deadly move, my habits finally changed...)
And only if by "uses AI" you say it has AI inside the hardware but really it's a network-enabled thin client that calls out to a server wrapping GPT calls.
Maybe because we're all exhausted by these kinds of demos. I'll be interested in AI when I can integrate it with a large codebase and it provides any benefit. I'm happy for you that you can stand up little toy apps that pull down an NPM library and call it, but it's not useful in my professional life.
If you have one of those larger codebases you aren't afraid of Cursor Co. getting a copy of, I would really try out Cursor. The indexing of medium sized mono-repo I've been working in is pretty flawless (contains code for a static web page, a singe page web app, and some python services along with deployment configs)
That's not a medium sized repo, that is a baby you can entirely memorize in your own head. Cursor is also dreadful at anything that isn't Javascript and Python.
There's absolutely no way I'd share my codebase with some fly-by-nighter ~crypto~ AI company, but the codebase I work on is upwards of 60 million lines of code so I doubt any "AI" solution would come close to being useful.
At 60 Mloc? Sure, for now I'd agree. An AI needs to use other tools to handle that kind of size, it doesn't (practically) work so well if you try to hold the whole thing in context.
And while tool use is being worked on, the results I've seen are are the "that's an interesting tech demo" level rather than the mind-blowing change when InstructGPT demonstrated the ability for a language model to generate any meaningful code at all from natural language instruction.
I had a pretty cool experience with that the other day. I wrote some production code (LLM had no idea what was going on), then I measured the coverage and determined a test case that would increase it (again, just using my brain), BUT when I typed "testEmptyString" or whatever, the LLM filled in the rest of the test. Not a massive change to the way I work, but it certainly saved me a bunch of time.
I swear half the people in this thread have spent 5 minutes with the first chatgpt, pre 3.5, wrote it off and are so convinced of their superiority that they won't spend the time required to even see where it's at.
Ever saw someone really bad at googling? It's the exact same thing with LLMs (for now). They're not magic crystal balls, and they certainly can't read everyone's minds at the same time. But give them a bunch of context, and they'll surprise you.
Sshhh, we've still got like a year of advantage over the folks that haven't learned about searching the Internet and still have to drive to their local university library...don't squander it!
Engineers are also notorious for having hit and miss soft skills. The interface for LLMs is natural language. I wouldn’t be surprised if much of the variance in usefulness boils down to how effectively you can communicate what you want it to do.
Same. I've felt any LLM for coding has saved me mechanical time, but as of now anything slightly more complex than that just makes me waste more time figuring stuff out.
Other than the automation aspect, it is a pretty good alternative to in-depth googling.
>BTW, why the disparaging reference to "little toy apps"?
It's an unmaintainable, single use piece of software (that doesn't even implement the features, it just glues together already existing code) that any CS student could write in a week. Congrats on getting a really fast CS student I guess ? Not to mention the fact that perfectly viable, better alternatives are available in many places.
It's like me nailing two 2x4s together to make a shelf. Yeah, sure, I made it myself and I didn't need any woodworking knowledge, but let's just hope I don't put grandma's heavy china on it.
As a professional 2x4 nailer and gluer, I assure you that I have a ton more deliverables for my client. The downside is that now I actually have to put some thought into my work; you know, put some engineering work into it.
The upside is that I can produce a shit-ton of one-shot code in record time, so I've got time to face the downside.
I had Claude write me a Bash script for running prompts (and images) against Google Gemini this morning - I really like it for Bash, because I never committed any of the Bash idioms to memory myself. https://til.simonwillison.net/llms/prompt-gemini
How would you refer to the example apps in the OP's link? They are almost definitionally toy apps, and definitely little (a handful of pages of code including all of the HTML).
As a programmer (which is the requisite to build such tools even with LLMs), I have a plethora of tools to do the tasks, what I choose and how much time I invested in in that depends on something similar to this chart, but with an added dimension: interest.
Take for example the URL extraction. For one single occasion, I'd probably use VIM and macros to quickly do it. If it were many pages, I'd write a script. If it were infrequent, but recurrent, I'd take the time to write a better script and would only write a web page if the use case was shared with other people or if I wanted a cross platform solution.
I believe the first question one should ask before building is why. That leads you to find a better UX than shoehorning everything inside a web app.
I 100% agree with the point you are making. The only aspect that obscures it is that paying employers will happily pay to support interests that are, on inspection, a waste of time.
In that aspect, I am hopeful. Maybe if "waste of time" activities are commoditized, "professionals" can instead focus on "what is important," whatever that might be.
I don’t mind if my employer bought a subscription. But my personal motto if that if I have to do something multiple time, it should take less and less time. I reuse code heavily (which is why I learned vim as it makes that fluid). And that’s where LLMs becomes useless because they need the entire context to generate something. Which means I have to type it out for them in addition to what I want. And the whole thing becomes a drag. Maybe Cursor and the likes could help, but the code is only half the story, there are things like protocols, messages format, specs,…
What LLMs promise is endless drag. I try to structure my work to ensure that the final velocity is high.
I copy and paste code examples into LLMs all the time. They're extremely good at figuring out which parts of the context are relevant, so I don't find myself needing to do any editing at all - I find the right example, paste it in and add my prompt at the end.
This app for example - which runs OCR against PDF files entirely in the browser - was assembled by pasting in an example of PDF.js usage and an example of Tesseract.js usage and having it figure out the rest: https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/
While I applaud the result, it's again one of the thing that would be a quick script if it were only for my personal usage. Because I wouldn't bother with making it user friendly as I'm the single user.
The kind of project I work on is more like this: Build an Android app for a quiz game. The quiz takes a list of random question from a set. Each set is a package that can be installed and upgraded when online. While the app is free, there is an activation code to be able to download the main packages. The app should work offline except for the activation and downloading packages. It also should notify when a new version is ready for a package. etc...
I don't know if LLMs could have helped me at the time (pre 2020), but I doubt it. Not because the code was complex, but mostly how cohesive the whole thing should be while taking care they're not tightly coupled and be maintainable by a single person. The IDE was a great helper once I got the design and the architecture outlined, mostly because it was deterministic and I already know what the end result should be.
GPT-3 came out in early 2020, prior to that we had just GPT-2 which was mildly interesting at best but not something that could generate usable code.
The best current LLMs GPT-4o, Gemini 1.5 Pro, Claude 3.5 Sonnet - are just about at the point now where I'd expect them to be able to get a useful chunk of your Android spec there done. Which is pretty wild!
Perhaps people want to read other blogs once in a while. We are at a stage of AI glut. If people pump out so much content in such a short time, no one can read it all (or is interested to read it).
Apart from chance HN seems rather time dependent – those posts seem to be posted still early in the day (I assume UTC from the timestamps) when the mostly US-American visitors doesn’t seem yet in the slacking-of mode.
I bought a motorcycle and the high lasted a few minutes. I felt like I was floating. I had to go for 3 weeks without and then was able to get another mini high but in general not a great purchase.
The main quality I remember Emacs having compared to something like VSCode is its ease of extendability - but I don't know how to balance the liveness/messyness of a personal Emacs setup and wanting to package some of those bits to third parties. I think its a tension there...
In my case I "lend" my personal device for work (Git, Slack, Figma, Miro... use one Chrome for work and Chrome Beta for personal). So I suppose there's no software running behind the scenes. Should I still worry in this case?
(I'm asure it was always a bit nasty but when it became a deadly move, my habits finally changed...)
reply