Adding to what another user said here, when using LLMs for heavy prototyping, I also commit frequently and have a document outlining the opinions contained in the repo.
One thing I've also been doing now is I have a "template" to use for fullstack applications that's configured out-of-the-box with tech that I like, has linters and hooks set up, as well as is structured in a way that's natural to me. This makes it a lot more likely that the generated code will follow my preferred style. It's also not uncommon that on the first iteration I get some code that's not what I'd like to have in my codebase and I just say "refactor X in this and this way, see <file> for an example" at which point I mostly get code that I'm happy with.
Often the AI makes a recommendation that looks good, but has consequences 4 or 5 changes later that you'll need to reckon with. You'll see this especially with web vs. mobile UI behaviours. Cursor isn't so good at reverting changes yet, and GIT has saved me a LOT of headache.
2. Collaborate with o1 to create a PRD and put it in the same folder as your specs.
This increases the likelihood Cursor's recommended changes comply with your intent, style and color guidelines and overall project direction.
If you're using Cursor Composer it's very easy to jump back at any time in the history using "Revert all files to before this message". I use it all the time and have stopped obsessively committing every change as a result.
This is an interesting use case for LLMs. At first, I thought “why complete an MVP if you don’t have time to iterate it?”.
But MVPs are a good testbed for determining if something is worth iterating.
I build quick prototypes for validation and LLMs have really helped decrease the effort. Here is an example of a site which took me 2 weeks with help from ChatGPT. https://post.withlattice.com/
It was enough to get interior designers to start using it, provide feedback, and validate if it’s worth additional time investment.
My hypothesis is that 2025 will bring forth a lot of indie products which focus on doing one thing well and can be supported my a small or fractional team.
I look forward this break from VC funded and big tech consumer products!
Since I started using all flavors of Gemini 2.0 I haven't been back on Cursor
Gemini Exp 1206 and Gemini 2.0 Thinking Model exceed o1 in my experience and best part is that its free with insane context token size.
For agentic code experience RooCline is quite good but I actually do most of my work in ai studio and like to create the folders and files manually.
I almost think agentic code generation is the wrong step forward. Without knowledge transfer/learning involved during the code generation phase, you end up cornering yourself into unfixable/unknowable bugs.
You can see the examples on the linked site are quite simple. These are hardly real world use cases. Most of us aren't generating things from scratch but rather to gain understanding of systems being worked on by multiple people and very carefully inspecting and undertanding the output from GPT
This is why a lot of tools like Bolt, Lovable, Cursor, Windsurf will be toys aimed at people looking to use codegen as a toy rather than a tool. It will serve you better if you look at AI codegen as a knowledge transfer/explorer tool and an AI agent isn't wise imho.
I have been using o1 chatgpt to make and work on side projects where I don't know the language (Swift) but I started to understand enough where I am not using o1 as often and it has really helped me to learn Swift which is a great side effect. Although there are many gotcha's and the biggest is that visionos and ARKit get confused on o1 and it won't compile.
For what it's worth, good system prompting in Cursor makes an enormous difference, currently I use:
You're the best fullstack engineer and LLM engineer, prefer using Vite + React, TypeScript, Shadcn + Tailwind (unless otherwise specified).
CODE ORGANIZATION:
All code should be grouped by their product feature.
/src/components should only house reusable component library components.
/src/features/ should then have subfolders, one per feature, such as "auth", "inventory", "core", etc. depending on what is relevant to the app.
Each feature folder should contain code/components/hooks/types/constants/contexts relevant to that feature.
COMPONENTS:
Break down React apps into small composable components and hooks.
If there is a components folder (eg. src/components/ui/*), try to use those components whenever possible instead of writing from scratch.
CODE SYTLE:
Clear is better than clever. Make code as simple as possible.
Write small functions, and split big components into small ones (modular code is good).
Do not nest state deeply, prefer state data structures that is easy to modify.
I've been experimenting with having two editors open. If I'm strictly editing, I live in Helix which is fast and comfortable and distraction free, but when I could use some AI help I save the file in Helix and reload it in Cursor. When that's done, I save it in Cursor and reload it in Helix and carry on.
It's a lot to manage, but I find it better that trying to move my whole workflow into VSCode when really I only want the AI features and then only 30% of the time.
I usually select functions or classes and workshop them by having a conversation in the composer tab. I find the autocomplete kind of dizzying to work with so I've disabled it. It's just kind of... flickery.
This is a me problem. For instance, I've also disabled animation in Slack because I can't focus on talking to people over the "noise" of all of their animated gif's.
Mostly I just stay out of Cursor unless I'm conversing with an AI because I find the interface sluggish and awkward to use without a mouse (not Cursor's fault, that's a VSCodium/electron thing) and full of little bugs that end up entangling me in politics between plugin maintainers and VSCode people.
I like chasing down bugs and helping people fix them, but that ecosystem is just a bit too busy for me to feel like those efforts are likely to come back at me looking like a return on my investment. If I'm going to be finding and helping fix bugs (which is what I do with most of the tools I use), I want that effort to be in a place that's as open and free as possible--and not coupled to anybody's corporate strategy (these days that's wezterm, zellij, nushell, and helix). I have no animosity for Cursor, and only some animosity for Microsoft, but since I can't help but get involved with the development of my tools, I'd rather get involved elsewhere.
So it's really all about managing the scopes within which I'll let myself be distracted, and not at all about AI.
Using it for a side project right now and since last year it really became an inseparable part of my work, while in 23, I restricted it to writing emails, docstrings and unit tests. Now, I use it for a hell of a lot more. I am working on something cool and I feel kinda dumbfounded that Copilot/Claude etc. know of various SDKs etc I plan on using without me having to go through their docs. I am trying as much as possible to keep the code of high quality, but it feels like cheating, and at the same time I know if I don't use it, I will be left behind.
For example, I wrote my personal blog in Angular+Scully, thought about migrating to Gatsby due to threads here and on Reddit, and it took me one actual weekend worth of coding to get it migrated to Gatsby, though I am ironing out the kinks where I can.
I love this level of prompt/response iteration and I think the most remarkable part is how vanilla, open ended, ambiguous, and simple all the prompts are.
That leads me to believe that the "manager" task itself may be one of the most trivial to automate.
adding a few tips I have after playing sometime with cursor.
1. do commit frequently, recommend to do commit for every big change(a new feature such as a new button added for some function, some big UI change like added some new component changing the UI layout a bit)
2. test everything(regression testing) after doing big change with cursor, it may affect existing functions when adding new features you think should not impact other parts
3. if trying to add a new complete feature including changes to multiple places(fe, be etc), please using Composer instead of Chat
4. when working with UI changes(fe), be careful of needing forth and back of changes. one recommendation is you write the skeleton yourself, then ask cursor to fill in the missing components.
5. when giving instructions, use bullet points and state clearly about the requirements(write the step-by-step instructions) would have better result.
In my .cursorrules I have instructions that after every testable change it should ask me to test it, and if I say it's good it should commit.
So I ask for some feature, it implements, I test and either give feedback or simply say "good" or "that works" in which case it'll produce a commit command I can just press Run on. Makes it very easy to commit constantly.
Also I use Composer exclusively over Chat and have zero hesitation to hit those restore buttons if anything went down the wrong path.
I just use both because it allows me to have separate cursorrules and windsurfrules files, one for backend and the other for frontend.
They have different failures modes but are largely similar. Cursor seems to be better at indexing and including docs but that’s a qualitative impression rather than a rigorous evaluation.
Both are rather useless in my mixed C++/Rust codebase but work pretty well for traditional web stuff. I’d say at this point that Cursor has an edge but there’s so much development going on that this comment will be outdated within a few weeks.
I think popular is not the same as there are loud voices dropping the name every now and then. And here we are, less than six months later there is a new name to jump on.
Good editors/IDEs tend to stick around for much longer and generally have a clear statement of how they’re different/better. Just my view of course.
I'm not sure if they have fixed it but Windsurf downgraded significantly over the last month. Check out the Codeium subreddit. Many people have given up on it and are switching to cursor. Kind of a shame because Windsurf really did feel like magic when it launched.
You would think with the money being invested into Anthropic they would do better providing access to their model.
I ended up cancelling my Claude subscription because I was constantly getting timed out, and I think I was getting pushed to a quantized or smaller model that sucked when I used it too much.
Also their artifact UI sucks and doesn't update the code most of the time. It had apparently been broken for a while when I searched to see if other people had the issue.
I guess this is as good a place to ask this as any. Is there any good tooling for integrating LLMs into classic Unix developer workflows? So not an IDE plugin, not a new IDE, not copy-pasting code into chatbots, but something that composes with the command line and arbitrary editors.
Basically, what I'd like is to never copy-paste code to/from the LLM but for it to happen automatically when I ask the LLM a change. And I'd like an interaction model where all the changes are applied directly to the workspace and then use whatever tools I like to review them before committing. (But I guess anything that achieves the same effect would be fine as well, e.g. the changes being automatically git-committed in which case the outcome of a failed code review is a revert, or for the changes being provided as a unified diff.)
Thanks, that's definitely directionally correct. (And if it turns out to not be quite what I'm looking for, it's probably enough to find alternatives.)
I think you're looking for simonw's llm python tool [1]. It allows you to do things like `cat script.py | llm -c 'describe what this python script does'`.
You're able to run LLM APIs, local models, multimodals, etc. It is great!
In my experience, it's worth keeping source files as small as possible. Windsurf regularly has problems with files that are larger than 300 lines, especially if they are not JS/TS files.
I have a question about LLM tooling that I would love to hear others thoughts.
So I'm currently using LLM-lite tools such as ChatGPT and simple copilot completions. These tools obviously provide utility. I have yet to play with more advanced tooling or models besides these. I'm currently experiencing absolutely no pressure at my job to "up my game," and my hobby programming is mostly done in very niche languages (contribute little to my resume whether or not I finish them).
Given this situation, I've decided to stop using LLM tooling altogether, with the assumptions that 1) the tooling in 2+ years will be completely different anyways so I'm not loosing by not familiarizing myself with the current "hot" tools, and 2) if I ever need to get up to speed with the latest LLM tooling, it is something that one should pretty easily get up to speed, especially with improved models (people have seemed to master the current state of affairs in matters of months, so in the future it should be even easier). I know some people will reply that I'm just wasting time when I could automate stuff, but in reply, I find coding tasks fun (I know, old fashioned) and I feel I otherwise have a good work-life-balance.
Do these assumptions hold up? Are there any perspectives I'm missing?
I am on a similar situation. This morning I decided to download Cursor and write a web app from scratch to easen the piano lessons I'm taking. I finished the app by noon.
It greatly reduced the bottleneck of bringing to reality what was in my mind. While it was nothing very complex, it surprised me how much time i could save from just typing literally what i wanted.
I recommend trying it just for fun and to learn something new. Maybe you decide to incorporate it in your day to day from a quality of life rather than from a productive standpoint.
I would give it a try it on my day-job, but I am still freaked out by the privacy aspect of it.
I find LLMs helpful for starting projects, but I don't see how they are helpful for finishing projects. Finishing is about cutting the scope and re-considering the goals based on discovered constraints, which is more like analytical work, then creative (generative) one.
I prefer VS Code + Sourcegraph‘s Cody. Cody has access to the entire codebase of a project, but it doesn’t “know” or load all the code into memory at once. Instead, it uses Sourcegraph’s indexing and search capabilities to retrieve relevant parts of the codebase as needed. This means: Full Project Access.
Whenever I try to navigate a new project I ask Cody to give me an overview. Also writing and modifying code using it is way more than Cursor‘s capabilities. Because it can „understand“ the context.
Yeah after reading this article I'm vastly not impressed with these "projects" unless you think that doing things just to do things is important. Web assembly? Uhhh yeah no thanks. Ohhh json formatter. Not like there are probably 10 of those already. How about rewriting tensorflow in java?
The point is that once you throw a moderately complex problem at it, you’ll typically get something that at least subtly wrong. So now instead of having fun coding, you’re rather debugging hundreds of lines of „someone else’s” code.
Heavily disagree; with guidance, these AI tools can solve problems that otherwise require in-depth expertise. Someone ported the HD version of my game[0] to MacOS using GPT with no prior graphics knowledge.
Exactly. If you just throw things at it, it will fail. You have to break down the complex problems into sets of simple ones (for now). Then it can shine.
Sure, and at no point did the author say "look at how this thing solves complex problems for me". They wrote about how they use these tools to build neat side projects that they otherwise wouldn't have time to build. That's worthwhile.
You must have low standards then. This is something that anyone can do. It's like making a hammer from scratch with iron you dug up from an old mine. I mean how innovative is that really?
Just as a starting point, the majority of developers likely haven't tried running jq using WebAssembly. Heck most developers probably still don't even know what jq is.
You seem to be hung up on "This is something that anyone can do" and "how innovative is that really?" - that's NOT the point of this. This isn't about building innovative things that nobody else could build, it's about being able to churn out small, useful projects more productively thanks to assistance from LLMs.
One thing I've also been doing now is I have a "template" to use for fullstack applications that's configured out-of-the-box with tech that I like, has linters and hooks set up, as well as is structured in a way that's natural to me. This makes it a lot more likely that the generated code will follow my preferred style. It's also not uncommon that on the first iteration I get some code that's not what I'd like to have in my codebase and I just say "refactor X in this and this way, see <file> for an example" at which point I mostly get code that I'm happy with.