Can this take vague ideas, do iterative design with me, and breakdown tasks to then pass off to agents to build?
I was playing with a very similar project recently that was more focused on a high level input ("Build a new whatever dashboard, <more braindump>") and went back and forth with an agent to clarify and refine. Then broke down into Epics/Stories/Tasks, and then handed those off automatically to build.
The workflow then is iterating on those high level requests. Heavily inspired by the dark factory posts that have been making the rounds recently.
From a glance, it seems like this is designed so that I write all the tasks myself? Does it have any sort of coordination layer to manage git, or otherwise keep agents from stepping on each other?
I've been working on a similar project https://github.com/BumpyClock/tasque . Tracks tasks (Epics, tasks, subtasks) with deps between them. So I plan for an hour or so and then when I walk away from my desk I had the tasks for the agents to code and then I can come back and verify.
Edit: minor note, one additional thing that is in the skill that the tool installs is to direct the agent to create follow up tasks for any bugs or refactor opportunities that it encounters. I find this let's the agent scratch that itch of they see something but instead of getting sidetracked and doing that thing, they create a follow up tasks that I can review later and they can move on.
I have been using Superset (https://superset.sh/) and it has worked really well to automate creating & deleting worktrees, with their own terminals, and keeping everything organized. Great for running work in parallel.
It's really just a terminal emulator w/ a bunch of extra helpers to make coding agents work well. Which I really like since it doesn't try to wrap claude or codex in it's own ui or anything tricky.
I agree, I have 'written' a handful of rubocop rules that are hyper specific to the codebase I work on. I never would have bothered before claude code. Stuff like using out custom logger correctly, or to not use Rails.env because we have our own (weird of course) env system.
so apparently they have custom hardware that is basically absolutely gigantic chips - across the scale of a whole wafer at a time. Presumably they keep the entire model right on chip, in effectively L3 cache or whatever. So the memory bandwidth is absurdly fast, allowing very fast inference.
It's more expensive to get the same raw compute as a cluster of nvidia chips, but they don't have the same peak throughput.
As far as price as a coder, I am giving a month of the $50 plan a shot. I haven't figured out how to adapt my workflow yet to faster speeds (also learning and setting up opencode).
For $50/month, it's a non-starter. I hope they can find a way to use all this excess bandwidth to put out a $10 equivalent to Claude Code instead of a 1000 tok/s party trick I can't use properly.
GLM-4.6 is on par with Sonnet 4.5. Sometimes it is better, sometimes it is worse. Give it a shot. It's the only model that made me (almost) ditch Claude. The only problem is, Claude Code is still the best agentic program in town and search doesn't function without a proper subscription.
Cerebras offers pay-per-token. What are you asking for? Claude Code starts at $100, or $15/mtok. Cerebras is already much cheaper, but you want it to be even cheaper at $10?
Yes this is the output speed. Code just flashes onto the page, it's pretty impressive.
They've claimed repeatedly in their discord that they don't quantize models.
The speed of things does change how you interact with it I think. I had this new GLM model hooked up to opencode as the harness with their $50/mo subscription plan. It was seriously fast to answer questions, although there are still big pauses in workflow when the per-minute request cap is hit.
I got a meaningful refactor done, maybe a touch faster than I would have in claude code + sonnet? But my human interaction with it felt like the slow part.
I find AI is most useful at the ancillary extra stuff. Things that I'd never get to myself. Little scripts of course, but more like "it'd be nice to rename this entire feature / db table / tests to better match the words that the business has started to use to discuss it".
In the past, that much nitpicky detail just wouldn't have gotten done, my time would have been spent on actual features. But what I just described was a 30 minute background thing in claude code. Worked 95%, and needed just one reminder tweak to make it deployable.
The actual work I do is too deep in business knowledge to be AI coded directly, but I do use it to write tests to cover various edge cases, trace current usage of existing code, and so on. I also find AI code reviews really useful to catch 'dumb errors' - nil errors, type mismatches, style mismatch with existing code, and so on. It's in addition to human code reviews, but easy to run on every PR.
Wow, 30 minutes to rename functions and tests? I wonder how much energy and water that llm wasted for something that any lsp supporting editor can do in a second.
Settings > Extensions > Git
Blame: Status Bar Item Enabled (check this)
Blame: Status Bar Item Template (use this value)
${authorName} (${authorDate}) ${subject}
There's no secret sauce, all these variables are shown right above the input.
weird - I had never heard of Jellycats as a brand, but last night I went to go buy a second copy of my toddler's favorite lovey (since the first will inevitably get ruined at some point... thinking ahead). And it was a jellycat brand, must have been gifted to us.
* I like how the map moves around. It helps nail down relationships to neighbors
* I don't mind a few extra "Where's canada", even though it's not that useful
* I'd like the pause between answers be shorter.
* Small countries are impossible to see when zoomed out on the first exposure, even when selected right. I find myself knowing the area it's in (ie, central america) but not which exact country. So selecting it right when zoomed out doesn't get me the correct answer.
I was playing with a very similar project recently that was more focused on a high level input ("Build a new whatever dashboard, <more braindump>") and went back and forth with an agent to clarify and refine. Then broke down into Epics/Stories/Tasks, and then handed those off automatically to build.
The workflow then is iterating on those high level requests. Heavily inspired by the dark factory posts that have been making the rounds recently.
From a glance, it seems like this is designed so that I write all the tasks myself? Does it have any sort of coordination layer to manage git, or otherwise keep agents from stepping on each other?
reply