* build a codegen for Idris2 and a rust RT (a parallel stack "typed" VM)
* a full application in Elm, while asking it to borrow from DT to have it "correct-by-construction", use zippers for some data structures… etc. And it worked!
* Whilst at it, I built Elm but in Idris2, while improving on the rendering part (this is WIP)
* data collators and iterators to handle some ML trainings with pausing features so that I can just Ctrl-C and continue if needed/possible/makes sense.
* etc.
At the end I had to rewrite completely some parts, but I would say 90% of the boring work was correctly done and I only had to focus on the interesting bits.
However it didn’t deliver the kind of thorough prep work a painter would do before painting a house when asked for. It simply did exactly what I asked, meaning, it did the paint and no more.
I keep seeing folks who say they’ve built a “full application” using or deeply collaborating with an LLM but, aside from code that is only purported to be LLM-generated, I’ve yet to see any evidence that I can consider non-trivial. Show me the chat sessions that produced these “full applications”.
An application powered by a single source file comprised of only 400 lines of code is, by my definition, trivial. My needs are more complex than that, and I’d expect that the vast majority of folks who are trying to build or maintain production quality revenue generating code have the same or similar needs.
Again, please don’t be offended; what you’re doing is great and I dearly appreciate you sharing your experience! Just be aware that the stuff you’re demonstrating isn’t (hasn’t been, for me at least) capable of producing the kind of complexity I need while using the languages and tooling required in my environment. In other words, while everything of yours I’ve seen has intellectual and perhaps even monetary value, that doesn’t mean your examples or strategies work for all use-cases.
LLMs are restricted by their output limits. Most LLMs can output 4096 tokens - the new Claude 3.5 Sonnet release this week brings that up to 8196.
As such, for one-shot apps like these there's a strict limit to how much you can get done purely though prompting in a single session.
I work on plenty of larger projects with lots of LLM assistance, but for those I'm using LLMs to write individual functions or classes or templates - not for larger chunks of functionality.
> As such, for one-shot apps like these there's a strict limit to how much you can get done purely though prompting in a single session.
That’s an important detail that is (intentionally?) overlooked by the marketing of these tools. With a human collaborator, I don’t have to worry much about keeping collab sessions short—and humans are dramatically better at remembering the context of our previous sessions.
> I work on plenty of larger projects with lots of LLM assistance, but for those I'm using LLMs to write individual functions or classes or templates - not for larger chunks of functionality.
Good to know. For the larger projects where you use the models as an assistant only, do the models “know” about the rest of the project’s code/design through some sort of RAG or do you just ask a model to write a given function and then manually (or through continued prompting in a given session) modify the resulting code to fit correctly within the project?
There are systems that can do RAG for you - GitHub Copilot and Cursor for example - but I mostly just paste exactly what I want the model to know into a prompt.
In my experience most of effective LLM usage comes down to carefully designing the contents of the context.
My experience with Copilot (which is admittedly a few months outdated; I never tried Cursor but will soon) shows that it’s really good at inline completion and producing boilerplate for me but pretty bad at understanding or even recognizing the existence of scaffolding and business logic already present in my projects.
> but I mostly just paste exactly what I want the model to know into a prompt.
Does this include the work you do on your larger projects? Do those larger projects fit entirely within the context window? If not, without RAG, how do you effectively prompt a model to recognize or know about the relevant context of larger projects?
For example, say I have a class file that includes dozens of imports from other parts of the project. If I ask the model to add a method that should rely upon other components of the project, how does the model know what’s important without RAG? Do I just enumerate every possible relevant import and include a summary of their purpose? That seems excessively burdensome given the purported capabilities of these models. It also seems unlikely to result in reasonable code unless I explicitly document each callable method’s signature and purpose.
For what it’s worth, I know I’ve been pretty skeptical during our conversations but I really appreciate your feedback and the work you’ve been doing; it’s helping me recognize both the limitations of my own knowledge and the limitations of what I should reasonably expect from the models. Thank you, again.
Yes, I paste stuff in from larger projects all the time.
I'm very selective about what I give them. For example, if I'm working on a Django project I'll paste in just the Django ORM models for the part of the codebase I'm working on - that's enough for it to spit out forms and views and templates, it doesn't need to know about other parts of the codebase.
Another trick I sometimes use is Claude Projects, which allow you to paste up to 200,000 tokens into persistent context for a model. That's enough to fit a LOT of code, so I've occasionally dumped my entire codebase (using my https://github.com/simonw/files-to-prompt/ tool) in there, or selected pieces that are important like the model and URL definitions.
Well, I just asked ChatGPT to answer my "How to print hello world in c++" with a typical stack overflow answer.
Lo and behold, the answer is very polite, explanative and even lists common mistakes. It even added two very helpful user comments!
I asked it again how this answer would look in 2024 and it just updated the answer to the latest c++ standard!
Then! I asked it what a moderator would say when they chime in. Of course the moderator reminded everyone to stay on focus regarding the question, avoid opinions and back their answer by documentation or standards. In the end the mod thanked for everyone's contribution and keeping the discussion constructive!
Ah! What a wonderful world ChatGPT is living at! I want to be there too!
The optics of equating a resistance organization on the one hand with a colonial and apartheid state with dysfunctional judicial system and no accountability for any crimes committed by settlers or its military on the other hand by putting them in the same press release is pretty bad for the court.
I'm all for investigating if there were any orders of directly targeting civilians being given to the Palestinian resistance, etc, but that's a pretty far fetched assumption in my opinion.
On the other side you have what's a pretty clear case of a large scale terror attack against innocent civilians, indiscriminately bombing schools and hospitals.
In addition, why doesn't the ICC look into US and Germany conduct of delivering weapons enabling the genocide?
For comparison, the French resistance was called a terrorist organisation by the Germans, as Algerian FLN was called terrorist organisation by the French… etc. History would be kind of funny if it wasn’t tragic.
And this whole “terrorist” word was jeopardised by Bush. There’s no “terrorism” per se as an emanation of evil.
It’s just an asymmetrical and violent extension of political expression, where dialogue failed to reach a settlement.
Otherwise you’d need to explain the ideological similarities between Al Qaeda and eg. ETA.
These aren't verdicts, so it goes to court-trial, and arguments will then be put forth.
We have the Nuremberg Code now, where people including propagandists were hung simply because they should have known better - even though there weren't yet specific laws in place yet.
It's been fascinating to see how fascism can rise so quickly, hidden by the veil of propaganda, different countries around the world at different stages of capture - some where turnkey authoritarianism has recently been executed on like in Canada, policies and infrastructure put in place that allows top down control so easily with so many succumbing to the fear mongering in part due to deep seeded programming.
The NaZi Germans were eventually suppressed, and with the internet this next attempt will hopefully be quickly stopped in its tracks globally - however there are arguably $ trillions in the war chest of the bad actors in the global establishment toeing the same line, of which people like Catherine Austin Fitts have been sounding the alarm for years now - who saw behind the scenes the financial markets et al aligning for this. Unfortunately because so many systems are centralized at the moment, it only takes a very small amount of people to cause chaos and mass destruction-death - whether that's manufacturing consent for people to believe power outages aren't planned and "out of our control - give us $ trillions to upgrade infrastructure [which actually will mostly go to our friends while we continue suffocate society financially and extract as much of the value of your labor-productivity as we can manufacture consent to get from you]" etc.
I hope, pray, that RFK in the US will win the next election - and that Pierre in Canada wins, and pray God is ready to cut the shit of these tyrant wannabes with totalitarian wet dreams - and will prevent their assassinations, etc; else the floodgates of hell are seemingly near ready to be unlatched.
I have a new NIBE pump (S735) that replaced my old, also NIBE (Fighter 640P) pump. It's a bit more silent than the older one usually, but sometimes it starts what sounds like a jet engine which is pretty loud. I can hear it from upstairs, far away from the pump, coming from the vents. I am not sure what it's doing when it makes that noise but it made me a bit upset with it, as initially I had hoped for a quieter house, and the pump was pretty expensive!
* build a codegen for Idris2 and a rust RT (a parallel stack "typed" VM)
* a full application in Elm, while asking it to borrow from DT to have it "correct-by-construction", use zippers for some data structures… etc. And it worked!
* Whilst at it, I built Elm but in Idris2, while improving on the rendering part (this is WIP)
* data collators and iterators to handle some ML trainings with pausing features so that I can just Ctrl-C and continue if needed/possible/makes sense.
* etc.
At the end I had to rewrite completely some parts, but I would say 90% of the boring work was correctly done and I only had to focus on the interesting bits.
However it didn’t deliver the kind of thorough prep work a painter would do before painting a house when asked for. It simply did exactly what I asked, meaning, it did the paint and no more.
(Using 4o and o1-preview)