Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, exactly this. If I didn't care about price at all, I'd exclusively use this model. It functions more like an actual engineer. I'm in the midst of a DB migration, and eg 5.5 continually suggests stuff like "use DB X instead of DB Y for task Z because its 30% faster" which is an impossibility of reality, given we are migrating DBs. Fable jumped in, reduced allocs by literally 46x, found multiple bugs 4.8 and 5.5 created (max file system usage, correctness issues, etc), and continually suggested awesome improvements unprompted. As in, it would finish a task and then suggest we tackle this other existing problem I didn't know about in a very specific manner... this is the first model that feels like its coming for my job.
 help



I'm having the same experience. I'm in the process of implementing a new CRDT for realtime collaborative editing. There just aren't a lot of implementations of CRDTs kicking around online for opus or any of the other models to have good design instincts.

Fable is doing - so far - a great job. I just had one big question around how part of it should work. I had a design sketch, but with some big unknowns. I asked fable to figure it out via reasoning and prototyping, and it did - it even, under its own initiative, wrote a fuzzer for its prototype which explored and verified that its reasoning was correct. It absolutely nailed it. And it found, and fixed, a couple bugs that I'd missed.

I'm sure its weaknesses will become apparent in time. But, wow this thing is a beast. Its the first time I'm reading the work of an LLM without spotting obvious weaknesses in its reasoning and code. I'm really impressed.


I was about to ask where you work that you’re implementing new CRDTs and then I noticed your username! Thanks for all that you do!

I work on the live collab at my company, and using AI while coding has into recently sort of “clicked” for me. We use an (I’m pretty sure) unheard of algorithm for collaborative editing, and I’ve had a long term goal of turning it into an implementation of EG Walker, but our document model is very complex and most out of the box CRDTs don’t quite fit. Maybe Fable will be what gets me over the hump.


Long shot here because I'm not knowledgeable enough about CRDTs but maybe something like DSON would help? I saw a talk about it a while ago and it might be useful.

https://blog.helsing.ai/posts/dson-a-delta-state-crdt-for-re...

https://www.youtube.com/watch?v=4QkLD7JhD_I&pp=ygUJZHNvbiBjc...


Ty, checking this out!

I’d be fascinated to hear more if you’re willing to share. What is special about your document model which makes existing tools like automerge a bad fit?

We have cross-field invariants that merging at the data structure level can't ensure (in an obvious way, at least), and "lose the semantic meaning of a conflict". The main idea behind their approach is that certain parts of the model can have custom "mergers" that are able to run business logic to maintain these invariants.

Worth noting, the decision to eschew CRDTs predates my time here, and I've pushed for a CRDT rewrite quite a bit since I believe it could be done. The other main concern they had was memory usage, but it seems like EG Walker would solve that. Our system uses a "Commit DAG", (an Event DAG by another name), and does a three-way merge using a common ancestor of the diverged documents, and so a lot of the bones of EG Walker are there, and I'm exploring ways in which we could gradually move to it.


Hello joseph,

I saw scanning the comments and saw you mentioned CRDT. Just wanted to mention that I implemented a CRDT-flavoured sync engine for the product I'm working on a while ago, I think it was with Opus 4.6 if I'm not mistaken (or earlier) so it's not something new to Fable 5, just fyi.


Yeah, you've certainly been able to get Opus to write a CRDT. It just needs a lot of hand-holding to make it correct. Opus always seems pretty bad at coming up with invariants and using them to make a piece of software correct. Without invariants, you end up with lots of hacky workarounds to avoidable problems.

So far at least - and its been less than a day - Fable seems better at this.

I think I also do my CRDTs differently from others. I've grown to like the pure-oplog approach after making eg-walker. LLMs are much worse at this!


> wrote a fuzzer for its prototype which explored and verified that its reasoning was correct. It absolutely nailed it.

For such a data structure, "nailing it" means a formal proof of correctness. Fuzzing, as useful as it is, is merely throwing dirt at the wall and seeing if anything sticks.


I’ll ask it for a formal proof when I get home and see how it goes.

I’ve read plenty of papers with “formal proofs of correctness” that turned out to have huge flaws. Machine verifiable proofs I trust. But I’ve personally found more bugs with fuzzing than I have via proofs.


In the real world, many of us don't have the time to create formal proofs. But our instinct in testing where edge cases may exist in code that we wrote is a type of refactoring that happens in our brains during the coding process. Hand the coding off to a machine and you have no idea where to start looking for the flaws.

> Hand the coding off to a machine and you have no idea where to start looking for the flaws.

I have found this quickly becomes false. I have learned I cannot review llm generated code as if it is written by a trusted senior developer (where I often just do a quick look, see nothing obvious and hit approve). Once you start reading the code in depth with the goal of understanding you quickly see the places where flaws are likely. Sure I start with no clue where to look, but it doesn't take long to see things.


Yes but it takes much longer to trace them. Because the LLM code almost always gravitates toward data blobs and highly dynamic objects and spaghetti that takes a ton of cognitive load to understand what their failure modes are. Even when it does document them.

> this is the first model that feels like its coming for my job

Damn you must be good, I've been feeling this for around 2 years now


It's been obvious for at least 2 years, anyone who doesn't see the writing on the wall simply hasn't learned how to use these well or has severe exponential blindness.

"But it doesn't do well when writing my undertrained language" - yeah, fine. Yet. Reasonable code in that is probably one RAG + verification scaffold deployment around Mythos or maybe mythos+1. Just like it was for you learning it, because you knew how to _program_.


Yeah I agree. We're headed into a rougher job market pretty much across the board for white collar work , hitting junior people worse at this stage. Up to societies around the world to decide how to deal with this - so far we deal with it by ignoring it it seems.

The monks got mad too when the printing press was invented because it took their jobs of hoarding knowledge.

AI is just another tool, learn to use it.


And then in a couple years the AI gets better at "using AI" than the bottom 99.999% of knowledge workers, who are now out of work.

We are all doomed! Doomed I say!

Gosh, I must be doing something wrong. I spent 15 minutes (of which a lot was waiting while it was thinking about "backwards rationalising" it's decision and "gaslighting"[1]) arguing with it over why it keeps using `node -e "console.log(require('fs').readdirSync('…'))"` instead of `ls -l …`.

Like it did everything:

- this is not a Linux system (true, it was macOS) - it is not an available command - the binary is corrupted - node/js is more precise - V8 JavaScript is faster than bash (true technically??? But not in this context lol) - JavaScript is more versatile

I forgot what else we went through but there were a few more things. I indulged it because it was incredulous and funny. The prompts from my side were all questions, never instructions. I assume an instruction would've helped here, but also I don't think Opus ever did this (but on the other hand Opus wrote python scripts to format/indent, instead of just running cargo fmt, so I guess potato potato)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: