More

efskap · 2026-06-05T02:57:30 1780628250

Depends on personality I guess. That would be sooo unsatisfying to me. E.g. not wanting to accept that languages have exceptions "just because" is what got me interested in historical linguistics as a young lad.

pixl97 · 2026-06-05T15:35:00 1780673700

Yep, we can see a lot of people here have had little experience in raising children. Some will just seem to naturally say "I accept that", and another kid that will be like "f you, I don't do what you tell me" born a year apart and raised in the same household. Nurture can moderate these behaviors, but nature is strong.

j45 · 2026-06-05T15:51:17 1780674677

I was more referring to the children that compare themselves to other children, or differences between what they have/are allowed to do and what others can.

It's why I prefaced it with "It can".

Every child is different, but a big impact is every parent who has or hasn't dealt with the normal childhood stuff every parent can have, plus the extra, or latent reactivity can be modelled and passed on.

efskap · 2026-06-05T01:49:57 1780624197

Do one thing and do it well, right?

selcuka · 2026-06-05T05:15:44 1780636544

Define "one". If AWS is a single service, then this tool does 1/100 things.

fuzzylightbulb · 2026-06-05T14:45:27 1780670727

No one would classify AWS as a single service. The name is literally "Amazon Web Services".

efskap · 2026-06-01T20:46:40 1780346800

Reminds me of the SCP-like article / short story about the first executable snapshot of a human brain, with similar horrors of realization.

https://qntm.org/mmacevedo

efskap · 2026-05-29T04:07:25 1780027645

At least for Nix itself, that's pretty much it except via Dutch.

> The name Nix is derived from the Dutch word niks, meaning nothing; build actions do not see anything that has not been explicitly declared as an input

From page 81 of the original paper: https://edolstra.github.io/pubs/nspfssd-lisa2004-final.pdf

microtonal · 2026-05-29T06:14:36 1780035276

Also, I think the founder's username in various places is nixnut. Which to an English-only speaker means someone crazy about Nix (Nix fan). However in Dutch 'niksnut' or 'nietsnut' loosely translates to 'bum'.

yjftsjthsd-h · 2026-05-29T04:37:51 1780029471

That's surprising; nix is Latin for snow, and its logo is a snow flake, so I just assumed it was that.

isityettime · 2026-05-29T13:00:40 1780059640

I don't think the logo choice is a coincidence, either; it's just that the ordering is different.

efskap · 2026-05-26T18:37:56 1779820676

I like https://viznut.fi/unscii/ - meant for ascii art but still works well in a terminal, and still gets unicode updates

co-ent · 2026-05-26T22:49:59 1779835799

The 'fantasy' version reminds me of the Sleipnir font for Dwarf Fortress. Neat! http://dwarffortresswiki.org/index.php/File:Andux_sleipnir_8...

JdeBP · 2026-05-28T18:23:56 1779992636

I have that forked, as well as a fork of funscii. Both have fixes in the main branch. I've added a fair amount of stuff beyond that in a branch of unscii.

* https://github.com/jdebp/unscii/tree/2.1.1f

* https://github.com/jdebp/funscii

smusamashah · 2026-05-28T01:36:08 1779932168

This reminded of Wingdings font that I use to play around with in Windows 98

efskap · 2026-05-26T01:50:50 1779760250

Yup, that's the https://en.wikipedia.org/wiki/Default_mode_network

It's the daydreaming/mind-wandering state that occurs when you're not focused on an external task. With all the stimuli of the modern world, I feel like we're being starved of crucial DMN time if we don't engineer conditions like the ones you describe.

aswegs8 · 2026-05-26T09:50:46 1779789046

Quite the interesting but unapproachable topic. Doesn't help that neurology logic on brain-level is dynamic and general rules are hard to extract.

efskap · 2026-05-22T01:53:43 1779414823

Specifically a clanker that itself would rather hallucinate something than say "I don't know".

efskap · 2026-05-22T01:50:18 1779414618

It reminds me of how LLM hallucination is attributed to "I don't know" being underrepresented in training data, and it being a better strategy to guess on evaluations rather than admit not knowing.

Different reward function, but the same behaviour emerges.

nullc · 2026-05-22T04:45:41 1779425141

We'll see that improve as people move onto synthetic training data-- something now possible that we have sufficiently smart LLMs to create enough of it.

The idea is that you generate fake llm transcripts using your classical training data. E.g. look at some training data, generate q/a transcripts. Generate radom questions, RAG against your whole dataset and look for relevant stuff, if there is nothing there, train a "I don't know." reply.

A moderately sized LLM operating some tools to access more information behind the scenes, perform tests and correct its own errors can write transcripts simulating a much larger and smarter llm.

efskap · 2026-05-13T00:27:43 1778632063

No FFN is blowing my mind. This is pretty much "Attention Is ACTUALLY All You Need". Reminds me of BERT Q&A which would return indices into the input context, but even that had a FFN. Really exciting work.

krackers · 2026-05-13T01:49:05 1778636945

I guess this had always been bugging me. I get while you need activation/non-linearities, but do you really need the FFN in Transformers? People say that without it you can't do "knowledge/fact" lookups, but you still have the Value part of the attention, and if your question is "what is the capital of france" the LLM could presumably extract out "paris" from the value vector during attention computation instead of needing the FFN for that. Deleting the FFN is probably way worse in terms of scaling laws or storing information, but is it an actual architectural dead-end (in the way that deleting activation layer clearly would be since it'd collapse everythig to a linear function).

Majromax · 2026-05-13T03:09:57 1778641797

> if your question is "what is the capital of france" the LLM could presumably extract out "paris" from the value vector during attention computation instead of needing the FFN for that.

But how do you get 'Paris' into the value vector in that case? The value vector is just the result of a matrix multiplication, and without a nonlinearity it can't perform a data-dependent transformation. Attention still acts as a nonlinear mixer of previous values, but your new output is still limited to the convex combination of previous values.

krackers · 2026-05-13T04:26:52 1778646412

> But how do you get 'Paris' into the value vector in that case?

Ok wait I think I see what you mean. Although maybe it's not getting paris _into_ the value vector that's hard, but isolating the residual stream to _only_ that instead of things like other capitals.

So as a naive example maybe at the very first layer consuming your tokens: Q{France} would have high inner product with K{capital} and so our residual would now mostly contain V{capital}, which maybe contains embeddings of all the capitals of all countries. You need some way to filter out all the other stuff, but can't do that without a FFN + activation.

Just throwing in a relu by itself won't help since that would still work on all the elements uniformly, you need some way to put weight on "paris" while suppressing the others, i.e. mixing within the residual stream itself.

Although maybe if you really stretch it, somewhere in a deeper layer you could have 1-hot encoded values with a "gain" coefficient so that when you do the residual addition it's something like {<paris>, <tokyo>, <dc>} + 10000*{<1>, <0>, <0>} and then if you softmax that you get something with most of its mass on "Paris". But it seems like this would not be practical, or it's just shifting the issue to how that the right 1-hot vector is chosen

efskap · 2026-05-12T21:52:17 1778622737

Makes sense that the agent can refine its search terms/strategy based on discovered context.

But it still has to enumerate synonyms to find things.

I would assume it's very domain dependent, like code or technical docs would have more precise terminology that is better for fixed string search. On the other hand, medical or legal text can have many many ways to say something

soco · 2026-05-13T10:40:59 1778668859

Yup, and good luck finding usable dictionaries. It's a lot of one-time handiwork to build it yourself, for which you need to find the right motivation (and time, and funding)