>It's not an assumption. It's literally based on the results of your own reference:
Human performance generally decreases with level of exposure so I figured you were talking about something else. Guess not.
>I don't think additional conversation is worthwhile because it becomes apparent that it's more dogmatic than reasoned
By all means, end the conversation whenever you wish.
>It goes into further detail about how LLM can have high accuracy while also making simple, unpredictable mistakes.
I'm well aware. So? Weird failure modes are expected. Humans make simple, unpredictable mistakes that don't make any sense without the lens of evolutionary biology. LLMs will have odd failure modes regardless of whether it's the "real deak" or not, either adopted from the data or from the training scheme itself.
>If I didn't know better, I'd assume you could just as well be a chat bot who only reads abstracts and replies in an overconfident manner.
>Human performance generally decreases with level of exposure
Are you saying that as humans get more experience, they perform worse? I disagree, but irrespective of that point it’s wild that you can have this many responses while still completely bypassing the entire point I was making.
I don’t think most would argue that performance increases with experience. The point is how well can the performance be maintained when there is little or no exposure. Because that implies principled reasoning rather than simple pattern mapping. That is the entire through line behind my comments regarding context dependent language, novel driving scenarios, etc.
>Think on that
In the context of the above, I don’t think this is nearly as strong of a point as you seem to think it is. There nothing novel about a text-based discussion.
1. We anchored this discussion on arithmetic so I stuck to that. If a child never learns (no exposure) how to do base 16 arithmetic then a test quizzing on base 16 arithmetic will result in zero performance.
If that child had the basic teaching most children do (little exposure) then a quiz will result in much worse performance than a base 10 equivalent test. This is very simple. I don't know what else to tell you here.
2. You must understand that a human driver that stops because a kite suddenly comes across the road doesn't do so because of any kite>child>must not hurt reasoning. Your brain doesn't even process information that quickly. The human driver stops (or perhaps he/she doesn't) and then rationalizes a reason for the decision after the fact. Humans are very good at doing this sort of thing. Except that this rationalization might not have anything at all to do with what you believe to be "truth". Just because you think or believe it is so doesn't actually mean it is so. Choices shape preferences just as much as the inverse.
For all anyone knows and indeed most likely, "child" didn't even enter the equation untill well after the fact.
Now if you're asking whether LLMs a matter of principle can infere/grok these sort of casual relationships between different "objects" then yes as far as anyone is able to test.
Your first statement seems to contradict your previous. Did you originally mistype what you meant when you said greater exposure leads to worse outcomes? Because now you’re implying more exposure has the opposite effect.
Regardless, it still misses the point. I’ve never been explicitly exposed to base-72, yet I can reason my way through it. I would argue my performance wouldn’t be any different than base-82. So I can transfer basic principles. What the LLM result you referenced shows is that it is not learning basic principles. It sure seems like you just read the abstract and ran with it.
As far as the psychology of decision making, again, I think you're speaking with greater confidence than is warranted. In time critical examples, I’m inclined to agree. And there’s certainly some notable psychologists who would expand it beyond snap judgments. But there are also some notable psychologists who tend to disagree. It’s not a settled science, despite your confidence. But again, that’s getting stuck in the limitations of the example and missing the forest for the trees. The point is not in whether decisions are made consciously or subconsciously, but rather how learning can be inferred from previous experience and transferred to novel experiences. Whether this happens consciously or not is besides the point. And you are further going down what I was explicitly taking against: confusing image/pattern recognition for contextual reason. You can see this in the relatively recent Go issue; any human could see what the issue was because they understand the contextual reasoning of the game but the AI could not and was fooled by a novel strategy. The points I’ve been making have completely flew over your head to the point where you’re shoehorning in a completely different conversation.
>Did you originally mistype what you meant when you said greater exposure leads to worse outcomes? Because now you’re implying more exposure has the opposite effect.
I guess so. I've never meant to imply greater exposure leads to worse outcomes.
>I would argue my performance wouldn’t be any different than base-82.
Even if that were true and i don't know that i agree, the authors of that paper make no attempt to test in circumstances that might make this true for LLMs as it might for people. So the paper is not evidence of the claim (no basic principles) either way. For example, i reckon your performance on the proceeding 82 test will be better if taken a immediately after than if taken weeks or months later. So surrounding context is important even if you're right.
>What the LLM result you referenced shows is that it is not learning basic principles.
I disagree here and i've explained why.
>You can see this in the relatively recent Go issue; any human could see what the issue was because they understand the contextual reasoning of the game but the AI could not and was fooled by a novel strategy.
KataGo taught itself to play go by explicitly deprioritizing “losing” strategies. This means it didn’t play many amateur strategies because they were lost early in the training. This is hard for a human to understand because humans all generally share a learning curve going from beginning to amateur to expert. So all humans have more experience with “losing” techniques. Basically what I’m saying is, it might be that the training scheme of this AI explicitly prioritized having little understanding of these specific tactics, which is different than not having any understanding.
This circles back to the point I made earlier. Having failure modes humans don't or won't understand or have is not the same as a lack of "true understanding".
We have no clue what "basic principles" actually are on the low level. The less inductive bias we try to shoehorn into models, the better performing they become. Models literally tend to perform worse the more we try to bake "basic principles" in. So presence of an odd failure mode we *think* belies a lack of "basic principles" is not necessarily evidence of a lack of it.
>The points I’ve been making have completely flew over your head to the point where you’re shoehorning in a completely different conversation.
You're convinced it's just "very good pattern matching", whatever that means. I disagree.
I think the short of it is that it seems to me that you are confusing a system having very good heuristics for having a solid understanding of principles of reality. Heuristics, without an understanding of principles, is what I mean by rote pattern matching. But heuristic break down, particularly in edge cases. Yes, humans also rely heavily on heuristics because we generally seek to use the least effort possible. But we also can mitigate against those shortcomings by reasoning about basic principles. This shortcoming is why I think both humans and AI can make seemingly stupid mistakes. The difference is, I don't think you've provided evidence that AI can have a principled understanding while we can show that humans can. Having a principled understanding is important to move from simple "cause-effect" relationships to understanding "why". This is important because the "why" can transfer to many unrelated domains or novel scenarios.
E.g., racism/sexism/...most -'isms' appear to be a general heuristics that help us make quick judgements. But we can also our decision-making process by reverting to basic principles, like the idea that humans have equal moral worth regardless of skin tone or gender. AI can even mimic these mitigations, but you haven't convinced me that it can fundamentally change away from it's training set based on an understanding of basic principles.
As for the Go example, a novice would be able to identify that somebody is drawing a circle around it's pieces; your link even states this. But you recharacterizing this as a specific strategy is weird when that strategy causes you to lose the game. It misses the entire meaning of strategy. We see the limitations of AI in it's reliance to training data from autonomous vehicles to healthcare. They range from the serious (cancer detection) to the humorous (Marines overtaking robots by hiding in boxes like in Metal Gear). The paper you referenced similarly shows it is reliant on proximity to the training set, rather than actually understanding the underlying principles.
>Did you read the paper? The authors admit it is only narrowly learning and cannot transfer it's knowledge to unknown areas. From the article: "we do not expect our language model to generate proteins that belong to a completely different distribution or domain"
Good thing they don't make sweeping declarations or say anything about that meaning narrow learning without transfer. Jumping the shark yet again.
>We find that without prior knowledge, information emerges in the learned representations on fundamental properties of proteins such as secondary structure, contacts, and biological activity. We show the learned representations are useful across benchmarks for remote homology detection, prediction of secondary structure, long-range residue–residue contacts, and mutational effect.
From the sequences of just the proteins alone, Language Models learn underlying properties that transfer to a wide variety of use cases. So yes, they understand proteins in any definition that has any meaning.
Wrong comment to respond to; if you can’t wait to reply, that might indicate it’s time to take a step back.
>Good thing they don't make sweeping declarations or say anything about that meaning narrow learning without transfer.
That's exactly what that previous quote means. Did you read the methodology? They train on a universal training set and then have to tune it using a closely related training set for it to work. In other words, the first step is not good enough to be transferrable and needs to be fine tuned. In that context, the quote implies the fine tuning pushes the model away from a generalizable one into a narrow model that no longer works outside that specific application. Apropos to this entire discussion, it means it doesn't perform well in novel domains. If it could truly "understand proteins in any definition", it wouldn't need to be retrained for each application. The word you used ('any') literally means "without specification"; the model needs to be specifically tuned to the protein family of interest.
You are quoting an entirely different publication in your response. You should use the paper from which I quoted to refute my statement, otherwise this is the definition of cherry picking. Can you explain why the two studies came to different conclusions? It sure seems like you're not reading the work to learn and instead just grasping at straws to be "right." I have zero interest in having a conversation where someone just jumps from one abstract to another just to argue rather than adding anything of substance.
>I think the short of it is that it seems to me that you are confusing a system having very good heuristics for having a solid understanding of principles of reality.
Humans don’t have a grasp of the “principles of reasoning” and as such are incapable of distinguishing “true”, "different" or “heuristic” assuming such a distinction is even meaningful. Where you are convinced of “faulty shortcut”, I simply think “different”. Multiple ways to skin a cat. a plane's flight is as "true" as any bird. There's no "faulty shortcut" even when it fails in ways a bird will not.
You say humans are "true" and LLMs are not but you base it on factors that can be probed in humans as well so to me, your argument simply falls apart. This is where our divide stems from.
>I don't think you've provided evidence that AI can have a principled understanding while we can show that humans can.
What would be evidence to you? Let’s leave conjecture and assumptions. What evaluation exist that demonstrate this “principled understanding” in humans? and how would we create an equitable test in LLMs?
>a novice would be able to identify that somebody is drawing a circle around it's pieces; your link even states this. But you recharacterizing this as a specific strategy is weird when that strategy causes you to lose the game.
You misunderstand. I did not characterize this as a specific “strategy”. Not only do modern Go systems not learn like humans, but they also don’t learn from human data at all. KataGo didn’t create a heuristic to play like a human because it didn’t even see humans play.
>The paper you referenced similarly shows it is reliant on proximity to the training set, rather than actually understanding the underlying principles.
Even the authors make it clear this isn’t necessarily the bridge to take so it’s odd to see you die on this hill.
The counterfactual of syntax is
Finding the main subject and verb of something like “Think are the best LMs they.” in verb-obj-subj order (they, think) instead of “They think LMs are the best.” in subj-verb-obj order (they, think). LLMs are not being trained on text like the former to any significant degree if at all yet the performance is fairly close. So what, it doesn’t “underlying principles of syntax” but still manages that ?
The problem is that you take a fairly reasonable conclusion from these experiments. I.e LLMs can/often also rely on narrow, non-transferable procedures for task-solving and proceed to jump the shark from there.
>but you haven't convinced me that it can fundamentally change away from it's training set based on an understanding of basic principles.
We see language models create novel functioning protein structure after training, no folding necessary.
Did you read the paper? The authors admit it is only narrowly learning and cannot transfer it's knowledge to unknown areas. From the article:
"we do not expect our language model to generate proteins that belong to a completely different distribution or domain"
So, no, I do not think it displays a fundamental understanding.
>What would be evidence to you?
We've already discussed this ad nauseum. Like all science, there is no definitive answer. However, when the data shows evidence that something like proximity to training data is predictive of performance, it's seems more like evidence of learning heuristics and not underlying principles.
Now, I'm open to the idea that humans just have a deeper level of heuristics rather than principled understanding. If that's the case, it's just a difference of degree rather than type. But I don't think that's a fruitful discussion because it may not be testable/provable so I would classify it as philosophy more than anything else and certainly not worthy of the confidence that you're speaking with.
Human performance generally decreases with level of exposure so I figured you were talking about something else. Guess not.
>I don't think additional conversation is worthwhile because it becomes apparent that it's more dogmatic than reasoned
By all means, end the conversation whenever you wish.
>It goes into further detail about how LLM can have high accuracy while also making simple, unpredictable mistakes.
I'm well aware. So? Weird failure modes are expected. Humans make simple, unpredictable mistakes that don't make any sense without the lens of evolutionary biology. LLMs will have odd failure modes regardless of whether it's the "real deak" or not, either adopted from the data or from the training scheme itself.
>If I didn't know better, I'd assume you could just as well be a chat bot who only reads abstracts and replies in an overconfident manner.
Now you're getting it. Think on that.