Seems exactly like what you want. We don't think in plain English, we _rationalize_ our thoughts into English (or whatever language comes out) but they must be more fundamental than language because language is acquired.
Essentially, English is one of many possible encodings of an underlying intuitive, possibly non-symbolic representation.
That's debatable. Language shapes thoughts much more than you might think. Because you learn concepts from language that you could not imagine by yourself until you learned/read about them, so they are in effect very linked to language.
I can also think in images and internal visualizations. Geometric reasoning is also a thing. Musicians can also hear things in their mind - some can write it down, others can play it directly, and in my case I'm not good enough to get it out of my head!
In all cases though these thoughts are kind of tied to representations from the real world. Sort of like other languages via different senses. So yeah, how abstract can our thoughts actually be?
But the thing you learn is not the word 'purple'. You just use the word as the mental scaffolding to build a concept of purple. The word forms a linkage to a deeper embedding, which is further proven by the fact that it's actually slightly different in each mind that has understanding of the concept.
This embedded concept is what is doing the work, the word was just the seed of the understanding and a method by which to convey that understanding to others.
Language is definitely a significant part of thinking, but when I remember how cold it was outside yesterday to figure out if it was colder than today, I'm not bringing words to mind. I'm bringing up some other non-discrete information that I could never precisely encode into words and then factoring that in with the other non-discrete information I'm currently taking in through my senses. Its only after that processing that I encode it as a lossy "It was colder yesterday" statement.
For example, I can think in formal logic. I've learned to do that, and surely my brain takes a step-by-step approach to it, but I've also internalized some of it and I don't think that my proficiency with English has anything to do with it.
I could have learned the same concepts in any other language, but the end result would be the same.
And surely there are many thoughts that can't be expressed purely with words. For example all that is related to qualia. You can think of a color but you can't describe what you see in your mind's eye with words, not in a way that would let a blind person share the same experience. Or try describing "love" without making a similitude. Is love a thought? Or a feeling? Is there a meaningful difference between the two?
>The hypothesis is in dispute, with many different variations throughout its history.[2] The strong hypothesis of linguistic relativity, now referred to as linguistic determinism, is that language determines thought and that linguistic categories limit and restrict cognitive categories. This was a claim by some earlier linguists pre-World War II;[3] since then it has fallen out of acceptance by contemporary linguists.
eh, probably both. Why does it have to be a fight between two schools of thoughts? Thoughts can be across-modal. Some of it can be done in specific language or some could be visual.
(universal grammar peoole hates this somehow, it's weird)
In section 2 they briefly mention studies such as [1] that point out that the token outputs of a chain of thought aren't always entirely faithful to the responses of the models
I'm not sure whether it wouldn't be more reliable to let the model run on latents and try to train a separate latent-reading explainer module that has at least some approximation of what we want as an explicit optimization objective.
Assuming it actually is or has the potential to be better than CoT, from what I gathered from the paper the current results are mostly just more efficient token-wise.
I was thinking abut safety reasons, but also usability. Seems like a pretty big difference to me if you don't understand the chain of thought. How faithful cot are is another question.
Someone could fine-tune a model on pairs of existing proteins and their misfolded prions and then ask the system to come up with new prions for other proteins.
ChatGPT found these 4 companies that will produce proteins for you just based on digital DNA that you send them:
The context here is that prions are misfolded proteins that replicate by causing other proteins to change their configuration into the misfolded form of the prion. Diseases caused by prions include Mad Cow disease, Creutzfeldt-Jakob disease, and Chronic Wasting disease. All prion diseases are incurable and 100% fatal.
Ilya went to university in israel and all founders are jewish. Many labs have offices outside of the US, like london, due to crazy immigration law in the us.
There are actually a ton of reasons to like London. The engineering talent is close to bay level for fintech/security systems engineers while being 60% of the price, it has 186% deductions with cash back instead of carry forward for R&D spending, it has the best AI researchers in the world and profit from patents is only taxed at 10% in the UK.
If we say that half of innovations came from Alphabet/Google, then most of them (transformers, LLMs, tensorflow) came from Google Research and not Deep Mind.
Many companies have offices outside because of talent pools, costs, and other regional advantages. Though I am sure some of it is due to immigration law, I don't believe that is generally the main factor. Plus the same could be said for most other countries.
Part of it may also be a way to mitigate potential regulatory risk. Israel thus far does not have an equivalent to something like SB1047 (the closest they've come is participation in the Council of Europe AI treaty negotiations), and SSI will be well-positioned to lobby against intrusive regulation domestically in Israel.
I feel like local models could be an amazing coding experience because you could disconnect from the internet. Usually I need to open chatgpt or google every so often to solve some issue or generate some function, but this also introduces so many distractions. imagine being able to turn off internet completely and only have a chat assistant that runs locally. I fear though that it is just going to be a bit to slow at generating tokens on CPU to not be annoying.
I don't have a gut feel for how much difference the Mamba arch makes to inference speed, nor how much quantisation is likely to ruin things, but as a rough comparison Mistral-7B at 4 bits per param is very usable on CPU.
The issue with using any local models for code generation comes up with doing so in a professional context: you lose any infrastructure the provider might have for avoiding regurgitation of copyright code, so there's a legal risk there. That might not be a barrier in your context, but in my day-to-day it certainly is.
Current systems are already (in a limited way) helping with alignment, anthropic is using its AI to label the sparse features of their sparse auto encoder approach. I think the original idea of labeling neurons by AI came from william saunders, who also left openai recently.
Resignations lead to more resignations....unless mgmt. can get on top of it and remedy it quickly, which rarely happens. I've seen it happen way too many times working 25 years in tech.
You need to think about OpenAI specifically - Ilya basically attempted a coup last year and failed, stayed in the company for months afterwards, according to rumours had limited contributions to the breakthroughs in research and was assigned to lead the most wishy-washy project of superalignment.
I’m not seeing “the good ones” leaving in this case.
Seriously asking; I've purchased a GitHub CoPilot license subscription but I don't know what their sales numbers are doing on AI in general. It's to be seen if it can be made more cost-efficient to deliver to consumers.
Because Tesla is. Unlike the traditional automakers that have no room of growth and in a perpetual stagnation, Tesla has potential being a partially automative and partially tech industry. They can even have their own mobile phones if they want to. Or robots and stuff.
What Mercedes, Porsche, Audi can do aside continue to produce the cars over and over again until they are overtaken by somebody else? Hell, both EU and USA need tariffs to compete with chinese automakers.
Not quite. Tesla has a high valuation mostly because traditional auto carries an enormous amount of debt on their balance sheets. I think Tesla is one economic downturn in a high interest rate environment from meeting the same fate. Once an auto company is loaded with debt, they get stuck in a low margin cycle where the little profit they make has to go into new debt for retooling and factories. Tesla is very much still coasting from zero interest rate free VC money times.
I am sorry, there must be some hidden tech, some completely different attempt to speak about AGI.
I really, really doubt that transformers will become AGI. Maybe I am wrong, I am no expert in this field, but I would love to understand the reasoning behind this "could arrive this year", because it reminds me about coldfusion :X
edit: maybe the term has changed again. AGI to me means truly understanding, maybe even some kind of consciousness, but not just probability... when I explain something, I have understood it. It's not that I have soaked up so many books that I can just use a probabilistic function to "guess" which word should come next.
Don't worry, these are the "keeping the bridge intact" speak of people leaving a glorious or so workplace. I have worked at several places, and when people left(usually most well paid ones), they post linkedin/twitter posts to say kudos and inspire that, the corresponding business will be in forefront of the particular niche this year or soon and they would like to be proud of ever being part of it.
Also, when they speak about AGI, it raises their(person leaving) marketing value as someone else already know they are brilliant to have worked at something cool and they might also know some secret sauce, which could be acquired at lower cost by hiring them immediately[1]. I have seen these kinds of speak play out too many times. Last January, one of the senior engineers from my current work place in aviation left citing about something super secret coming this year or soon, and they immediately got hired by a competitor with generous pay to work on that said topic.
> Also, when they speak about AGI, it raises their(person leaving) marketing value
Why yes, of course Jan Leike just impromptu resigned and Daniel Kokotajlo just gave up 85% of his wealth in order not to sign a resignation NDA to do what you're describing...
While he'll be giving up a lot of wealth, it's unlikely that any meaningful NDA will be applied here. Maybe for products, but definitely not for their research.
There's very few people who can lead in frontier AI research domains - maybe a few dozen worldwide - and there are many active research niches. Applying an NDA to a very senior researcher would be such a massive net-negative for the industry, that it'd be a net-negative for the applying organisation too.
I could see some kind of product-based NDA, like "don't discuss the target release dates for the new models", but "stop working on your field of research" isn't going to happen.
Kokotajlo: “To clarify: I did sign something when I joined the company, so I'm still not completely free to speak (still under confidentiality obligations). But I didn't take on any additional obligations when I left.
Unclear how to value the equity I gave up, but it probably would have been about 85% of my family's net worth at least.
Basically I wanted to retain my ability to criticize the company in the future.“
> but "stop working on your field of research" isn't going to happen.
We’re talking about NDA, obviously no-competes aren’t legal in CA
> Unclear how to value the equity I gave up, but it probably would have been about 85% of my family's net worth at least.
Percentages are nice, but with money and wealth absolute numbers are already important enough. You can leave a very, very good life even if you are losing 85% if the remaining 15% is USD $1M. And if not signing that NDA will help you landing another richly paying job + freedom to say whatever you feel it's important saying.
> truly understanding… when I explain something, I have understood it
When you have that feeling of understanding, it is important to recognize that it is a feeling.
We hope it’s correlated with some kind of ability to reason, but at the end of the day, you can have the ability to reason about things without realising it, and you can feel that you understand something and be wrong.
It’s not clear to me why this feeling would be necessary for superhuman-level general performance. Nor is it clear to me that a feeling of understanding isn’t what being an excellent token predictor feels like from the inside.
If it walks and talks like an AGI, at some point, don’t we have to concede it may be an AGI?
No I’m with you on this. Next token prediction does lead to impressive emergent phenomena. But what makes people people is an internal drive to attend to our needs, and an LLM exists without that.
A real AGI should be something you can drop in to a humanoid robot and it would basically live as an individual, learning from every moment and every day, growing and changing with time.
LLMs can’t even count the number of letters in a sentence.
>LLMs can’t even count the number of letters in a sentence.
It's a consequence of tokenization. They "see" the world through tokens, and tokenization rules depend on the specific middleware you're using. It's like making someone blind and then claiming they are not intelligent because they can't tell red from green. That's just how they perceive the world and tells nothing about intelligence.
I counted very quickly but 78? I learned arabic in kindergarten, im not sure what your point was. There are arabic spelling bees and an alphabet song just like english
The comment you replied to was saying LLMs trained on english cant count letters in english
LLMs aren't trained in English with the same granularity that you and I are.
So my analogy here stands : OP was trained in "reading human language" with Roman letters as the basis of his understanding, and it would be a significant challenge (fairly unrelated to intelligence level) for OP to be able to parse an Arabic sentence of the same meaning.
Or:
You learned Arabic, great (it's the next language I want to learn so I'm envious!). But from the LLM point of view, should you be considered intelligent if you can count Arabic letters but not Arabic tokens in that sentence?
Yes, it sees tokens. Asking it to count letters is a little bit like asking that of someone who never learned to read/write and only learned language through speech.
From that AGI definition, AGI is probably quite possible and reachable - but also something pointless which there are no good reasons to "use", and many good reasons not to.
This paper and other similar works changed my opinion on that quite a bit. It shows that to perform text prediction, LLMs build complex internal models.
> maybe the term has changed again. AGI to me means truly understanding, maybe even some kind of consciousness, but not just probability... when I explain something, I have understood it.
The term, and indeed each initial, means different things to different people.
To me, even InstructGPT manages to be a "general" AI, so it counts as AGI — much to the confusion and upset of many like you who think the term requires consciousness, and others who want it to be superhuman in quality.
I would also absolutely agree LLMs are not at all human-like. I don't know if they do or don't need the various missing parts in order to be in order to change the world into a jobless (u/dis)topia.
I also don't have any reason to be for or against any claim about consciousness, given that word also has a broad range of definitions to choose between.
I expect at least one more breakthrough architecture on the scale of Transformers before we get all the missing bits from human cognition, even without "consciousness".
Yeah, that's the part I don't understand though - do I understand it? Or do I just think I understand it. How do I know that I am not probabilistic also?
Synthesis is the only thing that comes to mind as a differentiator between me and an LLM.
- A possibility to fact-check the text, for example by the Wolfram math engine or by giving internet access
- Something like an instinct to fight for life (seems dangerous)
- some more subsystems: let's have a look a the brain: there's the amygdala, the cerebellum, the hippocampus, and so on, and there must be some evolutionary need for these parts
AGI can’t be defined as autocomplete with fact checker and instinct to survive, there’s so so so much more hidden in that “subsystems point”. At least if we go by Bostroms definition…
As something of a (biased) expert: yes, it’s a big deal, and yes, this seemingly dumb breakthrough was the last missing piece. It takes a few dozen hours of philosophy to show why your brain is also composed of recursive structures of probabilistic machines, so forget that, it’s not neccesary, instead, take a glance at these two links:
> Alan Turing on why we should never ever perform a Turing test
Can you cite specifically what in the paper you're basing that on? I skimmed it as well as the Wikipedia summary but I didn't see anywhere that Turing said that the imitation game should not be played.
I was definitely being a bit facetious for emphasis, but he says a few times that the original question — “Can machines think?” - is meaningless, and the imitation game question is solved in its very posing. As a computer scientist he was of course worried about theoretical limits, and he intended the game in that vein. In that context he sees the answer as trivial: yes, a good enough computer will be able to mimic human behavior.
The essay’s structure is as follows:
1. Propose theoretical question about computer behavior.
2. Describe computers as formal automata.
3. Assert that automata are obviously general enough to satisfy the theoretical question — with good enough programming and enough power.
4. Dismiss objections, of which “humans might be telepathic” was somewhat absurdly the only one left standing.
It’s not a very clearly organized paper IMO, and the fun description of the game leads people to think he’s proposing that. That’s just the premise, and the pressing conclusion he derives from it is simple: spending energy on this question is meaningless, because it’s either intractable or solved depending on your approach (logical and empirical, respectively).
TL;DR: the whole essay revolves around this quote, judge for yourself:
We may now consider the ground to have been cleared and we are ready to proceed to the debate on our question, "Can machines think?" and the variant of it quoted at the end of the last section… ["Are there discrete-state machines which would do in the Imitation Game?"]
It will simplify matters for the reader if I explain first my own beliefs in the matter.
Consider first the more accurate form of the question. I believe that in about fifty years' time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.
The original question, "Can machines think?" I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.
Relying on specific people was never a good strategy, people will change but this will be a good test of their crazy governance structure. I think of it similar to political systems - if it can't withstand someone fully malicious getting in power then it's not a good system
Criminals misusing it? I feel like this is already a dangerous way to use AI, they use an enthusiastic, flirty and attractive female voice on millions of nerds. They openly say this is going to be like the movie Her. Shouldn't we have some societal discussion before we unleash paid AI girlfriends on everybody?
marketing is marketing. look how they marketed cigarettes , cars, ll kinds of things that now people feel are perhaps not so good. its part and parcel of the world that also does so much good. personally, id market it differently, but this is why im no CEO =).
if we help eachother understand these things andnhownto cope, all will be fine in the end. we will hit some bumps, and yes, there will be discomfort but thats ok. thats all part of life. life is not about being happy and comfortable allnthe time no matter how much we would want that.
some people even want paid AI girlfriends. who are you to tell them they are not allowed to have it?
Agree that artificial intelligence is an outlier. I think it is the technology with the greatest associated risk of all technologies humans have worked on.
It’s unhelpful to the argument when you do this, and it makes our side look like a bunch of smug self entitled assholes.
The reality is that AI is disruptive but we don’t know how disruptive.
The parent post is clearly hyperbole; but let’s push back on what is clearly nonsense (ie. AI being more dangerous than nuclear weapons) in a logical manner hm?
Understanding AI is not the issue here; the issue so that no one knows how disruptive it will eventually be; not me, not you, not them.
People are playing the risk mitigation game; but the point is that if you play it too hard you end up as a ludite in a cave with no lights because something might be dangerous about “electricity”.
I disagree. Debating gives legitimacy, especially when one begins to debate a throwaway comment that doesn't even put an argument forward. The right answer is outright dismissal.