Hacker News new | past | comments | ask | show | jobs | submit | DalasNoin's comments login

So the models will no longer be thinking in plain English but some embedding space? Seems not like what you want.


Seems exactly like what you want. We don't think in plain English, we _rationalize_ our thoughts into English (or whatever language comes out) but they must be more fundamental than language because language is acquired.

Essentially, English is one of many possible encodings of an underlying intuitive, possibly non-symbolic representation.


Cognitive scientists called it “mentalese”.


> We don't think in plain English

That's debatable. Language shapes thoughts much more than you might think. Because you learn concepts from language that you could not imagine by yourself until you learned/read about them, so they are in effect very linked to language.


I can also think in images and internal visualizations. Geometric reasoning is also a thing. Musicians can also hear things in their mind - some can write it down, others can play it directly, and in my case I'm not good enough to get it out of my head!

In all cases though these thoughts are kind of tied to representations from the real world. Sort of like other languages via different senses. So yeah, how abstract can our thoughts actually be?


But the thing you learn is not the word 'purple'. You just use the word as the mental scaffolding to build a concept of purple. The word forms a linkage to a deeper embedding, which is further proven by the fact that it's actually slightly different in each mind that has understanding of the concept.

This embedded concept is what is doing the work, the word was just the seed of the understanding and a method by which to convey that understanding to others.


Language is definitely a significant part of thinking, but when I remember how cold it was outside yesterday to figure out if it was colder than today, I'm not bringing words to mind. I'm bringing up some other non-discrete information that I could never precisely encode into words and then factoring that in with the other non-discrete information I'm currently taking in through my senses. Its only after that processing that I encode it as a lossy "It was colder yesterday" statement.


Fair, but there are many categories of languages.

For example, I can think in formal logic. I've learned to do that, and surely my brain takes a step-by-step approach to it, but I've also internalized some of it and I don't think that my proficiency with English has anything to do with it.

I could have learned the same concepts in any other language, but the end result would be the same.

And surely there are many thoughts that can't be expressed purely with words. For example all that is related to qualia. You can think of a color but you can't describe what you see in your mind's eye with words, not in a way that would let a blind person share the same experience. Or try describing "love" without making a similitude. Is love a thought? Or a feeling? Is there a meaningful difference between the two?


You're basically talking about Sapir-Whorf here:

https://en.wikipedia.org/wiki/Linguistic_relativity

>The hypothesis is in dispute, with many different variations throughout its history.[2] The strong hypothesis of linguistic relativity, now referred to as linguistic determinism, is that language determines thought and that linguistic categories limit and restrict cognitive categories. This was a claim by some earlier linguists pre-World War II;[3] since then it has fallen out of acceptance by contemporary linguists.


eh, probably both. Why does it have to be a fight between two schools of thoughts? Thoughts can be across-modal. Some of it can be done in specific language or some could be visual.

(universal grammar peoole hates this somehow, it's weird)


If you mean “not what we want” for safety reasons, I think I agree.

If you don’t mean for safety reasons, I’m not sure why.


In section 2 they briefly mention studies such as [1] that point out that the token outputs of a chain of thought aren't always entirely faithful to the responses of the models

I'm not sure whether it wouldn't be more reliable to let the model run on latents and try to train a separate latent-reading explainer module that has at least some approximation of what we want as an explicit optimization objective.

Assuming it actually is or has the potential to be better than CoT, from what I gathered from the paper the current results are mostly just more efficient token-wise.

[1] https://arxiv.org/abs/2305.04388


I was thinking abut safety reasons, but also usability. Seems like a pretty big difference to me if you don't understand the chain of thought. How faithful cot are is another question.


They have never been thinking. This is important.

Predicting the next word is not intelligence.


Someone could fine-tune a model on pairs of existing proteins and their misfolded prions and then ask the system to come up with new prions for other proteins. ChatGPT found these 4 companies that will produce proteins for you just based on digital DNA that you send them:

- Genewiz (Azenta Life Sciences)

- Thermo Fisher Scientific (GeneArt)

- Tierra Biosciences

- NovoPro Labs


Whelp, time to move to a small island in the middle of the Pacific.


One of the few cases where Mars actually is a decent planet B.


The context here is that prions are misfolded proteins that replicate by causing other proteins to change their configuration into the misfolded form of the prion. Diseases caused by prions include Mad Cow disease, Creutzfeldt-Jakob disease, and Chronic Wasting disease. All prion diseases are incurable and 100% fatal.


Ilya went to university in israel and all founders are jewish. Many labs have offices outside of the US, like london, due to crazy immigration law in the us.


There are actually a ton of reasons to like London. The engineering talent is close to bay level for fintech/security systems engineers while being 60% of the price, it has 186% deductions with cash back instead of carry forward for R&D spending, it has the best AI researchers in the world and profit from patents is only taxed at 10% in the UK.


If London has the best AI researchers in the world, why are all the top companies (minus Mistral) American?


Demis Hassabis says that half of all innovations that caused the recent AI boom came from DeepMind, which is London based.


his opinion is obviously biased.

If we say that half of innovations came from Alphabet/Google, then most of them (transformers, LLMs, tensorflow) came from Google Research and not Deep Mind.


People are choosing headquarters for access to capital rather than talent. That should tell you a lot about the current dynamics of the AI boom.


Google Deepmind is based in London.


Many companies have offices outside because of talent pools, costs, and other regional advantages. Though I am sure some of it is due to immigration law, I don't believe that is generally the main factor. Plus the same could be said for most other countries.


Part of it may also be a way to mitigate potential regulatory risk. Israel thus far does not have an equivalent to something like SB1047 (the closest they've come is participation in the Council of Europe AI treaty negotiations), and SSI will be well-positioned to lobby against intrusive regulation domestically in Israel.


Ilya also lived in Israel as a kid from age 5 to 15 so he speaks Hebrew. His family emigrated from Russia. Later they moved to Canada.

Source: Wikipedia.


Two of the founders are Israeli and the other is French, I think (went to University in France).

Israel is a leading AI and software development hub in the world.


> Israel is a leading AI and software development hub...

Yep, and if any place will produce the safest AI ever, its got to be there.


Safest place for AI? Their miltary has the worse track and a complete fascist state. Israel is the worst place to fund "safe and humane AI"

Israeli military operations continues to this day with over 41,000 civilians killed.


Pretty sure they're being sarcastic.


I wasn't aware of his or any of the other founders background. Simply thought it was political somehow.

Thanks.


I feel like local models could be an amazing coding experience because you could disconnect from the internet. Usually I need to open chatgpt or google every so often to solve some issue or generate some function, but this also introduces so many distractions. imagine being able to turn off internet completely and only have a chat assistant that runs locally. I fear though that it is just going to be a bit to slow at generating tokens on CPU to not be annoying.


I don't have a gut feel for how much difference the Mamba arch makes to inference speed, nor how much quantisation is likely to ruin things, but as a rough comparison Mistral-7B at 4 bits per param is very usable on CPU.

The issue with using any local models for code generation comes up with doing so in a professional context: you lose any infrastructure the provider might have for avoiding regurgitation of copyright code, so there's a legal risk there. That might not be a barrier in your context, but in my day-to-day it certainly is.


Current systems are already (in a limited way) helping with alignment, anthropic is using its AI to label the sparse features of their sparse auto encoder approach. I think the original idea of labeling neurons by AI came from william saunders, who also left openai recently.


In that metaphor, is openai the humans or are actual humans the humans? So is openai about to be destroyed or humanity?


Openai would be the humans here and Ilya would be the dolphin. (In the metaphor, the dolphins leave and here Ilya is leaving)


The dolphins are actually openai


There goes the so called superalignment:

Ilya

Jan Leike

William Saunders

Leopold Aschenbrenner

All gone


Resignations lead to more resignations....unless mgmt. can get on top of it and remedy it quickly, which rarely happens. I've seen it happen way too many times working 25 years in tech.


This might not be bad from the perspective of the remaining employees, it might be that the annoying people are leaving the room.


Or that just the aggressive snakes are left.

I have no idea I’m saying I’ve seen that happen in companies.


In my experience, good ones leave first, followed by those who enjoyed working with them and or ones who are not longer able to get work done.


You need to think about OpenAI specifically - Ilya basically attempted a coup last year and failed, stayed in the company for months afterwards, according to rumours had limited contributions to the breakthroughs in research and was assigned to lead the most wishy-washy project of superalignment.

I’m not seeing “the good ones” leaving in this case.


So Satya Nadella paid $13 billion to have....Sam Altman :-))


Perhaps Altman will fail upwards once again to become CEO of Microsoft


Are you suggesting that the OAI investment was not a good investment for MS?


Have they earned a return on it yet?

Seriously asking; I've purchased a GitHub CoPilot license subscription but I don't know what their sales numbers are doing on AI in general. It's to be seen if it can be made more cost-efficient to deliver to consumers.


Checking MSFT price, seems like the market thinks they made the right move and the shareholders are for sure seeing a return.


The market thinks Tesla is worth more than all other automakers combined, that GameStop is a reasonable investment, and laying off engineers is great!


Until it one day it doesn’t. It’s very fickle.


But the market is always right.


Because Tesla is. Unlike the traditional automakers that have no room of growth and in a perpetual stagnation, Tesla has potential being a partially automative and partially tech industry. They can even have their own mobile phones if they want to. Or robots and stuff.

What Mercedes, Porsche, Audi can do aside continue to produce the cars over and over again until they are overtaken by somebody else? Hell, both EU and USA need tariffs to compete with chinese automakers.


Not quite. Tesla has a high valuation mostly because traditional auto carries an enormous amount of debt on their balance sheets. I think Tesla is one economic downturn in a high interest rate environment from meeting the same fate. Once an auto company is loaded with debt, they get stuck in a low margin cycle where the little profit they make has to go into new debt for retooling and factories. Tesla is very much still coasting from zero interest rate free VC money times.


Increased price of a company is indeed the expectation of future profits, but until those profits hit the balance sheet they are unrealized


What % of stock movements do you attribute to OAI, vs the cash-generation behemoth that is Windows/Office/Azure?


Nah, they paid for the brand.


And, you know, a company with $2B of revenue


That for sure loses money on every prompt...


I guess if they really thought we had something to worry about, they would've stayed just to steer things in the right direction.

Doesn't seem like they felt it was required.

Edit: I'd love to know why the down votes, it's an opinion, not a political statement. This community is quite off lately.

Is this a highly controversial statement ? People are truly worried about the future and this is just an anxiety based reaction ?


Doesn't the whole Altman sacking thing show that they had no power to do any steering, and in fact Altman steers?


Daniel “Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI”

“I think AGI will probably be here by 2029, and could indeed arrive this year”

Kokotajlo too.

We are so fucked


I am sorry, there must be some hidden tech, some completely different attempt to speak about AGI.

I really, really doubt that transformers will become AGI. Maybe I am wrong, I am no expert in this field, but I would love to understand the reasoning behind this "could arrive this year", because it reminds me about coldfusion :X

edit: maybe the term has changed again. AGI to me means truly understanding, maybe even some kind of consciousness, but not just probability... when I explain something, I have understood it. It's not that I have soaked up so many books that I can just use a probabilistic function to "guess" which word should come next.


Don't worry, these are the "keeping the bridge intact" speak of people leaving a glorious or so workplace. I have worked at several places, and when people left(usually most well paid ones), they post linkedin/twitter posts to say kudos and inspire that, the corresponding business will be in forefront of the particular niche this year or soon and they would like to be proud of ever being part of it.

Also, when they speak about AGI, it raises their(person leaving) marketing value as someone else already know they are brilliant to have worked at something cool and they might also know some secret sauce, which could be acquired at lower cost by hiring them immediately[1]. I have seen these kinds of speak play out too many times. Last January, one of the senior engineers from my current work place in aviation left citing about something super secret coming this year or soon, and they immediately got hired by a competitor with generous pay to work on that said topic.


> Also, when they speak about AGI, it raises their(person leaving) marketing value

Why yes, of course Jan Leike just impromptu resigned and Daniel Kokotajlo just gave up 85% of his wealth in order not to sign a resignation NDA to do what you're describing...


While he'll be giving up a lot of wealth, it's unlikely that any meaningful NDA will be applied here. Maybe for products, but definitely not for their research.

There's very few people who can lead in frontier AI research domains - maybe a few dozen worldwide - and there are many active research niches. Applying an NDA to a very senior researcher would be such a massive net-negative for the industry, that it'd be a net-negative for the applying organisation too.

I could see some kind of product-based NDA, like "don't discuss the target release dates for the new models", but "stop working on your field of research" isn't going to happen.


Kokotajlo: “To clarify: I did sign something when I joined the company, so I'm still not completely free to speak (still under confidentiality obligations). But I didn't take on any additional obligations when I left.

Unclear how to value the equity I gave up, but it probably would have been about 85% of my family's net worth at least.

Basically I wanted to retain my ability to criticize the company in the future.“

> but "stop working on your field of research" isn't going to happen.

We’re talking about NDA, obviously no-competes aren’t legal in CA

https://www.lesswrong.com/posts/kovCotfpTFWFXaxwi/?commentId...


> Unclear how to value the equity I gave up, but it probably would have been about 85% of my family's net worth at least.

Percentages are nice, but with money and wealth absolute numbers are already important enough. You can leave a very, very good life even if you are losing 85% if the remaining 15% is USD $1M. And if not signing that NDA will help you landing another richly paying job + freedom to say whatever you feel it's important saying.


> truly understanding… when I explain something, I have understood it

When you have that feeling of understanding, it is important to recognize that it is a feeling.

We hope it’s correlated with some kind of ability to reason, but at the end of the day, you can have the ability to reason about things without realising it, and you can feel that you understand something and be wrong.

It’s not clear to me why this feeling would be necessary for superhuman-level general performance. Nor is it clear to me that a feeling of understanding isn’t what being an excellent token predictor feels like from the inside.

If it walks and talks like an AGI, at some point, don’t we have to concede it may be an AGI?


Would say understanding usually means ability to connect the dots and see the implications … not feeling.


Okay, what if I put it like this: there is understanding (ability to reason about things), and there is knowing that you understand something.

In people, these are correlated, but one does not necessitate the other.


No I’m with you on this. Next token prediction does lead to impressive emergent phenomena. But what makes people people is an internal drive to attend to our needs, and an LLM exists without that.

A real AGI should be something you can drop in to a humanoid robot and it would basically live as an individual, learning from every moment and every day, growing and changing with time.

LLMs can’t even count the number of letters in a sentence.


>LLMs can’t even count the number of letters in a sentence.

It's a consequence of tokenization. They "see" the world through tokens, and tokenization rules depend on the specific middleware you're using. It's like making someone blind and then claiming they are not intelligent because they can't tell red from green. That's just how they perceive the world and tells nothing about intelligence.


But it limits them, they cannot be AGI then, because a child that can count could do it :)


You seem generally intelligent. Can you tell how many letters are in the following sentence?

"هذا دليل سريع على أنه حتى البشر الأذكياء لا يمكنهم قراءة ”الرموز“ أو ”الحروف“ من لغة لم يتعلموها."


I counted very quickly but 78? I learned arabic in kindergarten, im not sure what your point was. There are arabic spelling bees and an alphabet song just like english

The comment you replied to was saying LLMs trained on english cant count letters in english


LLMs aren't trained in English with the same granularity that you and I are.

So my analogy here stands : OP was trained in "reading human language" with Roman letters as the basis of his understanding, and it would be a significant challenge (fairly unrelated to intelligence level) for OP to be able to parse an Arabic sentence of the same meaning.

Or:

You learned Arabic, great (it's the next language I want to learn so I'm envious!). But from the LLM point of view, should you be considered intelligent if you can count Arabic letters but not Arabic tokens in that sentence?


Is this even a fair comparison ? Are we asking a LLM to count letters in an alphabet it never saw ?


Yes, it sees tokens. Asking it to count letters is a little bit like asking that of someone who never learned to read/write and only learned language through speech.


From that AGI definition, AGI is probably quite possible and reachable - but also something pointless which there are no good reasons to "use", and many good reasons not to.


LLMs could count the number of letters in a sentence if you stopped tokenizing them first.


tokenization is not the issue - these LLMs can all break a word into letters if you ask them.


This paper and other similar works changed my opinion on that quite a bit. It shows that to perform text prediction, LLMs build complex internal models.

https://news.ycombinator.com/item?id=38893456


> maybe the term has changed again. AGI to me means truly understanding, maybe even some kind of consciousness, but not just probability... when I explain something, I have understood it.

The term, and indeed each initial, means different things to different people.

To me, even InstructGPT manages to be a "general" AI, so it counts as AGI — much to the confusion and upset of many like you who think the term requires consciousness, and others who want it to be superhuman in quality.

I would also absolutely agree LLMs are not at all human-like. I don't know if they do or don't need the various missing parts in order to be in order to change the world into a jobless (u/dis)topia.

I also don't have any reason to be for or against any claim about consciousness, given that word also has a broad range of definitions to choose between.

I expect at least one more breakthrough architecture on the scale of Transformers before we get all the missing bits from human cognition, even without "consciousness".

What do you mean by "truly understanding"?


> when I explain something, I have understood it.

Yeah, that's the part I don't understand though - do I understand it? Or do I just think I understand it. How do I know that I am not probabilistic also?

Synthesis is the only thing that comes to mind as a differentiator between me and an LLM.


I think what's missing:

- A possibility to fact-check the text, for example by the Wolfram math engine or by giving internet access

- Something like an instinct to fight for life (seems dangerous)

- some more subsystems: let's have a look a the brain: there's the amygdala, the cerebellum, the hippocampus, and so on, and there must be some evolutionary need for these parts


AGI can’t be defined as autocomplete with fact checker and instinct to survive, there’s so so so much more hidden in that “subsystems point”. At least if we go by Bostroms definition…


As something of a (biased) expert: yes, it’s a big deal, and yes, this seemingly dumb breakthrough was the last missing piece. It takes a few dozen hours of philosophy to show why your brain is also composed of recursive structures of probabilistic machines, so forget that, it’s not neccesary, instead, take a glance at these two links:

1. Alan Turing on why we should never ever perform a Turing test: https://redirect.cs.umbc.edu/courses/471/papers/turing.pdf

2. Marvin Minsky on the “Frame Problem” that lead to one or two previous AI winters, and what an Intuitive algorithm might look like: https://ojs.aaai.org/aimagazine/index.php/aimagazine/article...


> Alan Turing on why we should never ever perform a Turing test

Can you cite specifically what in the paper you're basing that on? I skimmed it as well as the Wikipedia summary but I didn't see anywhere that Turing said that the imitation game should not be played.


Sorry I missed this, for posterity:

I was definitely being a bit facetious for emphasis, but he says a few times that the original question — “Can machines think?” - is meaningless, and the imitation game question is solved in its very posing. As a computer scientist he was of course worried about theoretical limits, and he intended the game in that vein. In that context he sees the answer as trivial: yes, a good enough computer will be able to mimic human behavior.

The essay’s structure is as follows:

1. Propose theoretical question about computer behavior.

2. Describe computers as formal automata.

3. Assert that automata are obviously general enough to satisfy the theoretical question — with good enough programming and enough power.

4. Dismiss objections, of which “humans might be telepathic” was somewhat absurdly the only one left standing.

It’s not a very clearly organized paper IMO, and the fun description of the game leads people to think he’s proposing that. That’s just the premise, and the pressing conclusion he derives from it is simple: spending energy on this question is meaningless, because it’s either intractable or solved depending on your approach (logical and empirical, respectively).

TL;DR: the whole essay revolves around this quote, judge for yourself:

  We may now consider the ground to have been cleared and we are ready to proceed to the debate on our question, "Can machines think?" and the variant of it quoted at the end of the last section… ["Are there discrete-state machines which would do in the Imitation Game?"]

  It will simplify matters for the reader if I explain first my own beliefs in the matter.

  Consider first the more accurate form of the question. I believe that in about fifty years' time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning. 

  The original question, "Can machines think?" I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.


Relying on specific people was never a good strategy, people will change but this will be a good test of their crazy governance structure. I think of it similar to political systems - if it can't withstand someone fully malicious getting in power then it's not a good system


Same applies to Sam Altman as well? Thing felt like a cult when he was forced out and everyone threatened to resign.


Criminals misusing it? I feel like this is already a dangerous way to use AI, they use an enthusiastic, flirty and attractive female voice on millions of nerds. They openly say this is going to be like the movie Her. Shouldn't we have some societal discussion before we unleash paid AI girlfriends on everybody?


marketing is marketing. look how they marketed cigarettes , cars, ll kinds of things that now people feel are perhaps not so good. its part and parcel of the world that also does so much good. personally, id market it differently, but this is why im no CEO =).

if we help eachother understand these things andnhownto cope, all will be fine in the end. we will hit some bumps, and yes, there will be discomfort but thats ok. thats all part of life. life is not about being happy and comfortable allnthe time no matter how much we would want that.

some people even want paid AI girlfriends. who are you to tell them they are not allowed to have it?


Agree that artificial intelligence is an outlier. I think it is the technology with the greatest associated risk of all technologies humans have worked on.


That's because you don't understand it.


Please don’t.

It’s unhelpful to the argument when you do this, and it makes our side look like a bunch of smug self entitled assholes.

The reality is that AI is disruptive but we don’t know how disruptive.

The parent post is clearly hyperbole; but let’s push back on what is clearly nonsense (ie. AI being more dangerous than nuclear weapons) in a logical manner hm?

Understanding AI is not the issue here; the issue so that no one knows how disruptive it will eventually be; not me, not you, not them.

People are playing the risk mitigation game; but the point is that if you play it too hard you end up as a ludite in a cave with no lights because something might be dangerous about “electricity”.


I disagree. Debating gives legitimacy, especially when one begins to debate a throwaway comment that doesn't even put an argument forward. The right answer is outright dismissal.


nicely put


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: