This is really awful, the text is completely content-free but it's got everyone who's not a domain expert hooked.
I didn't know whether ChatGPT's ability to babble convincingly was going to cause trouble or be a funny quirk that we all knew about and worked around, but this thread is really making it look like the pessimists were right about it. The problem is that it gets past many people's filters because its phrasing sends a lot of intelligent/rational/professional signals, as it was engineered to do. Nobody is used to picking smart-sounding text apart word by word to make sure they agree with it, except maybe academics, and that's the vulnerability it reaches us through.
I also think that OpenAI got human nature backwards when they trained it to hedge on everything it said - everybody knows that people who constantly demure are the most reliable! A safe chatbot would sound pushy, like a bad salesman or an ideological agent; like something incapable of self-questioning.
I (OP) am not a physicist but still a scientist. I’m not gonna say the theories it said are solid. It’s quite clear to me that they’re not. But they are better theories than what I (who knows the fundamentals of physics for the most part) can come up with. So it’s not a subject matter expert but better than the average joe. For what it is, (and given I asked it to imagine theories), its output looks impressive to me. And I can attest to its abilities to do this in biology as well, and I’ll say it does a better job than most biology professors in fact. I just didn’t publish those chats here.
Maybe it did a better job in biology because I was able to correct it at an expert level (which I was not able to here). Given the demonstrably (like right here in these comments) myopic unimaginative nature of physics as a science today, it’s possible not a single physicist would try to entertain this system as a hypothesis generation machine. I mean we have discovered everything already right?
I think you asked it some of the right questions, but even then it was able to slip past your skepticism by being really good at sounding like an intelligent human being that was saving you the trouble of the details. There's no hypothesis in there.
You would have to be an insanely skeptical person, one who would drive anybody nuts to talk to, to approach a ChatGPT session in a field you're not an expert in or maybe even one you are, and evaluate it right... The only normal human perspective that fits what ChatGPT is actually like is the one we take on people we think are terrible, so that is why I say it's awful, although it's just a machine.
Is anyone claiming there's a market niche for hypothesis generation in the natural sciences?
Full disclosure: I have a science PhD and a couple of published papers as a result. It was a long, slow, frustrating grind, but generating hypotheses wasn't even close to being the hard bit.
I too have a PhD and a few papers. If you think generating a good hypothesis wasn’t the hard bit then in my opinion you never did get what science was about. In my opinion. Which is the minority among todays scientists. It’s like People dont even know what they’re doing wrong.
> If you think generating a good hypothesis wasn’t the hard bit then in my opinion you never did get what science was about [..]
Is a good hypothesis important? Sure. Is it easy to get that bit wrong? Yes, and lots of people do.
I'm reminded of one of Paul Graham's quotes:
"I also have a theory about why people think this. They overvalue ideas. They think creating a startup is just a matter of implementing some fabulous initial idea. And since a successful startup is worth millions of dollars, a good idea is therefore a million dollar idea [..] startup ideas are not million dollar ideas, and here's an experiment you can try to prove it: just try to sell one. Nothing evolves faster than markets. The fact that there's no market for startup ideas suggests there's no demand. Which means, in the narrow sense of the word, that startup ideas are worthless."[0]
I love PGs insights and I generally agree with his ideas article. But think about it, is a startup idea a hypothesis or something more than that? To the market each startup is perhaps a hypothesis, but for you as a founder the idea shouldn’t just be a hunch. It can be in initial stages but you need to validate it asap and iterate on it. Which is the message I took from his writing.
And this doesn’t even touch the question of what’s different between basic science and entrepreneurship.
> The problem is that it gets past many people's filters because its phrasing sends a lot of intelligent/rational/professional signals, as it was engineered to do. Nobody is used to picking smart-sounding text apart word by word to make sure they agree with it, except maybe academics, and that's the vulnerability it reaches us through.
I think you're making a mountain out of a molehill. To me ChatGPT is basically a clever way to interpolate and extrapolate coherent text based on user input and its training set. If the training set is lacking in some areas, it underfits the output.
I've tested ChatGPT in a couple of engineering fields I'm familiar and I expected the service to respond poorly, but even though it returned nonsense in some areas I thought were low-hanging topics, such as the release year of an international standard, overall its output was very impressive and very entertaining.
Perhaps it's the engineer in me talking, but it's pointless to waste time waxing lirically about human nature. Tools like ChatGPT might one day be superb expert systems and teaching tools, but like any expert system and learning tool you need to corroborate the results by yourself. Human nature has zero to do with this.
That description of how it functions is true, but it did fool a lot of people, and now we have to explain why. I also note that a lot of people report positive results in their own fields of expertise, which might be part of the explanation. (It might be building trust before failing inexplicably right when it can no longer be checked.)
> That description of how it functions is true, but it did fool a lot of people, and now we have to explain why.
I see your point, and I agree. Nevertheless, these misdirections seem to boil down to a broad temptation to succumb to appeals to authority. People might be falling for ChatGPT misfires just like they fall for fancy talking bullshit artists, but that's hardly a failing of clever auto text generators.
I'm not a domain expert either, but to me the results seem to be entirety what you would expect from a model trained to pattern match very well on existing content.
That is, the ideas, strengths and limitations of the current well-known theories are very well explained and mostly correct. However, the "novel theory" is mostly filler words around some very thin concepts to make it sound like an actual theory with some depth to it, but in reality saying nothing more than that spacetime is granular and there could maybe be some matrices.
Haha that's for now. But humans do this too, attempting to copy the shell around a concept: the way it is expressed.
In American English, this is frequently authority speech mimicking law enforcement, scientists, or lawyers. All while lacking any knowledge of anything but the outer dressings of something.
In this way, the problem is that the AI is just mimicking most of us on most things.
Like that time a science fiction writer started his own religion based on characters from his novel and got a bunch of celebrities financially tied to it
Or when holistic practitioners procedurally generate an unsubstantiated preventative cure
Might as well just turn your brain off now, you won’t need it
I think the future will see academia as the last bastion for human ingenuity, before ultimately ML swallows that up too, or climate change swallows the whole thing up.
Since I got so many downvotes, let me share the specific claims:
“Dynamic Geometric Interactions in Multi-Layered Spacetime"
This hypothesis proposes that spacetime consists of multiple interconnected layers, each with its unique geometry and properties. The fundamental forces and particles emerge from the interactions between these layers and the geometric transformations that take place within and between them.
Multi-Layered Spacetime: The universe is not a single, continuous spacetime fabric but instead is composed of multiple interconnected layers. Each layer has its own geometric structure, which can be influenced by the other layers.
Interactions and Transformations: Particles and forces emerge from the interactions between the layers and the geometric transformations that occur within and between them. These transformations might involve changes in curvature, connectivity, or other geometric properties, leading to the observed behavior of particles and forces at different scales.
Unification: At high energy scales or specific conditions, the interactions and transformations between the layers might become more unified, leading to a single overarching interaction responsible for all fundamental forces.
[And a rough mathematical explanation]
Let's denote the granular spacetime structures as Sij, where i and j represent the indices for the type of spacetime structure and its configuration, respectively. Suppose there are N types of spacetime structures, each corresponding to a specific force or force pattern. The interaction between particles might be described by an interaction matrix I, where each element Iij quantifies the strength of the interaction between spacetime structures Si and Sj. In a high-energy regime, the spacetime structures' patterns might begin to merge, leading to the unification of forces. We can represent this by introducing an energy-dependent matrix U(E), which modifies the interaction matrix I as a function of energy E:
I'(E) = U(E) * I
As the energy E approaches a critical energy level Ec, the matrix U(E) transforms I into a single unified interaction matrix, corresponding to the unified force.
- - -
This comports with my understanding of the Inflaton field, which parametrically resonates (through geometric relationships?) with other fundamental fields during the Big Bang. I’ll pull some references here.
Disagree. The idea that there are different space time geometries for different quantum fields is an interesting and probably testable idea. I mean, yes, late night stoner shit, but next level.
Can confirm. This week a physicist friend came by to give it a try. He asked some questions related to his domain. I was fully expecting GPT to bullshit its way through it, from experience with GPT 3. But to my surprise, and to his, it got quite decent answers.
Not sure about physics but I am really looking forward to see games with NPC characters powered by ChatGPT. That would make some game world really deep and involving.
ChatGPT lowers information complexity by smoothing and averaging the information content, so no, likely you'll see the opposite effect.
We already went through this with Stable Diffusion - the content it produces looks very professional, but also somehow exactly the same no matter the subject.
This assumes that you will have only a single model. I expected at least one model per race (elves, dwarfs...) x adversity (friend, enemy, neutral) x some psychological models (e.g. Jung personality archetypes). So with ~100 GTP models learned on different sets you may have a quite diverse army of characters.
ChatGPT also offers a possibility to e.g. have a real bargaining at the shops: "How much is that sword... ".
You miss the overarching point. Neural network AI's work by averaging information and making it smoother, whereas it's the uneven "bumps" that actually make art interesting. (This is why most artists add deliberate imperfections to their work.)
I don’t know how much people listen to Sabine Hossenfelder’s opinions on things, but the article mentioned the age old desire for unification and symmetry, and frankly, I have to point out how aestheticist, unscientific, biased, and frankly typical it is for an LLM trained on modern (theoretical physics) conversations to go in search of unification. I know AI isn’t doing new science any time soon (or even this decade) but what I do think is that it’s probably dumb to bias your AI to go hunting for grand unified theories when what it should be looking for is unanswered questions grounded in first principles and proposing candidate experiments to discover the shape of data behind those doors, AND ONLY THEN proposing theories.
The modern mechanics revolutions of the 20th century demonstrated that our intuitions were completely mismatched for what the truth was. Planck himself rejected the notion of curve fitting which ultimately birthed energy quanta. It’s OUR macro universe that is the weird one, a strange corner case of quantum reality in the absurdly large.
I’m not a physicist myself but interested in it, and this was one of the topics I thought I’ll coax the bot into imagining. The theories might not be solid (I wouldn’t know myself), but I’m excited about the bots future as a companion for every day work in science.
I’m a biologist by training and have already made significant progress on multiple difficult questions I’ve had trouble with for decades in the 2 days I’ve had gpt4. Can’t wait to see what all we could accomplish with it!
"made significant progress on multiple difficult questions I’ve had trouble with for decades in the 2 days I’ve had gpt4" -- that is wildly impressive. Can you share some specific questions. Thanks!
I have had hypotheses regarding Epigenetic modulation of mutation rates playing a role in “epigenetically directed evolution;” this is not my primary field and honestly not a topic anyone has properly focused on. I’ve even corresponded with a couple of experts in related fields but didn’t make progress. Chatting with the bot I discovered scientists who might have actually generated some data on this topic a bit inadvertently that I never discovered merely by googling. I’ve made a list and am going through the literature already. I can’t imagine what else I can do with this tool if I can drag and drop pdfs for it to get more context on such arcane topics.
I think this is the best use case of ChatGPT, as a first level search engine to surface out any potentially relevant information. But I would be careful using it on a subject I'm not familiar with or relying on its content directly. When I query knowledgeable areas I've found it be wildly off on many facts, I shudder to think how someone outside the area would interpret its content.
I tried using chatgpt 3 for light research onto British occupation of Afghanistan and poppy production, I wanted to know potential books/authors which covered how long back Afghanistan might have been an exporter of poppy/opium and it ended giving me some really bad quality answers including stating that one author had actually written a book called "Opium and the kung-fu master", which is not a book at all! But an older Chinese action movie decrying the evils of opium addiction...
Interesting, thanks!. I see this as falling into the bucket of taking unstructured data/thoughts and mapping it well to structured/articulated data that is already out there. LLMs do excel at this and keep getting better fast (minus the hallucination, but that is easy to cross check). Would love to see examples of truly novel thesis generations that are verifiably plausible, which aren't a remix of what came earlier...
I spent 20minutes being gaslighted by it trying to convince me some very obvious false logical statement in set theory is true. It wouldn't for a second admit any doubts and would just say it's sorry I'm confused and that it'll explain it again more simply.
This thing is not good for developing any useful theories other than a creative crutch.
It presented me with some weird facts and when I asked for a reference it hallucinated one from 1988. The article name returned 0 results on Google. I went to the journal, volume, issue and page and found out it was in the middle of another article. When I asked GPT what is on that page in that vol. and issue it replied with the article's title again! When asked what is 3 pages earlier (where the actual article starts) it apologized and said it doesn't actually know...
I tried GPT-4 after the launch by a couple of hours and it tried so hard to sell me the idea that terms on the two sides of an equation can be of different dimensions.
It’s not gonna spew completely solid hypotheses but I can see how these ideas can be amazing seeds for us to pursue further. Importantly I asked it to think like Einstein. In exchanges I didn’t put here, it first just said Einstein would have worked on string theory if he were alive today. I reminded it that einsteins unique ability was to imaginatively think of completely new ideas. It then spewed these two topics. It’s like asking it to write songs in a particular poets style but for science. That sounds amazing to me.
Importantly it clearly has some understanding of these theories. At least as much as any regular person would, in my opinion.
It's kind of weird. This technology is designed for the human to give the machine a prompt. What I like about your approach is that you're almost using it to create the inverse: the machine supplying a slightly fuzzy idea that could prompt you to take your research in new and unexpected directions.
I love this so much. Like, most of the concepts and ideas it refers to are not completely new ideas (don’t ask me were I heard about them), but it’s interesting to hear them not proposed as a solution to a specific problem, but just based on the fact that they are thematically close to the works of an existing scientist. A little creepy though, like research from beyond the grave (move rights pending).
Non the less interesting though. I am currently researching into physics-informed neural networks for quantum problems, which are guided with differential equations. Maybe if you’d extend an LLM with an execution engine/numerical model, it’d be able to actually produce differential equations to undermine its hallucinated theories.
From my short testing that is something ChatGPT (4) is not that good at. Or it’s just a generally a hard problem to produce novel differential equations
Generative AI needs a way to receive feedback. AI learned to play chess by playing a lot, and the feedback was either winning, losing, or drawing.
If generative AI can repeatedly test physics theories faster than humans, then we may witness progress in physics. AI could generate thousands of theories and conduct experiments successfully, possibly leading to new physics models.
However, I am uncertain whether this will be achievable soon, particularly for theories requiring costly experiments.
A benefit could be that humans crave recognition for success, making papers less valuable. AI may be more willing to take risks and document its failures than humans.
>If generative AI can repeatedly test physics theories faster than humans, then we may witness progress in physics. AI could generate thousands of theories and conduct experiments successfully, possibly leading to new physics models. However, I am uncertain whether this will be achievable soon, particularly for theories requiring costly experiments.
I've long felt that this may be the strongest argument against an AI singularity.
The technical ability to emulate the minds of the world's theoretical physicists and run accelerated simulations of their thought processes may be developed, but the generation of valid new insights in physics might depend strongly on observations and experiments conducted in the physical world, as seems to have been the case historically, and the virtual equivalents of those experiments may prove to be inadequate or impractical to implement.
Steven Pinker made a similar argument in this 2018 discussion with Sam Harris (the remarks begin at 65m03s in this recording [1]; the full conext begins at around 50m36s [2]). Harris is concerned about existential risks posed by advances in artificial intelligence, whereas Pinker is less so, in part for this reason. I agree with Harris that there are risks associated with artificial general intelligence, but I agree with Pinker and the parent comment about the dependence of the scientific process on experiment, and that an inability to conduct accelerated experiments in the physical world may undermine the standard argument about the inevitability of an AI singularity.
An AI capable of interfacing with the physical world might develop the ability to conduct accelerated physical experiments, but it would presumably face the same fundamental and contingent limits as human researchers, and the history of human science suggests those limits may impede exponential progress.
I have been using ChatGPT as a Lithuanian language tutor.
It is surprisingly amazing - accurately (verified with a native speaker) conversing in Lithuanian with phrase level translations, explaining pronunciation, explaining clauses and cases, and is able to explain some grammatical concepts (usage of commas, etc) that I haven't found explained in English elsewhere.
I asked it to translate my resume recently, saw the word Python and or Linux stopped translating and stared opening a fake terminal and doing weird stuff, so YMMV.
My honest opinion is that very few people are clear on what the thing is actual good at, great at or bad at at this stage. We're still trying to find it's real use case.
IMO it's best for things you already know quite a bit about so at least you don't mess up hard. In other words, it seems like a more advanced ELIZA. Nearly everyone who has had a good time with it has used talk to it until they had a break through.
This is actually a much better version of the techno-babble we see all the time in movies and TV. I assume they get it wrong because the story is more important, and tech ppl don't want to be associated with the bs that comes out of the actor's mouths.
I have done that but it tends to actually omit some facts as well. This convo is after I asked it to stop adding disclaimers as you can see. I take it as just a quirk of a colleague and don’t mind it at this point lol.
> I asked it to stop adding disclaimers as you can see.
This is the first example of an extended dialogue with GPT-4 that I have read, and the fact that it failed to obey the request to dispense with its repetitive disclaimers was perhaps the most interesting thing to me about it. It seems somehow more fluent to me than GPT-3, as though its verbal IQ has increased a few points, but GPT-3 was already quite articulate; I havent yet seen any examples of clear new abilities from GPT-4.
The substance of the dialogue struck me as generic and lacking novel insight, though the bar was of course set rather high (essentially 'Describe revolutionary new physics'). I've also been jaded by the past few years of advances in AI; if I had seen this transcript ten years ago I would have been surprised and impressed that an AI could have a conversation about theoretical physics, and could demonstrate an ability to discuss relevant concepts in a reasonable and confident manner.
The ability of large language models to exhibit sophisticated verbal reasoning, albeit not yet reliably so, is their most striking feature to me, and I do think that has great scientific potential; perhaps GPT-4 isn't yet a major advance in that respect, but I imagine an important foundation has been laid. I should say I'm grateful to you for publishing this Ramraj; the transcript and the impressions you and others have shared in this thread have been illuminating.
Anyone who is curious about the application of AI to theoretical physics may be interested in the work of the MIT physicist Max Tegmark and his group, which is still at an early stage. Here are some videos in which he discusses AI and physics, in increasing order of detail:
I know some physics, but not enough in this area to judge if the results here are accurate. With the inaccuracy of AI chatbots, this makes this tool not useful for me for learning new concepts, as they might be blatantly wrong.
I didn't know whether ChatGPT's ability to babble convincingly was going to cause trouble or be a funny quirk that we all knew about and worked around, but this thread is really making it look like the pessimists were right about it. The problem is that it gets past many people's filters because its phrasing sends a lot of intelligent/rational/professional signals, as it was engineered to do. Nobody is used to picking smart-sounding text apart word by word to make sure they agree with it, except maybe academics, and that's the vulnerability it reaches us through.
I also think that OpenAI got human nature backwards when they trained it to hedge on everything it said - everybody knows that people who constantly demure are the most reliable! A safe chatbot would sound pushy, like a bad salesman or an ideological agent; like something incapable of self-questioning.