From my perspective it’s just _confusing_ to work in AI right now. We have some massive models that are doing some really neat stuff, and apparently hundreds of millions of people are using them—but I keep wondering: to do _what_, exactly? I’m not asking what the models can do, I’m asking what people want the models to do every day, all the time.
I’ve been shown some neat pictures people made that they thought were cool. I don’t know that I need this every day.
I’ve seen examples of “write an email to my boss”. It would take me longer to explain to ChatGPT what I want than to write these myself.
I’ve seen “write a snippet of code” demos. But I hardly care about this compared to designing a good API; or designing software that is testable, extensible, maintainable, and follows reasonable design principles.
In fact, no one in my extended sphere of friends and family has asked me anything about chatGPT, midjourney, or any of these other models. The only people I hear about these models from are other tech people.
I can see that these models are significantly better than anything before, but I can’t see yet the “killer app”. (For comparison, I don’t remember anyone in my orbit predicting search or social networking being killer apps for the internet—but we all expected things like TV and retail sales to book online.)
You’re asking “what’s so big about GUIs? Literally nobody has asked to move a pointer around a screen”.
It’s the use cases these thing enable that are important.
Today, I wrote a draft product announcement. Only after I was done did I realize I had written it in a really impersonal third person (“users will be able to”). No big deal, but maybe 10-20 minutes of work to make it energetic and second person (“now you can…”).
30 seconds with chatgpt. “Rewrite with more energy, in the second person, using best practices for announcements”).
Six months ago I would never have asked for that. Today it was glorious and let me move on to focus on more important things.
I mean Google revolutionized search. Apple revolutionized personal computing.
OpenAI revolutionized… rewriting things with slightly different wording?
I’ve seen so many breathless people posting “this would have taken me so long to search” and then I type 3 keywords from their massive prompt they crafted and find it instantly on Google. We’re talking 1000x or more faster. I feel like the same is happening in your comment. How often have I thought “damn I wish I wrote this blog post ever so slightly differently” in my life? Maybe a handful of times? And yes I’m including all generalizations of that question.
But certainly fake girlfriends and summarization will be mid size fields. Image generation has some mid size potential. But these will be spread between many companies.
I really think it has uses no doubt, but is it a revolution? Where? It’s not creative in the valuable sense - media, art, fashion, etc all will adopt it marginally but ultimately it will actually only serve to further the desire for genuine human experience, and cohesive creativity that we see it really falls flat at. It saves some marginal time perhaps if you’re ok sounding like a robot.
Taking into account the downsides it looks like a hype bubble right now to me, and a draw in the long run. There’s just a whole lot of tech people trying to cash in on the hype.
Let me repeat that: You can program GPT in English. ENGLISH!
You're complaining about the first nuclear test bomb being impractical and uninteresting. How will this change the world? That huge monstrosity had to be affixed to the top of a test gantry and took years of effort by a veritable army of the best and brightest to make! No way it could change war, or geopolitics, or anything. No way..
This is the day after Trinity. The bomb has gone off. A lot of physicists are very excited, some are terrified, and the military is salivating. The politicians are confused and scared, and the general public doesn't even know yet.
That doesn't mean the world hasn't changed, forever.
> Let me repeat that: You can program GPT in English. ENGLISH!
How?
Let me repeat that: How?
I had a little script that from time to time parses a list of jobs from a specific board, extracts some categories, inserts them into an SQLite and have a frontend that displays them to me in a way I want.
The board has since changed some things which would mean maybe 2 hours of commitment from me to update the script.
How do I program GPT in English. ENGLISH! To do that for me? What are the steps involved? I've been using ChatGPT and GPT-4 for awhile and I can't imagine what the steps are to make this happen without a lot of back and forth. I can't imagine how to program the infrastructure. I can't imagine how the API endpoint is more than a fancy autocomplete. I need help understanding what it means that I can program it in ENGLISH! (I can also program it in my country's language for what it's worth).
> That doesn't mean the world hasn't changed, forever.
> make this happen without a lot of back and forth
Perhaps this is the part you're missing. When I've watched people program with ChatGPT it _is_ a lot of back and forth because an enormous amount of context is able to be stored and back referenced. I.e. one wouldn't say "make me a Flappy Bird clone for iOS", they'd start with:
"Give me the code for a starter SpriteKit project". Then
"Now draw a sprite from bird.png and place it in the center of the screen".
"Now make it so the bird sprite will fall as if it's affected by gravity"
I won't bore anyone with how might one go from that all the way to a simple game, but I'm sure you see the idea. There are obviously _huge_ limitations to this approach and professionals will get hit them fast, but the proof is in the pudding: people who can barely code are producing real software through this approach. It's happening.
> Perhaps this is the part you're missing. When I've watched people program with ChatGPT it _is_ a lot of back and forth because an enormous amount of context is able to be stored and back referenced.
I've tried to build a lot of fun stuff with it so far. Haven't been able to properly 'program it in English' for anything non-trivial. Back and forth ended up in loops of not what I wanted. I'm just utterly confused at the difference in experiences I've had with it vs. what some people are preaching.
> There are obviously _huge_ limitations to this approach and professionals will get hit them fast, but the proof is in the pudding: people who can barely code are producing real software through this approach. It's happening.
I've had 4 product people I know try to create products using ChatGPT. All 4 of them basically got stuck on the first steps of whatever they were trying to do. "Where do I have to put this code?", "How do I put it online?", "How do I store user data?", "Where do I get a database from?". Basic questions to any professional, but to them it was impossible to overcome the obstacles from code to deployment.
I don't doubt that it's happening and it will become better in the future; I'm just having a hard time trying to grasp where some people are coming from when my experience as a professional, using it, has been mixed.
i've observed this schism between people who can get LLMs to produce useful output and people who are baffled, I think it's a mixture of two things:
expectations: using to the LLM to break problems into steps, suggest alternatives, using the LLM to help them think through the problem. I think this is the people using it to write emails - myself included, having a loop to dial in the letter allows me to write the letter without the activation energy needed to stare at a blank page
empathy: people who've spent enough time interacting with an LLM get to know how to boss it around. I think some people are able to put themselves in the LLMs shoes and imagine how to steer the attention into a particular semantic subspace where the model has enough context to say something useful.
GPT4 writes boilerplate python and javascript servers for me in one shot because I ask for precisely what I want and tell it what tools to use - I think because I have dialed in my expectation for what it's capable of and I learned how to ask in precise language, I get to be productive with GPT4's code output. Here's a transcript: https://poe.com/lookaroundyou/1512927999932108
Let me give you a simple example. I had to deal with a desynced subtitle file recently. I described the exact nature of the desync (in terms like "at point X1 the offset is Y1, and at X2 it is Y2") to GPT-4 and asked it to write me a Python script to fix this. It did require a couple tweaks to run, but when it did, it "just worked".
Honestly don't think it will be long before gpt can read this comment, then politely ask you for the urls of the job board and your git repo and 2 seconds later you will have a pull request to review
> Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, autonomously develops and manages businesses to increase net worth. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI.
I have some scrapers built in Scrapy, and from my experimentation with GPT4, I bet you could paste in your scraper code, the html source from the website in question (at least the relevant part), and tell GPT4 to update your scraper and you'd get something that's at least 95% correct within 30 seconds.
You can't program GTP in anything if you can't program.
If your prompt is garbage then the output will be garbage and if you don't know how to program you won't even realize the output was garbage.
It's not the language part of programming language that is hard. It's the programming part because it means you have to have a good understanding of what you want. Just like a human programmer won't read your mind an AI programmer won't read your mind either.
But I can already foresee bosses dismissing employees that raise issues (performance, maintainability, scalability, etc., etc.) by saying "Look, the AI can do it. So if it can do it you can do it too.". I foresee this because I have already seen it.
> You can't program GTP in anything if you can't program.
That's why it makes this so interesting - this type of automation impacts our jobs directly. Of course, I'm not sure who would use this in a corporate codebase without legal concerns.
> Let me repeat that: You can program GPT in English. ENGLISH!
The very existence of "prompt engineering", numerous discussions about how to prompt ChatGPT in order to get the result you want, etc. imply that while it may be in English, it still requires similar care and attention to do properly as a programming language does.
Which makes me wonder what the advantage of using English is. A formal language seems like it would be more productive and accurate.
For one, GPT-4 requires far less prompt engineering and generally interprets intent better.
The advantage of using English (natural language that is), the humans around you tend to speak it. I don't naturally speak powershell. Instead I want a script that searches for particular filenames, under a particular size, between a particular date in a directory path I specify. I told GPT I wanted that and in a few seconds it dumped out what I needed. It wrote the script in a formal language, which is then interpreted by the machine in an even more formal manner. Let the code deal with accuracy, and lets let language models argue back and forth with humans on intent.
> The advantage of using English (natural language that is), the humans around you tend to speak it.
This is true, but of limited utility. English is so bad at this sort of thing that even native-speaking humans are constantly misunderstanding each other. Especially when it comes to describing things and giving instructions.
That's why we have more formal languages (even ignoring programming languages) for when we need to speak with precision.
That's the other nice thing about ChatGPT - if you say it something and it misunderstands, you can correct it by saying, "no, actually, what I meant is ...". Which, again, is how people generally do that kind of thing outside of programming. The advantage is that you're still remaining on a much higher level of abstraction.
As far as formal languages... GPT doesn't know Lojban well, presumably because of its very small presence in the training data (and dearth of material in general). But it would be interesting to see how training on that specifically would turn out.
> Which, again, is how people generally do that kind of thing outside of programming.
Yes, and with people, that's insufficient if you really need confidence of understanding.
There's a reason that lawyers speak legalese, doctors speak medicalese, etc. These are highly structured languages to minimize confusion.
Even in less technical interactions, when you need to be sure that you understand what someone else is saying, you are taught to rephrase what they said and tell it back to them for confirmation. And there's still a large margin of error even then.
This is why, whenever I have an important conversation at work, I always send an email to the person telling them what I understood from our exchange. In part to check if I understood correctly, but also so that I have a record of the exchange to cover my ass if things go sideways because we didn't understand each other, but thought we did.
Not the person you replied to, but I see it the same way. GPT is an English (and other natural language) compiler.
Not in the sense that you get a computer program out (though you can), but in the sense that it can automate anything without even needing a programming language, compiler, and domain specific UX.
Low code and no-code tools still require thinking like a programmer. You define what you need to do, then implement, then get results. GPT often lets you go directly from spec to results.
If the goal is programming, GPT is nothing special. If the goal is quickly reasoning over very abstract instructions, it’s amazing.
The trick is seeing the new use cases. It really does come back to the GUI revolution: if you want to list files in a directory, the CLI is just as good, maybe better. But GUI makes photoshop possible.
GPT makes it possible to say “summarize the status emails I sent over the past year, with one section per quarter and three bullet points per section”. And the magic is that is the programming.
> What does that mean to "program GPT"? Do you mean program (software) USING GPT?
A sibling comment already explained the second part of the question, but there is something I find more exciting. You can program GPT, as in you can tell it to change its behavior. The innumerable "jail break" prompts are just programs written in English which modify GPT itself. Like macros in lisp I guess. The first time I truly saw this potential was when someone showed me you don't actually have to change the temperature of chatGPT in code, you can just tell it to give low and high temperature answers in the prompt[1]. That's programming the model itself in english.
The Ford Nucleon was a 1957 concept car that featured a compact nuclear reactor. Look at how well that prediction aged. It's apt that you mention the Trinity test, since 1950s inflated expectations of the applicability of nuclear everything are exactly where we are now.
Perhaps I could interest you in some Radium Water? It's new and trendy and good for your health.
This is a very simple case that doesn’t reflect the complexity of a real project. Like so many attempts before to produce code, using a little effort, it degrades when the complexity level increases even slightly. Once there are more tables, ones that have names which cannot be easily translated from English, it breaks down quickly. These types of tools work ok for brand new projects, but work on existing projects will prove to be exponentially harder or more difficult than it is worth.
Nonetheless, it could prove useful for looking up algorithms, patterns, and generating boilerplate code. However, an important issue is will it generate similar code if queried at a later time? Not likely, which will make it less useful or result in an inconsistent codebase. Maybe you can request a version of the code generation? In-house code generators will generate consistent code, so it will be interesting to see how it is used in real projects.
Note that in this case it isn't even asked to write specific queries for specific tasks - it's just given one high-level task and the schema to work with.
You're right, though, that the effectiveness of this approach depends very much on schema design and things like descriptive table/column names etc (and even then sometimes you have to make it more explicit than a human would need). You really need to design the schema around the capabilities of the model for best results, which makes it that much harder to integrate with legacy stuff. Similarly, not all representations of data work equivalently well - originally, I gave the model direct access to the typed object graph, and it handles that much worse than SQL. So if your legacy software has a data model that is not easy to map to relational, too bad.
On the other hand, GPT-4 is already vastly better at this kind of task than GPT-3.5, so I think we can't assume that this will remain a limitation with larger models.
> You really need to design the schema around the capabilities of the model for best results, which makes it that much harder to integrate with legacy stuff.
This may end up being a feature of some high level frameworks … “compatible with ChatGPT” or “designed to work with xxx LLM”.
> and then I type 3 keywords from their massive prompt they crafted and find it instantly on Google.
Seems I and you have different Googles and you still have the one I had pre 2010.
For over a decade now, Google has been including things I never asked about to the point where it would sometimes be easier to find it using Marginalia.
Some say it is just because internet has changed and there is less ham and more spam, but the last few months I have been using Kagi and it proves it is possible to create a better search experience.
And, if Google works for you, fine. Maybe you search other topics, use other keywords or are in another bucket wrt experiments, but from my perspective Google is now the same as its predecessors.
I actually agree Google has gone downhill. Yet for the 8 or so examples I’ve tested where I saw hyped GPT results, every single one google answers, usually in the top snippet, always in the first result.
For politics shopping and some other topics it can be terrible, but I don’t think GPT is good at those either.
I’m actually happy to be proven wrong here. If you have some examples let’s test it out. If it’s a true step function improvement I’d expect it to be easy to source examples.
I think this is a classic case of us overestimating the immediate impact and underestimating the long term impact.
Right now, they are definitely useful time savers, but they need a lot of handholding. Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.
You could spin up a giant staff the way we do servers now. There has to be a world changing application of that.
Yes, that's called 'ensembling'. There is a lot of work being done on this kind of solution. One way in which it could work is that you can use multiple models that have been fine tuned for various problems and then use the answer that returns the highest confidence.
> We just need to apply what we learned in business school
Please don't. You've already ruined enough industries. Let the MBAs do finance and Wall Street and leave them out of the chain of command in organizations that make things.
Every time you go to the store and find that the store is still in business and there is food on the shelf, it is because someone went to business school and knows how to optimize demand estimation, pricing, and logistics.
Yes, some MBAs fuck things up. Just like some CS grads fuck things up. But advocating against the study of business is just as naive as advocating against the study of computer science just because there are some bad CS grads.
> Every time you go to the store and find that the store is still in business and there is food on the shelf, it is because someone went to business school
Are you contending that business were not successful before Wharton started pumping out MBAs?
> But advocating against the study of business is just as naive as advocating against the study of computer science
I didn't say 'don't study business', I said 'stick to finance'. MBAs tend to end up destroying innovation and productivity for short term growth and stats.
Jack Welch showed what a successfully motivated 'business oriented' leader can do to an innovative and productive legacy organization when given complete control over it. The MBAs happen to just do it on a smaller scale.
> advocating against the study of business is just as naive as advocating against the study of computer science just because there are some bad CS grads.
Criticizing garbage MBA programs is not criticizing the study of business. Business schools don't study business. They're a place where people make a lot of money selling theories about business that are useless at best and it many places, quite harmful. Learning about business by going to business school is like learning to kiss by reading books about kissing.
I would say that just as every person is unique so is every company unique. And just as there is plenty of pseudoscience plaguing psychology so are MBAs full of pseudoscience. Two fields that are far too obsessed with generalising their advice. Which is not to say that there aren't any useful ideas in these fields. But the vitriolic reaction above is warranted.
>Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.
This is an intuitive direction. In fact, it’s so intuitive that it’s a little bit odd that nobody seems to have made proper progress with LLM swarm computation.
This sounds like that old economics joke that says it's impossible to find $20 on the ground, because if it had been there, someone would have already picked it up.
Context window is a limitation, but have we actually hit the ceiling wrt scaling that? For GPT, you need O(N^2) VRAM to handle larger context sizes, but that is a "I need more hardware" problem ultimately; as I understand, the reason why they don't go higher is because of economic viability of it, not because it couldn't be done in principle. And there are many interesting hardware developments in the pipeline now that the engineers know exactly what kind of compute they can narrowly optimize for.
So, perhaps, there aren't swarms yet just because there are easier ways to scale for now?
Rather large parts of your brain are more generalized, but in particular places we have more specialized areas. Now, you looking at it would consider it all the same brain most likely, but if you're looking at it in systems thinking view, it's a small separate brain with a slightly different task than the rest of the brain.
If 80% of the processors in a cluster are running 'general LLM' and 20% are running 'math LLM' are they the same cluster? Could you host the cluster in a different data center? What if you want to test different math LLM modules out with the general intelligence?
I think I would consider them split when the different modules are interchangeable so there is de facto an interface.
In the case of the brain, while certain functional regions are highly specialized I would not consider them "a small separate brain". Functional regions are not sub-organs.
It’s a hype bubble for hundreds of years and saying that doesn’t make chatgpt worth any less. I have definitely been surprised by this and gotta say I’m expecting AGI a lot faster now. Even if literally all it did was predict what the average internet user would write in a certain context, that’s huge, cuz when you integrate all the little advantages of all the weird things one person knows another doesn’t, the collective knowledge is worth more than the sum of the parts. A tool which can tap into the sum total of human knowledge 24/7 and more rapidly than I can propose more questions for it, mainly I’m just excited to play with larger context size models so I can include more code and get big picture ideas about groups of stuff that are too numerous for my feeble meat brain to reason about. 7-9 things in working memory has always been the thing that would make humans inferior to AI in the long run. Even if it’s not that insanely smart (but realize: intelligence is a probabilistic concept and computers are great at multiplying probabilities precisely) if the thing can fit more stuff in memory than us and type faster than us and it doesn’t get tired or overwhelmed and give up (imagine your capability in a world where you had no tiredness and unlimited self discipline) in time it’s inevitable the transformers put us all to shame, and the more complicated the topic, the bigger of a shaming it’ll be, since the more complicated topics have exponentially more relations to reason about. Who’s gonna trust a human doctor to diagnose their stuff if the human brain holds 9 things and the AI holds thousands?
"Who’s gonna trust a human doctor to diagnose their stuff if the human brain holds 9 things and the AI holds thousands?"
The human brain can hold much more than 9 things and even though AI will be used in medicine broadly very soon, I really want the final diagnosis done by a human.
Once true AGI arrieves, I might change my opinion, but that might take a while.
9 things is considered a standard for working memory (kind of like processor registers), for people with ADHD it's even less - 3-5.
Try writing a number from one piece of paper to another. If it's more than 7-9 numbers, you won't do it in one shot, unless you spend extra time memorizing it.
That can be increased quite a bit with practice. But it's also not important. It's just the cache memory -- it isn't the limit of what can be learned and recalled.
It is a limit on what you can reason about without a piece of paper.
I’m proficient at math, but my working memory is around 6, so I cannot add two three digit numbers to each other in my head (unless I see numbers to be added in front of me).
Revolutions do happen but not the way we expect. My anecdotical experience: no one in my team of about 30 people developing SW uses ChatGPT or similar in their day to day. This may change, or not.
AI is being used in medicine already. For example, in diagnostics. Most new diagnostics devices (e.g. CT scan, cardiograms) include AI systems that suggest an interpretation and point towards possible problems that a doctor might occasionally miss.
Granted, currently deployed systems are mostly awful, way behind the state of the art, and therefore mostly useless. Maybe it's because designing medical devices and getting them approved takes so long. Maybe it's because the manufacturers put AI in there for marketing purposes only, while assuming nobody will use the suggestiona anyway. In any case, I strongly expect the trend to continue and these systems to become very useful quite soon.
> Who’s gonna trust a human doctor to diagnose their stuff if the human brain holds 9 things and the AI holds thousands?
I will. As another commenter says, the brain isn't limited to 9 things at all. There's no way that I'll trust the diagnosis of a machine that won't understand me.
If a doctor uses AI to help with research, that would be OK. Just so long as the doctor is actually the one doing the diagnosis and prescribing the treatment.
The difference between your search query and theirs is clearly the level of expertise. Chatgpt has a great use case when you get started on a new subject; even with a very cluncky description it can point you into the core concepts of any field. Instead of reading 10 papers with are somewhat related but not what you are looking for, you can spend 3 minutes writing clumsy prompts and that's about it :)
Exactly. Does anybody remember in the early internet days people laughing at their parents for googling things like "please help with my back pain my doctor sucks".
To add to this, using ChatGPT feels great in the moment, because it seems to work so well. For example, asking it for an itinerary while traveling gives you something that looks great.
However, once you actually start using it and see that the "ten minute walk" is actually an hour walk, or that a full third of the attractions it has shepherded you to are permanently closed, you realize that building that itinerary yourself from scratch using Google or TripAdvisor would take you less time than manually double checking everything ChatGPT says.
It's also quite surprising that people still think ChatGPT is capable of logic. Even for a complete layperson, all it takes is asking it to draw someone's family tree as an ASCII chart to see that text prediction only goes so far and there's not enough of a relational concept in there to comprise knowledge. There are many examples of asking it to solve famous puzzles with minor variations where it fails spectacularly.
The marketing behind ChatGPT is genius, but there is only so far you can go before the honeymoon is over and people start to really question what you brought to the table. Aside from that, ChatGPT isn't unique in what it can do, and others (including open source) are catching up fast.
That being said, I'd still use it for something like language learning (and other types of learning), where follow up queries (such as why you'd use one word instead of another, or how to rephrase something to be more polite) unlock a significant amount of value. It can also be useful to write trivial code, though I doubt a serious professional would do this (for several reasons, such as privacy and liability). Ultimately, ChatGPT fits squarely under "tool" and not under "intelligence".
It seems that as of right now, the killer app of ChatGPT is the boost in views you get by putting it in the title of your YouTube video.
As for googling, here are some examples of queries you can try and see how it works:
- summary of all the carbon neutral concrete methods, especially ones that can be done in a small industrial workshop as a prototype
- I have allergies in Thailand, mid-february. What may it be related to?
- list all the companies from Japanese stock exchange that have high debt rate
Those are top of my head, but really anything that is either a super-specific niche, or requires merging a few niches together, Google won't help you with.
How do you deal with it straight up lying? My problem with this whole system is, if I’m asking those questions it’s because I don’t understand the field well enough to answer it myself, which means I can’t pick up on if ChatGPT is lying…
Fair, but not completely true. The Thailand examples gives a detailed reasoning. You can use those building blocks to check. If it says Thailand is a cold country and uses that in its argumentation, it's shaky. You don't have to be an expert climatologist to make this judgement.
It's not just one clean answer and we're done. In my experience it is helpful in breaking the problem down into stuff you can Google.
> In my experience it is helpful in breaking the problem down into stuff you can Google.
Yeah I can see that being useful. I’ve also seen a lot of non-technical people straight up accept whatever comes out of it, so that’s a little worrying. It’s true of Google searches too, of course, but at least a google search gives N results someone can check rather than 1.
With the example questions I provided, it would take many hours to do research on the subject. GPT provided initial answers instantly, and then fact checking was easy.
That’s what we did with gpt-3. With plugins you can have gpt fact-check itself.
Also, if you have a system for dedicated knowledge, you can use embeddings - with embeddings gpt has very little room for hallucinations, and it can provide detailed references.
> Taking into account the downsides it looks like a hype bubble right now to me, and a draw in the long run. There’s just a whole lot of tech people trying to cash in on the hype.
Techies will realize that they are just giving ideas to O̶p̶e̶n̶AI.com, Microsoft Word, Google Docs and Notion. It is just the same AI bros re-selling their hallucinating snake oil chatbot that are under a new narrative for AI.
There is a reason why the only safe serious use-case of LLMs is summarization of existing text, since everything else it does is untrustworthy and is complete bullshit.
I do wonder where LLM translator would take us to, considering that Japanese version of Bing Image Creator[0] is still proudly displaying a complete nonsense…
Every business is fine with some frequency of bullshit output at some level. The question is how often exactly it happens and how much harm the bullshit can cause.
My point was that spam is the perfect use case for this tech.
Of course there are other possible use cases, but spam and fake news content creation are the perfect fit. AI will enable one to easily clone the writing style of any publication and insert whatever bullshit content and keep up with the publishing cycle with almost zero workforce.
Want a flat-earther version of New York Times (The New York Flat Times)? Done. Want a just slightly insidiously fascist version of NPR? Done. Want a pro-Nato version of RussiaToday (WestRussiaToday)? Done.
And we already know people share stuff without checking for veracity and reliability first.
Notion going all-in on the "AI" stuff is annoying/concerning to me. Mostly just that I live and die by a personal Notion wiki to keep my life organized, and if they eventually tank their service by investing too many resources into features that don't take off and I have to find a new tool to offload my brain into, I'm gonna be pissed...
I went with Logseq and for the first time in a number of years (actually since OneNote 2016, the last self hosted version) I am actually happy with my tooling again.
It doesn't cover everything OneNote 2016 did, but it does a lot more in other areas and it is progressing nicely.
The app is already usable, at least on iOS, but for now sync is a bit rough around the edges, i.e., I need to verify it is synced or it will overwrite and I have to fix it using the page history which thankfully exist.
That is just the introduction, showcasing what level of sophistication you get with just Google and Wikipedia as tools
Now imagine task rabbit or fiver as tools. Ai can make things happen in the real worlds.
These llm have limited attention but infinite focus. You can parallelize them, you can have one direct a fleet of other llm, you can have llm checking input and outputs for correctness from the other models and feedback that information to the controlling model so that it can improve the promp to the other as it tries to reach it's goal
And the goal can be far fetching (manufacture fake artsy trinket and import them from China to distribute etsy) or nefarious (produce subtle propaganda in a moltitude of wordpress website, register accounts on Wikipedia, reddit, create a sophisticate network of citations)
> OpenAI revolutionized… rewriting things with slightly different wording?
Yes, if you try hard enough, you can try to cast transformational shifts as trifling.
- e.g. “Barteen, Shockley
, and Brittain made a smaller version of the vacuum tube.” (transistors)
- “Scientists discovered that light could carry information, like electrical wires do.” (fiber-optics)
The effects (including the harder to measure cultural shifts) matter more than some uncharitable characterization.
Also, the “it is not X” thinking is the result of present fixation. Such argumentation is, at best, quite narrow. Perhaps applicable in specific defined markets and situations but hardly a good mindset for making sense of how the world is changing. Hence the cliché, “The Stone Age didn’t end because we ran out of stone.”
The psychological undertones in the comment above are probably “people, stop exaggerating”. From one overreaction to another, it seems.
The popularity of ChatGPT revolutionized time. For learning, for many kinds of busywork (it's redefining what is and isn't "busywork), for planning. And most important: we don't know what we have yet because it's still being built. It's a tool. It's not the "capabilities" it's what people get out of it.
You mention blogging from the standpoint of writing it all yourself, and then using a tool to tweak it. That's not the revolutionary part. It's collaborating with the tool to write the post.
You act like writing itself doesn't take time and energy. It has sped up my grant writing 6 fold. Any long-ish form writing that I need to do now happens at warp speed
if we only focus on this part. this function represents that most of content creator don't need to convert or reproduce their content to fit another group of customers.
That's very cool, and right now it's a good idea, but I strongly suspect GPT only looks clever and does a good job in isolation. If everyone starts using it product announcements will start looking very similar, and they'll lose a lot of their impact.
This is definitely the case with cover letters for jib applications. The ones written by GPT appear to be pretty obvious - my guesses could be wrong, but after seeing most applications not having a cover letter for years to most applications having one over the past few months, I suspect GPT is involved, and there's a distinct 'style' that seems to be showing up.
Using GPT could be the 'bootstrap ui' of product announcements. It looks great on its own, but put it next to a bunch of other companies and they all fail to stand out.
Question for you-given two identical candidates: One who does not submit a cover letter and one who submits a cover letter that was clearly written by chatgpt—which candidate would you rather interview?
On one hand the duplicities involved with writing a chatgpt cover letter seem to be concerning in a new hire. On the other hand, it shows resourcefulness and going above the line.
I’m tempted to say I’d prefer the gpt cover letter candidate, simply to talk to them about how they got the idea and how they executed, but I’m curious if you feel the same way.
Right now I'd also have a preference for the GPT candidate because it shows a bit of an interest in new tech. In a year when 50% of candidates subject GPT cover letters I think I'll see it as a sign of laziness or trying to hide poor comms skills. Maybe not though. Time will tell.
I'd probably roundfile the chatGPT one. A cover letter needs to be a personal communication from the candidate, not something machine-generated. The writing style is an important signal.
However, I would have zero issue if they used chatgpt to help compose their CV.
I find it interesting it’s very hard to make it do typos. The type of errors humans make. It can do typos but even those seem weird. If I push it to make errors it either over- or undershoots.
The “correctness” of it is a definite give away IMO.
In fairness, lack of typos in a cover letter has traditionally been interpreted as a sign of diligence!
For more general queries its "house style" tends to be really obvious, with all it's "however, it's important to note" and "ultimately it depends on"s and the tendency to flesh out a one sentence answer to the specific question with two paragraphs or five bullet points of detail at a slight tangent to it...
It can imitate styles of little kids with their phonetic spelling. (Though this is so common in English, French etc but not Italian or Turkish where the spelling is very regular)
I am sure it can at least do very common mistakes like it's vs its or "would of" if prompted right since there's a huge body of that kind of work. Or maybe a human needs to add the finishing touches to make it look more human. :)
That distinct ChatGPT style is mostly the product of their RLHF, so it's what you get by default if you don't ask for something more specific. But it's fairly easy to tweak the prompt to make it use whatever style you want, including more terse, less apologetic etc. Don't forget that "write about X in the style of Y" was one of the first things that GPT models could reliably do, long before chat.
It seems like the use of ChatGPT is something like "microtasks". Little things a given person could do but would rather not and so is able to delegate to an automatic thing whose output they can verify.
It seems like it's potential as of today is increasing or seeming to increase the productivity of a segment of white collar workers in the fashion that email and the web did (or might not have). A lot of researchers might not have need for this and so not understand this appeal of this.
I, too, have been experimenting a bit to see if/how an LLM might help me. This microtask framing jives with my experience.
One of the best examples so far for me (and it's truly micro) was at grocery store. Friend trying to figure out how big of a rice bag to get and avoid not finishing it before a long trip coming up. She knew she ate a couple of cups dry a week.
"I eat 2 cups dry rice per week. Can I finish a 25lb bag in less than 4 months?" "Yes" (it did show its work).
One shot, perfect response. I know this kind of computational thing is what WolframAlpha was for, but that wasn't nearly as reliable. I know I could figure it out myself, but I'd need to find a reasonable figure for the density of rice and probably do some imperial metric conversions and generally futz around for longer than one would want to stand in front of a pallet of rice bags.
If we're characterizing LLMs as a kind of input interface I'd point out that the first GUI was released in 1973 by Xerox and the first commercially successful GUI was released in 1984 with the Mac 128K. It took 11 years for someone to answer this question. Sure things move faster these days, but we're still only a couple of months in.
With blockchains there was also a fundamental technological breakthrough (I'd argue less revolutionary than LLMs). The problem was that everyone jumped the gun and proclaimed the killer app had been discovered too soon: cryptocurrency's incarnations to date have yet to demonstrate much utility apart from being a vehicle for speculation. Nakomoto invented the first distributed blockchain in 2008...
Anecdotally, I've had more writers (screenwriters and copywriters) tell me they're using ChatGPT than programmers. I think people here might underestimate how big a deal it is in "the real world".
Most of the economy involves moving around physical things. Construction, transport, nursing and related health occupations (physiotherapy, home aid, etc.), retail and wholesale logistics. Manufacturing employs another chunk. Services for agriculture, fishing, mining also employ more people than are directly employed in those industries and are mostly to do with machines and equipment. Utilities.
Most of the rest is high-touch. People want interactions with humans for important stuff, not with AI. Remote teaching was an unmitigated disaster for most school students: how will AI teachers do, do you think? Attempts at robot police and security guards haven't gone down very well to date. It'll be a while before there are AI EMTs and firemen.
I guess if you don’t like writing, this is good for you. However, I like writing. Moreover, many people are
lousy at it (not that I couldn’t improve). I’m not sure I want something trained on lots of mediocre writing doing editing for me.
LLMs are trained on language. Not mediocre language. This is why models can be fine-tuned in one language and then see the benefits in other languages. How much longer with this fundamental misunderstanding of these models continue and how often will they be put forth by people who are worried about task X that they enjoy being replaced?
I'm going to guess (assume) you probably haven't worked in a 'real' business. A place where elbow grease still does the majority of the work and where Windows 8 was only just phased out.
The killer app (to me) in case of GPT is GPT itself, not ChatGPT. ChatGPT just allows me to easily test use cases for GPT. There are many interesting use cases for those elbow grease businesses for GPT. For example:
Data entry. There's still a lot of data entry being done from unstructured text. Where specifics like names and addresses need to be extracted from letters and emails and contracts and other stuff. I've worked on these challenges before using different strategies and GPT blows my mind with what it can do just by asking to grab this data and format it as JSON. Is it 100% correct? Nope. You still need people to review the data (depending on the use case), but that already saves tons of work.
Categorization. Some companies still get tons of emails that need to be forwarded to specific departments. This is another thing that GPT does surprisingly well out of the box.
And that's just GPT. There are many other legacy business processes that can be automated (partially) by other models that are coming out right now. Even just 'segment anything' that Meta just released is incredibly useful for many use cases I've seen in my daily work.
Killer apps are always a combination of a technology to solve a real world problem. If you don't venture into the real world and only stay part of the tech world, seeing the killer app is very difficult and ends up leading to Juicero-like products.
Right, questioning the value of a chatbot is a good indication someone hasn't had real job. Totally legit characterization that doesn't make the whole thing sound like a confidence game.
Ironic given yours and others emphasis on my choosing "chatbot" to summarise llms as a reason to dismiss my comment, along with the rest of the pedantry. The upstream post dismissed / insulted the person for questioning value, which was what I called out.
If you really had wanted to get into the "HN rules" game, you could at least have cited "don't be snarky"
That wasn't even remotely my goal and I'm disappointed that my choice of words made it seem like it was. I purposefully added quotes to the word 'real' in my comment since any business is a real business and made it clear it was an assumption, not a fact.
It's just that many tech workers often haven't worked outside of tech and therefore are blind to issues outside of the tech world, like manual data entry, because they assume that must all be automated. It's exactly the same the other way around, people in what I called 'real' businesses are blind to what tech can to to improve business processes because they have no clue about what's available and possible.
They weren't questioning the value if the chat bot, they were questioning the value of the models based on only the high profile use cases such as a now (in)famous chat bot. That, to me, showed that they have a narrow view of the problems that this stuff could solve when applied outside of those high profile use cases. So I made an (explicit) assumption that they mostly worked in tech and not outside of tech, which limits the view of the world outside of tech.
I don't think donkeyd was trying to insult the person he was replying to, but he definitely could have worded that better. I think he's using "real job" in a derogatory sense directed at businesses, how most of them are so inefficient and rote that they'd stand to gain from the roteness of something like GPT, and that the person he's replying to, perhaps fortunately, hasn't experienced that type of business.
Data entry is a really interesting one to me. We’ve been replacing a legacy (read > 1 MLoC with no tests), system on and off for a few years. The original system had a ton of double entry or manual data entry and the human error rate is noticeable. If GPT could have automated this with a similar or reduced error rate then we would have considered it a win.
The real win long term remains killing off manual entry any time it’s possible, but GPT offers a nice patch.
> There's still a lot of data entry being done from unstructured text. Where specifics like names and addresses need to be extracted from letters and emails and contracts and other stuff. I've worked on these challenges before using different strategies and GPT blows my mind with what it can do just by asking to grab this data and format it as JSON.
Is this data confidential or something you are willing to send to anyone? If the former, you probably shouldn't be sending it to an AI company that retains the data for its own purposes.
This is very confidential data, which is why the current implementation is run on-premise and of course I'm definitely not using production data to test GPT capabilities.
Well, in the organization I was doing this, they had similar issues before I started, with external consultants doing stupid stuff. Luckily, my team wasn't as stupid as these people and integrity was also very high. Which was very, very necessary considering the data we were working with.
Latvian is a language, that's spoken by less than 2 million people. It's irrelevant outside of Latvia.
All of our government funded researchers who worked on natural language processing can now throw their work in the trash and resign. ChatGPT is leagues better than anything that they've done. And OpenAI weren't even trying.
Windows wasn't even localised in Latvian properly until very recently. Google translate still spits out ridiculous translations. (Though it's better than before). Most software isn't even available in Latvian. Almost no video games are in Latvian. Only the most popular books and movies are being translated. Interested in something less popular - you better learn other languages.
And now ChatGPT comes out and I can ask it to write C++ functions in Latvian. I don't need to learn English to be a programmer any more. Nothing like that has been done before. It translates stuff way better than google. And it will only get better.
Imagine that there's a book that only 50 people from Latvia are interested in. Human translators aren't going to bother. But ChatGPT can do it easily.
Real researchers build on advances not quit because of them. I'm sure GPT is not as optimised as it could be to process non-English text. There's clearly a lot of work to do. At least this is true for south asian languages and I'm sure is true even for popular Western languages like French or German
Quality machine translation for less common languages would be a service worth paying for. Any links to examples and evaluation of ChatGPT capabilities?
It's a hype bubble. There's a small group if die-hards in the tech-adjacent VC community that talk it up to no end and attack anyone who disagrees (I just saw a guy ask you if you were "virtue signaling" with your post - lol. And there's lots of people exploring how it could potentially be commercially useful - just like with blockchain. As everyone points out, it's got more going for it than blockchain did because it does something more tangible.
But the jury's still out on the long-run commercial potential of pretty good autocomplete or chat-as-search. It's probably more than zero (like blockchain), but it won't "change everything".
I'd guess that 90% people either agree with your observation or don't notice / care. It's just that these things bring out the vocal defenders, usually a sign people know deep down it's a bit of BS and post out of insecurity
I've got access to GPT-4 at work. There have been many times that I'll encounter a bug while programming, I'll paste my code to GPT, tell it what my code is currently doing, and tell it what I actually want the code to do. It then explains what I'm doing wrong and what I can do to fix it. I've had success with this over 90% of the time. This saves me a significant amount of time that would have otherwise been spent hunting down solutions on Google, Stack Overflow, GitHub issues, etc.
I don't know what else to say other than that I would not willingly go back to life without GPT. The value speaks for itself to me.
How simple are your bugs? The bugs I usually have to fix at work involve edge cases around complex user interactions that are less programming bugs and more "well we didn't really think about this particular user interaction when developing this feature" - things that are usually simple 1 liner fixes but can take hours to figure out based on back and forth with product to figure out if it's even a bug and tracing where exactly the data/interaction in the code comes from.
Until I can paste in my entire codebase and the entire history of the product development process into GPT I don't see how it can help.
The bugs that it easily fixes, are generally the bugs whose errors i can copy/paste into google and find an immediate answer on stackoverflow
How is this possible? What do you guys do at work? I haven't had success with neither GPT-4 (did you build your own API calling tool for it? Do you just paste it in their Playground?) nor with GitHub Copilot in actually delivering anywhere close to 90% of the time. It usually misses a whole lot of context.
It feels like it would work for perfectly encapsulated small single purpose functions, which of course sounds great but in reality not many projects are structured like this.
I've used it to try and generate some rather small components in React / TypeScript myself, and what it did to arrays of refs with hook calls inside useState hooks initialization function, and the fact that I couldn't get it to fix its issues by doing what people suggest ("just copy paste the error"), or by trying to reason with it, made me not trust it so much. The output code is also pretty low quality in my experience and opinion.
I think you're wrong. I never bought into the blockchain hype, and blockchain added literally nothing to my life other than some speculation right at the beginning. I would pay $200 a month for ChatGPT, today, without thinking twice about it. That alone makes me think it's a much, much bigger deal than you think it is.
Why is this downvoted? It is a good point. My flatmate has a PhD in comp sci, does AI/ML research. He is now paying for ChatGPT "pro" because he is non-native in English and needs a bit of help to improve when writing his papers. He said ChatGPT is so good that he is learning a lot about English by just reading the corrections/improvements. And yes, we have talked/agreed about the scam/sham of blockchain/defi.
it does something more superficially apparent to naive people. decentralization is extremely more important. it will be one of the main ingredients of direct democracy.
I thought about learning Go and Fiber to build the backend of a side project of mine and I did but as it goes for any new language / stack I wasn't feeling confident in it. Then ChatGPT came out and I thought what the hell, let's see what all the fuss is about.
So I asked it to write me a struct for a table with the "id, name, longitude, latitude, news" columns. That worked well, I was surprised it automatically inferred the data types for said columns.
Then I asked it to write endpoints for retrieving a record from that table and it did so perfectly which again I was surprised by. Asked it to add endpoints for adding records and retrieving all records. Again, no bugs, perfect code. At the end I asked it to create a python script to test the API and it did so flawlessly.
Next day I created a docker env with postgres and went to test the code but it didn't work, turns out it wrote it with mysql in mind so went back and told it to rewrite the entire code with postgres in mind and again it did so flawlessly so overall writing this small API endpoint took maybe 30-60 min.
Considering I was a total newbie at Go this probably would have taken me several hours to complete successfully and this code is basically just boilerplate. I don't care to learn it by heart so I can be more productive in the future. Now that I have ChatGPT I basically don't have to, I don't have to write python to speed up my dev time, I can just have ChatGPT write the basic stuff in a highly performant language. It removed the only drawback which was more boilerplate.
How do you know? Either you know enough Go to be able to tell, and thus you'd be able to write this yourself, or you don't and thus you can't really judge.
I mean, it probably is right, but this "I don't know something, but I trust what the chatbot told me" is what worries me about the rise of the LLMs.
Reading and validating code is way easier than figuring out how to write it in the first place. A lot of programming is the same - functions, conditionals, loops, libraries, etc.. The hard part is mostly semantics across languages. Way easier for GPT to do the first pass, then go through it and read up on the parts you don't understand.
It isn't in my experience. Understanding code that you did not write for a problem that you do not fully grasp is a lot of work. It's hard enough when you did write the code and when you do fully grasp the problem, which you'd have to if you were to write it yourself. Typically that's what first draft code is for: to see if you actually understand the problem.
With a brain you can also build it piece by piece, in fact I don't know of any other way of writing a large software system than doing it piece by piece.
That's just not true. I expect software to become a magnitude shittier soon because of this attitude. After that I expect software to become unbelievably good, because thanks to AI we will have the ability to prove correctness of software much more cheaply than before, and to design on a higher level than before. After all, you don't want the AI to help you generate boilerplate code, you want the AI to help you avoid boilerplate code.
It is true if you can understand the code it writes. Maybe you are worried junior programmers are gonna churn out code using ChatGPT but to be honest I'd rather trust code from ChatGPT than from a junior, I feel ChatGPT writes better code on average.
I'd argue you only THINK you understand the code. If it generates a 1000 lines, which look like they are doing the right thing, will you be diligent enough to go through every single one of them? This literally can only work for extremely boilerplate code (which is maybe the code most of programmers write), but for most code I write I need a mental model of it, and constructing that model myself is easier than to try to learn it from some code. Of course ChatGPT can work as an inspiration, especially for working with unfamiliar APIs.
Because what the code does is simple enough, connects to a database, runs some queries and returns the results. I know it worked because I tested it and for that kind of code there aren't really edge cases. Even if I can't validate or catch syntax errors because I'm not yet experienced enough I can see the overall structure of the code and what it does and if it runs then there's no syntax errors.
It's like with riddles, someone asks you a riddle you think and you think and you draw a blank but if you are given the answer you can instantly validate it even if you didn't know the answer before hand, same with this.
On one hand one may not know the exact syntax for a function call but still understand looking at code where a function is defined and where it's called, those are two different knowledge set, the first doesn't transfer from other languages but the second does.
On the other hand you can ask gpt to write test and validate at the outside layer that data is transformed the way you need to.
Stop looking for the “killer app”; that metaphor is not useful here. Language models don’t need to have one, two, or a hundred “killer apps”. They are likely going to be highly distributed. They can revolutionize every textual interaction point with any person, group, or organization. For worse and better.
Instead, pay attention to how industries and people redefine themselves. There are going to be winners and losers.
I found ChatGPT helpful in a bunch of diverse situations. I'll go into detail below, but overall I think the most valuable thing it can do is understand questions that takes both knowledge (that I don't have and want to acquire) and understanding (that prior AI technologies did not have) and provide meaningful answers. I suppose I find it very useful for my own education - with the understanding that I should use 2 or 3 grains of salt when reading its answers.
For example, I am studying Japanese and often encounter expressions that seems to mean the same thing; I would ask my teacher, but her time is limited. I can instead ask ChatGPT and only bring to my teacher the questions whose ChatGPT answers did not convince me.
Another example: I like understanding why there are certain steps in a recipe. It would be hard to find someone with the knowledge and time to answer the question, never mind I should pay for their time. ChatGPT can explain what I want to know at the level of detail that I want.
I was also able to get a decent understanding of a mathematical question I had no business understanding by recursively asking questions until I was able to link its answers to my own knowledge.
It was also able to answer questions about the Spring framework that I had while reading the documentation itself. In that context, going in rabbit holes severely slows down learning and has the potential to just get me lost.
Be wary with ChatGPT and Japanese. As someone fluent enough and using ChatGPT for inspiration in writing, as well as to find words that either I have "on the tip of my tongue" but fail to remember, or find idiomatic expressions that I may or may not have heard of, ChatGPT can come up with weird hallucinations (Edit: to the point of making up words, I guess mainly because its tokens are at best at the character level ; knowing that, it's amazing it performs as well as it does, honestly).
I always double check with Google searches, but at least ChatGPT gets me somewhere where I can actually search for something useful.
Relatedly, sometimes, even with an initial prompt I've used in the past to make it do what I need to a text in Japanese, and despite everything I type being in Japanese, sometimes it decides to respond entirely in English.
It can also be really bad with context. Because in Japanese, the subject is often omitted because it's known from context, ChatGPT often mixes things up when rewording or summarizing.
GPT-4 is dramatically better with everything except speed. The minutia most people in this thread are complaining about is almost completely solved by GPT-4.
For example I used it over like 30 minutes to conceptualize, solve, and write some code that draws a graphic for a simple physics problem (0-shot) that I could intuitively understand but had no (physics and math knowledge) tools to calculate properly and it was a great experience.
It let me pick and change how what properties i wanted represented on the graph, knew what center of mass means, how to calculate it for a weird object, got something that feels correct to me, drew it out with various representations where i used color, size, shape to represent the various distances, weights, clusters, etc.
My non programmer friends have used generative networks to make designs for prints, incredible and generally accurate folk art, full mobile games.
Specific examples don't generalize though. My experience with ChatGPT (GPT-3) is that it is vastly better at dealing with English than Japanese. That is still probably true with GPT-4, but that doesn't mean there hasn't been progress in GPT-4 with Japanese, which a sibling comment says is the case. But it's not because it does better at $task that you can extrapolate to a different one. People have reported GPT-3.5 being better at some things, after all.
I cannot speak for Japanese, but I'd say that GPT-4 is better at Russian.
That the model is better in English is no surprise given that most of its training corpus is in English. In fact, based on the sentence structure of the output when it speaks in Russian, it's clear that what's happening there is some kind of real-time translation from English.
That aside, I have yet to see any task on which GPT-4 wasn't at least as good as, or better than, GPT-3.5. I'd love to experiment with that. Do you recall any specific examples?
ChatGPT is 3.5, and the one mostly used is 3.5-turbo, that's the only free one and it's the default if you pay (though you can pick the legacy 3.5, or 4).
A lot of your examples you can just google search for the answer - and at least google won’t hallucinate; worst case the websites are wrong but GPT’s training data can be wrong too.
You are not missing anything. It's a bubble. This is tech 101 and all over again. You gotta hype some new thing whether it's Web 2.0, Crypto or AI.
LLM have proven to be great as a gimmick or making rather okay localized approximations. You can create a good enough image without any designing skills, or have auto-completion on steroids. However, there is no proof that this same tech can extrapolate to the next level.
Most startups are putting their eggs on this single basket. I have the feeling that this will be what triggers another AI winter and some VCs will holding the bags... but why do I care?
This is the single greatest leap in productivity we’ve had in the last 100 years.
Last week, I used GPT-4 to write my code. Later, I used it to analyze 100+ websites and come up with a personalized pitch for a relevant plugin/product idea - something that would have taken me 100+ hours.
If it makes you feel better (or worse?). I am using it daily and extensively. It is a productivity boost but not something that will propel me to god-level and the more I use it (Codex in particular) the more I discover its limitations; and that's where my conclusion has came from: It is too limited beyond the local scope it has. It is not clear (and there are no indications) it can make a jump from that local scope.
It seems then that it's usefulness depends on the intersection of what you want/need to do, what the model is capable of doing, and what you're capable of getting the model to do.
In some cases the impact might be enormous, and in others perhaps less so. One thing is for sure, the models are getting more capable, and along with that people are investing time/effort/money into improving their capability to leverage what the models can do.
It has plugins now. It can search the web. It's connected to Wolfram. Zapier. It can do things.
Even before that you could hack something together. I told it how to request an image be included in responses - it includes [fluffy unicorn], I parse that out, feed it to another GPT to get a better description, then feed that to DALL-E to get the image to include
When you use it for brainstorming, hallucination is not that big of a deal - you need to verify ideas anyway. And it's great at "thinking" outside of the box, if you prompt it right.
What it does is it makes you multidisciplinary. You might be a coder who struggles with writing social media posts and blog headlines. GPT fills that shortcoming.
It’s also a massive skill multiplier. It can turn someone with a 3-6 months knowledge of a discipline equivalent to someone with 2-3 years (or even more) in the field.
I believe that you are half right. AI is going to radically change the world, but not now, maybe in 10 years. The pace of research is slower than the expectations of investors. This will lead to the perception of a boom and bust and AI winter, but in reality there will be slow steady improvements. It’s just been commercialized too soon.
There is also the other (very high in my opinion) possibility that LLM will hit a wall in regards to performance and a new model is required. It's unfortunate that the current VC landscape encourages this single shot mentality instead of diversification. This will not help in the AI winter since most people will be invested in the same thing.
Wouldn't it stand to reason that LLMs should be obsolete in about a year due to the singularity event and the rapid speed which we're moving towards AGI and then onto exponential technology growth due to ASI?
At this point, as an AI researcher, I guess you'd just have to sit back and watch is all unfold, very soon everything you do is obsolete almost immediately.
This suggests that LLM is the path to take us to AGI. There is no proof for that. That seems to be the current bet, but my thinking (and it is strictly mine) that we are still technologically under-powered to achieve such a leap. Maybe in 10 years, or in one year or in a hundred of years. However, for that leap (and strictly my opinion), we need significant infrastructure leaps. Something like processors are x100 faster, or your laptop is powered by a 5Ghz 200 cores CPU...
As it stands, we can't even get the Mark Metaverse right. You are trying to convince me that we have the infra. for AGI? Not convinced.
Three observations here. Firstly it has been a really eye opening experience watching the innovation around Stable Diffusion and locally run LLMs and seeing that the unoptimized research code that needed such beefy hardware could actually be optimized to run on consumer hardware given sufficient motivation.
Secondly it wasn't obvious that deep learning was going to work as well as it did if you simply threw enough compute at it. Now that this tech has reached critical mass there is a tonne more money being poured into infra to support it.
Lastly, compute power is increasing as always. Nvidia releasing H100 and also their recent work on computational lithography. Also DeepMind finding new state-of-the-art algorithms for doing matrix multiplication with AlphaTensor. You can kinda already see the positive feedback loop in action.
I dunno... at this point I just wouldn't bet against the trajectory that we're on.
The model architecture has stayed roughly the same since the original AIAYN transformer in 2017. That’s 6 years of nothing fundamental happening.
Now, obviously the models have got hugely better in capabilities since BERT. Everything else has advanced. Tweaking, tuning and scaling have delivered true intelligence, albeit sub-human. But it seems unlikely that transformers are what take us to human-parity AGI and beyond, because the more we optimize these word predictors the more we find their limitations.
The lack of architecture changes over the last 6 years creates a huge amount of “potential energy”. A new model architecture might well push us over the human-parity threshold. It wouldn’t surprise me if I wake up one day to find that transformers are obsolete and Google has trained a human-parity AGI with a new arch.
This could happen tomorrow or in 20 years, transformers had an easy discovery path from RNNs, to RNNs with attention mechanisms, to Transformers. Architecture X seems to have a much more obscure discovery path.
I also believe we might see something unfold but we still don't know.
It would be wrong though not to keep an very close eye on it or also to embrace it because if it will not just happen in the next 20 years you still need to earn money and with expertise in ml you might be better of
Chatgpt or let's say ml already did and there is no writing on the wall it will stop. Contrary it shows how much potential there is and how excited a lot of people are.
We have already ml now in office, in bing, in Google workspace. There is midjourney, SD etc.
You can find ml generated porn pictures.
We have constantly news about advantages in multiple ml fields.
I've been struggling to reconcile my personal experience with what I'm reading - it was so strange reading such dismissive comments by such a knowledgeable community about a new technology that's such an obvious game changer.
GPT-3.5 was easy to dismiss, but GPT-4 is incredible.
Sounds like you're a native english speaker? Translations to english often worked "well enough" with google. But as soon as you tried the other way arround, or from one non english language to another, the results where often fully incomprehensive.
Btw. the same goes for t2s and s2t. It only got slightly usable in the last few years, therefore wasn't really adopted in many non english countries. As you can see, there is a huge market opening up.
Then take voice synthesis into the picture. I think the simple amount of recent changes, including LLMs, will steamroll a totally new media envoirenment and with it substantial social and economical changes.
For me its less the capabilities of LLMs, but the speed and the inevitability of change. You can choose to ignore it, but you soon will be outdated then. Just like if someone would try to work an office job without using digital machines. Maybe you can still do it, but who would hire you?
Language pairing is important. Machine translation between English and German/Nederlands has been excellent for years. English and Korean/Japanese: Awful for years. On the last couple, did it get much better. I am sure ChatGPT12 will have virtually native level translation. Maybe it will be integrated into Kindle so you that can buy books in 25 diff languages, then get a ChatGPT-translated version with one button click.
DeepL is really good, better than Google Trandlate. ChatGPT is about the same, but it can also explain the translation and the grammar. It might sometimes be wrong but it’s usually good enough at explaining things to point in the right direction.
Yo, check this out! DeepL is like way better than Google Translate, so you should totally use it. But, GPT-4 is like a game-changer, dude! It's obvi way better than DeepL, and it can do all sorts of stuff like rewording, explaining in detail, and changing up the style. This text was translated from Japanese to gamer-style English by ChatGPT-4, no joke!
I’ll give you a concrete example from a few days ago in my job.
I needed a quick utility window in the Unity editor to see what animations could fire animation events and what those were.
I’m somewhat familiar with the editor API, enough to know what to Google and roughly where to go in the docs. I don’t do it enough though to really learn it beyond that point. So I’d estimate I could spend maybe one and a half hours, counting research, coding something, testing it and then context-switching back to what I was working on.
On a whim I asked ChatGPT (GPT-4) if it could do it for me. Formulating the prompt took a few minutes. I included a short bullet list of what I wanted and told it what Unity version I was on.
In almost an instant, it did it. I copied the code into a new file, added it to my project and it worked.
Time from idea to the first working version was around 10 minutes.
I asked for some minor refinement and then asked how I could extend it. It gave me starting points and taught me something new about Unity. All that slow doc searching, Google searching and forum-trawling was gone.
One worry I have for the long term is: how will we learn or adopt new programming languages in the future, when LLM du-jour knows nothing about Language X? Will we be stuck with what we’ve got because LLMs make us too productive in them?
I'm very worried that machine learning will lead to stagnation in general, in programming and in art. People will accept the machine output and call it a day instead of making something new.
But for a new language, there isn’t a large body of work to train the LLM on what idiomatic code looks like. It really worries me, that we’re going to be stuck at this local maximum
> I’ve seen examples of “write an email to my boss”. It would take me longer to explain to ChatGPT what I want than to write these myself.
You can integrate ChatGPT here to help with the proofreading and editing. If you have a list of points, you have have ChatGPT write an email, then integrate its changes. This is useful especially if English is not your first language. ([append]) Here's a quick example. These emails aren't great, but might be better than what I can come up with myself in 2 minutes. https://pastebin.com/dD22gR4y
> I’ve seen “write a snippet of code” demos. But I hardly care about this compared to designing a good API; or designing software that is testable, extensible, maintainable, and follows reasonable design principles.
It's super helpful when starting in a new space. I needed to write a python data munging script the other day. Using a few ChatGPT queries I found dependencies and understood the basics of using them. I still had to check the docs, but I jumped past the "tutorial" and straight to "API reference".
> In fact, no one in my extended sphere of friends and family has asked me anything about chatGPT, midjourney, or any of these other models. The only people I hear about these models from are other tech people.
Counter-anecdote, I met two copywriters working for media publications (non-tech) who both have had GPT-based services integrated into their workflows by the company management.
I've used this part of ChatGPT before. Incredibly useful for getting the syntax of some library that you're going to use once in your life, then never again.
Had a sysadmin mate do something similar to generate a simple Chrome plugin for internal use at his work.
That alone justifies the price for ChatGPT Plus, IMO.
Agreed. It’s also great for things like “zip up all the files in this folder, upload the zip file to s3, and output a progress bar while uploading”. It’s not difficult in an architectural or algorithmic sense, but it takes time to refresh on the APIs, find a progress bar library, and how do presigned s3 posts work again? And you can make little mistakes, forget error handling, etc. You could easily spend an hour or more on something like this, especially if you haven’t done it before in x language.
With ChatGPT it takes a few minutes. It can add up quite dramatically when you have a bunch of these kinds of tasks on the todo list. It does feel revolutionary to me as a productivity enhancer.
Recently used it to speed up the time to fix an oauth problem in code I didnt write, knowing nothing about oauth. Replaces stackexchange with much more tailored answers.
can take hundreds of pages of text and distill it down to an executive summary of any length
Daughter uses it to explain how to solve algebra problems, not just give an answer. Will completely change education.
Marketing using it for all content. Devs hate writing blog posts, it will take some code as an input and write a blog post about what it does. Can use it to come up with questions to seed a podcast
Paralegals are virtually redundant. AI with all legal caselaw making it much easier
HR using it to write customized offer letters, review resumes, etc.
It isnt just about asking a question and getting an answer. You can keep adding context and the answers keep getting better.
It outrages me that someone would waste my time making me listen to a regurgitated, averaged-out, speech written without any emotional fire or depth at an occasion that is profoundly meaningful.
A speech as a best man should be personal, heartfelt, it should be funny and a little cringy maybe, and it should mean something.
Same with eulogies; this is a moment to celebrate someone's life, to state what that person meant to you, how they affected you, to share something about that person with everyone else who is mourning.
It's like those services that will buy gifts on your behalf for your "loved ones" or at least, the "acquaintences" that you seem to be obliged to provide gifts for. You've outsourced your taste, your chance to buy something quirky, meaningful, useful based on how well you know a person... which shows that you just don't know them at all.
Why is it unfair or unpersonal if chatgpt helps you, guides you and gives you a good starting point?
Guess how many people would Google some example and starting with that, what's so wrong to let chatgpt generate a shitty first draft already tailored to your situation?
> Why is it unfair or unpersonal if chatgpt helps you, guides you and gives you a good starting point?
As somewhat extreme examples: Rodin didn't swoop in and smooth off "The Kiss" after an apprentice chiselled the basic outline out of a lump of marble; pretty sure Salvador Dali started with blank canvas instead of getting a basic landscape from a $2 Art Shop and added a melting clock and a giraffe to it.
We all have language.
Take your age, subtract maybe 5 years, and that's how much experience you have expressing yourself in your language.
Say you're 20; by that metric you have 15 years experience communicating. Now, I don't play guitar and I'm not remotely musical, but I'm pretty sure if I did it every day for 15 years I could at least bash out somthing original, if possibly influenced by things I liked.
I'm not suggesting that people entertain us for half an hour with heartfelt, witty speech about their relationship with the groom; I'm not suggesting ten minute poetic ode to a life well-lived that leaves everyone simultaneous trying not to cry and trying not to laugh, and nodding and saying "yes, that's how they were".
It doesn't need to be long: just one memory or incident. It doesn't need to be Shakespeare: just heartfelt. It doesn't need to win the Academy for Best Actor. It doesn't need to be a 5 paragraph essay (though if you did use that model, there's nothing wrong with it).
It just needs to be you; it just needs to be yours.
(I already know what I'll say at my Father's funeral when the time comes: it'll be about two sentences, just something he said to me once. But I know that's all I need to say).
Maybe I am arrogant, but I want human feeling and human expression at human events. I don't want to be snoring through regurgitated pap.
I started to see a huge issue with knowledge: people don't know what they don't know.
If you have a magic Maschine your solution is no longer 'no clue how to do this' but it becomes 'i will ask my trusty expert machine'.
This machine now gives you the next step, the guidance you need without being jugemental OR much simpler: just looks to busy to you that you are not confident to disturb the other.
Some people are just bad at this kind of thing, and sometimes saying nothing is not an option if there is a social expectation that you say something. Which can create a lot of anxiety.
Yep. The best man's speech at my brother's wedding was a disaster zone in every possible way, it was a sad blemish on an otherwise great day. It'd have been far better if he'd have some AI help along the way.
> can take hundreds of pages of text and distill it down to an executive summary of any length
This is the kind of breathless claim that no doubt fuels the skeptics.
None of the context windows are large enough for "hundreds of pages" nor "executive summaries of any length".
I did believe that it's possible to make LLMs do that kind of task with significant engineering effort to make it do summaries iteratively somehow, essentially "compressing" parts of the document recursively. But it's not something that you can just give to ChatGPT and have it work.
So yes, the hype is real, in both ways: there's lots of potential to explore over the next years, but also a lot of the claims you read today are not sufficiently hedged, which makes them look outlandish.
Unrelated to the general question of usefulness, but I'd be careful using it as a mathematics tutor.
Maybe the algebra in this case is sufficiently basic as to be trivial for chatgpt, but I've found that quite often it gets confused by relatively simple mathematics in very baffling ways (One particular example I found amusing is it trying to use the pigeonhole principle with exactly N pigeons and N pigeonholes).
There's also the problem that, being a complete pushover, it's hard (specially for a learner) to pinpoint whether the error is in the source or in the understanding of the reader. A teacher may, upon the request of a student to clarify what it seems like a mistake, say "No, this may seem wrong but it's correct for so and so reasons", but chatgpt usually goes "You're right, that was a mistake" and gets itself into messier and messier reasoning.
I don’t think you’re missing anything. I like the “calculator for words” analogy that was posted on HN a few days ago. It doesn’t seem like a revolutionary product, but it does seem like a fundamental innovation which will then unlock many more complex things in subtle ways. Calculators were arguably the predecessor to computers, which are kind of a big deal :)
> Calculators were arguably the predecessor to computers, which are kind of a big deal :)
"A computer is a machine that moves data around and only occasionally performs computation on it." ;-)
Computers aren't great, because they can compute numbers faster, they're great because we've managed to encode text, audio, video, geospatial, etc. information as numbers, which allowed us to perform complex text, audio, video, geospatial, etc. operations.
Most of the people asking the question "to do _what_, exactly" seem to have not much of a "knowledge worker" experience. Every non-tech person I showed GPT4 to was after a few tests immediately using it to speed up parts of their daily work.
Most impactful is how it destroys the blank page/getting started barrier, second how easy it is to substantially change/adapt/refocus produced work.
It is like having an incredibly efficient, patient and encyclopedic junior collaborator 24/7 at your disposal. It can't be trusted to fully automate without knowledgeable supervision, but it saves a boatload of time and effort.
> Most of the people asking the question "to do _what_, exactly" seem to have not much of a "knowledge worker" experience.
From personal experience I'd say the opposite: GPT lacks the specialist knowledge to produce useful writing or yield accurate answers in any of the markets I've worked in (I'll grant that less niche markets exist, and that GPT is pretty good at fixing the writing of people that lack English language writing skill) and it seems like people egging GPT as replacing most of those roles are all showcasing hypothetical "generate a website for an imaginary product with minimal brief" kind of situations which GPT excels at because there aren't any real world knowledge worker constraints imposed on their solution. That's definitely not to say it has no use, but lots of less technologically impressive accomplishments like data entry wizards and templates also have use without being considered transformative.
As I already suggested: in my experience, certainty that GPT will be transformational in a field of knowledge work seems to actually be inversely related to experience of that field. A response whose certainty that GPT could be transformational to the stuff I worked on exceeded only by ignorance of what any of that stuff was is quite a good demonstration of that point...
(and really, there's nothing particularly special about any of the stuff I've worked on, it's just GPT doesn't have relevant knowledge or a path to acquiring it so doesn't generate remotely adequate responses, struggles even more with novel concepts and would be terrible at real time discussion even if suitable interfaces to it existed and were unobjectionable, and that's before we get started on the privacy implications)
> Most impactful is how it destroys the blank page/getting started barrier, second how easy it is to substantially change/adapt/refocus produced work.
This is the killer feature of GPT for me. I'm very, very good at optimizing and solving problems within specific domains, but terrible at picking a direction with no boundaries. (Pick a theme for a costume party and I'll have the most interesting costume. Throw a Halloween party and I'll show up in jeans and a t-shirt.)
I recently wanted to submit a conference talk, but wasn't sure where to start. I gave ChatGPT a list of possible topics and some general guidance about what I thought was interesting and asked it to suggest topics. I picked one from its list and asked for tweaks, then "discussed" with ChatGPT for a few more rounds until I had a very clear idea of what the talk would be.
I don't feel like that's cheating. I'm still going to create and give the talk myself. But if I had to come up with the topic and abstract on my own with only a blank sheet of paper to start with, I'd never have submitted it.
It's just good enough and approachable. But I agree, for what I do professionally it just doesn't even come close and I'm already quite efficient.
From me and around me:
- marketing asked me to show them the ropes of mid-journey yesterday, boss said "this will be the face of our new product" to one of my hasty "creations".
- mom learns english with chatGTP because she finished duolingo
- I wrote a PoC demo for a prompt engineering tool and gpt4 demo chat in a day. That would have taken me days without gpt4. (material design, storing in local storage, gobbling up the data specification, save everything on change not with buttons)
- First draft for some diagrams in mermaid js worked well too, or converting from flow diagram to a swimlane diagram.
- All kind of personal data cleaning: dirty list of emails > ready to paste in mail client, wall of text with broken new line characters > sub-headlines and paragraphs
- virtual assistants (e.g. company chat-/voicebots), needs a bit of tooling but gpt4 is totally ready for it (apart from latency and price)
- 30 minutes to a browser-add-on that marks tweets as "seen" so I can skip them if I scrolled past them before. (userscript to be precise)
- understanding tax regulations %)
What I'm waiting for:
- better knowledge ingestion so it can use my notes
- personalization over time
- good dev-ops integration (push and deploy for me too).
- maybe something to have better separation of concerns on code so the messiness matters less, not sure if possible :)
- remote control my screen
- running an LLM locally
- have it build its own plugins for any website or service
- write and explain me more optimised algorithms for certain cryptographic operations
- explain funny mathy bits of papers that I don’t understand
- plan me a few days of activities for a city holiday
So far it’s been great on all accounts!! I was able to get a faster turnaround time in understanding the papers than I would if I were probing a colleague
What gives you confidence its explanations are accurate?
> write and explain me more optimised algorithms for certain cryptographic operations
This domain in particular strikes me as a poor choice for this approach. "Don't roll your own crypto... but definitely don't let a language model roll it for you, either"
Well it gives me a direction to dig in - often papers use inscrutable notation or seemingly magical variables that I’m not sure where they come from.
Will ChatGPT always be right? Probably not - but these are things I can validate better than no info at all!
Re: crypto algorithms, the quert in question was implementing exponentiation for arbitrary sized integers. My own implementation was taking until the heat death of the universe to finish for big integers and I didn’t want to just copypasta an impl from elsewhere.
ChatGPT‘s worked flawlessly and it was able to explain me certain tricks it used in depth (which I could independently verify from other sources).
Would I ship it to prod? Not without a security audit, but that ought to be the case regardless when rolling your own (or even someone else‘s) cryptography :)
For a serious answer, my biggest reservation with using ChatGPT to explain concepts I know very little about is that ChatGPT often just makes up fake information, or very confidently explains things wrong.
Having to fact-check every thing it says feels a bit exhausting.
A bit like talking to a version of Albert Einstein with dementia or alzheimers, who can still say really intelligent things but mixes in subtle bullshit.
> I’m asking what people want the models to do every day, all the time
I know three regular people using ChatGPT, here's how they're using it:
1. Franchise consultant: uses it to research opportunities, has it write business letters to franchisees he wishes to contact. Saves him time and is a better writer than he is.
2. Immigration lawyer: uses it to summarize info and write emails to clients. Saves her a ton of time.
3. School teacher: uses it to write report card and assignment feedback. It doesn't save any time at all, but the output is more elegant than if he wrote it manually.
Spam of all kinds and at automated-industry scale. Imagine blog-spam written in a variety of styles so you can no longer easily identify it as such. Imagine chum-boxes that no longer repeat themselves and are harder to identify as such. Imagine ads masquerading as content, as it already exists, but at scale.
And every time you slip up and click on one, it will learn a little more about you and create chum content better tailored to you.
Generative AI will facilitate a flood of algorithmic spam.
This kind of spam already exists, it just isn't scaled because it still takes time and resources to create and the people creating it are not the brightest, so for now it is easy to identify.
Fully functional in what way? As far as I can tell, ChatGPT is a box that I put sentences into, and I get grammatically correct sentences that contains topics or words loosely statistically correlated to what were in my sentences, that may or may not be correct and often are not. The box has little to no memory. I honestly don't see what's so useful about this box.
The results are approximately as good as asking an intern, but the response is approximately instantaneous, where an intern takes a week to do anything. And similar to the intern, you can get better results with a bit of guidance and iteration.
In short, humans kinda suck. LLMs also kinda suck, but faster than humans.
Agreed, but that's part of my critique. These systems are written by humans and are trained with barely curated data generated by humans. There is the concept of emergence, but I'm not sure how emergence suddenly fixes a terrible foundation full of biases and errors.
> and I get grammatically correct sentences that contains topics or words loosely statistically correlated to what were in my sentences, that may or may not be correct and often are not.
It's correct enough to pass the bar exam, medical exams, it scores 90-93 percentile on the SAT. This is way more complex and efficient than what you make it to be imo.
> It's correct enough to pass the bar exam, medical exams, it scores 90-93 percentile on the SAT.
So is Google Search, and we had that for a long time now. Is a slightly different and more verbose UI really a game changer?
(Let's ignore the fact that Google Search is broken from all the SEO spam and monetization. Especially when we have no evidence that ChatGPT is any more resitant to this than Google was.)
When hooked up to plugins or tools it does feel like a fully functional interface. With access to a browser, it can do basically whatever I want it to.
I hooked GPT-4 up to a shell and asked it to use the GPT-3.5-turbo completions API (not in the dataset), and it successfully did it through trial and error with curl and error messages. This example is of course not something you would actually do regularly, but rather shows that you don't need a lot of context for it to do useful things right now. With a complete explanation of the OpenAI endpoints, it would most likely make the request perfectly on the first try.
Yes, exactly this is what I meant. You can really leverage the API in other systems. CharGPT on its own is cool, but my mouse comment was about NL being a bolt on UX tool. Which is crazy!
You can tell it what you need to do in terms of data processing, and ask it to write Python code that does that. E.g. ever had to write a convoluted ImageMagick command to do something complicated? GPT will write it for you.
Think of it as a natural language interface to anything that has an API.
I think the problem is that people working in the field have a much different definition of what "fully-functional" means than users who believe that they're communicating with an intelligent, all-knowing or otherwise infallible being.
I believe it. For text->text workflows it's a game changer. For anything touching a physical systems, even the highest-level path planning for robotics, there's a long way to go.
In the long run, it will 'secure' knowledge and data. Nobody will know what's in GPT's databases.
Currently you can command kagi/google/http websites to return information. You can infer what should be in Google's search engine and track when information is deleted.
GPT is not commanded, it predicts with inaccuracy. So anybody who wants to black-hole information behind the scenes and never reveal clues to that fact, can do so.
All failed predictions are covered by LLM's design, you cannot infer without serious long term study that something has been removed deliberately. You cannot infer that a valid data entry exists and you failed to retreive it, because unverifiable bs is the default failure state from LLM's.
High level tech people will invest in this, regardless of what the public values in it. Just like Elon's SpaceX and Tesla got lifted out of pitfalls by gov and VC, so too will the AI guys.
Let me put it this way. Hoarde and backup every scrap of online information you care about. Hypothetically an LLM fueled replacement for all of 'the free and open web' websites, could limit information availibity.
A metaphorical example would be leaving out Tianamen Square. Which is fine when you can just Google it, but with the old freedom of information gone, an LLM has the ability to just bs you and you'd never have a reason to infer it existed in the first place.
It's a Super-Injunction by default, a perfect repository for spies to dump data, a librarian who will answer any question but only answer with the truth, if he likes you.
No more Snowden and Assange leaks, there's no way to chase up a deleted video with a search engine.
Anyway you get the idea. In the long run, the structuralists are licking their wet lips at the thought of re-establishing a heirarchy of information access. (Probably, i don't know).
> I’ve seen examples of “write an email to my boss”. It would take me longer to explain to ChatGPT what I want than to write these myself.
You’re a better writer than me also English isn’t my first language.
Besides that I used chatgpt to write a eulogy for my father. I know roughly what I wanted to say but I couldn’t find a good way to say it. Chatgpt helped me there even in my own language (Croatian) and there is just no way I could’ve made it as poetic as it did.
For me it’s a perfect replacement for Stack Overflow. Except every solution is tailored to my exact code and situation. I’ve even gotten it to walk me through things like installing WSL 2 without using the Microsoft Store after I nuked all Appx packages.
Maybe my favorite was pasting in pages of documentation describing all of the error codes for a library (the docs are the only source of truth) and getting it to output a very good typescript enum.
You don't need a full-blown LLM to find answers to questions on StackOverflow. You can do that today with any search engine.
> "I’ve even gotten it to walk me through things like installing WSL 2 without using the Microsoft Store after I nuked all Appx packages."
That's a search engine query you can issue already today, without involving any LLMs. It will cost a fraction of the cost of running this with an LLM, and it'll actually bring you to the "source" of the information (a thread on StackOverflow with the full context - including "wrong answers" which are just as useful) - unlike an LLM.
The problem with SO is asking questions though. It’s a horrible site for that purpose. I rather get a slightly wrong answer from ChatGPT than the insane experience of trying to formulate my question just right so it won’t get closed or deleted on SO.
I remember cross referencing with Google to see where ChatGPT was pulling from and it wasn’t immediately obvious. So yes I probably could have found a guide through a search engine with more digging. But clearly the LLM is the superior product.
Sure, you can keep throwing those hyped VC/MBA-style retorts all around, or you can acknowledge the simple reality that when it comes to the task of finding answers to questions on SO, traditional search technology outperforms LLMs on every parameter:
1. Cost: it's much easier to build and serve a traditional search index for questions/answers at scale
2. Speed
3. Quality: with search engines you don't just get one answer (you get a spectrum of answers). You have access to the "source" of those answers, which often include additional data that can help you choose the most appropriate solution to a problem.
In this specific case, the "horse" is the LLM - and the "car" is the traditional search engine with modern indexing technologies. But please, don't let simple textbook facts about technology take you off your trip.
I have a friend that has been using ChatGPT to help him write technical books which collect existing niche information and present it in a way that's more accessible and organized, in Spanish. You'd be amazed by how many small niches there are without robust technical books and well organized resources. And while the information is all "out there", it's often not really easily accessible to beginners.
Text redaction - if any part of job is about writing, you can now just brain dump everything, structure it more or less, throw into chatgpt, and make it produce a clear and readable article. In my case, the output is better than anything I would ever write.
Ditto all the redacting work in newspapers, intranet etc. The whole field of proofreaders was virtually extinguished overnight.
Marketing agencies - and I spoke to a few - increased their workers' producitivity 2-4 times (sic!), virtually overnight. Anything from writing briefs to writing copy.
Programming - most of my programming work is deep algorithms, so not much help here, but for writing boiler plate code with new APIs, or writing in a language that I'm a bit rusty in, chatgpt is better than anything else.
Customer service helplines / chatbots (and the same for intranet) - we don't see it just yet, because it takes a bit more time to build a good system, but there are probably thousands projects right now worldwide building those for any niche concievable.
Business intelligence - we used, with success, ChatGPT in our deep tech seedfund, for helping out with initial project ddil.
And, essentially, rubber ducking, but for every single field out there. I just discussed with a psychiatrist how he can use even boilerplate GPT-4 as an additional consultant. You need to be aware of limitations of course, but it is already immensely useful in it's current form - and dedicated solutions for medicine are coming very soon.
That's the short-term perspective and low hanging fruits. On top of that, you have thousands projects now, that are figuring out how to apply LLMs to specific niches. It was difficult before, because you had to train your own models, and now you can just fine tune the existing ones, do embeddings, or just plain prompt engineering.
Oh, and also synergy with different AI modalities - we've had a massive growth in voice recognition and generation, visual recognition, and so on. And LLMs are a glue that adds a layer of understanding underneath.
"In fact, no one in my extended sphere of friends and family has asked me anything about chatGPT, midjourney, or any of these other models. The only people I hear about these models from are other tech people."
I have the opposite experience. Everyone I meet outside tech has been exploring ChatGPT, or at least has heard of it and is extremely curious. And in non-tech student circles, a (non STEM) TA I know said students who until a month ago were bad to mediocre (8-12/20 scoring), are suddenly all turning in top 16-18/20 assignments this month. You can argue about what to do given this change, but you can not deny the impact.
I have the same experience. And ChatGPT was the turning point. A few month ago, to my surprise i wasn't even able to excite my non-technical environment about DALL-E 2, not to mention other ML models, that would excite technical people.
I was sitting in a Starbucks yesterday sitting opposite two girls doing some language study and one of them was joking that the other should ask ChatGPT about some grammar point. This idea that only tech people know about it seems way off.
We are going to have almost on the fly translation of video and audio. We are going to understand and write code faster. Knowledge management will change so instead of being partly a tool which searches the web we will get human readable responses.
Language model is about to improve language related things for us, how killer it could be?
Code things i've used it on:
- I've thrown in jsx react components and asked to transform it to tsx
- I've thrown in tsx components and asked it to write some tests
- I've asked it to make an offline queue's so user requests are stored and saved once online again
- I've asked to rewrite tests from enzyme to Jest
- I've asked it to add a full screen function to react-native-video, which is annoying.
Often Im doing something else in the meantime, like coding, in a meeting or having a beer.
Finally we can be drunk and code at the same time ;).
Joking aside, it's been a huge productivity boost, and if you ask things properly will write hugely detailed and correct code.
I've also used to understand other languages/coding I was less familier with, for instance c and sql procedures.
Above only works with v4, 3.5 is to inaccurate. But is indeed slow, small things I can do faster.
With writing articles i've been disappointed so far, even corrections or styles I ask to adjust it puts in back in a few questions later.
However writing children stories for helping my kid learn how to read it's really good, in every language. "Write a story for kids of age ... about ... use the following words". Came up with nearly perfect stories my kid loved.
I use ChatGPT at least 10 times a day, as a software developer, and I agree with all your assessments about the use-cases you mention. Here are my ten most recent prompts:
* Asking about how back-propagation works with multiple output nodes
* Examples of successful ICOs
* A deep dive into what it means to calculate the gradient of a function, with me asking lots of clarifying questions
* A deep dive into how electricity pricing works in the UK, digging into the market clearing price
* Looking for a generic "unwrap one layer of type" utility type in TypeScript
* "In JavaScript, I want to format a date as YYYY-MM-DD_HH-MM-SS"
* Seeing if there's a more concise way to get Zod to define an item that's required, but the value can be defined, other than using a union
* Ideas for naming two user fields, one that's a changed user, one that's the user making the changes
* Digging into the implications that Camilla is called the Queen Consort, not just the Queen
* "Excel I want to show the weekday as a single letter"
So, basically, it's my go-to instead of Google + StackOverflow
I’d be interested to see what it said about differentiating a function (and I’m assuming you mean in the context of programming). I consider that one of the dirtier secrets of gradient-based ML.
Also I've found it very useful for identifying books, TV shows, films, games, etc that I remember some details of (eg, was released around 1994, was about web design) but not all the details of
It's actually quite a lot better than a Google search. Yesterday I gave it the output of SQL ddl and told it to write me a script in Go that reads that from a CSV, then I asked it to generate a sample CSV to verify the script.
How is that different from googling? Lots of articles won't bother to include import and the structs they use definitely won't match up to your use-case because their dataset if different. What if my use-case is a bit strange and I need to embed the file instead of reading from the file system, for example? I can ask Chat GPT and it will update the example program, using my exact file name and the variable names for the problem I've described and the program runs as written!
I don't write much Go (so I don't actually care to commit the hello_world.csv ritual to memory), but I know enough about it to verify that the program doesn't have any glaring issues and make my own tweaks as necessary. Saved so much time for me in this scenario.
It can't replace search, actually. It doesn't know everything a decent search engine can find, and might not even remember the specific php command line parameter you're looking for.
What it can do is reason some pretty tricky interdisciplinary answers. It's about as useful as having an intern that's finished literally every white collar college out there but has no idea what they're doing. It takes some prompting to get useful work out of it but it is possible and it is very effective when you get it right.
The $20 sub has saved me quite a bit of time, the main challenge is remembering to set it to GPT4 for every damn conversation, as the previous models are trash.
I have read a lot of books (sociology, psychology, philosophy, history, religion, etc), and often I'll have an insight that stands in apparent contradiction with some concept I learned in one book but in alignment with a concept I've learned from another.
I use ChatGPT to understand these concepts, their background, and to bridge the gap between them.
Getting to the bottom of what Jean Baudrillard really was saying in "Simulacra and Simulation" and applying that to what Larry Wall meant when he said that Perl is the first postmodern language is really something!
similarly what’s the point of stackoverflow or wiki when you can just get the answer you want from a book in the library.
What I value about these lls is that they are essentially a condensed version of the internet (despite being stupidily large for normal hardware currently)
Usually if I’m building a recommendation or search algorithm I have to use the data from the company I’m working with. This makes it possible for me to encode the entire internet into a model that might be running on a product that only has 100 users.
LLMs will make average and below average writers / marketers / PR to become extremely good.
You no longer can mock "Please do the needful" from $1 / hr employees from India. They will be communicating at the same level as an average American and the smart ones can completely take over large fields.
A single person, can wear multiple hats and not get blocked.
This can apply at global scale of at least 4 Billion.
Just because you lack imagination doesn't mean it isn't good
To be honest, I can't tell if you are genuinely wondering or just (virtue) signaling that you're not using it, but here's a sample of my own usage: https://news.ycombinator.com/item?id=35299121
Not virtue signaling, but I seem to be asking a different question than people are answering. I’m asking what the product opportunity is and people keep telling me examples of tasks that they use it for.
In many cases the examples are one-off and the only product opportunity is the generative model interface itself. Looking broadly over the replies what I’m seeing is “there’s a thing that used to fail the cost/benefit test, but now the cost is so low that I can automate these things”. So part of my problem is (1) the small benefits of these tasks mean the value proposition comes from volume—that probably comes from the generality of the task engine, and (2) there may be some niche product opportunities on top of the model platform, but the primary big winner here is the platform itself. (That’s not necessarily a new insight, but it seems especially true here.)
The terrifying part is how often I hear people in this thread and elsewhere mentioning tasks that are not fault tolerant to the failure modes of these models. (For example, I had a coworker tell me their relative is a doctor using ChatGPT to diagnose patients.) People keep focusing on the risks of AGI killing us all with paperclips, but I’m much more worried about getting run over by some idiot asking ChatGPT to drive their car.
> Not virtue signaling, but I seem to be asking a different question than people are answering. I’m asking what the product opportunity is and people keep telling me examples of tasks that they use it for.
Sorry, I misunderstood. "I am not using AI" has become a sort of badge of honor in certain communities so I was wondering if that's what it was.
> (2) there may be some niche product opportunities on top of the model platform, but the primary big winner here is the platform itself. (That’s not necessarily a new insight, but it seems especially true here.)
I agree with that conclusion. I think the chat interface is the killer product. I treat ChatGPT as an assistant/intern that is really good at some tasks but that can also sometimes make dumb mistakes. It has also replaced a lot of queries I would have previously done on Google or questions I might have asked somewhere (e.g. in a forum, Reddit, Discord, etc.).
Many startups build domain specific UIs on top of it using the API, but whether that will become a sustainable business model remains to be seen[0]. I am reminded of the many "vertical" search engines that were once trying to compete with Google.
[0] Saying this as someone who did something like that: https://eli5.gg
For me, it gives an inkling of a web that existed before, with less friction. Instead of going to a webpage with verbose ad-ridden fluff, I get a more or less frictionless answer, very specific to my question.
Funny enough I don’t use it a lot for programming, maybe just to jog my memory on a topic.
The thing has enabled me to upgrade a major code project into an exciting direction in a short amount of time. Paradoxically I find it does better with more difficult tasks because the prompts tend to have more detail. the thing can generate semantic cypher queries for combinations of combinations of combinations of elements if you want, it’s ridiculous at coding already and only gonna get more insanely good at it in the future. Yeah, sometimes it forgets context and you have to guide it a little, it still can rewrite a whole module in a new style with lots of specific changes under 2 minutes, good luck coming close to typing that fast even if you did know exactly what to write, coding feels pretty revolutionized overnight right now for me tbh
I don't have the same experience as you at all. Most of my non-tech friends or family members asked me about chatGPT. You don't find it useful to write emails maybe because you're a native speaker, but it is extremely useful for non-fluent english speakers.
I defined an interface and asked ChatGPT to implement it, in Go, as method receivers on a struct named “repo” against sqlite. The code compiled and I wrapped it in unit tests to ensure it worked. Overall saved me an hour of boilerplate, would recommend.
What is the killer app of the internet, or the smartphone? It's a silly question. It's many little things, some foreseen and some not.
One thing you're missing is that we now have a pretty good solution to any NLP pipeline that in the past you'd have to spend months to get right. You can probably still get better results by supervised training on specific tasks but it's good enough. NLP (as we knew it) is dead. This will take some time to show in the applications we use, as people figure out how to use and integrate it, and costs need to come down, but it will make it trivial to add smart functionality for things you previously needed an in-house ML team for.
Sorry, one other thing, as an actual AI researcher:
These generative models, whether NLP or vision, are cool but are really examples from a very narrow field. Most ML researchers and practitioners are working in completely different areas and would not obviously benefit from the new generative models (which are themselves prodictized extensions of existing tech trained on more data) so nothing's going to change day-to-day. If you were working on some alternative general purpose generative model, maybe you got scooped. Otherwise it's business as usual.
The "revolution" is happening for tech savvy non-ML people, "tech-bros" colloquially.
I'll ask you another question: What are people searching for on google right now that can't be answered by ChatGPT? I would say 95% of searches can be a simple question & answer of ChatGPT. I've tried it, it works.
Note: When I write "Midjourney" below, I mean any AI/ML-based art generation tools.
Art generation is going to be groundbreaking for the advertising industry. Basically, you can hire summer interns from arts academies or liberal arts schools to type cool stuff into Midjourney to generate amazing art for your ads. You don't need to pay for real artists, who, sadly, their original artwork was used to train the model.
The same can be said for low-end graphics arts: Stuff like: Make me a greeting card for this big event. Today, you need to pay a graphic artists to whip something up in 2-72 hours. That will be replace by someone with modest English skills working in a call center somewhere in India or Philippines. They might chat with you for five mins (voice or text chat) to brainstorm ideas. Then, they will "drive" Midjourney and put together a nicely-themed party invite.
On the more advanced side, I do think artists will browse the "best of" portfolios on Midjourney (and others) to get new ideas. They might also use Midjourney to get a head start.
The next logical step beyond Midjourney is to generate the same image, but as a 3D model. I think people (myself included!) really (really, really!) underestimate the cost of creating 3D models for films, adverts, and games. If Midjourney could give you a starting model, you might save hours (or days) of work.
Next-Next: Midjourney can provide basic animations of the same 3D model. Again, you can download in a wide variety of formats, so that you can import and tweak as necessary. Think: Pay to play. Rendered as GIF is cheap as hell, but download advanced CAD format with 50K points in the model might cost 100s of USD. (Remember: You are paying for a SaaS engine, not an expensive, talented artist.) Imagine: "Hey Midjourney, I need a 60 second animation of cute animals, like Animal Crossing, sitting at a table enjoying our new brand of tea called 'It's Great Tea Meet You!'." (Use ChatGPT to generate the first draft of the script.) Writing that just made me think: Ok, now you can add voiceover.
The possibilities for the commercialisation of generating still and moving picture are nearly endless, and many will be useful for the advertising, film, and gaming industry.
I think the killer app will involve adding more layers of AI to make something that is closer to an agent than a simple prompt completion engine. Maybe a more self-contained system that has an internal cycle of asessment, generating prompts, executing output, and readjusting prompts. These agents could conceivably work on long term, complex projects with minimal supervision or interaction. If a general purpose system can get a foothold and be even 10% as good as a person, it could be enough to fundamentally reshape the workforce.
One thing I find it really great for is language learning. It can create examples, explain grammar, contrast to languages you already know etc.
Also I'm not a sys admin but occasionally have to do sys admin things and I ask for the shell commands I need in English, it is far far faster than using stackoverflow etc. (i.e. pulling logs from multiple servers and extracting bits) Also I don't want if I want to use a library I ask for what how I want to use it and get a small example entirely tailored to me.
In the content space. There's whole publishing industries that are at risk from GPT style models.
Would reddit, or HN still be interesting if there were bots that talked in the popular tone of the subreddit or thread that dominated the conversation?
AI's effect will be like pollution to our culture. It'll speed up and help alot of things and be very helpful generally. But It will create problems in 'human' spaces.
> I’ve been shown some neat pictures people made that they thought were cool. I don’t know that I need this every day.
3d and motion and this becomes the holodeck, you describe a scene and it creates it direct to your vr goggles.
Generative AI that can take in a scene and alter it rather than fully create it becomes real augmented reality from sci-fi: not just a HUD or greenscreened in elements, major transformations.
There is no killer app because chat gpt is largely useless. It's results might be right, half right, half wrong, or totally wrong. All it's good for is spouting bullshit. It's like talking to a really outgoing, really confident used car salesman.
Imagine a code/progtamming AI that I can ask to code review my code and find my mistakes, write unit tests for my new functions.
I am tempted to write a small program to use to fix my Internet comments and fix my bad English exprimattion , I done some tests and I see that ChatGPT find the places in my comments that can be improved.
Instead of looking at use of an LLM in isolation, consider the value it could have in the future, combined with API's, other models, speech interface, etc.
Sure, it can help you write an email now, but the real magic could be when these things come together in a symphony of intelligence.
But soon people will just enter one goal, "make me money" and the program will go on a loop, pausing only when making an important decision to get the approval of the owner
Also, the input buffer of LLMs is increasing. Soon we will be able to begin with "write me a full 3D game"
> In fact, no one in my extended sphere of friends and family has asked me anything about chatGPT
A couple of weeks ago they wrote an episode of South Park with ChatGPT (and it was about ChatGPT). It's definitely gone mainstream amongst students who are using it to do homework.
I use it for APIs. Specifically, I use it to help think about how I should architect my APIs, as well as how I should implement them. I can ask ChatGPT to do both things for me. Even if it's somewhat wrong, it still gives me a good enough direction to go on.
“Almost all” makes me question how much experience you have doing those jobs. They exist but work which is truly that rote has been getting automated for half a century so what I’ve found is that what you see now tends to deceptively appear simple while hiding judgement, uncertainty, and inference.
That’s especially true for things which involve liability. If Google builds a system to recommend YouTube videos and ads, it’s a win as long their error rate is below a certain level. If it’s your insurance company rejecting claims, however, people can die that way and the lawsuits for breaking contracts or legal standards can far exceed the believed savings.
It replaces googling. All the hard work of finding relevant info and piecing it together is done for you. For programmers it's just a faster stackoverflow.
Scrolling back through some discussions with chatgpt and a few things with bing (later ones more likely to be gpt4)
* Explaining what a do/while(false) loop was for as I'd not seen that construct before
* Discussing what DIDs were and how web DIDs worked as the RFC was very detailed in a lot of areas I just didn't care about. A discussion with a pretty knowledgeable person in the area was what I needed, and what I got. It explained what the well-known part was, explained my confusion around resolving to a document and resolving to a resource (where I was mentally stuck).
* Creating a learning plan, diagrammed by mermaid, for progressing in bouldering. Each major step broken down into sub-parts to practice
* Finding https://blockprotocol.org/ given a shaky memory of "some composable ui framework by someone famous" iterating a little with bing. I had failed to find it before manually.
* Explained and created a table of bouldering gradings as the place I go uses a different one to the videos I see
* Discussed project ideas to do with my son, gave me great ideas around electronics that I think are a good fit as well as a few other things. The most useful part here was being able to say "that's too generic, I need proper projects" and "that's too simple for him" and have it update. It then also created some good explanations at different levels about how radios work,
* General discussions about long term impacts of LLMs, potential use cases
* Career advice
* Generating art that we'll be getting commissioned for the house
* NER without any coding
* Generating ember templates, CSS and example filling data for a custom framework given a problem statement (what I'm actually building right now)
* (edit) I just took requests from my kids and made them some colouring in pages with robots, firefighting robots, lego ninjas, owls, frogs and crabs.
> In fact, no one in my extended sphere of friends and family has asked me anything about chatGPT, midjourney, or any of these other models. The only people I hear about these models from are other tech people.
I've had family members tell me they've used it to create reports, and used it to create marketing copy, a website and lecture slides for others.
Needed to safely move a bulk carrier on short notice recently:
Q: Describe turning 180 degrees and continuing down stream with the current in a heavily loaded bulk carrier from a stationary position tied to a dock with five knots of downstream current using two 50 ton Ballard pull tractor tugs. Be very detailed and specific in describing the actions required to complete this action safely.
A: To safely turn a heavily loaded bulk carrier 180 degrees and continue downstream with the current from a stationary position tied to a dock with five knots of downstream current, using two 50-ton bollard pull tractor tugs, you must follow a series of steps. These steps are designed to ensure a smooth and controlled maneuver while minimizing the risk of collision, grounding, or damage to the vessel, dock, and tugs.
1. Preparation: Prior to beginning the maneuver, ensure that all necessary communication equipment is functional and tested. This includes VHF radios, signal flags, and/or lights. Establish communication protocols and channels with the tugs and any other relevant parties, such as pilots and harbor control. Additionally, check the weather forecast, tidal information, and any local regulations or restrictions that may apply.
2. Positioning of tugs: Position one tug at the bow and the other at the stern of the bulk carrier. The tug at the bow will primarily be responsible for pulling and steering the vessel, while the tug at the stern will act as a brake and assist in maneuvering the vessel as needed. Securely attach the towlines to the designated towing points on the vessel.
3. Cast off lines: Once the tugs are in position and ready, instruct the crew to release all mooring lines from the dock. As the lines are released, the tugs will need to maintain tension on their towlines to keep the vessel in position and prevent it from drifting downstream with the current.
4. Initiate the turn: The tug at the bow will start the turning process by pulling the vessel's bow away from the dock and into the current. At the same time, the tug at the stern will push the vessel's stern towards the dock to create a pivoting motion. Both tugs must continuously adjust their power and positioning to maintain control of the vessel as it turns.
5. Manage the current: During the turn, both tugs must work together to counteract the effects of the five-knot downstream current. The tug at the bow will need to pull harder to overcome the current, while the tug at the stern will need to push and also act as a brake to prevent the vessel from gaining too much speed.
6. Monitor the turn: Throughout the 180-degree turn, the vessel's crew and tugs must maintain constant communication to ensure a smooth and controlled maneuver. Monitor the vessel's position, speed, and proximity to the dock, other vessels, and any navigational hazards. Adjust the tugs' power and positioning as needed to maintain control and avoid collisions or grounding.
7. Complete the turn: As the vessel approaches the completion of the 180-degree turn, the tugs will need to adjust their power and positioning to slow down the vessel and align it with the downstream current. The tug at the bow will ease its pulling force, while the tug at the stern will continue pushing until the vessel is properly aligned.
8. Proceed downstream: Once the vessel is aligned with the current and moving downstream, both tugs will gradually reduce their power to allow the vessel to continue under its own momentum. The tugs will remain close by to provide assistance as needed until the vessel is safely clear of the harbor and any navigational hazards.
9. Release tugs: When the vessel is at a safe distance from the dock and navigational hazards, and under the control of its own propulsion, the towlines can be released, and the tugs can return to their base.
I work on a research team in FAANG. What it really feels like is one company made everyone else obsolete. And we are going to work working on NLP models that underperform ChatGPT by a huge margin. Twiddling my thumbs and keeping quiet while no one wants to recognize the elephant in the room.
Also, there is no "working in AI", a few thousand people are doing real AI at most. The rest of us are calling an API.
This reminds me of back in the mid 2000's, there were a lot of smart people working on search algorithms at different companies. But eventually, you'd talk to someone smart working on Yahoo Search, and they would just be kind of beaten down by the frustration of working on a competing search engine when Google was considered to be by far the best. It got harder for them to recruit, and eventually they just gave up.
So... I don't know where you're working. But don't twiddle your thumbs for too long! It's no fun to be in the last half of people to leave the sinking ship.
Why do you think OpenAPI is so far out in front? It's not like there's a lot of secret sauce here - most of this stuff (transformers,etc.) is all out there in papers. And places like Google & Meta must have a lot more computing resources to train on that OpenAI does thus they should be able to train faster. Do you think OpenAI has discovered something they haven't been open about?
> Why do you think OpenAPI is so far out in front?
There is a network effect forming around its models. The strengths of its kit speak for themselves. (It also cannot be understated how making ChatGPT public, something its competitors were too feeble, incompetent and behind the curve to do, dealt OpenAI a massive first-mover advantage.)
But as others note, other models are in the ballpark. Where OpenAI is different is in the ecosystem of marketing literature, contracts, code and e.g. prompt engineers being written and trained with GPT in mind. That introduces a subtle switching cost, and not-so-subtle platform advantage, that–barring a Google-scale bout of incompetence–OpenAI is set to retain for some time.
Doubtful "feeble and incompetent" are fitting considering much of the research behind OpenAI was pioneered and published by those predecessors. More like "unwilling", and perhaps for good reason. Time will tell.
The level of denial people are willing to sink into regarding how good GPT-4 is compared to everything else is truly crazy. Not a single other project is an order of magnitude close to the quantitative and qualitative (actual experiential results, not just benchmarks) results that GPT-4 brings.
I feel that there’s significant insecurity among a lot of coders about GPT-4. A lot of them are ignoring the pace of improvement and highlighting the few off chances where it gets things wrong.
I think there's a lot of people writing boilerplate programs who are going to be freed from these menial tasks (i.e. no more Java enterprise application development, thankfully).
GPT 4 is quite astounding. It might be wrong on occasion, but it will easily point you in the right direction most of the time. It still messes up, but like a twentieth of what 3.5 did. Honestly it is like an incredible rubber ducky for me. Not only can I just talk like I’m talking to a rubber duck but I can get fast, mostly informed, feedback that unblocks me. If I have a bunch of things competing for my attention I can ask gpt about one of them, a hard one, go do something else while it types out its answer, and then come back later and move on with that project.
I'm a recent convert, been experimenting with converting one PL to another.
GPT-3.5 will get the gist of what the code is doing, and then provide what looks like a direct translation but differs in numerous details whilst having a bunch of other problems.
GPT-4 does a correct translation, almost every time.
It kills me that there's a waiting list for the API. I have put together some tools to integrate 3.5 into my workflow and it helps for my current task a lot (for others it's useless). But to really shine it needs to have API access to 4.
I recently finally got access to 4 in the API, it's good. It's much better imo at following the system prompt too. Faster than you see in chatgpt I think, not as fast as 3.5-turbo but definitely less tedious.
My only kind of quantitative answer is that I had 3.5 creating ember templates & data and it would get the templates mostly ok after a couple of iterations of fixing errors, almost never on the first shot (if ever). Often wouldn't quite get it in two, and data structures would often be kinda there but not quite. Required a lot more care with the prompts. 4 gave me working things every time first time (except only where it did things that were fine in a template but not currently supported by the custom framework I'm using them in), and didn't need as much hand holding.
Qualitatively, it's wildly different from gpt-3.5-turbo for discussions. 3.5 feels a little formulaic after a while with some kinds of questions. 4 is much more like talking to an intelligent person. It's not perfect, but I'm flipping between discussing a sporting thing, then medical malpractice, legal issues, technical specifications and it's doing extremely well.
If it's affordable for you, I'd really recommend trying it.
For code 90%+ reduction is easily correct. For text and other content, I can't say, but I would guess it's not that good.
Anything involving reasoning, code, complex logic, GPT-4 is a breakthrough. GPT-3.5 turbo is more than good enough for poetry and the other text generation stuff.
My favorite part about GPT-4 is that if it generates code that is wrong, and you ask it to verify what it just wrote - without telling it whether it's wrong or not, much less pointing out the specific issue - more often than not it will spot the problem and fix it right away.
And yes, it does indeed make an amazing rubber duck for brainstorming.
GPT-4 is leaps ahead, and it's improving with every new release. The latest March 23 release is significantly better than the previous one and does a LOT of heavy lifting for my code at least.
At the very least, it's a massive productivity booster.
I've had decent success with Open Assistant, an open source model. I'd say it's within the order of magnitude of ChatGPT, given the prompts I'm looking at, including reasoning prompts. This, I believe, is due to the overwhelmingly clean data that OA have managed to acquire through human volunteers.
I dont work at Google, I think other FAANGS underinvested in this area as they didnt think it was promising. But I will admit, I am suspicious that Google is incompetent. Probably they can come back given how much money they will be forced to throw at it. But Bard is clearly behind and I dont believe their "abundance of caution" arguments for why Bard cant even code.
Google was built by brilliant people whose mission was to make any information available to anyone, instantly.
Google is run by smart people whose mission is to maximize clicks on ads. If a user finds what they’re looking for quickly, that’s lost revenue.
Google’s profit motives are not aligned with useful AI. The better AI is, the less people need to click through to lots of web pages and ads, the less revenue for Google.
I don’t think they can catch up without a major pivot in business model. It’s very hard to be deeply invested in providing more value if it means reducing your revenue.
FAANG companies have distribution. They can sell anything. Give them a bit of time. They are not companies of a single product, they milk people better than governments.
The impression I get is that they purposely limited Bard in certain ways. It does not seem to be willing to code and makes a lot of excuses when you ask it to (at least as of a couple of weeks ago when I last tried it). Did they put this limitation on it because it's not so good at coding or because they don't want it to be abused in some way? In my experience I had to trick it into coding by telling it to pretend that it had created it's own programming language and then implement an algorithm in it. It seemed to think that it had created Python.
> I am suspicious that Google is incompetent.
Google has put a lot of effort and investment into AI. With Bard I get the feeling they're not showing us what they really have - it's like for some reason they're holding back the good stuff, at least that's my suspicion.
I am still wondering if Google is experiencing it Kodak moment.
They have the dominant product that makes them billions and billions of dollars at 'relatively' low cost.
The new dominant product is on its way, but it costs far more to operate and will net them far less money, so... um no one wants to kill the goose that is still laying golden eggs, even though its days are numbered already.
(moment worth photographing): From an Eastman Kodak Company advertising campaign.
(business's failure to foresee): In reference to the Eastman Kodak Company's decline when cameras and film were overtaken by smartphones and digital technologies.
Noun
Kodak moment (plural Kodak moments)
(informal) A sentimental or charming moment worthy of capturing in a photograph.
(informal) The situation in which a business fails to foresee changes within its industry and drops from a market-dominant position to being a minor player or declares bankruptcy.
Kodak Film Commercial - These are the Moments - Baby (1993):
I don't see ads as a big problem with ChatGPT. You could put a side-bar on the right and on the fly recommend products relevant to the on-going conversation.
The cost of computing these ads would be a lot more than today's keyword-based approach, that's certainly a problem. But think of hyper-relevant ads, based on the chat itself. There's a lot of information there, that beats tracking people's behavior online all day.
That depends on ad publishers, right? If they want to sell A, B and C and I am interested in D, then Google's still showing one of A, B or C to me. D doesn't make profit if there is nobody paying for ads.
Google is advertising things we don't need, that's why ad clicks are so abysmal. LLMs won't change that.
I’ve been infuriated with DuckDuckGo on occasion because it refused to exclude certain results.
In fact when you add an exclusion clause it simply boosts those results further instead of removing them.
I’ve been told this is because the underlying search providers refuse to exclude paying customer even when you explicitly don’t want to hear from them.
I could definitely see this happening in LLM answers too and I don’t expect it to be particularly subtle.
100% this -- a Google search requires multiple input queries, refinement, and scrolling a list of possible answers that are really just links to other web sites. The ChatGPT experience is far superior to this, for the average consumer and getting close for the power user. It's a better way to ask the Internet what it knows with a more natural interface that everyone already knows how to use -- real natural language. Less cognitive overhead, no busy search results that require clicking back and forth and (for now) no ads. That last part is key -- ChatGPT is doing Google's job right now, and not even having to run ads. Google doesn't even offer a premium no ads option for search and if they did I doubt enough people would buy it anyway to matter.
If I was Google I'd be worried. Very worried indeed. They either need to dramatically change their entire company within 18 months, or accept they are going to loose substantial amount of market -- and once its gone, it's gone in a first mover, winner takes all environment like what we have right now. Just ask Google themselves what it felt like back in the early 2000's when they completely destroyed the other search engines.
Now it is. Google used to be good too, until ads started looking like search results, and then the first page became entirely ads.
In the future, when you ask ChatGPT to help you write your resume, it will try to upsell you a premium account in linked in. It will withhold its best resume advice only for LinkedIn premium users after all.
> With Bard I get the feeling they're not showing us what they really have
I highly doubt this. If they had it they would show it because if they don't react swiftly and decisively their brand will be in 'catch up' mode rather than out front where they are used to being.
It's not just about hardware, it's about the software infrastructure to go with it. Other than OpenAI most researchers weren't interested in "merely" scaling things up because that was/is seen as simple engineering, unworthy of the great minds who dream up new algorithms.
I don't know about GPT4, but GPT3.5 I'd bet is pretty traditional and boring. It's power comes from a really good, properly curated dataset (including the RLHF).
GPT3.5 turbo is much more interesting probably, because they seem to have found out how to make it much more efficient (some kind of distillation?).
GPT4 if I had to make a very rough guess, probably flash attention, 100% of the (useful) internet/books for it's dataset, and highly optimized hyperparameters.
I'd say with GPT4 they probably reached the limit of how big the dataset can be, because they are already using all the data that exists. Thus for GPT5 they'll have to scale in other ways.
To be fair, if the opposite were true, it might not be wise to admit. Saturating available high quality training data is one of the few ways anyone can see OpenAI slowing down.
1. They would already be using everything they can get
2. They would easily be able to explain what they're not using, without giving away sensitive secrets.
I wonder if we saw the same video - or maybe it is just ChatGPT being "great" in the wild? I see one guy asking another guy simple questions and getting weaselwords for an answer.
Right, that's totally it. I came away thinking the interviewer seemed way sharper than the interviewee which is pretty rare. The sheer throughput and speed of interesting questions was incredible. Too bad that many of the answers were not.
> I'd say with GPT4 they probably reached the limit of how big the dataset can be
I’m curious about this too; not just on the dataset size, but also the model size. My hunch is that the rapid improvements of the underlying model by making it bigger/giving it more data will slow, and there’ll be more focus on shrinking the models/other optimisations.
I don't think we're anywhere close to the limit of sheer hardware scalability on this. Returns are diminishing, but if GPT-4 (with its 8+ k context window) is any indication, even those diminishing returns are still very worthwhile.
If anything, I wonder if the actual limit that'll be hit first will be the global manufacturing capacity for relevant hardware. Check out the stock price of NVDA since last October.
According to financial reports they are building a $225 million supercomputer for AI. What we can probably expect is the same dataset with even more compute ran on it.
There is a soft limit due to the computation required; the currently used model architectures are quadratic with respect to context size, so if you want ten times larger context size, that's going to need a hundred times more effort.
For a little more than a year I worked in an AI startup doing basically everything other than AI (APIs, webapps, devops...), but from what I've seen there the "secret sauce" to AI success is the training process (dataset, parameters, fine-tuning steps, ...). And OpenAI isn't open about theirs since their beginnings.
> Do you think OpenAI has discovered something they haven't been open about?
They have not, which makes me curious about which company gp works for because the "F" and "G" in FAANG are publicly known to already have LLMs. Not sure about Amazon, but I'm guessing they do too.
As an outsider, the amazing thing about ML/AI research is that you get a revolutionary discovery of a technique or refinement that changes everything, and a few months later another seminal paper is published[0]. My bet is ChatGPT is not the last word in AI, and OpenAI will not have a monopoly on upcoming discoveries that will improve the state of the art. They will have to contend with the fact that Google, Meta & Amazon own their datacenters and can likely train models for cheaper[1] than what Microsoft is paying itself via their investment in OpenAI.
0. In no particular order: Deep learning, GANs, Transformers, transfer learning, Style Transfer, auto-encoders, BERT, LLMs. Betting the farm on LLMs doesn't sound like a reasonable thing to do - not saying that's what OpenAI is doing, but there are a lot of folk on HN who are treating LLMs as the holy grail.
1. OpenAI may get a discount, but my prediction when they burn through Microsoft, they'll end up being "owned" by Microsoft for all intents and purposes.
Issue is the data moat OAI is building. They'll have hundreds of millions of high quality user interactions with ChatGPT they can use to finetune their models. What will anyone else including Google have?
Google has been collecting user interactions since 2007 via GOOG-411, which was a precursor to the Google Assistant - I suspect Google has billions of user interactions on hand through the latter. Facebook has posts and comment, Amazon has products pages, reviews and product Q&As and all of them have billions of dollars to draw upon if they choose to buy high-quality data, or spin-up / increase teams that create and/or categorize training data.
They also have deep roster of AI researchers[1] to potentially obsolete LLMs or make fine-tuning work without access to of ChatGPT records.
1. I suspect Google alone has more AI researchers that OpenAI has employees
I'm not sure how deep that moat is. As soon as you open up the API, anyone can distil ChatGPT (or at least, some smaller part of it) by fine-tuning another model on its outputs[0].
I'm guessing that this is the #1 fear for people inside OpenAI have right now.
[0] For the record, I have zero problem with this.
1. ToS make it hard for a commercial entity to do so. So some third parties would have to collect the data first
2. You won't be able to get the hundreds of millions or more interactions that OAI will have (both due to cost of API as well as it being not easy to figure out a good way to generate that many queries for a good multiturn conversaton). Maybe you can make up for it by querying smartly. We don't know if we can right now.
As people make chat bots with openAI and tie them into existing chat services, organizations that offer these chat services will get their hands on that kind of data too.
A lot of FAANG data folks aren't on the teams that were doing research into this stuff and weren't using the latest fruits of that research.
OpenAI has released a ton more easy-to-use-for-everyone stuff that has really leapfrogged what a lot of "applied" folks everywhere else were trying to build themselves, despite being on-the-face-of-it more "general."
I think it’s the way things go usually. The big players have a business to run so they can’t focus much on innovation. OpenAI has the only purpose right now to push AI and nothing else. Once they have a real business they will also slow down.
They have been collecting human feedback data for 2 years + probably have a lot of data from Copilot + are training with large context models + have invested a ridiculous amount in curating pretraining data -- the kind of stuff that won't get you a ton of pubs (so you won't see Google researchers having focused on it a lot) but apparently turns out to be super important for a good LLM
All of the neural network architecture for human level thinking and processing, including vision, speech, emotion, abstract thought, balance and fine motor skills, everything was publicly released in April 2003, twenty years ago this month. It's a 700 megabyte tarball and sets up an 80b parameter neural network.
What? Huh? Yes the human genome encodes all human level thought.[1] Clearly it does because the only difference between humans that have abstract thought as well as language capabilities and primates that don't is slightly different DNA.
In other words: those slight differences matter.
To anyone who has used GPT since ChatGPT's public release in November and who pays to use GPT 4 now, it is clear that GPT 4 is a lot smarter than 3 was.
However, to the select few who see an ocean in a drop of water, the November release already showed glimmers of abstract thought, many other people dismiss it as an illusion.
To a select few, it is apparent that OpenAI have found the magic parameters. Everything after that is just fine tuning.
Is it any surprise that without OpenAI releasing their weights, models, or training data, Google can't just come up with its own? Why should they when without turning it into weights and models, the human neural network architecture itself is still unmatched (even by OpenAI) despite being digitized twenty years ago?
No, it's no surprise. OpenAI performed what amounts to a miracle, ten years ahead of schedule, and didn't tell anyone how they did it.
If you work for another company, such as Google, don't be surprised that you are ten years behind. After all, the magic formula had been gathering dust on a CD-ROM for 20 years (human DNA which encodes the human neural network architecture), and nobody made the slightest tangible progress toward it until OpenAI brute forced a solution using $1 billion of Azure GPU's that Microsoft poured into OpenAI in 2019.
Is your team using $1 billion of GPU's for 3 years? If not, don't expect to catch up with OpenAI's November miracle.
p.s. two months after the November miracle, Microsoft closed a $10 billion follow-on investment in OpenAI.
Until we see otherwise, don't we have to assume there's some secret sauce? Bard doesn't match GPT4 and it isn't for a lack of trying. (though perhaps that will change, so far that's the case)
Bard would not trick anyone into thinking it was sentient, yet something they have supposedly did. I just think Google has far more to lose than Bing, so they are being more cautious.
> Bard would not trick anyone into thinking it was sentient, yet something they have supposedly did
Bard is overtly a reduced-resources model compared to the best version of the same technology (which, if true, is probably a boneheadedly bad choice for a public demo when everyone is already wowed by the people who got theirs out first, but easily explains that disparity. Though so does “guy who wanted public attention made stuff up well-calibrated to that goal.”)
> Bard is overtly a reduced-resources model compared to the best version of the same technology
There's a scaling problem. ChatGPT/LLM systems cost far more to run per query than the Google search engine. Google can't afford to make those the first line query reply.
A big business model question is whether Google will insist you be logged in to get to the large language model.
At Google scale, these things are going to have to be a hierarchy. Not everything needs to go to a full LLM system. Most Google queries by volume can be answered from a a cache.
> Most Google queries by volume can be answered from a a cache.
And given how aggressively they limit the number of search results (in spite of listing some ridiculous number of results on page #1) that percentage may well be very large.
Google also has Sundar and Ruth who’d rather focus on how to get another ad on the SERP than kill the golden goose. They’re not going slow, they just don’t have the leadership for the moment.
Maybe it is just time to train and also good training data for prompts which openai has gathered for so long already? E.g. there is a bottleneck on how fast you can train and also gather good data.
Possibly, but wouldn't Google and Meta have access to way more compute resources and data than OpenAI? Google has been touting their TPUs for several years now.
OpenAI has access to Microsoft and Azure. That’s bigger than Meta, roughly on par with Google in terms of capability and higher in terms of market cap.
Google has the compute, from the comparisons I have seen Bard smokes GPT-3.5-Turbo on response times. So my guess is that internal politics prevents them from putting out something better. There would have to be immense pressure from the search division to not make them obsolete.
Without Nadella footing the compute bills, nobody would be taking about OpenAI. He’s brilliant, he let the start up take on huge risk to quietly claim the gains for m$.
If you try Bard or Claude or character.ai they are not far behind GPT4. They might even be on par in terms of raw LLM capabilities. ChatGPT has better marketing and in some cases better UX. A lot of this is self-fulfilling. We think it's far ahead, so it appears to be far ahead.
> If you try Bard or Claude or character.ai they are not far behind GPT4
Bard is way behind ChatGPT with GPT-3.5, much less GPT-4. Haven’t tried the others, though.
OTOH, that’s way behind qualitatively, not in terms of time-of-progress. So I don’t think it is at all an insurmountable lead, as much as it is a big utility gap.
Claude and Character AI are great at holding a conversation but they lack the ability to do anything specialized that really makes these LLM’s useful in my day to day life. I ask GPT-4 and ChatGPT questions I would ask in stackoverflow, I can’t do that with Claude or Character AI. Bard actually seems behind even conversationally to the rest
ChatGPT is cool and novel, but FAANG's requirements for ML/AI go far beyond what ChatGPT provides as a product. ChatGPT is good at answering questions based on an older data set. FAANG typically requires up to date real time inference for huge rapidly changing data sets.
Working on the practical side of ML/AI at FAANG, you will probably be working with some combination of feature stores, training platforms, inference engines, and so on - all attempting to optimize inference and models for specific use cases - largely ranking - which ads to show which customers based on feature store attributes, which shows to show which customers - all these ranking problems exist orthogonal to ChatGPT, which is using relatively stale datasets to answer knowledge based questions.
The scaling problems for AI/ML for productionizing these ranking models from training to inference is a huge scaling problem. ChatGPT hasn't really come close to solving it in a general way (and also solves a different class of problems).
Agreed. For my job maintaining real-time models with high business value to be disrupted by a chatbot, an LLM would have to be able to plug into our entire data ecosystem and yield insights in realtime. The backend engineering work required to facilitate this will be immense, and if the answer to that is "an LLM will create a new backend data architecture required to support the front-end prompt systems", then... well, suffice to say I can't see that happening overnight. It will require several major iterative and unpredictable pivots to re-envisage what exactly engineers are doing at the company.
For the time being, I expect LLMs to start creeping their tendrils into various workflows where the underlying engineering work is light but the rate of this will be limited by the slow adaptability of the humans that are not yet completely disposable. The "low hanging fruit" is obvious, but EVPs who are asking "why can't we just replace our whole web experience with a chatbot interface?" may end up causing weird overcorrections among their subordinates.
Isn't this as straightforward as semantic search over an embedded corpus ? Unless i'm missing something, i don't think the backend engineering would take much
I think generating useful embeddings off of a lot of realtime data flows (eg. user clickstream data) is in fact fairly difficult. Furthermore, if you had such embeddings it's unclear if an LLM would add value to whatever inference you're trying to do. If the LLM is not only be used for inference but to actually retrieve data ("find and summarize the clickstream history of his user") then I would not expect this to be doable in realtime.
ChatGPT is human level intelligence, it’s not just novel and cool, it’s the thing.
Remember, GPT-4 training was finished 6 months ago. Listen to people at OpenAI, their concern is: disruption to the world, UBI, getting people used to superintelligence as part of our world. I think they have quite a few things in the pipeline.
So yes ads optimisation/recommendations still need to be reliable for the time being, but for how long?
GPT-4 is not human level intelligence, nor is is above or below. It’s quite a different kind of intelligence not entirely comparable to humans. That’s probably why we’re moving the AGI goalpost; we visualize AGI as a robot human, but these machines may simply be founded on too different principles to ever receive that honor.
I think it’s mostly different because they crippled the public version for now:
no internet access, everything is done in one pass.
In our mind we get an idea, we inspect it, we try different variations, we simulate how it will be perceive(consciousness). In this way we iterate before putting the information out.
This is not difficult and is getting added on to it externally.
Chat GPT is just to get us used to the idea, it’s the toy version.
I would be interested to know which part you feel is implausible, to me it seems inevitable
You have a language model produce an outline with steps and then recursively set agents to consume and iterate on a task until another language model finds the results satisfies the specification.
This includes interactions with the real world (via instructions executed over an API) and using the success of those interactions for reinforcement learning on the model.
But I think they are mostly pointless as OpenAI is so far ahead of everyone external it’s not even funny. Most externals things with the API will be obsolete in a few months.
They had GPT4 6 months ago or more!
They have access to the full model without crippling.
They (for sure) have larger, more powerful models that are not cost effective/safe to release to the public.
Now they have a new data flyweel with people asking millions of questions daily.
Put your speculation hat on and listen attentively to the interviews of Sam Altman and Ilya Sutskever.
You will see were their minds go: UBI, safety, world disruption, etc.
I'm quite surprised at how little progress FAANG companies have made in recent years, as I believe much of what's happening now with ChatGPT was predictable. Here's a slide from a deck for a product I was developing in 2017: https://twitter.com/LechMazur/status/1644093407357202434/pho.... Its main function was to serve as a writing assistant.
Scaling up an LM from 2017 would not achieve what GPT-4 does. It's nowhere near that simple. Of course companies saw the potential of natural language interfaces, there has been billions spent on it over the years and a lot of progress was made prior to ChatGPT coming along.
You're making incorrect assumptions. This project wasn't about scaling any published approaches. It was original neural net research that produced excellent results with a new architecture without self-attention, using a new optimizer, new regularization and augmentation ideas, sparsity, but with some NLP feature engineering, etc. Scaling it up to GPT-2 size matched its performance for English (my project was English-only and it was bidirectional unlike GPT so not a perfect comparison), and very likely scaling it up to GPT-3 size would have matched it as well, since GPT-3 wasn't much of an improvement over GPT-2 besides scale. Unclear for GPT-4 since there is very little known about it. Of course, in the meantime, most of these ideas are no longer SOTA and there has been a ton of progress in GPU hardware and frameworks like PyTorch/TF.
What you're saying about companies recognizing the commercial potential is clearly wrong. It's six years later and Siri, Alexa, and Google Home are still nearly as dumb as they were back then. Microsoft is only now working on adding a writing assistant to Word, and that's thanks to OpenAI. Why do you think Google had to have "code red" if they saw the potential? Low-budget startups are also very slow - they should've had their products out when the GPT-3 API was published, not now.
One thing I didn't expect is how well this same approach would work for code. I haven't even tried to do it.
Do you have any publications to back up your claims about your work? They seem more than a bit grandiose. If you're ideas are as novel and useful as you say then you should publish them.
And I'm sorry, but you're completely wrong about companies recognizing commercial potential. I worked on Alexa for five years, it is a far harder problem than you think. It is nowhere near as simple as "we just weren't looking at the right NN architecture or optimizer!" You're acting like it was a novel idea to think LMs would be extremely useful if the performance was better (in 2017). I'm just trying to tell you that isn't the case.
No, I have no plans to openly publish any of it. Some of my researcher employees have published their own stuff. I've previously written about how it was a huge commercial mistake for Google and others to openly publish their research, and they should stop. Indeed, now OpenAI has not published a meaningful GPT-4 paper, and DeepMind has also become more cautious. This mistake has cost them billions, and for what? Recruiting? Now they lost many people to OpenAI and wasted time and effort on publishing. Publishing is fine for university researchers or those looking to get hired. I did record some research ideas in an encrypted format on Twitter for posterity: https://twitter.com/LechMazur/status/1534481734598811650.
If any of the FAANG companies recognized the commercial potential and still accomplished so little, they must be entirely incompetent. When this 2017 deck was created, I had 50k LOC (fewer would be needed now using the frameworks and libraries) plus Word and Chrome plugins. The inference was still too slow and not quite feature-complete, and it was just a writing assistant with several other features in early testing, but it seems more than enough for me to know quite well how difficult is the task.
The fact that you think creating a writing assistant plugged into Word is equivalent to building a general purpose, always-on voice assistant tells me all I need to know.
What? We were talking about making a language model. I mentioned the plugins in relation to the question of commercializing. I'm very clear about what my project was and was not doing. I get that you're bitter because Alexa became a joke with how little progress was made and the struggles of Amazon in getting the top talent are well known. How much did it cost Amazon to fail like this? Is the whole team gone? Is that the billions you mentioned?
"It's six years later and Siri, Alexa, and Google Home are still nearly as dumb as they were back then". You can't even keep a coherent discussion and you are delusional about the significance of your work. You shared a slide with nothing but generic pie-in-the-sky use-cases and you act like it gives you some credibility on the subject ("let's make an AI system that can do the work of your non-professional employees!"). And to top it off you act like you've been successful here! Again, you shared nothing but a slide with generic use-cases that a 12 year old could think up. I don't know what you think you proved. Enjoy your imaginary pedestal.
Calling an API doesn't mean no value is captured. There are vastly complex integrations of LLM as a small component in larger systems, with their own programming, memory, task models and so on.
If you think GPT is just about chat, you've misunderstood LLMs.
Folks need to start getting over themselves. It's pretty trivial to get GPT4 to explain how transformers work, where the bottlenecks are in both performance and learning, and start modifying pytorch.
It's really not that complicated. Gatekeeping is so over.
Not sure why LLM would make Facebook (ads), Apple (hardware), Amazon (hosting, retail), Netflix (tv) obsolete. It's definitely something Google needs to think about, but there's no reason to think they won't again be the leader soon.
I actually think Apple is in a unique position here again with the hardware/software integration.
Once again, their ability to do computation on device and optimize silicon to do it, is unparalleled.
A huge Achilles heel of current models like GPT-4 is that they can’t be run locally. And there are tons of use cases where we don’t necessarily want to share what we’re doing with OpenAI.
That’s why if Apple wasn’t so behind on the actual models (Siri is still a joke a decade later), they’d be in great shape hardware-wise.
Google has some impressive on device AI software such as Google Translate (translation), Google Photos (object detection, removal, inpainting), and Recorder (multi-speaker speech to text). Most of this is possible without their Tensor chip, but is more efficient with it.
Imagine Walmart launching a ChatGPT interfaced bot for shopping that customers take a liking to. Walmart starts acquiring both new customers as well as high quality data they can use for RLHF for shopping. Eventually Walmart's data moat becomes so big, that Amazon retail cannot catch up and customers start leaving Amazon.
For AWS, if MS starts giving discounts for OAI model usage to regular Azure customers, that's gonna be a strong incentive to switch
For Apple, A Windows integrated with GPT tech may become a tough beast to beat.
> Imagine Walmart launching a ChatGPT interfaced bot for shopping that customers take a liking to.
I can't imagine that, because it doesn't seem to fit the use case. Especially not to the point of bankruptcy of Amazon, maybe as a small novelty? Can you list some killer features that the chat would bring that would make the existing shopping experience irrelevant? Maybe not everything is a nail to the hammer?
sure, this is basically a fancy version of the current FAQ's, customer's QnA and current recommendations (similar products, recommended products). Would it be just so much better in experience that it kills Amazon? i kind of doubt it. Also, we're currently talking model that was trained on pre-2021 data and here we have an inventory in the millions that changes daily, so the tech has to catch up, too.
Asking for your preferences to figure out your product choices or searching for a product for you based on your given requirements is a lot more than just FAQs and Q&As. You can even imagine a fancier version where you describe say what kind of setup you want and it gives you combinations that are nice.
Also this could aggregate information not just on the product page but across multiple pages which is time consuming to do by oneself.
As for 2021 cutoff -- ChatGPT can now browse the internet and if Walmart built a bot that interfaces with ChatGPT, I'm sure they would be feeding it the latest info
I work at a FAANG and our unreleased models are fantastic. Now, there might be panic about how to productize it all, but tech wise i'm pretty surprised how good they are.
Sounds like the same thing that happened with datacenters? No one has ops or hardware sysadmins, no one sets up large networks except a few in those centralized cloud companies and couple other niche uses. website ops job changed
This sub and blog post are all talking about ChatGPT which is no doubt amazing and far ahead of the curve. However I would also point to Metas new vision model:
SAM (Segment Anything): it is so far beyond any other vision model, I actually believe vision will be solved in a few years now. People don’t realize that there was an industry of publishing paper with incremental improvements in small datasets in CVPR that has been completely invalidated by this paper. I’ve seen engineers in Cruise segmentation team, say Metas new model seems to work better than the in house models they developed and that they should build on top of this. I’ve worked in Tesla Autopilot before and saw it hit a mannequin because we never had mannequin in our dataset before ( we might have had it in the data but it was not a part of our ontology for the network to predict). One approach to mitigate this was OpenAI’s clip that used the English Language as classification labels but Meta’s SAM is so much better where it detects objects without need to specify language. It just understands scenes and objects, at a fundamental level, it can detect anything in a picture if you prompt it right. Honestly it feels a lot like GPT1 which was also ignored by most. If you prompt it right you can get it to segment anything in an image, but prompting it right requires human input. However I can imagine the third or fourth version, with some RL sprinkled in just working zero-shot on complete pixel understanding of any image in the world. This was one of the holy grail of computer vision, that we are seeing solved right in front of our eyes
There seem to be a lot of breakthroughs being announced right now. My guess is that after ChatGPT, companies like Meta pushed their teams to get their own projects out the door which they'd already been working on for some time. SAM and Zero-1-to-3 (https://news.ycombinator.com/item?id=35242193) are incredibly impressive new projects in the image comprehension side. Then of course there's all the stuff happening with Midjourney/Stable Diffusion. And DeepMind's AlphaCode.
I’d say after ChatGPT’s success every other company is focusing far more on scaling up than coming up with new methods. SAM is basically the method they came up with in 2021, but scaled up to 1 billion masks, which was not something Meta might have been willing to do before ChatGPT
> I’ve worked in Tesla Autopilot before and saw it hit a mannequin because we never had mannequin in our dataset before ( we might have had it in the data but it was not a part of our ontology for the network to predict).
Incredible. To be clear, your group and the company you work for saw fit to release your beta-quality autopilot onto public roads, as it was hitting human-looking objects in your test labs. Is that what you're admitting here? Did anyone in your group object to this? Were you personally concerned?
Instead of moving fast and breaking things, what if you had not rushed your autopilot out, but waited on technology to improve to the point that your product wouldn't hit a human-looking object it had never seen before?
There’s too much insanity over building new tech these days, and I find a lot of hate directed on Autopilot, to stem from hate directed on Elon. Let’s go over the basics for Autopilot:
1. It costs thousands of dollars a year + 15k one time and it’s very easy to get banned from autopilot for life. They have a 3 strike rule + 1000 miles driven on your Tesla with a good safety before you’re allowed to access autopilot
2. I’ve seen 2 types of customers use autopilot. One is rich dudes, who buy it just to have all bells and whistles. They use it less than once a month and honestly it’s a waste of money for them. The other are passionate early adopters, they regularly make YouTube videos, constantly stress test our tech and are huge contributors of our tech itself. I’d say 10-20% of our users are the latter. The kind of group who don’t use autopilot? The regular old Joe, who perhaps would like Autopilot for some practical use case. It’s too expensive, has a lot of restrictions (like you have to grab the steering once every few minutes or autopilot disengages and then you get banned out of it even if you pay) that it doesn’t make sense for him to buy this tech anyway. In essence, this is not tech that is being used by regular people who have a chance of misusing it. Ever since the Uber self driving crash, heads roll if a self driving car crashes and as an engineer I don’t get any access to Tesla legal but it’s my understanding that in none of the cases filed against Tesla, did they prove Autopilot was active (forget Autopilot being the cause).
So yes, we’re not building new tech that’s killing hundreds, no one has any ethical dilemmas here. We’re building tech that a passionate group of users really want to see succeed and help us do that, and rest of the users just give us money for some reason even though they don’t really use it or trust it. I would frankly be more torn about working in a place like Waymo where the user has almost no control over the car (they don’t even sit at the steering wheel), and they have to solve the problem one shot before releasing it to the public while Tesla can keep iterating step by step (with its passionate user base supporting us and showing love all the way)
A cynical person might remark upon the fact that your being a Tesla employee might have some bearing on your position, but I am not such a person.
> In essence, this is not tech that is being used by regular people who have a chance of misusing it.
Now this is true in one sense – people who can't afford a Tesla and then aren't willing to spent an additional $15k on a piece of software, which has (many, many, many) times been described in a highly optimistic to the point of not having a very strict correlation with material reality way by the company's CEO, cannot use the software to drive in a car – and very false in another: _what if someone else's Tesla crashes into me_?
> while Tesla can keep iterating step by step
I could be wrong (this is a genuine statement, please don't take it as a passive aggressive one, it's not intended that way) but doesn't this rely on Tesla first finding a failure, then diagnosing a symptom, writing a fix, etc. The fact is though that this initial failure might be one of several crashes which have occured in a Tesla on AutoPilot, which isn't great?
PS., I have left Tesla but sure I might be biased since I have friends there and worked there for a while.
> I could be wrong (this is a genuine statement, please don't take it as a passive aggressive one, it's not intended that way) but doesn't this rely on Tesla first finding a failure, then diagnosing a symptom, writing a fix, etc. The fact is though that this initial failure might be one of several crashes which have occured in a Tesla on AutoPilot, which isn't great?
Failures are generally user disengagements not a crash. We measure user disengagements, classify them and try to drive the egregious ones to zero. FSD has had one major crash, no injuries that is being investigated by NHTSA, and a few minor bumps (I went in more detail below).
> what if someone else's Tesla crashes into me_?
I think that is a very fair point. It happened when a Uber self driving car crashed and killed a pedestrian which was a major incident in this industry. The problem with DL models is they are unexplainable and we cannot tell when they fail (Though in Uber case it was not exactly DL model failing). Tesla took this risk and has managed fine with no injuries to date. And now the main reason I made this post, the tech keeps getting better, we have this model from Meta that just literally segments everything in an image (even ones you take from your phone). It honestly feels we are leaving the risky DL territory and reaching the "we can't understand how but it just works" territory where you can rely on a Deep Learning to do what you expect it to do.
Go on Pornhub right now (if you're not at work) and search for sex in tesla. You'll find people driving Teslas on public roads while having sex in the driver's seat. The guy touches the steering wheel every so often to keep the car autopilot activated. The videos have been posted over a span of years. Another one was literally posted yesterday. Combined they have 10s of millions of views.
Is this what you mean when you speak of "passion" in Tesla owners? When will these Tesla owners be banned for life? Would you be okay with these people fucking in their Tesla as it drives around your town in broad daylight? Around you and your family?
> I find the hate towards autopilot to always stem from non user and by standers.
There are two other categories:
3: People who have been injured, maimed, or killed in Teslas, and the people who knew and loved them. This applies to me.
4: People who are passionate about robotics, and are disgusted at how Tesla in particular and Elon Musk especially are responsible for eroding public safety in the name of profits and market dominance. This also applies to me.
I appreciate that some people who own Teslas are very "passionate" about the expensive toys they have bought. Toddlers are also just as passionate about their material world. But I really don't care how much they love their cars, what I care about are people who I know are dead, and my field is a joke.
> And then we have bystanders like you with no understanding of what’s actually going on, who want to ban everybody from using Autopilot because you think drunk people are using Autopilot or something.
I have Ph.D. in computer engineering focused on robotics. My dissertation was on dynamic autonomous control. I've built many autonomous vehicles in my time, including cars, forklifts, boats, airplanes, and wheelchairs. I teach graduate students at a top international university. I've worked at and consulted on robotics at top corporations you've heard of. Sorry if you mistook me for a bystander with no understanding of what's going on.
What I want is my community to be safe. What I want is for professionals in my field to take safety seriously, and not release half baked admittedly beta quality software into the wild. That's the craziest part of all of this -- you and Elon and Tesla and all the passionate owners don't even contest that the software and hardware are not ready for the task. We had established protocols for testing autonomous vehicle in public areas in 2007 during the DARPA Urban challenge. Those protocols were designed to keep people safe, and they did. No one died. Tesla threw those protocols out the window, and guess what, people died. This is not saying that autopilot should be banned forever. It's saying you shouldn't move fast and break things, because sometimes those things are people, and sometimes those things are established safety protocols that are there for a reason.
> (Needless to say Tesla will never hit a mannequin or anything like it ever again as 100s of videos by our passionate users have shown. I’ve also seen it avoid a teddy bear on a roller chair that was in the middle of the road for some reason, something definitely not in our training set)
Is it needless to say? Because in 2016 a man was decapitated due to his AP system failing to sense an obstacle, and then it happened again to a second man in 2019 on a newer model, with the same failure mode:
Why didn't Tesla fix this beta-level bug in 3 years? Is it fixed today? If not, how long until Tesla AP kills another person? If you had tested your hardware and software in a lab more, would those two people still be alive? Would their families still be whole? Have you reflected on this at all?
Edit: I've responded to your original comment, but it seems you've heavily edited it after the fact. Although, your choice of words about "heads rolling" has incensed me to a degree that I cannot continue this discussion civilly, and have already gone too far. I'm not deleting this because I'd rather get it off my chest. I get that you're not responsible for those deaths, but Elon Musk is from my point of view, and also the general "move fast and break things" attitude is as well.
No idea why you had to tout out that you had a PhD when I was pointing out that you were not a user. Well news flash, I did my PhD in EECS with a focus on Perception and Autonomous Vehicle Applications from a top 4 CS school and so did most people in Autopilot (Most are from Stanford, Berkeley or CMU in Tesla AP), in fact I worked with people who were a part of the 2007 DARPA Urban Challenge who are in Autopilot now and I can assure you they think as deeply about safety as you claim you are. (I did not want to default to an argument of authority, but you made it that).
I do in fact sleep soundly, knowing I don't call engineers murderers on online forums. The 2019 and 2016 case you're talking about engaged Autopilot not FSD Beta (The 2019 crash did it just 10 seconds before the crash). Autopilot is glorified cruise control, it maintains a distance to the car in front of you, does not avoid obstacles, is not meant to break unless there are exceptional circumstances or do anything really beyond following the lane. Independent testing by both the NHTSA and the EURO NCAP authority have given Tesla Autopilot the highest safety rating recorded by any car, so it is the safest cruise control among competitors, but it is a cruise control where crashes happen. Its not FSD.
There has been only 1 case under FSD Beta under investigation by NHTSA, and that occurred in 2021, that is going on in the courts, I do expect Tesla to win that case, but let's see. That case involved damage to the car and no injuries or deaths reported.
Also I edited the comment to make it smaller, I did not see your reply before editing my comment. I can revert it if you wish.
I've calmed down after your "heads will roll" comment so I promise to be civil. To continue. Yes, you should put back your original comment to reduce confusion. I'm fine reading a long comment.
> No idea why you had to tout out that you had a PhD when I was pointing out that you were not a user.
You said:
And then we have bystanders like you with no understanding of what’s actually going on, who want to ban everybody from using Autopilot because you think drunk people are using Autopilot or something.
Which I took to mean you assumed I was a nonpractitioner who didn't know what I was talking about. I know you work at Tesla, so I know your credentials. Since you seemed so eager to dismiss me as a "bystander", my only point in telling you about my background wasn't to threaten you with it or to assert an unquestionable authority, but to inform you that I have the necessary experience and education to fully understand all the complexities you think are beyond my comprehension. I am not a "bystander", and while I do not own a Tesla (because of course I don't), that doesn't set you up to dismiss my point of view as uninformed. You can communicate to me as a peer, not a "bystander".
> (Most are from Stanford, Berkeley or CMU in Tesla AP)
Absolutely, I've gone to school with some of those people. I've been taught by some of those people. I've also taught some of the people you work with. I don't know who you are, but I know you know better, or at least your colleagues do. Which is why this is so especially painful for me.
> I can assure you they think as deeply about safety as you claim you are.
Do they though? Because... again, you're building a product that allows people to drive around town fucking in their car. I notice you didn't address that at all in your reply. What does your team have to say about that, and when will this be banned? Why did your team release "beta" quality hardware and software onto public streets? Why wasn't the public consulted?
By the tone of your earlier comment about drunk drivers, it seems to imply that you think drunk people driving Teslas is beyond the pale. And yet, what do you say about people fucking in their Teslas? That's happening. Isn't that just as dangerous as drunk driving, if not more so?
It's really easy to say you're thinking deeply about these things, but that seems to be as far as the consideration goes when looking at how your product is being used in reality. You didn't think enough about it that you realized the camera sensors on the AP system would be overwhelmed by a bright white obstruction, and it would cause the AP system to run into it at full speed. You didn't think enough about it to have robust sensing to overcome a single sensor being overwhelmed. Yet you shipped that to the public, and then someone died due to the lack of consideration of that failure mode by your company.
To me, it seems like your company's decisions are based purely on maintaining a competitive edge by being a market leader and aggressively pushing unfinished products onto the general and unsuspecting public. Can you please outline the ethical framework you used to arrive at this decision? Please don't tell me it was "We need this as fast as possible to save as many lives in the future, short term casualties are a necessary evil for the greater good."
> I do in fact sleep soundly, knowing I don't call engineers murderers on online forums.
Thanks for confirming my expectations. I figured as much. I figured you had no problem with the fact that your product decapitated someone, and then your company did literally nothing about it for 3 years, leading to it happening again. For the sake of your sleep, I'm glad you're able to rationalize these decapitations as "it was just a glorified cruise control" as if it's the user's fault. Maybe the first time. But the second one is on your company.
But to be clear I didn't call you personally a murderer; I literally said you're not responsible. But the thing you built is directly responsible. Your team is responsible for releasing these things into communities, which you yourself admit are beta quality. That's something you chose to do. Engineers must be held accountable when the things they engineer hurt people, otherwise they will engineer things that hurt people.
If you haven't realized it yet, this whole feeling I have really about you but your company, so don't take what I'm saying personally, unless in fact you do feel your personal work had contributed to these people's deaths.
Again I did in fact open pornhub when you point it out and that’s Tesla Autopilot not FSD. I do find it beyond the pale that people drunk use FSD, because there are very strict controls for people using FSD. Autopilot is advanced cruise control, apart from regulatory authorities saying it’s safe compared to competitors, Tesla releases quarterly safety reports since 2019 counting the incidents with Autopilot. It is shown to be safer than humans behind the wheel by quite a margin. Critics will point out, that Tesla AP users are old and most accidents are done by teenagers, AP is used on freeways while accidents are more common in city streets etc but in my view that’s still fine. It shows even the AP users are reasonable responsible.
The real risk comes when you have FSD, that takes complete control over your car crash and even worse if it crashes while the user is trying to prevent one. I don’t think the latter is possible, because the software cedes control to the user instantaneously. There is a risk of users being too careless with FSD, which is why people we put quite some effort to getting rid of such users quickly. As I mentioned there are no deaths with FSD, just one case in 2021 being investigated by NHTSA.
> Autopilot is advanced cruise control, apart from regulatory authorities saying it’s safe compared to competitors, Tesla releases quarterly safety reports since 2019 counting the incidents with Autopilot.
So what did you do to fix the issue? Why did a second person die in the same exact way as the first person after 3 years? Were you working on the fix at all? Or did you do nothing? Just admit it if you did nothing in response to that decapitation.
You say "It's Autopilot, not FSD" as if that absolves Tesla of anything. Your marketing does not change my opinion of your technologies. It doesn't change that FSD and AP both have glaring technological flaws, doesn't change the fact that FSD is beta-quality hardware and software being tested on the general public, something the public did not agree to. It doesn't change the fact that even though they didn't agree to it, you unilaterally decided it was okay to conduct a beta test involving us. That's a huge ethical problem, and the fact you don't even see it as such blows a hole in your insistence that your colleagues take safety seriously.
It doesn't matter. I understand why you think it absolves you; because you feel that the technology is similar enough to others out there, that it's just an incremental step, and so how can Tesla be held responsible when people make a career out of fucking in their car using that technology? How can Tesla be held responsible when multiple people lose their heads due to poor choices in sensor design? Correct me if I'm off base. Why is the AP/FSD distinction so important to you?
How do you not realize it's your entire company's fault there was no other orthogonal sensor to see the tractor trailer? How do you not realize it's your entire company's fault people out there feel safe enough to use your product to watch Harry Potter or fuck while while flying down the highway, using what you call "glorified cruise control"? That's what you say it is to me, your peer, but to them you've said it was "AutoPilot (TM)". Why didn't you call it "Glorified Cruise Control" or just "Cruise Control"?
That's not on them, that's on you for unleashing this technology on us. People are always going to watch movies and fuck. That they're doing so in your beta-quality robot menace to society is not their fault. Your company specifically conditioned them to think it was okay to do this in a Tesla.
> It is shown to be safer than humans behind the wheel by quite a margin.
The passive voice is doing a lot of work here.
> There is a risk of users being too careless with FSD, which is why people we put quite some effort to getting rid of such users quickly.
Again, this just goes to show how Tesla, in fact, is not concerned about safety and security, but instead are laser focused on market dominance and pushing technology on us as fast as possible. Tesla is a look before you leap, shoot first and ask questions later kind of company. Or as I said, move fast and break things (or in this case, "put quite some effort" to patch them up after the fact).
This is a brand new technology and Tesla is rushing it out to the public as fast as humanly possible, selling it in beta quality before the technology and software is even ready. And you're telling me now that you screen users for bad behavior and ban them after the fact. It just goes to show you're treating this as some grand social experiment you feel you have the right to run on the rest of us.
Yes people are going to watch movies in their cars. They're going to fuck. Yes they will be drunk and asleep. The problem is that you don't seem to care that your customers are using your products to do all these things, and it's over the course of years.
I feel like I’m repeating myself to your constant emotional attacks of trying to make me feel guilt for 2016/ 2019 case so let me try to end it with: you’re right about one thing, AP is similar to other tech out there, it is cruise control with some extra features. If you think it is ethically wrong to put AP in the public, then every car since 2005 at least are a moral hazard, if you believe that fine I don’t. Cruise control is regulated, Tesla passed them with the highest ratings and releases safety reports since 2019 as I already said. Why the distinction is so important, most of Tesla AP doesn’t even use new tech, very little deep learning (just some CV stuff), AP still uses radar like every other car manufacturer out there and AP has a very simple state space planner which most cars use now though most Toyotas use older PID tech.
FSD beta is different, FSD is filled with new DL tech. DL are black box models that work surprisingly well but are not interpretable. They can suddenly output something nonsensical (like bing Chat did, ChatGPT surprisingly hasn’t) and you won’t even know why. There is a risk involved with putting DL based FSD out there, because you don’t know when it will fail. Tesla took that risk. Tesla however to date has had no FSD crashes that involved injuries, had 1 crash that involved the front of a Tesla being significantly damaged (which is being investigated by NHTSA as I already said), and several smaller collisions that have caused scratches on Tesla cars (at which point we promptly ban that user for life, you can see YouTube videos of this). Uber self driving killed a pedestrian, (though the paid QA driver should have been paying attention, it was not really Ubers engineers fault), Tesla actually handled the risk of using DL tech pretty well. It was a real risk, we still have no injuries and the tech keeps getting better. So yes your tiresome moral attacks don’t affect me and I prolly won’t respond again if I just have to repeat myself.
> AP is similar to other tech out there, it is cruise control with some extra features
What concerns me is that even if this is technically true (I have no reason to doubt you, so I'll assume it's correct), it is not marketed this way. First the fact that it's called "AutoPilot" rather than something like cruise control or super cruise or lane assist, etc, or whatever other car manufacturers call their systems and second the misleading statements made by Tesla executives about how FSD is "imminent" and will be available soon.
Yea people are going to hate Tesla because of musk or bc they’re always in the news. I’m not saying that there isn’t valid criticism it just seems that they’re held to a totally different standard than other companies. People really enjoy taking a moral stance against them like you’re doing but it really seems like an overreaction.
As far as I can tell, Tesla was one of the first if not the first to start marking their autonomy as "Auto Pilot" and then subsequently "Full Self Driving (Beta)". Tesla also is unique in trying to implement these technologies using sensors which are not state of the art, a reason which was cost cutting (lasers are too expensive for consumers cars) and naïve (at the time and still today vision technology cannot replace a robust suite of sensors).
Nonetheless they released this beta-technology into the public, without consent. It's caused loss of life, property damage, and on top of it all it's also fraud, because it hasn't even delivered on the promise of full self driving after many years of promises.
I don't think calling them out for this is an overreaction.
I have a robot vacuum cleaner with "AI" computer vision, and you can just tell that it doesn't see the world in the way a living being does. It'll detect and avoid shoes, cables and dog turds, but is completely blind to anything not in its specific library of objects (we have a two year old, it chokes all the time on things like ribbons and toys).
Being able to recognise objects that cause it to choke, even if it doesn't what they're called in English, is souch more significant to robotics than being able to label things.
It's 2023 and tesla still can't reliably read traffic lights. Sounds like you know vision has a long way to go but I'd expect in the future it'll be harder to make that case. Maybe that's all that's meant.
You should give them more credit than that, but even if you don't you'd be surprised how bad even companies like Cruise and Waymo can be at this kind of thing.
I feel like the whole ChatGPT bubble has really highlighted what feel like some fundamental shortcomings in the worldview that is represented in tech. That is, there seems to be a winner take all dynamic baked into the tech world. Maybe this arises from the simple fact that tech places a lot of power in the hands of individuals. But there's an emergent downside to this which is that it makes those who were already much more powerful even more so. Because who is best suited to take advantage of all the power but those who were already perched up above everyone else and permitted to pick and choose their opportunities?
I've found it exceptionally hard to stay positive about all of this. It almost feels as though the advent of LLMs has shined light on a fundamental law of the universe that does not work out in the little person's favor. It's like survival of the fittest on steroids. Guys, what the heck are we doing??
At some point, I became disenchanted with tech and working as a software engineer when I realized that tech doesn't ethically fit into our primitive economic organization right now. It's made me feel bad about myself and the career path I've chosen.
Basically, with tech, you can build a product with a very limited amount of people and resources with a potentially infinite customer base. And this means that this product can suck up money from the whole world and redirect it into the hands of very few. At its essence, this is the case. With tech, this phenomena is exacerbated to the extreme compared to other industries where more physical resources and labor are required to scale production and distribution of the product, which means more wealth getting spread.
At its core, tech has the potential to exacerbate wealth inequality in mind-boggling proportions.
I think ultimately tech will result in humans being like the rest of animals - completely unable to participate in the economy. This will only start to happen once AGI is reached though which might only be a decade away or less.
I think you're right, we're going to need to completely rethink our economy to decouple the idea of human effort (Unfortunately I can't see how that's going to happen without everything first falling apart).
A single barrel of oil is estimated to replace about 5-10 years of a human's effort in pure energy terms. Our human ability in physical terms was already significantly obsoleted by that.
AI feels like the discovery of cheap oil but for mental effort.
At the same time, oil is getting more and more expensive and energy intensive to drill (and costly to the environment to use).
I suspect there will be essentially three different economies - Digital Human-experiences, and hybrid.
AGI will focus primarily on problems that can be mostly solved within digital worlds where it can cycle incredibly fast. Building software, finance, paperwork, etc, etc It will eventually evolve to design, control, and operate physical systems, but will need humans to intervene and fix systems (hybrid).
The human economy will built on experiences that are explicitly, and intentionally void of computers. Restaurants, guided tours, exploration, etc, etc.
Yes I could see that happening, assuming the AIs don’t manage to outwit humans and subvert power and take over.
However, even if humans remain dominant (political power-wise), eventually there will probably be fake humans/cyborgs, like Cylons in Battle Star Galactica, that can do everything humans can but better. And also provide human-level experiences. Impossible to know how long this will take though.
If something makes everything take half as much effort but removes your ability to earn an income (or slashes it more than twice) then it decreases your standard of living despite substantially increasing it for many others.
You are assuming everything else stays static. Maybe if this happens, the ability to do everything in as half as much time will increase competition in a lot of areas bringing prices down and increasing your purchasing power.
That’s not what’s happening in the US. Too much concentration of wealth (not just tech), leading to almost 3rd world country health outcomes, education levels plummeting and unaffordability to live.
"Tech might bring more inequality, but what is the problem with it if it substantially increases everyones standard of living?"
The problem is that ultimately, wealth inequality decreases the standard of living for most people. There are a finite amount of resources. This world we live in is finite. And these resources are being concentrated at the top, in the hands of a few, which leads to less and less of these finite resources being available for more and more people. Although economics is of course complex to some degree, don't lose sight of this very simple mechanism because it is a reality.
"I would argue that is caused by regulatory capture and government interference (e.g. zoning laws) and not technological progress"
If by regulatory capture, you are referring to regulatory corruption influenced by lobbying, then yes, of course that is increasing wealth inequality, and thus decreasing standard of living. Regarding government interference - it's the exact opposite of what you say - proper government interference is what we need. Under current conditions our economic system is not distributing wealth properly. That much is evident and easy to see. And there will be worse living conditions under more stark wealth inequality as opposed to a more even distribution of resources. As mentioned, this world we live in is finite. If these finite resources are syphoned to the hands of very few individuals, this leaves less of these finite resources for the rest of human beings. And as mentioned, tech has contributed to accelerating this, for the reasons mentioned in my above post. So, as mentioned, what we need is government interference. This means taxing heavily the areas that the finite resources I have spoken of are getting concentrated in and redistributing them more evenly.
I do not believe that standard of living should be compared in a vacuum. Since I assume that this is what you are referring to, I am firmly convinced that happiness is relative.
What this means is that improving the standard of living of the average person by 5% while improving it by 50% for a smaller population will hurt more than it helps.
Corporations make profits if their products are worth it to people, true. But in a sufficiently unequal world, what's "worth it" to a rich person matters more in the market than what's "worth it" to a poor person - the rich command more money than the poor. If the money made by corporations accrues only to a few, we get a positive feedback cycle of inequality: corporations cater to the market; the market caters to rich people; and in a highly concentrated market, only a few people get rich off of this. The result is a poor underclass. Society as a whole gets richer, perhaps, but very unevenly. Besides causing political instability, this is also an inefficient allocation of resources - an additional $1000 improves a poor person's life more than it does a rich person's.
this is not so straight forward, i've been using google for ten years even though it has been progressively getting worse. Inertia is a hell of a thing. Given a lot of choices, people generally choose one or two and stick to them (these choices mostly being determined by things like first movers advantage, good marketing, etc...)
> there seems to be a winner take all dynamic baked into the tech world.
This has been a pathology in the computer/software industry for a very long time. It's never been actually true except in a couple of special cases, but the industry acts as if it is. That has led to all sorts of bizarre and undesirable things.
> Guys, what the heck are we doing??
I think we're playing with fire and, without extreme caution and careful consideration (which I'm not seeing much of), this could end very, very badly for both the industry and society.
I have always been optimistic about technology and society, but (perhaps like you), my optimism has largely evaporated over the last several weeks. I wish the future didn't look so dark. Perhaps, though, things will look less gloomy with time.
People are perceiving (rightly or wrongly, I'm not debating that in this comment) that this software is powerful enough that it can replace people even in jobs that were always considered safe.
If that perception is accurate, it's really hard to see how this can lead to anywhere that isn't much worse for most people.
If that perception is not accurate, people will still be upset by what looks like an existential threat to them, and if there's economic disruption for any unrelated reason, they will blame AI.
Either way, I'm really struggling to see a good outcome from any of this. I'm not saying ChatGPT is bad, but I think the rollout of it has been done in a way that is incredibly insensitive, reckless and damaging.
> it's really hard to see how this can lead to anywhere that isn't much worse for most people.
It's very easy to see that, actually. Consider the problem statement again: economy is becoming more productive overall because we now have robots that can do some things that previously required people. This means that we can generate more total wealth for the same effort as before. That this translates to people being worse off, somehow, is a problem with the distribution of that generated wealth, not the ability to produce it. Even a sensible implementation of UBI could fix the most immediate problem, and I think we're looking at much bigger shifts in our economic system within the next couple decades.
> That this translates to people being worse off, somehow, is a problem with the distribution of that generated wealth, not the ability to produce it.
Exactly so.
> Even a sensible implementation of UBI could fix the most immediate problem
Which is simply not in the cards. The people who have the wealth are absolutely not going to give it away, because wealth equals power and they want power.
So what is more likely to happen is that the very wealthy will become fewer in number, and much wealthier, and everyone else will become poorer (in absolute, not relative terms) due to the reduction in the amount of jobs available that pay a living wage.
> I think we're looking at much bigger shifts in our economic system within the next couple decades.
Perhaps so -- but you can't ignore the serious harm between now and then.
Many of the people who do have the wealth have been talking about UBI. Not out of the kindness of their heart, but because they understand that a world in which billions of people suddenly become "economically redundant" is also the world in which their mansions will burn, some quite possibly with their owners inside. Now, these guys mostly propose the "bare minimum" kind - effectively, just enough to prevent this scenario. But they can't argue for it without legitimizing the concept in general.
The bigger political problem in US specifically is just how hostile the public opinion is to anything that "stinks" of socialism. There's also the perception that AI is automating away "useless" white collar jobs, and not "real man's work". The right-wing intelligentsia who tries to sell UBI to their electorate is having a hard time getting past that, but I don't think it's insurmountable with the right approach, and it's also much less of an issue in most other countries. In any case, I think attitudes will change very quickly once more people outside of tech and art get first-hands experience with job loss, if not for themselves than for someone they know.
The harm between now and then is serious and real, yes. I just don't see any feasible way to prevent it through regulation, not to mention that it would face just as many political hurdles. We have already jumped; let's not waste time trying to flap our arms to see if that works, and try to arrange for a safe landing instead.
That would be nice. I sincerely hope that we can swing that. I just don't see a realistic path for it. And almost everything that I've been hearing the enthusiasts say just makes everything seem more hopeless.
But I'm very much hoping that I'm wrong. At this point, hoping is the only thing I have the power to do.
One thing that I wish more people realized is that if this tech wave does push our societies into this degree of violence - and I believe that this is a distinct possibility - it won't be Altman's mansion that'll burn first. It'll be the houses of software engineers and other "high middle class" occupations in rich suburbs - simply because they are much closer, and don't have much security, so more accessible. Altman will have the time to get to his private jet off to New Zealand or wherever the current bunker is; the rest of us will suffer the immediate consequences. See also: 1917 Russian Revolution.
So, regardless of one's ethics and political philosophy, we have a very strong collective interest as an economic class to prevent this scenario. Most of us don't have any more direct political power than your average voter, but we can throw more money at the politicians. In my experience, this doesn't help one bit with "wedge issues", but this all is so new that nobody has managed to weaponize it across political lines just yet.
Pre-tech, the winner take all dynamic was prevented by physics:
- Customers were only willing to walk so far to go to a bakery
- A store could only stock so many varieties on it's shelves
Now, the cost of distributing goods across the globe, and the incremental cost of creating new goods has shrunk to virtually zero for digital goods, combined with people's general hesitance to the friction of switching providers, creates a the natural "winner take all" effect
The places where you'll see this NOT happening are areas where those costs are still significant for some reason.
Examples:
- Different ride sharing services are popular in different countries, many of them local ones, since foreign companies had a harder time getting distribution started.
- Legal regulations also have been hindering the spread of payment companies across country-level borders
Without this the gravity is much less powerful. Seems like where to focus if you want to empower people against the winner take all monopoly effect.
Areas where switching costs are low are areas where you have standard APIs and where your data resides locally, is easy to move, or are not data intensive at all.
> That is, there seems to be a winner take all dynamic baked into the tech world.
In tech world the cost for R&D/design/code is generally much larger then continuous code. That means that it's extremely expensive to have 1 user, but really cheap to have 1M users.
Thus winner-takes-all makes sense, as company who is ahead can continue developing/designing at a cheaper price as opposed to company with no users.
This is literally threatening the winner (Google) taking it all forever. Some of these innovations are winner take all until they aren't anymore. The cycle seems to be 1 to 2 decades. The profits during that cycle are probably fair on average given the risk necessary to make the tech ready for wide-spread usage
> This is literally threatening the winner (Google) taking it all forever
Sure, but look who is over there in the corner next to OpenAI: Microsoft with their $10B stake. The company that was the winner prior to Google being a winner.
You were also twenty years younger, surely youthful naïveté plays a role in how you felt about society. And regardless, your feeling about society doesn't necessarily correspond to how society actually is.
AI world can be proper dystopian these days. I know someone who accepted a job offer in one of the biggest market research firms in the world. She was making AI models for a high tech company before and was hired to make summarization AIs. Between serving out the (frankly ridiculous) 2 month notice period and joining this new job GPT-4 was launched and prices came down for 3.5. The original plan for building something internally was turned into feed everything to chatGPT and then ask it questions. A junior engineer whipped up this system in a couple of weeks. Its all just API calls after all. Now she spends entire days trying to ask the model the right questions so that it can generate the correct reports. Her entire job has been boiled down to talking to an AI. She is working with the most cutting edge technology, yet its so mundane that everyone is just bored on this project. This could have been a ironically tragic character in a Gibson novel.
This is an outstanding phenomenon, not dystopian. Your friend was going to have to do the incredibly awful and wasteful task of rebuilding an incredibly complicated system.
However, fortunately, a much more optimal solution happened, where somebody build a superior and cheap solution (unfortunately not open yet) and let the whole world leverage it. Now your friend needs to waste very little energy deduplicating, and presumably has free time and energy to think about other problems that aren't solved.
Indeed, a much more optimal way to do things has been found. But that's not what this is about. It's about the human experience. AI is creating value, but the day to day life for a problem solver has become a bit more mundane.
I think what’s really depressing here is just how effective scaling seems to be. It just means that any company that’s not willing to pour hundreds of millions of dollars into their AI programs isn’t serious at all and would probably be better off hiring engineers to figure out how to integrate GPTX into their systems than trying to roll their own. I really think we’re going to see a massive collapse of AI/data science jobs once it becomes clear that no in house model is ever going to be better than the zero shot performance of these mega models.
My understanding is that transformers are now favored over RNNs because they parallelize better.
It's hard to imagine, but I wonder if there's some non-parallelizable machine learning algorithms which might outperform these massive models? It seems improbable, but it's a small hope I've had. The greatest intellects were aware of (ourselves) do not scale very well, and maybe the same will ultimately apply to AI?
I remember seeing some theoretical analysis that compared computing differences between transformers, LSTMs and RNNs and I think that RNNs are theoretically better (can learn more complex functions). Can't find it now.
At least, people working on ML models that handle these tasks must be feeling terrible. They know their models will be abandoned sooner or later and composed on top an LLM.
I’ve been shown some neat pictures people made that they thought were cool. I don’t know that I need this every day.
I’ve seen examples of “write an email to my boss”. It would take me longer to explain to ChatGPT what I want than to write these myself.
I’ve seen “write a snippet of code” demos. But I hardly care about this compared to designing a good API; or designing software that is testable, extensible, maintainable, and follows reasonable design principles.
In fact, no one in my extended sphere of friends and family has asked me anything about chatGPT, midjourney, or any of these other models. The only people I hear about these models from are other tech people.
I can see that these models are significantly better than anything before, but I can’t see yet the “killer app”. (For comparison, I don’t remember anyone in my orbit predicting search or social networking being killer apps for the internet—but we all expected things like TV and retail sales to book online.)
What am I missing?