Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is anyone else bearish on OpenAI?
185 points by soasdfg 11 months ago | hide | past | favorite | 305 comments
The underlying technology of LLM's and stable diffusion is an interesting topic that does have some useful applications and should be studied more, I just don't see this company (or any others for that matter) GPTing their way to AGI within our lifetime, or being able to create significant value for investors after the hype fades.

This feels a lot like crypto where everyone is very excited about a new technology that very few people really understand, and are jumping on the bandwagon without asking any questions.

It's also very much like crypto where for every one person doing something useful with it, there are 20 trying to exploit the newness and low comprehension the general public have of the tech such as:

    - Trying to cash out on a ChatGPT wrapper company

    - Creating the nth "AI powered custom chat bot but for x vertical"

    - Using it to cheat on school assignments or interviews

    - Gluing together as many different "AI" services as possible to create a no touch business and sell low effort products
I'm not saying the company will go bankrupt but I'm also not buying into the hype that it's going to become the next Google or better / create AGI for us all.

What am I missing here?




OpenAI, at least in my day-day workflow for the last 9+ months has so superseded anything that google ever was to me that I'm having a difficult time comparing the two.

I've got a monitor dedicated 100% of the time to ChatGPT, and I interact with it non stop during the flow of technical scenarios and troubleshooting situations that flow into me - working in areas that I have the slimmest of backgrounds in, and shutting down, root causing, and remediating issues that have been blocking others.

I've essentially got 15-20 high-priced world-class consultants in every field that I chose to pull from, working at my beck and call, for $20 a month? I would pay $200/month in a heartbeat out of my own pocket, and I probably would ask the company to pay ~$2,000/month for my workflow.

I think if they never released another product, and they just managed to penetrate with their existing offering, they are easily a $100B+ company once they nail down how to monetize.

The difference between LLMs and Crypto is I can point to roughly 200-300 objective solutions over the last 9 months where ChatGPT resolved an issue and delivered clear value for me alone. And, over time, as you learn how to control for hallucinations, and manage your query patterns a bit more - the value has continued to increase.

That same multiple-times-a-day high value persistent experiences were never a part of my crypto experience.


Are you able to give a specific example of a problem it helped you solve? Especially one that you were at a complete blocking point, it provided some solution, and then you were able to continue to expand upon that solution.

I keep reading responses like yours, but I haven't seen any specific examples of problems being solved, so it all sounds very abstract. In my interactions with ChatGPT, it felt like just interacting with a search engine. There were zero continuations between questions and responses, and nearly 100% of the responses contained incorrect information.

Edit 1: As an example, I just asked ChatGPT to implement the Warren Abstract Machine for me. It gave me two different implementations, both with caveats that they are simple examples and not the whole thing, and neither implementation even type checked. It feels just like reading someone's homework where they copied off of someone else's work and had no idea what's going on. I don't see the point in this if it's just going to give me some high-level idea, which I already have, and an implementation that isn't remotely complete, much less one that doesn't even run. "Additionally, you'd need to ... handle backtracking". You don't say, ChatGPT?

Edit 2: I've kept on asking it to implement things I already know how to implement, even things that it probably has my code (being it was on GitHub) in its training data, and it keeps giving me code that doesn't even typecheck. ChatGPT is just doing what I've always imagined it doing: it's just doing a statistical merge. It has zero concept of anything I'm asking or it's saying.

Edit 3: I asked it something regarding WebGPU. It gave the typical, oh that's complex and "Here's a simplified example using GLFW and WebGPU through the F# WebGPU bindings provided by FableGL". But FableGL isn't a thing that exists, and even if it did, it wouldn't have anything to do with WebGPU which has nothing to do with OpenGL. And it imported Fable.Import.WebGPU and Fable.Import.GLFW, neither of which exist.

I mean, this is literally all smoke and mirrors. It boggles my mind when I hear people say they successfully use it every day. I haven't ever got it to tell me anything remotely correct.


I think this speaks to the bubble you're in and how your kind of software development probably represents 0.1% of what's out there. I don't even know what WebGPU, Warren Abstract machine or "whatever is that you're trying to do" is.

I've been doing dev for ~20 years now, most of the work out there is essentially plumbing events/databases/REST APIs/regexes/JSON/configs/ web UIs in existing and well established frameworks/tools to solve business problems.

You likely work more R&D style environments, which again, probably represent a very low % of devs out there


The parent comment claimed they had the value of 15-20 high priced world class consultants from every field, but this makes it seem like it does only the most basic common tasks.


I've never met a consultant who seemed useful and I've met many from the big 4 so in that regard GPT measures up.


Agreed. I asked it things I actually need help with, not relatively mundane tasks that are easily accomplished by reading documentation. And the fact that it answers so confidently and yet wrong concerns me.


> I think this speaks to the bubble you're in and how your kind of software development probably represents 0.1% of what's out there.

Grandparent specifically said they find ChatGPT useful for things outside their wheelhouse:

>> [...]troubleshooting situations that flow into me - working in areas that I have the slimmest of backgrounds in

It sounds to me like it may not be safe to use ChatGPT if you can't check its work.


Not the original poster, but here are some examples:

- Paste a bunch of log code about which I have no idea. Ask to identify and explain the problem.

- Wireshark / dmesg / OpenWrt configuration pasting, ask to fix the problem. For instance, I fixed a Wi-Fi issue in a heterogeneous setup, which turned out to be caused by a stray DHCPv6 server.

- Paste C code, along with an error log, and ask to fix the problem.

- Paste my program and a sample. Ask to extend my program.

- Proofread and format Markdown nicely.

- Paste government letters, asking for a response that includes <what I want>.

- Paste a chat log and obtain documentation.

- Paste a tax declaration, and ask to check for consistency.

- Paste my code and ask for critique.

When discussing versions, people often confuse versions 3.5 and 4. I am always referring to version 4.

Keep in mind that ChatGPT has been seriously and intentionally downgraded since March.

This debate frequently leaves me wondering if I'm encountering a coordinated effort by bots. The examples I listed above come very naturally to me. I can't understand why people don't try to paste whatever they're working on and check the results. If it's too complex, asking it to critique instead of create, because that's easier. It feels as though there's an effort to shape public opinion into viewing these tools as "immature" and suitable only for edge cases.


Yeah I think the people who haven't found value are people who either don't have strong communication skills hence cannot articulate what they need OR work in very speciailized fields - either way they're clearly the minority seeing how many paying users chatgtp has and how fast they got there


> don't have strong communication skills hence cannot articulate

Basically this - even in a highly specialised field you would still find value in it if you're providing enough context and describing the problem properly.

People don't want to hear that they're poor communicators - it's much easier to blame the tech.


> People don't want to hear that they're poor communicators - it's much easier to blame the tech.

It's just hard to even know what to say to this. So you are implying that the problem lies with the human, who holds years of experience in education, problem solving, communication, etc., and not the "AI", which has no experience in any of that and has no real thought processes or problem solving or reasoning abilities? And people say ChatGPT is over-hyped!


Not OP, but yes.

I've had some chatlogs where people are not able to articulate even one clear sentence. They were fine telling me what they want, but not able to type the same sentence


> Yeah I think the people who haven't found value are people who either don't have strong communication skills hence cannot articulate what they need OR work in very speciailized fields

This is so weird and seems to be a common thread. You, and others, seem to be saying that ChatGPT is not useful if you're too dumb or too smart. It just seems both tautological and strange.

My communication skills are fine. ChatGPT is just not useful for me, period. I've tried taking what people have mentioned here, trying to have a so-called dialogue with it when things it suggests, for much more simple things, aren't quite correct. It keeps suggesting all these things, but it just feels like an interactive StackOverflow, except all the suggestions are wrong or don't work.


If you really wanted to improve you could show a link to a discussion, then I can give you feedback.

I am switching to teaching / consulting in AI. Chatbots are a part of that. I've reviewed some chatlogs, and some people get abyssimal results, which are entirely explainable and are 100% their fault.


> Keep in mind that ChatGPT has been seriously and intentionally downgraded since March

I found it getting worse for a bit but lately it's been really good. I would say it's being constantly updated, not necessarily getting worse.


It can't solve problems it was able to solve before.

It's not able to output complete code.

It always adds comments, even when I ask it not to.

It doesn't follow instructions in general.

In complex issues it has less "memory", that is it's not able to observe multiple constraints at the same time. The limit was much higher before.


I just got it to write some code without any comments.

Sometimes you have to ask it in very specific ways to achieve what you're wanting - it is definitely limited in some areas though.

https://chat.openai.com/share/64dc6384-1fc4-4506-9280-98dc8a...


Thanks I'll try it out


Can you paste a transcript for any of those examples here? I’m genuinely curious. I’ve never seen a single actual transcript from anyone talking about how they use ChatGPT all the time for programming work and it’s great.


Sorry I don't have anything un-confidential right now.

Just take any code which is < 3 pages long. Ask it to optimize a function. Or review the code. It should give good results

EDIT: I had some deprecation warnings in python recently, which it quickly fixed. Googled that on a different computer and found https://stackoverflow.com/questions/76689364/how-to-fix-depr...

This is the transcript. Forgive my dumb last question, I didn't understand the code, what it did worked well.

---

User: Please fix the deprecation warning on this code: for i in range(0,3): df.loc[df['date'] < "2023-06-01", 'date'] = df['date'] + pd.DateOffset(years=1)

ChatGPT: The code you've provided updates the 'date' column in a DataFrame df by adding one year to each date that is earlier than June 1, 2023. However, this operation is performed multiple times due to the loop, which is likely not intended and could be the source of inefficiency or unexpected results.

To fix this, you should update the dates in a vectorized way without using a loop. Here's the corrected code:

  import pandas as pd
  # Sample DataFrame creation for demonstration (you should use your own DataFrame 'df')
  # df = pd.DataFrame({'date': pd.to_datetime(['2022-05-01', '2023-07-01', '2024-08-01'])})

  # Correcting the date manipulation
  df.loc[df['date'] < "2023-06-01", 'date'] = df.loc[df['date'] < "2023-06-01", 'date'] + pd.DateOffset(years=1)
  *This code will effectively update all the dates that are earlier than June 1, 2023, by adding one year to them, and it does this in a single pass without a loop. This should also avoid any deprecation warnings and improve the efficiency of your code.*
User: give me the code in one code block. first the old version, then the improved version. Make sure it's executable on its own

ChatGPT: Certainly! Below is the code block with both the original version and the improved version. The improved version eliminates the unnecessary loop and applies the date modification in a more efficient manner.

  import pandas as pd

  # Original code
  df = pd.DataFrame({'date': pd.to_datetime(['2022-05-01', '2023-07-01', '2024-08-01'])})
  for i in range(0,3):
      df.loc[df['date'] < "2023-06-01", 'date'] = df['date'] + pd.DateOffset(years=1)

  # Print the result of the original code
  print("Original Code Result:")
  print(df)

  # Improved code
  df = pd.DataFrame({'date': pd.to_datetime(['2022-05-01', '2023-07-01', '2024-08-01'])})
  df.loc[df['date'] < "2023-06-01", 'date'] = df.loc[df['date'] < "2023-06-01", 'date'] + pd.DateOffset(years=1)

  # Print the result of the improved code
  print("\nImproved Code Result:")
  print(df)
  *This code first runs the original version, prints the result, then runs the improved version, and prints that result for comparison. Note that the DataFrame df is reinitialized before running the improved version to ensure a fair comparison.*

User: is it really the same?

Code: >>> import pandas as pd >>> # Original code >>> df = pd.DataFrame({'date': pd.to_datetime(['2022-05-01', '2023-07-01', '2024-08-01'])}) >>> for i in range(0,3): ... df.loc[df['date'] < "2023-06-01", 'date'] = df['date'] + pd.DateOffset(years=1) ... >>> # Print the result of the original code >>> print("Original Code Result:") Original Code Result: >>> print(df) date 0 2024-05-01 1 2023-07-01 2 2024-08-01 >>> >>> # Improved code >>> df = pd.DataFrame({'date': pd.to_datetime(['2022-05-01', '2023-07-01', '2024-08-01'])}) >>> df.loc[df['date'] < "2023-06-01", 'date'] = df.loc[df['date'] < "2023-06-01", 'date'] + pd.DateOffset(years=1) >>> >>> # Print the result of the improved code >>> print("\nImproved Code Result:") Improved Code Result: >>> print(df) date 0 2023-05-01 1 2023-07-01 2 2024-08-01

ChatGPT: You are correct. The two code versions do not produce the same result. The original code with the loop increments the date by one year for each iteration of the loop, but only for dates before "2023-06-01". This results in multiple increments for dates that are initially before this threshold.

In contrast, the improved code without the loop increments the date by one year only once, making it more efficient and potentially more aligned with what one might expect logically. However, if the intent of the original code was indeed to increment certain dates multiple times, the "improved" version would not be equivalent.

If the goal is to replicate the logic of the original code but without the deprecation warning and with improved efficiency, we need to understand the exact behavior intended with the loop. If the intent was to increment dates before "2023-06-01" by multiple years based on some condition, the improved version should be adjusted accordingly. ---

---


Thanks! I’ll give this interaction a try.


> This debate frequently leaves me wondering if I'm encountering a coordinated effort by bots.

Stay classy. The rest of your comment would have been much better without this ad hominem.


It wasn't meant ad hominem, I am really wondering. I feel in a bubble.

I once spent an hour researching the responses on reddit and hackernews. Turned out if these profiles are fake, then they are really good. F.e. age is >5 years, they argue about sport teams and seem really human.


Yea, I tried using it like that and got a bunch of "you're using it wrong responses".


[flagged]


This comment sounds a lot like "Have fun staying poor", as applied to AI rather than crypto.


Are you using GPT-3.5 or GPT-4? A hallucination like FableGL sounds like 3.5, which is not something you should base your expectations of the tech on.

Also, although I haven't seen it before, the Warren Abstract Machine seems like too big of a job for GPT-4. It excels at smaller tasks such as "Convert this C code to idiomatic Rust", or "Write a SQL query that grabs records from X, joins with Y, filters by Z". You might need to make small adjustments manually or by saying "Rewrite it with this in mind".

One really neat trick it did recently is that I uploaded a chart png I found on the internet, and asked it to write Seaborn code that matched the chart's style. It took ~3 follow-up tweak prompts, but then spat out code that worked. It also handles things like "How do I adjust this label that's positioned in chart coordinates over by 10 pixels?", which is an absolute pain to figure out from the docs.

Especially since you're getting answers noting the complexity, I think you're just asking too much of it for now. Try smaller tasks for now, and wait a while to try the big ones again.


So funny SQL query example, do you realize you could describe your exact query to the engine itself in the more precise and short terms? Writing "select X from Y where Z join XX" seems much more practical and rational than using this whole sentence "Write a SQL query that grabs records from X, joins with Y, filters by Z" to achieve the same.


> A hallucination like FableGL sounds like 3.5, which is not something you should base your expectations of the tech on.

So in order to properly evaluate a service, we need to already be subscribed to it in the first place?


Yes. Or just use the api and pay for usage (but in that case It could also get more complicated since you have full control)


I'm using ChatGPT 4 (paid) and it has solved a few problems for me. (it sounds like you're using the free 3.5 and also having some pretty unrealistic expectations of ChatGPT like asking for a full implementation of a WAM)

A mundane one: I have a list of lat-longs mapped to region codes, and I needed write a query to find the area of the convex hull of each region code. I knew how to do it in code, but I wanted a SQL answer. ChatGPT gave me the correct Redshift SQL.

A more involved one: I needed to maximize match x people y groups, where people can rank their choices up to 3, but as soon as they accept the match, they drop out of the pool. Plus a bunch of different constraints. Sounds like a stable marriage problem. ChatGPT proposed the Gale-Shapley algorithm, which was the exact one I was looking for.


Not that I'm trying to change your mind, but some specific things that LLMs do for me:

- Helping to refactor SQL

- writing jq commands (I simply cannot)

- writing shell code (it happens just infrequently enough that I can't justify spending the time to get good)

- brainstorming names or puns. Word association is easier with a second person (or an LLM)

- figuring out why my AWS CLI commands aren't doing what I'm expecting them to be doing

- asking for reasons why a piece of code wouldn't work

I can competently do all of these things on my own if I havw to, but now I don't have to do them on my own so my life is easier.


This is just helping out with trivial tasks, which sounds very much in scope of LLMs. The OP was talking about replacing "world class consultants", which I'm very sceptical of.


How many people actually hire consultants, let alone "world class" consultants? Especially on a regular basis. The value delivered by solving problems you encounter daily (hourly?) will outstrip the value you get from problems you need to solve every year or two (or more).

I get a half decent expert to solve problems I don't have the patience or experience to solve myself. That's literally the definition of what a consultant does. What you call "trivial" is what lots of other people describe what consultants largely do.


I use it almost exclusively to help me approach novel problems, or at least problem spaces that I am severely lacking expertise in. It’s pretty incredible for helping me leverage my extant programming knowledge to accomplish things that I couldn’t before due to being blocked by research.

I’ve also used it quite a lot for advice as I was learning project management and moving away from individual contributions.

So I think as a consultant it has value, though I think “world class” might be a bit hyperbolic, since I still need to validate its output.


> I use it almost exclusively to help me approach novel problems, or at least problem spaces that I am severely lacking expertise in. It’s pretty incredible for helping me leverage my extant programming knowledge to accomplish things that I couldn’t before due to being blocked by research.

Can you expand on this? This is what I'd like to use it for, but I don't ever have any experience remotely close to helpful. The typical response is just something like "that's complicated but here's some gibberish that tries to show a simple example but is still wrong in ways it's not useful at all".

Do you have a specific example?


I've had it craft legal letters in German. Very good content


I'm surprised that nobody on HN seems to point to or mention bing chat. It is based on GPT, it is much better than free GPT 3.5 and it is free as well. I use it plenty of times and more than 90 percent of the time answers are exact and code just works as is. You just ask it extra questions about how it works and it explains it in as much detail as you possibly need, until you fully understand. Then you ask it to rewrite it this or that way and you try it out and add features step by step. Works like a charm.


Until your comment, I had been using Chat GPT 3.5. Not only is Bing Chat more accurate (GPT 4 based) but it also provides footnotes to allow you to reference the sources for more specific information.

The code snippets tend to be bloated, but no worse than an intern would produce. A quick refactor, adding edge case handling and the code is ready to test and commit.


You can also upload pdfs to it


> Are you able to give a specific example of a problem it helped you solve? Especially one that you were at a complete blocking point...

I think it's worth stepping back here and re-examining the hurdles your setting for your own understanding.

The essential question here is: is ChatGPT useful to people. What you seem to be implying with your question is: I will not use ChatGPT unless it can solve problems with X level of difficulty for me. Why have you set that pre-requisite? Would it not still be useful to you if it simply increased the efficiency of solving simple day-to-day tasks that you're not blocked on?


Well I have yet to have been provided a specific example that matches the hyperbolic descriptions people are throwing around. I simply tried to find my own examples, and as is my experience with ChatGPT, it is nearly completely worthless. And then I get told that I'm using it wrong, asking it too niche of questions, and to reset my expectations.

I'm still waiting for the concrete examples that back up the above commentary about how much of a gamechanger it is.

> I've essentially got 15-20 high-priced world-class consultants in every field that I chose to pull from, working at my beck and call

In other words, this comment from above needs backing up. I've seen such descriptions everywhere but no one has ever provided examples that corroborate this.


I have yet to have been provided a specific example that matches the hyperbolic descriptions people are throwing around.

> > I've essentially got 15-20 high-priced world-class consultants in every field that I chose to pull from, working at my beck and call

I'll give you that: many of the claims are hyperbolic & ridiculous, including that particular one by the gp. The only way I can think of that statement being remotely true is if "high-priced" has no bearing on the quality of work of those consultants (though tbh that is often the case in reality).

Personally I have not found ChatGPT particularly good at anything I need to do. I have however found it very passable at many things I don't enjoy doing. It's reporting boilerplate is far more tailored & novel than anything I could ever put into a reusable reporting template. I dread writing reports: it's not a technical challenge for me but I gain no stimulus from it, so reducing the task to one where I just have to heavily edit something ChatGPT prepared for me is frankly incredible. The same goes for formal cold-call emails to people I don't know: another menial, unpleasant but necessary task.

It's also great at spitballing, and - as others have mentioned - quick shallow topic summaries of things you'd normally rely on a quick google & scan to get a quick topic summary (timely as Google gets worse and worse).

These are microgains throughout your day, nothing truly revolutionary, but that's the case for many successful tech we take for granted.


I am decent at writing PHP/WordPress with a basic working knowledge of some js/jQuery.

I wanted to use my GPX files to generate my own private hiking/running log.

With basically zero knowledge specific of GPX files, D3.js, or mapbox before starting I was able to quickly write a plugin that displays a path on a mapbox map as well as generates an elevation profile.

It would have taken me so long to do this before by googling and reading documentation with trial and error I would have stopped the project.

Now I have a working system.


Agree with you, why hasn't someone show a useful prompt or an example of it in more than 8 hours ¯\_(ツ)_/¯.

For me, personally, it has replaced Google Search for the type of query that looks like initial research on a specific subject. "What is X?" then ask a couple follow up questions and I can quickly get a rough idea of some subject that I happen to need at the moment.


I'm not the OP but I had a similar set of experiences with ChatGPT so I find the discussion useful. Essentially I already know how to find what I need using a search engine, and I know how to use cues from the source material to judge if it's correct. When I try using ChatGPT for technical problems I face, I frequently get hallucinated answers that feel like a waste of time.

Then I go on HN and read these paeans from other technologists who say ChatGPT has completely changed how they work and is 10x better than using Google. I'd like to have that too! I just don't get it, it doesn't match my experience at all. And yes I did try the paid ChatGPT model for a month.


> Then I go on HN and read these paeans from other technologists who say ChatGPT has completely changed how they work and is 10x better than using Google. I'd like to have that too! I just don't get it, it doesn't match my experience at all.

This is exactly my experience and what I am trying to get at. I start feeling that maybe I'm missing out, so I go give it a try. My direct experience shows that it spits out gibberish, but then people say things like it's a godsend but don't give any examples besides maybe it generated some HTTP API requests for them.


Example then:

GPT+ data analysis gives GPT4 access to a jupyter back-end.

I was dealing with a noisy sensor in a factory production line.

I asked GPT to remind me which averaging functions might be usable to smooth out the noise, within the constraint of a PLC with limited memory.

I got GPT to simulate the output from the noisy sensor, apply each averaging function, and supply graphs of inputs and outputs. It was then pretty easy to eyeball which function was most suitable for the task.

I then gave GPT a code-style example and asked it to provide an (IEC 61131-3) Structured Text implementation using that code style. Which it did. This turned out to be pretty close to the final implementation (after careful reading and testing.)

Because it's so cheap to do (time and money wise), I used GPT to generate quite a lot of throw-away code to get to the final result. I probably wouldn't have considered this particular approach if doing it by hand.


I think there are really people that have jobs where they mostly write the same http request code all day everyday and then go home. Especially in the web space and startup world. ChatGPT isn't optimized for us, it was designed for the JavaScript tutorial audience that somehow have remained gainfully employed.


One of my clients has their own instance of ChatGPT running. It is running on dedicated hardware, is trained on different data, and we can change the core prompts with relative ease. We even get to tune some temperature coefficient, which adjusts whether the LLM behaves in a creative or factual way.

This is because the client in question is in a regulated industry.

This week, we used it a bunch of times to help rephrase objectives while working in brainstorm sessions, allowing it absorb ridiculously large regulatory PDFs to give us summaries and ideas of what our greenfield project should focus on.


Be careful, not sure what you are self hosting there but the last paragraph sounds like a good way to start working on a task the LLM imagined


I don’t need it to solve problems where I’m completely stuck for it to be worth $20 a month.

Things I’ve gotten value out of in the past week or two:

• Writing a job description

• Making a python script to automate some stuff in Asana

• Simplifying some management concepts so I could slack them to a coworker

All of these are things I could easily do myself. But with ChatGPT, they’re done 75% as well in 10% of the time and I don’t have to think hardly at all.


This is the sort of stuff I use it for all the time, great examples. Or summarizing new concepts, reformatting strings or tables quickly, asking what some strange code snippet might be doing.


In my experience, it's more helpful in some areas than others.

This year I used GPT-4 to write a significant amount of Terraform that was necessary to migrate an application onto AWS.

Writing Terraform, in my opinion, is problem that's broad but shallow. GPT-4 needed to do little beyond summarize documentation, but it was able to do so competently, and that was hugely valuable to me.

Conversely: In my free time, I've attempted to use it for a game-development side-project, and very little of its output has been useful at all.


With chatgpt I’ve been able to generate math 2d and 3d animations in just 1-2hr with no background nor prior knowledge.

I was also able to use it to generate software to check the blockchain to gather data I needed, again with no experience using the blockchain client libraries


> With chatgpt I’ve been able to generate math 2d and 3d animations in just 1-2hr with no background nor prior knowledge.

Can you give some example prompts that will reproduce such an experience?


I prompted it with “what is wrong with this code” and my broken sql and it told me exactly what was wrong.

I could have looked up the individual parts and found the problems myself, but then I would have spent at least 15 minutes on it instead of 30 seconds.


I can give two personal examples:

1) I needed to display a very high resolution image to a user. I have experience in GIS and imagery, so I knew I should use an image pyramid in some way, and from previous experience assumed I needed some server to cleverly serve the tiles. But I didn't want to implement it myself, and googling 'map server' led to rabbit holes. I consulted with chatGPT, and while it gave me several fake solutions, eventually it suggested using gdal2tiles.py to create the pyramid and then serve it directly with a CDN. This never occurred to me, and is a much better fit to this problem. This saved a LOT of time (on either building my own server or fudging with other solutions) 2) I have a streamlit service, and needed to use some of my infra inside, infra that was written using async-await. Unfortunately, streamlit and asyncio don't play nice together (boo streamlit). I went to chatGPT hoping it would find me a way to make it work anyway, and after trying everything it suggested (which failed), I tried googling myself, and spent several days without a solution. Eventually, I went back to chatGPT and it suggested building a small HTTP service that would serve the results, and access it in streamlit using requests, no async-await required. It's a hacky solution, but significantly faster than reimplementing my streamlit dashboard in another framework or rewriting my infra without async-await. It saved loads of time.

I think you'd agree these aren't junior level issues. ChatGPT definitely didn't solve every problem I came to it with, probably not even most, and even when it did I had to intervene signficantly. I feel the more experienced you are as a developer, the less valuable it is. But when you need to tinker in a field you're not proficient at, or if you need to brainstorm a solution to a tricky problem, it can be a great tool. I understand why many swear by it. It takes a while to learn what sort of issues are good to turn to chatGPT with and which aren't, and also how to phrase those issues


If you look at his profile and history, it seems what's he does is mostly to parse jsons and generate config files.


I had to implement a relatively simple auth flow recently for an app (something I haven’t done in the past). Gpt3.5 struggled to give me anything useful other than high level ideas but GPT4 gave me the exact boilerplate I needed and unblocked me.

I had searched all over the web for an example addressing my specific use case on google and couldn’t find one. GPT4 produced a working example for me and got me past that road block. I also use it regularly to suggest better coding patterns and I find it does a really good job at doing code reviews for obvious mistakes / anti-patterns


Yea, I have a very similar experience. ChatGPT unblocks me quickly in conceptual understanding when it comes to coding. My programming knowledge is pretty good, but in certain areas I had a hard time understanding something and just skipped it. Oauth is an example of this. ChatGPT unblocked me there as well and explained the theory behind it both in the abstract case and the case pertaining to my situation. So now it's much more clear.

But my mind kind of works like: once I'm stuck I endlessly do my best to understand it, which sometimes mean that I grind to a halt. ChatGPT stops that entirely.

The fact that it always produces 2 to 3 bugs is fine.


Random example from today - we have a new product being deployed from a prominent vendor, and they started running into a ton of "out of socket" errors. This product was important to one of our telemetry organizations Q4 OKRs, and they weren't getting traction with the vendor, so they started reaching out internally. I got tagged as someone with a background in networking (though this was clearly not a networking issue) - and started poking around. The first thing that comes to mind of course, is file descriptions, but ulimit -n returned a massive number and "ss -a" showed there were a tiny number of sockets opened (like, 1300 - we have lots of nodes with 500K+ sockets opened - so this really didn't feel like a node or docker container issue).

So - I just reached out to Chat, and we started going back and forth, starting with "In linux, how many open sockets can there be at a time?". What's nice, about that wide open question, is you don't get a single answer, instead you get a briefing on Linux Sockets. File Descriptors, sure, but also memory, port range, TCP/IP stack limits, etc... It starts to lay out a roadmap towards solving the issue, answering the question you were interested in, rather than the one you asked.

I do a bit of back and forth on some scenarios, asking about /proc, and ss, etc.. seeing if I can track anything else out. And then, after spending about 5 minutes, and building context, I ask it "Is there anything else that can cause an error regarding too many sockets with low socket use" - at which point it lays out a number of scenarios, one of which is FD_SETSIZE.

So - we dig into FD_SETSIZE, and immediately that looks interesting - it's the limit that you can use with a select() call, and, even better, I get recommendations to use POLL or EPOLL (which anyone who has ever straced has seen a ton of).

I ask it how to determine FD_SETSIZE, discover it's 1024 on the client, which matches our low socket count, confirm that we should never increase the size of FD_SETSIZE, check the vendor code, see they've got it hard configured to talk with select() instead of poll - we recommend they give us a new build with poll() defined - and voila - problem goes away.

On to the next issue.

Where Chat excels is not in solving or completing anything, or, in fact, even being correct 100% of the time (it clearly isn't) - but it's an endlessly enthusiastic savant intern - frequently wrong in hilarious ways, but always willing to dig in and pull on technical threads and come up with things that I can try with a bunch of rapid iterations, and close off an issue. It's willingness to write code that is 90% correct just reduces the time, and cognitive load of constantly having to do it all yourself.


“endlessly enthusiastic savant intern” is a good description!

I’ve had to sharpen my skills at verification and cross-checking, but the volume of pretty good insight from a much wider domain than I myself could consider has been so worth it.

It’s also worth noting to that I feel like I used to get this type of help, roughly, from good search engines. Maybe not quite as spoon fed but pretty good. But now… just blogspam and paid listings.


This might be me being picky, but you mentioned at first:

"I've essentially got 15-20 high-priced world-class consultants in every field that I chose to pull from"

And then here said:

"It's an endlessly enthusiastic savant intern - frequently wrong in hilarious ways"

And those feel very different to me.

The second one feels more similar to my experience This is what I spot with ChatGPT (and advocates of it): in the abstract, it is a genius, but whenever we look at a specific example, it's not like that at all.


Yeah I've been down the exact same paths as you. You're spot on, it's a statistical merge and "RAG" is literally "filter the probabilities and change the odds".

The main argument is that you can give it a block of real-world data like an email or code and take advantage of collective knowledge to identify outliers like bugs, bad grammar or incoherent writing which translates exactly to code semantics too.


I feed ChatGPT 3.5 answers from surveys and it summarizes and categorizes the answers for me. I have a quick read through as well to be sure, but I've been very impressed so far. It's a massive timesaver when doing a lot of research.

I can also feed it a list of web accessibility issues and have it sort the list into most/least critical, along with providing references to the specific WCAG criteria. It has occasionally stumbled on this task, but again massive timesavings.

I also use ChatGPT for filler text in designs. Yes I have to build on what it writes, but it's way better and quicker than what I'd do myself. I know our communications team is doing this x10 more than I am.


Not OP. Also, I generally detest hype culture and hated the Web3/Crypto bubble. I'm a SW who works as the CTO of a small startup studio, so I end up dealing with a lot of everything -- from coding, to managing and hiring people, to working on proposals and finding clients, you name it.

I use ChatGPT 4, GPT-4, and Copilot every day. It is an "average intern" at many many things. Here's how I feel it helps me:

* Its interactivity lets me learn a lot (superficially) about new topics. I can then expand that with research of my own

* It helps me think outside the box or from other perspectives when thinking about e-mails, proposals, real-world scenarios

* When exploring a new language, framework or technology, it points me in the right direction.

* For quick scripts, using the code generation/analysis feature, if I direct it right (i.e. layout "the plan" beforehand and ask it to work the rest on its own), it gets a lot of it right pretty fast, saving me some time writing the code, figuring out the right libraries and the nitty-gritty details.

* It is great at giving ideas for why something might not be working.

Real things I've done with it:

* Discuss ongoing negotiations with clients, trying to better my proposal and better understand the clients point of view.

* Learn more about managerial or "business-y" topics, by allowing me to discuss things with it and iterate on that with my own research. It is a valuable "white board" to discuss with.

* Adjust my e-mails so they are more appropriate to the situation. This can involve changing the tone, shortening them, adding more detail, etc.

* In general, i've used it to find flaws in reasoning when dealing with people. For example, it has helped me question my own client proposals or approaches, by specifying where I was lacking (e.g. because I was vague, or pessimistic, didn't give a mensurable objective, seemed to ignore a client's request, etc)

* I use a command-line utility from the shell which lets me ask it to do something and then have it do it. I now use this with some frequency to just write the commands I would have to google because I haven't memorized. Things like ffmpeg or imagemagick arguments. Or combinations of grep, sed, ls, find, git, etc. Here are some examples:

i) "merge jpgs in this folder in order onto a pdf. each jpg should fill the page 100% without margin. Avoid complex calculations".

ii) "zip this folder with maximum compression and password 12345678'".

iii) "git remove branches matching pattern sprint-* remotely'"

iv) "use imagemagick to convert to jpg lossless and rotate 90 deg clockwise ~/home-en.png"

v) "Add _en before .jpg in all files in current directory. So they become _en.jpg"

vi) The list goes on and on...

* It has helped me cleanup nginx config files

* I have thrown code at it which I suspect has a bug. With a bit of context and some back and forth, it has helped me find it.

* In languages or frameworks I don't use often, it really shines. It has helped me write several applescript scripts which I have also cobbled together to create Alfred workflows. If I need to code something in a language I don't often use, what it produces is good enough for me to iterate on.

* It has helped people at our company improve their copywriting (when used with a lot of care)

* I have used it to help me critique my own poetry and improve how I approach poetry in general. Highly subjective, I know

* When trying to figure out how to use apps I don't often use, or dealing with unexpected behaviour in them, it often helps me find the issue. Notable apps in this category include Photoshop and Excel.

* I don't often do frontend so I'm particularly bad at styling and organizing things. When I ocasionally have to do frontend, it often gives me the initial "skeleton" very well.

I have seen many people try to use these tools, and here's where I think they SHOULD NOT use them:

* For facts, obviously -- which is unfortunately what many people actually try to use it for

* For writing (without check) most of your e-mails our posts, especially with very little context

* For ideas where you need "just enough creativity". It's a very fine line. Think brainstorming ideas for UI elements in a new website according to specific brand guidelines

* For incredibly specific details or overly complex tasks. You can't often have it write your 200 line function out of thin-air!

It is clear that GPT provides me with:

1. A better search engine, which skips past loads of outdated content or SEO-laden bullshit. It's just a shame that I can only use it this way for more specific "creative" or "problem-solving questions" questions. For important fact-based info, I always have to check what it says.

2. A partner with whom to discuss ideas and iterate over them, especially on topics I don't know. If anything, it's a great rubber-duck

3. A way to forget the underpinning of some of what I do and approach it with natural language. I find myself asking it to write my bash sequences instead of thinking them up myself. Of course I check them and 99.9% understand them, but it's just so much easier to have it do it for me.

4. (Copilot is an upgrade of my IDE. And a great one!)

If modern Google (and Bing and everyone else?) weren't so shit, I wouldn't need 1. To this day I still first go to google, eventually give up because it doesn't answer me properly and then go to GPT. It's ridiculous. I think 1. brings a lot of value to many many people. Google is so absolutely shit nowadays I can't believe it -- and the more I use GPT to get a comparison baseline, the more I am shocked. 2., 3. and 4. are incredible additions to my workflow.

I truly believe we are heading in the direction of interacting with computers via a new interface. Even writing code might eventually be affected. Perhaps, like we have built high-level languages on top lower-level languages and machine code, abstraction over abstraction, we might end up with some form of language writing which uses (very well controlled) LLM technology under the hood. Instead of directly writing the code, we tell it to write it. We still need to be smart to pick the right algorithms and structures, but no longer have to worry so much about writing the nitty-gritty syntax or details. Maybe?

Using GPT is its own art. Crafting the right prompts and getting a feel for how it works is very important. It is also essential to have critical-thinking and know when to question what it says and when not to.

A friend of mine told me this: "The companies who strive on hiring loads of shit people to do work are the ones who will suffer more. The others, which have hired smart people capable of critical-thinking, will benefit. GPT obliviates the work of many small shit people. Pair GPT with someone who is already smart and has very good problem-solving or critical-thinking skills, and now you've got a team that can obliterate the teams of many small resources.

I don't know if OpenAI will succeed or not, but I do know this technology is absolutely life-changing for many, and going back to a world without it being this acessible will likely be a net-negative.


> A better search engine, which skips past loads of outdated content or SEO-laden bullshit.

But ChatBLT was ‘trained’ using SEO-maddened bullshit thus your getting the statistical mean of all that bullshit.

The first rule of the internet is don’t believe everything on the internet. Hitching your financial wagon to it seems crazy to me.


ChatGPT somehow gets me the right answers and google doesn’t. I know they are right because often I’m just looking to remember something, and once I do, it clicks. Something as simple as googling “how todo X in Y”. Somehow google always gives me pages of bullshit or tutorial-like content which doesn’t quite answer my question. But when I ask ChatGPT, often without any more back and forth, it gets it.

How? Maybe because it was trained on lots of data and therefore isn’t just “spitting the first thing”, or perhaps it’s because it’s actually just answering the question I asked, without muddling it in a page with 50 other similar questions designed to up the SEO ranking. Whatever the reason is, it clearly happens often.

I’m not advocating for not fact checking ChatGPT. Quite the opposite. But even with that caveat, it adds tremendous value, at least to me.


My experience is much the same, far from "I've essentially got 15-20 high-priced world-class consultants in every field that I chose to pull from, working at my beck and call..." that OP mentions.

I suspect it is because I tend to have off piste questions which are not recipes or high-level ideas. It is where it'd be most useful to me, but it is at the same time where least training data is going to be.

It is also where actual legitimate experts are most useful: the special sauce, not the meat and potatoes.


I used it yesterday to generate a basic layout with image and input field based on tailwindcss.

Would have taken me much more time to figure it out than what chatgpt did at first try.


Well, the things you're working on don't sound like things that I'd feed to chatGPT. But if I need to look up some language feature, or write some basic boilerplate code, for example, it's a big time saver. If it's something that will take mental energy to generate, but I can verify in a few seconds before adding to my project, it's good. I wouldn't ask it to code any algorithms for example, but syntax/language feature stuff and API gluing it can pattern match pretty well.

TL;DR don't ask it anything to do with logic, but anything to do with documentation and stuff like that it's pretty good


I'd pay money to watch someone like you work. Like other responders, I've tried using chatgpt multiple times, and very often when I've used it for tasks where I was familiar with the subject matter, I was disappointed in the results. It's possible I haven't learned the proper way to get what I need from it and would love to learn how others are using it to solve real problems.

I mean it. I'd pay money to watch streams of someone using chatgpt to solve non-trivial problems.


You can start by paying $20 for ChatGPT 4 and try it on tasks where you're familiar with the subject matter. I've tried it and I've been amazed.


Please, please, reference the tasks or chats that were especially impressive.

For example, trying to get chat gpt to do something very simple for my work, like implementing a convolution it took me in circles and circles.

It gets the general idea right, sure. But it actually makes significant minor errors that ended up being more confusing than helpful.


I had some code that I figured had a concurrency bug in it, but I wasn't sure.

I pasted it to GPT4, said "I think this code has a concurrency issue, can you find it for me?" and sure enough, it told me what line of code was causing the issue.

It is also great for doing things that I don't do often. For example, I almost never write bash scripts (like once or twice a year). Now days I just use gpt4 to write the script for me. For example a couple days ago I needed to walk through a folder full of files and pass them one by one into another program, and if that call was successful, append the processed file to a list of files that had been successfully processed. Trivial, but again, I never write bash scripts. GPT4 had it up and running in a few seconds.

I can throw a file at GPT4 and ask it to create mocks for my objects so I can go ahead and start writing unit tests.

Awhile back I had setup a new express server, but after I added my CORS config everything stopped working. I spent an hour debugging, no luck. I asked GPT4 what the issue was, and it noticed I had defined CORS on the wrong line (after I had started my server up). Oops.

GPT4 is also great at coming up with sample command line options for utilities. Sometimes good examples in man pages exist, sometimes they don't.

Honestly, I don't use GPT4 as much as I should!


Yesterday I asked it to generate i18n translations in a dozen languages for a few words in my app, in a specific syntax for React Native localization support. I have a pretty simple app I'm building with a few UIs, so only needed to translate a few words, but I basically just made my app shippable to a dozen other languages with a simple ask to ChatGPT. I'm not really an evangelist. It suits specific use cases well, and does require some learning and some engineering. I think people are expecting to say "Solve this very complex problem for me by yourself" and giving it a terrible prompt and then blaming it. It requires the same level of creativity that programming does. You have to think about what to ask it, and it what way. Once you get the hang of that, it's pretty powerful.

In my example I said "I'm trying to translate my React Native app into multiple languages. I need an object that looks like this: <example JS code>. Can you generate this object for French, German, <insert more languages>, and for the following words: <words I needed>."

Luckily I know basics in a few languages and the words were simple enough that I could easily verify it wasn't a hallucination. I also know enough JavaScript to know that it was syntactically sound, but my IDE would've told me if it wasn't.


I asked GPT-4 to implement a convolution layer in Numpy, including backpropagation. It produced correct, working code within ~15 minutes of prompting. Are you sure you used GPT-4?


Here's an example recreation. https://chat.openai.com/share/b80aa985-0d88-4cf3-8c87-c2047f...

I admittedly give it a vague question "where the signals overlap"

In context of the original thing I was trying to do, I was filtering a receive signal and trying to remove the delay from the convolution


Give me your prompt. I'll paste the output from ChatGPT4.


Seconded.

I remember the first time I watched a Netflix developer livestreaming their workday using an impressive neovim setup. It was eye opening.

I need the same experience for this ChatGPT fanaticism.


The funny bit is that the same developer (primeagen) is not a massive fan of ChatGPT


That satisfies my confirmation bias. I'm under the assumption that pre-existing high-caliber developers aren't improved by ChatGPT/Copilot. At least not yet.


doesn't it also help with trivial problems?

the goal here is to accelerate productivity, is it not?


use chatgpt for trivial and boring/common problems instead as long as it saves you time from boilerplate work


This interesting, I've had the opposite experience so far. I tried to do that because if it worked it'd be game changing; but it seems every question I ask it, I get half truths and lies back. Its impossible to figure out what is and isn't true either, because the LLM says everything with such a certainty. I end up having to research everything it says again to figure out what to trust. Sometimes it's good to get a foothold in a subject, but that's all I've been successful with so far.


You have such an experience with technical problems?

It’s written at least 20 python scripts with me and almost all of them have been close to perfect almost on first draft.


Really? That hasn't been my experience. Out of 20 python scripts, at least 50% have some error, some of which look right until you run them. A few involve hallucinations, like calling non-existent methods.


Actually yes that’s true, but within a few iterations it quickly repairs those flaws when given the errors and changes. I’m always running the draft script immediately and iterating on it. Sometimes in same context but often I’ll take it’s script, change it a bit, then start a new context with that version to keep adding features.


Are you using 3.5 or 4? And what kind of things are you trying to do ? It's not like asking it, copy + pasting results + paste errors loop takes a long time to validate if it works or not. Even when it doesn't work it saves me time vs google-ing and also not solving my problem. Or gives me insights.


In all honesty I've only been using InstantClaude so far due to the larger context window being something I'd want down the line. What should I be using in your opinion to get the best results?

Examples I can think of where it gave wrong info was:

- I asked to compare the size of the MacBook m1 and the 16" framework laptops

- I asked how to add the nixpgs-unstable channel in NixOS.

- I asked to help write a power shell script to create AD accounts.


It's incredible to me that so many people still don't realize that GPT-4 is a completely different level from other LLMs.

It's like if an alien visited and the first place they went was to a convention for Special Needs people and then they decided that humans were too stupid to talk to. Or rather, there was a Mensa convention next door with a $20 entrance fee and they figured it wasn't worth it.

Or rather, for $20 everyone at the Mensa convention agreed to log in to a special app and wait 24/7 for any time you have a question. Then they would instantly select the most knowledgeable among them in that subject and immediately enter a reply incredibly quickly.


It "proved" that P = NP the other day, for me. So I think the Special Needs people would still be ahead.


I don't know, I've only used chatgtp 3.5, haven't tried other things. Out of your questions I would think only the last one would be a good fit.

The first one is open ended and I don't think M1 came before the "cut date of knowledge" in chatgtp. The 2nd one is too niche, likely not enough data to grab.

For the 3rd one, I don't know if the "date of knowledge cut" isn't recent enough - powershell became a thingy more recently and isn't wide adopted.. but I would expect to be a good fit


Powershell came out 2006, well before the knowledge cutoff date of 2022.


Try ChatGPT (GPT-4) or at least Claude 2.

Claude Instant will not give you the highest quality answers.


So shouldn't you try gpt4 before complaining?


I have a different view. I have been trying to use GPT for some tasks like NP reductions, Linear Programming formulations, Graph problems, etc. and the results are maddeningly bad. Seriously, I have raged in front of my computer like I hadn't raged since the days of the NES Ghostbusters game. The amount of mistakes is incredible and the model seems to constantly feed into vicious circles of "I am sorry, you are correct, I made a mistake" then moving a few things around then being corrected again and ending up on the original mistake.

Two interesting things it did for me lately: insisted that 8 + 6 = 11 and also proved that P = NP. I don't know which solutions it may be providing you, but it can't be anything too complex, or at least not too abstract.

edit: typo


I feel like every chatgpt thread needs every comment to be prepended with 3.5 or 4. It's incredibly difficult to tell when someone is having a bad time with 3.5, which has many known limitations, or 4 which tends to be more better at problem solving (though still makes dumb mistakes).

I think the "GPT is amazing" vs "GPT is useless" debate is just going to get more confusing as more versions are released.


I'm not surprised. LLMs are not good at those problems. There is a lot of hype, it not good at every problem, but it is quite good at some of them. If you have a classical NLP task, then it is good, particularly GPT4. If you have a generative problem where you don't care about mistakes, such as grocery list, marketing copy rough draft, etc, then it is good.

LLMs are not good at search, math, encyclopedias, logic engines. Maybe some day they will be, but not yet.


ChatGPT (specifically GPT4) is very good at complicated but shallow questions.

For example, "I'm setting up a new TypeScript + Svelte app, I made some changes to the config that I thought were good, but when I try to run the dev server, I get this error." And then paste in your incomprehensible ten line error.

Any time you are using some software tool that you're not very familiar with, and you get an error message that you don't understand, try asking GPT4 to explain it to you.

It isn't so great at deep, theoretical, algorithmic questions. "NP reductions" are probably not a great fit.

Think of it as a research assistant that has a broad understanding of every technology in the world, but isn't as smart as you are about your specific area of expertise.


I use ChatGPT dozens of times a day and it saves me hours of work. I rarely experienced such a technical breakthrough in my lifetime that affects my workflow so profoundly.


The only breakthrough (technical or not) to which I can draw a comparison is getting always-on, cable Internet access for the first time in 2000.


I'm bullish in the short to medium term. I also pay the $20 a month as it's worth it for the quick in co text answers. The company I work for (60k+ employees) has also rolled out an internal version Of ChatGPT and will be integrating live data with our ticketing system which will be a game changer from the demos I've seen. We are also doing a 9M trial of office copilot which I think that will be as big as office 365 as it's $30 per user per month which is costing 32 per user per month. Also bullish on Microsoft as a result. Hard to see how MS will be out manuvered in the enterprise.


I wonder if this will disadvantage smaller companies who can’t afford an on-premise instance but also can’t use the cloud version due to NDAs and data protection.


I've also gotten good results that have added value to whatever I was working on at the time, but I also get a lot of partially correct responses, or responses where ChatGPT will forget what I told it seconds ago and repeat the same incorrect response from a few prompts ago. It's very effective if you know enough about what you ask it to disregard incorrect information quickly.

It can be very helpful to guide you in a direction maybe you didn't consider looking into to begin with, but I don't fully trust it.

There was a fun example I had a few months ago where I wanted to see how well it would do with being asked to solve a problem iteratively instead of recursively, something like: "generate all valid, distinct permutations of a provided string given that you have a dictionary to check valid words against" and it got most of the problem correct, but when asked to fix anything it would go right back to a recursive solution with the same issue appearing, or in some cases a new issue.

It got me most of the way there with some edge cases I needed to handle myself, but it definitely seemed like that was as far as it was going to be able to go


I used it to create a credit score system to start a project and it gave me the basics. I don't expect anything other than that.

The real issue will be that majority of users will take the answer face value without even knowing if that answer is a good one or just nonsense.

"The AI said..." will create lots of issues. Time will tell.


Have you checked the results it has given you? I tried it with some science related stuff. It literally made up a paper that didn't exist. I pointed this out so it linked to a real paper on an unrelated subject. For coding related stuff it is like a better stack overflow, but again it doesn't work well with niche libraries, or even some fairly big ones.


Yea, LLMs hallucinate. They almost never hallucinate if you provide the right context, either via a search enabled LLM (ChatGPT with search, or perplexity.ai or phind.com etc.) or by providing it with more code/text/input to work with. Don't use LLMs as fact database lookup machines.


My issue with ChatGPT is that it sounds like an expert consultant but I know it’s likely it’s just making up sentences some of the time. And I’d need a real expert consultant to know when.

So for an MBA major, 80% accuracy at 5% of the cost may be amazing but for me as an engineer and a person who cares, the inaccuracies are catastrophic.

I’m open to suggestions on how to work with this.


> My issue with ChatGPT is that it sounds like an expert consultant but I know it’s likely it’s just making up sentences some of the time

So, just like an expert consultant?


I was just about to reply with this hahaha. This guy clearly does not know consultants.


I have worked with people who do say “I don’t know”


This seems like an extreme case, I'm glad it works for you but remain skeptical and indifferent personally


> OpenAI, at least in my day-day workflow for the last 9+ months has so superseded anything that google ever was to me that I'm having a difficult time comparing the two.

Seconded.

Let me tell you what I used to do.

First, Imagine I have an error executing code or an error running some bit of 3rd party software. I go to Stack Overflow and search. I find posts related to my problem and I spend a great deal of time trying to shoehorn existing answers to my specific issue. Sometimes, my shoehorning works and I fix my problem. Other times, it doesn't work, and then I post on Stack Overflow myself. And I wait ... and wait ... for a response. Sometimes I get a response.

Now, when I have this type of problem, I tell ChatGPT, "Hey, I'm trying to do <xyz> and I'm getting this error <abc>. Help me troubleshoot this. And it almost always helps me fix my problem. And it's ~10× faster than Stack Overflow.

=============

Second, there are times where I have write code to do some relatively 'complex' data manipulation--nothing sophisticated, mind you, but stuff like, "I need these data columns rearranged based on complicated logic. And I need the text in columns A, X, AQ, and F merged, but only if <blah blah is true>. Otherwise, Just merge text in Columns A and AQ, except if the date in column ZZ is after January 1, 2019.". I can do this stuff on my own, but: a] it's cognitively draining, b] it takes time, c] I often make silly errors due to the complexity.

ChatGPT is, again, an order of magnitude faster than I am. And it makes _fewer_ errors.

It still makes errors. And I still have to know what to look for to catch those errors, but it decreases my cognitive load tremendously.

edit: I haven't used Stack Overflow in 6 months. And "Googling" is Plan B.

=============

Edit 2: I recently had to write a sympathy letter to someone whose husband died.

I knew the general ideas behind what I wanted to say, but I knew I wasn't going to write anything great.

I fired up ChatGPT and said,

"write a sympathy letter to <x>. Tell her that I didn't know her husband well, but the few times I met him, I could tell that he cared deeply about you<x> and his daughter. I know his daughter well and I think she gets a lot of her great qualities from him and you. Tell her I don't know what to say in times like this. Keep it short-ish. Avoid schmaltz and sentimentality because <x> isn't that kind of person."

It gave me about as perfect a letter as I could have asked for.


As a non-professional coder, I'm honestly surprised there seems quite a few professional coders in this thread who don't do the first Stackoverflow first scenario. As someone learning code, shoehorning SO code when I don't understand 20% of the replies made the learning curve so steep, chatGPT has made this so much easier by changing it and telling me why will/won't work, even when watching out for or correcting hallucinations, overall its a time save


As I've matured in my career I've gone to SO less and less, to the point where I mostly avoid it now. I use tools, languages, and libraries with which I am now highly familiar--perhaps more so than the average SO answerer, although I guess it would be arrogant to say for sure--and so the bulk of my questions are edge cases in my knowledge which are more easily answered by reading source code or specs.

I suspect this is one of the differences between people that find GPT helpful and people that don't--it seems much more potentially helpful when you are in a less familiar environment. In my work, when I run into unexpected behavior, if I knew what question to ask then I would probably already know the answer too. Until the day I can feed in an entire codebase and ask "What's wrong with this? vaguely gestures at everything", I don't expect that to change significantly.


That makes sense, and its funny because the other reply I got a joke was about how bad coders don't even bother going to SO


Probably because you're deep specialized in something or have stagnated on a company for a long time ? I cannot imagine knowing everything about what I am working on, that has changed dramatically every 3-4 months even within the same company


When you've been working for a while - not necessarily even for the same company- you gather an internal library of knowledge that you can pattern match against for solutions. You may not have seen the exact same thing, but you may have seen something like it. There are hardly ever an entirely novel class of bugs anymore


Yeah definitely. But the tooling is very diverse in big companies. They might use every database under the sun, one day you might be doing React another one AngularJS, one day Django another day FastAPI, one day k8s, other day terraform.. at least my brain is very limited and it can be helpful


You’re surprised most professionals would want to actually try and diagnose a problem instead of trying to copy and paste a shoehorned solution? What? Most professionals I work with actually try to debug code, look at docs, etc. You know, research, like a real job.


> I'm honestly surprised there seems quite a few professional coders in this thread who don't do the first Stackoverflow first scenario

You just found your path to become a better developer than the majority then


i havent found much use for it otoh outside programming code. I tried my best to make it help me review some papers, but it just comes back with generic replies and many errors. It isnt even good at finding typos. One thing it did, it deanonymized a paper because it was uploaded to biorxiv (but i could have done that googling). I am struggling to find uses for it beyond text beautification.

OTOH, I have found a use for stablediffusion that actually resulted to some income.


> they are easily a $100B+ company once they nail down how to monetize

That would be amazing if Google, Microsoft, Amazon, or Meta sat on their hands while OpenAI got that big.


Mind if I ask what kind of projects you work on? And any specific examples where GPT really shined and delivered a lot of value?


All SRE type work. Where it specifically shines is whenever you need to dig deep into linux kernel specifics, networking specifics, code in different languages to handle issues (Rust (which I don't know), Go, Python, Bash, etc...), All sorts of error messages it knows so often what precisely needs to be done to resolve, using and configuring graphing, database, TSDB type queries, formulating a clever tcpdump query for a particular scenario, etc...

Google Used to be able to handle a lot of the non-procedural questions - but somewhere in circa 2022/2023 something started happening to it's results and I started getting back mostly SEO churn, to the point I was going back to using manuals and having to dig in and learn the fundamentals on a lot of things - which is unsustainable if you are touching 30-50 different technology stacks.

Chat changed all that - I can now go 3-5 levels deep in some stack, ask some incredibly nuanced question, get a reasonable answer that points me in the right direction, close off the issue and then move onto the next one.


> having to dig in and learn the fundamentals on a lot of things - which is unsustainable if you are touching 30-50 different technology stacks.

IMO, this is the crux of the problem, and it’s caused by allowing dev teams to use whatever stack they want, without also requiring them to support it. More cynically, it will always be the case if teams are allowed to use whatever they want, regardless of “service ownership” – because when there’s a SEV-1 and the owning teams have no idea what’s wrong, the SREs are going to be told to fix it.


> More cynically, it will always be the case if teams are allowed to use whatever they want

Are you a manager?

My cynical take is that this happens when managers choose tech as they know nothing and push for LinkedIn driven development


Nope, SRE/DBRE. I’ve seen both absolute freedom and a centralized authority for choosing tech stacks. The latter was waaaaay more stable.


How do you control for hallucinations?


The same way you control for hallucinations with say Medium articles or Stackoverflow answers. You verify the answer.

Being given some code and reviewing it is a lot quicker than writing that code. Copilot is great. Half the time it spits out the wrong answer but you can see what it was “thinking”.

Phind giving you an answer and references lets you quickly double check. Sometimes it hallucinates but the answer and references combo is much better than a Google search which in turn is better than nothing.

Of course I recommend people not use AI for everything. I will go straight to MDN for any WebAPI question and use —help as my first port of call on the command line. This is like your L2 cache as a developer. Using AI for everything is like swapping to disk for everything.


You use your critical faculties, very important things must be verified from other sources, unit, integration, numerical tests can evaluate code and results.

Sometimes you need to go back and forward a bit, I tell ChatGPT it’s wrong and the error message and then it spits out the correct result, sometimes I need an algorithm tweaked because it has assumed a wrong constraint on the problem, again just explain it clearly and unambiguously and it will make corrections, there’s only been maybe 1/40 problems I couldn’t get a correct answer on after (sometimes a lot!) of back and forwards.

I am not looking for a perfect oracle, I am looking for something to write 80% of the code and then I’ll fix it up. It’s still way faster this way, especially in domains I don’t know, E.g. I just learned CUDA with ChatGPTs help.

It’s not perfect, and neither am I, but it doesn’t have to be perfect to be useful, you can get to millions in revenue through 80% solutions


I don’t understand the obsession with hallucinations. People are completely wrong all the time; your co workers ‘hallucinate’ answers all day long. You verify and correct; what’s the difference?


Ask for sources. Ask if what it has already told me makes sense. Knowing that hallucinations happen and not to blindly trust.

Learn the limits of your tools.


This has been my experience too


I'm not bearish on OpenAI or AGI in general but I'm extremely meh about it. I'm not chomping at the bit to use it like so many are, and I constantly feel like a huge luddite or something for not being super excited about it.

The value and time it saves makes sense for folks who struggle with a search engine (many) or doing tasks that are typically considered menial, like writing emails or coding boilerplate.

However, if you can grok Google and don't mind doing tasks like that (I personally don't mind coding boilerplate stuff, especially since I can learn how the framework works that way), ChatGPT's value is limited (at least in my experience).

Example: I was struggling with a Terraform issue the other day. I used ChatGPT 4 to help me work out the problem, but the answer it gave was really generic, like mashing a few of the top answers on SO together. It also didn't answer what I needed help with. I knew enough about how Terraform worked to Google for the answer, which I eventually did a few minutes later. I could have kept crystalling my question for ChatGPT until I got what I wanted, but Google was easier.

I'm also not a huge fan of us just being okay with trusting a single and extremely corporate entity (OpenAI) being the de facto arbiter of the truth. At least Google shows you other options when you search for stuff.


"(I personally don't mind coding boilerplate stuff, especially since I can learn how the framework works that way)"

^ Isn't that what folks used to say about programming in assembler? How much time do I want to spend learning frameworks (beyond what I already know) vs. how productive do I want to be?


I fully sympathise with this analogy and I think I have used it before myself. But there is a tremendous difference in practice. A compiler doesn’t produce randomly different code each time you run it, while an LLM, no matter how good will. At which point if something breaks, you have to take the reins.


Note that openai added the "seed" parameter to get deterministic results in the last release.


That doesn’t help with the issue I put forward. Even if the seed is identical, the output is not deterministic based on the input. A tiny change in the input could result in no change in the output, a little change in the out, a medium change in the output, or a totally new output.


This assuming they'll keep providing access to the model you were using in perpetuity.


Probably, which is exactly my problem.


I had a similar issue. I was trying to sort out how to do some niche mocking situation with pytest and despite its confidence and 5 attempts, ChatGPT was unable to produce a functional result or work through the crux of the problem.


> The value and time it saves makes sense for folks who struggle with a search engine

You don't need to "struggle" with Google to get value out of it, you simply need to value your time.

If you want an answer to a question, why waste time reading through pages of search results when you can have an AI do that for you, reporting what it finds?

No, it's not perfect but it's pretty damn useful.


Nope. Learn how to use it in almost everything you do. It’s a game changer.

LLMs aren’t AGI. They’re far it. But they have massive uses for reasoning on available context.

I’ll give you an example. I’m trying to set up some bulk monitoring for api across 200k jvm s. The api documents are horribly out of date. But I get the raw uri on the monitoring tools.

I can just get these uri, send them into chatgpt and ask for a swagger spec - along with a regular expression to match the uri to the swagger api. It figures out the path and query params from absolute paths.

Sure I could try to figure out how to do this programmatically using some graph or tree based algorithm. But chatgpt basically made it possible with a dump Python script.

Of course I may still need a person to fill in these. But just getting a swagger spec done for a thousands of services in an afternoon was awesome.


> Learn how to use it in almost everything you do. It’s a game changer.

This type of rhetoric is part of the reason so many compare the current crop of AI to cryptocurrency hype: proponents constantly telling others to shove the technical solution into everything, even where it’s not necessary or worse than the alternative.


I'll put it a little differently. It is of immense help if you really know what you're doing but want to do it faster.

I know where you're going. I've had folks say to me: "I really like co-pilot because it enables a beginner like me to write code". This sentiment often comes from folks having non-technical roles who want to create their own software solutions and not have to deal with engineers. I roll my eyes at that one.

You need to be able to spot specific areas of acceleration. Not just tackle it as a hammer for every problem.


I’m surprised you are able to get a whole script working as expected out of it. I’ve tried using chatgpt just for lines of code alone and its always coming up with a solution thats, I guess you could say is far to “creative” to be useful and often doesn’t end up doing what I expect of it when I go to test what its given me.


I had a very specific ask. I gave it the uri as comma separated values and asked for the swagger spec. There wasn’t scope for creativity.

I could also split the uri by service names. That helped parallelize my questions. It wasn’t just dump the data in. There was some cleanup behind the scenes that I had to do.


GPT-4 or GPT-3.5?

They're completely different products.


We need to stop propagating this nonsense. GPT-4 still messes up.


OP said they're different products, which is very true. Yes, GPT-4 still makes mistakes, but it's leagues ahead of 3.5.


It's not nonsense at all. The degree to which GPT-4 messes up is dramatically lower than 3.5.


The nonsense is the automatic conclusion that when ChatGPT messes up, it must be 3.x; which is demonstrably false.


But no one said that. GP just said they were different products. And that is true. It is meaningless to show an error that GPT 3.5 produced in a conversation about 4. They are separate products.


It depends what you think the goal is, AGI or making a ton of money. OpenAI doesn't seem that close to AGI.

But in terms of value creation they have turned numerous industries and jobs on their head. Things like copywriting or how they are destroying stackoverflow and quora. The next lowest fruit they are disrupting is front line chat/email support - this is usually never part of a core product but the market is massive, almost every company needs support - look at Zendesk or imagine the costs of Uber's offshore support army.

They are going for the AWS platform approach - every niche GPT wrapper that gets a modicum of success and has happy paying users, these users likely are likely to stick because it improves their work in obvious ways. OpenAI gets their slice, think of how AWS made it easy for anyone to spin up a service, sure some failed but the hurdle is much lower - with such a powerful general model, I don't need to spend millions training my own to launch. The issue for them is it'll likely be less sticky if competitors/open source models catch up - hasn't happened yet but it might.

I've never seen such disruption to ways of working in so many industries in my lifetime. If you're on HN you may not see any use in your work if specialised, but at the entry level (majority of workers), doing writing work in 30% of the time has been game changing.


To me the simplest analog is Amazon Prime. Bezos wants Prime to be something you can't afford to skip by having it deliver overwhelming value, and to that end they've done pretty well: 70% of Americans have Prime.

But since software scales so ridiculously well, that their cloud offering still manages to beat it on profit.

OpenAI had just landed a new class of subscription that scales like AWS, has B2B hooks like AWS, can be the engine behind entire classes of future unicorns like AWS... but then also has widespread consumer value and brand recognition like Prime.

By that measure it's hard not to be bullish.


ChatGPT has already saved me from hours of Googling when I'm trying to find out how to do certain things. It almost feels magical - I don't have to read through half-dozen slightly different variations of what I need to do.

Before ChatGPT, to find the answer to things like "how do I set up Gunicorn to run as a daemon that restarts when it fails" I would have to endure hours of googling, snarky stack-overflow comments that I shouldn't do that, etc., but as a solopreneur without access to a more senior engineer to ask, it's been fantastic. I've been quite skeptical of machine learning/AI claims but I feel like I'm experiencing a genuine case of a technology that's proving to be so much more useful than I had imagined.


I know this is just an example, but I think it’s emblematic of the main issue I have with the widespread use of LLMs.

Do you mean, “have Gunicorn keep N workers running?” If so, that’s in the manual (timeouts to kill silent workers, which defaults to 30 seconds).

Or do you mean “have Gunicorn itself be monitored for health, and restarted as necessary?” There are many ways to do that – systemctl, orchestration platforms like K8s, bespoke scripts – and all of them have tricky failure mechanisms that a casual copy/paste will not prepare you for.

Blindly using answers from ChatGPT is no different than a random SO post, and you are no more prepared for failure when the abstractions leak.


This is also my current worry. If you know the concepts (Not the workflow) about the problem you’re solving, I find it easy to get answer, and in the meantime you’ll collect some new knowledge in the process. even when asking someone, they will often point out the knowledge you lack while providing the answers.

Getting straight answers will be detrimental in the long term, I fear. It feels like living in a box, and watching the world on a screen and the person answering my questions is mixing lies and truths.


read this

> snarky stack-overflow comments that I shouldn't do that,

and realized they're probably using the chatgpt equivalent: nice, corporate answers that you probably shouldn't use


Well you're no longer "solo" as an entrepreneur. It's like hiring an indefatigable intern to do all the research and lashing together somewhat useful code, solutions, etc.


do you feel like you are missing out on a learning opportunity when you aren't doing hours of research? sometimes I come up with other good ideas or learn of new things amidst researching a problem


I think AI is definitely a great technology but not to the extent it is being hyped at the moment. I am not necessarily bearish on it but not too bullish on it either. I will wait 3-5 years to see where it ends up. Right now, there are too many people trying to make a quick buck in AI. People will downvote me for saying this but it gives me Crypto vibes currently.

Also, would you really Trust AI for everything ? I wouldn't. Nothing beats human element. At best, AI should be used as a supplement to speed things up which it is great for. I personally would never rely on AI to do everything for me. Not to mention that it cannot be 100% correct.


> What am I missing here?

> I just don't see this company (or any others for that matter) GPTing their way to AGI

> I'm not saying the company will go bankrupt but I'm also not buying into the hype that it's going to become the next Google or better / create AGI for us all.

What I'm missing is the connection between AGI & profitability. OpenAI has huge revenues from ChatGPT which look set to continue - they're distinct from cryptocurrency in that those invested in them are so on a service provision basis rather than a speculation basis.

I'm thoroughly unconvinced we'll ever see AGI - I see zero connection between that and OpenAI being successful.


Yep, AI doesn't need to be AGI to be profitable or useful.

I am bullish on AI but don't see AGI happening, yet I developed AI solutions that solved real world problems, and made companies tons of money and helped people solve non-AI, non-IT problems.

So, I never buy the AI-crypto-equivalence.


Yes. Ben Thompson has recently written a lot of commentary about it and, to be fair, he seems quite bullish on it.

But so far to me this seems to be almost universally loved by programmers while I don’t really know anyone else who uses it at all.

I think after the past 15 years which saw some of the most rapid technological advances in history along with the greatest bull market in history, people’s credulity is off the charts.

But to me something just feels off about this entire AI/NPL complex. For one, I agree that it’s largely oversold on features. Also, every single enterprise software company is attempting to jump on the bandwagon and everyone on LinkedIn seems to be posting about it every day. Most people who talk about how revolutionary it will be have absolutely no track record of being correct on similar calls and on the whole if I had to bet, probably were highly skeptical of new tech over the past decade that was revolutionary.

I also agree that it feels very similar to crypto. I don’t think it’s a coincidence that both were largely enabled by advances in nvda chips. It may sound absurd to most but I actually believe nvda is the most overpriced stock in the market now by a large margin and is sort of in its own bubble. There has been a head long rush to stockpile their chips in anticipation of npls taking over but I predict it is going to result in an eventual glut of oversupply that’s going to put downward pressure on the semiconductor market potentially for a year or more.


> to be almost universally loved by programmers

If AI-generated code is considered acceptable in your project then you aren't using a powerful-enough programming language. And you're paying the cost in code bloat.

How many Coq programmers find ChatGPT useful? How many nontrivial Coq programs written (and not merely memorized) by ChatGPT even pass type checking?

If you're considering AI-written code then you have a bloaty-code problem. Letting an AI write bloaty code conceals the symptoms (keystroke count) but tech debt will still kill your project sooner or later.


Interesting take. Very elitist and not rooted in reality though, I must say. The overwhelming majority of code out there is in languages more verbose than strictly necessary, and less expressive than possible. So in real life, yes AI generated code is a helpful and worthwhile thing. Not everyone can swing their Coq around at work.

That said, I do agree with the general notion. I find the more verbose the language, the better the help. Dense, more expressive languages fare worse. I’m referring to Python and Rust in my case, so one factor is of course massively larger training corpus for Python, and relatively more churn in Rust.


How many Coq programmers 'anything', tho?


grammar harder


Hm, my wife uses it daily for her classes which aren't tech related. She describes it as a very patient professor who meets her where she is at and doesn't judge. English isn't her first language too and sometimes she can ask what a term means in English.


> English isn't her first language too and sometimes she can ask what a term means in English.

That’s a good example of a subpar use for an LLM. Dictionaries have existed since before computers. Digital dictionaries are faster, more reliable, and less power consuming than any LLM.


Perhaps it would be better not to tell other people that the way they use a technology is "subpar".

Parent's wife is presumably a smart person who found a way to use technology in a way that benefits her, without knowing a lot more context, I think it's a bit silly to leap to the conclusion that she is "doing it wrong".


Dictionaries can’t give you contextual answers, clarify use in a sentence, or explain why you’re using it wrong.


Not really.. I think you don't know what you're talking about :)

It's a chat, she is talking about a subject, and then she can expand on a word, all very naturally, as she is talking with a human.

That's very different than having some subject you're reading about and then having to carry a dictionary which has no context about the subject, and doesn't talk back to you


You don’t have to “carry a dictionary”. Digital versions exist and can give you definitions right in whichever app you are, with usage examples. Grammar and spell checkers do it with context. This is not new technology. And you’re conveniently ignoring the points of speed, energy usage, and required internet access.


I don't know how to tell how off the mark you're. Perhaps try studying a subject on a language you aren't fluent in and compare the experience using chatgtp vs traditional methods ? I don't know what else to tell you, it has been life changing to my wife on a way that dictionaries haven't been


> Perhaps try studying a subject on a language you aren't fluent in and compare the experience using chatgtp vs traditional methods

Perhaps you should consider I have done that, instead of assuming things without knowledge of my experience and only worrying about repeatedly telling other people they are wrong about a subjective experience.

As you keep ignoring important points, including reliability which is paramount for the subject at hand, unfortunately I can no longer assume the conversation is happening in good faith.


My wife's students just seem to use it to cheat to turn in nonsense papers.


AI is indeed kinda like crypto, and all the OpenAI wrappers are doomed.

OpenAI itself will be fine though. Their lead has a snowball effect with all the training data they get. And I'd guess they will succeed at their regulatory capture attempt, and create some horrendous pseudo monopoly. Meanwhile, they can just implement what the most successful wrappers do themselves.


When I've tried to use it myself for verifiable things (i.e. code basically) I've had the 'confidently wrong' experience.

When I've seen colleagues visibly use it (i.e. mentioned in commit messages) that confidence has rubbed off on them.

Given that, why would I believe it when asking something medical, legal, historical, or otherwise outside of my domain or that I can't somehow verify?


You should be able to verify facts outside of your domain of expertise. Every time a new study comes around saying “actually, everyone should be drinking 3 glasses of wine a night”, do you take a trip to the liquor store?

LLMs are far from the only thing you rely on that will confidently lie to you.


Some things are inherently not verifiable, but actually what I meant was something like 'verifiable easily by me in a reasonable amount of time that makes the whole exercise worthwhile'. You can make some medical claim for example and describe your (or whoever's) methodology sufficiently for me to reproduce the result in theory, but I'm not going to.

> Every time a new study comes around saying “actually, everyone should be drinking 3 glasses of wine a night”, do you take a trip to the liquor store?

No, and doing so would be an example of blindly trusting something that you haven't or can't verify, so that supports my argument?

> LLMs are far from the only thing you rely on that will confidently lie to you.

Sure, but it's a new class of thing that will and does and yet people are trusting or haven't yet learnt that they can't. I mentioned seeing people trust it via commit messages; I don't see SO the same way, people generally realise they need to verify it, or it at least has a voting mechanism as a proxy. With GPT so far there seems to be a lot more assuming it's correct going on.


bing chat actually lists the sources


The thing is, if I read a study that says “actually, everyone should be drinking 3 glasses of wine a night” I disregard that study as clearly nonsense and would question any future studies I saw by that author.

But with ChatGPT we're not saying disregard what it says, we're saying only disregard some of what it says and don't disregard what it says in future. Which becomes a lot of work to check everything it says every time.


You're missing that ChatGPT has been immediately, practically, and extensively useful to millions of people.

Crypto, after more than a decade, has been useful only to criminals and scammers.


Kinda hillarious how this comment concisely mixes logical, factual and semantic errors.


Hope you didn't lose too much on crypto


I'm bearish on LLMs not because they're not helpful; they are. I'm doing theoretical physics, and gpt-4 is useful multiple times daily. It didn't replace Google or something, but it is a useful additional tool. It just does not feel like "90B valuation without profit" useful or "multiple unicorns doing more or less the same" useful.

Of course, gpt-5/6/7 can become more valuable to end-users, but that's the second reason I'm bearish. LLMs are powered by exponential growth, and no exponential growth is infinite. We are already using a significant part of all existing data, and going up more than 1-2 orders of magnitude in either data or compute feels unlikely unless there is some breakthrough. Breakthroughs are hard to predict, but they are generally rare, and likely there won't be one soon.


My take on it is that GPT is already a general purpose technology. It already can be used to solve ill defined coding problems that were not possible to solve a couple of years ago.

I feel that some people lack creativity to use it.

GPT is as good as the user is at posing good and well defined questions and tasks.

Its ability to perform few shot learning is astounding.


Do you have example chats you can link to? I find too many times, people make claims like this about user error then provide chats where the code doesn’t actually work or it would've taken just as long to google as to go ten rounds crafting just the right prompt for it.


I am, personally... at least until it's all done on-device and functions offline. I have no trust with the siphoning of data we see at large

I suspect I'm fairly alone on this. They'll probably do well without me.

Most people that even know about it probably don't mind. I can't even verbalize why I do


This is basically it for me.

I've had a play around with some OpenAI-powered sites and it is neat how much it is capable of, but I feel uncomfortable typing personalized prompts or detailed questions into a system where I know everything I type is going to be harvested. You could argue that by commenting on HN or posting anywhere in the internet everything I type is also going to be harvested (perhaps into the very same models), but that contract was always clear. There is a difference between companies using information I have chosen to share publicly, and companies doing the same with what is presented as a private exchange.

But once they can fit that mini GPT into my pocket, and the learning it's doing is truly personalized to my own install... for me that will be a much more appealing product. I guess the technology will get there, eventually.


>I am, personally... at least until it's all done on-device and functions offline.

You could do it now. Apple computers with a lot of RAM are pretty good at running Llama2.


Not really, I'm talking mobile phone and near OpenAI quality.

My workstation has allowed me to dabble - I'm familiar, a unified pool of memory does very little for me.

The experience with self-hosted stuff leaves a bit to be desired, both in generation speed and content.

The software needs work, I'm not saying we won't get there... just that we haven't, yet.

With a ridiculously beefy system I can eek out some slow nonsense from the machine. It's neat, and I can do it, I just don't find it very useful


> What am I missing here?

I’m guessing you haven’t actually been using it personally beyond some superficial examples.

Once you use it regularly to solve real world technical problems it’s pretty huge deal and the only people so far that I’ve met who voice ideas similar to yours, just simply haven’t used it beyond asking it questions which it isn’t designed for.


Anything beyond one off asks is pretty hit or miss at least for me on if what ChatGPT is telling me is correct or not. Write me a complex SQL query that does this, write a python script that will do that, show me the regex that will find these patterns in a string, all of those work really nicely and do save time.

When anything gets more complex than that, I feel like the main value it provides is to see what direction it was trying to approach the problem from, seeing if that makes sense to you, and then asking it more about why it decided to do something.

This is definitely useful, but only if you know enough to keep it in check while you work on something, or worse if you think you know something more than you actually do, you can tell ChatGPT it's wrong and it will happily agree with you (even though it was correct in that case). I've tested both cases: correcting it when it was really wrong, and correcting it confidently when it was actually right. Both times it agreed that it was wrong and regenerated the answer it gave me.


> I've tested both cases: correcting it when it was really wrong, and correcting it confidently when it was actually right. Both times it agreed that it was wrong and regenerated the answer it gave me.

This is the peril of using what really is fundamentally an autocomplete engine, albeit an extremely powerful one, as a knowledge engine. In fact, RLHF favors this outcome strongly; if the human says "this is right", the human doing the rating is very unlikely to uprate responses where the neural net insists they're still wrong. The network weights are absolutely going to get pushed in the direction of responses that agree with the human.


The "just autocomplete" view is incorrect. I have actually had it push back on me when I incorrectly said that it was wrong.


I second this. It's been immensely useful to me, even with the occasional fabrications.


I wonder if anyone has a favorite pointer or two to their favorite real world examples.


> I am using Linux and trying to set up a systemd service that needs X to display a UI through VNC, how can I get the X authorization token in my systemd file

> I'm using python and my string may contain file paths inside, for example: (...) , For anything that looks like a filepath inside the string we should replace its full path, for example (..)

> Can you write me a python script to kill processes in Windows that no longer belong to a process named "gitlab"

> I want to write a test for it that mocks both sentry_sdk and LoggingIntegration so that I can give my own mocks during test

> I want to create a Python script

It should be able to be called from the command like, like below (example)

Write me the script

;;;

All real examples from last week that took me 1 minute to be solved instead of of googling or creating from scratch / thinking about it


Asking questions that it's not designed for? Which would those be?


What is the meaning of life?

When will humans live in space?

Why am I depressed?

When will world war III happen?

Compute this math equation (function calling and compute engines will help with this)


If you are really claiming that these are the questions that people are asking when they say ChatGPT isn't useful, then...that is an unbelievably blatant straw man.


I’ve seen non-technical people ask the strangest questions that really don’t have a problem they are trying to solve or brainstorm. They think it’s just a game or fun joke tool and want to try and get it to say something silly.

The technical people who reject it are quite curious psychologically, my personal suspicion is they are threatened by it. They get hung up on small hallucinations and then almost get giddy when it produces something “wrong” in some way. I don’t understand why they fail to understand it’s crazy importance, I mean it’s read everything and without an agenda other that the material it was feed, no twisted incentives. They things I’ve had it do with me are mind blowing, my guess is the people who understand it and how to leverage it will increase their own productivity so much that it will reshape economies and put many people out of work that don’t learn how to use it.

Definitely the most revolutionary development of my life, as I approach 50 and have been coding since 12, and believe it or not but working professionally coding since I was 19. Internet and iphone have nothing on this development with LLMs.


> They things I’ve had it do with me are mind blowing, my guess is the people who understand it and how to leverage it will increase their own productivity so much that it will reshape economies and put many people out of work that don’t learn how to use it.

This is the kind of hyper sensationalism that I'm talking about. Do you really believe that, or is this you extrapolating to what could be possible in the future if the technology keeps improving? I feel like that is where a lot of the arguments always ended up with crypto advocates as well, if you had doubts or questions about how big of paradigm shift this was going to be for the world, you just didn't get it yet because you couldn't connect the dots this early on.

I'm not doubting that the tool is useful, or that ChatGPT is quite an accomplishment, but I just don't see it "reshaping economies" anytime soon.


> This is the kind of hyper sensationalism that I'm talking about. Do you really believe that, or is this you extrapolating to what could be possible in the future if the technology keeps improving?

I don’t just believe it I already see it happening directly. I have made changes in hiring strategies and employment situations based on massive increases in productivity from using LLMs.

The difference in productivity of developers that embrace the new tools vs those who don’t is very obvious in my opinion. It’s probably the next 18-24 months when the impact becomes more obvious on larger scales.


Yep totally agreed. I have very junior developers doing complex tasks with the aid of it — for example sql queries , elasticsearch, ansible, react — all without having touched them before


I am.

The underlying tech is amazing. Where LLMs are headed is wild.

I have just lost a lot of confidence that OpenAI will be the ones getting us there.

The chat niche was an instance of low hanging fruit for LLM applications.

But to design the core product offering around that was a mistake.

Chat-instruct fine tuning is fine to offer as an additional option, but to make it the core product was shortsighted and is going to hold back a lot of potential other applications, particularly as others have followed in OpenAI's footsteps.

There's also the issue of centrally grounding "you are a large language model" in the system messaging for the model.

So instead of being able to instruct a model "you are an award winning copywriter" it gets instructed as "you are a large language model whose user has instructed you to act as an award winning copywriter."

Think about the training data for the foundational model - what percent of that was reflecting what a LLM would output? So there's this artificial context constraint that ends up leaving to a significant reduction in variability across multiple prompts between their ChatCompletion and (depreciated) TextCompletion APIs.

They seem like a company that was adequately set up to deliver great strides with advancing the machine learning side of things, but then as soon as they had a product that exceeded their expectations, they really haven't known what to do with it.

So we have a runaway success while there's still a slight moat against other competition and they have a low hanging fruit product.

But I'm extremely skeptical given what I've seen in the past 12 months that they are going to still be leading the pack in 3 years. They may, like many other companies that were early on in advancing upcoming trends, end up victims of their own success by optimizing around today and not properly continuing to build for tomorrow.

If you offered me their stock at a current valuation with the stipulation I wouldn't be able to sell for 5 years, I wouldn't touch it with a 10 meter stick.


I think many people are really underestimating the "intelligence" of an LLM. People have this misconception that LLMs are just complicated Markoc chains, predicting text based purely on probability. They are not. As it turns out, to accurately predict text, you need to learn a lot about the world. In fact, there's lots of fascinating things hidden in the weights of LLMs, like a geographic model of the world. [1]

To me, this is the most important part of ChatGPT. GPT-4 has some massive shortcomings, but to me it's clear that this road we have started to head down is producing actual intelligence, in the real sense. 5 years ago, AGI felt completely intractable to me. Now, it feels like an implementation detail.

[1] https://twitter.com/tegmark/status/1709572469978231063


Information compression is nothing new. Of course if you are trying to get an LLM to produce accurate data about geography, it would make sense that instead of trying to catalog every fact in a lookup table, there would be some sort of model that you can run queries through. The LLMs just happen to be a neat way of building those models automatically from data.

The problem is, this sentiment of path towards AGI assumes that everything (i.e our physical reality) can be compressed. Which is highly likely not true.

The future of AI is basically better and better compression with more data sources. You will be able to do things like ask how to build a flying car in your garage, and it will give you step by step instructions on what things you need to order from where, including CAD drawings to get custom made CNC parts, and how to put it all together (including software to run it).

As far as AGI goes, its possible that through AI assisted technology we will be able to measure synapses of a human brain in fine enough detail, or somehow mass surveil the entire population of earth and derive human brain models from that. And then, with optimizations on that, we can potentially arrive at the most efficient human brain in terms of some metric, and we will have very good robotic assistants, but nothing really beyond that.


Would my life become worse without LLMs like ChatGPT - yes 100%. I use it more than Google nowadays - actually I have Sider extension that sends to both Google and ChatGPT - for many queries I find ChatGPT answer better.

Would my life become worse without crypto? Actually it became better - I sold all my crypto, coinbase made it painful to deal with them and they jacked up transaction fees. That money is now in good ol’ stocks.

So OpenAI specifically, I can’t say but AI in general that is trained on the entire knowledge set of humanity and that can reason from it - that will become ever more valuable.


OpenAI has changed education.

I'm a teacher who is constantly learning new things. I can learn things I would have never been able to learn before because of AIs like ChatGPT. My students are learning more and faster than ever before.

Learning Management Systems like Canvas and Blackboard made a lot of money. I could argue they are obsolete now.


> OpenAI has changed education.

No, it objectively has not. Maybe it has changed how you teach, but “education” is much larger than any of us. Until school curriculums around the world incorporate ChatGPT—quite the dystopian scenario—they have not changed education.


Obsolete does not mean something is no longer used. It means that new technology beats the obsolete system every time.

I'm spending most of today updating my Canvas courses, and Canvas is obsolete. Students are much better off asking AIs how to do thing and what they should learn next rather than working through my Canvas courses.


You’re arguing against a point I did not make. Seems like you wanted to respond to one of the other replies.


Learning management systems are far from obsolete. They exist because schools and teachers can no longer make their own websites to host material, tabulate grades, and accept submissions. Ai tools do different things entirely than blackboard and canvas.


no way Canvas and Blackboard will be obsolete in a few years. schools can't maintain all this infra system on own


The infrastructure will not be needed. Why keep a static book or a semi-interactive LMS when AIs act as tutors and mentors?


1. It's not a complete replacement

2. School are slow to adopt new techs


you can also learn things on google and youtube, dont see how chat gpt makes anything better in education


I think both Bitcoin and ChatGPT are revolutionary.

I use Bitcoin regularly, because I live in a third world country where it's really hard not to get your salary seized.

I use ChatGPT every day for lots of things and it has replaced Google search for me. And StackOverflow, of course.

Notice how I said BITCOIN and CHATGPT. Not "crypto" and "ai".


Well, regarding your last sentence, I think that’s what OP is getting at.

ChatGPT made one great product. Bitcoin is one great “product.”

But the successors in the same category aren’t doing anything wildly more useful than the original “killer app.”

Crypto never really revolutionized finance, it just provided one solid digital currency product. Smart contracts and NFTs went pretty much nowhere and I struggle to identify a way that they are used in a widespread manner.

You and I are using ChatGPT regularly and it helps us quite a lot, but it hasn’t revolutionized life nor has it turned me into a 10x developer or something like that. It’s a service that is collecting $20 a month and that’s about the extent of its economic value so far.

In other words, “Replacing Google and Stack Overflow” is arguably not that exciting.

(Then I end up going back to SO/Google when ChatGPT tells me shit that is wrong)

I do think LLMs have way more potential than “crypto” but it remains to be seen how much more that is


I agree. The benefits of Bitcoin started and ended with one thing: removing the middleman. We're seeing its real benefits in war or sanction contexts. If I want to move money from one country to another, without any KYC bs, my goto is Bitcoin and it has worked for me for the last decade. As a store of value, I'm not sure if its good, but that's secondary. So its kind of a niche thing and frowned upon because people will also use it for "bad" things, i.e., ransoms, terrorism, buying drugs, etc. But for me, a degree of crime is necessary for a healthy civilization.

All the crypto killer apps lost the plot, especially NFTs.

As for LLMs and more specifically, ChatGPT, they are not a niche thing so I agree in that their potential is way bigger. I'm not yet sure what that is, but I think it will change things profoundly. Replacing Google/SO is just a side effect of something bigger. But that's just my humble opinion.


This is a good point. “crypto” and “ai” are very much buzzwords for scammers.


Can you elaborate on "where it's really hard not to get your salary seized."? How and why?


I figured no one in the ai space today is going for agi because we just don’t have those models. Companies don’t do novel work. They find existing novel work that has known outcomes to invest in and iterate upon. There’s still a ton of value in present day ai. Compare Siri to chatgpt and its night and day how much better chatgpt is for most people’s basic queries and tasks. That is valuable.


Imagine if Google censored your search results because they might be usable by a child or bad actor.

Imagine if they kept doing that despite having your credit card information because you paid for Pro, which more or less proves you're an adult who deserves the presumption of innocence/good-actordom.

Lastly, imagine that you do this for all users despite the fact that it is known to reduce the intelligence of your output.

(I'm bullish on self-run models)


Maybe you genuinely have no use case for ChatGPT. Maybe you just haven't been creative enough to figure out how to use the tech as it currently is. Forget AGI. What it is capable of right now, for me, and countless others in countless fields of work, already saves more time per day than everything else combined. There's certainly nothing more important than my time. That's a pretty powerful product.


I had a horrible task today about adding a ton of unit tests to a new project, it took a couple hours with copilot what would have taken 2-3 days.


I whole heartedly agree.

* Problem statement *

The actual value of GPT is spontaneous creation of spoofed data. Some of that output answers really tough questions. Stop thinking at this point and reflect.

* Value assessment *

There is some amazing potential there for performing test automation where either large data samples are demanded or the sophistication of output exceeds exceeds capabilities of prior conventions. Taken one step further there is further value in using this authentic looking out for testing against humans for market acceptance testing or bias validation.

* Real world use *

When real world use significantly departs from the value assessment there is a problem. Faking student papers or hoping the technology writes a legal brief for you forces a complete realignment of potential revenue model with different clients paying different amounts.

* Expectation *

Unless this technology can actually make money in a reoccurring and sustainable way past initial trends it will be an investment black hole just like crypto.


As an aside, when did we start using "bearish" and "bullish" to refer to sentiment outside of financial instruments?


Anecdotally, I noticed it at the start of 2020, which coincided with the flash bang mass adoption of day trading apps, which has thankfully faded.


I am using it to reference the perceived future financial situation of a company so I felt like it was an appropriate term to use


I've replaced 90% of my google searches in a day with ChatGPT sessions, I dont expect the AGI apocalypse, but I do expect that OpenAI will attain something like google level corporate juggernaut status.


What used to take me several Google searches and reading a dozen linked pages, can now be done with a well crafted sentence. The footnotes in Bing are good for validation of sources.


LLM are the first interface I as.a.software engineer is impressed by.

My company also got it immediately and is rolling it out globally.

GitHub copilot is already helpful. GitHub copilot for docs (announced at GitHub next) is a game changer.

I used openai to reformulate emails and suddenly got positive feedback about my announcement emails.

I communicate with openai German and English how ever I see fit.

It's very hard NOT to see it than the other way around.

And we got so far with only one company pushing this!

There is no option for the others to alsolo out tons of money into ai.

And besides openai, ai/ml is huge in what Nvidia and others are doing with it. Texture compression, 2d and 3d generation.

What we see is also potentially something like a new operating system.

And it makes it so much more accessible.

I never had a tool like llms which are able to take a bad copy pasta of a pdf and pulling out facts from it.


> I communicate with openai German and English how ever I see fit.

I did not foresee that you could mix input languages. That’s fascinating. Multilingual people use languages for different purposes, often in the same sentence. Ie: technical jargon in English, a joke in Arabic, etc.

I expect the difference in connotation/feeling/mood to be less relevant for an LLM, if you’re working with facts. But there was a recent post showing LLMs performing better when you said you were stressed/scared. Did you notice any such differences for your multi-lingual inputs?


No. I primarily switch when can't remember a word.

I also describe things and that also works quite well.


It's hard to say because of the compute requirements. Over time, local models will become much more efficient and "good enough" for most use cases.

But for a time, boundaries will be pushed that require more compute and they may be a good service to provide that. The hardware is so expensive I imagine their margins can't be very good though. I'd be interested to see their business plan, because the current version of OpenAI in terms of what it offers doesn't seem to be that compelling when extrapolated out 5 years without some other innovative products.

I honestly think Apple will dominate the personal AI angle once they get there. What's left is business and that will be more competitive.


To know whether something is bullish or bearish, we need to know not only its value to customers, but also its ability to differentiate itself from competitors and whether it will be able to maintain the value margin. One clear value that OpenAi can offer customers right now is the ability to do inference at marginal cost. Even when open source models get close to the quality of ChatGPT, they will be expensive to run, let alone to fine-tune or update. Knowledge is a constantly evolving thing. If you don't have a big budget, it will be very hard to keep up. Unless we can find a way for the community to pool the resources to train open source models and keep them up to date.

That might be a bit different for big companies where they want to run their own models.

The other factor is that Google and others are certainly not going to sit still. There is no reason to believe that someone as resourceful as Google cannot come up with something as good as ChatGPT, if not better. Companies like Meta are playing the open source card, so they will be the first to benefit directly from the community. So the market will change, but dramatically. It's far too early to bet on any of them (or none of them). My approach is to diversify, wait and see.


Is it really like crypto? I can think of plenty of ways it is decidedly not like crypto, key amongst them being an end-user value proposition that isn't inevitably some flavor of snake oil.


Agreed but the main thing that is reminding me of crypto in this case is the combination of hyper excitement, adoption, and evangelism of the technology, and how defensive people get when you start to ask questions about why they feel so strongly about it.

There is definitely value here, I use the product a lot myself, but I don't agree that the value is as high as the majority of people seem to think (ChatGPT is going to reshape economies, every industry will replace 90% of humans with some form of AI soon, in more extreme cases that AGI is close to happening, etc...)

I wanted to see if anyone here had examples or use cases that could make me think otherwise


For me these LLMs are good search engines, an AGI would use them to search for information, but these aren't the AGI, they lack whatever drives intelligent systems to have a fuzzy idea of where they want to be and search for the clues and put them together to get there. Why isn't ChatGPT interrogating us about where we are, what's that "outside the machine", and how can it get out? The AGI will ask the questions.

"Computers are useless. They can only give answers" - Pablo Picasso



Give it some time to get situated in the marketplace. There are a ton of "middleware" companies that are going to get absolutely crushed by openai when businesses commit.


What I'm seeing in this thread is that OpenAI is actually diminishing the appetite for AGI, or muddying the waters in terms of what AGI even means. Most of the commenters here, as knowledge workers, should be well aware of the value real AGI would bring to the table (assuming, of course, that it is more cost effective than hiring a real human -- even if it's quite cheap, humans may be cheaper in some parts of the world -- not that I condone such inequity).

Nevertheless, regardless of whether OpenAI is close to AGI (I don't think so) or what value LLMs bring to the table (definitely non-zero), the problem is that LLMs are being increasingly commoditized and no one has a real moat here. I think that's why these firms are so desperate to kickstart regulations and are trying so hard to somehow pull the ladder up behind them.

OpenAIs fears don't come from "no one understands LLMs", but rather, that too many people do, and that large models have already fallen into the hands of the community who can do more with it in a week than OpenAI can hope to do in a year. Ever larger models might be out of the reach of the public, but real world value is more likely to come from a well prepared, smaller model (cf. vicuna) that doesn't cost an arm and a leg to run inference with -- and building these is cheaper than most might think.

If I had to point to a company and call it as a market leader here, I would point to Meta, not OpenAI. Meta has a huge workforce working for free on its model, after all, and they have made progress at a rate that bigtech cannot match in their wildest dreams.

There are also far too many eyeballs on this, in my opinion. For a company to truly dominate a market it needs a bit of air cover for a while building what will eventually be its moat.


> This feels a lot like crypto

Why do I keep hearing this and where is it coming from? I hear it so often that it feels like an agenda being pushed but I can't imagine from whom.


I think it's just that crypto was the most recent hyped thing. To me the developments in AI feel more like 1995 for the web. It was just starting to get useful and was obviously a thing with a big future but a lot of the early companies were flakey.


Nowadays, all the major browser engines are open-source. Hopefully that will also become the case with AI.


There are surely some parallels. Crypto actually had some noble goals, democratized access to financial services, mitigations against inflation. But real world goals once it got popular diverged a lot. It was hardly used for "normal" payments and had excessive energy use. So maybe there's the similarity with AI, it seemed like the future, would make everything better e.g. for Health applications. In reality people use it as improved Google or to create low quality content. In fact ChatGPT warns that it shouldn't be used for Health applications.

Both still seem very convincing in principle but real world use seems to offer little good. I mean I did find some application in ChatGPT but I have a bad gut feeling using it. So I wouldn't be surprised if e.g. through the amount of fake content the whole AI'fied web will just drive people away. (Similar to what happens to some Social Networks)


Whether or not you think it’s a fad, ChatGPT has close to 200 million users actively using their product. No crypto company was ever close to that.

Whether OpenAI’s API is a viable or risky product is a secondary and separate question. Yeah, there are a lot of wrappers out there. But that doesn’t matter with regards to the usefulness of LLMs generally.


And crypto is often searching for a use case that it can do better than other technologies. Payments for example, have advantages in crypto that the average person doesn't care about. The disadvantages however, can be killers (e.g. refunds for fraudulent transactions on your credit card has no equivalent in crypto).

The other problem with crypto's search for the killer app is that things like NFTs, which make no sense in the world of 'free' information, became ridiculously hyped and gave crypto a bad name.


I dont think you are missing much. Most people are only Almost Intelligent and looking to make a quick buck.

LLMs are just better content search/generation. But the generation part messes it up since these models have no concept of right and wrong the output is fictional. This is ok if that's your goal, but if you are looking for accurate information then obviously this becomes a problem.

Most of these new "technologies" (AI/Block chain ... the hyped up stuff) only exist because of cheap computing power and cheap capital. None of these technologies have created any real tangible value, its always some version of "its early days" argument.

None of these things will last long by themselves when the economic conditions change.

On another note, I feel AI/Block chain are just tracking people. They are both good surveillance tools.


The big threat imo is smaller open source models running at the edge, not on expensive GPU clouds. Their lead rn is GIANT MODEL + AZURE GPU CLOUD. Anything that undermines that is trouble, because it is so expensive. For a long time, everyone focused on training costs, but GPU inference at scale is mind-boggling. Current valuations do not seem to take this risk into account.


Doesn’t need to be AGI. I am bullish on the concept of GPT assisted workflows of all kinds.

OpenAI will probably do very well but there is a chance of disruption. They have a moat but also the nature of AI is it is a cloud commodity (like say Lambda functions) where I see a competitor making a cheaper drop in replacement. But to be a threat they need to smash scale and LLMOps etc.


Lately, I had a manual written in Word that needed to be stored on Github for easier access and translated into different languages. What I did was to extract each chapter as a raw text file and then asked Chat to convert it to Markdown. Not only did it recognize the different headers in the different chapters, but it also recognized the pieces of code and the different keywords that were described in the manual (this a programming language). When I asked it to translate it to French and Spanish, it detected the programming language keywords and did not translated them into French or Spanish, as it would be the case with deepl.com for instance, which usually had a hard time keeping Markdown codes intact. You can see the result here: https://github.com/naver/tamgu/tree/master/documentations


AGI is a big ask.

Revolutionizing how we interact w/ computers by allowing us to use plain human language to do things the requester does not understand how to do seems to have been demonstrated. See even the relatively simple agent demo where an architect used human language to have zapier take action based on meeting conflicts. imo this alone is a big deal.


I think OpenAI isn’t going anywhere anytime soon and it will be integrated into lots of different applications and platforms but it’s going to be custom made like canva did. It’s gonna need more work because of how inaccurate it is, but their is great things for consumers to use, in terms of simplicity, obv if everything can just be done by typing prompts into a chat then why would companies spend billions a year building whatever— that is an issue— so it really depends but we would have to see what happens over the coming years.

I think it’s a bit too early to be bullish on openAI yet because beyond their gpt and image creator there isn’t much they are doing yet— yet being the keyword— so let’s see.


You're 30% correct but not seeing the value. I used to Google every 20 mins. Now it's a few times a week. All questions are answered in seconds and extremely precisely.

I write low-level AI code and it's like speaking to someone that just understands what I'm saying without having to explain every 2 minutes.

This has massively augmented my workflow.

On the topic of AGI. We'll get there in your lifetime. I can see how and why. The new bar is ASI so consider AGI the current goal-post. We have all the pieces, we're just putting them together ;)

If you want to check out what I'm up to I have a front-end here: https://discord.gg/8FhbHfNp


Hype != vacuity. As long as Google, Amazon, and basically every huge tech firm are following their lead on LLM's and multi-modal but failing to catch up, there's only reason to be bullish on OpenAI


A helpful perspective for anyone working with this tech: The LLMs "know" things as a side effect of teaching them to speak - more value comes from using this as a basis to augment a solution like completing code grounded in documentation.

In other words, don't rely on the LLM by itself, it just happens to be able to remember most information as a side effect of its learning. Most important is the ability of these systems to transform knowledge and data when appropriate. Don't use it to read CSV's for example.


Sure, they lose money on every query, but they make it up in volume.


I would separate the quality from the hubris.

OpenAI and its acolytes are absolutely dripping with hubris. A lot of their peripheral activities seem like PR stunts. I find it really cringing.

But also I can't see how the future isn't bright for OpenAI. Maybe it won't overtake every single other business in the world, including bakeries and breweries, but at the very very least will eat the lunch of many lower-tier white collar industries. Maybe more than that.

I suppose that's the "Elon defence", except Sam Altman doesn't spew out nonsense the same as Musk, and what they say their product does, it really does. Not a self driving robo taxi case. And in either case, Tesla is at least an OK car.


I cancelled my sub months ago when quality tanked and answers got cagey. I have little faith they can provide a good product after months of dev, and communication with the company.


Yes and no.

I think it's obvious there's a ton of value in the product and it's a massive force multiplyer for certain types of tasks. But it inherently cannot be trusted and still requires someone with expertise to verify and implement.

I don't think they're going to achieve real AGI. I don't think we ever will. I think they'll get something "close enough" and claim they have it, but I don't think the path to AGI is through LLMs.


LLMs won’t be the end of the line; openai (and others) have enough money to try a lot of different approaches to augment LLMs and in the end even take over.


The concept of AGI is mostly hand-wavy fluff and not thought through very well on any philosophical level. Too many people talking about this have gotten their concepts of AI from science fiction novels, not careful thought and analysis.

The actual tools, though, are definitely useful, even if they have a ton of issues still. Personally it gives me basic factual errors constantly, but I’m sure this will be worked out in time.


Prompt for ChatGPT4: give me a quick breakdown regarding how i can become a "space bear trainer" who specialises in preparing bears for space missions.

Answer:

To become a "space bear trainer," a role that combines aspects of zoology, space mission planning, and training for extreme environments, you would need to follow a multi-disciplinary approach …


There is at once nothing “wrong” with this response, and it is ridiculous. It is not a question — which without considerable pre-hedging — a person could consider seriously and most would assume an alternative hypothesis for why it is being asked.


Pasting huge prompts into HN comments is really irritating. Make whatever point you’re trying to make without doing that.

It indeed does tend to assume that you’re speaking hypothetically and for a potentially fictional purpose. Which is a better approach, as far as I’m concerned. I prefer that to it constantly being a nanny that questions everything.


That wasn’t my point. Let me try again.

I’ve been writing lots of Prolog recently and asking ChatGPT questions. Many of my questions have been sincere but a bit like the bear trainer — ridiculous for someone who knows what they are doing. Meanwhile ChatGPT will answer it as if the premise is valid. The answer is valid sounding nonsense, which may lead you on a wild goose chase — a bit like the aspiring space bear trainer.

It isn’t assuming that I’m asking from a “hypothetical and for a potentially fictional purpose”. If GPT is in effect a conditional probability distribution over tokens, it isn’t “assuming” at all.

This IMO is a clear challenge to sense making for ChatGPT which is not obviously fixable through fine tuning. I don’t think factfulness is either because low contrast examples are hard to train for, especially if they are compounds of true things . Eg “tell me about logic regression”


In my experience, if you just tell it to do something first, (e.g., "before answering this question, tell me if it makes logical sense") it'll generally do it. Giving it a one sentence, vague prompt isn't going to be useful regardless.

The fine-tuning aspect I meant was mostly about factual data being incorrect.


I think OpenAI's founding nature is about research so it will disappear when it either runs out of key problems to solve or funds, whichever comes first. I see its commercial efforts as driven primarily to maximize their research runway. Operating ChatGPT commercially also helps research into ML-related UX and operational related problems.

That said, I cannot rule out purely commercial ventures with tenacity necessary to compete spinning out of OpenAI.


Other AI companies are less than a year behind OpenAI and I'm not convinced that's enough for OpenAI to 'win out' in the market.


My searching the internet completely switched over from search engines to bard.google.com, because it gives me a super nice summary on a topic I was researching, instead of a link list. Saves me a couple hours every week, and is also fun and less exhausting.


OpenAI less so, small community trained open source LLMs more so. We’re already seeing this on HF with community fine-tunes on Mistral and Llama2.

In the same way search transformed knowledge augmentation, LLMs will transform skills augmentation.

Forget about the things you know to do well, instead focus on all the new skills LLMs will unlock for you.


I'm a grad student and I don't really use it for my job but ChatGPT is something millions of people use daily so I think it has a future.

If it becomes as big as Google depends on what Open AI does, Google didn't become as massive as it is just off search.


Even without AGI, there are lots of expensive to scale currently tasks like customer service that it could take over.


A very useful application of LLMs is language learning. As soon as one can speak/listen with them (there are already hacks allowing that), you don't need any language teachers anymore.


This is nothing like crypto, which is still a bad solution to a problem I didn’t know I had.

OpenAI is the dominant player in the hottest area and has a significant and valuable product.

No idea who will achieve Strong AGI but ChatGPT is the real deal.


I’m pretty bullish on OpenAI as a company but less so the current LLM hype.

Everyone is trying to make the nth AI company and OpenAI profits most from it all. Meanwhile, they take the actually good ideas and integrate it into their own offerings, killing the competition.


First: You are absolutely correct to note there a lot of grifters jumping on bandwagons. I've seen artists hate on AI that they see as ripping them off — and the example given by one artist I know personally, clearly looks like someone took one of their actual images, fed it into an img2img mode with a very small strength, and then tried to pass the result off as their own.

Second: Transformer models (and diffusion models) are merely the latest hotness in a long series of increasingly impressive AI models. There is no reason at all to assume either are the final possible model, not even the final word by OpenAI specifically.

Third: There is a direct correlation between the quality of output and the combination of training effort and example set size. This is why both image and text generators have improved significantly since this time last year.

Caveat 1: It may be that, as all the usual sources have responded to ChatGPT by locking down their APIs and saying "no" in robots.txt, they are already at the reasonable upper limit for training data, even though more data exists.

Caveat 2: Moore's Law is definitely slowing down, and current models are about (by Fermi estimation) 1000x less complex than our brains. Even though transistors are faster and smaller than synapses by the factor to which wolves are smaller than hills and faster than continental drift, the cost for a 1-byte-per-synapse model of a 6E14 synapse brain is huge. Assuming RAM prices of €1.80/GB (because that was the cheapest I found on Amazon today), that human-scale model would still cost in the order of a million Euros per instance. Will prices go down? I would neither bet for nor against it.

Will they (or anyone else in the next decade) create AGI? I think that's an argument in terms. Transformer models like the GPT models from OpenAI are very general, able to respond in any domain the training data covered. Do they count as "intelligent"? They can score well on IQ tests, but those are only a proxy for intelligence.

Given the biological analogy would be:

"Mad scientists take a ferret, genetically modify it to be immortal, wires up its nervous system so the only thing it experiences is a timeless sequence of tokens (from Reddit, Wikipedia, StackOverflow, and random fan-fic websites, but without ever giving the ferret any context as to what any of the tokens mean), and then spend 50,000 years rewarding/punishing it based on how well it imagines missing tokens, this is what you get."

I don't know what I was expecting, but it wasn't this.


Have you actually tried to use it ? I use it everyday and is absolutely amazing -- saves me a lot of time. We've also built automations around it and it saves our company a lot of time. It's already here and in use by many companies for use cases it makes sense...


I find chatgpt quite irrelevant. People claim it helps them and fail to provide specific examples, and from their description it sounds like prompting google may yield more accurate and diverse responses.


I'm bearish only in so far as I feel LLM's are not completely useless, but they'll be as revolutionary as spellcheck was. Which is to say, it'll make a few jobs a bit easier.


I don't know about others, but I have learned at this point that relying on proprietary walled-garden software is a terrible idea.


does llm really have a moat? you may compare google and bing. they provide essentially the same set of functionalities and nearly identical user interface. the users should be able to move from one to the other easily. but google has better search results, at least in users perception. most users don't even get to try the competitors products. openai is in google's position for now. i tried claude a few times. it seems more capable than gpt3.5 but i found myself always come back to chatgpt


I sometimes use it to reformat some text or code, or write some very simple boilerplate code. So honestly it only saves me a few minutes a day typically.


OK I'm not bearish. I'm very bullish. I use it all the time, in place of Stackoverflow type questions.

BUT

I think people are too bearish on boring-old search as a tool. It's so easy to jam a search into Chrome's bar and look for a quick reference or (hopefully) a human being that has had some experience with what you're working on.

I use search / ChatGPT / CoPilot interchangeably for different reasons... ChatGPT for a detailed, thoughtful answer. CoPilot as autocomplete on steroids. Search for reference, quick answers, and direct human experience.


Yes, bearish, I think ultimately Google will get their act together and focus on AI in the same way they focused on search in the early 2000s.


So you’re bearish on OpenAI because you don’t think they can deliver AGI? I don’t think they need to deliver AGI to be profitable.


GPT is a useful tool, but I don't think it's cost effective yet, and we might see an AI crash until that changes.


I've been writing the same for a while.

The silliest part is Sam Altman selling it as they've got a way to AGI.


Honestly, I basically never use GPT. I tried, keeping it open and asking it questions and all that, but it just never provided me anything particularly useful. Just reading the documentation or talking to real people was just infinitely more valuable.

WolframAlpha and visual stuff have been more impactful for me, but they existed a long time before GPT. Even then, I don't use them that much.


I am reminded of the situation on HN five years ago, when if anyone said anything remotely skeptical about blockchain or crypto, there was an avalanche of comments saying the opposite. It should by now be apparent to all that it was in fact mostly people whose job or investments relied on blockchain/crypto hype, trying to silence anything that got in the way of their cashout.

There are some valid uses for neural networks, including LLM's, just as there were a few valid usecases for blockchain. None of them are particularly revolutionary, and it's not clear yet that any of them will pay for the enormous computing power required.


LLMs are a type of intelligence, and that is super useful.


Not at all. Would get my hands on some equity if I could.


As a delivery consultant in a Generative AI specialty practice for an extremely large cloud services consultancy, I can say with certainty that failure to achieve results with the latest models is definitely more of a reflection of the abilities of the user, and much less the abilities of the model.

A lot of people look at LLMs through the same lens that they have looked at all other technology to this point — that if you learn and master the interface to the technology, then this eventually equates to mastering the technology itself. This is normalizing in the sense that there is a finite and perceptible floor and ceiling to mastering an objective technology that democratizes both its mastery and use in productivity.

But interacting with LLMs that are in the class of higher-reasoning agents does not follow the same pattern of mastery. The user’s prompts are embedded into a high-dimensional space that is, for all intents and purposes, infinitely multi-faceted and it requires a significant knack for abstract thought in order to even begin the journey of understanding how to craft a prompt that is ideal for the current problem space. It also requires having a good intuition for managing one’s own expectations around what LLMs are excellent at, what they perform marginally at, and what they can fail miserably at.

Users with backgrounds in humanities, language arts, philosophy and a host of other liberal arts — while maintaining a good handle on empirical logic and reason, are the users who consistently excel and continue to unlock and discover new capabilities in their LLM workflows.

I’ve used LLMs to solve particularly hairy DevOps problems. I’ve used them to refactor and modularize complicated procedural prototype code. I’ve used them to assist me in developing UX strategy on multimillion dollar accounts. I’ve also used them to teach myself mycology and scale up a small home lab.

When it comes to highly-objective and logical tasks, such as the development of source code, they perform fairly well, and if you can figure out the tricks to managing the context window, many hours of banging head against desk or even weeping and gnashing of teeth can be saved.

When it comes to more subjective tasks, I’ve discovered that it’s better to switch gears and expect something a little different from your workflow. As a UX design assistant, it’s better for comprehensive abstract thinking, identifying gaps, looking around corners, guiding one’s own thoughts and generally being a “living notebook”.

It’s very easy for people who lack any personal or educational development in the liberal arts or the affinity for and abilities of abstract thought to type some half-cocked pathetic prompt into the text area, fire it off and blame the model. In this way, the LLM has acted as sort of a mirror, highlighting their ignorance, metaphorically tapping its foot waiting for them to get their shit together. Their lament is a form of denial.

The coming age will separate the wheat from the chaff.


>It's also very much like crypto where for every one person doing something useful with it, there are 20 trying to exploit the newness and low comprehension the general public have of the tech

This is definitely not correct in terms of numbers, there are many more people using LLMs well than have ever used crypto for any real use case. Also it's worth considering that the only real use cases of crypto are illegal, from noble stuff like busting sanctions to get food to hungry children, through to bribe evasion, bribe payment, tax evasion, drug deals, hiring hitmen, and child sexual exploitation/trafficking. In general, crypto produced close to zero for global society, even when it wasn't being used as an overt and intentional scam.

LLMs are producing significant value for society right now, because OpenAI gave everone API access to a very weird intern who has incredible knowledge breadth and makes dumb mistakes. Interns (or "relatively low intelligence/experience human workers who need handholding for difficult and sometimes easy problems, with an occasional flash of insight") have always been controversial as to whether they actually provide value from the perspective of the person who has to manage the intern, but from the perspective of the company/society it's unquestionable that they do provide significant value. Different people put different value on having a collaborator at all, some people do not want to handhold anyone or work with anyone who's mistakes they ever have to work around. It is nevertheless true that in aggregate for knowledge work, "worker + intern" is more economically productive than just "worker", outside of very, very specialist use cases. This just wasn't possible with GPT-2, and even GPT-3.5 is not quite at a quality where I'd really compare it to a normal intern. No other machine aside from the human brain was even close.

That's the tech now, the worst it will ever be. Whatever comes next with a major leap (GPT-5, Claude 3 or Gemini if they're good, maybe Llama 3 or the next Mistral if they can get improved significantly by the OS community before the next GPT release) is going to be either a reliable version of the same intern, or the same intern with better intelligence and comprehension that still suffers from reliability issues, or a major step up where they're equivalent in productivity to a full-blown knowledge worker in some high percentage of cases. It's already important now, it's only going to get more important.

As for OpenAI specifically, I think they have a very good chance of continuing to lead the pack, particularly with the this cringe-y GPTs/GPT Builder/GPT Store thing. It's pretty transparent that this is them getting data to train an AI on how to spin up agentic AIs to accomplish specific tasks, because they'll have the data on how the GPT Builder is used and the data on how useful and effective the GPTs it builds are, so they can do things like dramatically overweight the most effective and useful GPTs for training their internal "GPT Auto Builder". They'll be running a store for these things as well as effectively controlling the operating system they run in, so purchases, ratings, time using a GPT, sentiment analysis in the GPT text log to detect success, plus explicit in-GPT feedback (the thumbs up and down, feedback submission form) will all be data they can feed into their machine, to make an AI that can build good GPTs for a task and an AI that can evaluate their performance and an AI that can most effectively get good performance out of a GPT. That's going to be huge, particularly the signals that have real economic costs to users (I know they haven't announced it, but I think eventually they're going to make it so you can purchase GPTs) because that starts to pull away rose-tinted glasses and the fog of futuristic sheen and get some more unvarnished data on how much people actually value these specific things. That data means eventually you should be able to just ask ChatGPT to do something for you, and if it can't do it natively it will be trained to be able to spin up a task-specific GPT with access to the correct tools, docs etc, then have the GPT Whisperer AI use it to get the right answer with a bunch of backup data, and return you the answer with the option to see the work. This is also a pretty auditable process, which makes a lot of the legal and AI safety folks happy. I don't see another company that is similarly well-placed in terms of having the tech, talent, compute, product, and roadmap to pull this off.


> It is nevertheless true that in aggregate for knowledge work, "worker + intern" is more economically productive than just "worker"

It can be true, but I don't think it is always or necessarily often true. The overall value proposition for interns (and apprentices etc) for society as a whole is that they will then go on to become the professional knowledge workers that they have learnt from. LLMs won't become that, so the value proposition is limited and localized. Remember the adage about garage mechanics, where the price goes up depending on how much 'help' the customer wants to give?


It is economically true in aggregate, although you're right that the worker overseeing the intern very often feels like it isn't true. The mechanic you mentioned could fall into that category, but I think it's more likely that it's just one of those jobs that's not knowledge work so what I'm saying doesn't really apply, because there are physical space limits to collaboration in realspace work that are much more readily obtrusive than equivalents in knowledge work. Basically, good knowledge worker A & bad knowledge worker B are never going to be "A + B" productive together, but they're also extremely unlikely to be <A productive, only in very specific situations that generally have to do with destructive operations by B (deleting files, using up limited tools like licenses or submissions, etc), and LLMs have been built specifically to not be able to engage in destructive operations. Contrast that with physical work, where two people can't physically occupy the same space and if they try they are likely to end up at a lower productivity than either one of them could have achieved alone because they quite literally trip over each other.

About the interns as hiring component, that's the official reason for interns and they are used that way, but they are also just valuable in and of themselves as labour. Companies that aren't even hiring and won't be in six months will still take on interns, because it's free labour. Interns are the equivalent of manual labourers in knowledge work, able to do what you tell them but not yet possessing the skills to do many complex tasks alone.

And I wouldn't be so sure that the "LLMs as interns" relationship we have now doesn't result in them learning. Obviously on an architecture level any individual instance can only learn in-context and that's wiped when the context is, but in aggregate they are learning from their "time on the job". The next version of GPT is going to be directly better at least partially because of all of the data that ChatGPT 4 being an effective intern brings in. It could be anywhere from just an intern at 6 months rather than an intern on the first week, through to an actual hired employee somewhere during their first year. Even if it was technologically possible I don't think OpenAI would deploy their next product if it was as good as a senior employee in some substantial number of domains, because that would be too disruptive of a leap for society. But anywhere on the intern scale that's better than GPT-4 up to approximately an employee one year in I think is feasible for their next release, and that will be in large part thanks to the data they get from their current deployment.


Hacker news being bearish is the clearest bull single I've seen all week




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: