Hmm ... I wish with all these new agents my regular tasks become a bit easier, but so far it didn't work, it 'ran out of loops in the free plan'. Not going to pay without even executing one successful example (I know this one is a pretty tough problem, but I was willing to give some easier 'research' tasks too). My query was :
"Find out the cheapest per unit Pampers New Born diapers from the ones available on amazon.in, flipkart.com, bigbasket.com, dmart.in, firstcry.com considering all the available card discounts, coupon codes, shipping and other charges etc. "
API access is not coupled with being a ChatGPT+ subscriber, and I believe GPT-4 API access is also not included in that. That means you can use GPT-3.5-turbo via API right now, and GPT-4 through API is available via waitlist.
I'm not yet, because I haven't got confidence in ChatGPT's answers or abilities so far (e.g. the fact that it 'lies' with a straight face when it doesn't know the answer is especially troubling for me). I hope ChatGPT 5 or 6 are much better in this regard, and I will very gladly pay for it then.
I understand it could have just been a 'regular' automated task and not necessarily an LLM task (at least if tried hard enough). But isn't the general promise of these LLMs that we don't have to program a separate automated task for everything? And my query is precisely the kind of research I want the LLMs to carry out for me online. An educated assistant (natural intelligence) will do this task for me pretty easily, and I just want AI to do the same, perhaps much faster and cheaper. Until they can do that they're not impressive enough. :)
Either you do it yourself, or your task someone else with that research task. The promise of autonomous agents is that they do the grunt work for you, but even though I'm really excited about the concept, it doesn't work all that well yet.
You know what I hear most? People asking if anyone had great success doing anything with these AutoGPT agents, it’s always meet with crickets. Either that or people don’t want to give the secret sauce
In my opinion, this isn't an end product in itself, but rather a demonstration that the chain of thought can be altered with greater flexibility than ChatGPT and take on human-like actions without reliance on APIs or ChatGPT Plugins.
Does this have use cases now? No, it's far too buggy.
However, the concept of this with refined and/or default chain of thought logic and better managing control over the sequence, it can do anything.
(E.g. Find me some good real estate investment properties in the San Francisco area and send a message to the seller's agent to schedule a meeting on my calendar) <> References internal data set of my investment criteria from long-term memory and prior responses, Browses MLS, Find Broker Email, Sends, Syncs with my Calendar.
I've tried AutoGPT agents about 10 times for tasks that seem like they'd be a good fit. That meaning they need to scrape current data, but otherwise they'd be possible for a person willing to put in some time to combine data sources.
An example is "create a table showing the current ratio of rental/sales prices (on a per square foot basis) for residential properties in the 10 most populated counties in Colorado."
I'm still yet to get a reasonable result from trying this.
You need gpt4 api access, and hardly anyone has been able to get that. Otherwise it will run on gpt 3, which isn’t able to function in this role. It’s hard to say how well it might work with very little info.
I've had that for months.. if you want that quickly you need a project in which you already incorporated the OpenAI API. At least that's what I did.
I linked my open source project when applying and got access a few days later.
In my own farting around with AutoGPT (which was quite a lot), what stuck out to me most was that the agents wouldn't rely on their own "knowledge" for tasks hardly ever. They would always default to browsing the internet for trivial questions and so it would take a good 10 minutes to arrive at a conclusion ChatGPT would've gotten in one-shot.
i tried this some weeks ago and couldn't get it to work. I managed to implement the ChatGPT stuff but everything beyond that like implementing Google into it didn't work...
Projects like this try to get around the fundamental flaw in GPTs - namely that they do not have goals, plans, thought processes etc - without actually solving it, e.g. by having the AI write out its "goals" before continuing.
But this is a hacky fix, and will never be reliable enough for consistent use. For that, more actual research is necessary, on how to simulate and model goals and trains of thought and have them interface with the world model provided by an LLM.
I feel like there's an implication here that the research should be in modeling architectures and training sets and other specialized machine learning. But there is research here: in natural language modeling of goals, plans, thoughts, processes, etc.
Obviously we don't know what paths will be most successful. But a path where critical drivers of AI (like goals) are modeled in a transparent and comprehensible manner seems like a very attractive direction to take. I'd much rather be able to read my AI agents goals, plans, intermediate goals, self-analysis, etc., than have it all captured in a set of completely incomprehensible weights.
The AI would never say hello, but if you say hello to it, it will say hello back. Is that also a hack? Aren’t you just describing everything about LLM behavior generally not only something specific about goals/tasks? In that case the nature of the thing is less interesting than the results we're able to find from it and I wouldn't worry about this kind of purity test.
I mean most people don't have the resources needed to build a model big enough that these types of behaviors emerge so third party addons is all we got until Google/Microsoft/OAI drop something on us.
Part of the issue here is the massive amount of compute needed over what we're already spending. ToT is showing a likely 10 to 20x number of calls to get an answer, which when you are compute limited is going to be a problem for deployment in mass. It's very likely we're going to have to wait for more/faster hardware.
At least as a non-native english speaker I was a bit confused as to what an agent would do compared to using plain ChatGPT. I tried then the examples "Plan a detailed trip to Hawaii" and "Write some code to make a platformer game". I tried the same Hawaii sentence with plain ChatGPT and told it that it can browse the web. Now I am thinking that AgentGPT does seem a good tool on top of ChatGPT. As they say on the Github page it "It will attempt to reach the goal by thinking of tasks to do, executing them, and learning from the results". This was a whole lot more thorough service than what plain ChatGPT did with the prompt. I'm just thinking that maybe they should emphasize more those points already on the app page, about the core value it adds and how it does it. Perhaps even explain it some more than what that sentence from their Github page does.
Stars are a weird thing. Some people seem to believe they're like likes on social media, or even that they indicate actual use. There's probably sites selling you fake stars, like they sell fake Instagram followers. However, many people, myself included, treat Github stars as just bookmarks, and star things they found interesting or possibly worth coming back to, regardless of the quality or whether one has any use for it.
Interesting. So this combined with the fact that other comment say this is an incredibly thin wrapper on GPT-4 makes this some kind of investment bait?
I wonder how profitable selling of non-products is with marketing like this.
I was mad about this too, but I'm interested enough that I was going to register. I use a custom domain and so would use something like "agent-gpt@customdomain.com"
Anyway, they don't allow email signups, either. You have to use a social media login. So the pie gets even smaller.
Edit: also the logo is not displaying properly on the signing page (Firefox, linux)
There are a couple of challenges with making this stuff completely open like that right now.
The first is that OpenAI has rate limits. They are especially small for GPT-4, which for anything complicated can be quite a lot better than 3.5.
The other one is that if you do leave it open, then you can be sure that a significant portion of your customers will be from countries that could not pass the OpenAI phone verification. Or just didn't want to identify themselves. For some reason.
Combine that with something that scares some people a lot like autonomous AI or connecting them to servers or the internet, and it feels like you might be on thin ice with OpenAI or some regulatory group. Especially if a bunch of Russian and Chinese users are finding you on some directory or post listed as a way to get around the phone verification.
This is a good line of thinking for making LLMs more powerful. I'm confused what I'm seeing though, can I not inspect the internals of what it's doing inside each task? And why does it take so long? Is it doing tons of GPT queries for each task?
Are agents not just a natural extension of ChatGPT as a product with chat being a universal language interface for the plugins which connect to tools and skills?
Yes. In fact, ChatGPT with Plugins behaves like an agent.
The idea behind these AutoGPTs is to give greater structure to the way the model thinks and acts, through a loop of planning, executing, and reflecting. Another goal is to provide the model with enhanced memory, similar to or identical to retrieval.
In short AutoGPT-like applications treat the LLM as a reasoning core around which a scaffolding of higher-order thinking is built.
It’s interesting, as a cognitive architecture. Imagine that your frontal cortex, responsible for planning, was very rigid and farmed out its actual reasoning tasks to your language-learning brain area.
"Find out the cheapest per unit Pampers New Born diapers from the ones available on amazon.in, flipkart.com, bigbasket.com, dmart.in, firstcry.com considering all the available card discounts, coupon codes, shipping and other charges etc. "