Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Can Devin genuinely replace the roles of developers?
29 points by sunnysogra 26 days ago | hide | past | favorite | 83 comments
These days, everybody is talking about Devin and its threats to developers. I would like to gather feedback from others to gain a comprehensive perspective on this matter. Do share your thoughts.



My perspective (as a CTO who's hired hundreds of developers in the past 12 years) is that I don't have a use case for Devin, from what I've seen about it.

That comes down to why I hire developers in the first place: To share my responsibilities with people I can trust.

I don't hire them to write code or to close tickets. The act of programming, I consider an exercise that helps them understand the problems we solve and the logic of our solutions. I'm always excited when I have a well specified ticket I can hand to a new hire to learn the ropes. So the kind of thing I can imagine Devin can pull off at some point, that'd actually be detrimental to the kinds of teams I build.

I don't think I represent the majority of why people hire developers though, so I guess tools like that may well have a big impact on the industry. Nobody can predict that though.

Uncertainty sucks, but it's how things are. I find the best way to deal with uncertainty is to become better at adapting to unforeseen circumstances. Programmers have quite a bit of experience with that, for what it's worth.


My take is that AI may replace mediocre task taking developers who just write code. It won't replace problem solvers who use code as a means to solve actual problems.


> My take is that AI may replace mediocre task taking developers who just write code.

Is there such a job? Even the tinniest bit of code I had to write involved learning about the problem and the platform to come up with the solution. There was some I could hand over to ChatGPT, but it always felt like when somebody gives you their vim or zsh config. You always wonder what's going on here, and you might just as well read the documentation.


Yup, essentially what I think. With the caveat that I find the act of programming to be one of the best ways to deeply explore, understand and learn things. So in other words: I see a lot more value in coding than I see in code.

I'm sure it's not the only way, not in the long run anyway.


Yeah, I have the same perspective. If a person likes coding and his problem-solving skills are good,he can definitely survive whatever the circumstances.


> These days, everybody is talking about Devin and its threats to developers.

If by these days you mean the two days after it came out several weeks ago, and by everyone you mean a handful of people on social media, sure.


I think this is an appropriately flippant response. If you're on YouTube or listening to podcasts or browsing hacker news, it's extremely important to qualify everything you see as something to engage your specific curiosities, advertise, or to serve an algorithm that people have made their jobs.

People are hard up for content, a lot of their audience doesn't have employment anymore and are scared, and they need to put things out at a regular cadence. It's not (necessarily) a conspiracy that you see the same copy pasted tech thing appear everywhere simultaneously, and because it does, it makes you think everyone in real life knows and cares, but that's far from reality. Even someone who's terminally online and chronically checking these things could have taken a long weekend away from their phone and not have heard about it.

As an aside, no employer really gives a shit about your curiosities, so you need to separate all that chronic consumption from what is efficient and practical to do in a job if you have one, rather than leaning into what someone online thinks is the best way or whatever you think is the future.



Yes and they are astroturfing Devin is all over the place.

Several subreddits were overtaken by Devin bots for a few days. When the dust cleared everyone realized what had happened.

And here we are again.


I still think the same thing is happening with OpenAI. It's way too hyped vs its actual utility.


I'm a part of my university's competitive programming club, and the discovery of Devin being a fraud really upset a few club members (Many people who worked at Devin.AI were elite-level competitive programmers). It shattered the illusion that being a good competitive programmer would translate into being a good engineer overall. They were the victims of market hype.


> It shattered the illusion that being a good competitive programmer would translate into being a good engineer overall

There was a study by Google years back that showed the exact contrary: https://catonmat.net/programming-competitions-work-performan...


Not exactly - it just means google overvalued that experience in its hiring decisions. https://erikbern.com/2020/01/13/how-to-hire-smarter-than-the... explains the phenomenon ("Berkson's Paradox") well.


I really don't think this is applicable here.


No offense but did you read the article? It also links to this one which is about this exact claim - https://erikbern.com/2015/04/07/norvigs-claim-that-programmi....


Yes I did, I even read other sources about the paradox to make sure everything was clear in my mind.

The whole argument relies on the fact that "being good at programming contests" is a factor that was considered on interviews (and even given too much weight in the decision). There is absolutely no hint of that at all, and having been on the hiring side in FAANGs I can safely say this is not the case.

There is no paradox here because the study is not done on a pool of candidates that are pre-filtered based on this specific parameter, only on what it correlates (being good at LeetCode, which makes you pass the interviews).

It is funny that in the second article you linked the author says "My point here is that you can tweak these variables and end up seeing correlations with pretty much any value.", because this is exactly what he does. He manipulates the problem until it turns into a Berkson's paradox.


It's like when a person goes too deep into CP, he just overfits in it and might just not come out of that phase, where you are given the problem and you are supposed to get the pre-defined solution.

It's not the case in real-life, the main problem is to detect the problem and then find the optimal solution. So unless you blend into it, get the test of CP, develop logic and get outside, build projects and keep moving, the field is vast, don't keep knocking same door everyday.


I will check it. Thanks for sharing the link.


I have preview access to devin, and I can tell you that it's the real deal. It's making PRs to our codebase daily, and acts as a junior engineer.

It frees us up from doing menial tasks and is a great help for stuff like that.

It's not perfect, but it's a glimpse of the future that definitely needs to be noticed.


Some proof would give your comment weight.


proof or not true.


The way I see AI programming assistants is that it would help juniors be a bit more productive, but senior developers can do with out the assistant.

I've used Cody and Copilot and it just gets in the way because I know exactly what I need to write and neither really helped me.


Precisely. I think it helps with smaller/mundane tasks (that it has seen in its training), but the tasks that actually require a higher level reasoning and understanding of the bigger picture - are not something we can expect the current LLM's to do.

However as I was researching, there are a few interesting ideas in this space that might help these LLM's solve more complex problems in the future. Post here if interested: https://kshitij-banerjee.github.io/2024/04/30/can-llms-produ...


These are the cases where even senior developers should use AIs.

When I'm creating a CRUD API I know exactly what I want, I know exactly how it should look like.

Do I want to spend 15-30 minutes typing furiously adding the endpoints? No.

I can just tell Copilot to do it and check its work. I'll be done in 5 minutes doing something more engaging like adding the actual business logic.


I agree with this mostly - but recently a bug was introduced into our app because of a copilot suggestion that wasn't checked thoroughly enough by the engineer (it suggested a property that was similarly named to another property but not the same).

Like you say, it makes the most sense repetitive or easy tasks.


My usage of Copilot is dramatically higher in strictly typed languages because of things like this. It's almost counter-productive if I have to very carefully analyze every variable name to make sure it's not subtly wrong.

Having a compiler do that validation of AI output helps dramatically so I only have to validate logic and not every freaking character in the function.


This is why I have Copilot write unit tests too :D

Actually it does the boring bit of generating the test data and the basic cases, I'll do a once over and add more if it's a something that warrants it.


> I can just tell Copilot to do it and check its work

Checking other entities' code is not trivial and very error-prone.

I get what you're saying but I have my doubts if me doing the whole work manually would be slower than asking an assistant + doing an extensive code review.


This is highly repetitive code where the options are pretty much either me copy-pasting a piece of code and changing a bit here and there or having an AI do it.

The latter won't make stupid small mistakes, I will (and have)

And I'm checking like 10 lines at a time, related to code in the context I've got in my head.

I need to review 100x bigger PRs done by humans of varying skill regularly - related to other parts of the project I'm not intimately familiar.


Okay, but isn't your own code generator a better option in this case? You know, a for loop with some parameters that spits out code?


How can a "for loop" generate me 10 API endpoints in C# that call business logic functions with the parameters received?


You tell me. You are the one who said that code was repetitive to generate. :)

So it turns out, not so repetitive after all then?

I remember devising my own mini DSL when I had to produce 250+ such endpoints and validators. Three days spent on that, then ran the command and I had working code 30 seconds later. Felt like a god.


Lately I have been using it to write print and logger statements. I type what I want as a sentence in a comment and then it handles all the special syntax characters. Given the error rate I’m not certain it saves time, but it is fun to play with.


I've got "just" Amazon Q at home (not paying Copilot prices for my personal projects) and just typing "log.Printf(" and waiting a second it usually gets what I'm trying to log either very close or exactly right.

It's not like we're breaking new ground in the field of computer science here. The LLMs have been taught with terabytes of code and me writing a Go API glue program is perfectly in their wheelhouse.


For seniors I think it depends on how much breadth you need. I find them very useful to explore/poke around new areas where I don't have domain knowledge. I agree that areas/problems that I worked in the past it just slows you down but as you move into more unknown territories they are kind of nice to have as a sparing partner.


Similar to my usage as well, it's a good start for unfamiliar territory to quickly get up to speed but you can hit its limits quite fast.

I've been toying around with embedded development for some art projects, it was invaluable to have a kickstart using LLMs to get a glimpse of the knowledge I need to explore, get some useful quick results but when I got into more complex tasks it just breaks down: non-compiling code, missing steps, hallucinations (even to variables that weren't declared previously), reformatting non-functioning code instead of rewriting it.

As complexity grows the tool simply cannot handle it, as you said it's a good sparing partner for new territory but after that you will rely on your own skills to move into intermediate/advanced stuff.


I find the ML completion used in Google codebase very useful. It knows the APIs that I'm going to use better than I do, and it also can infer how exactly I'm going to use them. So in my experience, it does make me more productive.


You should watch this video that shows that some of the Devin demos are fake. https://www.youtube.com/watch?v=tNmgmwEtoWE&feature=youtu.be


I will definitely gonna check! Thanks for sharing it.


Devin? No, it was a rigged demo verging on fraud. See previous discussion: https://news.ycombinator.com/item?id=40008109

Future hypothetical AI coding assistants that don't exist yet? While I won't say it's philosophically impossible that they'll move beyond extreme autocomplete with security holes, I'll say it's not up to me to disprove someone else's hypothetical. Show me the thing.


My take on LLMs is that it won't scale much of what we already see because this is "just" text prediction on steroids. I'm not an expert by any stretch of the imagination, but that's my opinion, and going through that path, no, I don't see this path as the best path for "autonomous development machines", only powerful autocompletion like we already see today.


I agree. I've been using Copilot for several months now, and the only thing it (almost) consistently helps me with is predicting relatively trivial snippets.

Anecdotally, I've had it mispredict from very simple contexts, such as skipping numbers in series' where the pattern should've been extremely obvious.

I've had it sneak in sublte and obvious bugs on a regular basis, to an extent where I don't have much confidence beyond any code I can grasp in a single look and be confident it's correct. Sorry bros, I'm not on the hype train this time. Feels like crypto all over again.


I switched to Supermaven and in my experience it’s at least 2x Copilot. Might be worth a try.


There are many way how current LLM can be scaled in different dimensions and there are research around it e.g.:

1) Many different AI with different role: business analysis, tester, developer. You as developer are treated as customer and write simple prompt but business analysis AI will make a proper step by step prompt to Developer AI - so that you don't have to very good with prompt engineering

2) bigger context for LLM so you can feed up to date documentation and full repo

3) LLM having access to do RAG on web search to get up to date information

4) LLM having access to terminal and debugger so Tester/Developer AI can automatically see the flow how code is executed and variables states during execution

5) faster and cheaper LLM so that you give a task before you go to sleep and all those AI in a loop try to solve this task trying many different options until passed all tests.


Yes, but that doesn't change how they fundamentally work. You can replicate things fast, with many variations, almost like brute force but more "thoughtful." Don't get me wrong, this opens many possibilities, and I am even thinking about having a local AI machine for my stuff. However, working with multiple layers of knowledge connections is where I don't see LLMs arriving. Maybe some other technology is based on this, but following LLM evolution will be better data and "patches".


The open question is if the answer to your question is "no", or "not yet". a bot that takes 1 point jira tickets and goes off and does them would be something, but that currently doesn't exist. until it does, we can say it won't exist, that it can't exist using the current approach, and thus that it won't exist within my lifetime. Until it does.


Depending on what you’re building, likely sometime in the near future. I’m a senior dev, I’m just using Cursor.sh and my day is essentially telling it what I need done - and the code is written much faster than I could have.

Essentially I now just architect and review. Cursor has good context, so if that gets extended to the way Devin operates I think this could go pretty far.


I'm surprised that, as a senior, you're able to offload your work to Cursor. Maybe it's light years ahead of GPT, but I'm also a senior engineer and get almost no usable code out of AI tools without having to double check and then reimplement its suggestions. I use them to generate ideas for algorithms mainly, and would not (could not) just copy-paste its solution into my/our codebase. It must depend on what you're building.


Being senior can mean completely different things depending on the job. I believe a senior whose work can be offloaded to Cursor at the extent OP claims may be the kind of senior who doesn't really need to deal with that much complexity in the first place.


I think it’s highly dependant on the language and framework and whether they lend themselves well to the way Cursor accepts context.


I'm only using ChatGPT 4 and cannot reproduce your good results. I definitely get valuable inputs from it, but also definitely (almost) nothing usable in that form. Might be I'm still bad at prompting, but if I request the code to do say 5 things (like a complex SQL) it will always forget at least one or the other of the points. Or suggest useless things. Or reformat instead of rewrite. So again: helpful, increasing my productivity, but very very very far from being trusted with anything real. Edit: I don't agree with others saying it doesn't help seniors: it does help, because I can recognize its idea and build quicker on it. If I was a junior I'd navigate blindly (sometimes okay too).


A few thoughts (for context, senior developer, and use chat-gpt every day as an assistant)...

In the short-term (5-10 years, I cant see them autonomously producing products), it will need an experienced programmer to interpret and use the output effectively.

An implication of this is, in the short-term, developers become even more valuable. You still need them, and these tools will make the developer significantly more productive.

I was reading Melanie Mitchell's book 'Artificial Intelligence: A guide for thinking humans' recently (which I'd recommend). She has this chapter on computer-vision. And as an example, she shows a photograph of a guy in military clothing, wearing a backpack, in what looks like an airport, and he's embracing a dog. She makes an insightful point, that our interpretation of this photograph relies a lot on living-in-the-world experience (soldier returning from service, being met by his family dog). And the only way for AI to come close to our interpretation of this, is maybe to have it live in the world, which is obviously not such an easy thing to achieve. Maybe there's an analogy there with software development, to develop software for people, there's a lot of real-world interaction and understanding required.

In terms of autonomously producing products, I see these tools as they are now a bit like software wizards, or a website that Wordpress will create for you. You get a 'product' up-and-running very quickly, and it looks initially fantastic. But when you want to refine details of it, this is where you get into trouble. AI has an advantage over old-fashioned wizards, in that you can interact with it after the initial run, and refine it that way. But I'm not sure this is so easy, to have that fine-grained control you have with code. This is where I see the challenge being, to develop tools to talk to it, and refine the product sufficiently.


No, the job will change.

It is a tool for building software. You still need to know software development to use the tool.

You might not need to actually write code in the future - just like very few write Assembly today.

But you still need to know and understand system requirements, systems architectures, integrations, distribution, deployment, maintenance, etc.

Software Engineering is more than just coding.


But that doesn't exclude that maybe 50% of IT jobs get axed and/or average salaries decreased significantly (aka unless your really pro forget about 150k+)

if AI will speedup software development by 3x and if demand for software not increase by 3x then I don't see why the above wouldn't apply. And I'm talking about longer timeframe like 10 years since this is more relevant for people who are just starting out software path at collage.


> These days, everybody is talking about Devin and its threats to developers

I think we're in different spaces, because I barely hear anything about it. That said, I think the LLM replacing jobs train was blown out of proportion. I heard so much about it replacing developers, but I've seen time and time again it output code with subtle bugs (I'd argue worse than obvious bugs) and no be able to operate with more than just a little bit of context.

I think we're in a Pareto distribution situation right now. The majority of getting an LLM to write code was pretty quick to do. To get it to do anything a moderate dev can do will take decades.

I've seen it multiple times over the last 15ish months where I'm reviewing code and I spot a subtle bug in an htaccess file or a bash script that doesn't make any sense. The PR comment follow up is then "oh I got it from ChatGPT". I think these tools become assistants to a human developer who can guide them. That use case is already available and seems to be pretty decent for a lot of folks. Full replacement is so far away that I don't have a single thought about it.


It can replace a junior "developer" who only really knows how to copy-paste from stackoverflow. Anything beyond that, no.


more generally, will developers be replaced? the answer if an affirmative yes. when? unclear at the moment. i think a big moment will be when we free ai/automated agents from the shackles of python/ruby/etc—ie programming languages invented and perfected for the human programmer—and allow them free reins to accomplish their tasks.

i contend that given that the universe of computing (ie the capabilities of the processor) is finite, all software is engaged in essentially the same activities (write to memory here, a file over there, etc) such that the difference between any two could be reduced to a matter of interpretation. if so, any automated agent capable of assigning some meaning to these activities should be able to produce a sound program, in whatever computer language, even one very proprietary to the agent.


yeah! I feel the same


Devin is a replacement for a keyboard, not a developer.


The thing that might replace won't be Devin, if we are going to make a one-for-all solution, it just won't work.

If we are talking about a specific use case black box to automate the work, then it might be possible up to a certain limit.

because when we are talking about training Neural networks, we are looking for the best numbers. The so-called "emergent abilities" which say that increasing model size makes it smart can be true but what's the probability of getting most of the parameters to their correct values? There are billions of them.

(Total)Replacement? I highly doubt that.


AI can give suggestions, but it can't actually deploy working software into the real world. For the foreseeable future there is always going to be a human who says "yes, lets spin up that elasticsearch cluster that costs 50k a month".


I think this is the wrong way to look at it. Forget replacement, if AI just makes developers more productive, companies don't need to hire as many people to do the same work.

Maybe the pool of work for developers will increase proportionally to the productivity increase, so there's no layoffs, but there's no guarantee of that.


Yes, you are essentially restating my point.


Recently, it was discovered that Devin was trained on only certain aspects of a problem, enabling it to predict the output accurately. It appears to be specifically trained to resolve similar issues.

Although AI is advancing rapidly, if a person learns to adapt in such situations by broadening their learning scope, one can smoothly navigate through such hypes.


Junior developers at minimum. Team Leads, no. LLMs are already really good at code, they'll obviously get better and better.


It doesn't need to replace a developer completely. If it does make development 10-20% faster that's on average 10-20% less developers needed to be hired.

The current LLMs are not perfect but I recommend anyone to try it out - it really speeds up development, great for creating boilerplate or when trying to use a language you have little knowledge in.


There's a certain amount of hyperbole and bluster, but I believe it will happen sooner or later.


Not Devin, it is vaporware, but easy coding will get automated away. If your value as a developer is 'LoC/sec', then you will be automated away.

Architecting, writing requirements, debugging nasty issues and optimizing tricky problems will remain valuable.


How can you be sure in those areas the AI won't be eventually superior? Of course the first on the firing line is the code monkey but I wouldn't be so certain that's where it ends.


No.

Like all the other AI codebots it's a tool that can potentially optimize a developers workflow. In the same way that a nail gun optimizes a carpenter's workflow. But sometimes the carpenter might just use a hammer.


It's going to change how interviews are done with developers, no more leet code homework. And it will speed up some devtime, but I don't think its going to replace junior/senior devs any time soon.


I continue to see AI as a tool for developers rather than a replacement. It enables developers to do their job better, but wouldn’t cut it as a standalone developer.


Like the halting problem, I think LLM tech is a "Halting AI" situation. I wrote about it here last month: https://www.evalapply.org/posts/halting-ai/

> This riff derives from a recent "AI Programmer" story that's making people in my corner of the nerdiverse sit up and talk, at a time when hot new AI happenings have become mundane.

> ...

> It is yet another prompt for me to take the lowkey counterfactual bet against the AI wave, in favour of good old flesh and blood humans, and our chaotic, messy systems.


interestingly enough, your can ask ChatGPT is a function that calculates all the Fibonacci numbers will ever complete, and it'll say no, so it seems we've solved the halting problem, for cases where the problem's been discussed in the training data.


Given that Devin is hype-ware until it is widely available for testing (remember, its been months since announced and still no sign of a reasonable beta roll-out), it would be better to frame this in terms of the general class that Devin represents, ie. AI-powered "full self driving" coders as opposed to the recent (phenomenal) prompt-driven tools like ChatGPT.

So - do FSD coding assistants pose a threat to developers? Sure, just like any tool using GPT3+ class engines do, we are in revolutionary times.

But the revolution here is the engines now available thanks to OpenAI and successors, not the wrappers like Devin and others.

If you're not concerned already, if you're not pondering the future already you've missed the point by seeing it only in the likes of Devin.

Personally, I think there is both risk and opportunity - it really depends on what sort of mindset you have as to whether you need to feel threatened.


No


I mean, they only made a poor demo since the announcement and then vanished not providing any more info. I do think projects LIKE Devin can replace some developers that work in most popular domains (I'd guess JavaScript and such), but it won't replace devs in complex fields (e.g embedded programmers)


not only is it complex, there's a dearth of training data for embedded programming compared to the web


In it's current form it's rather limited but I can imagine how future version could work as a very productive junior.

The real question is, what will be the rate of progress from this point forward?


Buy an ad.


the u t


no


I'm building a terminal-based tool that is somewhat similar to Devin[1]. Part of the premise of my tool is that "AI software engineer" is the wrong target for current model capabilities.

That's L5, and it may well be the future, but as it stands today, I believe L3-L4 is a more productive target for a coding agent. Before building full autonomy, we first need the infrastructure to precisely guide the models and iterate efficiently on our interactions with them.

Once that foundation is in place, it will then be possible build more and more robust layers of autonomy on top. But crucially, the developer will always be able to a drop a few layers down and take over the controls.

1 - https://github.com/plandex-ai/plandex




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: