>As we navigate our academic reserve, it is reasonable to assume that the base rate for intentional deception is very low. But this base rate can lead us astry if we apply it to life off the academic reserve.
That is a bold assumption, given what has been playing out in academia very publicly and messily over the past few years. While it might be plausible that the base rate for outright fraud in academia is still fairly low, the base rate for all forms of deception is likely extremely high, particularly in highly-cited papers with interesting results.
There are just so many ways to tweak your analysis that don't feel like lying, but nudge a p-value the right side of 0.05; so many little issues that might crop up during a study that feel too trivial to mention; so many methodological choices that are perfectly reasonable if you just avoid thinking too hard about what you're doing. All too often, it only takes one little inconsequential-seeming shortcut to get the result you want, but that shortcut makes all the difference in the world.
Find me an academic who can defend every word they've published without once squirming or cringing and I'll show you a zebra.
The conclusion doesn't really mesh with the body of the argument, because you can measure whether AI is useful or not. How can something be a con when it's genuinely useful to me every day? When it writes half the code at our company? Methinks the author's base rate of P(con) is too high.
Author thinking of "con" in too abstract terms, maybe. (Apollo didn't liberate Penn-Teller from their monies after all?)
Imho, the misdirection is away from the prevalent industry/societal problem of technical debt.
a fellow HN'er has observed that AIDEs are great for getting on top of things but not to the bottom of things (paraphrasing Knuth on emAIl)
Sorry for assuming that what you find useful is the help getting through that pile of make-work foisted upon your will by the powers that pay the bills :)
I find AIs great otherwise for swiftly resolving FOMO, but not so much for tending to the ikigAI
Do you have no code low quality standards in you company? No code reviews?
I sometimes use AI to generate boilerplate code that I use as a base but I end up refactoring it so heavily that rarely anything actually AI written gets committed.
Can't imagine AI code passing code review. The quality is still pretty rancid. It has problems following modern coding standards because it is trained on old data and super over complicated stuff.
I feel AI is mostly useful as a super slow search engine. Now that Google has be completely shitified, yeah it has uses but if we still had we good Google, would there be so much need for ChatGPT? When you could find relevant information in seconds and quickly copy paste code from stack overflow instead of waiting for ChatGPT to generate the same thing?
Our internal models have improved dramatically. I think people are also better at using them now. Nearly every eng I know uses them to write non-boilerplate code as well.
I check in on Copilot every quarter or so to see if it’s gotten smart enough to write one of my simpler PRs for me, and as of 2 weeks ago it’s still not. Makes writing test cases a bit faster, but that was never really a bottleneck.
Copilot has an LSP I can add to whatever editor I need without breaking my entire existing developer workflow and tooling.
Cursor and Aide are both their own editors. Amazon Q might be able to integrate but a cursory search didn't immediately bring up a solution.
Even if the difference between Copilot and Cursor et al is GPT 3.5 to Sonnet 3.5 it wouldn't be worth the time. Even the places where Copilot feels useful and improves productivity isn't worth it with my recent experience. The improved coding speed just allows me to more quickly build shitty code that works around architectural issues. This was on a greenfield project may I add where people have praised the abilities of AI greatly. The only place I've actually found Copilot to be useful and where I miss it when I turn it off is the ability to autocomplete the currenr line reasonably well.
Even with boilerplate I have found it to make glaring mistakes. Example I had, generating the SQL for a many to many table when the PK in one table was a composite key. At first I thought, well that was useful it just generated the whole thing. Then I read the code it generated and it completely just made up fields, when the table it was referencing is 5 lines above in the same file, you cannot even make an argument about context length since the context is literally on the same screen.
I've seen much much worse examples in actual code where the generated code is close enough that unless you actually properly review it you will miss the mistakes. So now not only do I need to think about the architecture, I need to write half the code and code review the other half assuming a monkey high on LSD wrote it.
The real threat isn't that AI will steal coding jobs... The threat is that all the good developers will get so frustrated that farming goats might actually be a viable option.
The example of seeing a stripy equine in Colorado and having to decide whether it's a horse, ass, or zebra is a good one and highlights what base rate means.
However, it even more strongly highlights a problem that comes up in probability: missing assumptions. As in, there's several important missing assumptions in the example.
Suppose you're hiking in a country without wild zebras and you see what is clearly a white equine with black stripes.
Which is more likely:
1. It's a zebra.
2. Someone painted stripes on their horse
3. You saw patterns of shadows on a white horse
4. You accidentally absorbed hallucinogenic fungus through your skin when you wiped you hand on a tree branch
...and so on. The longer you think about any unlikely events, the more unlikely possibilities you can come up with.
In fact 2. is something I've personally encountered on a hike before, although it wasn't in Colorado.
How would you know which of these unlikely events had a higher base rate? It seems unknowable. It depends on whether there are some isolated farms nearby where people keep horses, or whether there was a small zoo in the next town that you didn't know about, whether hallucinogenic fungus grows in this area, and so on.
I think it's important to address this in any discussion of probability theory, because hard though the maths is, the complexity of reality is even harder and it's usually omitted or hand waved away when teaching this stuff.
Exciting to see an article by Prof. Romer on the front page of HN. Thanks.
As the imitation game is a marker of progress, AI is absolutely, by design, trivially, a con.
I am questioning this sentence:
"instead of zebras versus horses, the types they have to recognize are people who strive to maintain a reputation for honesty versus people who are willing to deceive"
The first pair are separate categories, but members of the second pair overlap to a defining degree, whatever the stakes.
> What if [various "AI" claims are] intended to exhaust our ability to consider other possibilities?
> Could AI be a con?
Oh my, yes. However I think they want us to believe their claims and send them money, as opposed to using the claims to misdirect from something else. The direct profit motive is sufficiently compelling.
To be more specific, there are people out there creating cons ("investment opportunities") based on misrepresenting LLM-generated story characters as real-world entities.
For example, suppose I make a program trained on all the human stories of detectives and Christmas, and it generates a story of SherlockSantaBot, a fictional character depicted as being extremely clever and also able to know the good/ bad quotient of all children in the world. Does that mean I actually invented an real AI with those capabilities? Nope.
That is a bold assumption, given what has been playing out in academia very publicly and messily over the past few years. While it might be plausible that the base rate for outright fraud in academia is still fairly low, the base rate for all forms of deception is likely extremely high, particularly in highly-cited papers with interesting results.
There are just so many ways to tweak your analysis that don't feel like lying, but nudge a p-value the right side of 0.05; so many little issues that might crop up during a study that feel too trivial to mention; so many methodological choices that are perfectly reasonable if you just avoid thinking too hard about what you're doing. All too often, it only takes one little inconsequential-seeming shortcut to get the result you want, but that shortcut makes all the difference in the world.
Find me an academic who can defend every word they've published without once squirming or cringing and I'll show you a zebra.