More

wbharding · 2024-12-16T07:36:29 1734334589

I think they’re called “commercial displays” now (no malware subsidy like smart TVs tho)

wbharding · 2024-01-28T19:12:28 1706469148

Original research author here. It's exciting to find so many thinking about long-term code quality! The 2023 increase in churned & duplicated (aka copy/pasted) code, alongside the reduction in moved code, was certainly beyond what we expected to find.

We hope it leads dev teams, and AI Assistant builders, to adopt measurement & incentives that promote reused code over newly added code. Especially for those poor teams whose managers think LoC should be a component of performance evaluations (around 1 in 3, according to GH research), the current generation of code assistants make it dangerously easy to hit tab, commit, and seed future tech debt. As Adam Tornhill eloquently put it on Twitter, "the main challenge with AI assisted programming is that it becomes so easy to generate a lot of code that shouldn't have been written in the first place."

That said, our research significance is currently limited in that it does not directly measure what code was AI-authored -- it only charts the correlation between code quality over the last 4 years and the proliferation of AI Assistants. We hope GitHub (or other AI Assistant companies) will consider partnering with us on follow-up research to directly measure code quality differences in code that is "completely AI suggested," "AI suggested with human change," and "written from scratch." We would also like the next iteration of our research to directly measure how bug frequency is changing with AI usage. If anyone has other ideas for what they'd like to see measured, we welcome suggestions! We endeavor to publish a new research paper every ~2 months.

oooyay · 2024-01-28T19:28:36 1706470116

> We hope it leads dev teams, and AI Assistant builders, to adopt measurement & incentives that promote reused code over newly added code.

imo, this is just replacing one silly measure with another. Code reuse can be powerful within a code base but I've witnessed it cause chaos when it spans code bases. That's to say, it can be both useful and inappropriate/chaotic and the result largely depends on judgement.

I'd rather us start grading developers based on the outcomes of software. For instance, their organizational impact compared to their resource footprint or errors generated by a service that are not derivative of a dependent service/infra. A programmer is responsible for much more than just they code they right; the modern programmer is a purposefully bastardized amalgamation of:

- Quality Engineer / Tester

- Technical Product Manager

- Project Manager

- Programmer

- Performance Engineer

- Infrastructure Engineer

Edit: Not to say anything of your research; I'm glad there are people who care so deeply about code quality. I just think we should be thinking about how to grade a bit differently.

zemo · 2024-01-28T23:08:57 1706483337

> this is just replacing one silly measure with another

> Not to say anything of your research

The second statement isn't true just because you want it to be true. The first statement renders it untrue.

> I'd rather us start grading developers based on the outcomes of software. For instance, ... errors generated by a service

yeah you should click through and read the whitepaper and not just the summary. The authors talk about similar ideas. For example, from the paper:

> The more Churn becomes commonplace, the greater the risk of mistakes being deployed to production. If the current pattern continues into 2024, more than 7% of all code changes will be reverted within two weeks, double the rate of 2021. Based on this data, we expect to see an increase in Google DORA's "Change Failure Rate" when the “2024 State of Devops” report is released later in the year, contingent on that research using data from AI-assisted developers in 2023.

The authors are describing one measurable signal while openly expressing interest in the topics you're mentioning. The thing is: what's in this paper is a leading indicator, while what you're talking about is a lagging indicator. There's not really a clear hypothesis as to why, for example, increased code churn would reduce the number of production incidents, the mean time to resolution of dealing with incidents, etc.

oooyay · 2024-01-30T22:33:36 1706654016

Thankfully in life no third person gets to dictate what I mean by my own words. There's plenty of good research that comes from studying the silly things, science is filled with things that even an average person would say "duh" or "don't do that". That doesn't make them meaningless. If you disagree that's cool, but I still mean what I said exactly how I said it.

> yeah you should click through and read the whitepaper and not just the summary. The authors talk about similar ideas

Ah, this whitepaper that's gated behind supplying my business email address?: https://www.gitclear.com/coding_on_copilot_data_shows_ais_do...

I read the article that was linked, which is generally what's expected of me on HN.

> The authors are describing one measurable signal...

I'm aware of the research around this topic, it's something I like reading about and I've read a lot of takes both academic and colloquial. That may be why I put that idea into words.

Maybe, just maybe, in our future interactions you can avoid being so unnecessarily hostile?

lolinder · 2024-01-28T19:24:16 1706469856

> That said, our research significance is currently limited in that it does not directly measure what code was AI-authored -- it only charts the correlation between code quality over the last 4 years and the proliferation of AI Assistants

So, would a more accurate title for this be "New research shows code quality has declined over the last four years"? Did you do anything to control for other possible explanations, like the changing tech economy?

nephrenka · 2024-01-31T16:52:04 1706719924

> our research significance is currently limited in that it does not directly measure what code was AI-authored

There is actual AI benchmarking data in the Refactoring vs Refuctoring paper: https://codescene.com/hubfs/whitepapers/Refactoring-vs-Refuc...

That paper benchmarked the performance of the most popular LLMs on refactoring tasks on real-world code. The study found that the AI only delivered functionally correct refactorings in 37% of the cases.

AI-assisted coding is genuinely useful, but we (of course) need to keep skilled humans in the loop and set realistic expectations beyond any marketing hype.

wbharding · 2024-01-08T19:08:00 1704740880

Sure would be nice if Firefox desktop would join the browsers that support PWAs. We build an app that has been PWA-first, but it is unfortunate that this generally requires users to have a Chrome instance running. Would much rather point people to Firefox, and it seems like it would be to their advantage to give apps a reason to recommend FF, if they built a smoother PWA integration than Chrome.

ctoth · 2024-01-08T20:17:19 1704745039

Mozilla's removal of SSB support (single-site browser[0]) which is the key missing piece here is completely mystifying.

I reckon that CEO salary has to come from somewhere though.

[0]: https://www.reddit.com/r/firefox/comments/uwojh7/why_did_fir...

taway789aaa6 · 2024-01-08T20:38:12 1704746292

Agreed. On Firefox, if I navigate to the main page I can scroll and look at all the items. If I go to a sub-page, then back to the main page, I can no-longer scroll to see all of the "supported" features. :/

amadeuspagel · 2024-01-08T22:47:13 1704754033

I don't understand the point of PWAs on desktop.

I find it much easier to open a website from the addressbar then to open an "app" using spotlight.

I always have a browser window open anyway.

int_19h · 2024-01-10T00:27:42 1704846462

Same reason why you might prefer to use an app over a website in general. I use PWAs a lot on Windows, and for me the single biggest benefit is that they don't clutter my browser tabs, but instead show up as separate icons that I can Alt+Tab to, see in taskbar along with their notifications, move and resize the window as I see fit etc. Sure, I could also run them in separate browser windows, but then window management gets much messier.

amadeuspagel · 2024-01-10T19:07:36 1704913656

But I don't prefer using an app over a website in general. For me, the biggest benefit of tabs is is that they don't clutter my task bar or my workspaces and that I can ctrl+tab through them.

int_19h · 2024-01-10T23:08:18 1704928098

What does it even mean to "clutter" a taskbar if you exclusively use websites? i.e. what is the point of a taskbar devoid of apps other than the browser?

In-app tabs are supposed to be the second level of grouping, hence why it gets its own shortcut. What you're describing is an attempt to flatten that hierarchy, which is a valid approach, but why are you surprised that not everybody shares your enthusiasm for it?

donmcronald · 2024-01-08T22:11:14 1704751874

Firefox on Android is worse experience than Chrome too, at least for this app. The app is harder to install (no prompt), the icon looks worse, and the icon has a Firefox badge on it.

I wonder if that's down to config for the PWA or a Firefox shortcoming. Anyone know?

amadeuspagel · 2024-01-08T22:51:43 1704754303

> and the icon has a Firefox badge on it

This is something enforced by android. If every app was able to pin apps to the homescreen without such a badge, phishing would be too easy ... add something that looks like a bank app to the homescreen, get people to type in their password ...

It's not fair, since that's not true for chrome, but there's no obvious solution, other then I guess having some sort of "super trustworthy" status for a few other browsers like firefox.

benkaiser · 2024-01-09T00:10:28 1704759028

At least with Nova launcher, you can "edit" a PWA icon once it's on the home screen and un-check the badge to make it look more seamless.

tacone · 2024-01-08T22:07:08 1704751628

I really wish they would do that.

wbharding · on July 7, 2023

For the poor few still using Evernote, there are many great alternatives now available:

Notion is the obvious one: https://noteapps.info/note_apps/disambiguate/evernote-vs-not...

Amplenote was heavily inspired by early Evernote: https://noteapps.info/note_apps/disambiguate/evernote-vs-amp...

Reflect is another up and comer: https://noteapps.info/note_apps/disambiguate/evernote-vs-ref...

And of course Obsidian has its many evangelists, though the sharing situation with Obsidian leaves something to be desired.

wbharding · on July 6, 2023

https://bill.harding.blog - Linux Touchpad project, Rails techniques, occasional entrepreneur musing

wbharding · on May 13, 2023

I had never specifically considered the distinction between "intelligence" and "consciousness," but after reading this, I'd agree that it may be an important distinction warranting consideration.

My sense after reading the article and the wikipedia "Hard problem of consciousness" page is that consciousness is an evolved biological phenomenon that makes intelligent entities stateful. Being stateful is very useful from an evolutionary standpoint. Firstly it lets us pick through & prioritize long-term state. Secondly because it imbues the will to continue maintaining said state. The second is what makes consciousness a dicey prospect to aspire to w/ AI.

While the author asserts that consciousness and intelligence are separate concepts that can develop separately, I'm less sure. It seems plausible that intelligence/problem solving is always improved by having state. And more durable with state. That's presumably why evolution brought it along.

But we already have AI agents like Bard that build a history of responses akin to state. If "recognizing consciousness" is what happens when the speaker becomes aware of their state and able to meta-optimize it, then it seems that consciousness wouldn't ever travel too far behind intelligence.

wbharding · on May 12, 2023

Wow, a much more vitriolic first comment than I would expect from HN. The last six months has not lacked for Musk-hating enthusiasts howling “Twitter is dead.” But when I look at Google Trends, its story is that interest in Twitter is almost identical to where it was 5 years ago. Doesn’t seem so dead to me?

Have Musks changes have been net positive or negative? No shortage of internet opinion on that Q. Regardless of one’s personal feelings on it, it seems hard to dispute that Twitter moves faster than before. I consider that no small feat given how much legacy code and bureaucracy the company had at the time of his acquisition.

If he can free more of his time by making this hire, I am looking forward to seeing whether that translates to even faster iteration times. pg and all the others I followed are still there, so still plenty of potential to create quality entertainment/learning w less regret than Facebook

wbharding · on Jan 24, 2023

The "gaps that seem obvious"-notion describes exactly how I feel about pull request tooling these days. The status quo for PR review has very obvious-seeming improvements that have not been pursued (ex: de-emphasizing moved code vs. deleted code, making AI predict what comment will be left before dev starts typing, auto-reviewing trivial changes).

If I'm correct that PR review is currently much less efficient than it will be soon, it won't be because I'm smart or this is a new idea. It would just be because our company has spent last five years building code review tools, right place right time. Eventually there was enough infrastructure accumulated (and ambient events unfolding, ie OpenAI) that it became a small step to pass the edge of what PR review meant circa 2022.

wbharding · on Jan 5, 2023

We were defrauded by Twilio as well: https://bill.harding.blog/2019/08/13/twilios-incentives-to-a...

Maybe a class action possibility here?

wbharding · on Nov 3, 2022

As has often been Stripe's way as a company, they are setting the bar for what other companies should strive toward. Has there ever been a more generous severance package posted to HN?

As the owner of a (much, much) smaller company, I'm inspired by how the Collisons run their business, especially under adverse circumstances. Yes, they fucked up in estimating the future market, but they are in good company among CEOs and non-CEOs lately.