I've worked on some projects that used ML and such to half-automate things, thinking that we'd get the computer to do most of the work and people would check over things and it would be quality controlled.
Three problems with this:
* salespeople constantly try to sell the automation as more complete than it is
* product owners try to push us developers into making it more fully automated
* users get lulled into thinking it's more complete than it is (and accepting suggestions instead of deeply thinking through the issues like they would if they had to think things from scratch)
Which is a very real component of the whole system at large. (Think along the lines of assemblage/actor-network theory.
Maybe fixing management is the more pressing issue then working on the task of selfreplacement in the name of profit for others. Thinking about it, the implications are interesting.
What is the energyconsumption of a human thinking in comparison with the energy requirement of a possible machinic replacement?
To me they feel more like Molokh-style problems; systems that work towards more automation will always have to deal with these problems. You can't management your way out of users trusting your product too much.
I.e. for classification you can judge "certainty" by the soft-max outputs of the classifier, then in the less certain cases can refuse to classify and send it to humans.
And also do random sampling of outputs by humans to verify accuracy over time.
It's just that humans are really expensive and slow though, so it can be hard to maintain.
But if humans have to review everything anyway (like with the EU's AI act for many applications) then you don't really gain much - even though the humans would likely just do a cursory rubber-stamp review anyway, as anyone who has seen Pull Request reviews can attest to.
I have the same experience but I am still 5 to 10 times more productive using claude. I'll have it write a class, have it write tests for the class and give it the output of the tests, from which it usually figures out problems like "oops those methods don't exist". Along the way I am guiding it on the approach and architecture. Sometimes it does get stuck and it needs very specific intervention. You need to be a senior engineer to do this well, In the end I usually get what I want with way more tests than I would have the patience to write and a fraction of the time. Importantly since it now has the context loaded, I can have it write nicely formatted documentation and add bells and whistles like a pretty cli, with minimal effort. In the end I usually get what I want with better tests, docs and polish in a fraction of the time, especially with cursor which makes the iteration process so much faster.
One of the big subtle problems is designing the broader interaction so that the humans in the loop are both capable and motivated to do a proper review of every item that will occur.
LLMs are able to counterfeit a truly impressive number of indirect signals which humans currently use to make snap-judgements and mental-shortcuts, and somehow reviewers need to be shielded from that.
It's how humans lived for all of history before the Internet. Seems healthier to me. If you're not close enough to someone for them to want to share updates with you specifically, or to see them and catch up, why do you need to know every update on their life?
Tbf I'm in a family group WhatsApp chat, which I guess fulfills the "life updates" part for my family. But no public social media, don't see the need
That's exactly it. You have to bring the "copy" of the left side to overlap with the "original" right side. Then move forward/backward until the overlapped 2 come into focus. The shimmering should stick out, was so shocking when it finally clicked for me.
In theory they try to get people hired for their competence rather than their network. A widely-cited anecdotal example of this reportedly working well is the Rooney Rule: https://www.espn.com/nfl/playoffs06/news/story?id=2750645
This thread also has a lot of anecdotal examples of failure modes of 'diverse slate' rules, though, such as people who have already decided who to hire still interviewing women candidates just to appease the rule, thus wasting everyone's time.
I aim to discuss this topic factually while noting that views vary significantly among San Francisco residents. Some common political positions that would generally be considered far-left in the U.S. context include support for:
- Housing as a human right and strict rent control policies
- Universal basic income and significantly higher taxes on wealthy individuals and corporations
- Complete defunding or abolition of police departments
- Immediate and dramatic action on climate change, including bans on private vehicles