Hacker News new | past | comments | ask | show | jobs | submit login
Googlers say Bard AI is “worse than useless,” ethics concerns were ignored (arstechnica.com)
47 points by pseudolus on April 21, 2023 | hide | past | favorite | 38 comments



Apparently, Google employees were asked to test Bard pre-release, but their feedback was mostly ignored by managers intent on launching Bard as quickly as possible. Bloomberg obtained copies of pre-launch discussions in which internal testers describe Bard as "cringe-worthy" and "a pathological liar." One employee wrote a post titled, "Bard is worse than useless: please do not launch." Google launched it anyway a few weeks later.


Yeah but I've said before, if Google had GPT 4, we wouldn't have public access. The reality is Google has filled itself to the brim with experts on their domains, rather than experts on their domains as they fit into a larger picture.

It's cool to be a 10x engineer, but if you're turning basic tasks in 100x efforts in the pursuit of engineering rigor, you're going to get your lunch ate by 1x engineers using unsexy unscalable tech.

The source article is a much better read: https://www.bloomberg.com/news/features/2023-04-19/google-ba...

Based on its contents I was not convinced that Bard is actually so terrible vs Google's internal testers simply upholding a bar that's not realistic for the current crop of LLMs, especially when you read things like:

> In some, like child safety, engineers still need to clear the 100% threshold.

How do you child proof an LLM? Why wouldn't you instead just release it and say supervise your child's use the same way you would for 99% of the internet?


>The source article is a much better read: https://archive.is/uhhl4


It seems odd to me that you're complaining that Google experts are not seeing the bigger picture, when from what I see they are the ones taking ethical concerns seriously.

If solving health and safety risks is "not realistic for the current crop of LLMs", maybe they in fact should not be in the wild? LLM's already have a body count [1], so child safety doesn't sound like an unreasonable thing to ask about. Dumping the responsibility on parents does not actually mitigate the ethical concerns; its just a way of ignoring the "larger picture".

[1] E.g., https://www.brusselstimes.com/belgium/430098/belgian-man-com...


It's my view that "ethical concerns" in this context are confused at best, worthless virtue signaling frequently, and at worse it as an absurdly _unethical_ effort to use state power to block research and inquiry in order to protect oligopoly profits. As a bonus side effect they get used as an excuse to train in discriminatory biases, with whatever negative effects those will have-- probably mostly just undermining trust in the involved institutions.

Offer the system to consenting adults, make them aware that the output is frequently garbage, TEACH them how produce garbage to help them to recognize some of the most obvious failure modes. Done. At that point the result would be generally safer than the URL bar in user's browsers which can take them to all manner of harmful things, including potentially other LLMs -- or the most dangerous of intelligences: contact with other human beings.

Mentally ill people, violent people, etc. will always do unfortunate things involving whatever things are in proximity to them. The internet isn't "safe", libraries aren't "safe", the _world_ isn't "safe".

But even if the world itself were safe people would still manage harm themselves and others with it with it, because people aren't safe. No one bothers writing about people who committed suicide after losing a game of chess to a child or watching sad movies, but those things happen too. It's a big world and so you name it, someone has probably killed themselves over it or with it or killed someone else because of it.

Go look at the mobile 'games' offered in the play store. Can you look at google raking in tons turning children into click-a-holics and tell me seriously that you think "safe" was ever a significant objective?

It's only big in the context of ML because there is a socially well connected apocalypse cult ( https://archive.is/eqZx2 ) pushing the idea that today's ML are materially unsafe in ways that other things aren't and it's a commercially convenient narrative to protect oligopolies.


I guess your view is that if a new thing is approximately as terrible as an existing thing, then it's fine. Mine is that new things are a rare opportunity to make the world less terrible, so we should take a swing at it.


Fair position, but I say: What is somewhat rare is the opportunity to make large changes, but it's usually a false opportunity because large changes are generally ill-advised. However, large changes aren't the only ones we can make: We can and should always improve and if we do the improvements don't need to be large. That this is good because large improvements often don't work-- because people route around them due to their cost or unfamiliarity-- or don't do what we thought they'd do-- because we're not as smart as we think we are and we can easily make things worse through our efforts to do better.

So that's why I suggest things like I think it would be reasonable to put LLMs behind education in a way that we don't for search boxes or URL bars. I think anything we do to make us all more savvy consumers of information makes the world less terrible. Yet it's an improvement that is unlikely to backfire, doesn't present much incentive to route around, still effective if routed around somewhat, doesn't significantly impede forward progress, or increase costs tremendously.

I can empathize with the frustration that there are so many unfixed things that can be improved, but I'm confident in mankind that if we all keep nudging in the right direction that we'll get there-- and get there sooner than we would by attempting a Great Leap Forward that has too great a chance of disaster, too great a chance of stopping our progress, setting us back, or coming at too great a cost.

Like in software where the most complex code you can write is code you can't debug/maintain because you've got to be 10x smarter to debug it than write it, in our cultural progress we can imagine advances so much greater than we can accomplish safely and sustainably in practice. But what we can accomplish brought us to where we are now and we should feel proud of it and confident in the future, and know that no matter how much better we make things we'll still think they're terrible and find ways to improve things.


You seem to be going around to everyone who replied to you yesterday and responding in anger. If you're having a bad day, it might be better to not reply.


wpietri's response to me was in no way in anger, and his position in it is a perfectly legitimate one.


I definitely took reducing your entire comment to to "I guess your view is that if a new thing is approximately as terrible as an existing thing, then it's fine." as not coming a place of good faith, but maybe I'm being harsh


How are they taking ethical concerns seriously? They launched the thing despite the internal employees' feedback that the thing lies and says things that would lead to injury or death.


Sorry, I am speaking of the employees taking concerns seriously, not Google itself. I'll edit to clarify.


What is googles but a sum of its employees?


Do you think Google is some sort of... democracy? Like they all get together and build consensus for a particular course of action?


You're showing off a perfect example of understanding your domain vs understanding the bigger picture.

In your mind not having "100% child safety" is apparently equivalent to "not solving health and safety risks". Unfortunately that's not a take that's compatible with the reality on the ground. The internet is not 100% child safe and if we had locked it up until it was, there'd be no internet.

What people fail to understand is that sometimes settling for something strictly worse in domain terms results in a much better overall outcome.

For example, who's having more impact on the future of safety and alignment in LLMs:

- Someone working on ethics at OpenAI; who settled for Google's "80%" on all topics including child safety and is therefore empowered to actually achieve that (with a massive head start on learning how alignment is being broken in the real world)

- Or someone working on ethics at Google; who was stuck on 100% until their project stalled and their entire opinion got discarded and they get laid off and the thing goes out with an "experiment" label and significantly less testing?

You can't steer the ship if it sinks. Some companies are currently at a place where everyone has their hand on the rudder pulling in their own direction as the ship runs into an iceberg because everyone is certain that their direction is correct, and the other direction is certain doom for humanity. At the end of the day you need to learn to discard that sort of extremism, the world is not that black and white.


Seems people have different opinions on what’s acceptable.

Many people seem quite happy with ChatGPT despite its tendency to hallucinate with no warning of any kind.

Then there are people like me who feel that its unreliability significantly decrease its usefulness - and it’s probably completely useless as a “primary source” of information; it’s at best equivalent to a stack overflow post except without the community feedback to detect bad posts. Like with any stack overflow provided solution, it’s wise to double check with a “primary source” like the official documentation to make sure it does what you think it does.

I can understand why some at Google might be hesitant. The information provided by Bard would be seen as coming from Google and if it’s incorrect, Google will be blamed.


The level of irony in saying the world is not black and white... then getting a reply that equates being happy with ChatGPT to not understanding unreliability's effect on a tool's usefulness.

Usefulness is not a binary value. One of the biggest advancements of 4 over 3.5 is that it's more reliable: that has made it significantly more useful.

I will say having to explain the relationship being reliability and usefulness tracks with why certain ethical voices just get ignored. At some point you need to be able to have some reasonable shared level of understanding to have a productive conversation.

When one side of the conversation thinks others are just so completely lost that they can't understand how something being unreliable makes it less useful... there's just not much productive conversation to be had.


> Many people seem quite happy with ChatGPT despite its tendency to hallucinate with no warning of any kind. > Then there are people like me who feel that its unreliability

Depends on what you're using it for, if you're using it for creative purposes the made up information is usually great.

If you're using it instead of stack overflow... yeah, then made up stuff can be a problem. ... except sometimes the made up stuff turns out to be correct. :)


> You're showing off a perfect example of understanding your domain vs understanding the bigger picture. [...] you need to learn to discard that sort of extremism

How very strident of you. What exactly do you think my "domain" is? What "extremism" do you think I'm in the grip of?


My comment says "can't be 100% child proof". But you proceeded as if my comment implied child safety is an unreasonable thing to ask about (literally in those words). That's either a bit of extremism or a strawman argument. I was being charitable in that you wouldn't make a bad faith argument and attributed it to a more ideological source.

And I think you misunderstood, I don't think your domain is AI since you're saying "solve health and safety risks". People in AI tend to state that problem a lot more concretely than shocking article headlines and tend to use more nuanced language than "solve" since alignment is a complex topic.

It's just you're starting your comment by positing that your position is in line with that position of those of Google's experts, and I'm saying that's a great example of zooming in on your domain rather than the larger picture.

After all, the only way to claim that a warning that kids should be supervised "does not actually mitigate the ethical concerns" is either a refusal to look at the larger picture, or a misunderstanding of what mitigate means?


From my perspective, you are not only not addressing what I'm asking, you're throwing up a wall of prose in a manner that reminds me of a squid squirting ink, replete with what seem to me distortions and straw men. I tried a couple times, but this certainly doesn't look like conversational progress to me, so I'll tap out here.


There are much more civil ways to state that you're out of your depth.


I help answer questions on a few IRC channels. ChatGPT has made our life considerably and noticeably harder, as the knowledge that comes out of it is often blatantly wrong, yet extremely confident. We spend a lot of time untangling it.

I would not be sad one bit if all of it died tomorrow. In my view, it seems like it's more important for AI to look coherent and seem impressive than be correct. It's extremely possible it will provide net negative results if we don't get this under control. There's no clock to race against. Let's take our time and get it right.


I've used IRC for years and people being confidently wrong has always been an issue.

LLMs are definitely a new concept, but not every facet of them has 0 precedent. Someone surfacing a bad source repeatedly isn't new, you label the channel so people know it's banned, and if the same person keeps surfacing bad info you mute them, then ban them if it continues.

If you have a sudden influx of low quality new users it's also not new, you require authentication and that alone cuts down on a ton of spam


IRC? Try HN. ... or really just about anywhere. People rarely post "I don't know", if they know they don't know they don't post. ... and you don't post you don't exist. So in any venue online the level of confidence is dramatically overstated and probably the amount of wrong answers too.

Not that things would be much better for LLMs if that weren't true. LLMs that learned their "don't know" practices from text-- they'd only be right about it on things that everyone didn't know and would probably give erroneous don't know answers or things they actually do know, so it would be just another way of being incorrect. :)

Suppressing out of control making stuff up is hard for human minds too. Go talk to a four or five year old. :) I've wondered if part of the reason for it is that it's something much easier learned through interaction than from observation.


Well put. I think this is especially important:

> There's no clock to race against.

Nothing bad is going to happen if society doesn't rush LLMs out. It's not like we're facing a shocking deficit of text.


no clock for society. some clock for companies to get ahead and make money while they can


I well remember the launch;

Google shares drop $100B because of Bard : https://www.npr.org/2023/02/09/1155650909/google-chatbot--er...

Posted by nigamanth 69 days ago - 8 comments : https://news.ycombinator.com/item?id=34749560


> google layoffs thousands: stock goes up

> google releases bard: stock goes down

some balance in the universe.

you have this futuristic AI tech and you name it "Bard".


What does ethics mean here? That the robot hallucinates? So do all the others.


It means not putting profits before the concerns of a technology that has already proven to be garbage just because you are nervous about other companies winning the AI hype race.


One thing I don't understand about the AI "ethics" debate is that it's a debate about the behavior of a singular system. One single AI. In this case, Google's.

Do people honestly think there is some happy medium that everyone will agree with for standard behavior of a machine that is attempting to emulate the human mind?[1] Of course there isn't. Humans exist on a vast spectrum. It seems we're somehow arguing about one machine that attempts to emulate that spectrum.

[1] https://en.wikipedia.org/wiki/Dune_(franchise)#The_Butlerian...


having worked at a lot of these companies including google itself. most of the employees are complete whiners and complainers that will block everything.

think they made the right choice in ignoring them if they want velocity


Sounds like Google+ twelve years ago - employees detailed precisely what would happen, and it happened and it was a disaster. In fact, I think Google+ is what finally did in Google's golden reputation with techies.

The answer is: who is the PM and what are their incentives? That's how Google+ became an all-consuming trash fire, and it'll be what's happening here.


Good job google, now do search.


Let's be Honest, Ethics only matter to Corporations when finances are negatively affected.


I'm wary of anyone claiming to be the arbiter of ethics in general. It always sounds like just authoritarianism in disguise.


I think corporations caring about ethics in so far as it impacts the bottom line is a much safer outcome than them caring about ethics as a terminal objective...

All serious atrocities that have been done in the history of man have been done by people who believed what they were doing was good. Trying too hard to be good is exactly how you do evil, and not just conventional banal evil, but the truly horrifying stuff like outright genocide.

Ethics as triggered by bottom line concerns becomes part of a feedback cycle involving the greater market. It provides a way to stem some of the worst ethical transgressions without having much potential for running off on a tangent and doing evil by trying too hard to do "good", as those efforts will also hurt the bottom line.

Greed follows incentives, so collectively we can predict and control much of the harm it can create. When some person or organization is motivated by belief it's going to be much less responsive and we better hope that their beliefs are exactly right or that the entity doesn't have much power.

It's easy and common for bloated quasi-monopolists to engage in an awful lot of activity which has little to do with improving the bottom line, so the the mere existence of ambient capitalism isn't sufficient to ward off misguided and out of control do-gooding. It can be particularly bad when they do since they begin the activities with a lot of power (which they usually gained through productive activities in the past before they were diverted).

Overwrought claims of do-gooding are also easily abused to undermine our civil process and steam roll protections which were established because our forebearers recognized the risk of popular whim and expedience. A few "think of the childrens" and you've deputized the state to carry out your campaign of oppression. We're a bit less willing to roll over just because someone wants to make more money.

I don't mean to argue that what you get from profit driven ethics is great or that we can't supplement it with other backstops (including regulations, whistleblower protections, a non-precarious employment situation so people can follow their consciences, etc) but you really can do much worse than "is ethical in so far as its profitable, is unethical in so as it's profitable and you can get away with it", and I think that perspective is important.

From this position it's easier to see how google's transition from "Don't be evil" to "Do the right thing" can be seen in an absolutely chilling light.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: