Gemini can't be disabled on Google Docs

wcoenen · on July 11, 2024

This just weird Twitter rage. The user seems to think the private information in their tax return will leak, just because Gemini tried to summarize it.

To back this up, they reference the story about Gemini chats "leaking" on the public internet. What actually happened there was that people were creating publicly viewable chats via gemini.google.com/share/ links, and some of those got posted on social media and were indexed.

nitinreddy88 · on July 11, 2024

Its not the matter of getting leaked. Without my consent, reading my documents to do something should be illegal

wcoenen · on July 11, 2024

Gemini is "reading" your documents in the same sense that google servers are "reading" your documents in order to show them in your browser. Gemini is not a person and won't remember what it has "read".

If you don't want google to handle your documents, then don't put your documents in Google docs.

manuelmoreale · on July 11, 2024

I get what you’re saying. But at a fundamental level there’s a difference between “using” content to do the thing I’m asking you to do (display it in a browser) and running it through an AI to do processing on that content.

I personally think the two situations are quite different.

I agree that if you don’t want Google to sniff on your content you shouldn’t put it on their servers to begin with.

That said, stating that Gemini won’t remember, is dubious. Because given the track record of these companies I have my doubts that they don’t log everything they can put their hands on.

giovannibajo1 · on July 11, 2024

Google Docs does a lot of algorithms over the data you put in. For instance, it paginate them and show a page count. This is an algorithm processing your data exactly like Gemini does. There is no option in Google Docs to avoid the pagination algorithm from reading my data and processing it.

Another example: Google Docs indexes the contents of your document. That is, it stores all the words in a big database that you don't see and don't have access to, so that you can search for "tax" in the Google Docs search bar and bring up all documents that contain the word "tax". There is no option in Google Docs to avoid indexing the contents of a document for the purpose of searching for it.

When you decide to put your data into Google Docs, you are OK with Google processing your data in several ways (that should hopefully be documented). The fact that you seem so upset that a specific algorithm is processing your data just because it has the "AI" buzzword attached to it, seems like an overreaction prompted by the general panic we're living in.

I agree Google should be clear (and it is clear) whether Gemini is being trained on your data or not, because that is something that can have side effects that you have the right to be informed about. But Gemini just processing your data to provide feature N+1 among the other 2 billions available, it's really not something noteworthy.

manuelmoreale · on July 11, 2024

> For instance, it paginate them and show a page count.

Do you think this information google is gathering can then be used in the future to paginate some other document? Do you think paginating my doc will help their algorithm to better paginate documents in the future? I see what you're trying to say but putting everything in the "algorithm" bucket doesn't help moving the whole conversation around AI forward.

> The fact that you seem so upset

Your upset detector is clearly wrong. I don't use google docs. I don't care about google docs. I'm just adding my 2c to a conversation around this type of practices google and co are using.

Isn't this why we're here on HN? To exchange ideas?

Jensson · on July 11, 2024

Google is pretty good at separating inference from training. If they wish to train on your data they do that by just training on your data, them running the model on that data to give you info is totally separate.

cwillu · on July 11, 2024

https://support.google.com/gemini/answer/13594961

“Google collects your Gemini Apps conversations, related product usage information, info about your location, and your feedback. Google uses this data, consistent with our Privacy Policy, to provide, improve, and develop Google products and services and machine-learning technologies, including Google’s enterprise products such as Google Cloud.”

“To help with quality and improve our products (such as generative machine-learning models that power Gemini Apps), human reviewers read, annotate, and process your Gemini Apps conversations. We take steps to protect your privacy as part of this process. This includes disconnecting your conversations with Gemini Apps from your Google Account before reviewers see or annotate them. Please don’t enter confidential information in your conversations or any data you wouldn’t want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.” [italics was bold in the original]

Seems pretty clear to me.

voxic11 · on July 11, 2024

You can opt out of that. Its explained right after what you have quoted.

> To stop future conversations from being reviewed or used to improve Google machine-learning technologies, turn off Gemini Apps activity. You can review your prompts or delete your conversations from your Gemini Apps activity at myactivity.google.com/product/gemini.

wil421 · on July 11, 2024

Right now it’s an AI but Google has been “reading” your docs since they invented Google Search, Gmail and Google Docs are just extensions. It’s literally in their business model, collect all your info to show you relevant ads.

manuelmoreale · on July 11, 2024

Oh, I know. I'm aware of that. Which is why I don't use a free gmail account, I don't use google docs, and I run as much ad blocking as possible both at the browser and at the network level. And that's also why i'm sure google is collecting as much data as possible when people use gemini. Because why wouldn't they? It's their entire business model: collect data and then sell ads based on that.

dotancohen · on July 11, 2024

Are you suggesting that the contents of Google Docs documents are used to target ads?

singron · on July 11, 2024

Google also stopped using Gmail for ad targeting.

_flux · on July 11, 2024

> But at a fundamental level there’s a difference between “using” content to do the thing I’m asking you to do (display it in a browser) and running it through an AI to do processing on that content.

The thing you're asking it to do is to maintain the state of the document in a service. This way it's able to display the state of the document to one or more clients.

Arguably running through AI is even a more privacy-preserving feature, because Google Docs most certainly stores the data to a persistent storage (it is its main feature and cannot work without it), whereas AIs just needs to run the data in RAM.

manuelmoreale · on July 11, 2024

Just so we're clear, I'm personally not using G Docs nor Gemini nor any other AI tool at the moment. My issue—if you can call it that way—is more fundamental and that is related to intentions.

I think using some data for something that's fundamentally different than its original use case is problematic in my view.

cwillu · on July 11, 2024

Except they specifically warn you that they _do_ store conversations and use them for training.

wodenokoto · on July 11, 2024

You gave them your consent when you opened the document in Google Docs.

And it is not like Gemini is the first NLP model to be reading your google docs. Google have had one of the most advanced spell/grammar checkers reading through your docs for years now.

steve1977 · on July 11, 2024

One could argue that by using Google Docs, you are giving your consent. If you don't want that, stop using this service.

Sayrus · on July 11, 2024

Title is sensationalized. According to the thread, Gemini is enabled only after opening Gemini. Ingestion is enabled until Gemini is closed.

> OK, more testing and I think I've figured it out (and it's still bad!). It seems that if you've ever clicked the Gemini button for a type of document then it remains open whenever you open another of that type--and therefore automatically ingests and summarizes it. So, e.g...

rezonant · on July 11, 2024

What is the problem here? Google already has your document, all they are doing with Gemini is running it through a bunch of transformers and spitting back some information. The act of sending input into an LLM does not itself train the LLM on that data (these systems don't "learn" via execution of normal input/output).

Is the concern that Google would consider your use of Gemini on this document as consent to use it for future training?

michaelt · on July 11, 2024

From the user's perspective:

1. Some people (e.g. some artists) don't like generative AI as a matter of principle, seeing it as soulless, corporate, and entirely trained on stolen data.

2. Many people resent Clippy-style popup features. They appear at the most inconvenient times, and everyone knows they're mostly for the benefit of the product manager with a user count KPI. And the harder they get pushed, the more people resent them.

3. The distinction between things-known-from-training-data, things-known-from-context and things-known-from-RAG and so on is pretty opaque to most users - and not clearly guaranteed by the documentation. If it's an assistant that can schedule reminders, can find things in your e-mails and google drive, and it promises "Personal Results" where "your communication requests will be used to improve your experience with Gemini" the distinctions are pretty ambiguous.

4. The LLM industry norm is to play fast and loose with training data.

zarzavat · on July 11, 2024

Unless explicitly promised otherwise, the assumption is always that anything being passed to a LLM API may be retained for various reasons. That’s the expectation that’s been set by the industry.

fauigerzigerk · on July 11, 2024

Agreed, but as I understand it they do make that promise:

"We do not use your Workspace data to train or improve the underlying generative AI and large language models that power Gemini, Search, and other systems outside of Workspace without permission."

https://support.google.com/docs/answer/14615114?hl=en

Edit: I believe that this "without permission" caveat refers to experimental things like Workspace Labs where you have to give them that permission if you decide to join.

dotancohen · on July 11, 2024

Where can I see the value of the "user has granted permission" boolean relevant to my account, clearly and unambiguously? I want to be sure that I haven't accidentally given permission via an action that I may have not understood.

fauigerzigerk · on July 11, 2024

I'm not aware of any such on/off switch or indicator. You could check whether you can turn off Workspace Labs. If you can't turn it off then you haven't joined:

https://support.google.com/docs/answer/13447104?hl=en#labs-o...

I'm not sure whether there are any other beta programs that require these permissions.

dotancohen · on July 11, 2024

If this is the case, then the phrasing "without permission" is meaningless and so is the clause attached to it. We have to assume that Google thinks we've "given permission".

austhrow743 · on July 11, 2024

Sure. That's also the assumption when you upload your private information to someone else's computer. Except for the "unless explicitly promised otherwise" part.

Ferret7446 · on July 11, 2024

Why would that be limited to passing it to an LLM and not just passing it to any server? If Google wanted to use your data for training, they would just... do that? No Gemini needed

delusional · on July 11, 2024

I would presume that google has already trained Gemini on all the documents.

viraptor · on July 11, 2024

You should have the control of your information per use case / purpose. The ad networks are also Google's property - would you be fine with the Adsense looking at all your private documents? GDPR got that right - you consent to a specific usage, not to a free-for-all.

aniviacat · on July 11, 2024

The title does not match what is said in the post. The post implies there is indeed a setting to turn it off.

baud147258 · on July 11, 2024

except the setting doesn't seem to work cf https://x.com/KevinBankston/status/1811194932931027298

fragmede · on July 11, 2024

It’s disabled on $JOB and my personal Google Workplace. Author opted into some bard thing and then forgot about it.

egorfine · on July 11, 2024

Rationally I understand that no training is happening and this is no different than slicing a substring for preview purposes.

Emotionally it makes my blood boil though. I am absolutely with the author.

I had to leave Sentry for crash reports because they did report things to me I have never asked them to watch for in the first place. Same vibe.

whoitwas · on July 11, 2024

Is there a setting or not? Should we migrate from docs? Is it time to finally cut the cord officially de-Googlify?

michaelmior · on July 11, 2024

Does it matter that much if there is a setting or not? Google already has your documents anyway. They're not being used for training either way.

barnabee · on July 11, 2024

It matters to me.

I don’t use Google to create personal documents precisely because I can’t trust or verify that their apps aren’t doing more than provide storage, versioning, conflict resolution, and an editing UI without my permission.

That said, I think it’s way past time to give up on hoping to trust a cloud / “big tech” provider.

There are now only three models for applications I’m happy to use, and I try to do everything in one of those three where possible:

1) Verifiably end-to-end encrypted. This is the only option way I can get comfortable with cloud software.

2) Self hosted applications.

3) Local first applications with “dumb filestore” data syncing via a provider of my choice (likely self hosted).

In all cases I have a strong preference for truly open source (avoiding VC funded open core weirdness etc.) and try to donate more than the equivalent subscription or purchase price for proprietary alternatives to the projects I find that meet my needs.

yawpitch · on July 11, 2024

Possession of a file doesn’t imply that Google was reading that file, and absent a very limited set of criminal actions for which a search warrant might be issued there’s always been a reasonable expectation that Google was not reading your files… now Google is reading that file, apparently by default.

fauigerzigerk · on July 11, 2024

Not sure what you mean by "reading". Google has always been indexing all personal content to power the search feature (not public Google search). It's a major reason why I use their services.

yawpitch · on July 12, 2024

Let’s take “reading” to mean “learning the semantic meaning of” then… an index is really just a data structure populated with the words and link back to the document, but no developed notion of the semantic meaning of those words or a semantic meaning like “odds this is the user’s SSN” associated with the content of the document, and it could easily be held entirely within your user account, entirely accessible only to you. This is much closer to a Google employee opening your document and reading it and emailing you back a summary as a default action, even if I’ll admit that employee is not generally intelligent. If the technical details make it so the “semantic index” is held only within your user account and only ever accessible by you and cannot under any circumstances leak outside to Google at large, then okay I might be comfortable with that, but I’d want to know those details first rather than just having this “feature” enabled on a file type.

viraptor · on July 11, 2024

A best time to get off Google was before the previous problems. The second best is now. (applies after every Google issue)

kkfx · on July 11, 2024

Hem... How "private" can be something existing on someone else computer, not just as mere data, but also the code to use and visualize them?

31337Logic · on July 11, 2024

I never use Google docs. This is a great reminder of why I have no reason to start. Thanks.

hexage1814 · on July 11, 2024

Don't use Google if you don't want Google to know what you are doing.

cynicalsecurity · on July 11, 2024

Google is not a privacy company. Don't use their services if you are looking for privacy.

Gant1 · on July 11, 2024

"Gemini is your always-on AI assistant across Google Workspace" So user buys a product without even reading the product page, then complains about product? m'kay

rvnx · on July 11, 2024

Except if was added recently, and most users are long-time users.

Gant1 · on July 11, 2024

Gemini is an add-on. No matter the current workspace plan, one has to sign up for it

hiddencost · on July 11, 2024

Try some empathy? A user is upset and scared. AI is scary and hard to understand. We owe our users a sense of safety and trust.

BSDobelix · on July 11, 2024

>Try some empathy?

NO! Don't put your private documents on someone else's computer, that's easy.

But if you do, don't put it on the hard drive of a company that makes money from your personal data for advertising, it sounds almost childish to explain.

robertlagrant · on July 11, 2024

It's not clear who you're replying to, but this isn't a conversation with a person. What's being discussed - away from the person who published this - are the truth claims in the post.