Don’t type anything into Gemini, Google’s family of GenAI apps, that is incriminating — or that you wouldn’t want someone else to see.
This is the PSA (kind of) today from Google, which in a new supporting document describes how it collects data from users of the Gemini chatbot apps for web, Android and iOS.
Google notes that human commentators regularly read, flag, and edit conversations with Gemini — albeit conversations “disconnected” from Google Accounts — to improve the service. (It is not clear whether these commenters are internal or outsourced, which can matter when it comes to data security; (Google isn’t saying.) These conversations are kept for up to three years, along with “relevant data” such as the languages and devices the user used and their location.
Now, Google provides users some control over what Gemini-related data is retained — and how.
Disabling Gemini App Activity in Google’s My Activity dashboard (it’s on by default) prevents future Gemini conversations from being saved to a Google Account for auditing (meaning the three-year window won’t apply). Meanwhile, individual prompts and conversations with Gemini can be deleted from the Gemini App Activity screen.
However, Google says that even when Gemini App Activity is turned off, Gemini conversations will be stored in a Google Account for up to 72 hours to “maintain the safety and security of Gemini Apps and improve Gemini Apps.”
“Do not enter confidential information in your conversations or data that you would not want a reviewer to see or for Google to use to improve its products, services, and machine learning technologies,” Google writes.
To be fair, Google’s GenAI data collection and retention policies aren’t all that different from those of its rivals. OpenAI, for example, stores all chats with ChatGPT for 30 days, regardless of whether ChatGPT’s chat history feature is disabled, except when a user is enrolled in an enterprise-level plan with a custom data retention policy .
However, Google’s policy shows the challenges inherent in balancing privacy with developing GenAI models powered by user data for self-improvement.
Liberal GenAI data retention policies have landed vendors in hot water with regulators in the recent past.
Last summer, the FTC wanted detailed information from OpenAI about how the company controls the data used to train its models, including consumer data — and how that data is protected when accessed by third parties. Abroad, Italy’s data privacy regulator, the Italian Data Protection Authority, said OpenAI had no “legal basis” for bulk collection and storage of personal data to train GenAI models.
As GenAI tools proliferate, organizations are becoming increasingly wary of privacy risks.
Recent overview by Cisco found that 63% of companies have placed restrictions on what data can be fed into GenAI tools, while 27% have banned GenAI altogether. The same survey revealed that 45% of employees have entered “problematic” data into GenAI tools, including employee information and non-public records about their employer.
OpenAI, Microsoft, Amazon, Google, and others offer GenAI products specifically aimed at businesses no retain data for any length of time, whether for training models or for any other purpose. However, consumers – as is often the case – get the short end of the stick.