Hacker News new | past | comments | ask | show | jobs | submit login
Google Groups Data To Be Destroyed (artima.com)
193 points by barrkel on Oct 20, 2010 | hide | past | web | favorite | 84 comments

It's October, right? And Google is notifying anyone who creates a Page or File that the feature is going to go away in February? (And it looks like they started doing so in September?)

Seems like plenty of notice to me. One would assume that as the deadline gets closer, more prominent notices will be given. How many months do you need to move your data?

Google should be held to high standards, both because they are an important part of internet infrastructure and because they've stated they aspire to high standards. Five months' notice meets those high standards as far as I'm concerned.

I disagree. I have a friend whose mom died when she was in College. In the year following, she kept all her journals and poems in emails in her Yahoo mail account. That's a safe place for them, right?

The years went on and she got a Gmail account, but she would log in to Yahoo every year or so to read through her old journals to help process what was going on in her life.

Only one year she tried to log in and her account was gone. Yahoo, she learned, simply deletes your email if you don't log in for a year. And she had gone 14 months.

Personally, I think deleting data like this--especially when the storage requirements are so tiny--is unforgivable. I can think of many, many reasons why someone would put something important on Google Groups and then not log in for six months. There are people away for Peace Corps right now. There are people deployed in Afghanistan. There are people hiking the Appalachian Trail (well maybe it's a little late for that).

Many thousands of people will feel completely violated if Google does this. Google can count on that. For those people, Google will have simply destroyed their most precious data without warning.

I used to think that one of the selling points of Google Apps was that I could leave my data there and trust them to protect it. No more worrying about crashed hard drives. But it's become clear that's not the case.

> That's a safe place for them, right?

Fuck no! Double fuck no!

Do people actually think that? That's beyond ignorant/naive. Data in the hands of a for profit corporation, safe? Did you read the terms of service / limitation of warranty? The short version is "fuck you and your data". It's not safe from loss at the whims of market/share holder needs/etc. It's also not safe from being compromised either by internal employees or external hackers. It's also not protected by most of the Bill of Rights (those mostly apply to government vs people not business vs people). It's easily supeoned, if the business doesn't just turn over the data on request.

Safe is encrypted on multiple devices geographically (and preferably jurisdictionally) separated.

The lowest acceptable level that can still be considered safe (a very loose definition of safe) is; a copy in plain/open formats on a device I physically control/hold(my desktop harddrive), and a copy somewhere else. Mom's house, Dropbox, yahoo mail, etc.

> Do people actually think that? That's beyond ignorant/naive.

Yes, people actually think that! Have you met 90% of your (ok, Google or Yahoo's) customers? Barely anyone tells them to worry about saving their data locally and these companies actively ask you to load your data into them listing safety as a feature. "It's in the cloud!" blah blah

You can call people ignorant/naive when Google/Yahoo/Hotmail et al. display prominent "Your data is going away in 6 months AND HERE'S HOW TO GET IT." Emphasis on the "here's how to get it" otherwise they're suddenly at a loss of how to get their data back.

Most "normal" people don't assume there's a moratorium on data. Data is forever. You put it on a floppy disk, it stays there forever. All you can do is lose the disk. Same thing with flash drives. Same thing with online services.

The different here is, as you said, safety is pushed even harder with the cloud. There's nothing to lose anymore. The people who are savvy enough to realize that storage can and does die are now being told that people far more competent at it than them are managing backups and such. Why would you think your stuff would go away? Who is telling you that it might? I don't think it's ignorant/naive at all.

I do think that what Google is doing sucks, because I don't think it costs them much/anything when they're handing out 7GBs to anyone with a Gmail account. There's orders of magnitudes less Groups than Gmail users, and Groups was a great way of organizing small project teams privately, in a way that Sites isn't (eg. mailing list, files, pages all in one place).

To us engineers/founders/people with knowledge on how the software business works, it might be obvious, but not for the average Joe. Google (or Yahoo) will be a name brand and they'll hold high expectations of them, especially with Google's "Do No Evil" mantra.

That's "Don't Be Evil," actually. I think that's a pretty important distinction in tone.

Don't worry, I'm sure the NSA has it on a server somewhere along with the rest of our communications.

I'm looking forward to the day when we have a widely used p2p archiving network like OceanStore or Free Haven. Then we can stop worrying about this stuff.

My Master's Thesis covered exactly that. I'm currently working on commercialising the idea, so you can expect a "HN - Review my startup" within a year :)

If you don't mind linking to your thesis I'd love to read it!

Sure, but it's in Swedish. If you still want it I can e-mail it you.

I can see how a Dropbox-like P2P filesharing system could be beneficial for groups, but how would a P2P network benefit individuals trying to safely and securely archive personal documents?

Encrypt the documents. Put them on a distributed hashtable, using something like Reed-Solomon codes to distribute them over the network, so you're only taking about twice as much total storage as the document itself but you're very resilient to nodes disappearing. Maybe add some kind of resource exchange system so you pay for the storage by participating in the network.

Google the projects I mentioned and you'll find a lot more detail on these sorts of ideas. Both projects intended to build permanent, uncensorable storage at global scale. It's not a totally solved problem yet but there's been a lot of interesting engineering.

If the data is important don't entrust it to someone else. It doesn't matter if it is Google Groups, Facebook, or Yahoo mail.

While that is a good prescription for individuals, what the OP is getting at is that trust is a part of a brand, and if you could measure it, deleting data is penny wise and pound foolish for these companies.

Also, don't trust Seagate, Western Digital, Maxtor, or any other hard drive manufacturer..

(the serious point - if it's important, don't have a single point of failure. It doesn't matter if it's hardware or cloudware)

Totally, 100% understand that people make that kind of assumption. It makes sense if you don't look at it closely.

However: free account. Expecting a free anything to store your anything indefinitely is expecting the sky to rain diamonds.

Expecting a warning about such actions, especially for a low activity account, is not unreasonable (though still: free). And > 1 year warning is about as good as you can hope for, and would be utterly wonderful. But companies don't plan to stop things that far ahead and just absorb expected loss for over a year, so it's unlikely to occur, in which case 6 months is very generous, and about a minimum for people's expectations.

"...is expecting the sky to rain diamonds."

Lovely analogy! :)

It does happen, though: http://geology.com/nasa/diamonds-in-space.shtml

...not that anyone should expect it. ;)

Awesome find, I'll have to keep that one handy for other similar analogies :D

> However: free account.

Indeed, you get what for pay for.

> Personally, I think deleting data like this--especially when the storage requirements are so tiny--is unforgivable.

I think the problem is that people haven't adjusted to the new ways of keeping data. They assume their free Yahoo account is theirs, the way their closet is theirs. You know, if you leave a box of pictures in your closet, it'll stay in your closet barring a catastrophe.

Online services are available to serve a certain set of goals and objectives. If those objectives change, your stuff might get thrown out. It's more like asking a hotel to let you keep a box of stuff behind the desk for a while than it is like your own closet. They might keep it for a really long time, but if the hotel changes management or remodels, your stuff could get tossed. Especially if you're not regularly staying at the hotel.

It's not even "you get what you pay for," because paid services shut down too. In this era of cheap storage, there's no excuse to not make multiple backups of key data. Ideally at least one physical copy of important things. Actually print the emails out, put them in a binder, and put it in your closet. Keep one copy at Yahoo, one at another online backup service, one on your PC, and one on an external harddrive. That sounds like a lot, but it's not. Once you get a sensible backup policy and figure out what's important, it doesn't take very long to do backups.

It's a sad story you shared, but not unforgivable. Someone's keeping a box of your stuff behind the desk at a hotel. You can't count on that box to stay there. Now, the problem is people haven't realized that yet. But slowly, it'll become widely known and understood culturally. If it's important enough, keep multiple online and offline backups. Don't count on anyone to hold on to your stuff for you, because if management changes or the place gets remodeled, it might not be there any longer.

"especially when the storage requirements are so tiny"

Hardware is not the only contributor into the cost of keeping data available on the web site. There is electricity cost, servers administration cost, cost of keeping up with the updates, legal cost of dealing with requests to put data on your servers. Finally, there is opportunity cost -- the same team that manages these Group features could be doing something more useful. That's why Google is not maintaining unsuccessful applications.

There is another reason to forget old data: reducing noise. Usually newer data is more useful than the old one, so it forgetting old data reduces noise and makes it easier to find more useful information.

I hope your friend managed to get her data back. That makes my blood boil.

Do you believe it's reasonable to expect an account to not be deleted after 14 months of inactivity? That's not a short amount of time.

Obviously, she didn't. It's important to set the expectations clearly up front with the client.

I lost my old yahoo to, went to look at it one day for the nostalgia factor and it was gone, not a big loss but it's always fun to look back at old chats and emails and see how much you have changed over time.

Particularly for a free service. It's not reasonable to expect Yahoo to archive everyone's email forever. If you want something with more guarantees, you should probably expect to pay something for it.

OK, you are not getting his point. You don't have to explain it to us, all of us here are aware of it.

But, if there are users that don't understand it, than it means that they didn't deliver the message in a proper way. And obviously there are users that didn't get it (I know several cases for erased email accounts).

I use a Google group for a summer sport. Nobody logs in from August until May.

Would have missed it if not for this notice.

"Beware of the Leopard."

I needed Google to decipher this excellent comment. In case it helps:

"Yes," said Arthur, "yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard'."

That's the display department.

Hitchhikers' with allusions to Zork. Let's geek...

"But Mr Dent, the plans have been available in the local planning office for the last nine months."

"Oh yes, well as soon as I heard I went straight round to see them, yesterday afternoon. You hadn't exactly gone out of your way to call attention to them, had you? I mean, like actually telling anybody or anything."

"But the plans were on display ..."

"On display? I eventually had to go down to the cellar to find them."

"That's the display department."

"With a flashlight."

"Ah, well the lights had probably gone."

"So had the stairs."

"But look, you found the notice didn't you?"

"Yes," said Arthur, "yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard'."

I look after a few google groups but don't recall getting any notice on this.

I was just looking at the settings for a group I set up recently and saw that there was an option to allow Google to contact the group admin. And it was off. I think I just do that out of habit on the assumption that it will just mean more noise, but of course it could also mean I'll never hear about upcoming group changes.

Update: I got a notice yesterday.

You only get told if you create a page - what if you only read it?

The notice shows up prominently at the top of any Groups page that would be affected. It's pretty hard to miss.

Author tried a cheap trick of attracting audience by using a title of a big "shock-value". But the truth is not as bad as that title suggest: the most important thing about Google Groups, discussions that is, are not going anywhere, it's only "Files" and "Pages" features that are going to be trashed.

Which is a big deal if you use this features. Plus, I run many groups and didn't know they were disabling these features, so they are not telling everyone about it.

The title is true. Some Data from Google Groups is going to be Destroyed. I don't think anyone would have read it to mean that all Google Groups Data will be destroyed. Some Files and Pages will certainly be valuable; it's not like "Google Groups Boring Data to be Destroyed."

> I don't think anyone would have read it to mean that all Google Groups Data will be destroyed.

Actually, that's exactly how I interpreted the headline.

Can we turn down this 'author-is-using-a-cheap-trick-to-attract-audience' volume around here. I don't think Bruce Eckel needs to attract audience so badly that he will do it purposefully. Granted, title is not reflective of the content but give the authors some benefit of the doubt. Let's do a quick Google search on authors before we write stuff like this.

You mean evaluate people based on who they are versus what they say? No thanks.

I interpreted the title to mean Google Groups was going away. I was not aware that there was a separate "data" concept.

That sounds like your lack of familarity with the software and not the author's showmanship.

If the title read: "Google Groups Shutting Down!" and they meant shutting down the pages and files part, then sure I'd see your argument. But literally all data in Google Groups will be destroyed. I actually thought the tittle meant his data was destroyed on his Google Group, much less all of them were being wiped. I think it is pretty big news.

> You mean evaluate people based on who they are versus what they say? No thanks.

I don't know. I usually find that knowing who people are often provides some context to what they are saying.

TFA has "data" in the title. You weren't aware of what it meant and jumped to a conclusion. I don't see how it follows that the author is being sensationalistic.

So you are saying basically, we should believe [some notable author] over our lying eyes?

For those who didn't use the features, here is how it looks like in one example:


  (Discussions 1090 messages)
  (Pages 1 page)
  (Files 1 file)
Google announced: "we have decided to stop supporting the pages and files features." "Starting in November 2010, Groups will no longer allow the creation or editing of files and pages" "In February 2011, we will turn off the pages and files features, and you will no longer be able to access that content."

I read this article this morning and thought, "Oh they're removing some feature I don't even know exists.", but didn't want to comment on it worrying that maybe I really do use it and it is useful.

Then after reading your comment, I went to every Group I am subscribed to through Google Groups and couldn't find a single one that had a significant number of pages or files. The only group that did seem to use the feature is an old users group that has a very outdated iCal calendar and a group hackfest project from 2006 in a tarfile.

Maybe I'm missing the point here, but why should Google be forced to maintain and update a feature of one of their products that from my observation, no one uses? If anyone is a member of a group that does use this feature heavily, what kind of files and pages do you have tied to your group, and couldn't those be simply moved to a sites.google.com address?

If hacker Hacker News didn't remove the exclamation mark from the end of the title the shock-value could be even bigger :)

I am wondering why they don't just integrate groups features into Google Apps (where you could use Sites and Docs to do this same thing). It leads me to believe (knowing nothing about the reality) that Groups was built on some different underlying technology stack that makes it more difficult to integrate... I had a JotSpot account that eventually became a Google Sites site, and the main thing that didn't migrate over neatly was the discussion forums.

Also noticed this message on there: "Google Groups will no longer be supporting the Welcome Message feature. Starting January 13, you won't be able to edit your welcome messages, but you will still be able to view and download the existing content." Would it kill them to tell us why?

I'm guessing that the removal of Files, Pages, and Welcome Messages are all for a single reason: spam. Spammers love Google Groups, because it'll host their landing pages (complete with graphics!) on a trusted domain. I'd be willing to bet that someone at Google got fed up playing whack-a-mole with Viagra ads.

I remember when google groups was still called 'dejanews' and I used it just about every day.

Whatever I wanted to know, dejanews seemed to have the answer buried in its pages somewhere.

The google took over and promised all kinds of great things would happen to the content. Only it never did. Google groups was the ugly stepchild in the google family, first dropped from the google homepage links, then completely neglected and left to rot.

In a way it is surprising that it took this long for Google to make a decision on this, but it is a total waste of all the content in there.

Are there any attempts to archive this stuff ? Is there a way to get the data in bulk ?

I've seen first hand how important old data can be to people and I suspect this to be no different, the reocities.com project has been running for a bit under a year now and not a day goes by that I don't get notes from people that are insanely happy that their content was not lost.

A bigger point Bruce makes is the issue of confidence in Google. Groups is consumer oriented, but as developers we've seen Google turn on a dime, change its mind, and leave developers who've bought in to their technology, out in the cold. Bruce mentions Wave... no follow through from Google. Anyone who pitched a Wave based service in their own companies or business plan is fired, but the Goog keeps rolling on unperturbed. Microsoft, whatever else you might say, never hung developers out to dry like that.

Who's ready to build a full HTML 5 site now, and ask IE users to install Google Chrome Frame? Ready to bet your company on it? Six months from now, Google could get bored with it. Yawn. Not working out, fellas. But hey, it's open source, so you guys keep working on it, ok?

Is this the start of the cloud computing dystopia that Stallman warned us about?

This is a perfect example of one of the Freedoms we don't have with web applications. Even with Microsoft Office you have the freedom to keep using the software today in the way you did yesterday.

Even open source web applications don't have this freedom... If wordpress.org were to do this, those users would be in the same boat.

I've been working on an open source infrastructure project to fix these issue for a while, but it's still pretty nascent. My "manifesto" on that is here: http://bit.ly/forkolator

And for anyone who actually cares about proper linking: https://docs.google.com/Doc?docid=0AcmB_WI1jRkCZG41c2d4cl80O...

Hahaha, I'm dying from the irony! bit.ly etc. are pure evil, and are going to have a much more negative impact when they're gone than Google groups files.

Personally it's more of a "I want to see what website I'm about to open in my browser" ideology than pithy moral meanings. But sure!

it's not really about morality. When bitly decide the ad revenue and premium plans don't cover the hosting any more, massive chunks of web infrastructure will fail. When was the last time you followed a link to a running website and it failed? A few network issues here and there yes, but refresh and you're fine. When bitly bite the dust, that link is gone forever and you'll never know where it went. In thirty years I can look at a direct link and know where it goes, and if the endpoint is still live I can go there. Not so with bitly links (and don't even start on the smaller shorteners. They'll probably fold within the next 2 years)

I think the consequences are going to be more severe for dead links. Today I deal with permanently dead links pretty much every day. With the canonical URI, there is some hope there's a mirror in archive.org or the Google cache. If not, I can search for [parts of] the URI and hope someone mirrored the page, or has quotations from it. bit.ly URIs are a black hole.

At least on TinyUrl, you can preview links (prefix the links like: "preview.tinyurl..) But you are correct about sites going down. Biy.ly or TinyUrl goes down, so do the links.

Not to mention Google Docs...

lol. That is pretty funny.

>Even open source web applications don't have this freedom... If wordpress.org were to do this, those users would be in the same boat.

It's pretty straightforward to self-host your own WordPress blog. Does WordPress not offer data dumps? WordPress users would be in a very different boat, because they have all the tools they need to rehost their blog, without building anything from scratch.

Does this mean the end of the DejaNews Usenet archive as well? The transformation of that unbelievably helpful resource into "Google Groups" was a crime.

Geez, but I miss Usenet. Having _one_ forum on which everything related to a topic was posted was unbelievably helpful.

They intend to remove only files and pages, which is something like a primitive wiki for your group. This feature was never widely used anyway and there are much better tools for the job (like Google Sites).

BTW the Usenet still exists, even if it's not that active anymore.

> BTW the Usenet still exists, even if it's not that active anymore.

GGroups has been the cause of most spam on Usenet for years. It's also sad because some of the people who aren't regular spammers think that when they post to a newsgroup via GGroups, they're only posting to that one server, and not on hundreds (if not thousands) of servers. Others think that GGroups and Usenet are the same, but not as many as the "un-spammers".

It's too bad that this is going to happen, but... I guess that's Google for you. I wish that Google would instead try to prevent some of the spam on the Usenet, and not mess about with GGroups. :(

Hum I see Google not making a big fuzz about it now, but saying that Google is obfuscating the announcement and the it destroys trust and all that is a false assumption until it actually happens.

I'd expect email notifications and notifications inside the app to let me know the feature is being removed and how to move my data some time before the actual feature removal happens. I can't say Google has burned me until they do actually not inform me and 'destroy' my data. I think it's too early to say "Geez Google you're evil now!", and for the most part, this could be a small "Hey FYI, we're removing those features... we might not even be sure about it so we're not doing a big deal out of it yet."

If this article was written 2 weeks before data was removed I'd understand, but as it is it comes out somewhat sensationalistic.

I had moved several student org's at my university to private Google Groups because of the pages and files feature. Now that this is feature is being killed they will be screwed since they were using it as the repository for their meeting minutes and other important documents.

They can just download the data and stick it somewhere else. It's inconvenient, but nobody is "screwed."

The author of the article was disappointed that the service was being discontinued, but his main reason for writing seemed to be calling attention to how quietly Google announced that it was destroying data. People won't download the data if they don't know it's going away.

That is clearly not the case for the commenter whom I was addressing. He knows, so if the files he is worried about are deleted, it's due to his inaction. I don't want to see anyone just throw up their hands and accept victim status when there's a better option.

As for other people, I thought Google added a notice to the Groups files page along the lines of "Hey, dude, you might want to download this stuff before it goes bye-bye". I don't use the service, but that's what I heard. Was I misinformed?

EDIT: Yup, it's there. The notice, in bright red at the very top of the page, reads: "Google Groups will no longer be supporting the Pages and Files features. Starting January 13, you won't be able to upload new content, but you will still be able to view and download existing content. See this announcement for more information and other options for storing your content."

I mean not so much my inaction as I am no longer running those groups :-p

Actually what I was concerned about wasn't the disappearance of the data, but the disappearance of the feature. I chose groups because those features made it really easy to manage the entire group. I worked hard to set up a process around how the google groups worked. So the sense that the organization is "screwed" is that they will have to completely change the way the function internally and find a new centralized portal for managing there stuff.

As far as the warning goes they gave plenty of warning. I am just disappointed the feature disappeared.

Google Docs and Sites are much better applications for managing this kind of content. Docs allows you to collaboratively edit documents in real-time with multiple participants and Sites can be a decent CMS or wiki. Just try both and you won't miss the old features that are being removed from Groups.

The problem is that the users of my Google group aren't necessarily tech savvy. Now they'll have to remember that files are at one url and messages are at another. I'm pretty sure I'll have to repeatedly remind them of this.

Besides, who's to say Google won't pull the plug on those other services in a year? It seems unlikely, but I also thought it was unlikely they'd remove those features from Groups.

The description of the group could simply point them to those URLs. The link would nearly be in the same place as it is now, and many un-tech-savvy people wouldn't even notice anything changed. In fact, I'd bet many would think that the feature got "Upgraded"

That happened to me with my blog when they removed the ftp feature. I didn't want to convert automatically to a blogspot blog because I didn't have to time to deal with the privacy issues. Fortunately, I was able to download the data and I still have a web page for it that I could parse to convert it to another format but in the meantime I'm without blog.

I still haven't decided to which format to convert my blog but I'm very tempted to convert it to an emacs org-mode format and then decide later.

On that notice page, Google suggests using "Google Sites" for pages and attach files to pages on the site. If it is straightforward for someone to do it, I find it strange that they don't do it automatically and tell the people "we'll be migrating your data to Google Sites" instead of taking it down.

This is a good thing as far as I'm concerned. I don't think society is ready yet for the consequences of an internet that never forgets. I wish more websites would periodically delete content (with adequate warning).

I actually find a lot of good content for debugging purposes in google groups

There's a big red warning at the top the welcome page (web version). Impossible to miss, although I know many people only use Google Groups via email.

Google Groups is one of the worst Google web apps (from a programming stand point). You cannot add code snippets easily, the search is horrible and you cannot even format code/text easily like in http://code.google.com.

Have they seen stackoverflow.com?

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact