Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft OneDrive for Business modifies files as it syncs (myce.com)
229 points by ingve on Apr 21, 2014 | hide | past | favorite | 119 comments

SharePoint has always been a piece of garbage. Unfortunately it's a money making piece of garbage, so there you go.

Inside MS, SharePoint is often used to "track" project documents. Start a project and -poof- your most unfavorite PM has creates a procrustean bed of document folders, all set for you to lose your documents in because none of the categories match anything in the actual product. Like, having whole separate doc folders for Beta 1 and Beta 2 (there's going to be a second beta, and the docs are going to be cloned into those? Really?)

PMs: "Please add your documentation to these folders."

Devs: "When we do that, we lose control of the documents, we can't get at the history, we can't search them, we can't even find stuff in there, and SharePoint is slow and the permissions are always wrong, and a year after the project ships the SharePoint will be destroyed and we will lose all of the documentation." [All of this is true, especially the bit about not very old project documentation completely going away, OMFG].

PMs: "We don't care."

Devs: [check documents into the source tree anyway, and write a mirroring script to copy the things to SharePoint]

PMs: "Stop that."

Devs: "We don't care."

The right answer is, of course, to fire the damned PMs who serially insist on a crappy excuse for a version control system despite everyone else pushing back and saying that it sucked hard. Only saw that happen a couple of times.

I wonder if SharePoint has a future in Microsoft's strategy re-alignment.

SharePoint (evolved as Office Server) is a beast of software and many awkward engineering & design choices from the 2003 era are still visible to the end user. And I am not even touching the XHTML tabled based layout with thousands of CSS rules, Silverlight & ActiveX controls, bad WebDAV support, "SharePoint Groups", low soft & hard limits for file size, file count in folders, etc.

It has of course also it good values for companies. Like the Office integration, the Office ribbon UI and some document management options that are missing from Windows Explorer thanks to the failed attempt to release WinFS in the Longhorn era.

Microsoft also discontinued InfoPath, the WYSIWYG form designer that is part of Office 2003-2013 and the InfoPath form services hosted on top of SharePoint: http://en.wikipedia.org/wiki/Microsoft_InfoPath

Will Microsoft rewrite (or refactor) SharePoint and its form services from the ground up? Or will SharePoint 2015(?) be a rehashed v2013? And will Windows 9 come with an improved Explorer and NTFS/ReFS with better document management capabilities like that were planned for WinFS or "Microsoft Semantic Engine"?

I suspect the curse of backwards compatibility might cause a problem - there are so many systems that have dependencies on SharePoint (including a lot of Microsoft's own products). I found this page the other day which talks about the non-trivial process of picking the right API for talking to SharePoint:


The diagram trying to explain how the different APIs relate is a pretty good visualization of the problems that anyone trying to refactor SharePoint would face.

It is widely known that 2013 will be the last version of SharePoint. Everything will be an incremental upgrade from now on with customisations decoupled from SharePoint through 'Apps' working off web services.

At SharePoint Conference 2014 the product team specifically announced that there will be a SharePoint vNext released on-premises, released in the summer of 2015, presumably to be named SharePoint 2016.

Apps are being pushed hard for all customizations so as to decouple code from the "core" of SharePoint so as to allow for a better upgrade strategy.

That being said, I do see Yammer as replacing a lot of functionality within SharePoint over the next 2-3 years, but I believe SharePoint as a product (especially as it relates to document management and search) will continue to exist for at least another 4-5 years.

>> Will Microsoft rewrite (or refactor) SharePoint and its form services from the ground up? Or will SharePoint 2015(?) be a rehashed v2013? And will Windows 9 come with an improved Explorer and NTFS/ReFS with better document management capabilities like that were planned for WinFS or "Microsoft Semantic Engine"?

It would be better to build something on top of Outlook and Onenote. New file systems will only be useful if the metadata survives email which would require email integration anyway.

The first version of Sharepoint was a portal server built on Outlook/Exchange. http://en.wikipedia.org/wiki/Sharepoint#History

Another product team pages evolved from Frontpage server extension. Both products merged to what is known as Sharepoint 2003.

Most of the functionallity is still in Outlook client. You can also sync Sharepoint sites with Outlook and access your document libraries offline.

You'll be pleased to know that the work item side of Team Foundation appears to be based on SharePoint and they're heavily marketing that at the moment.

Kill me now.

I work in a drawing office at engineering firm that uses Sharepoint. I wish I could even get anyone to acknowledge that there are such things as versions of files. Our backups are destroyed after 72 hours "to save space", emails are copied manually from Outlook to "emails sent" and "emails received" folders in sharepoint.

The cluelessness and lack of interest from either the IT dept. or anyone really drives me crazy. I just get shoulder shrugs from colleagues when I tell them magical tales from the distant lands of version control.

There's may be a case for pushing hard but I'm leaving in September.

Sorry to reply to myself but I have worked in a couple of places now as a CAD operator rather than a Devops. I am appalled at how terrible the IT depts. have been at getting involved at adding value to the business. They seem quite happy to sit back and wait to be asked to do things the managers have read about somewhere.

At the current place the only time I even see anyone from IT is when something needs fixing. I can see multiple vectors for real process improvement.

And I think this will be a feature of many IT depts. If you want to make a positive contribution and be known for making a difference rather than being regarded as some sort of Janitor then you need to get out of your chair and, as Taichi Ohno says: walk the gemba. The gemba is the shop floor where things happen and the place where making a difference is something your customers are willing to pay for. Don't be an IT anchor, ramping up equipment costs to stay in place.

Ha you haven't seen the piece of shit that is HP/CoCreate WorkManager then. Not joking but sharepoint is Jesus' sandals compared to that turd. SharePoint is also at very least 20 times cheaper because you don't need a full $1m stack HPUX pile and Oracle to run it.

It only exists still due to cash back handers.

To an experienced software developer, any document management system is going to feel like a poor excuse for a source code control system. But many businesses live and breathe by their documents and workflows the way development teams do their source code.

It sounds like most of your complaints are about the way the processes were managed rather than SharePoint itself.

Both. SP sucked, and management's use of it also sucked.

If you think Sharepoint is garbage today, you should have tried it in '08-'09. I was at a 100% Linux site where management insisted on everyone using Sharepoint and scrapping out old Wiki and source control systems. The rationale seemed to be entirely because Unix admins cost more. We couldn't even browse Sharepoint from our own machines since it was so broken in non-MS browsers that they had to set up VNC to a Windows box. Most people couldn't install Wine either because IT refused to give local root to anyone. The only way to install your own tools as local root was to violate every security principle by bringing your own box and swearing off any kind of support.

I notice that management in most companies has a bad habit of believing IT will put up with awful tools just because admin staff do.

> a procrustean bed of document folders, all set for you to lose your documents

"I resemble that remark". Shudder.

Such fun, a web implementation of a 1980s folder system, with no possible way of doing an "ls -lr". One gets to click every node in a three or four deep hierarchy to look for a document. Shudder, erase from memory.

To be fair, it doesn't lose your documents, it just makes sure nobody can ever find them again...

When we do that, we lose control of the documents, we can't get at the history, we can't search them, we can't even find stuff in there,

Except SharePoint does store history and you can search it. And no need to create a second Beta1 and Beta2 folder when you can just mark documents as release = Beta1 or release = Beta2.

Except that search was always borked, for various reasons (one really special reason was that search was utterly global, and didn't obey the access rules that were so vigorously enforced elsewhere, exposing details of your Super Secret Project to the rest of the company). So search was turned off, and good luck finding anything. Sheesh.

And getting history out was, to put it mildly, a pain in the rear. Something I could do in five seconds became a nightmare of bad web UI.

Anyway, the devs in my group had a rule of keeping docs in the source depots, and we never lost anything. Poor other groups, we saw them lose important design documents when IT decided that the two year old SP wasn't being used anymore and got recycled. Wow.

Maybe SP does all that whizzy stuff. I just saw it being unutterably stupid, slow and unreliable in practice.

I can relate to your experience.

I have seen SharePoint installations at big corporations and they are horrible.

In one case there was a central SharePoint group that has commissioned the sites. So we had a highly restricted, utterly gutted and badly corporate styled thing that was practically useless.

Add to that that there was a weird mix of public and enterprise editions (which cost a bunch of money and were frowned upon by the org unit).

In this environment you can sit back and watch productivity and common sense being choked to death. Enterprise is a weird place.

Not sure what version of SharePoint you were using, but recent versions most definitely obey access rules - if you don't have access to a document, you don't see it in the search results.

You will find that SharePoint 2013 with FAST has already fixed those problems - search is really powerful and fully customizable.

No share point does not store history. Get a file above a certain size, and that magically just stops working.

Every time I've seen share point used, everyone described it the same way. Its where documents go to die.

So far at the current gig a share point "upgrade" has meant that all our fun use of the old install is useless. All of the sudden wikis look more useful.

Burn me once and all that. Oh and search, yeah that search is about as useful as altavista was back in the day.

Sharepoint has every possible feature, in the sense it exists on a product information sheet. But every possible feature is implemented in such a terrible way that actually using it is more trouble than just about any alternative, including the "rename file to include version number and then email it around" method.

Will someone please kill sharepoint already?

Last place I worked we were using Sharepoint as a CMS for two large e-commerce sites so that the PM's could make changes when they wanted to.

Let that sink in for a moment.

>Except SharePoint does store history and you can...

It does until it doesn't. Just like "Plays for Sure(TM)," also from MS, worked until it didn't.

> Just like "Plays for Sure(TM)," also from MS, worked until it didn't.

Are you referring to unintended technical defects or DRM?

I think he's referring to when they made sure that "plays for sure" drm license/authentication/whatever servers got shut down so all the music you bought became a bunch of bits in a bucket.

Yes, the shutdown is what I was referring to, as you correctly inferred. Technically it may have all worked but when they pulled the plug on the servers, the value of any previous marketing assurances became clear.

Please feel free to teach that to all of the people forced to use SharePoint, who mostly don't know about that. And SharePoint's obtuse interface does not make that particularly easy to discover.

If it's a tool used for your job I'd suggest they spent 30 minutes to learn it. Would you allow a dev to just start using Git with no training at your company? Would you cut the dev any slack if he complained about Git without bothering to learn how to use it first?

> If it's a tool used for your job I'd suggest they spent 30 minutes to learn it.

By all means. How do they know when they're done learning it? It's by definition difficult to tell the difference between missing functionality and hard-to-discover functionality.

Your comment is witty, but I have absolutely no idea what it has to do with OneDrive modifying files. It's not even about Microsoft or the effects of any of their choices or policies - it's about stupid managers.

> ... I have absolutely no idea what it has to do with OneDrive modifying files

The article is about Onedrive for Business. Which is nothing more than a Sharepoint Document Library in the users MySite. And that is probably also the cause of the behaviour that the autor is describing: there's probably some workflow or other weird SharePoint feature at work, that was installed or activated unknowingly.

Well, that. Also about how bad SharePoint is, and the kind of people who think that it's okay, when it's not. I wouldn't put my data there because the types of things you run into on SP are endemic and fractally bad.

"I wouldn't put my data there because the types of things you run into on SP are endemic and fractally bad." Yes, there is nothing good about SP. But have you tried version controlling office files? That's easy right, just save as office xml and push. Oh wait, what is all this? User state in the xml file? Currently active spread sheet tab? Currently selected cell? Nightmare!

My previous employer was switching from just using shared folders and manual versioning (which worked, was accessible, and we had great search tools for it) to sharepoint. Despite almost the entire development floor complaining and objecting to management, their was no changing their mind. My guess is that some higher level exec had a great lunch with some MS salesman.

The result was that next week, people were setting up ad-hoc fileservers and documentation systems to avoid using sharepoint, with half-baked sync scripts to placate management.

I am sorry you had a bad experience with your PMs, but what does SharePoint have to do with creating categories that don't match anything in the product, or with deleting databases prematurely? Search in internal databases is a very hard problem because of lack of metadata, and in my experience it is not solved any better in source control systems than it is in SharePoint.

And, I have personally found SkyDrive Pro, or OneDrive for Business, or Groove, or SharePoint, or whatever MS wants to call it, very convenient and reliable compared to Google Drive (for which the desktop app tends to quit every hour and leave things un-synced, and which has far worse web viewers).

Anyone who seriously tries to make a company adopt Sharepoint should be fired on the spot, be it the IT guy, the CIO, the CEO or the chairman of the board. There is no possible excuse for that kind of stupidity.

I've got some experience in the field as a consultant for a similar product called Alfresco, and I can see why it is such a big success. If there is anything that is much better, I haven't seen it yet.

So: what collaboration tool would you suggest for the average corporate drone to use? And no, git is not an acceptable answer.

>> So: what collaboration tool would you suggest for the average corporate drone to use? And no, git is not an acceptable answer.

Email + shared folder (as flat as possible) with rigorously enforced naming conventions. Clever features like search can be added with crawlers.

For me that would suffice - but its not the answer, I'm afraid. Management/IT wants to:

- have workflows, so documents can be approved among other stuff

- have document lifecycle, so they expire and must be reviewed after time x

- solve the Dropbox problem - how do you collaborate with external entities without compromising security by having users uncontrollably put stuff on Dropbox, because there is no other way?

- have automatic versioning

- be able to checkout/lock a file during editing

These are just a few aspects. You can do most if this by gluing stuff together yourself, but these are the constraints:

- be able to achieve all this with semi-competent IT staff

- seamless integration into the existing Windows infrastructure is a big plus

- be able to blame someone else if something goes wrong

So - what do you propose?

It is interesting how all these problems come down to basic CS concepts (cache invalidation, naming things, locks).

- Versions

"Doc-456 V-023" "Doc-456 V-024"

To maintain an archive files become read only after 24 hours.

- approvals

"Doc-456 V-024 Joe Bloggs ✔"

The problem with the semi-competent IT staff is that they are not immune to Microsoft's BS^H^H PR. They will really believe what the account manager told the CIO over dinner.

I hear good things about Alfresco. I have a lot of experience with Plone, which is also a very good foundation for a corporate intranet. I've deployed it countless times in this scenario, as well as the back-end of public-facing websites.

I took me a while to realise it but yes you are 100% right.

A few years ago I played with SharePoint for a bit and found some good use cases for it. Decided I'd throw some code together that I could wire into it and sell. There was a demand and SharePoint object model was pretty spot on.

4 days later I hadn't even got a bearable working SharePoint installation for doing dev against, had to throw another 4gb of ram at the machine I was working on and wanted to kick the shit out of the thing due to the recursive layers of batshit.

14 days later I had learned python, written the entire thing from scratch using django and had gone live with a client who still uses it today.

I did a lot of development on Plone and Zope. Mind you, Plone is not the easiest CMS platform to work with or to program for (you will regret using it at some point in time, but it will pass), but it used to be a great starting point for an office intranet. I am not sure how well it fits that use case now, but it's one of my preferred CMSs.

Yeah, them stupid people over at Volvo, the Department of Defense, the marines, the army - all of them are clearly idiots! Everybody should just use git repos and use javascript webapps!

So your PM doesn't manage the docs? as in talk to you about what the appropriate output should be and then they collect and handle the organization of it?

There are a few PM's out there like that, but most I have worked with are super excited about jira or sharepoint, love agile (as in "look at all the charts, the customer will be so happy when I show these to him") and asks nothing of his team except to follow his guidelines and everything will be swell. Are all unmindful IT people working with MS and MS-centric customers these days, or do you have them on the other side of the fence too?

Your reply infuriates me. The PM, at a minimum, should be collecting the information needed to credit the appropriate team/ICs, circulate status of dependencies, and communicate upwards. Methodology(agile) and tools (SP/Jira) are supposed to be ways to think and optimize communications...

It really peeves me to see PMs get away with doing spreadsheet management and wear teflon suits...

To be fair, i've been on a team where the engineers passed the buck towards me also, so i had to push towards written follow ups and detailed documents but that was isolated from the entire team..

Framework filled resumes go straight to the trash or get tossed a cluster f* in an interview to see how they structure the problem and build out a path towards a more productive structure.. most have failed because they all respond with some rendition of 'call a meeting with the team and their supervisor'..

You are using a document management system to manage source code? There is your problem with this 'piece of garbage' right there.

No. That would be stupid.

IIRC OneDrive for Business is not your average cloud storage but build on top of SharePoint (unlike the normal OneDrive). Therefore this might be some kind of side effect due to way SharePoint handles documents and document versioning. It is definitely unexpected behavior but this might be a reason explaining it without bad intent on Microsoft's side.

Microsoft employee here, not speaking for the company just my own perspective. I work in Azure, but not on this stuff specifically.

Sharepoint has its origins in managing collections of MS Office documents more so than HTML and browsers. It knows about certain document types and tries to do intelligent things with them. It's not necessarily the tool you would use for serving raw data over HTTP with arbitrary Content-Types. (Given the complex and varied rules by which browsers interpret content, I'm not actually sure how one could even do that perfectly securely short of enforcing separate second level domain names for each and every tenant.)

As an old-school software engineer, we used to say the biggest part of requirements analysis is setting customer expectations correctly. It seems fair to say that the renaming of Sharepoint to "OneDrive for Business" has surprised some folks where it behaves differently from plain OneDrive or from raw BLOB store.

"It knows about certain document types and tries to do intelligent things with them."

There's your problem right there ....

I'm not saying this is impossible, but I think it would be an instructive exercise:

Please name three nontrivial, commercial, end-user facing apps or services that know nothing about any file or document types.

Only file synchronization tools (commercial or not) could ever fit that description, any other kind of tool must manipulate some kind of file.

The only problem is that we are talking exactly about a file synchronization tool. Thus the exercise isn't as valuable as it may appear at first.

I found this KB https://support.microsoft.com/kb/2903984 which doesn't mention the term 'file' anywhere (except to refer to the downloaded installer file). It uses the terms 'library' and 'content':

The stand-alone OneDrive for Business (formerly SkyDrive Pro) sync client lets users of Microsoft SharePoint 2013 and Microsoft SharePoint Online in Office 365 sync their personal OneDrive for Business (formerly SkyDrive Pro) document library or any SharePoint 2013 or Office 365 team site library to their local computer. This sync relationship provides access to important content both online and offline. The OneDrive for Business (formerly SkyDrive Pro) client can be installed side-by-side with previous versions of Office (such as Microsoft Office 2010 and Microsoft 2007 Office).

I tried the installer. The installation process is branded all over as being a feature of Office.

Is there something other than the substring 'Drive' that gives the expectation of a fully generic file synchronization tool?

>Is there something other than the substring 'Drive' that gives the expectation of a fully generic file synchronization tool?

I don't think anyone is expecting OneDrive to be "fully generic" (I believe that 'emeraldd was referring to the "tries to do intelligent things" portion of the sentence he quoted).

It just seems that people are unaware of the fact that putting certain kinds of documents into "document libraries" or "team site libraries" involves automatically adding metadata to those documents (from the OP's example, html comments and "xmlns:..." attributes were added to html files).

I think he has a good point. For example, DropBox has lots of special handling for image types. It has a special folder for sharing them, it has auto image upload from the places phones tend to store them, etc.. Google similarly has auto-image upload and enhancement, and they have a ton of special handling for office documents that can interconvert them. So DropBox and Google Drive are both examples of cases where special things are done by knowing about file types. I can't really think of any service that doesn't. Maybe rsync run by cron or Tarsnap, but only us techies ever use that. Even then, most compression schemes can recognize compressed files vs. uncompressed and not recompress them - like ZIP utils storing files vs. compressing them when building an archive.

"OneDrive for Business" is a branding disaster, as it creates a completely rational link in people's minds that it's the same as OneDrive. In reality, it's a completely different product that acts in different ways.


It's amazing how wrong companies can get marketing. Take Apple's AirDrop, for example. I spent 20 minutes one day wondering why my iPhone wasn't connecting to my Mac Mini. Turns out, "AirDrop" is two different things by the same company that are similar but completely incompatible.

Yeah, I hope they improve and fill the gaps in current AirDrop and AirPlay. It has come a long way, but there's still room for improvement.

Namely, AirPlay between OS X devices and iOS devices (including just audio and both audio+video), in either direction. Here's hoping for OS X 10.10.

Just like "surface" vs "surface pro"!

>Just like "surface" vs "surface pro"!

More correctly "Surface RT" vs "Surface Pro". "Surface" was initially Microsoft's interactive table [1], so yeah pretty much a branding disaster if we take a walk down the history lane.

[1] http://en.wikipedia.org/wiki/Microsoft_PixelSense#Microsoft_...

BTW: Surface / PixelsSnse was the most horrible purchase i / my university ever did. Bought a Samsung SUR40 in January 2012 for about 8000 €. Received it two month later. Another 3 month later Microsoft renamed it to PixelSense and stopped the development. No Windows 8, no Internet Explorer 11 Touch. We use it as a table.

Surface RT is now just Surface.

Actually "Surface RT" was the "simple" version given by the community, if you can believe that. Typically Microsoft, the official names were "Surface with Windows RT" and "Surface with Windows 8 Pro".

I think the idea was to have "one" device (even though there were actually two different devices, with very different hardware) with the name of Surface, but which came in two versions, one with Windows RT, and one with Windows 8 Pro. So they were telling people: "This is Surface with Windows RT...and this is Surface with Windows 8 Pro".

But yeah, a disaster. Their current names of Surface and Surface Pro may be simpler to use now, but it's actually more confusing for consumers, because this current naming implies Surface Pro is basically Surface, but with a few extra features. When in reality, they are very different. I think this confusion was meant on purpose, because they want people to believe that Windows RT is "just like regular Windows - but with fewer features". They are doing their customers a disservice by trying to trick them like this.

The whole RT thing sounds like a joke. Even on wikipedia there's a circular link collection on top of each of those articles, "WinRT, not to be confused with Windows RT, not to be confused with Windows Runtime, ..."

Btw, I just noticed that the old Surface is now known as PixelSense. Makes sense.

Um, no. Pixelsense is technology behind touch, gestures, and similar. Take a look at:


Yes, and something very similar used to be called Microsoft Surface (for example, see http://www.techradar.com/news/internet/web/digital-home/home...)

@pgeorgi - I was specifically responding to -- "Btw, I just noticed that the old Surface is now known as PixelSense. Makes sense."

MSFT (and other companies) have a very bad habit of circular/redundant/superfluous naming conventions.

In this case Pixelsense and the renamed Surface RT never had direct naming overlap.

With no knowledge of either product, that's a fairly sane assumption.

...and, indeed, these XML namespaces (MS Office and data types) and the CustomDocumentProperties tag are already known to be added by SharePoint, and apparently for well-established reasons (though not ones I understand not being a SP user): http://sharepoint.stackexchange.com/questions/30626/why-is-s...

It's absolutely inexcusable to modify .php source code files for this though!

Wonder if it would modify files in a git repository in the same way? Good luck recovering from that!

From my bad experience, Git repositories (and I would go as far as to say any source code) does not belong on SkyDrive/OneDrive. This is OneDrive, not the newfangled Hailstorm by architecture astronauts[0].

I would recommend that everyone keep working copy of their source code outside of any form of syncing. If you must, create tape archives (or 7z or something) and sync them but never your working copy.

[0] http://www.joelonsoftware.com/items/2008/05/01.html

I've kept most of my Git repos in my OneDrive for the past few years. It's worked great.

As many others have said, OneDrive For Business is really more of a SharePoint + Groove document sharing / collaboration thing for businesses documents (as the name kind of implies). While it does similar things from a generic corporate user point of view, the mechanics are pretty different. OneDrive For Business works well for the same kinds of use cases that SharePoint does (mostly documents), but I wouldn't put source code in there.

Good to know one shouldn't treat the OneDrive as a... Drive.

OneDrive is a drive as you would expect. OneDrive for Business is really SharePoint. Terrible branding.

git reset? I can't imagine any way in which it would destructively modify the files under objects/

Why not? The article claimed it modified a .php file. So I'm curious to know if it would modify the equivalent file under .git/objects?

Because the equivalent file is either compressed with a custom header or in a totally-incomprehensible-to-it binary pack.

Let me put it this way: I would be scared to store an svn checkout because of how svn stores metadata copies. I wouldn't worry about a git checkout.

To expand on that, OneDrive for Business (a product formerly known as: SkyDrive Pro, Live Mesh, Grove Workspace etc.) shares the OneDrive branding (and some? interface components) but otherwise bears little resemblance to its consumer counterpart.

IIRC the Office Org owns the SharePoint client while OneDrive proper is handled by the Windows Services team.

OD4B has nothing to do with Mesh.

livemesh was the best!

Yeah I can completely believe that this would be an unintended technical side effect, a consequence of trying to mitigate some other kind of difficulty that Microsoft's broad userbase might run into. Kind of reminds me of Excel's propensity to helpfully (and destructively) convert dates and zero-padded-numbers upon import.

It's almost certainly "accidental" and a relic of some indexing or something that Sharepoint is doing to the documents. I'll bet it recognizes XML-ish content (as the article notes, images and plain text are ignored), tosses it in a validator or something similar that "corrects" the file, and saves that file internally. That's not too unusual in the CMS world. The bad part is that the internal version has found its way back out; hopefully in turning SharePoint into "cloud storage" they screwed up and sent the wrong thing. Otherwise, that's rather a mis-feature of SharePoint. If there's anyone here who actually knows it (I only generally do, being on the OSS side of the CMS world) I'd be interested to know.

So I'm ascribing this to incompetence and/or bad judgement rather than malice. But either way, still unacceptable.

Sure, but as a cloud sync product it's a "you had one job" situation. Not so much incompetence as a fundamental failure to achieve that basic requirement.

The problem seems to be that this is not primarily a cloud sync product, despite what the name might suggest.

It's not that OneDrive is modifying files "as it syncs"; it's that it involves Sharepoint, which adds certain metadata to documents that it handles, and has probably always done so; here's a stackoverflow question about it from 2010:


EDIT: I see that 'ppog has already mentioned this


Microsoft is not the only one. I had saved a special PDF with settings to full screen the PDF on open. I uploaded this to my Google Drive to transfer it to another machine and it completely reconverted the file into another type of PDF which not only corrupted the document but also broke the full screen open setting. This is a really random circumstance, but I was surprised to see GD reconvert the PDF, not store what I wanted "byte-for-byte"

The full screen bit might has been seen as malware and was "sanitized" by their anti-virus.

Google Chrome used to corrupt certain PDF files if I saved them via the built-in PDF viewer. It still might, I haven't touched that button in a while.

I've just spent a couple of weeks "unclouding" everything due to a number of problems like this. I was using OneDrive (the consumer one) with an Office 365 Home sub for doing basic personal finance spreadsheets and it decided to literally destroy the contents as it was uploaded and downloaded from one computer to another. I can't really trust it.

I tried Google Drive before and found it unacceptable that it just leaves links to documents on your local disk. I could imagine that in a network down situation I'd be in the shit.

So I'm here with LibreOffice and local file storage only now and all is good.

I don't buy the supposed advantage of these services any more. I'm just going to lump my ThinkPad around and not worry about where my shit is now (it's with me). I'll keep an offline backup at home and one off site (encrypted).

Try BTSync

Closed source with that sort of application is uncomfortable.

If you want open source, and you're okay with manual sync and conflict resolution, I've had good results in the past using Unison[1] to sync multiple systems.

At this point, though, I've personally settled on a system similar to the GP's — a backed-up laptop. In the rare event I need to access a document on the laptop when I've left it at home, there's always ssh, and, in the even more rare case where I need a document and I've left the laptop elsewhere, there's always ssh + rooting around Time Machine folders on my backup server. Finally, for times when I don't want to carry the laptop and know I'll need access to files, GoodReader[2] on iOS syncs over a variety of file server and cloud storage protocols.

[1] http://www.cis.upenn.edu/~bcpierce/unison/

[2] http://www.goodiware.com/goodreader.html

I really want less software to manage my stuff so I'll pass on that.

Or Tahoe-LAFS, which is free and open source, and much more stable.

If this is true then Microsoft will have a hard time convincing the majority of potential users to switch to OneDrive. Although I can imagine possible use cases altering data without user interaction is unacceptable. What if you had a git repository stored there? (Yes I know this belongs somewhere else but some people might want to do it regardless) Although Dropbox apparently looks at the data to create previews for the Website etc. I have never experienced altered data.

This article is about OneDrive for Business, not OneDrive. OneDrive for Business is for business, as the name suggests. SharePoint maybe slow, but it does have features that users in business like, so having OneDrive for Business behave like SharePoint is fine.

So must mean it's just like the difference between Windows 8 Enterprise and Windows 8 Home, yeah? It's exactly the same thing under the hood, but the business edition just has a few extra businessy bits bolted on top, or possibly some restrictive anti-features removed? Because that's how Microsoft branding works, right? Right?

No, they're almost completely different under the hood.

[edit] On second reading, I see perhaps you were sarcastic. If so, well trolled good Sir! :-) I had it coming.

No, thank you for being so gracious and understanding!

Why risk so much user trust in order to tag some files?

Same question for me. I think it's all about tracking if someone steal files from your onedrive and then they can track who stole it... maybe? But as a user of onedrive and google drive i can say that their license agreement on onedrive (normal customer) doesn't assume owning your files, while google drives i think is written that they own your files and can modify it without your permission. At least google say it.

Disappointing, but a good reminder to not trust 3rd party services with sensitive data regardless.

  ssh user@rsync.net md5 your/file
... let's not paint us all with the same broad brush ...

You may be great today, but nobody knows what tomorrow holds. Therefore, I would prefer:

    ssh user@rsync.net md5 your/already/encrypted/file

Shocking that so many bytes would be added to HTML files. If you were backing up website views, this could add a lot of bytes per page request if you published them without realising. Totally unacceptable.

Seriously raises some questions around the whole product, and makes me wonder if we can ever trust any other third-party with any sort of "sensitive" data.

Luckily there exists plenty of FOSS options that provide a close approximation of this functionality for me, though I know they will likely never be an option for the type of companies that rely heavily on things like OneDrive.

They'll probably blame this one on an "error" again, like how they did when it was found out that Skype was MITM-ing https links, or when Bing was censoring stuff from China globally, and even in English, and in a couple of other cases I don't remember well right now. That seems to be their boilerplate PR response whenever some big privacy infringement happens and many are outraged about it.

I'm interested if you have a cite for MITMing hyperlinks. The only thing I see sounds attributable to old fashioned cookies and ad networks.

I'm pretty sure this user is referencing the link checking bot news that came out mid last year. Basically, a Skype bot HEADs links placed in messages. It's hard to say why, maybe it checks for 404ed links.

>Bing was censoring stuff from China globally

That's not a privacy infringement, that just sounds like production code went up to the wrong server.

Yea, but of course the microsoft happy bunnies will go on defending them like the morons they are.

As much as the "new Microsoft" impresses me some parts of the company still seems to believe they can get away with anything. :-/

There is no "new Microsoft". The huge ship has shifted course, visibly, and announced the shift, but that doesn't mean it's completed it's shift, nor does it mean that everyone in the company is completely aligned yet.

While I'm happy that Microsoft is re-aligning with reality, I still don't quite trust them yet. They have a lot of credibility to rebuild.

Reminds me of a few years ago when I said that I thought it was likely the NSA and GCHQ were able to monitor and then inspect without warrants pretty much anything they wanted on the internet, password or not, and I was ridiculed as a conspiracy theorist. After Snowden's revelations I see absolutely zero reason to trust anything any of these companies say or do. And I would use them all with caution - especially if it's your startup's patent application stored in onedrive.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact