> I do all of my work on local C:\ paths and copy files to SharePoint or OneDrive only if forced to collaborate with colleagues.
This is an option I considered, but it's harder for non-power users to deal with the file management. It's very likely some users would copy files instead of moving them and it would quickly devolve into trampling each other's changes.
That Dropbox article was great, so thanks. If I had to summarize my issue now, I'd say it feels like the backup solutions have a poorly implemented sync engine. I'm guessing they simply get a stream of changes from SharePoint though and I wonder if it would even be plausible to think they could misapply those changes.
A backup solution makes it into an even more complex sync configuration:
User <=> OneDrive => Backup
Multi-way synchronisation like that is full of landmines. I read a great article once about how a similar many-way, multi-master sync needs to be implemented. The context was LDAP directory synchronisation, but the concepts are similar.
I can't edit anymore, but want to clarify. Files are missing from my backups, not from OneDrive. The backup software fails to reproduce the data in OneDrive.
You're probably thinking along the lines of pointing a file level backup solution at the OneDrive folder, right? This isn't like that. They're commercial solutions that are configured as an Azure App, so they always work against the online data.
I know what you're saying though. If you took something like Arq Backup and pointed it at you're OneDrive folder, I don't know how it would work, but suspect it would fail with files-on-demand because it takes a VSS snapshot to get a stable point-in-time. I've never tested, but assume working off a snapshot doesn't trigger the download for files-on-demand and it would feel broken to anyone that doesn't realize what's going on.
That's a good observation, but unlikely to be related to what I'm seeing here.
> Possibly related: A few weeks ago I was told to use the autosave option of MS word.
I don't think this would be related.
I have no reason to suspect any data loss from OneDrive. It's the backups that are failing for me. If I had to restore from backup right now, it wouldn't match what's stored in OneDrive.
> They designed it to look like a differential backup model
> Eventually, if no more changes were made to an item, Veeam Backup for Microsoft 365 will remove all versions of an item except the latest one.
I added the emphasis. I read that to mean I can expect the current point-in-time (aka right now) to be identical to my live data. I'm only trying to reconcile the most recent set of data, so that retention policy shouldn't make any difference, right?
- there's a folder somewhere whose contents are "synchronized", crying laugh emojis galore, using a "Dogshit" protocol, as it appears in the programs Microsoft OneShit or DropShit or Shit.net or Google Shit.
- you have a machine that has a "copy" of a dogshit-synchronized file system, perhaps using a vendor's implementation of Dogshit, such as Synology Dogshit Manager, a piece of software with which I am intimately acquainted
- that machine "regularly" "backs up" its "copy" via a procedure known confusingly as "dogShit"
- wait a minute, the files aren't right!
I hear you. This is very surprising, but it would occur no matter which file sharing system you use, so long as it's based on Dogshit and/or dogShit.
What is the specific flaw with dogShit? The simplest explanation for everything you observe, like the constant moving by multiple users, is that while using ordinary file system enumerations of the form "walk," a folder may be moved in a way such that it is never visited by "walk," even though the destination of the moved folder is still within the root directory of the walk. You can easily verify this in Python.
Okay, so now that I solved your bug in dogShit, what should you use instead? Snapshotters have already solved this problem, you have to create a dedicated filesystem for the shared directory on your Synology, then snapshot the whole filesystem, which will only have issues with "missing" since the moment it started but never traversing the filesystem incorrectly. Then you can backup the file system snapshots to S3. In Synology you can achieve this with btrfs and some elbow grease.
But the right answer is to not use dogshit. At the end of the day the people who are authoring Dogshit file sharing products, they should be giving you a backup approach, not demanding you do it ad-hoc.
> What the heck... Was I expected to extract the archive and go through all the files one by one (there are thousands of them) to check if every file was properly backed up?
There are a lot of silent pitfalls like that in my experience. For example, if you use folder level encryption with Synology Active Backup for MS365 it can silently mangle the file names due to path length restrictions. You'll end up with files that have "file name too long" as part of the file name.
That's why I'm trying to reconcile every file in this data set. I don't trust anything without being able to personally verify it at this point.
> For example, if you use folder level encryption with Synology Active Backup for MS365 it can silently mangle the file names due to path length restrictions.
Is this Microsoft's fault or Synology's fault? Much of this discussion sounds like an inability to distinguish between first and third party products.
> Much of this discussion sounds like an inability to distinguish between first and third party products.
Yeah. Did I explain it badly or something? I thought I made it clear that I'm having an issue with 3rd party backup solutions and that it's possible they're using some shared API from Microsoft, but most people seem to be focused on OneDrive specific issues.
This is an option I considered, but it's harder for non-power users to deal with the file management. It's very likely some users would copy files instead of moving them and it would quickly devolve into trampling each other's changes.
That Dropbox article was great, so thanks. If I had to summarize my issue now, I'd say it feels like the backup solutions have a poorly implemented sync engine. I'm guessing they simply get a stream of changes from SharePoint though and I wonder if it would even be plausible to think they could misapply those changes.