
iCloud Drive can strip metadata from documents - ingve
https://eclecticlight.co/2018/01/06/icloud-drive-can-strip-metadata-from-your-documents/
======
csydas
The original source for the article is a bit clearer on what's actually
happening, and the blog post for the HN link omits some details:

[https://eclecticlight.co/2018/01/06/icloud-drive-can-
strip-m...](https://eclecticlight.co/2018/01/06/icloud-drive-can-strip-
metadata-from-your-documents/)

I recommend read the original (and that the submission be changed to the
original source instead), but a more accurate description here is that two
Macs running two different versions of macOS will see the file differently
once it's copied to iCloud. If a High Sierra Mac sends a file with a custom
icon, for example, to iCloud, the High Sierra Mac will see the file as
expected in iCloud. If a Mac on Sierra looks at the same file on the same
iCloud share, it will see the base file without the custom icon. Other xattr
items are removed as well, as documented in the original source.

~~~
sctb
Thanks, we've updated the link from
[https://mjtsai.com/blog/2018/01/08/icloud-drive-can-strip-
me...](https://mjtsai.com/blog/2018/01/08/icloud-drive-can-strip-metadata-
from-your-documents).

------
okket
This title sounds like the actual files get modified by iCloud. This is not
the case. Only external data ('extended attributes') in the filesystem gets
stripped. It is true though that this is called metadata, but so is EXIF in
photos, or OCR text in PDF scans, which definitely do not get stripped.

Also note extended attributes do not survive most file transfers.

~~~
dkonofalski
This is what I'm wondering. Does Skim (the example used) actually store the
annotations inside of the original document in a standard fashion or is it
stored in some kind of additional file or as a standard xattr. If the latter
is the case, then the OP needs to check to see if High Sierra is using APFS or
HFS because High Sierra should have automatically converted the filesystem to
APFS which has xattr support natively while Sierra, by default, would still
have used HFS+ which stores xattrs in the B*-tree and not in the standard
metadata for the file. Since iCloud would only sync over filesystem supported
metadata, the xattrs would be dropped by Sierra.

This is like complaining that a file created in Windows 95 called
FileWithAReallyLongName.doc would be truncated when saved in MS-DOS as
FILEWI~1.DOC.

~~~
TazeTSchnitzel
The internal storage of xattrs is an implementation detail. They look the same
to applications, and copying from APFS to iCloud Drive should function
identically to copying from HFS+.

------
PuffinBlue
It's tough not to scoff. In years past it was a relief to see iCloud
Drive/Mobile Me didn't simply strip out an entire directory. It was honestly a
relief when Dropbox came along as at least you felt you could recover with
their 30 day retention.

There must be mid-tier oldies like me who stopped using iCloud Drive due to
those early loss of trust issues?

~~~
wlesieutre
I don't think I lost any data on my iDisk, but the connection wasn't always
reliable. IIRC it was mounted over WebDAV, but may have been AppleShare before
they switched to that. Flash drives were definitely safer (but more expensive)
option. I can't remember ever losing one while I needed it.

------
userbinator
Although I haven't used this feature, the behaviour with the custom icon and
filesize seems extremely unusual and I'd be quite puzzled if I came across
this without knowing about it first --- imagine opening a text file whose size
is shown as 500KB+, but discovering it has a tiny fraction of that --- I'd
probably be shocked and, if it was a file someone else gave me, wondering
"where's the rest of the data? Is my filesystem corrupt?"

Of course, if it doesn't count xattr sizes, it would be equally confusing in
the other direction. Maybe showing "1372 bytes (and xxxxxx bytes extended
attributes)" would've been better.

But from past experiences, IMHO xattrs and similar "bonus filesystem features"
are not something you should rely on in general for portable data exchange.
The name (or a slightly modified variant thereof) and the associated sequence
of bytes are all that can really be relied upon to reliably survive transfers,
since it's the lowest-common-denominator abstraction.

PDF has annotation features in the file format itself; perhaps Skim should use
those instead --- in which case annotations will also survive and be visible
in a Windows or Android PDF reader.

I'd also consider things like icons to be "display metadata", in the same way
as the size of the icon or its position in the GUI use to display it.

------
dkonofalski
I wonder if the OP and the article linked in one of the other comments looked
at whether or not the Sierra install was on APFS or not. High Sierra
automatically converts all drives to APFS during install while Sierra does
not. Since Sierra stores xattrs in a special B*-tree node rather than as
extended filesystem attributes, it's possible that the High Sierra install is
saving them to iCloud Drive properly with all the filesystem attributes and
the Sierra install is dropping them, for security reasons, as they would have
been invalid on an HFS+ filesystem.

That should be viewed as a bug from the user-perspective but, because those
extended attributes literally did not exist when Sierra was released, it's
hard to say that it should be considered a bug for Apple when it's functioning
exactly as intended in terms of the filesystem. It's almost like complaining
when MS-DOS saved files as NAMEOF~1.DOC instead of "Name of Really Long
File.doc" from Windows 95.

~~~
duskwuff
> Since Sierra stores xattrs in a special B*-tree node rather than as extended
> filesystem attributes...

I think you've misunderstood something.

"Extended filesystem attributes" is literally just what xattrs stands for --
"eXtended ATTRibuteS". "A special B-tree node" is one way that a filesystem
can choose to internally store xattrs. They aren't two separate things.

~~~
dkonofalski
I haven't misunderstood it (I don't think). B*-tree is basically a hack to
store extended attributes that weren't initially included in HFS. APFS
includes those extended attributes natively. I'm just wondering if the
difference in how they're stored is the reason for the loss.

~~~
userbinator
I doubt it, since the APIs to access them are the same.

------
Kpourdeilami
Unrelated, but IMO, the clients for iCloud Drive on both iOS and OS X are very
poorly implemented. I used to have a 16 GB of iPhone that would magically go
from 8GB of available storage to 0 despite me not downloading anything onto
it. It turns out, if you have iCloud Drive enabled on iOS, it automatically
downloads the entire content of your drive to your phone and in the settings
app, it classifies it under the "other" so there was no way for me to figure
out what was eating up all of my phone's storage.

The mac client isn't any better, it sometimes utilizes 100% of CPU to sync a
few files (bird process on OS X) with iCloud and there's no way of stopping
the sync or slowing it down.

------
ksec
Slightly Off Topic Question.

Why is Apple continuously pushing its customers towards cloud, when many (
more then 40% ) in developed world dont even have access to 10Mbps + Internet
Connection. ( And that is just download speed )

------
nkristoffersen
Not related, but when I adjust chmod for files in iCloud, they revert back
(during next sync cycle I presume). Just something that may or may not be
interesting for folks looking at iCloud.

------
maxsavin
I wonder if its a security precaution.

