Hacker News new | past | comments | ask | show | jobs | submit | scrapheap's comments login

It's worth noticing that the performance difference between sequential and non-sequential reads will differ significantly between types of devices. It's much more noticeable on a spinning hard disk drive than it is on a solid-state drive.


What do you mean by perfect copies here? Do you mean the file content itself or are you also including the filesystem attributes related to the file in your definition?


A file consists of data and various metadata, e.g. file name, timestamps, access rights, user-defined file attributes.

By default, a file copy should include everything that is contained in the original file. Sometimes the destination file system cannot store all the original metadata, but in such cases a file copying utility must give a warning that some file metadata has been lost, e.g. like when copying to a FAT file system or to a tmpfs file system as implemented by older Linux kernels. (Many file copy or archiving utilities fail to warn the user when metadata cannot be preserved.)

Some times you may no longer need some of the file metadata, but the user should be the one who chooses to loose some information, it should not be the default behavior, especially when this unexpected behavior is not advertised anywhere in the documentation.

The origin of the problem is that the old UNIX file systems did not support many kinds of modern file metadata, i.e. they did not have access control lists or extended file attributes and the file timestamps had a very low resolution.

When the file systems were modernized (XFS was the first Linux file system supporting such features, then slowly also the other file systems were modernized), most UNIX utilities have not been updated until many years later, and even then the additional features remained disabled by default.

Copying like rsync, between different computers, creates additional problems, because even if e.g. both Windows and Linux have extended file attributes, access control lists and high-resolution file timestamps, the APIs used for accessing file metadata differ between operating systems, so a utility like rsync must contain code able to handle all such APIs, otherwise it will not be able to preserve all file metadata.


But what you're referring to here are the attributes that the file system stores about the file, not the file itself. By default I wouldn't expect a copy of a file to have identical file system attributes, just an identical content for the file. I would expect some of the file system attributes to be copied, but not all of them.

Take the file owner for example if I take a copy of a file then by default I should be the owner of that file as it's my copy of the file, and not the original file owner's copy.

An alternative way of looking at it is if I have created a file on my local machine that's owned by root and has the setuid bit set on it's file permissions then there's no way that I should be able to copy that file up to a server with my normal user account and have those atttibutes still set on the copy.


> But what you're referring to here are the attributes that the file system stores about the file, not the file itself.

Yes. Sometimes you need that additional information too. And if you do, then rsync is your tool. If you only need the data stored in the file, then drag & drop suffices.


"File" means an entry in the file system, and so includes the metadata. It is not only the data.

When a copy a file you will be the owner because the new copy is your copy. Other attributes however like modification date for example will remain the same. It's not as if you wrote the contents of the file anew, especially not for copy-on-write architectures like Apple's APFS.


So you also would expect some of the file system attributes to be copied, but not all of them. :D


I expect all of them to be copied except for specifically the owner and group. Created date, modified date, ACLs, extended attributes, eeeverything else.

My expectations are more specific than "not all of them", so please don't misrepresent them.


Out of interest, why wouldn't you expect the created timestamp for a file that you've created by copying another file to be the point in time which the copy was made? After all, before that moment the file didn't exist, and after that moment it did.


For some context you may want the new file creation time, but if I copy a folder of some backups for example, I don't want every file to have date set for today. I'll lose the possibility to filter files based on creation date, which is very useful for such use case. I don't remember that I would ever need a copy to have creation date reset.


Most tools that sync files (in contrast to mere copies) need a way to know which files need to be copied, and which can be skiped. The expensive way is to perform a checksum, but most sync tools rely on the creation or modified date unless told otherwise.

Now say Alice and Bob have the same copy of file F, Bob modifies it first which gets stored at timestamp T, then Alice modifies her copy at time T+1.

Bob syncs his files on a filer, its timestamp gets reset to now, which is say T+2. Then Alice does the same, but her file does not get copied, since the remote timestamp T+2 is newer than her local timestamp T+1.


macOS has "date added" for this, which is the date the file was added to its containing folder. It's not the exact same as the date created that you're talking about, though.

I honestly don't have a strong preference either way on this. I don't use date created except for misbehaving media downloaders that think the file modified date is a good place to put the video publication date. I'm sure there's a flag somewhere that I don't care enough to find.


You do you expect the ACL to be copied but not the owner? They are different abstractions of the same thing.

As a counterpoint, many daemons or programs (e.g.: sshd, ssh, slurm, munge to name a few) expect their files to have specific users, groups and modes for security and behavioral guarantees, and flat out refuse to run if these requirements are not met.

When installing these things from archives or moving/distributing relevant files to large fleets, I expect the file contents and all metadata incl. datestamps to be carried the way I want, because all of that data is useful for me and the application which uses the file.

If the user doing the copying has no right to copy the file exactly, I either expect a loud warning or an error depending on the situation.


Should the SELinux context of a file always be copied from the source when moving or copying it? Or should it typically inherit the context defined by policy for the destination directory structure?

For example, copying a file from a user's home directory (perhaps user_home_t) into /var/www/html/ usually requires it to get the httpd_sys_content_t context (or similar) to be served by the webserver correctly and securely. Blindly copying the original user_home_t context would likely prevent the webserver from accessing the file.

Doesn't this suggest that some metadata, specifically the SELinux context, often shouldn't be copied verbatim from the source but rather be determined by the destination and the system's security policy?


What if the tool accessing the file is malicious, and can copy the file, but can't change the context of the said file? SELinux shall be strict on its behavior even if it's a detriment to user convenience.

SELinux contexts shall be sticky, and needs to be manually (re)set after copying.

This is the default behavior, BTW. SELinux contexts are not (re)set during copy operations in most cases, from my experience. You need to change/fix the context manually.


I think when I cp a file it takes on the context of the directory or whatever the default context for that path is supposed to be, and when I mv, it retains the original context.


That is not what most file copying tools do by default. They usually only do that when you specify it and for good reasons.

When foo copy a file from user bar, and put it on his homedir, the last thing h want is for it to be owned by the foo user.

Your expectations are irrealistics.


> That is not what most file copying tools do by default.

Yes, and that's OK.

> When foo copy a file from user bar, and put it on his homedir, the last thing h want is for it to be owned by the foo user.

It depends.

> Your expectations are irrealistics (sic).

No, rsync can do this (try -avSHAX) and tar does this by default, and we're talking about rsync here.


> rsync can do this (try -avSHAX)

That is exactly what I am saying, rsync do not do this by default either, you have to tell him to via optionnal parameters.


The thing is, if you’re knowledgeable enough to use rsync over cp, you already know relevant flags to do that.


The executed bit is an attribute that the FS stores about the file, and isn't technically part of the file itself.

Strip all the execute attributes out of your *nix system and see what happens.


The cp command does copy the file data but not the metadata. There is a reason we have come up with 2 words to distinguish them.

Rsync only cp the metadata when you specifically ask it to anyway. I haven't had a look at openrsync man page but I would assume it is the same in the case of the later.


Nope.

Openrsync lacks the options of rsync for making exact copies.

Moreover, the OpenBSD file systems are unable to store all metadata that can accompany files in Linux filesystems or Windows filesystems, so that is the likely reason for removing the rsync options.

I also doubt that the developers of an utility for OpenBSD are also interested in taking care to preserve file metadata when copying to/from Windows, because the metadata access API is not portable, so a complete "rsync" utility must include specific code paths at least for Windows, for Linux and for FreeBSD. I do not know if the API of MacOS is also specific to it, or it is compatible with anything else.


It maens that if you copy a file from NTFS to ext4, ext4 will magically sprout support for alternate data streams.


And all files from NTFS have +x. :|


I can understand when a company has a policy of never giving feedback, but it's a shame when they can't at least be polite about it. How hard is it to have an standard response saying something along the lines of "Sorry, it is not company policy to provide feedback"?


> How hard is it...

Not very - though the experienced prospects don't need to be told.

And people who are inclined to sympathy and politeness tend not to stick around in a role which requires lots of saying "no", to people who really wanted to hear "yes".

Finally, the sooner you close the door on further communication, the less time you waste with candidates who fall short of "calm and professional" in accepting their "no".


I didn't know there was lithium-ion's in AA and AAA sizes. Any good recommendations?


They are in those sizes, but they are not AA or AAA replacements because the voltages are different. If you want to use rechargeable batteries in devices made for alkaline batteries, Eneloop branded NI-MH batteries would me my preference, though other brands are also an option.


I'm a big fan of both Mermaid and Graphviz - Thanks to GitLab supporting Mermaid we can put relevant project diagrams inline in Markdown docs that live in the same git repo as the rest of the project code.

And if I need to generate a graph programmaticaly then I instinctively reach for Graphviz as it's solid and can produce the graphs in so many different file formats that they're easy to include wherever they're needed. Your code is a lot simpler as it doesn't need to handle any of the rendering logic, it just needs to work out which nodes are connected by which edges.


Favorite is hard, but if you use vim and don't know that `.` repeats the last change then you're missing out. It's really useful for those times where you need to make the same change in lots of places in a file, but not everywhere.


> When you move, do you expect to be able to keep using your previous postal addresses? (Perhaps there could be some benefits...)

In some countries you can tell the postal service that you're moving and, for set period of time, they'll forward the mail addressed to you from your old postal address to your new one. It's very useful for catching all those places you didn't remember to update your address for when moving.

You can do the same with email by setting up an forwarding rule on your old email address so that it forwards on any emails it recieves to your new email address.


Yes, that's why I said email and snail mail portability are about the same.

OP's question was about portability like phone numbers where you can keep using your old phone number indefinitely after "moving" to a new provider.


I don't know about your network, but for me that NAT device is sat on my network and so very much my problem :D


The Freescape 3D Engine was in use in the 80's (see https://en.wikipedia.org/wiki/Freescape ).


Good ol' Driller will always be my first 3D game. The novelty of 3D graphics that were more than the plain wireframes of Elite, was such that we put up with the 1 FPS framerate for hours on end.

I wish I could regain some of that wonder back, but all the Raytracing and RTX make it very hard to find.


Yes and no.

If that memory isn't being used and other things need the memory then the OS will very quickly dump it into swap, and as it's never being touched the OS will never need to bring it back in to physical memory. So while it's allocated it doesn't tie up the physical RAM.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: