
What happens when you run `cp` on the command line? - ingve
https://blog.safia.rocks/post/173365993220/what-happens-when-you-run-cp-on-the-command
======
developer2
Summary of the article if you omit the trace itself:

>> macOS `cp` (*nix not discussed) copies files (somehow, not discussed).
Then, via macOS-specific syscalls, file attributes/metadata are manipulated
(details superficially discussed).

The core functionality of `cp` - namely, copying a file - is not discussed at
all; only the pasted trace gives any insight. There are naive ways one could
do so, such as opening the source for read and destination for write and
iterating X bytes/pages at a time in user space, versus mapping files to
memory, versus instructing the kernel to perform a direct low-level copy. I
was expecting such an analysis, and perhaps an idea of how things change
depending on whether the source and destination paths exist on the same
filesystem.

Instead, the only talking points refer to macOS-specific extensions.
References are made to system integrity protection (SIP), flistxattr for
setting attributes, and MAC policies. Yet all we get are descriptions as to
what those subsystems are, without any explanation as to why `cp` is calling
into them. I would assume it's simply to copy file attributes from sources to
destinations (duh), but whether this is the case is not covered.

~~~
ktpsns
Funny thing is that MAC does not stand for Macintosh but for Mandatory Access
Control, source: [https://www.freebsd.org/doc/en/books/arch-handbook/mac-
backg...](https://www.freebsd.org/doc/en/books/arch-handbook/mac-
background.html)

------
dveeden2
On UNIX there are two copy commands: cp and install. cp copies over the
destination inode. install creates a new inode. This is important when copying
new .so files while an application is using the old one.

~~~
Yetanfou
What makes you think this to be the case? This is what _copy_ does (cp
source_file destination_file, abbreviated):

    
    
       22674 stat64("destination_file", 0xbfad201c)    = -1 ENOENT (No such file or directory)
       22674 stat64("source_file", {st_mode=S_IFREG|0644, st_size=1418, ...}) = 0
       22674 stat64("destination_file", 0xbfad1e18)    = -1 ENOENT (No such file or directory)
       22674 openat(AT_FDCWD, "source_file", O_RDONLY|O_LARGEFILE) = 3
       22674 fstat64(3, {st_mode=S_IFREG|0644, st_size=1418, ...}) = 0
       22674 openat(AT_FDCWD, "destination_file", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0644) = 4
       22674 fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
       22674 fadvise64_64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
       22674 mmap2(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f94000
       22674 read(3, "file contents go here..."..., 131072) = 1418
       22674 write(4, "file contents go here..."..., 1418) = 1418
       22674 read(3, "", 131072)               = 0
       22674 close(4)                          = 0
       22674 close(3)                          = 0
    

It doesn't do anything specific with any inodes, it just opens a file for
reading ( _openat(AT_FDCWD, "source_file", O_RDONLY|O_LARGEFILE) = 3_) and one
for writing ( _openat(AT_FDCWD, "destination_file",
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0644) = 4_), copies the content from
source to destination and closes both files.

Running a trace on _install_ (install source_file destination_file,
abbreviated) shows _exactly_ the same trace with the following additions:

    
    
       22795 fsetxattr(4, "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\0\0\377\377\377\377 \0\0\0\377\377\377\377", 28, 0) = 0
       ...
       22795 chmod("destination_file", 0755)           = 0
    

The only difference between _cp_ and _install_ is that the latter sets access
rights while the former leaves this up to the user (and _umask_ ).

~~~
kevinday
That may be true for some UNIXes, but not for all UNIXes. For example on
FreeBSD, create a process that will stick around for a bit:

    
    
      # cat >sleepy.c
      #include <unistd.h>
      int main(void) {
        sleep(999);
        return 0;
      }
      # cc -o sleepy sleepy.c
      # ./sleepy &
    

Now try to overwrite it:

    
    
      # cp /bin/date sleepy
      cp: sleepy: Text file busy
    

Now try with install:

    
    
      # install /bin/date sleepy
      # ./sleepy
    

Install was able to get around the "Text file busy" error. Why?

    
    
      # touch a
      # stat -f %i a
      686974
      # cp /etc/motd a
      # stat -f %i a
      686974
    

cp here has preserved the inode, it replaced the contents without deleting and
recreating the file. Now lets try install:

    
    
      # stat -f %i a
      686974
      # install /etc/motd a
      # stat -f %i a
      687076
    

The inode changed. Why?

    
    
      46785 install  CALL  unlink(0x7fffffffed52)
      46785 install  NAMI  "a"
      46785 install  RET   unlink 0
    

install unlinks the file first before copying the data over. Then it creates a
new file/inode (F_CREAT):

    
    
      46785 install  CALL  openat(AT_FDCWD,0x7fffffffed52,0x602<O_RDWR|O_CREAT|O_TRUNC>,0600<S_IRUSR|S_IWUSR>)
      46785 install  NAMI  "a"
      46785 install  RET   openat 4

