
Maybe: run a command, see what it does to your files without actually doing it - colund
https://github.com/p-e-w/maybe
======
paulasmuth
While this looks like a nice idea on paper, I would not recommend to use the
current implementation of 'maybe' on a system that hosts valuable data.

The tool seems to work by intercepting individual "blacklisted" system calls
and then - instead of executing them - returning a nonsense value.

The issue is that this breaks every single POSIX spec and will therefore break
any program that does more than a few trivial IO operations and relies on
those operations to behave as specified.

So it might work for a simple demo case where a small script only does a
single file modification and never checks the results, but for any serious
program (think a database, a complex on-disk format or really anything that
does network IO) this will lead to corruption and undefined behaviour as
system calls will return erroneous success values or invalid file descriptors.

I think to actually make this work one would have to emulate the system calls
and make sure everything stays POSIX compliant. Doing this correctly for calls
like mmap might get tricky though (and won't be possible from within a python
runtime). And even then it isn't obvious how something like network IO would
be handled.

~~~
DavideNL
Obviously, that's why the Github description says:

> That being said, maybe should :warning: NEVER :warning: be used to run
> untrusted code on a system you care about! A process running under maybe can
> still do serious damage to your system because only a handful of syscalls
> are blocked. Currently, maybe is best thought of as an (alpha-quality) "what
> exactly will this command I typed myself do?" tool.

~~~
jacobparker
You've missed Paul's point I think. Even running trusted code is unlikely to
work as desired (and could still have negative consequences due to what isn't
included in the "sandbox") for anything other than trivial programs.

~~~
DavideNL
I understood. Obviously the author knows about these limitations, otherwise he
wouldn't be able to write such a tool... but still it can be useful; It's
alpha, the tool can be used on trivial programs and you should be aware of its
limitations.

------
burrows
This would be more useful if it used a virtual file system.

What happens if the program writes a bunch of stuff to disk then later on
tries to read it?

~~~
whenov

      #!/bin/bash
    
      touch x
    
      if [ -f "x" ]; then
          rm -rf /
      else
          echo "I'm just an innocent little script."
      fi

~~~
singlow
rm -rf --no-preserve-root

~~~
justinjlynn
Never put salt in your eyes. -- Kids In The Hall

------
lugus35
On Windows you have sandboxie
([http://www.sandboxie.com](http://www.sandboxie.com)), a complete sandbox for
your programs.

~~~
viraptor
I didn't know something like that exists. I knew of win10 app sandboxes, but
this is even more generic. I'll definitely use it whenever possible.

I got so used to this being relatively easy on linux (grsec, firejail,
seccomp, etc.), but never heard of easy ways to apply it on win.

~~~
frik
> relatively easy on linux

Please give me an example how to launch a program in a sandbox incl virtual FS
on Linux - with Sandboxie it takes a single mouse click

~~~
cyphar
Docker would be the closest "one click" thing that exists. You could make it
more secure but it would require more configuration.

------
majke
Nice idea. Shameless plug: this reminds me a ptrace hack I did a couple years
back:

fluxcapacitor

[https://github.com/majek/fluxcapacitor#fluxcapacitor](https://github.com/majek/fluxcapacitor#fluxcapacitor)

[https://idea.popcount.org/2013-07-19-how-to-sleep-a-
million-...](https://idea.popcount.org/2013-07-19-how-to-sleep-a-million-
years/)

------
dantillberg
I'm imagining someone trying something along these lines:

maybe find ~/ | grep . | parallel rm {}

...and be fairly surprised to find everything deleted. It wasn't find's fault!

~~~
ecma
Why would anyone ever invoke something like this and expect maybe to somehow
steal those pipe characters? It's almost meaningless and displays just enough
know how that the author should know how a shell works.

~~~
geofft
Have you never run something like "sudo echo 0 > /proc/sys/..." and only
realized your mistake after running it? _Ever_?

~~~
repsilat
Is the "right solution" to use `tee`? I saw that once, and it seemed like we
should be able to do better -- as if, had `tee` not been in the standard we
wouldn't have any way to do it...

~~~
yrro
Alternatively: sudo sh -c 'echo 0 > /sys/foo'

------
caioariede
Mbox does something similar, but seems to be more robust/complete:
[https://pdos.csail.mit.edu/archive/mbox/](https://pdos.csail.mit.edu/archive/mbox/)

------
mappu
It doesn't work this way, but my first expectation was that this would take an
LVM2/ZFS snapshot and diff the file trees afterward. Then it'd be easily
ported to Windows (VSS) and wouldn't have the subprocess issues on *BSD, but,
the diff would be slower, unordered, and contain changes made by unrelated
processes.

~~~
dfc
Comparing two snapshots would convert multiple modifications into one big
change.

~~~
the8472
combining the ptrace approach for logging with mount namespaces, snapshotting
or overlayfs would probably be more consistent approach if the program
actually tries to use the files.

just stubbing out the system calls sounds like it'll quickly break down once
the programs try to do something more complicated with the files.

------
zx2c4
Also check out mbox:

[https://pdos.csail.mit.edu/archive/mbox/](https://pdos.csail.mit.edu/archive/mbox/)

It's a bit more complete than this.

~~~
dontdieych
I've used [mbox][0], [firejail][1] for this case.

    
    
        mbox COMMAND
    

run COMMAND and shows you what files changed. Then you can select drop or
commit those files.

    
    
        $ mbox mktemp
        /tmp/tmp.cfJElOQmNK
        Sandbox Root:
        > /tmp/sandbox-30749
        > F: /tmp/sandbox-30749/tmp/tmp.cfJElOQmNK
        F:/tmp/tmp.cfJElOQmNK
        [C]ommit all, [c]ommit, [i]gnore, [d]iff, [l]ist tree, [q]uit ?> q
        $
    
    
        firejail --private=/path/to/home-for-command COMMAND
    

does similar things. But it only sandbox $HOME. Also does not show what is
changed by defult.

[0]: [https://github.com/tsgates/mbox](https://github.com/tsgates/mbox)

[1]: [https://firejail.wordpress.com/](https://firejail.wordpress.com/)

------
frik
Ideally, you want ad hoc sandbox (like Sandboxie on Win32)

Ideally, Docker could offer ad hoc commands to launch an process in a sandbox.
And then you launch a file explorer process in the same sandbox (like
Sandboxie) to inspect or run a diff-tool that outputs statistics like the tool
in the headline.

~~~
dkarapetyan
libguestfs can be used to do the diffing. Create the initial disk, do stuff in
VM, use libguestfs to do diff. I almost built a hacky omnibus packaging tool
this way. I say almost because it turns out to be just as simple to just
compile stuff from source and install in /opt and then package those files.

~~~
ptman
This is like Sandboxed Execution Environment
[https://github.com/F-Secure/see](https://github.com/F-Secure/see) which was
originally created for malware testing. And yes, we used guestfs.

~~~
dkarapetyan
That's cool. Thanks for the pointer.

------
gravypod
THIS IS AMAZING! I would have never imagined seeing something this useful
being made in my life.

~~~
jldugger
It's actually quite simple. LD_PRELOAD can achieve a similar effect, if the
software is using the typical syscall library.

~~~
gravypod
I know exactly how it was done, I am just amazed that it was done. It's a
great idea that I have never come up with, nor would I be able to.

~~~
emmelaich
Yes, it's a great idea.

A similar idea is used to trace what 'make install' does so that stuff can be
uninstalled later - or made into a package (rpm, dpkg etc)

I forget what that tool is called.

~~~
burrows
[http://asic-linux.com.mx/~izto/checkinstall/](http://asic-
linux.com.mx/~izto/checkinstall/)

------
GigabyteCoin
I would have used something like this when I was first learning rsync...

Even though I had read all of the man pages and knew the commands inside and
out, it still seemed incredibly risky and scary to me to run rsync with the
--delete function when I was backing up my main USB drive for the very first
time.

Basically my biggest fear was that I had source and destination mixed up,
which I didn't, but it would have been nice to run a test trial of that
command before doing so.

~~~
SixSigma
did cp not work on your system ?

~~~
cyphar
rsync has integrity checking and is better in quite a few ways.

~~~
SixSigma
i meant to backup up before running rsync

------
rcthompson
I'd wonder if this could be implemented more robustly on ZFS using some
snapshot magic.

------
aug-riedinger
Instead of a control approach: having to confirm every little action in there,
It would be nicer to have a undo-able behavior: I actually run the script on
the first time, but I can easily undo it if it did not work as expected. The
current state of the art is of no use for me.

------
rvalue
I like the work but Is it possible that "maybe" you don't catch something and
it actually happens?

~~~
lucb1e
Yes, that is possible. As it says in the readme file:

> maybe should :warning: NEVER :warning: be used to run untrusted code on a
> system you care about! A process running under maybe can still do serious
> damage to your system because only a handful of syscalls are blocked.
> Currently, maybe is best thought of as an (alpha-quality) "what exactly will
> this command I typed myself do?" tool.

------
sry_not4sale
I use Docker to achieve a similar thing...

1) Boot up a new Ubuntu docker container 2) Run command / script 3) Use
`docker diff` to see what changes to the filesystem were made

Obviously it's only useful for some commands, but at least it's safe :-)

------
13of40
Kind of like -whatif in PowerShell, if anyone bothered to implement it right.

------
_Marak_
This seems neat, but from a security standpoint I'd much rather see a command
which spawned a new VM with a copy of my current file-system.

I would then want to capture all disk and network i/o that the "maybed"
command generated in the VM.

Even that wouldn't be that secure, because the command would still be able to
send sensitive data out. You could intercept the network i/o, but that would
cause most installers to fail.

------
txutxu
Limited scope... this is local.

I use to implement a --trial in my shell scripts, which covers reporting of
ALL operations (local and remote).

Anyway, this is a nice discover and nice hack.

------
im2w1l
Suggestion: Add a way to distinguish between parameters sent to maybe and
parameters sent to the program. You don't accept any parameters at the moment,
but you may want to in the future, and when you start wanting to do that, you
may not want to break any existing usages by changing behavior. For that
reason I think it would be good to introduce disambiguation now.

------
lowglow
This is dope. I also wish there was better formatting/coloring/more info for
dtruss/ptrace in general.

~~~
lowglow
s/ptrace/strace/

------
markbnj
Looks like a very interesting and potentially powerful diagnostic tool. Nice
work!

------
amelius
Sadly, it is not possible to combine this tool with programs that use ptrace
themselves, because of limitations in the implementation of ptrace.

In Linux, a program that is being ptraced is not allowed to ptrace another
program.

~~~
deutronium
That's interesting, I never knew that. I wonder why that use case isn't
possible.

~~~
amelius
You can try it yourself by running "strace" on itself, e.g.:

    
    
      strace -f strace -f ls
    

I guess this isn't possible because the kernel developers only envisioned
ptrace to be useful for debugging (they supposedly didn't think about
sandboxing applications), and implementing a "recursive" version of ptrace is
probably more difficult.

------
bobp127001
A really cool idea. Is this more or less how rootkits work and hide
themselves? A system call is made to list the contents of a directory, and the
rootkit excludes itself from that listing?

------
amelius
Another option is to use a filesystem with rollback capabilities.

~~~
effie
Do you know any examples?

~~~
amelius
For example btrfs or zfs.

------
malka
Could not it be possible with docker or like to run a command in a sandboxed
environment, and then provide a diff ?

------
andreygrehov
Would it be more safe to copy the entire target to a /tmp and do all the
required operations from there?

~~~
lucb1e
If you chroot there, maybe. Not sure I understand your question though.

------
agumonkey
Anybody remember cleansweeper ?

------
tluyben2
Very nice example of the python ptrace library and OS internals.

------
OJFord
This is awesome, I hope some Mac user with the know-how sees this and is
motivated to get python-ptrace on Darwin...

~~~
lunixbochs
You don't actually get PTRACE_SYSCALL on OS X, so you need to play with dtrace
or dynamic recompilation to get this kind of tool.

~~~
OJFord
My understanding is that support for the ktrace back-end is (/was/will be)
upcoming for python-ptrace. Unfortunately I don't have the skills for that,
but what I meant is that I hope someone will; then tools like `maybe` can
presumably use python-ptrace in just the same way as other Unices?

~~~
lunixbochs
ktrace doesn't exist anymore on OS X as far as I know. You need to use dtrace,
which means you need to write dtrace programs. Python's dtrace interface would
probably only let you load a dtrace program.

It might be possible to implement `maybe` as a dtrace program by instrumenting
syscall entry and raising a signal or stopping the program immediately, which
you can recover in your debugger (though I'm not sure if this actually stops
the syscall). That said, I tried it and OS X doesn't seem to allow
"destructive" dtrace actions even as root with SIP disabled:

    
    
        $ sudo dtrace -n 'syscall::open:entry { stop(); }' -c 'cat Makefile'
        dtrace: could not enable tracing: Destructive actions not allowed
        $ sudo dtrace -n 'syscall::open:entry { raise(9); }' -c 'cat Makefile'
        dtrace: could not enable tracing: Destructive actions not allowed
    

Alternately you can use dynamorio, intel pin, qemu, or a quick instruction
scan/patch for SYSCALL to manually break on syscalls.

Either way you almost certainly will not be able to do this with python-ptrace
alone. I filed an issue to write a `maybe` tool using Usercorn [1] (which
supports OS X) with my VFS overlay work, which means writes could still
succeed but be non-destructive.

[1]
[https://github.com/lunixbochs/usercorn/issues/151](https://github.com/lunixbochs/usercorn/issues/151)

------
Mithaldu
I feel like this would only be useful for badly written software.

Anything reliable would check that its operations went through as expected and
bail really early.

~~~
millstone
Reliable software should definitely check returns codes, but maybe returns
success. Inspecting the filesystem to verify the result is overdoing it, and
race-y besides.

~~~
emmelaich
It's one of those things that change in tone when expressed in the second or
third person.

When I do tests, they are the unmistakable sign of good careful programming.

When _you_ do tests it's an expression of lack of confidence.

Whey _they_ do tests it's a sure sign that really don't know what the hell
they're doing.

:-) in case it's needed

