
Show HN: S3Scanner – Find and dump open S3 buckets - _salmon
https://github.com/sa7mon/S3Scanner
======
_salmon
This is the first project I've created that I've actually put a lot of time
into. If you find bugs or have ideas for improvements, please open an issue or
a PR!

~~~
vitovito
Does it handle the case where objects and pseudo-folders have the same names?

e.g. I can have "testobj" as an object, and also "testobj/child" as an object.

When downloaded to a regular filesystem, that breaks and you lose one object
or the other object depending on which was downloaded first, because a regular
filesystem can't do that.

~~~
_salmon
Hmm, I'm not sure because I'm not sure what pseudo-folders are in this case.

Under the hood, the --dump function calls 'aws s3 sync s3://bucket ./bucket'
so however aws-cli handles that.

Admittedly, the --dump function is the area that probably needs the most
attention.

~~~
vitovito
I mean, the case is what I described. If you have an object named 'testobj'
and another object named 'testobj/child', and you try to download both to a
traditional filesystem, one of the objects will fail, because you can't have a
file and a directory with the same name in a traditional filesystem (S3 is an
object store with different properties).

If you're just calling s3 sync, it will fail in this use case, because s3 sync
does not account for this.

You'd have to get a listing first, parse it, and then figure out interactively
with the user what to do with the conflicting object names.

