- The tools feel fragmented, not aligned and often don't compose well
- Always sharp corners (Billing, ops, you name it)
- Products in various state of usability / maturity, sometimes a product is launched by barely usable.
I've never worked for Amazon, but if what is said about amazon is true, then the culture is reflected in their products, teams siloed owning their own services, then it does make sense that their services have those quirks.
That comparison seems unfair... AWS has a habit of launching early and iterating, and they almost never shut stuff down. In this case it's a filesystem implementation of a HTTP API, so the capabilities reflect the main API. Deciding to build and launch reads before writes isn't really a bad decision, it helps those who have the read use case try this out and give feedback immediately.
I think there’s a bunch of gateways available already. The example use case for this seems to be mounting a 10 petabyte “hard drive” with CSVs to a reporting machine.
"Doesn't supports writes in the first release" is a letdown, but the latter part above seems expected. I suppose you could abstract what would have to happen to fake arbitrary seek() and overwriting portions of a file, appending, etc, but it would encourage things that wouldn't work well.
Partial writes would need to buffered and flush()ed. That is not unlike other file systems, until you fully know it is persisted - or am I missing something?
It’s not supported by S3 to partially update an object; you would have to completely replace the entire object. While they could do some magic under the hood to make it seem like it works, it’s better to just reject these operations and let the application developer decide what’s best.
Maybe I'm missing some context here, but can't you generate X number of presigned URL's with your chunk size and if you provide the chunk ordering it's automatically done on S3 side, but I though you could manually implement the chunk order assignment also.
I would suspect the team making this tool is not exactly the same team maintaining S3...
So adding "server side operations" would require work from the other team to be added.. which certainly could be done if they decided it was needed.
As a first MVP release this project seems neat for read use cases.. and feedback from the community can help ensure read operations are solid while they are also working on the write side -- which they admitted was lacking in this initial release, but want to improve.
That S3 storage is almost immediately consistent on a planetary scale is pretty impressive already. This would make it a lot more complicated to do that.
OTOH, you can always engineer around such limitations by storing smaller objects you can completely ingest and store on a single operation.
For that use case, they have EFS. If you are reading from one of more S3 blobs and gradually building another blob, you can keep that blob on EFS and, when finished, dump it back to S3.
> it doesn’t support writes in this first release, and in the future will only support sequential writes to new objects.
To me it just sounds like Azure Files with less features. You can just mount a share on any machine supporting NFS or SMB or just use the REST interface. [0]
> The open-source client does not emulate operations like directory renames that would require many S3 API calls or POSIX file system features that are not supported in S3 APIs.
At this point I'd probably be fine with using a different API than the filesystem, perhaps one inside of whatever backend language I was using
Indeed, if you're getting that far away from historical file systems primitives then may as well use an API model that makes senses for the storage backend. Or just contribute direct to rclone and fuse.
Reminds me of tape-based storage like the Sinclair QL or Exatron's "stringy floppy". Or Mitsumi's QuickDisk, a format I saw on the Sharp MZ series which is not a disk, but a spiral track that looks a lot like a loop of tape running past RW heads.
I experimented with similar ideas 11 years ago with the OpenStack counterpart of S3, and basically an object storage is not a real filesystem. You can get away with a FTP/SFTP interface (for example), but that's it. Everybody wants a cheap object storage, but not the fact that what you deal with is "objects".
FYI mountpoint is written in Rust, but using a new framework called CRT (common runtime) that’s written in C. The perf is multiple of that of the AWS cli since the CLI is in Python. You can actually try using CRT in Python by setting some flag - I forgot what. It doesn’t support all the APIs yet though
Cool. I was mostly just pointing out the assumption that the official software is somehow the best option is proven incorrect time and time again with AWS.
I like s3fs, sshfs, and similar solutions, but it gets wonky in corner cases, like gaps in connectivity.
What I'd really like is a kernel-level file system abstraction which:
(1) Has capabilities, and unsupported operations fail. If I try to move a log file over to a medium like S3, it works. If I try to write a log file directly to S3, incrementally adding data, it fails. I don't want compatibility kludges (e.g. copying a file down, appending a line, and copying it back).
(2) Provides common abstractions (my file manager works on S3 as much as it does locally)
(3) Supports all media.
The current implementation broke down with NFS and hot swappable media like USB sticks. Having a removable media stuck in an impossible-to-unmount state shouldn't happen. Nor should assorted failures due to connectivity.
I’ve been burnt by trying to use FsX as a mount-s3-as-a-readonly-filesystem. Flaky mounts, complicated workarounds for maintenance windows - if this lives up to the promise I can delete a whole module (and the cognitive overhead of reasoning about its state and behaviour) that we’ve written to manage a local disk to mirror an s3 bucket.
To be frank, I think S3 developments like this might happen due to S3 compatibility both on the "server" (Backblaze B2, Minio, SeaweedFS, etc.) and so on as well as on the client side happens, so Amazon is probably not too happy about competition. It's probably a good idea to stay away even from open source by Amazon, given that there is very well, used in production by big companies, working and more independent alternatives.
I love Rust but I have to agree installing rustup like that feels like a crime to my machine. For some time I would only install rustup when inside an air-tight virtual machine since I can't really bother to read the sh script every single time I'm going to download it.
It's 2023, we have package managers, we have packages, we have containers, why are we still shipping software like this? Even worse is the fact that you can generally find Rust in your package manager of choice but it will often be outdated and you won't be able to choose your versions the same you would with rustup.
I don't know, I'm not sure I have a solution, but I just wish maintainers would put a little more effort into trying to support package managers as the de-facto way to set up a tool chain such as Rust. Even if you want to keep the meta-package manager such as nvm or rustup, at least let me download _them_ from my distros repositories instead of running a random sh script from the internet.
This specifically isn't an issue of trusting them with my system, it's that a shell script can give a shit all over a system without a good way to undo it, even if it was well-intentioned.
Package managers are modern technology, they exist because they can track what files were placed where, and can remove them cleanly when given an uninstall command.
Maybe the problem is the rust community seems to live on the bleeding edge and always needs to have a ‘nightly’ build to do anything interesting?
And distros usually don’t bump versions between releases because they just don’t. Run Debian Sid or Fedora Rawhide or <whatever> if that’s what you’re after.
> rust community seems to live on the bleeding edge and always needs to have a ‘nightly’ build to do anything interesting
You have an example of a project that needs nightly to build? I’m sure some exist, but nearly all libraries and projects live on the latest stable release.
That indeed used to be very common in the early days of Rust, however, I believe stable Rust is what most libraries and projects use nowadays. It's been some time since I saw anything that used Rust beyond 1.60 (release April last year) as a minimum version.
It's straightforward to set up an apt-get repository that updates nightly and have users use that. For something as widespread as Rust I expect better than curl | sh scripts.
Probably doesn’t matter for Rust but once you start messing with random system packages that other packages depend on it becomes less than straightforward really quick.
One simple version bump can effect hundreds of packages and, if you’re not careful, bork an entire install… ask me how I know that one.
—edit—
Also should say that I’m fully in the package manager camp. If I want to install something that’s not in the repos I almost always find or make a package and build it locally because I don’t want random orphaned files strewn around my system folders.
Unless it’s just some command line program then I usually just use it from the source directory and don’t even bother having it in my path.
For what it’s worth, I had a C# dev who’d never dealt with Rust at all before get themselves setup, and then compiling and running the Rust app by themselves in about 10 minutes flat (our internet is slow) and I definitely think the ease of the setup flow contributed to that hugely, so massive thanks for making it so nice.
Why do you trust scripts from your package manager more than one from the official upstream Rust project? The Rust project also has a good security track record.
I trust them more because I choose who mantains the repos I use. I trust them more because I already have to trust them: they provide almost every single piece of software I run on my machine. In this case I'm on Fedora which has a good track record for security, stability, and only allowing free software.
It's not to say that I don't trust the Rust project, nowadays I kinda have to, but curl sh installation is messy and separated from the rest of the system. If they just packaged rustup into an rpm and set up their own repos I could point dnf to it would make system maintenance so much easier.
I just want them to use the tools that already exist instead of reinventing the wheel with an esoteric 700 lines-long sh script. This applies to Rust and any other kind of tooling that does the same installation workflow. I understand the reasoning behind it, but I believe the other options should also be considered and supported.
What's wrong with it as long as it comes from a https URL with low potential for typo squatting (a short .sh domain would be even more natural but maybe controversial and a bit expensive for just this)? You don't have to pipe it into sh, you can also redirect it ie "> install.sh" then examine it before running. What's the alternative? A dozen incompatible centralized lang-specific or distro-specific package managers? Might work for devs but not end users, might not work with all licenses, is extra work for distros + Mac OS so chances are the O/S you're using or would otherwise be using isn't covered, ...
The whole point of this installation method is ease of use. You can use it for convenience if you prefer it. I don't see the issue with providing more options not less.
Try examining the above script. It's a lot of work. It's not just a bunch of wget's and cp's, there are a bunch of subroutines and conditionals. Too much to look at.
Also as another user pointed out, uninstallation is a problem.
> Try examining the above script. It's a lot of work. It's not just a bunch of wget's and cp's, there are a bunch of subroutines and conditionals. Too much to look at.
This would be a reasonable counter-argument if most people could honestly claim to have inspected the source of >1% of the things they'd installed from apt-get/yum/etc.
"But I trust the maintainers of those repositories to verify correctness" - yes, and people trust the Rust maintainers, too.
I'll grant you that uninstallation is usually a good argument (though not in this particular case).
Ansible provides a module to install OS packages. It works with apt, yum, dnf, pacman, etc. This is just one of the implementation wrappers around a universal way to accomplish the task.
This same mechanism also allows uninstallations and relies on the OS to manage dependencies, which it does.
This one true way necessitates the "OS package manager" being the way for an OS to manage packages.
This exists, and tools exist to eliminate the need to memorize how to do these processes like update, remove, reinstall, rollback, etc. It also centralizes into one config Nd workflow your very limited number of trusted parties, what their GPG keys are, and how to validate, rescind, and securely handle and use those keys. It handles digest verification, sha sum validation, etc.
One way to do this exists by as many names as there are distro families, but it's one way nevertheless.
This way, with these keys, stored under the security postures built and refined over decades, exists.
The people complaining that this one way should be used turn out to have a valid point, and your suggestion is important, so it sounds like you're in agreement: let's use the OS package manager when installing OS packages. This way should be foremost, recommended, and most prominent. It's easier and safer. Do you disagree?
Can we please, please, please have someone write the authoritative article on why this is _actually_ a bad idea, so that it can be linked next time this conversation comes up, rather than merely intimating that it's bad?
...Yes, I'm intentionally invoking Cunningham's Law in the hopes that it exists and someone will link it here. Just to do a little due diligence, I searched for some answers, and found:
* https://stackoverflow.com/a/29389868/1040915 - "Because you are giving root access to whatever script you are executing. It can do a wide variety of nasty things." - apt-get and yum require sudo too.
* https://stackoverflow.com/a/34016579/1040915 - "if the script were provided to you over HTTP instead HTTPS..." sure, no arguments there! So don't do that then! ... "if the connection closes mid-stream, there may be executed partial commands, which were not intended to" ok, an actually reasonable answer! (though see the next link)
* https://www.arp242.net/curl-to-sh.html - a pretty comprehensive article in support of `curl | sh`, which points out that it's of equivalent security to `git clone && cd dir && ./make`, and only very slightly less than using package managers (which provide checksums). It also points out you can avoid the "interrupted connection" issue by running within a function
* https://medium.com/@esotericmeans/the-truth-about-curl-and-i... - reiterates that interrupted connection issues are minor, and repeats some server-side exploits that could potentially happen (if you don't trust the server, don't _ever_ install anything from it, via any means!)
Don't get me wrong - given the choice, I'd rather have the audit log, built-in checksum, uninstallability, and other features of a package manager any day. I'm not arguing that `curl | sh` is _better_ than package managers. But I have always been a little baffled why it's portrayed as _so_ much worse (when installing from a trusted source, over HTTPS) as to be anathema and repugnant.
There’s nothing wrong with curl | sh other than people realising “damn, this software could be malicious!” Yes, it could be. Doesn’t mean a HTTPS link maintained by a project with a good security track record is actually serving malicious links though.
Releasing? No. But you're already free to consume it via some other means, such as your OS/distro's/choice of package manager. It's not really on tool authors to inform you how to use your (and every other choice of) package manager, nevermind ensure it's packaged for all of them.
> it doesn’t support writes in this first release, and in the future will only support sequential writes to new objects.
[0]: https://aws.amazon.com/blogs/storage/the-inside-story-on-mou...