It’s a shame they have committed to never supporting Bigquery and other cloud provider services. We really need something like this at my org but there’s no compelling reason to move all of analytics off of BQ nor is there a good reason to add yet another copy layer between the apps and the analytics tools
We literally have an entire set of teams building a DAL internally and its basically a forever project while existing app engineers just keep doubling down on table federation and other stuff that breaks lineage and ownership models
I don't really grok how the API of this - which appears to be very file-oriented - would be adapted to querying a sql interface like bigquery. How would that look?
This is really interesting. For a project I was building I figured out that having a sort of "universal access layer" for all types of storage was a requirement, and this looks like it's right up my alley.
Just because the author of this piece seems to be looking at this post, some minor nitpicks regarding spelling/phrasing:
> GCS has native JSON API which more powerful
Missing a verb.
> OpenDAL needs to implement features in zero cost way which means:
"in a zero cost way" is probably better.
> What OpenDAL does?
What does OpenDAL do?
> Free to zero cost
This is a tricky one, right now it sounds a bit weird because you're turning 'to zero cost' into a verb, suggesting users would have to 'zero cost' things themselves. The problem is that 'free' and 'at zero cost' mean the same thing, so it's hard to use both in the same phrase. Free of cost could work.
Looks great. Very impressed with the vision (although language could be improved), and the "wont implement" decisions made.
I would provide a bit of challenge on tenet 2 however. Supporting "storage_class" for S3 is a compromise that clearly has to be made, and yet, it appears you're not realizing you'll have to make other compromises like it in the future. I would suggest a storage-specific configuration class for each storage backend, and then you wont need to make these arbitrary concessions. The power of OpenDAL will be its standardized data API, not its simplified configuration.
I'm also not convinced that the project should implement OpenDAL Gateway. I cannot see how it will provide anyone any value other than making things more confusing.
The core part of OpenDAL is a Rust crate that provides fs-like APIs over different storage backends, but we also investigate providing other interfaces like a CLI. We have an experimental binary named `oli`[1].
You're welcome to start a discussion[2] to share how you use rclone and we may find it fit in OpenDAL's scope :D
Was surprised that rust didn’t have vfs libraries. Created my own async-vfs crate but now using opendal for a Nextcloud alternative that I have been working in rust.
That's my peeve with marketing advice this day. Describing what the product will do for user in emotional/vague terms carries zero information relevant to evaluate the product and make a use/purchase decision. It's either treating the customer as a generic unsophisticated idiot, that can't understand what the product actually does, and just needs to be told it'll make them happy, or it's a pure manipulative play. Either way, this is not the style OSS projects should pick up.
to me it means - you will be able to write code that allows you to save and retrieve data from various online services that offer data storage without having to know the particulars of each one of those services apis.
Is this what it does? And if so I wouldn't think that was useless information, at least it is useful enough for me to determine should I Read on?
It is accurate but not concise. The explanation should start with the purpose of the project described in about one sentence, then elaborate by adding context that may or may not be known to the reader.
I linked the overview page because the image on it explains the project better than any description I could find.
One of the issues I see is mixing "what" and "why".
The "what": an abstraction for accessing object storage services, implemented as client libraries in multiple languages plus utilities built on top.
The "why", the problem to be solved: there are many services that provide key-value storage with values being possibly very large; it is reasonable to want to write code that works with any of these services but there is no common interface to write this code against.
Caveat, I may very well be misunderstanding the project.
Questions I have:
- how does it compare to object_store
- what is the philosophy of the project; in what way is it opinionated about the design of object storage services
I agree it took me some time to understand what it does.
It seems to be a Rust library/crate allowing filesystem like operations on many storage systems, as long as you can do basic key/values things. From your filesystem, to redis, S3, sqlite, etcd…
They also provide Python and NodeJS bindings.
The operations are apparently (not supported by all storage systems):
stat
read
write
create_dir
delete
copy
rename
list
scan
presign
blocking
But now I have an image that you can use it either as:
1. A drop-in replacement to S3 SDK (AWS's Rust SDK is ...);
2. A quick way to support your users to configure different cloud storage they have (release you from supporting multiple cloud OSS backends with different SDKs).
A few increasing DB projects use OpenDAL in the second way, like Databend, GreptimeDB, QuestDB, RisingWave, etc.
Looking at the kind of APIs they are supporting, this appears to be aiming at filesystem-like behavior. If I was building such a thing, I would leave SQL out of scope as well.
We literally have an entire set of teams building a DAL internally and its basically a forever project while existing app engineers just keep doubling down on table federation and other stuff that breaks lineage and ownership models