
Announcing flyio, an R package to interact with data in the cloud - akashtndn
https://blog.socialcops.com/inside-sc/announcements/flyio-r-package-interact-data-cloud/
======
carljv
The R ecosystem around cloud services and their APIs still seems immature, so
it’s great to see folks working on packages in the space.

I’m not 100% sure if this is providing any new functionality not provided by
existing cloudyr projects or is just wrapping them in a new API. I think
either is fine, but it would help to better understand why you’d want to use
flyio vs, say aws.s3 or the like.

Also, there are some aspects of the API that make me a little itchy. If I’m
reading the examples correctly, it seems like flyio_set_datasource sets a
global variable and then there are generic functions like list_files that do
different things based on that global state?

That seems risky to me, and a more idiomatic approach to this would be to have
a function that returns a handle object representing a Google Cloud or AWS
service, then have generic functions take that handle and dispatch to
appropriate methods.

Even then, namespacing in R isn’t really a thing, and I worry that really
plain function names like list_files or export_file are likely to get
clobbered by other packages using names like that. For packages like readr
that are intended to actually replace large swaths of IO functions, that’s
fine. But I’m not sure it makes sense for a more specialized package like
this.

Despite that, I do appreciate you all creating and open sourcing this. Like I
mentioned, any work on cloud packages is welcome from my perspective!
Interested to see how this develops.

~~~
renthu
Isn't the main USP of flyio the cloud agnostic part? You can play between
local, google and amazon without changing the code.

------
ggm
A dplyr abstraction for Elastic Search I tried to use this year turned out to
be orphanware. I'm back scraping HTTP fetches directly with a canned JSON blob
posted.

If this one works, I'd love some positive feedback signals from users, because
I don't want to build hopes, or code, to an interface spec which turns out to
wither and die.

------
gaius
The name already seems to be used by a Node.JS module.

There’s a bigrquery package on CRAN by Hadley Wickham that seems to have
already covered this use case.

~~~
renthu
bigrquery is for a completely different use; a wrapper for Google Big Query.
This seems to be a cloud agnostic read and write using aws/gcp. cloudyr
packages seem to be the closest.

------
sidpuri
congratulations socialcops team !

