Hacker News new | comments | ask | show | jobs | submit login
Show HN: Google Drive as a file system (github.com)
374 points by harababurel 7 months ago | hide | past | web | favorite | 83 comments



Hello everyone!

I have built a tool that allows Linux and macOS users to mount their Google Drive account locally as a virtual file system. The file system supports most typical operations (creating/deleting/moving/renaming files and directories, reading/writing to them).

It can also export special Drive files as OpenOffice documents and has an in-memory cache which improves the speed of navigation and file access. Changes performed on other clients (e.g. on the web or mobile interface) are usually detected shortly and applied locally as well.

I also wrote a paper[0] on it, as I am using the project for my bachelor thesis.

It is still rough around the edges and lacks some functionality, but for the moment it is good enough for my personal use. I am looking forward to hearing your comments and feedback on how to improve it.

[0] https://sergiu.ml/~sergiu/thesis.pdf


As an undergrad thesis? Wow you are amazing! I'm wondering how it differ from the existing Google Drive FUSE project? [1]

[1]: https://github.com/astrada/google-drive-ocamlfuse


Thank you! I have covered this question in a different comment:

> In short, GCSF tends to be faster in several cases (listing files recursively, reading large files from Drive). The caching strategy it uses also leads to very fast reads (x4-7 improvement compared to google-drive-ocamlfuse) for files that have been cached, at the cost of using more RAM.

> On the other hand, I arrived at the conclusion that google-drive-ocamlfuse provides a better overall experience, as it already has an active community behind it. My goal with GCSF is to more or less close the gap between the two projects and reach the same level of functionality.


I did one as well. It's a no brainer: a semester to do whatever you want as long as you get someone to agree to mark it.


It was required for us, one semester of “independent study” where you must select a research adviser (professor) and produce a final report. I did mine on “TorCoin” and it was an awesome experience, the only time I legitimately enjoyed school.


Definitely not the only time I enjoyed school, but definitely my favourite part.


Awesome project!

I'm curious, how does it compare to google's own similar product, Backup and Sync[0]?

It seems like the primary difference (other than being FUSE-based) is that, for data where google has a collaborative interface (docs, sheet, etc), Backup and Sync will place into the folder a "file" which is simply a link to open up the docs/sheets/slides app. Your app, on the other hand, seems to translate this data into open office format when changes are detected (i.e. .ofd for docs).

At first glance this strategy seems less good for real-time collaboration and less performant, but there may be advantages to it as well.

Do you find that strategy to be practical in most cases? Are there other features that distinguish this from Backup and Sync?

EDIT: Aha, looking into it more, it seems like mounting drive folder with this project doesn't trigger the downloading of any data, and rather lazily loads the data when the file system requests it, which would be a very significant difference. Is that right?

[0]https://www.google.com/drive/download/backup-and-sync/


Thank you for your response! You are correct in saying that GCSF doesn't download any data upfront. It constructs the file tree at mount time using only file metadata and downloads the actual file content only when it encounters a `read` call. This is an advantage if you're running low on local space (the file system essentially adds 15 GB of "free" additional storage).

Real time collaboration is indeed a shortcoming. I would still use the online interface of docs/sheets/slides for this purpose.

I haven't personally used Backup and Sync, as there is no Linux version of it. From what I gather, it seems that it uploads local files to a new category on Drive instead of the 'My Drive' directory. This can be useful for automatic backup. You simply set it up once and forget about it.

However, GCSF might be a better choice for the additional control it provides. Whereas with Backup and Sync you have to inspect a file manually in order to check whether it was synced or not, GCSF ensures that a pending write operation will only return once the file transfer is effectively complete. For instance, when copying a file to Drive, the execution of the command will take as long as the upload process itself. Once finished, an exit status of 0 will indicate precisely that the upload was successful and the file is certainly on Drive.

I imagine this sort of strategy is a better fit for use cases which require high confidence and predictable behavior.


> For instance, when copying a file to Drive, the execution of the command will take as long as the upload process itself. Once finished, an exit status of 0 will indicate precisely that the upload was successful and the file is certainly on Drive.

That is awesome. Very clever!


Isn't this the same as Google File Stream?


This is great Linux support finally! And in rust! Awesome work man!


Thank you!


This is neat! Can I use this for multiple accounts? I've been looking for a MacOS and Linux solution for this exact problem but I'd like to use it for 3-4 Google accounts.

Another question related to implementation. How easy is it to use a language like Rust for some web connection stuff like what is being used here? I've never used the language, but I've always been interested in it.


> This is neat! Can I use this for multiple accounts? I've been looking for a MacOS and Linux solution for this exact problem but I'd like to use it for 3-4 Google accounts.

The current release allows you to mount a single account for each local user. For the moment, you could work around this limitation by creating another user on your machine and running a separate instance of GCSF as the new user. I created an issue [0] and will work on adding support for multiple accounts in a future release.

> How easy is it to use a language like Rust for some web connection stuff like what is being used here? I've never used the language, but I've always been interested in it.

You can make HTTP requests relatively easily with the help of hyper [1]. In the case of this project, I was lucky to have some useful libraries readily available: yup-oauth2 [2], google-drive3 [3]. I would place Rust somewhere below Python in terms of existing tools and support (for instance, Google doesn't provide any official client libraries for Rust, but it does for Python), but making relatively simple applications is completely achievable (and fun as well) in Rust.

[0] https://github.com/harababurel/gcsf/issues/10

[1] https://hyper.rs/

[2] https://github.com/dermesser/yup-oauth2

[3] https://crates.io/crates/google-drive3


This looks very useful! We've been using Syncdocs [0] to sync multiple accounts concurrently to Google Drive.

Syncdocs does end-to-end encryption. Would it be possible to put in a PGP layer that automatically encrypts into your solution?

[0] https://syncdocs.com


This looks more like a contender to rclone, doesn't it?


Guessing you're basically building a FUSE filesystem with a Google Drive backend? I know KeyBase.IO did something similar accross platforms. Not sure if thats how you did it but might interest you to check out FUSE too.


Cool project. Just FYI: footnote 68, which is supposed to have the Github link, has a link to something else instead.


Thank you for pointing this out. The link should now be fixed.


Are multiple accounts supported?


Not at the present moment. I plan to add support for this in the near future.


What about Google Drive File Stream?

https://support.google.com/drive/answer/7329379?hl=en


Well one limitation of Google File Stream is you can only use it with a work or school account not a personal account.


It also does not support Linux afaik!


It also tends to crash a lot with large directories.


No support for Linux.


I use this at work and it’s very good.


Same here. I wish they offered it for personal accounts. I would happily start paying for Drive to use it.


I’m not sure if it’s what you’re looking for, but Backup and Sync is a similar Drive sync utility made by google for personal accounts. The only feature that it doesn’t have is file-level sync settings (afaik B&S only has folder level sync options)


Backup and Sync lacks the coolest feature of File Stream. File Stream allows you to download files when you access them instead of keeping them all on your PC. For someone like me with lots of photos, a smallish SSD, and a fast internet connection it's very convenient.


Saves a huge amount of machine resources, too. Some folders are GB’s big we have to share.


not sure what you mean here because i pay for Drive (via Business G Suite, single user, unlimited storage) and use Filestream just fine.


Yes, you can get it with a G Suite account, but regular @gmail.com users cannot.


Interesting, I use it at work and have nothing but problems with it. What sort of file count do you have on Drive? I suspect ours is beyond the testing scope of DriveFS and as such perfo nance is hideous and often freezes up computers for minutes at a time.


Very cool!

I ship a similar product that has included Google Drive integration (and many other back ends) since 2013 - https://www.expandrive.com - happy to answer any questions too!


I remember testing Expandrive out a few years ago. Great to hear you're still going strong and expanding functionality to cloud storage services too.

Haven't revisited the program since we originally tested it and found a major problem for our application, but can Expandrive now handle symlinks over sshfs correctly or does it still silently mangle them into regular files?


When is Linux support coming?


In beta!


Ever get the random disconnects fixed? Haven’t used it since I sent in those log files.


Very cool, I think I will try using this as my standard google drive solution on linux.

One thing I haven't found in your paper is how the software handles conflicts. Suppose I have two or more machines hooked up to the same account, all simultaneously modifying the same file in different ways. What's going to happen? I guess this would be mostly up to the server side and out of your control, but maybe you can point me to a specification on how such issues are handled?


Some unwanted behavior might occur in scenarios like the one you describe. Most probably, the change performed by one client will silently overwrite the other. If there is however a small gap between the operations, the earlier one will have a better chance of being picked up by Drive and detected by the other file system instance. In this lucky case there might be no data loss.

I would set the `sync_interval` configuration parameter to a low value to improve the chances of detecting changes as soon as they appear, but I would also try to make sure that only one client works on a certain file/directory at one time.

This case looks like a good area for future improvement. Thank you for addressing the issue!


Congrats on shipping! How does this compare to https://github.com/astrada/google-drive-ocamlfuse

aside from the programming language?


Thank you! I think google-drive-ocamlfuse is an excellent product. It is clearly more mature and has more features than GCSF.

I made a comparison between the two projects in sections 4.2 and 4.3 of my thesis [0]. In short, GCSF tends to be faster in several cases (listing files recursively, reading large files from Drive). The caching strategy it uses also leads to very fast reads (x4-7 improvement compared to google-drive-ocamlfuse) for files that have been cached, at the cost of using more RAM.

On the other hand, I arrived at the conclusion that google-drive-ocamlfuse provides a better overall experience, as it already has an active community behind it. My goal with GCSF is to more or less close the gap between the two projects and reach the same level of functionality.

[0] https://sergiu.ml/~sergiu/thesis.pdf


Any comparisons to rclone's gdrive integration? Seems that rclone probably has the biggest userbase of the "google drive fuse mount" tools.


Ditto. I'm using clone for this same function. I'm using it for casual file access (not lots of intensive access every day all day) but I'd like to know of there is any benefit to this over rclone, since as mentioned rclone has a pretty big user base..


I haven't personally used rclone so far. I will look into it and see how it compares to GCSF.


Fuse has been around for a while, in fact MacFuse was an implementation that was open sourced by Google, although no longer workable given the advances of the MacOS. There is now OSXFuse which is used by a few commercial applications including Transmit and the Storage Made Easy Cloud Service which uses it to support multiple backends including Google Drive and Google Storage.

On Windows there are Fuse implementations but they are not as rock solid as OSXFuse on Mac. The best commercial implementation Windows FUSE I'm aware of is CallbackFS.


I'm getting an error when compiling (`cargo build`), both on stable and nightly Rust:

    error: failed to run custom build command for `fuse v0.3.1`
    process didn't exit successfully: `/home/bromskloss/code/gcsf/target/debug/build/fuse-46607682b28e6d4c/build-script-build` (exit code: 101)
    --- stderr
    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Failure { command: "\"pkg-config\" \"--libs\" \"--cflags\" \"fuse >= 2.6.0\"", output: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "Package fuse was not found in the pkg-config search path.\nPerhaps you should add the directory containing `fuse.pc\'\nto the PKG_CONFIG_PATH environment variable\nNo package \'fuse\' found\n" } }', libcore/result.rs:945:5
    note: Run with `RUST_BACKTRACE=1` for a backtrace.
Am I doing it wrong?


See this issue[0]. tl;dr: make sure you have the libfuse-dev package installed.

[0] https://github.com/harababurel/gcsf/issues/9


It's always good to see services of big companies being turned into highly useful but faceless commodities!


There's a similar tool someone made for Standard Notes to mount your notes as a filesystem:

https://github.com/tannercollin/standardnotes-fs


now that is helpfull for me migrating to standard notes!


Hey, thanks for the awesome work! I always wanted something like "one drive on demand"[0] but for linux. Maybe in the future you could add an option to keep some files offline. By doing that, Linux would have a real alternative to one drive [0]https://support.office.com/en-us/article/learn-about-onedriv...


Back in ~2005 someone did this with Gmail.

https://en.wikipedia.org/wiki/GmailFS

It did not turn out very well, though.


I don't get it. Isn't this the same as what implemented natively in GNOME, or with Backup and Sync on Windows and MacOS?


Not sure what you are referring to as implemented natively in GNOME.

One crucial difference is the fact that Backup and Sync picks up local files (which exist physically on the user's machine) and uploads them to a special Drive directory in the background. GCSF does not store anything locally unless you tell it to. It simply creates a virtual directory and reports its content and file tree structure so that it matches whatever exists on Drive.


I think GP is referring to the Gnome Virtual File System (GVfs) [1], which has a number of pluggable backends, one of which being Google Drive. I've used it lightly and it seemed to "just work."

[1] https://wiki.gnome.org/Projects/gvfs


Yeah probably he was talking about that. But for my Mate, GVfs is not an option :(


Stupid question[0]: What do the initials GCSF stand for?

[0] Yes, I know there are no stupid questions, only inquisitive idiots.


The question is not stupid at all, but the answer is :). GCSF stands for "Google Conduce Sistem de Fișiere" -- a (bad) word-by-word Romanian translation of "Google Drive File System".


And I see that you've now added the explanation to the README:

https://github.com/harababurel/gcsf/commit/3d5ba8ffff2a18bf0...



Although this is fairly nice, I would recommend Syncthing over this. It has the benefit of not relying on any third party to store your data, it's all exclusively on your devices, along with some very solid security.


Syncthing is pretty much a compeltely different thing.

I use Syncthing a bunch, but I don't really see any overlap between it and this tool.


From my understanding, the primary purpose of this is for backup and syncing your Google Drive files between multiple devices, which is very similar in nature to Syncthing. Is there something else that I am missing?


Could you please rewrite it in Ru… Oh, it _is_ written in Rust! :-)


Ain't FUSE a really wonderful thing?

Edit: "fuse" to uppercase.


What about Windows support?


Paid but actively developed with responsive developers: https://mountainduck.io/


Just in case and FYI, a Dokan based driver for Windows:

https://github.com/viciousviper/DokanCloudFS

I guess that using the Dokany to port this new driver to Windows should be relatively easy:

https://dokan-dev.github.io/


Should be achievable using winfsp[0]. I will look into it for a future release.

[0] https://github.com/billziss-gh/winfsp


The drive app will sync a folder in your home directory if you are a busy person like me and need a one click 90% solution.

I once synced my whole home directory but that turned out to be a bad idea because random applications put their junk there.


I use it from GnomeShell 3.28, it works pretty solid


How does this differ from Google's own Filestore?

https://cloud.google.com/filestore/


Filestore seems NFS-based and can't access Google Drive, just its own GCP volumes.


is this drive same as the drive\directory that you get when you install "OneDrive"?


I haven't use OneDrive, but I think it is similar to the Backup and Sync app for Google Drive, which targets a different purpose than this project.


How does this compare to `rclone`?


Seconded, there are widely deployed alternatives, specifically rclone (mount) and plexdrive. The former is, for backup purposes, better used via it's CLI, mostly due to how it relays IO errors up and the fact that many applications are too dumb to retry filesystem errors, or even not crash completely.


Does this support Team Drives?


Not at the moment. I would like to add this feature in an upcoming version.


How far away are from making the 'Download More RAM' meme redundant I wonder?

Excellent work harababurel :)

Edit: words


Why do you trust Google enough to do this?




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: