Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Or you could, you know, clone the repository into a local working tree.


Have you ever read this HN comment from 2007 before? [0]

> I have a few qualms with this app...

Different paths towards the same outcome should multiply, so we can increase the surface area of the bandages on the different pain points along the way.

[0] https://news.ycombinator.com/item?id=9224


Snarky drive-by comments like this are the worst part of HN.


that's why lobste.rs was created.


Any chance you have an invite and are feeling generous?


That works well for small repos or a few repos but if you want to find all cc files, at all release branch's, in your entire company and check for some exploit it is helpful to have a VFS. Makes it so you could also support N SCMs through one API. You just need to make a new VFS.


Isn't there already a good way to push computation closer to the data?

GmailFS and pyfilesystem (userspace FUSE) and rclone are neat as well.

https://stackoverflow.com/questions/1960799/how-to-use-git-a... explains about the `git push` step that git-remote-dropbox enables: https://github.com/anishathalye/git-remote-dropbox


GitHub also has a code search now: https://cs.github.com


Needing to tie into a specific API (like codesearch) couples you to the specific storage backend (Github). If you build your software to operate on a POSIX-y file system, you can support anything that shows up as a file system. For example: A local working tree of files, an NFS share, or now a remote git repository.


Running the code where the data already is saves network transfer: with data locality, you don't need to download each file before grepping.

Locality_of_reference#Matrix_multiplication explains how the cache miss penalty applies to optimizing e.g. matrix multiplication: https://en.wikipedia.org/wiki/Locality_of_reference#Matrix_m...


Stop for a moment and consider what the tradeoffs of that could be and why it might not work well in some situations.


- Higher latency

- Less efficient use of bandwidth, as the git protocol is optimised for bulk transfers

- Not resilient against unreliable connectivity

- No support for repositories not hosted on GitHub

Yes, I can think of some.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: