Hacker News new | comments | ask | show | jobs | submit login
Backyourstack: discover and sponsor your open-source dependencies (backyourstack.com)
277 points by vvoyer 6 months ago | hide | past | web | favorite | 61 comments

> If you want to analyze non-public repositories, sign in with your GitHub account

Do people really expose their or their employer's source code to random third party convenience services?

I do understand the convenience factor here, I just think it's dodgy to encourage developers to be so flippant with privileged access.

It also has the option to upload a package.json which is far less exposure, and can easily be tweaked to omit anything sensitive.

It's still not no-exposure, and it's still getting a developer in the mindset that just sending out files on a whim is okay.

I'm not wholly against this sort of stuff, I'm sure we've used similar links in the past for CI and coverage, but it seems to be the end of the slope, where we're handing out access to our stuff for something so frivolous. This is the same sort of mechanism that got everybody and their dog's copies of Windows XP infected with trojans in the early 2000s. "Sure, I'll install that toolbar, just let me see Britney naked".

This could be a local, auditable script that fetched a static list of projects seeking funding.

To be fair, I don’t think this is intentionally bad privacy-wise. I think that, uh, a lot of the JavaScript set doesn’t actually know how to write code that doesn’t involve uploading shit to websites. When all you have is a hammer...

That's a bit too much. Shouldn't even be an option.

Interesting project, would be great if this supported more than just JS projects, I'd be very interested to see some of the dependencies that my current business relies upon.

I'm working with a pre-launch company that's building a tool to map companies' stacks (among other things). They've shared some of their research with me from and it's insane.

Large software companies have no idea what dependencies, libraries, or even languages are in their stacks. Companies are using multiple versions of the same library, replicated in a bunch of different repos. Different teams in he same company end up re-implementing solutions over and over, sometimes in a whole other framework than the team next to them because nobody knows what anyone else is using. Makes compliance a nightmare.

It's wild out there.

It’s even more wild when you consider that these companies are relying on all of these dependencies to be secure, and to lack intentional backdoors.

Kevin from https://fossa.io/ here.

We have an open source project just for this: https://github.com/fossas/fossa-cli. It currently supports roughly 20+ build systems and languages, and pairs with our web service for license and vulnerability discovery.

Would love your feedback.

You claim this is open-source, however I don't see an easy way to run this myself without relying on your infrastructure and signup up for an account there. Your "analyse locally" option requires an API key and is therefore very much misnamed.

Wait, can you do this as a one-time event without a paid subscription to your service?

They're working on that, contributions for C++ and Ruby have apparently already been received: https://medium.com/open-collective/open-collective-august-20...

Based on the fact that you can also upload composer.json files, I assume it also supports PHP.

You mean you do not already know your software dependencies? A bit careless for running a business, if you ask me.

Consider how many layers of dependencies are in use today, and you have no idea who the commenter is, what position he has in the business you assume he is 'running'.

I'd probably be best described as some combination of a technical manager and a businessman. I write some code but there's people I work with who are much better technologists than I am.

A few years back I wrote a cross platform make replacement due to issues with an existing recursive make solution not getting dependencies right due to issues with it creating the wrong DAG (http://lcgapp.cern.ch/project/architecture/recursive_make.pd...). The recursive solution was used in the first place because it was fast but it happened to be incorrect (interesting how stuff like Pipenv has really long lock times because it's prioritizing correctness). So I have an awareness of some of the difficulties in this space very directly. Before I did some projects with build systems and packaging there were a lot of things I didn't know I didn't know.

If they're running that business, they are either paying people who can tell them the dependencies (if not, those people are being paid too much), or they are responsible for the code themselves (small co. or early startup), and should already know, or with small effort be able to identify all the dependencies whenever necessary. Anything else would be irresponsible.

If they're not running that business and have enough access to the code to be able to use this service, they're probably a developer who again should be able to know about the dependencies.

Or perhaps they're in a different position, and only want to know about this for curiosity's sake - in which case, I guess this could be useful. But still, you should be able to ask about this in-house.

And as for the "layers" argument - if there are too many layers of dependencies to keep track of, something is very, very wrong with the technology you are using. (And yes, I do consider modern web tech completely insane.)

I currently am helping to run a kitchen at a restaurant in my spare time and I can tell you that nobody including myself can tell you where our onions are grown, but that they come from Costco

There are grocery stores which specialize in being able to tell the customer exactly where their produce comes from. They've been quite in vogue these past years. :)

Interestingly enough, it's debatable how useful this info is, sort of like how it's debatable knowing every single dependency for your project :)

>And as for the "layers" argument - if there are too many layers of dependencies to keep track of, something is very, very wrong with the technology you are using. (And yes, I do consider modern web tech completely insane.)

So you were basically being glib about the salaries, since you know a lot of us are in that position? What do you suggest we do?

When I list the dependency tree of our project at work I get ~5500 unique packages (many are different versions of the same ones). Does the fact that I don't know them by heart mean that I'm being paid too much?

No, if anything, I think anyone working in your field is not being paid enough for working with something as bonkers crazy. :)

But seriously, I haven't said that you have to be able to recite the dependencies when woken up at night, just that you should have an existing internal methods of keeping track of them and auditing them, and not relying on some comes-one-day-disappears-the-next web service.

I think this is why a lot of people vendor dependencies and keep mirrors of anything particularly mission critical that they depend on. I've been involved in situations where this has been done for closed source code too via escrow services to make sure of continuity of product even if the other business closes for whatever reason.

Consider something like PyPi and the way in which it will host Python packages. There's times with Python that you will not know what packages are dependencies until you actually download it (this you may notice is why Pipenv will take so long to create a lock file, it has to download a ton of packages, get our your network traffic tools and have a look sometime). So I'm fairly sure there are times where the reason you don't know the dependencies exactly is because they are in some form of being unknowable. Now you might argue that such systems should be re-implemented and as a generalized goal taken in isolation I would tend to agree (improving existing code including legacy systems is something we consult on in my company). But alas in existing systems you sometimes have to deal with non-ideal circumstances that were created with non-ideal package management tools (legacy python and c++ being prime examples of difficult areas to get 100% right).

Which technology are you using that does not have this problem?

I work on the opposite end from the web stack - firmware development - and even we have layers upon layers of dependencies that are problematic to track.

The point is not that the deps are problematic to track - that is to be expected with any larger project. The point is actually doing the tracking, and being aware when the deps change, new one gets added, etc., and being able to determine what it means for the business.

So what technology are you using that makes that viable?

Any today's technology, if you care about having your bases covered. Many people don't as I gathered (with great surprise, too) from this subthread.

My initial impression of the product was that it was a way to see which packages you depend on are looking for support in terms of funding/support. On a non-technical level this would be really good information for us to know because it would really help us make better open source investments.

On a technical level we do the best we can to get to know all the package dependencies and the various other things that we have to be aware of such as security and license issues (we are aware that not everyone does this). But honestly even when you decide to take this seriously it is still hard. Because of transitive dependencies and long chains of dependencies in the libraries you use, it is hard to be able to have a high confidence that you know everything you depend on. Essentially this requires tooling of some form or another to have any chance. With our in-house code we know what our direct dependencies are and we usually can track down most of what we need fairly quickly because we control our build environments. However when we go consult for a clients who have large codebases that we have never seen before it takes a while to track everything down. Sometimes if people have customized build systems this can be really hard to do. We have our own static analysis tools to help with this too.

Even without all the online package management services it's still a hard problem, consider the case of the humble makefile (http://lcgapp.cern.ch/project/architecture/recursive_make.pd...) Add in dependencies on remote computers and this gets harder. Take for example Python with it's huge number of different ways of installing a package, we have some tooling to check things but it's probably not 100% accurate because of the various ways in which Python packaging is broken exacerbated by the various ways in which people have worked around these shortcomings in the past. Pipenv has helped with the lock files but not everything is using those. The power of good tools for package analysis is clear and we use whatever we can. I hope you will see that this is actually a hard problem in a business sense as it costs substantial amounts of time for a business to create tooling for these things and the customer is likely unable to assess the benefits of this directly. We have an obligation and a desire to bring value to our customers and this means we will sometimes have to prioritize a 100% coverage of package information less than other objectives if the client demands it (for example fixing mission critical bugs may be a higher priority). In an ideal world we would love to know every aspect of the stack that we run on but as time goes on the increasing complexity of the systems we use makes this harder.

Can we add javascript to the title? It is a bit misleading without it.

Yup, tested it out on a few python projects I have/use and it has 0 dependencies listed which is very, very wrong.

We effectively don't support Python yet but it should not be far away. It's 100% Open Source and we're looking for contributions. https://github.com/opencollective/backyourstack/issues/34

Understandable that you're adding support for languages and packaging ecosystems as you go, and JS + PHP is as good a place as any to start, but it would be helpful if this were explicitly highlighted. For those who are unfamiliar with the ecosystems in question, package.json and composer.json are just filenames that don't actually tell you anything.

It would be helpful, and the site creator should list it. However, let's not treat this as a fault. They're offering you a free service and can't launch with every issue already resolved before being identified.

Re-reading my comment I can see it reads a bit like that - I totally don't intend any notion that I expect everything to be perfect, I'm an open source developer myself and as you say we don't expect every issue to be resolved instantly!

I was probably overly harsh and judgmental in my comment towards you. Glad we are in agreement!!!

composer.json files are PHP.

The entire software world uses js didn't you know...


I've been wondering whether a for-pay alternative to the open-source ecosystem could be developed.

The problem seems to be that open-source gratis software contributes nearly zero friction to a company building out its tech, so any alternative would have to compete against that near-zero friction. I just don't see each company negotiating separate prices with 100,000 package maintainers to use all of their software on a custom linux distro just for one of their internal servers or whatever. It's a tremendous amount of friction for each company to bear.

If that friction could be eliminated, while keeping a requirement to pay for use of the software, then I think a non-gratis ecosystem could dwarf the gratis software world within two or three years from its launch.

How would that be different from the regular Microsoft Ecosystem?

I can spin up a quick testserver using Windows Server, running an IIS webserver and Microsoft SQL Server, with my software stack written in C#, programmed in Visual Studio. It covers pretty much everything I could need, and involves ordering à la carte instead of negotiations.

Obviously it costs more and unless I spin up a cloud server I can't "just spin up a server" without making sure I have enough licences etc. But AWS/GCP/Azure solve that mostly.

You can do pretty much the same with ASP.NET core + Kestrel, too. Substitute Postgres for SQL Server if you want to stay all open source. But even with SQL Server, you can deploy the whole stack on Linux.

My understanding is that Microsoft does not auto-publish+sell third-party libraries in a package repository. So the difference is basically the distinction made in the old cathedral/bazaar analogies.

They do have nuget for packages, but that doesn't have an integrated payment system. Despite this, there are plenty of paid C# packages you can buy around the web.

How about a crowd-funded model (like Kickstarter)?

Or a model based on Patreon?

A similar tool with a focus on license compliance is fossology: https://www.fossology.org/

Great idea! It's nice to see an easy way to support open-source software.

Regarding funding open-source software: Companies I've worked for have all been OK with purchasing licenses for software that saves development time. They've also been careful to abide by software license terms. I'm surprised that more open-source libraries/frameworks don't require the purchase of a commercial license in order to use them commercially.

I think we need less "awareness" and promotion, just more work on peoples' parts. Seems these projects are asking for a magic bullet to improve their stacks but find it's missing in their own time and efforts

I tried putting in my github account on the home page just to see what would happen.

I didn't create an account or sign in, but it created a public profile on your domain using my name without my consent.

Is there any way to remove that?

In that case, it's just reading the public data from GitHub on demand, there is no account, nothing stored.

Ah ok, it looked like maybe it was created and cached and then publicly accessible at backyourstack.com/foo since that's the URL it forwards you to.

You might want to anonymize that URL for accounts who don't sign up just for clarity.

This is a great tool, and going to keep tabs on it for future use. I appreciated the package.json upload for my private repos. Kudos to whoever built it!

« 56 repositories depending on 0 Open Source projects. »

Well, that's not true. Maybe you could indicate which package systems you actually are able to analyze?

You're right, it's a common feedback we have, we even have an issue opened for that https://github.com/opencollective/backyourstack/issues/57 We explain what we support in the FAQ but that's not enough. What languages / packages managers are you expecting?

Common ones, I suppose! I have projects involving Java, Python, and Clojure. Of those, I would expect the first two to be covered by a generic dependency analyzer.

(It's fine if it doesn't, but it would be nice to know up front.)

The site offers to analyze your composer.json file, but doesn't seem to identify even popular libs like Monolog and Doctrine

Monolog and Doctrine should be properly detected. After that, we need to match detected dependencies with fundraising strategies (today Open Collective, in the future Patreon and others). Maybe this is why they're not appearing in the "Projects Requiring Funding" section. Feel free to submit an issue on GitHub and we can look at that in details.

This is a really good idea! I assume support for more languages / package managers is coming?

It would be nice if you could paste package-lock file for private repos.

You can do that! Check out the section below the button.

This is absolutely amazing!!

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact