It will mean the death of Maven Central, about which I have mixed feelings. On the one hand, Sonatype deserves enormous thanks for what they have done for the open source world, as does mvnrepository.org. Their central repository has been free and maintained for a long time. Thank you, Sonatype.
On the other hand, it took me three days to release a new version of one of my artifacts the other day. The process for doing a Maven deploy is very complex. It took hours to get my private key to work because the key registries were slow. Then the staging server was slow, and kept timing out. Support was responsive, and said they were dealing with a DDOS attack. On top of that, it takes a while for artifacts to show up in the registry even after they have been uploaded. I'm glad that getting that artifact out wasn't an emergency.
This new Github service separates the registry from the artifact storage, which is the right way to do it. The registry should be quick to update because it's only a pointer. The artifact storage will be under my control. Credentials and security should be easier to deal with. I really hope this works out.
Publishing to Maven Central comes with a bunch of requirements (https://central.sonatype.org/pages/requirements.html) may be seen as a burden to packagers, but is certainly a delight for end-users of those packages.
All packages are GPG signed, come with companion source and javadoc artifacts, and are guaranteed a certain amount of other metadata in the POM. There are "easier" repositories (like Bintray jcenter) but anyone who has used something from there that didn't include sources or proper licensing information soon comes to appreciate why it is that Maven Central (and not jcenter) that is the center of the Java ecosystem.
Just compare how well organized and neet are the packages in Maven Cetral to mess in JCenter. JCenter is full of inconsistent trash. I can imagine people pushing packages there for testing and then just forget about them.
Not everyone and not all of them, of course. But while I was looking around to figure out which repository to choose as my main, JCenter has put me off. I still can't understand how people can easily trade convenience for quality.
Unfortunately the GPG signing is worthless because there's no way of attaching trust to each key. So each package has been signed, but anyone could have issued the keys, so an attacker could easily do the same.
Also, not all artifacts have sources and javadoc. Most do but some certainly don't.
So continuing this way GPG is worthless in general. Most of the keys you can't verify in person so there is no trust whatsoever. In this case Sonatype is verifying key for you. They will check if your key belongs to you and you are in control of your organization. Otherwise package would not be accepted.
I may be wrong but source and javadoc are requirements. Maybe there are some old packages without it, but new ones should be complete.
Maybe they're historic and now it's a requirement, or maybe it's prereleases or something, but I have certainly seen some in the past. First one I can dig up: https://repo1.maven.org/maven2/com/google/protobuf/protobuf-...
The newer versions certainly do have sources and javadoc, but you can't assume their presence for everything.
> So each package has been signed, but anyone could have issued the keys, so an attacker could easily do the same.
Not true. The GPG signature means the key belongs to an account with access to the group id (namespace, usually a domain), and that sonatype has verified the group id belongs to the original admin account for that group id.
It's not a lot of guarantees, but you cannot just generate a GPG key, sign a package, and publish to maven central.
Anyone can issue a GPG key claiming to be whatever identity they want though. I've uploaded artifacts to Maven Central before and they didn't do any specific verification of the signing key - so if it just matches the domain that is no protection at all.
You could for example opt for TOFU. Then at least you’d be protected against a malicious takeover unless the attacker manages to access the maintainers private key. That’s been a pretty common issue in the recent past.
> It will mean the death of Maven Central, about which I have mixed feelings
I don't see it as such. A key reason that this offering from GitHub (and the corresponding one for GitLab) is useful is that it simplifies the enterprise stack - things that will never get posted to Maven or npm or DockerHub in the first place.
From the announcement:
> Packages in GitHub inherit the permissions of the repository, and you no longer need to manage third party solutions and sync team permissions across systems.
This impacts locally hosted Nexus repositories. The artifacts that I build for my team that currently get pushed to internal systems can now live along side the source code repository.
From the "What our customers are saying":
> GitHub Package Registry has allowed us to spend more time solving hard problems, and improving patient care. Since it uses the same permissions and security as the rest of GitHub, we spend less time managing multiple accounts, ACLs, and on-premise infrastructure, which leaves us with more time to code what matters!
That is exactly where it is useful.
For maven central, I am pleased to have the governance and management of those systems be part of my deployment chain for third party libraries and I will continue to prefer to pull something from Maven Central rather than somewhere else whenever possible.
I think you've been somewhat unlucky. I've been releasing some code on Maven Central for a while, and while original setup did take some time and was a bit confusing at times, once all keys were in place deploy works with a couple of short steps. It is true the packages do not appear immediately, but I imagine serving this much content requires heavy caching, so I can understand that, and the wait times aren't that outrageous either.
That said, would be interesting what Github's effort comes to. It's always better to have alternatives.
They're literally throwing around money creating new coding tools, languages, buying GitHub, LinkedIn, ... if we were to debate the effectiveness of its spending, there would be a lot to talk about.
2,5 Billion, with money from oversees on which they otherwise would be heavily taxed. They bought a game studio with significant growth potential at 20x its annual profits, but maybe even more important, they bought 54 Million and growing user accounts on a diversity of platforms, of often very young captive users, easily convertible into Microsoft accounts, a number which was poised to expand rapidly.
Indeed, Microsoft has sold and added 100M more Micecraft licenses and accounts since.
From an account acquisition point of view alone, which fairly often is the main driver of these transactions, the deal was a steal. I'd estimate the value of a fresh user account in a desirable demographic to be around $250 for Microsoft. Even the short term projected revenue would be close to $10.
Microsoft projected the deal to pay for itself in 1 year, and while not having followed the case up close, chances are it did.
Disagree about being the death of Maven Central - they are different beasts.
- Central has a global namespace of artifacts. com.google.guava is the same for everyone. This will probably stay the default of open-source libraries.
- GitHub Package Registry has a per-user maven repository, so a local namespace (https://maven.pkg.github.com/OWNER). This is likely to be used by companies internally.
In order to use GH Registry instead of Central, I would have to add a dozen maven repositories to my settings.xml. I doubt many developers will be up for that.
It's made by JFrog (makers of Artifactory), it's been around for while, it supports lots of formats including harder ones like apt, and it makes package distribution about as easy as it can be.
I tried it years ago and it didn't offer signed packages at the time. I ended up just using ansible to build my own rpm/deb repos on a server given to us by a University:
Their support is the worst. I anticipate a 4 day turn around if I ever need to contact them. Scary really.
Their Gradle plugin is pretty bad too; kinda ironic given their prominent position in the Android and Java community. But then, Gradle itself is crazy town so it's hard to blame them too much.
Yea same here. There are way too many workflows already setup around Maven Central. People publish to it from Scala/SBT, Gradle, Clojure/Leiningen, Kotlin, etc. It's not going to be going anywhere any time soon.
Exactly. It’s nice to have competition in the Java package space aside from Maven, Jfrog Artifactory and Nexus. It will take time to build up network effects for Github but if they make a good product I could see it happening eventually. We use Artifactory where I work and the generic aspect of it plus the ecosystem integrations are super nice. I can publish docker images, regular zip files, jar files, Python packages etc and use the same tooling for all of that. GitHub should really push for the one-stop-shop approach here because I feel like that’s going to be their major competitive advantage. That’s where Gitlab has been playing too so I wouldn’t be surprised to see a similar product from them in the near future.
Where do you think most of those project's are managing their code? And of those, what percentage are already publishing releases on GH? I'm willing to wager the migration will be much faster than you think.
I tried publishing a side-project to Maven Central for a few hours, only to give up and publish to Bintray in minutes.
I'm willing to admit I was probably doing it wrong, but I'm glad it forced me to look at other options. There are definitely easier methods of package/publishing out there, and GitHub package registry sounds awesome.
What? Maven Central is here to stay. It will be here even after a nuclear war. Jitpack does the same as GitHub Package Repository (or even simpler) and Maven Central is still here. I don't see why this would change anything.
This is pretty interesting. Github really is becoming the social network that MS never seemed to be able to create. We already use it as our portfolio of work for potential employers. We collaborate with fellow enthusiasts and maybe even make new friends. We host our websites from it. Abuse it to store binaries, too. And now, along side, source code we can use it as a CDN of sorts to serve packages, for free, sounds pretty great. All they need now is a place to get coding questions answered (a la stackoverflow) and along with Github jobs it could be really compelling.
Pure speculation, it would not surprise me to wake up someday and see MS has bought Stackoverflow. Given their direction of integrating the entire developer experience, it would make sense. MS is upgrading technical docs across the board, organizing and linking to SO content would make sense.
In light of StackOverflow looking for a new CEO, layoffs in the past year and a half, $68 million in venture capital looking for a return, and Joel Spolsky's connections to Microsoft, this might actually happen.
I've also gotten the impression that StackOverflow's recruiting product isn't doing so well. It seems to be a few hundred dollars a month for a single job posting, but the results for recruiters are apparently mixed.
StackOverflow for Teams seems like a really hard sell too. $5-10/user/mo is pretty steep, especially for a service that needs a significant-size userbase to "work".
We use StackOverflow for Teams with a small team (<15 developers) and it’s been great. While I’m sure our revenue alone won’t make it profitable, I think it’s a product that can work with teams of all sizes. Don’t knock it till you’ve tried it.
You might say StackOverflow careers isn’t doing well, but it is literally the only jobs listing outside of this website that I look at. The ability to get a succinct email within a chosen SALARY RANGE and being able to select remote only is AWESOME.
Pretty sure it was developed entirely on the MS stack. Jeff Atwood had a few posts about it. At the beginning it was literally one Windows Server machine.
If I remember correctly, my team of four was quoted at $13k/yr for the StackOverflow job posting / recruiting solution.
It's probably worth it for companies with greater hiring needs than ours, but LinkedIn (begrudgingly), and ZipRecruiter have provided enough quality candidate-flow for far less money that it doesn't make any sense for our uses.
Truth, I was at SO in Manhattan for a JS meetup a couple times and all the desktop computers were PCs with Dell monitors.Not a single Mac in sight. Have a feeling they weren't running linux either since Stack Overflow is .NET I believe.
I remember many many years ago listening to the Stack Overflow podcast which was Jeff Atwood and Joel Spolsky talking, in real time, about them creating Stack Overflow.
IIRC it uses ASP.Net MVC or something like that, and might have been the first and/or biggest site using it?
> running linux either since Stack Overflow is .NET I believe
It is a matter of taste and I'm sure these guys (Atwood/Spolsky) love Windows to work on, but with .NET Core (we started porting when the first stable ASP.NET Core came out), we ported/migrated everything from Windows/MS (SQL server, Windows Server, AD, etc) to Linux + MySQL/Postgresql on ASP.NET Core.
I guess it's what you are used to, but everything is faster, smoother, more stable/consistent and easier to manage now. I would never go back.
It is still really unorganized in many ways. The worst thing, I think, is search results still frequently list obsolete MSDN pages higher...and, btw, the new branding is not MSDN, but docs.microsoft.com.
They still have a long ways to go, with the typical problems of a large organization tackling a large, kind of amorphous, project.
docs.microsoft.com PM here - thanks for the feedback! It takes some time to update all our search results across the two major search engines. Given that some pages have less traction than others, the more obscure content sometimes still is indexed as if it's coming from MSDN.
We have moved most of the library to docs, with redirects in place, so hopefully you won't get too many 404s. If you do - feel free to report them here: https://aka.ms/sitefeedback, and we'll address them.
I'm genuinely asking, not meaning to poke if it's Bing - I use DDG but just don't have a feel at all for what's most popular after the obvious one.
Wikipedia has just 7% market share left for the second, and the rest - thinking about it's probably one that's popular in China and unheard of elsewhere?
Considering Verizon has stated their Yahoo and AOL properties are worthless[0], it's probably Bing and the search engines that rely on it (including DDG).
Funny thing, you're describing exactly why you're not assuming a global perspective. Forget about a first, the world is fragmented. To be global is precisely to cater to individual locations around the world.
Question: Why is Offline documentation and the Help Viewer in Visual Studio 2017 still horribly broken? I keep it around for when I don’t have internet access but it’s next to useless. Why keep up the pretence? (In comparison, the CHM and DocEx from VS6 through VS2008 work perfectly and are very reliable)
What particular aspect of it is broken? Genuinely asking the question, because I want to make sure we address major issues in the customer experience that you have.
Massive amounts of duplicated content. Broken images and stylesheets. I’ve downloaded all content but pressing F1 on a symbol in VS informs me the content is only available online.
The term "social network" has become too vague in 2019. You will have to append a purpose to each one. ie. Yelp is a social network for food, LinkedIn is a social network for professionals, and GitHub is another for developers.
Each one will serve a niche which is much harder to supplant because there's a common purpose. In contrast, when people think of Facebook, people just associate it as 'the' social network but not one for a special purpose.
> All they need now is a place to get coding questions answered
I think Github issues has already started doing that. Personally, I've been finidng more help from Github issues than Stack Overflow, plus I find myself asking questions or submitting bugs on GH a lot more than asking something on Stack Overflow. In fact, I've not asked anything on SO for years now.
I think YMMV on this, cause I also know of a lot of repos that explicitly close any issues that are support requests because they fill up the issue list so quickly. I think having it separate as in SO is still going to be the move unless there's some big re-organization of how Issues work.
This is quite true but it's boggling why neither Github nor its competitors haven't added a 'questions' tab to public repos where people are explicitly allowed to ask questions, and have them answered by maintainers or other users.
Which is usually a ghost town. Maintainers of small libs usually don't even monitor SO. Issues is the fastest way to get high visibility and get a question answered quickly since newcomers usually read GitHub issues to see if there are any serious things to worry about before using the package.
Issues aren't used as people generally use SO. Questions like "how do X". Most people use issues to report bugs, request features and most prominently contact the authors when things either don't work as expected or the software lacks documentation about how certain things are supposed to work. I as a user and maintainer of OSS packages prefer such questions in issues than on some random site like SO which I won't be monitoring. Issues is a great place to consolidate all the knowledge around a package. Also, most of such questions can be considered as bugs, feature requests or just plain lack of documentation.
sure ...except that a lot of support questions can be completely irrelevant to your project, because the users aren't competent programmers. it's a big time suck, and people happily demand help, and then not even bother to thank you for the hour of your time you spent solving their problem--which was ultimately due to them not paying attention, or having basic donain knowledge.
I'm all for giving. I happily write tutorial blog posts, but I don't feel obligated to give more help than I already gave.
I see what you mean but questions like these are not what I was really referring to. I do get them from time to time on my repos and I point them to where they can get help which usually is some mailing list, docs or some other related project. Also, my quality of life as a maintainer has dramatically improved since I stopped caring about maintaining clean list of GH issues. I'm fine with people opening tons of them and a lot them being open for a long time. I'll get to them when I can as I don't feel obliged anymore to answer all those questions or implement new features.
If I post code to a GitHub, I'm happy for you to use it, and I'm happy to learn about bugs in the code. But what obligates me to facilitate support, or respond to support requests? People should grow up.
Plainly you're not obligated to, but if you want to actually grow the userbase it makes sense. If you really don't want to help with (reasonable) support requests, then an alternative is to make that abundantly clear in your README.
> People should grow up
Not sure what this childish quip adds to your otherwise sensible comment?
I personally will pass on any project/lib that has a bunch of unanswered issues, or a bunch of auto closed stale issues. Unless the project is big enough to have an active SO or Gitter community.
I definitely see some people (ab)using Issues as a way to ask fairly generic coding questions. It might be time they open up another avenue for questions generally.
You jest, but Microsoft seems to have really taken Ballmer's message to heart lately. Ballmer was on-the-nose about needing a strong developer community; he was just terribly misguided about how to actually get one going :)
Or you know, it could just focus on its core competencies and be good (great?) at what it does. They don't need to eat the world to provide a positive impact to it...
MS's core competency has always been developers. IBM called Microsoft for BASIC back in the day because without MS BASIC their computer was DOA to a lot of potential customers.
> MS's core competency has always been developers.
As a developer who still has to work very hard to forgive MS for all the pain IE6 put me through a decade ago, this grates on my ears, even though I understand that it might be true in the abstract.
Classically, MS has been good to developers who agree to be chained to their platform, but has made life extremely difficult for developers who want or need to be platform independent.
Platform vendors and developers will forever have conflicting interests.
>As a developer who still has to work very hard to forgive MS for all the pain IE6 put me through a decade ago, this grates on my ears, even though I understand that it might be true in the abstract.
Yes, Active X, Windows, Java etc, and god knows how many awful things they did I cant remember them all. But years later Bill Gate decide to donate his wealth to good cause. Not only is this not a PR / Marketing Stunt, he is actually using his time and energy running it. That alone halves whatever hatred I have had.
Ever since they lost the Smartphone OS race ( if you consider they were even part of it ), I don't consider M$ a monopoly or threat any more.
And given the amount of Good things they have done since new CEO took helm, WSL, and now WSL2, VS Code, .Net Fully Open Sources with MIT license, ditching IE ( God that feels good ) , Direct X RT, along with lots of Research put out, I think it is worth reevaluating that hatred against M$ we once had.
We have no lasting friends, no lasting enemies, only lasting interests.
Back in the 80s, IBM was evil, Apple and Microsoft were good. I have an "I HATE IBM" badge from a very early computer show.
MS was mainly known for its languages and its apps, more than the OS. IBM PCs still came with CP/M or PCDOS (MSDOS).
Then when Windows 3 came out, MS started to act like IBM but on steroids, thinking they owned the "stack" (as it was). OS/2 was the last attempt to extract the "PC compatible" world from the Windows domination.
Then IE6 and ActiveX ensconced MS in the enterprise. What used to be "you won't get fired for buying IBM" became "you won't get fired for buying MS because there's no choice".
The onset of the web and competitors in MS's dominant space (well except for apps, Office still rules the world) and the demise of the Ballmer years (especially the death of their mobile/phone ecology) means that MS is now actually doing what IBM did about 15 years ago when they adopted Linux.
Microsoft started acting like IBM from the moment the DOS licensing was agreed with IBM. Incredibly naive on IBM's part, or perhaps they simply expected to sell so few machines it wouldn't matter. After all the first PC was deliberately crippled to stay away from more expensive IBM kit.
MS were acting like mini IBM throughout the 80s, and before the mid 80s had very much gained a negative reputation globally. It was against Microsoft, not IBM, that was the usual target of complaint when the first AT clones were coming out - 84? 85? I think Windows 1 was about the same time. Certainly enough of a reputation to be amazed they were still collaborating with IBM to produce the first OS/2, again around the mid 80s.
A large part of that in the late 1970's IBM was under anti-trust investigation. They had to make some changes in behavior to prevent bad results. Much of the PC would have been different/closed if IBM wasn't afraid of what lawyers would do.
When you have that much money it isn't possible to donate it all at once.
Fraud in charities is a real thing. There are a lot of "charities" that do some good work, but primary exist for the benefit for the benefit of someone. Often the primary purpose is to hide bribes: the CEO's spouse is a high government official. It is very easy to get mess up in such a charity and end up not doing well with your money.
The other problem is 90B is to much money for any charity to handle at once. Any charitable program that has a lasting (and thus useful) impact will take time. Even if the charity sets up a trust, there is nothing to stop the CEO from raiding that trust in latter years. Several charities started good, but over time have slowly - and legally - morphed into something that is very different from what the founders intended.
Staying in charge of his money is the best way to ensure that it is used well.
I don't think developers should place too much trust in any platform vendor. (Apple and Swift, Apple and Metal, Google and AMP, Google and Dart, Amazon and AWS... the list is endless.) Their interests are fundamentally at odds with ours.
Platform vendors benefit when developers are locked in by network effects. Developers maximize their value when their skills are transferable.
And that goes for Github and its package registry. Concentration of power is problematic.
Except that in the commerical software world, you can't stay independent with "purity".
The costs of supporting multiple cloud providers is still a cost. K8S, Docker, Pulumi, Packer are starting to reduce that interoperability gap, but it's still a friction.
I've moved to .NET on Ubuntu for our current project (I'm an Engineer Manager/Architect, but the team is C#). It's been remarkably smooth and low friction.
So it's a balance, the interests of a platform vendor are of course that you stay on their platform. But currently there's enough competition that it's still a buyers market.
AWS is getting a bit too powerful, Azure is doing a good job of keeping MS shops in the MS world, but GCP is disappointing.
In the abstract, what would best serve developer interests is if platforms are as compatible as possible, especially in their superficial details. That minimizes switching costs, both in terms of what it takes to port real software from one platform to another, and in terms of what a developer must learn to apply what they already know when working with a new platform.
Taken to an ad absurdum conclusion, developers want all platform vendors to coordinate in order to minimize switching costs. Of course there are lots of good reasons such a level of coordination will never be realized. :)
In the meantime, there will always be efforts to create adapters which generalize the interfaces of multiple platforms and put them behind a common wrapper interface. But while such interoperability efforts serve developer interests, they work against the interest of vendors in encouraging platform lock-in.
The dynamic isn't pure in terms of real product offerings because vendors also understand that portability provides value to developers, and so some vendors will provide at least some portability in order to differentiate themselves. But I maintain that the fundamental interest structure is unchanging.
It's a really nice project overall, having a registry that supports many different projects and run by a company that today is good, is always nice.
But we been here before. We trusted npm and now they are trying to squeeze out a profit, and it ruins it for the users. I'm happy to be proven wrong, but every for-profit company that runs a package registry, eventually stagnates, and ends up implementing things that are not for the users, but for their own profits.
I think package management, especially for open source, should not be run by for-profit entities. We need to have something similar to public utilities, where the community funds the registry itself, and the community can own it as well, where the only changes allowed, are changes that are good for the users.
This is not that. npm and docker are already run by for-profit companies, so this move by GitHub just adds another centralized package registry for those. It's not worse, by it's not better either. I'm a bit mad about the RubyGems part though, as RubyGems is a community project, and they are trying to make it not so, making it worse.
It's basically a community funded decentralized package registry, where the community funds it, and is a part of the ownership of the registry, handled via a governance followed by the contributors. All the finances, development and planning is happening in the open, and Open-Registry is committed to never making changes that are for increasing profits, only changes for making the service better for users.
Please, if you have some free minutes, check it out and write down some feedback. We might not be the perfect package registry over night, but I'm hard at work getting as close as possible, without compromising the user value for it.
First of all, thank you for building something like this. I like the idea of a decentralized, open registry.
That said, the market's moving towards a universal registry for package management, across tech - npm, docker, linux packages, jars etc.
With that perspective, GitLab's initiative (https://about.gitlab.com/direction/package/) is something I'd likely prefer. The software's open-source and deployable, which means the software's fate isn't tied to that of a single company.
It's already ironic enough, that the world's biggest collection of open source projects is managed by a single closed-source software - GitHub.
Yes, I agree with you. Open-Registry isn't tied to being just a JS registry. Open-Registry focuses it's energy on unlocking the for-profit registries first though, like npm, docker and packagist, before we'd consider moving on to other already non-profit registries. Currently, there are no plans regarding expanding it, but it wouldn't be very hard and the architecture of the application makes it very easy to expand too.
While GitLabs effort is (in my mind) more well-meant than GitHubs, since it's open source, I don't think having the software open source is enough. The full development, funding and finance has to be open as well, and I don't think GitLab fits that. Basically, we need Open Source Public Utilities for core infrastructure projects like these.
> every for-profit company that runs a package registry, eventually stagnates, and ends up implementing things that are not for the users, but for their own profits.
I actually think Github might be different, because they have a pretty solid monetization model already: companies paying per user for private source repositories. This easily extends to companies private artifact repositories.
Github benefits from the network effect of providing free source repositories to open source projects, so this is probably enough incentive to start and keep providing high-quality free artifact repositories to open source projects.
Yeah, today that's so. The problem with for-profit companies is that there is nothing keeping that from staying like that, except the goal of earning a profit.
The moment the outlook of earning a profit changes, the company has to adjust and sometimes that doesn't affect the users. But sometimes it does, and it's those cases Open-Registry is trying to prevent from ever happening.
Let's say the community comes up with a feature that would be great for the GitHub Package Registry to provide, but it would make the earnings from private repositories lower. Since GitHub rely on earning from private repositories, the decision will probably be to not implement that feature, even though it would be good for the Package Registry users.
I think this is somewhat mitigated with Microsoft these days because of their motivations for buying GitHub. For them this endeavour seems to be focused on winning developers hearts and minds rather than seeing how much profit they can make from it. Clearly they have a business case for this, but it seems more to be more geared towards making azure more and more profitable instead.
I'm not as sure as you are. Sure, as it seems today, Microsoft wants developers to be as happy as possible. But in the end, Microsoft is not running a non-profit. They are running a for-profit company and the motive is simple: earn a profit.
Today, they can afford not earning as much on their new Package Manager as they earn money elsewhere. But that's no guarantee they will act the same way tomorrow.
We've seen Microsoft go back and forth in the developers minds, and I'm sure we will see more movements back and forth in the future. Right now, things are good though.
I think it's good to be cautiously optimistic. Microsoft has a massive revenue source in Azure, and losing 100 million on Github to make 1 billion in Azure is... a no brainer.
No brainer for who? Feels like the users are the ones loosing here.
If a company is running two divisions, one that doesn't make any profit and another where they make a massive profit, which one will they focus on? If shit hits the fan, which one gets cut first?
Having some core infrastructure like a package registry be the loosing option in that case, does not feel like a no brainer when you're a user choosing a service.
As my original comment is too old to be edited. Just wanted to add some more of my thoughts on the issue but it became to long to post here, so ended up with a separate blog post. You can read it here: https://dev.to/victorb/the-everlong-quest-for-the-perfect-pa...
I'm unsure if this is willfully ignorant marketing, or naivety. Profitability is a GOOD thing -profit typically means people getting paid, livelihoods supported, lives built, etc. I hope NPM becomes profitable, to support the awesome team that builds useful tools for millions
It's noble that you are building a non-profit, neutral registry, but framing the contrarian view as evil, and pitching this as a sacred good vs evil fight is bad.
Maybe your value-add is that the registry works for the good of more than the parent funding org, and that in itself is valuable. However, not-for-profit is scary because you will refuse to go the extra mile for any one customer, even if they pay you money, and only prioritize whatever YOU deem fit and moral.
There's something slightly concerning about ceding responsibility for distributing the world's open-source projects from a family of strong independent repositories to a centralized platform owned by a tech giant.
Yes, but that's not a new concern - to some, GitHub has always represented an anathema to what git was supposed to be and bring. Centralization at a proprietary vendor, instead of open systems interacting. Then locking people in further by network effect and adding centralized products around git. That it's become so popular many people equate GitHub with git adds insult to injury.
I completely understand why this all happened (centralization is just so easy and convenient; federation is hard), and it was probably inevitable in its timeframe, but I also wish it wasn't so. It's not quite what we imagined when we made the leap to dscms in the early aughts.
All the good stuff is still in there, though, and it's still as possible as ever to do different things, so it's not a bleak situation.
I have the opposite view: the success of GitHub and the growth of code being open by default with everything running through git has probably brought more people into the git ecosystem than would have otherwise. I primarily use GitHub, but whenever I need something that I need to run myself I know I can fairly seamlessly switch over to something like GitLab.
For example, if GitHub ever started using a very proprietary application, I would just switch over to using regular git, and I'm guessing many others would too.
I am with you on this one. I use GitHub to share code, and participate in projects. I use my own GitBucket instance for anything purely personal that I don't want to lose, but don't want to make nice or document and then at work we use GitLab.
I'm all in on git in a way that I might not have been without GitHub making it so huge. Without GitHub, we'd probably all be using git at home but SVN at the office.
I don't believe you would. It's more probable that you'd install that proprietary application to get the features which standard git lacks at that time.
When Linus introduced git he didn't seem to care at all about decentralizing from a political standpoint, just from a "I can work on this from my laptop without an internet connection" point of view.
That's the thing - git was fundamentally a tool borne with an asynchronous workflow in mind: I work on X, Alice works on Y, Bob works on Z, and the eventual merging (which might happen days or weeks later) should be as simple as possible - without worrying about who checked out what. Git was dropped in the "distributed VCS" bucket, but decentralization was a secondary effect of the workflow Linus wanted to achieve.
GitHub then took the server-side bits of git, and effectively built a web-based interface with social features on top. Git itself is still very much a decentralized tool (just add a new remote and off you go), only the social GUI is centralized.
It would be cool if somebody could build "Github over P2P" (I guess with a bit of blockchain, because hype). At that point the entire stack would be fully decentralized.
A git service built on IPFS or something similar would be wicked. It does take quite a bit of engineering (money) to compete with the big boys, however.
rad project allows you to create, checkout, manage, and publish a project, comprised of issues, patch proposals, and a git repo.
Well hot dog.
And it uses LISP for scripting? Nice.
Interesting approach to saving state, too.:
One only needs the address of the latest input, the "head", to be able to recover the whole log. The owner of a machine uses an IPNS link to point to the head of the list, and the name of this link is then the name of the machine also.
Thanks for sharing that. Definitely going to give it a whirl this weekend.
It looks the business but it lacks the single most important feature for popularity: a GUI. GitHub made Git dominant by building a friendly GUI on top of it. Before, it was just another player in a relatively crowded field of CLI DVCSs. Obviously it is not essential to get stuff done, but anything with any ambition of generating network effects definitely needs a GUI.
The other thing that seems to be lacking, from a quick reading of the docs, is a way to generate pull requests (or "patches" in Rad terms) from a branch, and then merge them on another branch. Obviously you can do it manually in git, but GH is definitely a superior experience.
> It looks the business but it lacks the single most important feature for popularity: a GUI.
It's a matter of perspective. My first thought when reading the docs was, "All someone needs to do is slap a GUI on this baby. Good thing the designers made a simple CLI that it could interface with."
They've already done most of the heavy lifting. At this point a GUI is trivial to add. "Terminal–first" doesn't imply "terminal-only". In fact, quite the opposite. I wouldn't be so quick to assume that they don't envision a GUI at some point-- Why not contribute to the project and get the ball rolling?
The patch command [0] has a propose subcommand that describes what you're talking about. It generates a patch from a commit (on any branch, I presume). This can be applied however you see fit. And the checkout subcommand even lets you generate branches from patches similar to GH. What seems to be missing?
I agree that the GUI situation is a glass-half-full sort of thing, I'm just saying it needs that as a priority if they want any network effect.
> The patch command [0] has a propose subcommand that describes what you're talking about.
Yeah but in the tutorial it says it will fail to work if you are on a different branch from master - I took it to mean that the patch command can only target the same branch it was generated on. If that's the case, obviously the maintainer can then do the manual merge-and-delete routine; I'm just saying that on github it's a one-click operation.
Your sister comment had another project to share based on IPFS: https://radicle.xyz/
One thing that seems clear is that there needs to be a standard interchangeable format for storing PRs and issues, so that it's not just the point of origin that's decentralized, but the data itself (in case the maintainers vanish).
I'm not really sure how to go about that sort of thing either myself or as a community effort but I'd appreciate any advice from the HN community.
> One thing that seems clear is that there needs to be a standard interchangeable format for storing PRs and issues, so that it's not just the point of origin that's decentralized, but the data itself (in case the maintainers vanish).
I use Artemis for issue tracking, and that uses maildir (a widely supported standard). I compose issues with Emacs message-mode and render them to HTML using mhonarc.
> anybody can push to anybody else's git repository. [...]
The nomenclature is to treat the ssb remote as the "true" remote, and work off of a branch called @your-username/master as your own master, emulating a forked repo. This seems to work well: the SSB network thrives off of being a group of kind, respectful folks who don't push to each other's master branch. :)
Uh, thanks but no thanks. I thought we had learnt that the honor system does not scale.
> decentralization was a secondary effect of the workflow Linus wanted to achieve.
Linus didn't want to achieve a change in workflow of kernel development, he just made a tool to make which would ease the pain, Linux development was decentralized since forever.
What I said is that the design of git was driven by the requirements of async workflow (“easing the pain”, in your words) more than decentralization as a philosophical objective. I think we saying the same thing with different words.
I still wonder if Larry McVoy feels sore that git basically destroyed BitKeeper and became what it did. “It could have been me” and all that...
I just wish a PR and issue tracking system with pluggable credentials had been simultaneously developed and implemented alongside git, so that I could migrate my issue and PR history, or plug them in to multiple hosts.
I prefer Gitlab for a multitude of reasons but since all the action takes place on Github, my Gitlab account just serves as a repo mirror.
Git has always had integration with email, which makes it compatible with a vast amount of existing servers, clients, credential systems, Web UIs, scripting languages, etc.
As an example I think SourceHut is mostly based around email (which it provides a Web UI for) https://sourcehut.org
That is, indeed, a fair point of concern. But in practical terms, I would place Github very high in any ranking of good things that happened to open source.
It's possible many people have forgotten, or are to young to remember, how the ecosystem worked pre-Github. There was sourceforge, which wasn't quite the disaster it is today, but also not very good. But mostly I remember every project using different, often hand-rolled systems. PRs had to be sent in by mail. Every project had their own conventions of where to send patches, what formats to use, what additional information to provide etc.
Just try figuring out how to get a patch into Debian, which is still where most projects were ca. 2005. I won't wait.
I never contributed to OSS pre-Github. These days, I routinely send in a patch for smaller things I encounter a few times per week. Over time, I have also started becoming a more involved contributed to two projects. I doubt this would have happened without the flat learning curve that Github provides.
I wouldn't be surprised if both the number of contributors and total contributions to OSS have soared by a factor like 5x even above just the growth in OSS usage, and Github is the obvious reason for it.
Hypothetically, and, currently, only hypothetically:
If Microsoft is still of the old spirit, then what we see now, would be the biggest "Embrace, Extend, Extinguish" coup, they have ever done.
It won't happen now, it won't happen tomorrow. For that, this would be too big of an effort. But Microsoft is trying big to win back the hearts of "The Community" and "The Market". As people, especially developers, have gotten more clever about the computers, since the advent of the web has made it possible, to live in IT without "getting shown" and "taught" by "Big Daddy" type companies, since all and everybody is much more self-organizing these days, there is much more competition to MS, that has been in the past. So they try to get it back, what they have lost.
* VSCode, is very sweet and candy, with lots of bells and whistles, major software companies writing plugins for it (the most active being Microsoft). As a programmer's text editor it sits right at the core of every development.
VSCode, especially, is attractive to people outside of MS Windows (they might use VisualStudio). I am talking about web- and "App" developers. Mostly frontend or mobile.
* By buying Github, they bought the "source of all sources". They won't ever own the code, but as long as they own the popular infrastructure, everybody is playing on their grounds. The next step, in two years, or so, may be
the need for a MS account to log into Github. They integrate it.
Out of curiosity, what else did MS buy in the last years, that would fit into this pattern?
Hopefully something along these lines will also be added to Gitab.
I share your concerns, but I've also long had the feeling that both NPM and Maven are a security disaster in the making.
Having the dependencies being published from the same place that stores the actual code, gives me a little hope that things will improve from the security and design perspective.
Ideally I would like the social aspects of GitHub (trending/popular repository, staring projects, notifications, etc.) but with decentralized hosting. Something that would be to GitHub what Mastodon is to Twitter
While the technical side of the news is interesting, the organisational repercussions worry me. Microsoft (who owns GitHub) is already one of the largest tech companies, and I would not be surprised if this move was intended to weaken NPM and Docker in an attempt to acquire them.
I fear a future where everything one requires to develop "socially" depends on a single super-entity. GitHub and VSCode were the first steps in that direction, and now package management. My guess would be for CI/CD to be next on their list, with more integration of Azure somehow (potentially under the hood).
I'm glad you brought up Docker, but I think this is a move against GitLab, more than it is against NPM or Docker.
Lots of us use GitLab at work because it's such a complete product. Source code, container registry, CI/CD, Issues (via GitLab or Jira), Maven repository, NPM repository, etc. etc.
Microsoft is trying to build out GitHub so that they can more effectively compete for GitLab's corporate customers. Since buying GitHub they've added many of GitLab's key features to GitHub and these are some of the biggest adds so far.
You might be right that this hurts NPM and Docker, but I think it'll hurt GitLab more.
The price of self-hosted GitHub was so high the last time I checked that you could buy the whole Atlassian stack or the highest tier of GitLab instead and still have enough money left for Artifactory.
I guess that's their way of telling your bean counters that you don't want self-hosted and instead want to put everything on their servers (<Jedi mind trick wave>). That way, they can increase lock in.
Microsoft's Azure DevOps already has everything corporate customers could want though - AD integration, CI/CD (even hosted MacOS build agents!), choice of TFVC or Git, task boards, testing stuff...
Microsoft has been in this game for a while with Visual Studio, TFS, and other tools. The same strategy is just now catching up to a larget set of better tools.
IBM I believe tried to do this with their 'Rational' tool line and they're still buying into the game (UrbanCode).
That would make sense as a worst case scenario but I'm not sure the evidence suggests that's the route they're going. If they wanted to acquire a CI/CD product, they would've bought Travis when it was being shopped around for a buyout.
I guess this is the risk of working on a product that could be easily added as a feature to a much more popular product. But, hey, Dropbox is still successful.
It's an alternative registry so it's compatible with yarn and the official npm client. It doesn't seem to rely on any services provided by npm Inc though, so it's a direct competitor.
The npm registry started out as a hobby project that was eventually backed by the company its creator worked for. He then decided to pull out his project into his own startup, which raised some eyebrows because suddenly it looked like there was a lot of hostility between him and the company that previously footed the bill for little more than marketing value. Also it was completely unclear how the startup was supposed to make enough money to be viable.
Additionally during its "nice people matter" phase npm Inc seemed to be more focussed on creating a nice environment for its employees and maintaining its ethical values than creating anything that might generate a profit.
The two most obvious monetisation options were private packages and enterprise self-hosting. But when private packages had become a thing there were already third-party open source clones of the npm registry that offered this feature (first sinopia, now verdaccio).
There's really no way to monetise the registry itself directly because users simply aren't willing to pay for a service they expect to be free (like maven, rubygems, PyPI, etc). It would have been more logical to create a non-profit (or at this point transferring the registry to the OpenJS Foundation being the more obvious choice) instead of a for-profit startup.
npm Inc was doomed from the start. Even after acquihiring ^Lift to build the security audit feature there's simply no significant value in what npm Inc offers for money compared to what's already available for free.
The recent CEO change feels like a desperate move by the stakeholders to avoid becoming the next RethinkDB (which also ultimately failed to come up with a way to make money other than support licensing, i.e. renting out access to their developer time).
This is what people criticised when npm Inc was initially spawned: investor money isn't free money and having investors doesn't mean you can perpetually operate at a loss. Investors want significant return on investment, at least eventually. That means either selling out by being acquired (and likely killed) or becoming massively profitable (or surviving long enough while generating enough "value" to go public).
A package registry is a cost center that in order to be valuable needs to be practically guaranteed to exist forever. Maybe if GitHub manages to kill npm Inc they'll finally admit this and transfer the registry and client to a non-profit like the OpenJS Foundation.
We were contacted by NPM to switch our Enterprise account from self hosted, to hosted by NPM.
There is two problems with this for where I work.
1. What if NPM goes bust? what happens to our packages?
2. What if NPM gets hacked? what happens to our packages?
3. The increase in price was HUGE.. which was probably the reason for forcing us to migrate to their new cloud hosted option.
Look it's an open secret at this point that NPM is in trouble, it's fired a bunch of staff, other staff have quit. The new CEO is all about profit, and its just the beginning.
Actually, it has never felt natural to me to publish a Node.js package to two web sites, both Github and NPM. Moreover, when Google lands me on NPM's web site I prefer to navigate right away to Github. If this new thing from Github is going to replace NPM so that there's only one place for that matter - I would not mind.
I'm worried about the resiliency of code distribution as we continue the trend of centralizing distribution in a few large companies. GitHub has had service outages in the past, so what happens when not just our repositories but also now packages are not accessible the next time that happens? It would be great if they'd implement it using an open/decentralized protocol such as IPFS, so that even if GitHub went down the content would still be accessible.
The problem is that hosting and bandwidth aren’t free and abuse is a big problem. Managing a distributed petabyte-scale archive which gets updated so frequently is a significant engineering problem even for a single party — now consider how you’d handle redundancy and routing when you can’t rely on any of the parties involved, and you have enough different objects being accessed to turn away most participants unless you can guarantee that participating won’t blow your ISPs data caps, interfere with other use, etc.
Abuse is the other huge problem: think about what happens when you’re hosting some BLOBs and the FBI shows up at your door because someone uploaded some kind of contraband and some of it was available from your IP address. How many people are going to setup completely independent hosting accounts to avoid fallout from something like that which happens so regularly?
The closest thing which comes to mind is the Debian mirror network and that is something of a historical fluke, predating centralized hosting being possible, and scoped to a much smaller set of more trusted participants. That also hits the big problem that even with a fair amount of infrastructure backing it, it’s hard to match the user experience of something like Github or NPM so the most likely case is spending a lot of time in hard problems but not overcoming the basic economics, as seems to be happening to IPFS.
I hear those concerns, but I think there are clear ways to use decentralization to provide real benefit without running afoul of the issues you describe. For example, you can simply cache the packages you/your team are interested in locally, or on a shared local server for your entire office to use - which gives you fast p2p transfers, offline resiliency, and avoids serving any evil BLOBs or running into giant perf issues by trying to mirror and serve the entire registry. That's the way npm-on-ipfs (https://github.com/ipfs-shipyard/npm-on-ipfs) works - more details in this WIP blog entry https://github.com/ipfs/blog/pull/215/files?short_path=90aba... ;)
A local cache helps with performance issues but you still need to get it from somewhere, which means you’re still hoping someone else has dealt with those issues, not to mention the cost of maintaining a high-quality robust local server.
I’ve seen this cycle with Linux distributions, Java, and Python packages (arguably even Git), and several digital preservation systems (I work at a library so this is a popular topic) and each time there either ended up being strong user demand to switch to the performance/stability/consistency of a centralized service, that happening de-facto with one or two big players doing most of the work, or falling apart because the contributed resources were insufficient. Getting the incentives aligned for something like this is really tricky.
Thanks for your insight. I think those are definitely real challenges to a fully decentralized system, but allowing some sort of federation can't hurt. The worst case scenario would be if GitHub is the only one pinning all the packages, which is what we would have now. It'd be nice to at least have the option to mirror in a way that interops with an open protocol so it would still work if GitHub went down. I doubt many mirrors would pin the entirety like GitHub, but I know I would certainly be happy to mirror my own and any open source software I've used.
Doesn't this bifurcate the namespace of literally every packaging system they are supporting, or are they requiring `@author/`-namespaced package names?
In the livestream he pokes around a github repo, sees it's one author, and decides that what makes it trustworthy? No GPG signing?
The new Actions support (about 50 minutes into the live stream) for auto-publishing from master is pretty sweet. From the very cursory demo, it seems very much like Gitlab's CI pipelines.
I'm a little bit anxious because the pricing has not yet been published. Both GitHub Actions and package registry will be free for public repositories but it is not yet known how much it will cost for private repositories after the beta.
They said they expected the package registry to be included in all paid plans. So it's only going to cost you anything if they decide to raise prices for everything across the board, it seems.
Namespaces in general are a mess. I want domain validated namespaces; github.com/example.com/, docker.io/example.com/, @example.com/, facebook.com/example.com, twitter.com/example.com, etc.. Whoever owns the domain owns the validated namespace. I doubt it'll ever happen though since (IMO) namespace squatting makes services look more popular than they are and some sites (ex: GitLab) already allow usernames with dots in them.
As for registries, I didn't like Docker's URLs when I first started, but now I'm convinced it's a good scheme. I can "own" my (domain) namespace by running my own registry. The implementation could have been a bit better though:
* The daemon should allow the user to force the use of a local mirror / cache as a registry.
* The daemon should pass full URLs to the registry for requests (https://github.com/docker/distribution/issues/1620).
That way something like Sonatype Nexus could be used as a local caching proxy for all Docker images and could automatically request images from (public) upstream repositories without any additional config.
The new TLDs make perfect identities / namespaces and there are plenty to go around.
Or I could download that it from a GitHub-user controlled URL, or someone's random website. The name of the package is still "vscode", regardless of what location it was fetched from.
Yes and no. It affects software that's installed via things like Kubernetes pod definitions. You need to "relocate" images to the correct registry in that case.
Even with @author, my github username is someone else on npmjs. So installing “@me/module” will either get my module or the other guys depending on sources it would seem.
Is centralization of open source a good thing for the world or not? This thread seems to be overwhelmingly positive. And in the end we all will be critisizing it if all package repositories will be handled by a single entity. And that entity that is being applauded here in this case happens to be the most valuable corporation in the world right now. Healthy skepticism seems to be a disappearing attribute in the tech world
Seriously. I love Github but I don't know how to feel about a megacorp becoming the de facto source for packages in the open source ecosystem. It could be great but many of us thought that consolidating all of our social activities under the Facebook umbrella was going to be great
As usual it will take a disaster for people to realise it was a bad idea. Microsoft tried to destroy Linux in the past. Literally. Linux is what gave us git in the first place, and docker, and so much technology that we love today. Oh how quickly the past is forgotten when convenience is on the table.
This is one of the things I love about Packagist. Technically Composer doesn't care where the source is from, but the official Packagist repository actually just uses Github as the storage and CDN for downloads. You have to link a repo to publish it, and Packagist will only publish source committed to your repo (no build steps, etc). Packagist then uses the zipball downloads for each package for it's source.
Downside of this approach is that almost any PHP project requires you to configure Composer with a personal access token for Github due to the amount of API requests causing rate limiting. Folks sometimes end up wondering why Composer needs an API token to download otherwise public code. (https://getcomposer.org/doc/articles/troubleshooting.md#api-...)
Composer/packagist has done many things right: namespaced packages, and downloads straight from VCS to name a few.
I wouldn't consider the Github personal token to be an issue either. It's a one-time setup per device, and my server (which only pulls code) never needed one, because it uses the lock files to download the exact commit/tag, and this significantly reduces the number of API calls made.
If you can just point at the github registry, and run `npm publish`, does that really solve the problem?
NPM's major problem is there's no official link between the package and the repo, any code/branch can be published, and unless I'm missing something, this doesn't really solve that issue.
OTOH GitHub is already in the position to require that only accounts that have 2-factor auth enabled can publish to public repositories. You can already require on organization level that only users who have 2FA enabled can be members of the org, which is great feature for orgs that host private code on GitHub.
AFAIK most cases where npm etc. have been compromised are scenarios where maintainer of a popular package re-used a password, and the password became compromised in some unrelated hack. Other attack vectors (compromising access tokens on maintainer's computer, compromising 2FA, compromising a git repo) really are a notch harder.
Even if these hacks are not the fault of npm per se, they make them look bad, and looking bad security-wise is really really something you don't want to happen to you when your whole business model is founded on user trust (public package repo).
There can be a link, if you prefer to write your dependencies down that way in package.json. See Git URLs¹ and GitHub URLs².
There are some challenges, though. If the repository requires a build step to derive a package from it then the author has to provide the proper package.json lifecycle hooks, e.g. a prepare script. Also, there's presently no git/hub-install support for a package nested inside a monorepo.
That does not change with this github registry. It's simply moving the binary storage. But as long as Github is not enforcing reproducible builds, or letting users pulling down build environments, the statue quo is still the same as today.
Basically because it's all just github, the package author and publisher are intrinsically linked because the package repo is directly associated with the code repo. In NPM, there is not any way to directly ensure the publisher and package author are the same because they're different systems.
Code signing is a different sort of trust issue, in this case if the package file is coming from the same github repo page as the source code, you know it (AFAIK) had to come from someone with write access to the repository.
vs having an npm package named (for example) nodejs, are you sure the npm package is authored by and owned by the same person or people that own the nodejs git repository? How do you verify that?
There are many problems this doesn't solve of course but it does seem like it helps with the one I describe above, the connection between the source and the package.
Unsolved problems of course would include things like 'did someone get unauthorized access to the git repo and put an artifact there' and 'did someone with unauthorized access push code to the repo and then have an artifact built'. Those are tough and real problems but I don't know if that's any different between this and say, npm. Code signing Helps with that but you have the same unauthorized access problem if some bad actor gets their signing key instead of repo access.
I think if they required a user or org-namespaced package name, you'd get that. For example, if https://exiftool-vendored.js.org was `@mceachen/exiftool-vendored`, or `@photostructure/exiftool-vendored`, it's explicit, in the package name, who you're trusting.
> ... did someone get unauthorized access ...
If they required publishing to be via 2FA-authenticated users, and (if I can dream), GPG-signed commits, I think you get most of the way there.
Github is starting greenfield here, and it's frustrating they didn't (at least afaict) require these small steps.
When I'm looking at a given package, I'd like:
1. Assurance that the package was published by the author
2. Assurance that the package contents were generated, in an externally repeatable way, from a release tag.
It seems like they could have lifted 1. by requiring 2FA and GPG.
It seems like their new Actions tab could have given us 2. It may, I can't tell from the demo.
And when I update my dependencies, I also want to see the diffs from the version I'm updating from. Github already has nice comparison views for arbitrary commit shas, so this should be doable as well.
Github has traditionally been a Ruby shop and once you are a Ruby shop, you can use Ruby to do anything that you could use Python for so there is no need to touch Python other than may be data science. That means they would have built a lot of expertise in Ruby and comparatively very little with Python. So it's understandable that they are able to add support for Ruby before python. It must have been easier to do for them and also been much easier to "dog-food".
That said, I'm sure Python, Go, Rust and other languages will be supported very soon.
Yes but in future Go will most likely use a package server that will act like a proxy, a cache and way to verify packages. There will be many implementations. Some engineers at Microsoft are building one called Athens. Go team will release one as well. Github could release yet another one.
Hosting a Python package index is much simpler, just dump the files in a web accessible directory with a directory listing and specify it with `--find-links`.
Less technical and more political. Python has a bureaucracy that can take let’s say “a while” to decide things. If GitHub approached them about this it wouldn’t surprise me if they’re still debating whether to condone it.
There's nothing required from PSF to create own repo. I created one myself on S3, all you need is then just update pip.conf to use it. Pip also has functionality to support primary repo and a backup, so they don't need to make their repo a pass through to PyPI, pip basically can be configured to look up first the custom repo and if package is not found then fallback to PyPI.
PyPI has no closer relationship to Python than the other options to their respective language. Similarly to others PyPI also runs out of an http server. There really is nothing tied to Python.
GitHub, at least from what I've seen publicly in the past, uses all the ones they support except maybe Java/Maven; I've never seen anything about them internally using Python.
So, it's not really all that surprising of an initial set of choices for them to make.
Notice how you were able to call out the de facto package manager for those languages, but didn't for Python? I would imagine supporting the various Python package managers in use would be a bit annoying.
pip is the default package manager for Python, and while anaconda knows how to install from PyPI, you just pip install things into anaconda environments. So, the answer is "pip".
There's only one - PyPI used by pip, and you can run it on a basic web server.
There only other that I heard is anaconda, but that's made by 3rd party (not affiliated fir PSF), it is not just Python but also R since it targets scientific community. I also believe it is primarily used by windows users.
It is packaging for scientific tools that happen to also have python.
Edit: from another comment I see that underneath anaconda apparently uses PyPI for python packages (I didn't know, since I never used it) so it is not even a proper repo, just an abstraction to PyPI (and possibly whatever R is using)
Third party proprietary tools like bintray/artifactory manage to do it without too much trouble. Honestly, there aren't really that many formats to support for Python these days— if you don't mind kicking some legacy to the curb, sdist, bdist, and whl pretty much covers it— any other splintering of the ecosystem is on the tooling side, but all the tools that matter still generate one or more of those three formats as the archive.
Actually bdist is legacy is not even used anymore you basically use whl (bdist_wheel), generate packages for specific platform and also provide sdist (source) so people that use platform you forgot about still can use your package. If you're lazy you can just upload sdist.
I don't find it odd at all. It's likely just "languages we use" and "languages that would see enterprise value". Certainly Ruby/Javascript fall into the former, and Java/C# fall into the latter.
Not saying Python doesn't have enterprise value, but we have to consider that this is an MVP, so it makes sense for them to limit to a subset of languages they feel comfortable about.
people may be reading too much into this. notice there is no go either? maybe because we often tend to pip install and go get directly from github repos and releases? so whether they are working on proper integration or not, nothing is being missed here.
Do I want to use Github for this? I kind of like the npm model where they say "don't cache it, we guarantee as much capacity as you want to re-download packages". I use a lot of go modules, and each of our container builds ends up fetching them all. Github rate limits this and you have to either vendor the modules or provide a caching go module proxy (Athens, etc.). Meanwhile, npm just uses Cloudflare which seems happy to serve as many requests as I desire.
In general, I find that caching/vendoring dependencies is the most sane thing to do, but it's not what, say, the Javascript world appears to be doing. Do we want to move towards a service that already rate-limits package fetches when we already have a service that doesn't?
I would be shocked if GitHub rate limited this new package registry. They're just serving tarballs and static content, and it's a new system so they can fully architect it with scale in mind (i.e. a CDN). They rate limit current repository-related content because they have to dynamically generate most of it in response to requests (I assume they have caching here as well, but not static-file-behind-CDN level caching).
This isn't too surprising. Microsoft's DevOps in Azure does the same thing (or did I haven't looked at it in a few months). There was literally no point in using it until, as you've pointed out, a user can leverage cache. If I have a multistage build with an SDK that weighs in around 1GB why would I ever want to use a tool that pulls that down every run?
I think, as many have said, that this is going after GitLab more than anyone else, although I can see a lot of users migrating away from Docker Hub given 1) the latest snafu/breach and 2) why keep my container repo over here and my container build pipeline over there? Doesn't make any sense and Docker Hub doesn't come with the pedigree of CDN baked in. I'm sure the same arguments work for other technologies in this consideration, but... Docker seems to continually be behind the 8-ball on the shifting field. My guess is Microsoft buys them in the next 3 years at a discount anyway. It fits their pattern of getting in front of the modern ecosystem and since Docker has leverage with containerd right now it would be an unsurprising move.
This is big. For a while we have needed a simple, intuitive and centralized artifact storage system for the modern age. I’ve been wanting to build something for ages but never made the time.
I also think that this will also have the side effect of exposing a lot of people to package/build/dist tools from other ecosystems, which might help disseminate best practices outside of their walled gardens. Github helped do this with code, helping to put the spotlight on less popular or more cutting-edge languages
This is going to solve a lot of problems for a lot of people.
This is super cool, but I worry that we've basically let a proprietary closed-source service be the de-facto standard for open source software. That really hampers my enthusiasm here.
This is going to be a huge hit for things like NPM Enterprise and Artifactory. Especially useful for small/medium teams that's want to start from the get-go with an easy way to share modules that will scale as they grow.
Maybe for their SaaS but we’ll see how it’s implemented in GitHub Enterprise for on-prem. If it’s anything like LFS you’ll just be expected to keep growing your volume instead of doing something sane like supporting s3 or hell, even separate volumes.
On-prem doesn’t exclusively mean “in your data center” anymore. It’s about security and control, not where it’s hosted. They offer GitHub Enterprise AMIs for a reason.
I'm disappointed it doesn't support Python. There's not a lot of options available for private Python package hosting, it would have been good to have another one.
To host a private Python package repository, I create a simple directory tree where the first level is the package name and the second level is the package (a tarball, zip, or a wheel) and I serve that tree over HTTPS using vanilla Apache or nginx with directory listings enabled. Then I use "bin/pip -i https://packages.example.com ..." to point to that repository. It's very low tech and it turns out that's all I need.
Whenever pip can't find a package due to case or hyphen issues, I look at the access log, find out what pip is trying to retrieve, and rename things or use symlinks to fix it. Also, I manage the directory using git. (One of these days I'll try using git-annex or similar, but for now, a few gigabytes is not even close to being a burden.)
Yeah, self-hosting a Python package index is so easy that (free) hosting solutions don’t really offer much, which is probably why you don’t see many of those. Paid services do exist (the most recent is PyDist), but you’re really paying more for hosting than the index.
p.s. I believe pip has recently fixed the hyphen problem you mentioned. Sorry for the inconvenience! Please do report any issues if they still exist.
I'm working on a SaaS service for developers, so I've built some API client libraries using openapi-generator [1] (using my OpenAPI specification.) The hardest part (by far) has been signing up for all of these different package manager services and figuring out how to release the libraries. Java and Maven was particularly difficult.
It sounds so nice to be able to release all of my packages on one centralized service. I hope they support PHP and Python soon.
I agree the artifacts should live alongside the code that produced them. But doesn’t Github killing npm, Inc. and Docker, Inc. in one move indicate Github is too powerful and, therefore, a huge liability? We need decentralized solutions, not another monopoly.
Take the JS world for instance, npm is many good&bad things, but one thing that was squeezed in the package.json spec is the ability to install packages from git repositories. And so, github already had a "package registry" for npm, and we publish npm packages to github without needing extra credentials, etc.
Watching the live stream now (https://live-stream.github.com/). Will likely use the Docker support near immediately. Hoping Singularity will be supportable as well.
I love Gitlab, but I think it's good GitHub is taking the threat serious and that we finally have some real competition in this space and not just one dominant player.
GitLab released integrated packaging back in 2016 - starting with a Docker registry - and adding Maven and NPM in 2018. You can find our plans for adding further packaging capabilities on our public packaging roadmap https://about.gitlab.com/direction/package/
We are also embarking on making package management more secure and auditable for the users of packages with a Dependency Proxy https://about.gitlab.com/direction/package/dependency_proxy/ GitLab users will be able to block and delay packages that are suspect and trace where vulnerable packages were used. This will increase performance, cost efficiency, and the stability of your tests and deployments.
> GitLab released integrated packaging back in 2016 - starting with a Docker registry - and adding Maven and NPM in 2018.
No, first version with "NPM support" (see my other comment as why I don't consider it being "supported") was gitlab 11.7, end of january 2019. I was really looking forward to this and were following your verdaccio (an open source npm registry) thread closely. Development then made a 180 and chose to re-implement rudimentary support for npm on top of your current package abstraction instead.
Thanks for your feedback on NPM registry support in GitLab. We release minimal viable change (MVC) and then iterate on our product functionality. Here are some of the issues we have related to NPM support:
Hey. You can probably find my name in each of the zendesk comments/issues. Thiago posted a few for me. I have been very vocal about what I feel needs to be done through my sales leads (my client has a EEU subscription).
What if they clash (eh user/package exist both on npmjs.com and GitHub)? Does it go through each of the configured repositories in sequence looking for a match?
I'm really loving the way in which Gitlab and Github are looking to diversify their value-add offerings and auxiliary services without sacrificing any existing basic git functionality or UX, and without aggressively directly competing on the same feature set.
This makes it less of "gitlab or github?" and allows developers to more easily decide to use just one for each project based on whichever service better focuses on the project's primary long-term goals.
If you find yourself in a 2-way split market on a core offering, I think this strategy by both parties is net beneficial for everyone rather than trying to directly compete on all the same features and offerings.
Everyone, if a programming language you use already has a good package registry (like ruby has rubygems), I would be extremely wary about switching to github. Don't put all your eggs in a large company's basket.
Oddly enough, before gemcutter and the refreshed rubygems.org, GitHub used to serve gems. IMHO, it was the easiest way to publish gems and it had a built-in namespace system where your username was prefixed to the gems (e.g., my Rails fork would be "nirvdrum-rails").
It took a while to clean up the mess when GitHub decided to close down its gem server. Workflows needed to be adapted, dependency lists updated, and so on. I'm certain there are still gems that never made the transition.
GitHub is a different company now than they were a decade ago, so this may be less of a cautionary tale and more of a blip in their history. The JS package management space is interesting in that its primarily hosted by a private company, in contrast to Ruby's being funded by a non-profit. Betting on GitHub still running its package server in a decade may very well be safer than betting on NPM still being around.
Helping decentralize package managers is Protocol Labs' top priority for IPFS in 2019[1].
Seems very prescient now. Hopefully this gets adopted soon enough, while it's still easy to batch-export stuff out of registries. I really don't want MS owning the most popular editor, git host, "linux desktop", and universal package manager in the world. Edit: oh, and programming language (typescript is eating javascript.)
Is there a way I could let some CI service like Travis CI to ONLY publish the packages to this GitHub Package Registry? ONLY means I don't want to expose the entire GitHub account to Travis CI but allow only publishing to the registry. So if the GitHub key/access-token leaks somehow the possible damage would be limited by registry publishing scope. So something like scoped access tokens.
Yes. They showed in the demo that there will be a new scope for read/publish packages. So you can create a personal access token for Travis with only that scope.
Watch out Nexus and Artifactory, I've been commenting about this for a long time (on reddit). With the advent of Bitbucket Pipelines, GitLab CI and finally GitHub actions, I knew it was only a matter of time before package management was added as well.
This is fantastic, I love the idea of one stop shopping for source control, CI/CD and a package registry.
This will be very neat just as GitHub user experience is so far. Centralization is a questionable, but it looks like that the community values much more the convenience than decentralization and privacy. In any case, it is excellent to have multiple choices beside other registries. I hope that other services like https://newreleases.io will catch up and support this registry as well. But, maybe this would even make them obsolete and everything a bit more centralized.
This is really cool. I'm excited to integrate it with Nim's package manager Nimble [1].
It will be a little strange though, since Nimble packages just need to be tagged in git and then you've got a release. It doesn't seem that GitHub implemented it this way.
This is very interesting. Would this also support hosting artifacts for closed source projects without having to add every user to my Github org?
For example, I am working on a SaaS product that can be optionally self hosted. I want to provide docker images and maven artifacts for the self hosted portion but since they are closed source I don't think they belong on Maven Central or Dockerhub.
From my perspective, it would be awesome if this could (in the future) be used to properly-host Debian and RHEL packages (extending of course to their derivatives, like Ubuntu and CentOS).
I wouldn't expect that to compete with platforms like EPEL, but I think it would be great for easy distribution of programs that aren't in those wider places.
I was really hoping they would take advantage of also housing the code to mediate some of the trust issues we've seen in npm: specifically, being able to prove a binary was generated from this source code. Although I imagine that's tricky because then the build process would need to be run by them and be exposed as well..
It looks really cool. The only fear I have is the impact of mono culture if everyone starts using the same repo and it gets compromised. Having a topology of many different repos would make open source less prone to this kind of risk.
That said, would be nice with a pkgsrc solution!
This is early and really depends on details, but this is a super exciting move and direction for Github. I'm wary of Microsoft ownership as it becomes even more of the home for code than before, but if they keep it true to its roots it could be a real positive.
So that's an interesting "4 dimensional" way to think about your software packaging, eh? When does it make sense to "push" your code to the long-term archive?
An useful intermediate step, but ultimately, a wrong move.
Packages will keep having the essential problem that there's no guarantee whatsoever that the package was derived from the advertised source. Plenty of chance for delivering malicious code.
I'm very curious how this will compare/contrast with Azure Artifacts. Will it interoperate well with Azure Artifacts? Will there be guidance for when to use GitHub Package Repository and when to use Azure Artifacts?
Now all that remains for Github to do is to add a build platform as alternative to Jenkins and a deployment framework. Source code, build platform, artifact storage and deployment - all co-located.
This is extraordinarily generous of github, but I don't think having a hundred maven repositories is a good idea. We have central and a process for putting signed artifacts there
Does any one know if this is actually their own CDN with PoPs around the world, of do they really mean Azure ( Microsoft ) or Fastly, which they were using at one point?
Excited to see someone other than JFrog and SonaType in this space. Personally, I think the major cloud providers are missing an opportunity by not doing the same.
This question might already be answered already, but who owns the built packages? Source code is released by license, but I don't know whether licensing compiled packages naturally inherit the same license from source.
What would GitHub do in the case of a `left-pad` situation?
If/when they add a build service (like Bitbucket Pipelines), they have a golden opportunity to provide a strong guarantee that a package was built from a particular Git commit (i.e. the source code wasn't modified to add malicious code). That would make me feel a lot better about using pre-built packages.
Hah, I had forgotten about that. Now they just have to integrate all the services and provide opt-in public traceability for source -> action(s) -> registry.
This solves the problem of managing private artifact repos in corp-land. If your org pays for GitHub, now you don't have to manage them. The only thing they need now is their own CI, and maybe some improved project management, and GitHub's going to be one gigantic gravy train.
Can somebody explain the technical accomplishment here? What's new about this? Github already hosts source. What do they mean by "package"? Is it just source? I don't get it.
For npm, you might publish your source 1:1 if you have a very vanilla setup, but a typical package in npm contains a built version of the code while the repo contains source, documentation, tooling, etc.
Package versions are also created explicitly and tend to contain many changes, whereas repos are commit/branch based.
With the npm example, you can tell npm to use github's package repo instead of npmjs.com, and install from or publish to that one instead. Basically another npm, but the same command line app.