
Software Heritage Archive - mxschumacher
https://archive.softwareheritage.org/
======
dddddaviddddd
The article describing the archive suggests it be used for research in
computer science, as well existing to preserve software culture. Commercial
uses envisioned:

> Software Heritage makes two key contributions to the IT industry that can be
> leveraged in software processes. First, Software Heritage intrinsic
> identifiers can precisely pinpoint specific software versions, independently
> of the original vendor or intermediate distributor. This de facto provides
> the equivalent of “part numbers” for FOSS components that can be referenced
> in quality processes and verified for correctness independently from
> Software Heritage (they are intrinsic, remember?).

> Second, Software Heritage will provide an open provenance knowledge base,
> keeping track of which software component—at various granularities: from
> project releases down to individual source files—has been found where on the
> Internet and when. Such a base can be referenced and augmented with other
> software-related facts, such as license information, and used by software
> build tools and processes to cope with current development challenges

[0] [https://hal.archives-
ouvertes.fr/hal-01590958/file/ipres-201...](https://hal.archives-
ouvertes.fr/hal-01590958/file/ipres-2017-software-heritage.pdf)

------
gsaga
The page says that this archive has 4,782,131,719 source files and
4,188,748,858 directories. Isn't that weird? These numbers suggest that every
4 directories have less than 5 files on average. I would expect the first
number to be much higher than the second one.

~~~
ComputerGuru
Perhaps there are long nested paths, e.g. `/usr/local/share/foo/{many files}`
with nothing in the directories before `/foo` besides the child directory?

~~~
randoramax
The effects of com.java.some.class.there.and.here maybe :)

