We have a few approaches to the disappearing data.
First, we are working with libraries, universities, or other groups with large amounts of storage/bandwidth. They'd help provide hosting for datasets used inside their institutes or other essential datasets.
Second, we started to work on at-home data hosting with Project Svalbard[1]. This is kind of a SETI@home idea where people could donate server space at home to help backup "unhealthy" data (data that doesn't have many peers).
Finally, for "published" data (such as data on Zenodo or Dataverse), we can use those sites as a permanent HTTP peer. So if no data is available over p2p sites then you can get it directly from the published source.
As others said, decentralization is an approach but not a solution. It gives you the flexibility to centralize or distribute data as necessary without being tied to a specific service. But we still need to solve the problem!
First, we are working with libraries, universities, or other groups with large amounts of storage/bandwidth. They'd help provide hosting for datasets used inside their institutes or other essential datasets.
Second, we started to work on at-home data hosting with Project Svalbard[1]. This is kind of a SETI@home idea where people could donate server space at home to help backup "unhealthy" data (data that doesn't have many peers).
Finally, for "published" data (such as data on Zenodo or Dataverse), we can use those sites as a permanent HTTP peer. So if no data is available over p2p sites then you can get it directly from the published source.
As others said, decentralization is an approach but not a solution. It gives you the flexibility to centralize or distribute data as necessary without being tied to a specific service. But we still need to solve the problem!
[1] https://medium.com/@maxogden/project-svalbard-a-metadata-vau...