How would you set it up? The decentralized world doesn't really have a great system for curation at this point (unless you can point to a counterexample!), and so I'm in favor of any sort of playing around with decentralized voting/curation until we find something that seems to be working well.
Start from the objective of first do no harm. Voting systems may eventually be gamed to distort results, so eliminate the voting system. Instead rely on ad-hoc personal networks to disseminate signal about quality papers out-of-band. Don’t assume you have to systematize everything.
Voting (as was bore out in many examples including digg.com and elsewhere) becomes a mob rule situation and variation of tyranny of the commons without a novelty algorithm in addition to total votes. If you just go by totals, it will be easily gamified and rendered useless as a metric.
Actually I don't think science has democratic nature. Yes we do somehow do that as a theory would still need to be accepted widely. But in reality one person can have the correct idea while all others disagree. Still this person is doing it right.
I believe science is a democratic process. If someone has the correct idea but communicates it poorly, so poorly that others in the field disagree, then this person is doing it wrong. (Thinking specifically of https://en.m.wikipedia.org/wiki/Shinichi_Mochizuki )
The participation ought to be democratic in the sense of being open to everyone to participate. But, you can't do a vote and use it to decide who is right. Deep down we know that being right or wrong is independent from the scientific consensus. Mochizuki may be interacting with the scientific community in the wrong way, but it has no bearing on whether his theory is correct.
The consensus itself has some democratic features, but it's weighed by prestige and adherence to the current paradigm. I think Kuhn described its mechanism pretty well. It's far easier to convince people of a wrong result if you follow the established paradigm, than convince people of something right if you go against it. What really saves science from being pure dogma is that there are paradigm shifts, revolutions in which the scientific consensus change.
All and all it is a non trivial problem. You have at the very least have to attach some kind of form of reputation system into the verification process. Even with that you will still have the "misunderstood genius" issue, or the "excellent reputation professor" that everyone trust without (enough) verification.
But at least there’s be a system for other researchers to record “failed to replicate” that could give a channel to critique reputable professors that’s not controlled by the same professors (as they often can in journals).
Scientific consensus is democratic in nature (even though votes are not distributed evenly). The ideal is that through reproducible experiments and application of the scientific method the scientific consensus moves to increasingly accurate models of reality over time. But obviously the speed at which that happens varies, and some right ideas took annoyingly long to get accepted into scientific consensus.
Sure the right answer will eventually prevail but the process is much worse than we like to admit. Many breakthrough advances were outright rejected by contemporary peers when first proposed.
"Fermi first submitted his "tentative" theory of beta decay to the prestigious science journal Nature, which rejected it "because it contained speculations too remote from reality to be of interest to the reader." Nature later admitted the rejection to be one of the great editorial blunders in its history. ... Fermi found the initial rejection of the paper so troubling that he decided to take some time off from theoretical physics, and do only experimental physics" https://en.wikipedia.org/wiki/Fermi%27s_interaction
Using Wikipedia as an example of a seemingly naïve idea that was ultimately proven to work is a pretty bad argument that completely ignores how Wikipedia operates at the moment.
Larry Sanger has made something of a career out of being "the cofounder of Wikipedia who thinks it's getting it all wrong". There's a point at which the latest iteration of his criticism ceases to be a stop-the-presses newsworthy event.
Sanger wrote a great set of essays, largely based on the lecture notes of courses he taught as an academic, that seeded Wikipedia with a load of freely licensed content that kickstarted the whole enterprise. It's quite possible that without this initial burst of momentum, Wikipedia would have failed. For that he has earned and will never lose recognition. But the negative part of his critique of Wikipedia is not more searching than that Wikipedia editors perform on themselves without his help, and his series of suggestions for positive alternatives have lost credibility because his ideas never work.
I still pay attention to what Sanger says, but not with a high expectation that what he says will be exceptionally insightful.
In all my experience using wikipedia it has been successful at providing facts and accurate references.
I don't mean to attack the speaker here, but that former cofounder of wikipedia you just cited... isn't he an extremist neo-conservative? Why did he leave wikipedia in the first place? What are his proposed solutions?
Hey Hugo, do you have an email I could reach you at? I've been thinking/working on these problems for 3 years now and would love to find some smart people to partner with to further develop the ideas. I don't have much to show publicly right now, but https://intpub.org/ (soon to be scipub.app) is the start.
Great question! Search is (I believe) a second-order problem. Once permissionless publishing is solved, extensions can be built onto the core protocol.
In my opinion, indexing and ranking services work much better as ancillary services. The bundling of indexing and ranking with core protocol features is one of my main contentions with the Coinbase-backed "Research Hub" project/company (https://researchhub.org).
If you think about it, journals are important services, but they serve two functions right now:
1. Publication/information & data storage
2. Information indexing and ranking
Really, I think the whole contention about open/closed science is directed in the wrong place. Journals shouldn't be hosting information! They should be indexing and ranking information.
The information should be stored on an open, decentralized substrate where no one needs to go through authentication steps to view it. Then, journals can maintain their closed/open persuasions and offer bundled services and discovery and maintain their clout without really needing to change their core offering.
Also, the git repo link needs to be updated, and there's nothing there right now really, but https://github.com/scipubapp is the right link!
Great question! Search is (I believe) a second-order problem. Once permissionless publishing is solved, extensions can be built onto the core protocol.
In my opinion, indexing and ranking services work much better as ancillary services. The bundling of indexing and ranking with core protocol features is one of my main contentions with the Coinbase-backed "Research Hub" project/company (https://researchhub.org).
If you think about it, journals are also important services
Also, git repo link needs to be updated, and there's nothing there right now really, but https://github.com/scipubapp is the right link!
that's fine. Maybe then for a better user experience have grey-out or removed the IPFS download button for the articles that haven't been uploaded yet.
It's still very slow - retrieval times >30s for files that aren't cached on cloudflare or ipfs.io. Also, those two providers each have multiple periods of downtime every year.
If the files are cached and the services are up, it's plenty fast for static data but dynamic data (IPNS) is still very slow.
We've built a competitor called Skynet that is much faster (less than 200ms for files that aren't in the cache) and scales better. It's currently hosting tens of millions of files across 200+ TB of data.
We really like the vision that IPFS had and we think decentralized data is the future of the Internet. We're proud to have put in the legwork to make it practical.
The blockchain gets us a decentralized marketplace for decentralized storage providers. Anyone can join as a provider and get paid, and the blockchain can act as a decentralized escrow that holds the payment until proof is provided that the storage contract was properly fulfilled.
98% of our technology is off-chain. Only a little tiny sliver (the file contract open and close) is actually posted to the blockchain.
I'm writing a more direct comparison this week, we just recently (less than 30 days ago) hit full feature parity with ipfs, any webapp or file deployed on IPFS should work natively on Skynet now as well. The link/identifier will be different but you shouldn't need to change any code
You can always run a host yourself and pin it to your own host.
The main reason we chose a host-based architecture instead of a pin based architecture is that we saw on IPFS that having people pin their own data resulted in really poor uptimes, a lot of file rot, and it also substantially reduced scalability and increased fetch times. And after all of those tradeoffs, the vast majority of accessible content on IPFS is hosted via a pinning service anyway.
Makes sense. I’ve just been looking at creating a distributed YT archive on IPFS, but as you said, load times are absolutely terrible, especially for big files like video. I’ve been following sia since 2016-ish, and skynet looks awesome, just worried about maturity. I will try hosting my own files as you described, thanks!
> You can always run a host yourself and pin it to your own host.
(It sounds like it, but) to clarify, can you do this completely for free, with no cooperation from a third party (eg, you don't need to pay a existing host to vouch for you)?
I got turned off it because things that I'd pinned, and knew I'd pinned, would eventually become mysteriously inaccessible unless accessed through the node that I pinned it on, and I couldn't work out why. It's a great idea, but I found it too flakey to satisfy me.
Build this during an hackathon where I felt like that arXiv was in grand need of a small face lift. Currently not all articles are uploaded. The repository is here : http://github.com/hugoroussel/xirva
viXra is full of nonsense but then again so is arXiv (see: 750GeV debacle). viXra desperately needed a voting system and if it did it would likely have been much more useful and become a viable alternative to arXiv.
arXiv does not accept papers from authors with no institutional affiliation and viXra was the only (and ugly) alternative. There is an opportunity there to fix both sites.
A spurious signal at LHC (that disappeared in later runs) that spurred a cottage industry of arXiv submissions trying to explain it. Authors, mostly grad students, quickly submitted hundreds of low quality articles trying to get in on the "discovery". Later articles would cite over 400 previous articles and it became one giant circle jerk. There were even blog posts complaining about the blatant "ambulance chasing".
> jupyter-comment supports a number of commenting services [...]. In helping users decide which commenting and annotation services to include on their pages and commit to maintaining, could we discuss criteria for assessment and current features of services?
> Possible features for comparison:
> * Content author can delete / hide
> * Content author can report / block
> * Comments / annotations are screened by spam-fighting service
> * Content / author can label as e.g. toxic
> * Content author receives notification of new comments
> * Content author can require approval before user-contributed content is publicly-visible
> * Content author may allow comments for a limited amount of time (probably more relevant to BlogPostings)
> * Content author may simultaneously denounce censorship in all it's forms while allowing previously-published works to languish
FWIW, archiving repo2docker-compatible git repos with a DOI attached to a git tag, is possible with JupyterLite:
> JupyterLite is a JupyterLab distribution that runs entirely in the browser built from the ground-up using JupyterLab components and extensions
With JupyterLite, you can build a static archive of a repo2docker-like environment so that the ScholarlyArticle notebook or computer modern latex css, its SoftwareRelease dependencies, and possibly also the Datasets can be run in a browser tab with WASM.
HTML + JS + WASM
Well, as anyone can and will publish, it will be stuffed with junk, porn, trash of all sorts, so my plan was to implement a filter to ignore any data published outside of one’s WOT.
Furthermore, a user searching for “reviewed” data and papers would normally filter for items with enough “endorsement” metadata items signed by known WOT actors.
I haven’t figured a mechanism to prevent “review rings”, although being totally transparent it should be easy to spot them.
Hmm, my Firefox here on Ubuntu does not trust that CA for some reason. And clicking "Accept the risk and continue" seems to just land me back at the "Warning: Potential Security Risk Ahead" page.
Wow looks super nice, feels so easy to navigate and read, I was thinking it would be nice to have a RSS feed for some topics to read as news.
Great work!
Since it was for a ETHGlobal hackathon I thought it would be fun to experiment with the feature where we mint an NFT where the metadata points to the IPFS link. You could then do whatever you do with an NFT.
I wonder how much information do you get from arxiv regarding the dependency graph via citations. Could there be a way that I, as I upload my own manuscript, to tip the authors of the people I citated and conversely someday have the possibility of also generating revenue personally as such? It would be much nicer, I think, if the fees that are currently paid to journals to instead go to the authors that also contributed to my work.
There are for sure interesting ideas related to new forms of scientific funding. The issue I have and why the project is currently on standby is how to combat spam/hoax articles
Oh dear.