I have to admit I am not impressed. My first paper, which I tried to render, does not work properly; references are removed and rendered poorly, figures are misplaced and tables incomplete. Given that not all arXiv papers are under a permissive license and you do not have permission to do this, I would much prefer if you at least made sure that arxiv-vanity rendered papers do not show up in search results, e.g. by offering a suitable robots.txt and with a bigger link to the author-endorsed version of the paper.
Edit to clarify: If people want to use or develop a broken sort-of-PDF viewer, that’s fine. However, if someone searches for a paper of mine, I would like them to only find the version where I at least had a chance to see that it renders correctly and is complete. In particular, I do not want to be "responsible" for broken rendering on random third-party websites. This website actually operating illegally does not make me more inclined to support it.
"Some papers do not render correctly, for example figures and tables in mine [1][2]. Beside that, some articles may not be posted under a permissive license, so you want to double check that you're not running into copyright violation troubles by modifying or publishing them.
This is fair. If somebody stumbles across this thinking it is how you intended it to be displayed, I can understand you'd be unhappy. We should make it clearer that we're just a conversion tool, not a source.
If you want us to remove your paper and just point at the PDF, we're happy to do so. My email's in my profile if you don't want to post the broken render here!
Thank you for your reply. Ideally I’d prefer for you to respect the license associated to each paper and only re-compile and re-host if the license actually allows you to do that (i.e. CC0, CC-BY, CC-BY-SA and maybe CC-BY-NC-SA, depending on whether you think you act commercially).
I also don’t want to keep tabs on every arXiv rehoster and inform them manually by e-mail every time a new paper goes up.
May I ask why this was not done together with the arXiv itself? I.e. have the infrastructure run there, let authors check the HTML render at the same time as the PDF render and then, if the author thinks they look ok, have them show directly on the abstracts page? This would even avoid all your license problems, as the arXiv already has the corresponding license!
I believe in most countries only a court can decide if a site is illegal or not. Not you. And as far as I know this is true in both France [0] and Germany[1].
Each arXiv paper has a well-defined license linked-to from the upper right hand corner "(license)" link below the PDF and source downloads. If they only re-hosted and re-compiled papers for which they have a license to do so, I wouldn’t complain at all, but re-hosting and modifying content without a valid license is clearly illegal, no?
> re-hosting and modifying content without a valid license is clearly illegal, no?
No.
Longer version: it's illegal only if a license is required, which is a matter of the copyright law of the jurisdiction relevant to the act. In the US, that question may turn on things like fair use analysis, which can be tricky.
Fair use applies to citations (not of the whole work), parodies and similar creative processes. Similar requirements in Germany include some creative input by the person claiming fair use which usually should exceed the creative content taken from the original work. Simply re-compiling the LaTeX source is certainly not creative work sufficient for a fair use exception. Checking the other limits of copyright law in e.g. Germany ( https://de.wikipedia.org/wiki/Schranken_des_Urheberrechts ) nothing remotely applies to this site.
Could you clarify why you think that this site does not require a valid license to re-host and re-compile papers?
> Fair use applies to citations (not of the whole work)
Time-shifting is one of many examples of where copying a whole work was found to be fair use; the idea that fair use applies only to citations is very, very wrong.
Fair use is extremely precedent dependent (and very hard to predict without clear applicable precedent) because the statute law gives only factors to weigh in the analysis.
> Could you clarify why you think that this site does not require a valid license to re-host and re-compile papers?
I didn't state an opinion on that; I said that, because it skips the question of whether license is required, the blanket statement that rehosting without a valid license is “clearly illegal” is inaccurate and overbroad.
Edit to clarify: If people want to use or develop a broken sort-of-PDF viewer, that’s fine. However, if someone searches for a paper of mine, I would like them to only find the version where I at least had a chance to see that it renders correctly and is complete. In particular, I do not want to be "responsible" for broken rendering on random third-party websites. This website actually operating illegally does not make me more inclined to support it.