Two more examples examples for arXiv and one for Figshare. (1) Article in journal has a flaw. For some reason the journal is slow or unhelpful concerning a correct version, so the author posts on arXiv a revised and perhaps much bigger version. Filters (will these be on arXiv?) detect a copyright infringement. (2) I translate an Euler article and post it on arXiv (with all attributions). Or an English translation of an obscure German or Russian article by a great mathematician. (3) I post on Figshare (and GitHub?) the meat of a published article (like data, procedures, results of experiments, programs) in order to make the research Open Science. I explain the context but maybe I don't pass the automated filter which detects a copyright infringement. For all examples, same questions as before.
If something was a copyright violation yesterday, it will still be a copyright violation tomorrow. And vice versa.
If you fear that arXiv will implement filters that will recognize that what you are uploading is a translation of an obscure Prussian research paper and block the upload, I don't think that's going to happen. (But I will be really impressed if it happens!)
Example: article in arXiv gets later published in a journal. Will the arXiv article version be still available for anybody? Only for non-EU people? Will be removed from arXiv? None of these?
Oh I replied to another comment. The answer is "no". Fact is that 1/3 (anyway a significant proportion which can be referenced) of the arXiv articles appear later in journals. So it is a totally credible concern. The arXiv has a system of copyrights which allow them either a perpetual non-exclusive dissemination right, or a CC-BY copyright.
Depending on the journal and its licensing agreement, they may not even accept manuscripts which have been circulated as preprints. Some of them will allow you to publish the corrections made through the review process. Some of them will even allow you to distribute the final article as published. But this is essentially unrelated to the law being discussed.
Well in mathematics and physics the usual thing to do is to first post on arXiv and then submit to the journal. ArXiv references are totally allowed. I heard that the situation is different in other fields but as you say this is not relevant for this discussion.
Smaller platforms are exempted on the directive (ie, startups, single person corporations, etc.) as well as anything that doesn't do profit oriented content/active content moderation.
Why would you? The Link Tax and Upload Filter specifically affect service providers that moderate user content to generate profit, if you upload something with explicit consent of all rightholders then this is explicitly exempted from the directive.
So? They're not published there illegally, arXiv (AFAICT) is non-commercial and publishing similar content elsewhere does not retroactively make the existing content illegal.
Sure, they are not published illegally. IANAL but as I understand the process, is like this: the author submits to arXiv and either (a) gives arXiv a perpetual NON-EXCLUSIVE licence to distribute (which does not change the fact that the copyright is with the author) or (b) the author chooses a CC-BY licence. The author submits the article to Journal and after acceptance, may transfer the copyright to Journal. Is perfectly legal and it works like this for a significant part of the arXiv articles. See also here in the comments for cases when a revised version of the article from Journal is posted on arXiv. But now, with the new EU Copyright Directive, how will this delicate process interact with the dumb one-size-fits-all automatic filters which may detect (wrongly) that the article on arXiv infringes the copyright of the Journal. What if arXiv will receive a bombardment of requests from various Journals? What will they do? They are admirable but they don't have the surface to fight this ddossing from journals. Or maybe my concerns are void, I'd be very happy if this is the case.