It's really a pity that they do this now. Some of their older papers had actually quite some valuable information, comments, discussions, thoughts, even commented out sections, figures, tables in it. It gave a much better view on how the paper was written over time, or how even the work processed over time. Sometimes you also see some alternative titles being discussed, which can be quite funny.
> Some of their older papers had actually quite some valuable information, comments, discussions, thoughts, even commented out sections, figures, tables in it.
Ehh sometimes you have additional results or insightful remarks that simply don't fit into the page limits. You may want to keep those for yourself and use them for a separate publications rather than give them away.
Also true, but the arxiv version is often (in my experience) containing the entire paper. Indeed, many conferences ask people to submit the full version to arXiv.
Aye, but in this context "full version" usually means "a version with more detailed proofs/results related to the paper's contributions", rather than "a version with additional contributions".
However, it’s pointless or even counterproductive to embed the raw high-resolution data in the paper because it doesn’t show up in the rendered copy but balloons its size. For 6.5” (i.e., full width) figure printed at 300 dpi, you can only show 2100 points horizontally—-and realistically a lot less. Upload the raw traces somewhere and add a link.
Source: As a grad student, I stupidly turned a simple poster into a multi-gigabyte monstrosity by embedding lots of raw data. The guy at the print shop was not happy when it crashed his large-format printer!
Same! I've accidentally rendered a PDF monstrosity where every data point was represented in full vector graphic glory. It was absolutely enormous and dumb, because you couldn't tell that from the figure.
Generate high quality graphics, with the limitations of print, digital displays, and attention in mind. Then toss your data up on Zenodo and cite its DOI.
Obfuscating is the wrong word. "Decimate", "project", "render" are all better options, depending on what you mean. Punning render is the most fun of that lot, FWIW.
Many researchers learn LaTeX by looking at the idioms used for the papers they really like.
That includes code for Tikz figures.
I hope people will use this tool only to remove the inadvertent disclosure of commented regions and to reduce the file size. But keep the LaTeX source intact otherwise!
You can upload only the PDF on ArXiv. Useful when you for some reason (e.g. client request) publish in certain engineering conferences that only allow Word submissions...
if arxiv detects that it's a latex-generated PDF, it will reject it. Though it's probably possible to launder the latex-generated PDF through ghostscript or something to evade detection (I haven't tried...)
To remove comments, one can also run, for example `latexpand --empty-comments --keep-includes --expand-bbl document.bbl document.tex > document-arxiv-v1.tex`. Latexpand should come pre-installed with texlive. Without the `--keep-includes` option, it also flattens the tex files into one.
But I'd consider removing comments by hand and leaving any comments that are potentially insightful.
I wish journals would start accepting Typst[0] files. It is definitely the format of the next decade in my opinion. It's both open source and highly performant.
Sadly existing legacy structures prevent it from gaining the critical mass needed for it to thrive just yet.
Some of those are redundant (arxiv will complain if there are unused files, must commonly by accidentally adding the .bib file). My make arxiv target on papers usually just calls latexpand to cull comments and modifies all image includes to not be in a subdirectory (then prepares a tar file with the modified source and all figures).
Or, don't put your stuff on the arXiv, but put it on zenodo. You also get a DOI, and you can just publish the PDF, not the source. You can even restrict access to the PDF, and create share links with access to it.
You cannot just publish the PDF, they have checks that make sure that you didn't produce your PDF with LaTeX. There are probably ways to get around that, but why? Just use zenodo instead.
Or just publish on zenodo, without all that fuss. The reasons the ArXiv gives may be good from their point of view, but if you don’t care too much about that but have your own good reasons for not wanting to publish your source, then zenodo is a great and in many respects superior alternative, no questions asked.
Let's assume that every rule has an exception. Then this rule must have an exception as well, so there is a rule with no exception. That is a contradiction.
So most definitely, there are some rules with no exception. The ones you are sure about should be among them.
E.g. from https://arxiv.org/abs/1804.09849:
%\title{Sequence-to-Sequence Tricks and Hybrids\\for Improved Neural Machine Translation} % \title{Mixing and Matching Sequence-to-Sequence Modeling Techniques\\for Improved Neural Machine Translation} % \title{Analyzing and Optimizing Sequence-to-Sequence Modeling Techniques\\for Improved Neural Machine Translation} % \title{Frankenmodels for Improved Neural Machine Translation} % \title{Optimized Architectures and Training Strategies\\for Improved Neural Machine Translation} % \title{Hybrid Vigor: Combining Traits from Different Architectures Improves Neural Machine Translation}
\title{The Best of Both Worlds: \\Combining Recent Advances in Neural Machine Translation\\ ~}
Also a lot of things in the Attention is all you need paper: https://arxiv.org/abs/1706.03762v1
reply