Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why are we hanging on to anachronistic academic paper formats
29 points by dcreater 8 months ago | hide | past | favorite | 23 comments
Academic papers are not only dense to read because of the technical content, but the format/layout of the paper i.e. the 2 column format with the distinct table styling etc isnt a very user friendly/approachable look. Is there a good reason why this doesnt change? Is it just that academics dont value aesthetics?



I think it's not so much that academics don't value aesthetics or don't want to change as that the alternates are NOT _strictly_ better. It's all tradeoffs. And we are used to changing template to submit to different journals, so it's not that much of an issue.

Extensive or summarised background? -- You don't want extensive background material padding every paper out with several more pages, so once you're familiar with 'the literature' (the key background details), often you can quickly skim and get an idea of the baseline of the paper just from specific citations

Citation formats? -- some citation formats are better than others...I usually publish in journals using IEEE/numeric, which can be denser but requires flicking to the Bibliography...names can make it easier to recall the gist of a paper if it's a famous one like the 'Hinton' or whatnot, but can really pad out content if you have strings of names inline

Papers too short/page limits? -- You don't want pages and pages of content, otherwise the paper is likely contributing _more than one idea_, and so is less focused (and also kind of dilutes the review process). If a paper contributes A and B, but B is weak, do you still let it through if A is great? If the contributions are distinct, you can easily have an A paper and then request some more work for B.

2-column? -- 2-column and dense can make it easier to quickly jump from section to section, or get more information on a single page...so when reading in print format (or digitally with full-page view), you can easily see what was said earlier on a page, whereas 1-column tends to have much larger font so less content-dense.

I don't think we are at optimum, but I don't think ANY new format that's been proposed gives such a noticeable benefit to some of these areas as to overtake. Fully digital with hovers and links and stuff may be useful, but would completely regress when printed out for reading and annotating.


2 column is so helpful in reading the paper imo. It's easier to jump and see previous parts, no need for page width equations that use a lot of empty space. And basically the style is enjoyable for me, I really don't see what is not approcheable.

Having a different style of the paper will not magically make it easier to understand. Focusing too much on the format instead of the content is counter productive.. And having somewhat standardized sections allows an academic to jump to juicy parts with enough experience.


It's absolutely trash for accessibility. Most pdf readers can reflow text when the file uses only one column but not so with two-column files. I.e reading two-column articles on cell phones is pretty awful. The real reason most articles are two-column is because "it looks more professional" which is the kind of cargo-culting OP rallies against.


I read most of my papers on my phone without any issues, I have absolutely no idea what you are talking about.

With Adobe PDF, you double click on one column, it zooms onto it perfectly, you double click again, it goes full page.. Navigation is super easy, the zoom feature perfectly fit the column to my screen.


Then you must not have tried many pdfs because there are many where the reader can't detect the columns. Even when column detection works, you can't increase the font size beyond what the pdf uses as the text won't be reflowed. It's of course an even bigger pita for those relying on text-to-speech to read pfs.


I don't know, something around 1 to 3 per day for the last 7 years maybe, I read and write papers as a living.. I never really had any problem.. I mostly have issues with epub like formats..

Of course tts is a different beast, but.. Try to read science in general with tts.. It doesn't make sense.. And you need time to process each sentence and put it into context.. So even with reflow, tts would be a nightmare for science content.


> I mostly have issues with epub like formats.

Epub is great for content that is mostly texts (Novels,…). As soon as there are objects that need to keep their relative distance to each other (equations, diagrams), I go for PDF. It's also not easy to get good typesetting for epub files. I believe that typesetting gives you a visual map that provides an additional memory boost. Doing focused reading on a phone is a no-no for me. I use my tablet for PDFs and my e-reader for epubs.


Is the styling of the paper really a barrier to understanding? Like, sure, this _could_ benefit outsiders, but I don't think most technical papers are written with the purpose of being accessible or aesthetic. And that's absolutely not a fault of academia.


Why should academics torture each other with poorly formatted papers?

We don't like reading these newspaper-column style papers any more than lay people do.


You don't like it.. I know many academics, and I'm one of them that like very much 2 column information dense papers..


Most humanities papers/journals have a much more readable and approachable single column format. The flipside, however, is that they eschew much of the clearly defined structure of scientific papers.

Scientific paper formatting is because they are still, generally, actually paper-published. 2 column layouts, for instance, are actually quite readable on paper. Online articles tend to have single column styles.

There is also very little "fiddling" in academic publishing. Journals/publishers tend to pick a template and styling system and funnel everything into that single style regardless of content. They are usually thinly staffed and want to enforce aesthetic standardization article to article to reduce typesetting workload and to ward off any claims of favoritism.


Format is anachronistic, but other things are even worse.

For instance, a typical review process in Nature or Cell takes around 12 months these days. Lots of waiting, back and forth messages, and waste. If you get rejected, switching from one journal to the other implies massive changes to format and rewriting, as the length constraints could not be more different.

Computer Science has a much more dynamic publication culture.


Probably because everyone needs to make an academic career nowadays. To make an academic career you need to get published in prestigious journals or conferences. These prestigious publishers are rather established and hence still stick to their ancient LaTeX template.

In an ideal world, every publication would be more similar to a Git repository or a website that contains all the code for the statistical evaluations and, of course, the raw data that was collected initially The code would document why a certain sample was excluded etc. Of course, the results wouldn't be presented in dry tables or bar charts. There would be interactive visualizations that support the understanding of the data and results. All this would be integrated into the text. Of course, such a text would have real hyperlinks to other works without the need to go to a reference section and look up a publication.

If a publication would fail to replicate, the original authors or their academic offspring would add a remark that this work is superseeded by some other work.

Needless to say that all this has been proposed already in publications ironically in PDF. Also noteworthy that Berners-Lee proposed the WWW 35 years ago, which basically implements this.


Forget the layout and such, at the very least academia needs to completely get rid of PDF as a format.

Completely unreadable on different format screens and makes accessibility impossible and therefore narrowing the field of possible readers even lower.

We've had HTML since what, the beginning of the WWW? How are we still using PDFs in 2024? Completely absurd.


I'm an academic mathematician; feel free to ask me particular questions.

Two column format is not typical in mathematics; here is an example of one of my papers, with formatting is highly typical for my field:

https://arxiv.org/pdf/2404.00541.pdf

I'd say that I do value aesthetics; I like this look, for the same reason that I appreciate HN's look and feel. It is simple, functional, and to the point.

Personally I dislike most of the trends in web design. The web feels increasingly designed by marketers who apparently want to measure or guide how I browse their sites -- rather than by someone who just wants to provide me useful information in an attractive and organized fashion.

For example here is the current website for the University of Wisconsin's mathematics department

https://math.wisc.edu/

and here is the 2004 version:

https://web.archive.org/web/20040619085511/http://www.math.w...

I prefer the old one.


I agree the format/layout is archaic and not friendly. There’s a reason why no website adopts the same design approach. But I think the reason is not just that aesthetics are not valued - you could conceivably choose any number of other unattractive approaches - why just the one dominant format? I think the reason why one particular approach persists is because it grants academic papers an (undeserved?) appearance of legitimacy. This is almost an unspoken bar for acceptance in the community, due to some mix of momentum, social cohesion, and inward facing norms. Journals and their process of curation, review, and editing all influence this process as well. The centralized gatekeeping of paid journals lends itself to a continuation of the existing practices.

I don’t think this discussion is applicable to just format/layout though. The writing style of academic papers is also anachronistic/archaic. I feel like most are written in an unnecessarily awkward way that impedes understanding. It’s almost an attempt to elevate the “complexity” of the content by choosing difficult writing styles, overusing jargon, and under-explaining. I’ve seen this style of writing previously defended in the same way as the aesthetics, that it is a matter of the content being “technical” or “scientific”. But if you look at any public facing post from companies that write deeply technical blogs or other such things, you’ll see there are definitively better alternatives to communicating deeply technical content.

These norms need to be broken. I think a paper should be given more weight if it is easy to consume. This means free access, nicer designs, simpler language, more transparent explanations.


There is some scientific publishing outside the standard format, sometimes corresponding to a conventional paper. To pick one I liked recently, https://www.cs.utexas.edu/~lqiang/rectflow/html/intro.html is nicer looking than the corresponding arXiv paper and has some animated figures that help visualize the process.


I find recently published academic papers are using a more modern format/layout, and have abandoned the 2 column format. It’s probably just a slow process getting people to adapt to change, and I doubt they go back and change the layout of older papers.


As someone that occasionally reads academic papers (but doesn't write them), the only change I really want is for an explicit publication date.


Do you have a better LaTeX template? Nothing but TeX does the math rendering well so stuck with that.

You'll notice that GPT-4 Technical Report uses single column https://arxiv.org/abs/2303.08774

So some variants exist. They print well. Colours are more annoying to print.


Agree, but it's a format that so so engrained in academia that it's not likely to change. But with the advent of LLMs that will readily read the paper and summarize it in the tone of voice of your choice this doesn't seem like an obstacle for people outside academia any more.


We are seeing a transformation where AI papers have a website that has code/paper/video and high level explainer. The paper is now subservient to the rest of the body of work.

The one that I don't like is many venues have a 10 page max, so some papers end up getting golfed down to 10 pages and become impossible to read.


The paper is the disaster-proof archive format. Pre-apocalypse, there's no reason for anyone except the journal committee to read the content of a published paper.

Any academic who wants to be read publishes in a more readable version of their research.

I'm not sure how, but sometimes a well-formatted paper gets published Example:

https://maartenfokkinga.github.io/utwente/mmf91m.pdf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: