Hacker News new | past | comments | ask | show | jobs | submit login
How to Read a Paper (2016) [pdf] (uwaterloo.ca)
343 points by DecayingOrganic 12 days ago | hide | past | web | favorite | 40 comments





I know there are a lot of ML researchers and practitioners here - and I have unfortunately only a very shallow experience with reading ML papers, more so with the recent output.

But, I have to ask, how do you get a feel that the content actually looks correct, and not just only quack? The improvements are usually in the 1% range, from old models to new models, and the models are complex. More often than not also lacking code, implementation / experiment procedures, etc.

Basically, I have no idea if the paper is reproducible, if the results are cherry picked from hundreds / thousands of runs, if the paper is just cleverly disguised BS with pumped up numbers to get grants, and so on.

As it is right now, I can only rely on expert assurance from those that peer review these papers - but even then, in the back of my mind, I'm wondering if they've had time to rigorously review a paper. The output of ML / AI papers these days is staggering, and the systems are so complex that I'd be impressed if some single post. doc or researcher would have time to reproduce results.


You will never know the "generative process" of a paper, which makes it nearly impossible to properly evaluate it, even as an expert reviewer. If you are not a reviewer (or working on a competing paper), you are much better off NOT reading the newest papers. Instead, rely on the best proxy metric of all: Time. Wait until the paper has been battle-tested. See if people on social media are talking about it, what critics are saying, wait for reproduction results and open-source implementations, conference acceptances, and citations and comparisons. These give you a much better idea of the validity of the paper than its content can.

When reading the newest ML papers, I found it useful to not judge them, but instead use them as inspiration. Forget about the results. The paper may contain interesting ideas or viewpoints you didn't consider before, and those are probably much more valuable than the result table.


> The improvements are usually in the 1% range, from old models to new models, and the models are complex. More often than not also lacking code, implementation / experiment procedures, etc. [...] no idea if the paper is reproducible, if the results are cherry picked from hundreds / thousands of runs, if the paper is just cleverly disguised BS with pumped up numbers to get grants, and so on.

Personally, I'm bearish about most deep learning papers for this reason.

I'm not driven by a particular task/problem, so when I'm reading ML papers, it is primarily for new insights and ideas. Correspondingly, I prefer to read papers which have new perspectives on the problem (irrespective of whether they achieve SOTA performance). From what I've seen, most of the interesting (to me) ideas come from slightly adjacent fields. I care far more about interesting & elegant ideas, and benchmarks to just sanity-check that the nice idea can also be made to work in practice.

As for the obsession with benchmark numbers, I can only quote Mark Twain: “Most people use statistics like a drunk man uses a lamppost; more for support than illumination.”



> Basically, I have no idea if the paper is reproducible

Worry more about whether the result is generalizable. How sensitive is that incremental improvement to hyperparemeter tuning, or to how the data is pre-processed, or to the specific problem domain? These days, the academic literature seems to rarely spend much time dwelling on these subjects, which is, at least in my opinion, sufficient reason for industry practitioners to shy away from the cutting edge.

I am less familiar with how this works out on classifiers, but I can say that this is the elephant in the room with topic modeling. Hyperparameter tuning and data cleaning are much more important than choice of algorithm. Perhaps even more importantly (at least if you're trying to understand different algorithms' relative merits), the method you choose for evaluating quality is critical: One setup will be clearly better if you are focused on the data's syntagmatic qualities, but perform terribly if you instead focus on the paradigmatic. And vice versa. In short, the question, "What algorithm is best?" is malformed and unanswerable.

There's an interesting paper from a while ago where it turned out that the vector space model that performed best when evaluated against the TEFL synonymy test was good old latent semantic analysis. I find that result to be noteworthy because it's one of the few papers that took a real live test that was designed for evaluating the skills of real live humans, and used it to evaluate a machine learning model. At the same time, that in no way implies that LSA is the best fit for your sentiment analysis pipeline.


I usually does not care about 1% improvement at all. It can be due to many factors including over fitting to training set.

What is important is the ideas in the paper. How does it translate to your context. If it is relevant you can try it out since it may give more than 1% in your data set :)


I think you have to have a deep intuition about a subject area before you can easily determine if a paper is trash or not. Even then, people who are knowledgeable in some area aren't impervious to bad research.

Just read deepming / nvidia / facebook (A-tier AI institution) papers or NIPS orals, and go to the results first: are they reeeealy better than what you've seen before ? Then read the paper.

I know it's a bit elitist but we all have a limited time only in life. Also these institutions are usually the only ones making actual breakthrough for money reasons: they brute force many hyperparameters on their giants cluster in a context where sometimes a single training cost 15k USD..

Also most papers which "look interesting" or "edgy" are usually a disappointment.


> the only ones making actual breakthrough for money reasons

That's also a suggestion that those aren't breakthroughs. They are just someone getting 1% because their corporate sponsor spent $250k more than the other guys.

Look at the ideas, not the results. Is there something new and is it clearly expressed? If you can't answer that in five minutes, move on. Ideas transfer, results don't.

In particular, if an ML paper abstract states a percentage improvement over SOTA and then lists five existing techniques that were combined to get the result, you can just put it directly on the trash pile.


Mm i get the point but i disagree that "money" only bring small percentage. Entire new possibilities can be uncovered. For example I was thinking to these papers for example:

https://www.youtube.com/watch?v=vppFvq2quQ0

https://www.youtube.com/watch?v=XOxxPcy5Gr4

As another proof the entire concept of neural network exists since the 80's. It's the fact of using it on new hardware (the GPUs) which made it so important. And the "new hardware" is always expensive.

In 10 years also maybe every startup will have a massive cluster instead of a 4 GPU PC with current flops capabilities (which was luxury a few decades ago)


This depends entirely on your goals and reasons for reading papers. Are you a researcher on the lookout for promising new directions to explore? Trying to keep up with the general zeitgeist? Looking to implement the "best" method for a production system? Looking for tricks and tweaks and bells and whistles to incorporate in your existing system? Just starting out with the field and getting familiar with the literature? Each will require a different approach, a different selection of papers and different parts of papers to focus on.

> But, I have to ask, how do you get a feel that the content actually looks correct, and not just only quack.

If you're familiar with the particular subfield, you can spot problematic evaluation methods, how much they follow general best practices, whether they cite all relevant approaches or "curate" their tables by intentionally leaving out methods that outperform theirs, etc. You can "smell" whether something sounds plausible. If you're unfamiliar with the field, start with highly-regarded conferences like CVPR/NeurIPS/ICLR, especially orals.

Authors squeeze their methods to press out that 1% improvement, because it's difficult to get through peer review these days without state-of-the-art numbers. Many reviewers are themselves not very experienced, do not spend much time on each paper and give large weight to quantitative results.

So be aware that the primary target audience of papers is often not really the general reader, but the reviewers.

> Basically, I have no idea if the paper is reproducible, if the results are cherry picked from hundreds / thousands of runs, if the paper is just cleverly disguised BS with pumped up numbers to get grants, and so on.

If they've released their code, it can be a positive sign.

> As it is right now, I can only rely on expert assurance from those that peer review these papers - but even then, in the back of my mind, I'm wondering if they've had time to rigorously review a paper.

Depends on the venue. But don't treat peer review as some sort of verification or confirmation as truth. It's more like a spam filter. It just means that 2-4 PhD students went through it (spending perhaps a few hours on it) and found it to be worth presenting to the community.

Peer review is never about reproduction, in any science. The reviewers for a psychology journal will not recruit their own subjects and redo the experiment, for example.

There's definitely a good amount of trust, gut feelings, paper-gestalt, are-they-one-of-us and similar subjective effects at play when a paper gets accepted and the process is known to be noisy.


Can we apply the steps to reading source code?

Here's my take on adopting it to reading code:

1. Read readme if available, read the list of source files to get a sense of how the project is modularized. Identify the entry point. Identify type of program from main entry point: is it a server, a CLI, or a graphical app?

2. Run call graph analysis tool if you have it, so you can study callgraph tree starting from main entry point. Read just the function names and start making notes of how the execution works at various levels, e.g does it read input then enter an infinite loop, does it wait on network packets, does it use update/render loop, etc. Also make note of whether a function is trivial/non-trivial based on quick glance at the code.

3. Ignore the trivial ones, and read the non-trivial ones in detail. Make note of the algorithm, data structures, and dependencies.


My spouse has a rule for science papers: First, look at the pictures. She figures that people will put a lot of effort into their graphs and diagrams telling a good story.

Usual method followed by a lot of academics is: read abstract first, then figures, and if it’s still appealing read the conclusion. If the paper is still appealing at that point read the whole thing. That way you can cycle through a lot and stay on top of the literature.

This is a good idea. Depending on the field, there really shouldn't be any statements in the text about the findings in the paper that aren't directly backed up by "proof" in a figure. If you're familiar enough with the methodology, just looking at the figures should give you a good idea of what the authors did.

Rats! I only use xkcd style plots.

I think that's likely to get you a lot of readers.

How do i find all the interesting papers?

Like i like to read about things like: ML, Scaling, Filesystem, Databases, Algorithms etc.

I do get a lot of input through hn, friends, youtube, blogs but i'm not getting my papers from direct sources. I don't have anything like nature or so laying around either.


Try to find a survey paper or review article [1] on the field you're interested in. They summarize the current state of the art in a field and link to the relevant papers. If the linked papers are behind a pay wall then you can use arXiv as recommended in another comment by searching the title/authors. The field usually wouldn't be as broad as "databases" but you could probably find one on "distributed wide column stores". I think they're usually published by grad students before they pick their thesis topic.

[1] https://en.m.wikipedia.org/wiki/Review_article


You can also go here for almost all papers

https://whereisscihub.now.sh/go


Make a list of papers that you find interesting and find out where they are published. Chances are, the venues that publish things you find interesting publish other work you'll like too.

For instance: USENIX ATC, USENIX Security, OSDI, SOSP, PLDI, ICSE, FSE, NDSS, ASPLOS, and CCS consistently have work I find interesting.


Good idea, tx :)


Yes! Thats what i'm reading for 'blog' :)

It depends on the field, but for my fields, most new work is first published as pre-prints for free on arXiv (not peer reviewed yet!).

There are new entries every day. You may want to check it out.


https://paperswithcode.com/

I like that format. A short tidbit of abstract, as well as description tags.


For arxiv: There have been in avg 3000 Articles per Month for CS.

Like 100 papers you would need to skim every day :|.

There should be arxiv reddit mode :/


« For CS ». I mean, CS is pretty vast. There are subcategories, finer paper classifications, you can look into. It would still be a high number of paper to look at. You need to learn to skim over a title an abstract in ~30 secs and decide if it’s worth your time. Also, use google scholar alerts.

Is there a text that will explain the difference between a paper, an article, a manuscript, a monograph and all the other words often used to describe different kinds of written scientific material?


A dictionary? ;-)

But seriously, if you want to know the differences look up the etymologies and then keep your eyes open for how each term is used in practice. That's all there is to it.


My dictionary defines manuscript as something written by hand rather than typed.

Not exactly something that illuminates the usage in scientific literature


Doesn't it though? If you look at the etymology, manus (hand) + script (write) directly gives you "hand-written", and then a hand-written document is more likely to be the work of a single author, more likely to refer to an original than a copy, and more likely to be old (before printing, typewriters, or computers). Everything else is just associations that a word picks up over time or in a specific community.

This reminds me of How to Read a Book[1], which is also a great read.

[1] https://en.m.wikipedia.org/wiki/How_to_Read_a_Book


I tried to read this paper using his approach but I didn’t know where to start..

Came here to say the same thing, you beat me to it!

This is like a level 1 joke, there is an easy established pattern to it, kind of slap stick

I want to know the content of the paper before reading the paper. The struggle is real...

Read the abstract?

There's a 2016 updated version here : https://blizzard.cs.uwaterloo.ca/keshav/home/Papers/data/07/... Should the link update?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: