My collection of machine learning paper notes

activatedgeek · on April 22, 2021

I think these notes are great, and Vitaly certainly seems like a great person from Twitter (been following for a while now). I just want to spell out the obvious - the biggest (and probably the only) beneficiary of such structured notes is the note-maker.

The beginners who come in feeling excited that this will be a great learning resource are probably missing the point. Learning happens when you force yourself to create notes by finding structure in the raw text. Notes are extremely personal, and reading someone else's does not have the same emotional connect.

Am I suggesting you stop reading notes made by others? Absolutely not! I am suggesting you rather double down on that, except _always_ make your own notes if the objective is learning. Use the excellent public notes to build your own mental models of what makes for good notes.

yobibyte · on April 22, 2021

> the biggest (and probably the only) beneficiary of such structured notes is the note-maker.

I would probably agree with the "biggest", but disagree with the "only": * The readers might use a note as an extended abstract when selecting a paper to read. This is like a short conference talk which is for advertising the paper and inviting people to the poster session. * The authors get feedback about their research, and some of them engage in a discussion as well.

Having said that, I agree that taking your own notes is better for you.

anon_tor_12345 · on April 22, 2021

>beneficiary of such structured notes is the note-maker.

the same is true of most textbooks as well; most people write textbooks for themselves (and then publish them in order to not nothing to show for a year of work). i saw that somewhere and it's changed the way i approach reading textbooks (no longer do i take it for granted that one presentation is /the/ presentation).

activatedgeek · on April 22, 2021

I don't quite agree with this reduction. A good reference textbook is specifically designed to convey a clean linear story of the otherwise ugly conceptual development of research ideas. Notes are personal. Textbooks are a deliberate transform of those notes meant to convey structure in ideas to the average person in the target audience.

I find it funny that someone would go through the pain of undertaking an endeavor as large as writing a textbook, just for themselves. For that, they already have their notes. If you are hinting that writing textbooks (good or bad) has professional consequences, sure. Are they wrong in doing so? I don't see why they shouldn't bear the fruit of good exposition.

Stretching the argument further, you might as well explain almost every action as "people do X for themselves". Kevin Simpler explores this theme in detail [1].

[1]: The Elephant in the Brain: Hidden Motives in Everyday Life (https://www.librarything.com/work/19982533/book/195649617)

anon_tor_12345 · on April 22, 2021

>A good reference textbook is specifically designed to convey a clean linear story of the otherwise ugly conceptual development of research ideas.

Keyword: good. I said most and I stand by that: most textbooks suck and serve only to order the concepts in a way that makes sense to the author.

lobocinza · on April 23, 2021

It is a gradient. You can have notes where the author took that effort to great length and textbooks where it didn't.

forgotpwd16 · on April 21, 2021

Very nice. I especially like the structure (What?, Why?, How?, And?) which is shared for every note. Though it has fallen out of popularity, RSS will be useful. Overall the concept kinda reminds me [the morning paper](https://blog.acolyer.org). I wonder if there're similar attempts for other fields (math, physics, ...).

yobibyte · on April 21, 2021

Thanks! This structure helps a lot when comparing your work to the related literature. I chose notion because editing (and adding a new note) causes as little pain as possible since it's hard to keep a pace of a paper a day. I do not know how to add RSS support here though =( But I'll keep that in mind.

MattGaiser · on April 21, 2021

I thought they meant a pile of paper notes you could use for machine learning OCR or something.

krisgee · on April 21, 2021

I thought it was going to be banknotes generated by ML based on the world's currencies.

yobibyte · on April 21, 2021

Wait for my ICO!

ptigas · on April 21, 2021

NFTs for paper notes

albertzeyer · on April 21, 2021

Hi Vitaly! Nice to see you here on HN. :)

I wonder how long you can keep up doing this. I once was motivated to also read a lot (although not strictly one paper a day) but once you get to have more and more deadlines (paper submissions etc) and then approach the end of your PhD, I gave up. Now that this is (mostly) over, I want to read more again.

Also, I can recommend to keep a balance of papers close to your own research area (these are anyway a must, if you are serious about it) and also from further away. If you can manage to adopt techniques from other areas/fields, this usually results in great things.

yobibyte · on April 21, 2021

Hi Albert! Long time no see =)

I'll probably slow down at some point, but I think atm reading stuff gives me more ideas or general understanding what I want to work on and what not. There are drawbacks as well since some papers have a lot of time to read in depth, and a day is def not enough to get a proper understanding.

Re your advice, that's a great point! How do you select a papers outside of your comfort zone?

albertzeyer · on April 22, 2021

Hm that's difficult. Automatic speech recognition (ASR) is probably by now my comfort zone.

So already most pure DL papers are out of this zone, but I anyway many of them, when I find them interesting. Although I tend to find it a bit boring when you just adopt next-great-model (e.g. Transformer, or whatever comes next) to ASR, but most improvements in ASR are just due to that. You know, I'm also interested in all these things like neural turing machine, although I never really got a chance to apply them to anything I work on. But maybe on language modeling. Language modeling is anyway great, as it is simple conceptually, you can directly apply most models to it, and (big) improvements would usually directly carry over to WER.

Attention-based encoder-decoder models started in machine translation (MT). And this was anyway sth part of our team did (although our team was mostly divided into the ASR and MT team). And since that came up, it was clear that this should in principle also work on ASR. It was very helpful to get a good baseline from the MT team to work on, and then to reimplement it in my own framework (by importing model parameters in the end, and dumping hidden state during beam search, to make sure it is 100% correct). And then take most recent techniques from MT, and adapt them to ASR. Others did that as well, but I had the chance to use some more recent methods, and also things like subword units (BPE) which was not standard in ASR by then. Just adopting this got me some very nice results (and a nice paper in the end). So I try to follow up on MT sometime to see what I can use for ASR.

Then out of own interest, I'm also interested in RL. And there are some ideas you can also take over to ASR (and have been already). Although this is somewhat limited. Min expected WER training (like policy gradient) has independently already developed in the ASR field, but it's interesting to see relations, and adopt RL ideas. E.g. actor critic might be useful (has already be done, but only limited so far).

Another field, even further away, is computational neuroscience. I have taken some Coursera course on this, and regularly read papers, although I don't really understand them in depth. But this is sth which really interests me. I'm closely following all the work by Randall O'Reilly (https://psychology.ucdavis.edu/people/oreilly). E.g. see his most recent lecture (https://compcogneuro.org/).

This already keeps me quite busy. Although I think all of these areas can really help me advance things (well, maybe ASR, although in principle I would also like to work on more generic A(G)I stuff).

If I would have infinite time, I would probably also study some more math, physics and biology...

yobibyte · on April 22, 2021

Thanks for such an extended reply!

It's probably hard to estimate an impact of reading outside of your field, but this definitely sounds like a good idea. A positive bonus here is that you get more exposure to how people write and talk about research in different areas, and I find it super useful. I've recently read about Curry-Howard correspondence (https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon...), and it was mind-blowing both in terms of what they talk about and how they talk about it.

On the negative side, it's often quite hard to understand only because the terminology is different.

Re Neural Turing Machines, there's been an interesting resurgence of the field working on algorithmic tasks (check out this amazing survey https://arxiv.org/abs/2102.09544).

anitil · on April 21, 2021

I wonder if there's also a marginal effect here where early on each paper is contributing a lot to your knowledge base, but as time goes on the marginal contribution reduces.

I haven't done a PhD (yet), but I get the sense that in the early years you spend a lot of time under water trying to swim.

marwahaha · on April 21, 2021

I really like this. I recently started a similar project called the "arXiv wiki". Could you link future paper notes here?

For example: https://arxiv.wiki/abs/2101.06861

yobibyte · on April 21, 2021

Nice! Re linking, writing a bot scraping the links and adding PRs would be a great thing to have =)

marwahaha · on April 22, 2021

Sure, I built one. I think it will work if you keep your paper collection page in the same format.

https://github.com/arxivwiki/kurin-paper-scraper https://github.com/arxivwiki/arxivwiki/blob/main/.github/wor... https://github.com/arxivwiki/arxivwiki/blob/main/.github/wor...

submagr · on April 21, 2021

This is cool stuff. I am planning to do something like this myself.

yobibyte · on April 21, 2021

Thanks!

pyuser583 · on April 22, 2021

Thank you!

lgats · on April 21, 2021

notes on machine learning papers

bachmeier · on April 21, 2021

A bit of a tangent, but Notion is performing well for a post on the HN front page, even with equations.

hs86 · on April 21, 2021

They just shipped a performance-related change: https://www.notion.so/blog/faster-page-load-navigation

It is also mentioned on today's changelog: https://www.notion.so/What-s-New-157765353f2c4705bd45474e5ba...

bhl · on April 21, 2021

That engineering blog post is a bit ambiguous to whether or not SQLite is also used in the browser web-app, which I'm presume is being used for most people who clicked on the link.

So that could mean two things:

- SQLite is being used in-memory, but things are still being flushed to IndexedDB for persistence? shouldn't help with faster page navigation here

- SQLite is not being used, so it can't explain the performance increase

I think the answer is more in the second link (changelog):

> - *Your workspace is now more reliable after Apr 16, 2021's scheduled maintenance — we upgraded from a single database instance to a sharded deployment, which means Notion is now capable of serving 3x as much traffic as before*

yobibyte · on April 21, 2021

Do you think the equations are generated on the fly? I thought they are getting cached.

whimsicalism · on April 21, 2021

I feel like they're either cached or generated client-side with MathJax. No chance they're being generated every single time.

forgotpwd16 · on April 21, 2021

They're generated client-side using KaTeX[1]. That said the entire page is generated using JS meaning that someone visiting the site with JS disabled or using a text web browser will be greeted with a blank page. Nevertheless no-JS versions seem to be shown to bots since Google caches a plain HTML version of the same page[2].

[1]: https://www.notion.so/Math-equations-b4e9e4e03677413481a4910...

[2]: https://webcache.googleusercontent.com/search?q=cache:afS5a6...

bhl · on April 21, 2021

Maybe it's just an oversight then when inline and block math was rolled out. KaTeX has a `renderToString` function that can be used to server-side render the LaTeX.

On an other hand, Notion doesn't seem built to serve read-only webpages like a static blog or Medium.com: the expectation is that you'll use the editor, so assumption is javascript is enabled and the editor itself can be used to render a read-only view from JSON or however they're keeping document state.

Opening dev tools on the website, it looks like they're just using Webpack (like React CRA?); not sure if they're changing the javascript bundle per page like with Next.js. It would make sense to not have server-side rendering if you're building both browser and desktop apps, since that would mean avoiding a separate framework only for the browser.

Another clue is that people who try to use Notion as a CMS for their blogs had to build out a React library to emulate the feel of Notion itself: https://github.com/splitbee/react-notion https://github.com/NotionX/react-notion-x.

imadr · on April 21, 2021

They use KaTeX which is faster than MathJax