
Show HN: Top PDFs Posted to Hacker News in 2018 - burtonator
https://getpolarized.io/2019/01/08/top-pdfs-of-2018-hackernews.html
======
burtonator
Hey guys. This listed is computed via the data we have at Datastreamer
([http://www.datastreamer.io/](http://www.datastreamer.io/)) ... we basically
index the web and have a petabyte search engine that we license for people
very serious about open data.

I realized that we had Hacker News for every day and wanted to compute the top
PDFs for my own usage so I wanted to share with you guys.

My side project, Polar ([https://getpolarized.io/](https://getpolarized.io/))
is used for managing PDF and other reading so you might want to check that out
too. It's basically a tool for researcher, students, or anyone passionate
about long term education and use read a great deal of technical and research
material.

PDFs are supported but we also support offline caching of web page content. We
also support Anki sync so you can create flashcards and sync them with Anki so
that you never forget what you've read.

EDIT. Awesome! This landed on the home page in less than 15 minutes and is now
#2.. Super excited you guys found this helpful. Great to contribute back to
such an awesome community!

~~~
temny
One more broken link: #37 Self-Awareness for Introverts [pdf]

Correct link seems to be [http://cliffc.org/blog/wp-
content/uploads/2018/05/2018_AWarO...](http://cliffc.org/blog/wp-
content/uploads/2018/05/2018_AWarOfWords.pdf)

~~~
qwerty456127
And this is one of the papers I feel the most interested in.

~~~
tapland
Here is a link to the original submission:
[https://news.ycombinator.com/item?id=17010199](https://news.ycombinator.com/item?id=17010199)

Which includes a blog post that goes into this without being just a
presentation-aid PDF: [http://cliffc.org/blog/2017/07/30/introverts-emotional-
proce...](http://cliffc.org/blog/2017/07/30/introverts-emotional-processing-
self-esteem-and-salary-negotiations/)

------
azhenley
My dissertation made it on the list! Only #176 but that is more attention than
I ever expected it to get :)

~~~
burtonator
Ha. Nice. Well glad we could get you another link and more love.

------
createdjustnow
Just in case if you thinking to download these links

import re import requests

from bs4 import BeautifulSoup

def download_file(download_url, name): #create response object r =
requests.get(download_url, stream = True)

    
    
        #download started
        with open("repo" + name, 'wb') as f:
            for chunk in r.iter_content(chunk_size = 1024*1024):
                if chunk:
                    f.write(chunk)
     
    

html = requests.get("[https://getpolarized.io/2019/01/08/top-pdfs-
of-2018-hackerne...](https://getpolarized.io/2019/01/08/top-pdfs-
of-2018-hackernews.html")) soup = BeautifulSoup(html.content) sAll =
soup.findAll("a")

for href in sAll: if(href.has_attr('href')): link = href['href']
if(link.find(".pdf") > 0): print(link) last_index = link.rindex("/") name =
link[last_index + 1:] print(name) try: download_file(link, name) except:
print("error downloading " \+ link )

~~~
diminoten

        from __future__ import print_function
    
        import re
        import requests
    
        from bs4 import BeautifulSoup
    
        def download_file(download_url, name):
            r = requests.get(download_url, stream = True)
    
            with open("repo" + name, 'wb') as f:
                for chunk in r.iter_content(chunk_size = 1024*1024):
                    if chunk:
                        f.write(chunk)
    
        html = requests.get("https://getpolarized.io/2019/01/08/top-pdfs-of-2018-hackernews.html")
        soup = BeautifulSoup(html.content, features='html.parser')
        sAll = soup.findAll("a")
    
        for href in sAll:
            if href.has_attr('href'):
                link = href['href']
                if link.find(".pdf") > 0:
                    print(link)
    
                    last_index = link.rindex("/")
                    name = link[last_index + 1:]
                    print(name)
    
                    try:
                        download_file(link, name)
                    except:
                        print("error downloading " + link )

------
ekphrasis
Very nice!

It's also interesting that it differs from this search[1] performed in HN's
own search system.

[1] [pdf] with filtering on Past Year -> 4,422 hits.

[https://hn.algolia.com/?query=%5Bpdf%5D&sort=byPopularity&pr...](https://hn.algolia.com/?query=%5Bpdf%5D&sort=byPopularity&prefix&page=0&dateRange=pastYear&type=story)

~~~
burtonator
Yea.. there would be more total hits for PDFs found on page 2 or page 3...
which I didn't analyze. 500 I think is good enough. Enough reading already ;)

~~~
ekphrasis
Sure, although the positions of the items in the answer sets at cutoff = 10
differ as well.

~~~
burtonator
meaning the ranking? Algolia is probably using a different score algorithm. If
I were designing it I would probably also factor in number of comments but I
think it's fair to use number of upvotes.

~~~
ekphrasis
I'm assuming that too. Just an observation. Very interesting work of you
nonetheless!

------
smartbit
The SRE book was available as PDF till August 25, 2018
[https://news.ycombinator.com/item?id=17614907#17624523](https://news.ycombinator.com/item?id=17614907#17624523).
I tried to find it but seems not be availble anywhere as PDF. I could only
find Kindle (with limited anotation), at safaribooksonline (with even less
anotation capabilities), at [https://landing.google.com/sre/sre-
book/toc/index.html](https://landing.google.com/sre/sre-book/toc/index.html)
or as print-on-demand.

------
app4soft
Something wrong with score

    
    
      22. Software-Defined Radio for Engineers [pdf]
      score: 292 comments
    

This article[0] posted by me and reached score is _352_ :

    
    
      Software-Defined Radio for Engineers [pdf] (analog.com)
      352 points by app4soft 6 months ago | 50 comments
    

[0]
[https://news.ycombinator.com/item?id=17399554](https://news.ycombinator.com/item?id=17399554)

~~~
burtonator
Thanks.. probably the same article but on a different URL.

~~~
app4soft
> probably the same article but on a different URL

What you talking about? Article URL[0] is same!

[0]
[https://news.ycombinator.com/item?id=17399554](https://news.ycombinator.com/item?id=17399554)

------
jackfoxy
This link gets me to the free download of the document manager. I'm not seeing
the _list_ on this page.

~~~
dvfjsdhgfv
> This link gets me to the free download of the document manager. I'm not
> seeing the list on this page.

That's the power of "Above the fold" in action.

------
tu7001
Also this link: www.dcs.gla.ac.uk/~trinder/papers/sac-18.pdf seems to be
broken.

------
voiceclonr
Interesting project! Kudos on shipping.

