Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: PDFlower – Reflow PDF papers for small-screen reading (pdflower.com)
216 points by chengchang on Jan 4, 2017 | hide | past | favorite | 59 comments



The majority of scientific and technical papers are delivered by the PDF file for portable presentation. However, it difficult to read the PDF paper on small screen devices, such as tablets and eBook readers.

So I wrote PDFlower to reflow PDF papers for small-screen reading.

Compared with plain-text extraction based tools, PDFlower reconstructs the object level - columns, formulas, tables and figures - layout using some smart heuristics, and then renders them for new page size.


Great idea and actual pain point. Showing a live or even a video demo will increase sign up significantly IMHO. Also the "get started now" should lead to the sign up page, not sign in page.

I like minimalist design, but this is a little too minimalist. More pictures, how it works, examples, pricing, will be great. Also no clue if this is an IOS app, android app, mobile web app, desktop app.

So congrats on the building and shipping attitude, but to take advantage of the flow of beta users from HN, I'd say a video / demo and more "faq" is going to increase your conversion tenfold.

Well done on the launch in any case, better launch something than wait till it's perfect.


Example pictures are kind of small, hard to see the quality of the result. Also, I'd use this for a Kindle device, not an iPad, so it'd be also good to see how it looks in a Kindle.


- The output is a reflowed PDF. It keeps 'PDF quality'. - Any PDF reader/device could display new version paper technically. - Kindle has supported.


Does up let you resize fonts?


Seconded. I have bad vision; I would love to use this to resize the fonts in single-column PDFs that I read.

Equivalently, I would love to reduce the column width to make zooming easier.

I prefer two-column format specifically because the columns are typically much narrower (less words per line), so I can zoom them to fill the screen and read much more comfortably than a single column document with 50 words per line.


Well just a single small example in the homepage is not enough to convence me to sign up and try.


why do we need to create an account to try it out?


Reading the page, it looks like your document gets put into a queue for processing, and then you can download it after. I imagine they wanted to avoid all the hassles of emails, as well as having accounts to prime the system if they decide to enhance it as a for-pay service.

It probably also stops assholes from using the service on PDFlower's dime for their own needs.


There are plenty of services that put my job in a queue and simply update the page via JavaScript to show progress and deliver the end result. The better implementations identify me via session cookie and recognize me when I check back half an hour later to get my results. That's a much nicer experience than creating another account at some random service that I might never visit again.


PDF is easily the worst file format I have ever dealt with. It's hard to believe that somehow this format became an industry standard.

You are a brave man/woman for dealing with this.


Does anyone know an open-source solution for this? What I read is no one's business and I would never used a web-hosted service for something like this.


It's doable. The original work on reflow was done by Thomas Breuel back in 2003: http://cgi.csc.liv.ac.uk/~wda2003/Papers/Section_II/Paper_5....

One of the problems with doing this in pdf is that for most documents, you need to infer reading order. A decade ago I wrote the code in poppler to do that (yet again, based off papers by Breuel) in order to get multi column select working. At the time I wanted pdfs to be readable on my iRex Iliad... anyhoo, most of the pieces are there in poppler. It can figure out reading order, render piece by piece, and already differentiates between images and text under the hood. Still a lot of work.


k2pdfopt works ok for me.

(bash function)

    tokindle () {
      ~/Software/k2pdfopt $* -dev kpw -mode fw -wrap -hy -ws 0.375 -ls-
    }


There's also koreader, which just uses k2pdfopt on the fly.


koreader looks fabulous! Last time I checked, the Paperwhite could not be jailbroken. But it no longer seems true. Thanks for the link.


Yes! All versions can now be easily jailbroken. The main features (to me) are installing KOReader to view epubs and pdf files, and hacking the screen saver display. The process only takes about an hour, mostly because you are reading, doublechecking, and downloading the right files. This guide[0] is a very good overview. I find mobileread's site to be too wordy and difficult to follow.

[0] http://lifehacker.com/how-to-jailbreak-your-kindle-178386407...


Which version would you buy if you were buying a Kindle today? I only owned the one before the DX... managed to crack the screen in an unfortunate elbow-related incident.


Paperwhite has been really good to me. Both Oasis and Voyage look great as well - they have higher resolution than the Paperwhite. And they have real buttons to move between pages, so if I were buying a Kindle today I'd buy one of them depending on my budget (Oasis is more expensive, but slimmer).


Current (and previous) generation Paperwhites have the same resolution as the Voyage and Oasis. I have a Paperwhite and the screen is fantastic. I'd love hardware buttons for page turning, though.

While the Oasis is expensive, if you add a nice cover to the Voyage you're close in pricing. Either go solidly midline with the Paperwhite or go all out with the Oasis, depending on your budget.


My Paperwhite was collecting dust until my recent discovery of KOReader.

I'm now toying with my own instance of https://github.com/koreader/koreader-sync-server since KOReader also runs on my Android devices.

I'm not 100% sure how(/if) I plan to sync PDFs, versus just downloading from source on each device.

I intend to play with OPDS as well.


Thanks! just tried this out, and it seems to work pretty well.

I'm having some trouble using the margin option -m to strip the vertical sidebar text from an arXiv paper, would you happen to have a solution to that problem that doesn't involve a lot of manual intervention? Using '-m 0.5in' handles the vertical text, but then the body text isn't coming out quite right...


Sorry, not at the moment. I figured out those parameters after tweaking things a bit. You might want to start the conversion with just the name of the file, and use the interactive prompt to try out various parameters.


Here's a tool built on k2pdfopt to streamline pushing papers to an ebook reader: http://dontprint.net/


Any reason they don't distribute the addon through Mozilla? Sorry but it sounds a bit suspicious...


Agreed. Nothing against the described service or its users, but personally when I'm searching for something like this, "upload" comes right before Ctrl-W for me.



k2pdfopt: http://www.willus.com/k2pdfopt/ We use that engine in the OSS KoReader application.


The last time I tried to build k2pdfopt, the source zip was missing a bunch of files. You had to download another zip, which wasn't complete either, and combine the results. And then the build failed because cmake was having a bad hair day. I wanted to like it since the windows exe did seem very handy, but had to give up at that point. (Oh, wait, there was a bit more. I did get it building, but the source assumed fprintf(NULL, "log msg") was a perfectly fine thing to do. That's when I gave up.)


I'm trying right now to build it from source, and I'm astonished that it doesn't come with automake tools.

I'm pretty much used to all the sources to work with "./configure && make", and have a BUILDING.txt or INSTALL.txt. k2pdfopt doesn't have any of this.


How does it compare to PDFLower?


android closed-source payed app that does reflow: ezPDF Reader


I would like a "pricing" and a "privacy policy" link on the front page. Probably won't spend time trying it out due to the omission.

Luckily there are interesting self-hosted / local options mentioned in this thread.


Exactly the same here. I won't go through the [maybe] lengthy process of creating a new account to get a paywall then. Same goes with the privacy policy, I don't want to use it with my own PDFs and lose their rights or with other people's and screw things legally.


Very cool. How hard would it be to go in the opposite direction? When I'm using a desktop or a laptop computer it's most annoying that almost all news sites, blogs, eg. optimize for a portrait orientation. I can only see a relatively small amount of text so I have to scroll constantly, despite the acres of available whitespace to either side. It's really annoying when reading anything complex where you'd want to refer back to previous sentences or paragraphs frequently to make sure you've understood the substance properly. I would love to be able to reflow into 3 or 4 columns though...


Thanks for your comment.

The PDF format will ignore some structures of paper and `draw` contents on a fixed-layout flat document. For a char, e-readers cannot know it belongs a paragraph, a figure caption or a formula. Plus, publishers render original paper with the specific style. It is a problem. Some research concentrate to improve PDF format, extract infos for scholarly papers, like articles on DocEng - a compute science conference: http://dl.acm.org/event.cfm?id=RE135. Then I read the spec of PDF format...

Anyway, that's hurt.


That's the feature Kindle PDF reader should have built in from beginning.

Waiting in the queue for my account.


Unfortunately the kindle is optimizing for users buying Amazon content, not for being the best general purpose eReader. This incentive mismatch between Amazon and the end users is probably why we will never see such a feature built in.


If you jailbreak your Kindle and install Koreader (mentioned upstream) you can have high quality PDF reflow, with support for two-column layouts as well as math and displayed equations, on your Kindle today.

Here are some example images:

http://www.epubor.com/how-to-install-and-use-koreader-on-kin...


Hey, thanks for this! I'm traveling today but I'll give this a shot tomorrow. I'd be much happier if I could read scientific papers on my eReader.


Wow, that's quite the (both) useable and useful software - thank you for the link. I'm not normally part of the "x should buy this out" group, but in this case, Amazon really should.


No, they shouldn't - because if it goes against their business interest (having Amazon-bought books look better than manually uploaded content), they'll shut the project down.


The only reason this software is any good is specifically because it isn't controlled by Amazon. This functionality is very deliberately missing from the stock Kindle.


> Q: If it still work when update the firmware version?

> Kindle jailbreak will survive when updated to the last firmware, such as 5.7.3. But you need to reinstall KUAL and Koreader to use it again.

Does firmware auto update on the Kindles? If so, will this break at any time?


It has it, but the feature is a bit hidden. Use your email-to-kindle email address and send a PDF with "convert" as the subject line.


Interesting and I might have a use for this professionally. But there is no indication on the main page about how it works (web service? code library?). And it forced me to login on the second page, so I closed the tab.


I thought I'd give it a try. But... Signup Queue? I had never seen anything like that.

Signup Queue

Hi, <my_email>

* You are on your way to PDFlower.

* 269 people in front of you.

* 0 people behind of you.

* Sharing this referral link may change your position in queue. Have a try. <referral_link>


I've seen that before. It's a pathetic attempt to force viral sharing. I never participate on principle.


This is something I haven't found an ideal solution for yet. There are existing iOS PDF readers with reflow, but each seems to have its own issues.

To echo what others have said, this is an actual pain point and I'm very willing to spend money on a solid solution. I've bought a few different PDF readers just on the chance their reflow would be better than what I've found elsewhere.

So I clicked on your link pretty ready to throw money at you, but I'm not sure what I'm looking at. Is this an iOS app? Android app? It seems like some kind of service, since it wants me to login, but I can't tell.

Consider working on the messaging and then reposting this. There's definitely a market.


Could you please name it?


Adobe Reader has a little-known function similar to this. Go to Display > Zoom > distribute (the last entry). It works reasonably well with simple, clean documents. You can check wether the document is compatible in its property panel.


If you're interested in doing this for sheet music (sort of a similar-ish problem), check out my product soundslice.com. Auto-reflowed sheet music depending on your device width.


I'm also working on this PDF reflow and posted a video demonstration on my twitter. Ping me if you would like to test: https://twitter.com/ldenoue/status/785142066665971713


Cool concept. This should be built into all smartphones and tablets and automatically run when they open PDFs.


I just want something that'll collapse 2 columns into 1.


would be great, if you could provide a guest account to try it out. I closed the window as soon as I saw the signup form..


Ah, reminds me of the Celery tool Flower, pronounced flow-er rather than flour. How is this one supposed to be pronounced?


Just want to comment that if this worked well I would pay for it. Probably up to ~5 or maybe even 10 dollars a month.


I'm working on a similar system. Would you like to test it? https://twitter.com/ldenoue/status/785142066665971713




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: