Hacker News new | past | comments | ask | show | jobs | submit login
Adobe unveils multi-year vision for PDF, introduces Liquid Mode (adobe.com)
132 points by georgecalm 12 months ago | hide | past | favorite | 114 comments

Isn't the whole point of PDFs that they (mostly) don't change? They represent content as laid out on a page.

So, IMO this is neat if it's new tech for reading PDFs and extracting data from them (and maybe leveraging current under-used features to store more machine-readable information), but bad if it's about introducing even more complexity into the PDF documents.

Perhaps 10 years from now we will have responsive PDFs, but I feel sorry for whatever damned soul is going to have to expand hard-coded limits in order to fit the new PDF specification text into a single PDF document.

This is more like a readability mode than a responsive website.

It's a button you click to automatically make a fixed-layout PDF fluid and easier to read on mobile.

You'd still author the document the same way without considering multiple device sizes. When the PDF needs to be reproduced faithfully (printing) it would still appear as the creator intended.

For many, the point of a PDF is that it appears the same way in all contexts. For device-dependent rendering, we have HTML.

It's not crazy to have a device-independent presentational document that also contains enough data to enable some device-dependent rendering of its text and media contents.

I read a boatload of 2 column PDFs on my phone. Liquid Mode has been on PDF Reader on Android for... it feels like a year now? And it's fantastic. Without it I have to constantly pinch and zoom and swipe to the upper right to start a new column, etc.

I hate PDFs, but if I want to read a document I've stored in print fidelity from the 80s on my phone, Liquid is rock solid.

Which android pdf reader has it?

Not OP. Xiaomi MIUI bundles WPS office, which has a "mobile view" for PDFs. It's far from perfect, but it can single-column some multi-column PDFs. Some PDFs, like restaurant PDFs are downright butchered though.

I would guess Adobe's? Or otherwise it's not "Liquid Mode" but a similar feature...

According to the article Adobe reader for Android got this feature recently. I tried it and I am disappointed. Maybe OP was talking about reflow mode that indeed was accessible in many pdf readers, but for technical papers (math) it was useless.

That would be the case if a PDF was solely used for print.

In the digital realm though, a PDF should also be accessible to screenreaders for the blind, people with sight problems (increased fonts, contrast), dyslexia, etc. Those already require alternative representations.

And of course nothing bad about having the best of both worlds! A format that is pixel perfect when you want it too, and adaptable for easier reading when you don't.

> For device-dependent rendering, we have HTML.

It does not sound like the goal is device-dependent rendering, though. This is exactly parallel to Reader Mode, which is a temporary user toggle. The goal of PDF is still to produce documents that mirror print-ready layouts, something that HTML has never really accomplished. Whether you want to consume the documents that way will now be up to you.

> For device-dependent rendering, we have HTML.

Say that to those who create PDF files for things that should be web pages. That behavior is not going to stop anytime soon.

This is kind of funny to me since what is now the iOS Core Graphics framework started out life as a PDF rendering engine.

Sadly no one really publishes books as HTML.

Not sure if this is sarcastic but FYI:


(spoiler, most of them are HTML/CSS)

Most books I saw were epub and pdf. They say epub is some XML, ok. But I don't see how the number of existing HTML based formats is relevant

> They say epub is some XML, ok.

Epub is HTML and CSS bundled in a zip file following a specific directory tree layout and containing a few metadata files.

Epub 3 is HTML5.

Open LibreOffice, create a doc, save it as Epub, and unzip the file. See for yourself.

That's cool, thanks!

Aren't the most popular ebook formats based on HTML?

They do, they just don't end up being called books. I'm a big fan of http://learnyouahaskell.com/

That's awesome stuff there

Lots of books (in major digital bookstores like ibooks) are published as "epub" which is basically html.

>It's a button you click to automatically make a fixed-layout PDF fluid and easier to read on mobile.

But this of course reminds us of the miserable UX we have come to accept from the "mobile web". In the beginning, mobile websites didn't exist, and you just zoomed. Now, it's practically impossible to view the desktop website from a phone at some URLs without jailbreaking your device (if it's iOS). The net effect has been a severe degradation of usability in the worst cases for a small improvement in aesthetics and speed in the best.

I truly hope that PDFs do not meet the same fate, and remain readable on mobile in the correct layout, with no in-document markers that force my iPhone to display it wrong. While that apparently hasn't been proposed here, I don't like even the risk of it being proposed.

The best restaurant website experiences I have are when the whole website is just a pdf of the menu with contact info.

Maybe put a link on the pdf for online ordering or finding locations, but I have no idea why the home page should be anything but a menu.

I like to see pictures of the room/view as well.

About "responsive PDFs": there's already a "reflow" feature in Acrobat Reader. It's limited but kinda works on clean PDFs. I don't know to what extent it's a explicit possibility of the PDF standard.


When your digital document format needs artificial intelligence to understand the content, it may be the wrong container for delivering content. I'll be sad to see content that should be delivered as HTML instead delivered as liquid PDF.

According to this article, nothing will be "delivered" as liquid PDF, because it's a feature of the reader app, not the document.

The point is that there are 30 years worth of existing, static, PDF documents - published as PDF for a potentially valid reason - which are frustrating to read on a mobile device.

I'm not convinced this is even a feature of the reader app. When they click "Liquid Mode", it says "Processing in Document Cloud" which means they're sending all your PDFs to their servers? Make sure you don't click Liquid Mode on any sensitive documents!

The PR post states near its end that the more documents go through Adobe Sensei the better it will get. So yes, they’re mining third party data. It would be interesting to see if this is GDPR compliant (I bet no since user is not well informed about the processing).

If the AI is built in to a client side reader, that reader has to be installed on all clients, then kept up to date.

A frictionless solution would be for Adobe to offer the AI as a web service that renders legacy PDF files as HTML.

I would prefer that my device does not phone home to Adobe about every single PDF I decide to look at on my phone. On-device rendering for me, please.

It would be opt-in only, to respect your wishes.

I am old to remember that the initial publicity surrounding PDF touted it as the future of the web. A better HTML.

I laugh now as I laughed back then.

Perhaps that was the case in the pre-HTML5 days, where each browser decided what they think they thought was HTML, and IE was a reference.

A better title would be an ”ambitious multi-year vision for Adobe Reader” as this isn’t changing the PDF format at all: it is simply a new, buzzword-compliant (AI! ML!) content-reflowing UI for the reader app.

I disagree - this is just a change in the reader but it cascades into format changes.

I don't know if you've ever had the fun of writing a website using bootstrap and then having a client complain that the page layout changes (i.e. becomes responsive) when the window is resized. I've hit that a few times with things that need to go through audits/agency approvals and in those cases you can pull out some of the @media tags and call it a day.

Imagine having to make sure that liquid mode won't reflow a document that was signed off on by the FDA because there's a concern that the ISI (important safety instructions) box required to occupy 10% of real estate on that page might be shrunk to occupy 5%.

I agree this announcement is a lot less disruptive than the statement initially makes it look, but it's still going to have knock on effects.

Agreed. It seems that a lot of people aren't getting past the title, so now there are a bunch of misguided comments peppered all over the thread about needing to use AI to read PDFs, which is definitely not what the announcement is saying.

Content reflowing isn't easy for pdf files. Xodo has this functionality for years and it rarely get the line break right, so there is some use for ML in this. (Or do you happen to know a pdf reader that is better in this?)

Just downloaded the Adobe Acrobat Reader on Android.

It looks like the Liquid mode won't show up in the app if the device doesn't have Internet connection (kudos for graceful degradation here). Once you connect to the Internet and restart the app, you'll prompt to login to Adobe Document Cloud account to use this feature.

I tested a couple of PDFs and a lot of these files are showing "Liquid Mode isn't yet available for this file.", including: a Loan Estimation doc from LoanDepot, a PGE statement file, a Payment Notification from Nationwide.

It does work very well on https://www.politico.com/f/?id=00000174-abca-d59c-a174-ffde1.... Outline, concatenation of multi page sections, inline footnote reference are all working as expected.

If this works for more types of document that would be awesome.

Is everyone in such a rush to comment that they don't even bother to RTFA? A majority of the comments here are knee-jerk reactions to a headline that was misunderstood...smh.

Liquid mode is simply a tool to make it easier to consume PDF content on mobile devices. This is definitely a good thing especially since it is opt-in (you press a button to engage liquid mode in the reader).

The problem is that PDF has already had a few cycles of adding and then shedding off useless, and frankly, dangerous ideas because they keep trying fit a square peg in a round hole.

Exactly, and this feature could encourage more people to use PDFs in context where they should not really be used, because of having a false sense of security that their document will somehow magically be reflowed on devices that it wasn't authored to display on.

That may be, but it still has no real relevance to this article unless people just want any excuse to get on their soapbox to rant about Adobe/PDF without even bothering to read the article and what it is talking about.

Yes but, as it stands, this is a change to a PDF reader, not a change to PDF.

Of course, it's Adobe's reader. So the chances that this will lead to even more mucking about with the format are pretty high.

Which hey, for me that's job security. Which is not to say I look forward to it...

But Liquid Mode, now with more machine learning, isn't those changes. It's just reader mode for PDF.

Touch screens already make reading pdfs pretty easy, this seems like a solution in search of a problem to me.

Eh, I don't like 2D scrolling, it is a lot more of a pain then 1D scrolling - so I think it's a valid problem to try and resolve. That said I also think PDFs are just conceptually broken when it comes to mobile devices - you specifically don't want single format files on that device and, often times, you don't need to consume them on mobile.

It seems to work well to me so I don't know what you mean by conceptually broken or single format files. Having a single file that you can zoom in and out on was one of the original multi-touch demos in the TED talk, years before the iphone.

I'm also not sure why they are different from anything else when it comes to 'needing to consume them on mobile'.

I agree, my 12.9" touch screen is the best device for reading PDFs I could imagine. A ReMarkable is at least as good for black and white (I've tried it) but a lot of PDFs are full color.

PDF on a phone is unremitting pain. I'd rather muddle through on a laptop. I'll give this this a shot, it might help, and it can't hurt, since it's just a button you can push that says "make this thing legible, maybe".

"Used AI and ML", Christ, we need AI and ML to read a static document now.

> Christ, we need AI and ML to read a static document now

The article doesn't say that. This is like Reader Mode in your browser. It presents the same content in a different way based on an understanding of the document structure. You don't need to use it.

If you want content to be able to be presented in different ways (such as a Reader Mode) that reflect the document structure, there are far better options than PDF. This is Adobe trying to extend their moat. Just Say No.

You may as well be saying that there are better options for producing readable text than HTML, because that's what Reader Mode is for. Yes, but entirely irrelevant. Reader Mode is for the reader to use, not for the writer to use. Liquid Mode is for the reader to use, not for the writer to use. See the parallel?

What the writer wants to present is structured layout based on design. Sometimes that's what the reader wants to see, other times not. That's why websites aren't all flat text and browsers have reader modes, not the other way around.

> You may as well be saying that there are better options for producing readable text than HTML, because that's what Reader Mode is for.

To me it reads like this adds absolutely nothing to that which HTML has been already delivering for years.

HTML is not known for providing print-ready layouts. This is just a different way of viewing documents that are designed to be print-ready. Whether they choose to implement the output of the conversion in HTML or some other method isn't relevant to the announcement.

I'm guessing OCR of scanned images to turn into text.

Come on people, what's the name of this site again? Granted, my first reaction was the same as many commenters here: "Responsive PDfs? Oh the irony!"

But as a reader of essays and textbooks on a small tablet, let me tell you it's can be useful. Yeah, it's not elegant. It a clever technical solution to a real-world problem made of decades of path dependency.

You might call it a hack.

If this had been from some kid in his basement and not from a corporate press release (admittedly pompous, overselling and worrying privacy-wise), you would have cheer.

For 4 years I worked at a publishing/education company and developped on a solution for educational interactive ebooks in HTML.

A couple of insights from those years:

PDF is popular because it fits the paper designer mindset and because Adobe InDesign is pretty much the standard in the publishing industry.

HTML is a much better format for the digital age. It's responsive, interactive, etc, but even today, the best way to produce HTML is by using a code editor.

Even if there was a good WYSIWYG tool, editorial designers come from the paper world and have a really hard time understanding the responsive model.

Many times I've fantasized about working on a tool made for designers to produce HTML, but it would be a ton of work and I don't really think there is a market for it. Many ebook formats are actually HTML, but I think the industry is getting by with conversion tools from Word. Most HTML content comes from blogs and journals which already have an established pipeline and don't need a general purpose HTML production solution. Education is the strongest use case, but most education companies are still rooted in paper and switching to interactive education is quite a leap.

Imagine creating a document format so broken you need to build an artificial intelligence product to display it on smaller screens.

Its not broken, it’s just not designed to do this.

A pdf file is a program that outputs pages you can display or print. There are no more restrictions to the format.

That’s why without some artificial intelligence you can’t edit or reflow the contents.

Its not broken, its just abused. PDF is create for printing, its not great for anything smaller than an A4 page. Other formats exist which do that fine. If you use pdf for what it was made for its fine.

This sounds like a vision for Adobe Reader, not PDF.

Where are the changes to the PDF format that will help other viewers understand hierarchy and relayout pages without Adobe's ML engine?

Something they didn't really mention was that this requires uploading the document to their cloud, which I'm afraid is a pretty big downside.

It's okay - it'll be hard to find your document among all the harry potter fan-fiction they'll be OCRing.

I do agree it's disappointing that they can't just process it locally though.

I hope it's not just for mobile -- PDFs are good for printing, but not much else. Even viewing explicitly-paged documents (esp. w/two columns) is painful. I'll be happy if I can skip all the zooming and panning I have to do on laptops and tablets as well. The only time I don't have to do this is on a large portrait-mode external monitor.

Liquid mode doesn't work with PDFs above 200 pages, the whole point was to be able to read books without zooming & scrolling within every page & this arbitrary limit!

The internet already has this. It's called HTML.

Liquid mode? It looks like a tool to reflow the content in existing PDFs, which are often intentionally not responsive. I don't see how this is similar to HTML.

The AI is required to reconstruct text order in legacy PDF files. Once you have the text, why wouldn't you render it in HTML?

Sadly, without something like CSS Regions (https://caniuse.com/css-regions) the internet has limited rendering capabilities. CSS Regions was led by Adobe but dumped by Google with Blink (https://arstechnica.com/information-technology/2014/01/googl...)

A standard that is too open ;-)

Better have an alternative that is controlled by one company. Shareholders like that. Think social media, ads, apps.

PDFs can do all sorts of voodoo (like you can do with HTML if you hate the user's browser) to make legible content that is pretty illegible to machines - but most documents are produced by tools that have pretty sane outputs that can be reverse parsed to get a pretty nice HTML blob.

PDF is an ISO standard. PDF 2.0 has no proprietary components.

Pretty sure their 'AI reflow' is something that is not (nor will be) in the standard

yes, because it's a tool in Adobe Reader and not part of PDF.

It sounds like it will make viewing papers and such that are PDFs much more convenient, eventually.

But it really makes you question the idea of putting things into PDF format in the first place.

Because at this point it may be that a significant majority of the time PDFs are read on screens.

So in my opinion it might make more sense for acedemic journals (for example) to standardize on a something like reStructuredText (which now supports LaTeX by default). Or maybe Markdown, or a subset of HTML.

Or maybe a standard eReader (Kindle-like) format.

Or just default to a tar.gz with the RST and supporting files in standard folders.

Then if they want to publish a print journal they can automatically format it for printing. If it doesn't look good enough sometimes then let the print journals use AI or manually typeset it (earn their money).

So anyway I wish Chrome and Firefox would get support for my new RST archive format.

Point being that PDF is getting a little obsolete.

E-pub is a ziped html and css

The problem is that the PDF standard tells each letter where on the page to sit. If you need an AI to dynamically re-render the page, what good is that standard?

PDFs are good for one thing only: Printing.

I wish academia would stop using PDF to distribute online, so their documents would be easier to parse!

Sounds like Adobe is trying to add media queries and a dynamic DOM to PDF.

Perhaps they should consider leaving PDF the fuck alone and reiterating to developers that HTML & CSS are the appropriate technologies for producing documents which must reflow based upon viewport dimensions.

I remember they had something similar 15 years ago. It was called tagged PDFs. It was useful on low resolution devices like PocketPC as the full page view wasn't readable. You had to convert your PDFs before loading them onto the device. Probably went out of favour when the iPhone came along and screen resolution increased. Looks like it’s an upgraded version of this with added ML.

Tagged pdfs are still a thing for accessibility, to make pdfs work better with assistive technology. In practice, this is underutilized, because of the additional effort to use it properly.

I wish they put that AI and ML to use on the OCR functions in Acrobat.


They should start with a better prediction of the zoom level and single/continuous page rendering modes when opening a PDF document.

Notably, this also means a lock-in in traditional layouts, as I doubt that an AI trained on the usual corpus would parse any non-traditional, maybe even dissociated layouts well. And, since there's apparently no way to control the reflow, this means serial presentation, all the way, just to make sure that your content will be presented in a meaningful way.

This is a bit ironic, since the USP of PDF has always been its ability to preserve individual presentations in a portable format, which made it also ideal for sharing and archiving non-traditional presentations of content.

Coming soon: PDF will become Flash!

Please don't even joke about that.

Liquid Mode ?= Reader Mode. Ok. As long as I can still read Old PDFs, and the file size stays small. Why not.

I don’t see any way to improve tables in PDF. Most of the PDFs I deal with are invoices, statements, tabular reports.

As long as we’re on the topic of PDFs...what the heck is up with PDFs always asking if you want to save, even when you only opened them (no changes; using setting ‘Use only certified plug-ins’ & ‘do not show edit warnings’). Ridiculous.

This is feature brought back from the past. Back a decade ago, I had reflow mode on my Pocket PC and I used it to read PDF during my commute.

There’s a lot to consider here, and I’m certainly encouraged by Adobe's long standing focus on consistent PDF view experiences. (At least compared to other formats)

That being said, this feels somewhat like Adobe will turn PDFs into a form of markdown where the parser is a (variable?) machine learning algorithm.

Done perfectly, the results could be great. Done badly and the results would be very painful.

> a form of markdown where the parser is a (variable?) machine learning algorithm

Sounds like a way to extend vendor lock-in, as nobody else will be able to implement the same algorithm with any assurance of interoperability.

Adobe can't write secure or portable software for shit. They only ever want to be the stewards for something so they can make money off of it, often precisely to the detriment of the rest of the world.

No Internet content should EVER be loaded by Adobe software by default. Ever. Their security history has clearly shown us that.

This is a good step forward to improve the consumption of PDFs. https://www.nngroup.com/articles/pdf-unfit-for-human-consump...

Dear god please no. PDFs suck, but they're the only viable medium for scientific publication, and they aren't "broken" for that. Do not fuck them up with your proprietary crap, Adobe.

i actually expect that this will work pretty well, at least for academic articles, because the machine learning engineers working on it will constantly be "dogfooding" it, as machine learning research is published mostly as PDFs.

I hope it will be an open standard (but I don't expect it from Adobe).

Is this something new ? I've been using it for long time.

My plans to replace HTML with PDF move further to fruition.

Liquid sends documents to Adobe cloud so use judiciously !

It is nice though.

TL;DR: The plan is to make PDFs responsive through built-in machine learning for mobile consumption.

I’m very skeptical but somehow intrigued.

Are they mining bitcoin? My phone becomes super hot when I visit the link.

how this is different from 'epub'?

It uses heuristics to convert existing documents in to reflowable versions.

Adobe is a great company to invest in.

Objectively true. Adobe makes a lot of cash.

And that's why my portfolio is Adobe, Oracle, Haliburton, Bank of America and Comcast.

Bank of America could be a dangerous one with so much insolvency quickly coming.

Just because a company makes a lot of cash doesn't make it a good investment. It depends on the price, and whether you think it's likely to go up.

Can Apple kill PDF too, please?

I'm especially enraged at the PDF "forms" XFA or whatever that no software on this planet can open except Adobe Reader. And Adobe Reader even on macOS doesn't allow Print to PDF or something else to get a PDF in a "sane" format.

I have to use a fake printer to trick Adobe Reader about this! Incredible!

Fun loophole that usually works--so Adobe Acrobat on the Mac doesn't let you use the default Print to PDF functionality, BUT if instead of selecting PDF you print to Postscript it will work.

Speaking of print-to-PDF, this is tremendously useful functionality, but fails on over half of the web pages out there. Is this due to some intentional action by the publisher, as a copy protection mechanism, or due to faults in the print-to-PDF driver? If due to faults in the driver, whose print-to-PDF driver is the best at implementing this functionality - I've tried several and they all seem to be sub-par.

Another PDF issue, does anyone except Adobe make a good PDF malware remover (from inside the PDF file). PDF malware seems to be rare, but could be a big problem if it ever catches on.

It does not work. Just like print to PDF, trying prostscript tells you:

> Saving a PDF file when printing is not supported. Instead, > choose File > Save.

Because Adobe went out of their way to prevent this to work.

My thoughts regarding HTML as an alternative to this:

1. AFAIK, there is no standard way to bundle a webpage containing images into a single file.

2. We have EPUB, and my experience using it has been horrible. Either the format is bad or somehow every single reader I've used sucks, and I've tried many (Foliate and the now defunct Readium Chrome extension are the ones that suck the less, but the experience is still much worse than reading a PDF)

At this point, solving the bundling problem of HTML (and using normal web browsers) seems like a better course of action than trying to use EPUB.

3. Text that always fills the width of the screen sucks if you're using a screen bigger than 10 inches.

Being able to coerce a document into a page whose border is clearly delimited (just like PDF readers do it) without having to resize windows is, in my opinion, an absolute necessity. Something like:

body { border: solid black; max-width: 25em; }

Epub readers allow defining widths (or semi-equivalently: margins) but they always look bad because there isn't a visual boundary.

> AFAIK, there’s no standard way to bundle a webpage containing images into a single file.

There’s mhtml, though I can’t speak to how well supported it is. https://tools.ietf.org/html/rfc2557

You could embed images using data: URLs. There are size limits in some browsers, particularly older ones, but you can manage some pretty high-resolution images under the limits in a modern browser.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact