PDFs are fantastic, and are still wonderful today. OP's problem isn't solved by PDFs, just like it isn't solved by XML or MP3 or ZIP formats.
Simply put, PDF is a format designed to allow me to give you a document where you can be sure (with certain well defined exceptions that I can work around anyway) that you see what I intended you to see. That works brilliantly if I want you to see a letter-sized 3-color document, or an A5 booklet, or a funky 3" x 9" flyer, or whatever other size I want.
This doesn't work with e-book readers, because my e-book reader isn't A4 or 8" x 6" or whatever other format the book is in. Epub, mobi, LRF etc. give the e-book reader the ability to format the text (if it's pure text, and not, as OP complains, pre-formatted with assumptions the e-book reader can't match) for the screen or for my options wrt. zooming and font size.
I say all of this as both an avid PDF user (I write lots of documentation), and an avid e-book user. You use the right tool for the job. If you're complaining about PDFs on your e-book reader, you're doing it wrong.
>Simply put, PDF is a format designed to allow me to give you a document where you can be sure (with certain well defined exceptions that I can work around anyway) that you see what I intended you to see.
Well put. When I buy technical books, I always look for a PDF format. I want a copy of the book that is the same as the printed one and that's exactly the reason why I like PDF. Plus, things like diagrams and graphs are in vector graphics, and not raster as they are in EPUB/MOBI.
I've made the mistake of buying a couple of technical EPUB eBooks that contained graphs, and in all cases it was a complete abomination. The publishers seem to want to produce the smallest files they can, so what used to be vector graphics, become highly compressed and low resolution raster images. Trying to zoom in on a graph on your tablet gives you a resized blurry image.
I don't like this trend where some technical ebook publishers offer a PDF version of the book in addition to an EPUB, and the PDF ends up being a mere conversion from the EPUB version, having all the cons I mentioned above. This happened to me on InformIT when I bought "The Mythical Man-Month", and they ended up giving me a refund.
I came to say something similar. One of the reasons I liked the iRex Illiad e-reader was that the screen was large enough and had enough resolution that one could display a PDF and read it at its nominal 'size'.
What surprises me about this conversation is how sad it makes me. Displaying documents accurately was a solved problem in 1985 at Xerox. They had a lot of folks who were to document display what web designers are to web page display. They would go on and on and on and on and on about the kerning or the way this ligature in this language had to have this sort of thing in order to be 'correct'. They had fights with Imagen whom they considered the "GeoCities" of document display. Don Knuth got so fed up with how poorly computer formatted documents looked that he wrote his own formatter, and part of that included a representation (dvi) which was designed to be a high fidelity source that could render to the 'best possible' form on the device. These guys generally fought over printers because display terminals were 25 lines of 80 characters for the most part (unless you had a Xerox D-machine).
The thing that makes me sad is that we haven't been able to capture any benefit from that war. Everything patented then has expired, whether it was the special 'hinting' Xerox used on fonts or the way they maximized the fidelity of the half toning on two color printers. Sigh.
How about good old HTML + CSS + SVG for a book format? There is a number of good rendering engines for that, I heard.
Also, it's much less complex than PDF.
But publishing a book in HTML would require some genuine effort, as opposed to just exporting the book typeset for paper as a PDF, often not even trimming the margins which eat precious e-reader screen space.
Kindle books have truly terrible formatting and editing in general. Some are riddled with what look like OCR errors; in others, blockquotes and italicized regions start and end in the wrong places. Images are in awkward places. Sometimes just paging left and then paging right causes the page boundary to move.
The main reason, I've heard, is that even if the Kindle format and reader app are reasonable (and I can't vouch for them), it's on publishers to put their books in the right format. They have no idea how to do this, so they outsource it. Some publishers probably don't care about or dislike the Kindle platform.
I find it really, really sad and bizarre to read books with so many errors. Maybe you haven't come across a really bad one yet, but since it varies by publisher (and I'm talking about big, mainstream publishers), just wait, you will.
I'll bite. I could have written this, and have said similar things here and there several times. The vitriol directed against PDF and me for recommending it has been quite surprising.
PDF doesn't reflow and a PDF formatted for a book-sized page will not be usable on your phone. No kidding! So write a script to run your LaTeX source (for example) multiple times for normal and large-type editions for, say, four canonical screen geometries.
> Some publishers probably don't care about or dislike the Kindle platform.
This is the real source of the problem. Publishers have every reason to suspect that Amazon will eat their profits and would be stupid to help Amazon accomplish its goal.
Except that a badly-formatted Kindle book reflects more poorly on the publisher than it does on Amazon, since Amazon doesn't create the content. If a publisher really doesn't want to encourage the e-book industry, isn't it much easier for them to not publish an e-book at all?
"The main reason, I've heard, is that even if the Kindle format and reader app are reasonable (and I can't vouch for them), it's on publishers to put their books in the right format. "
This is akin to terrible apps on any App Store. It's up to the App Store guidelines to raise the standards and only allow certain books of quality.
Here is my experience based on building ebook-related products:
Amazon: Will allow almost any book of any kind, any formatting. Their singles publishing program is a little stricter since you're giving an exclusive to them and they plan to use those for marketing.
Apple: Strictest store in the business. Humans review every submittal and will kick back your book even if they deem the subject 'not wide enough for a large audience'. IF your book is 51% French, 49% english content but submitted in English, it'll get kicked back and so on.
I don't agree with Apples purview that you can only submit books that are for a large audience - who are they to judge that? Therefore people head to the wild world of Amazon and fight it out at the free or 0.99 level.
Full PDF support is a security nightmare[1], and PDF supports so much more than a simple reading program should need, which results in it being bloated, buggy, and drastically increases the surface area for attacks.
True, but there's no delineation provided in the PDF files so as to ascertain which files require which features (via file extension or whatever), and so it might be impossible to tell if your particular reader supports all of the features provided by the PDF file. The better readers will tell you that some features of the current document aren't supported, while others simply won't render correctly at all, or worse, will just crash the application.
You also end up with users trying to open some dynamic form PDF with scripting and whatnot enabled, and of course your portable device balks because it doesn't support it, so now it gets kicked back to the device MFG because the PDF reader is "broken".
HTML/EPUB as the format for eBooks open you up to every single one of those problems - and more! HTML/EPUB supports forms, scripting (try to find a web page w/o any JS), audio, video, 3D and more.
I've attempted to read technical PDFs on my Kindle. You either need a magnifying glass (if you try to do it a page at a time) or spend the whole time scrolling. Maybe scrolling isn't a big deal on a Kindle Touch, but it's not fun on a standard Kindle.
I don't think this is a question of formats; I think it's a question of "How do we show legible code samples on a 7" screen?".
Switch the screen to landscape mode and do fit-to-width. This way, a typical PDF will only need 3 "next page" pressess per page, without you having to scroll sideways.
Sure. My point is it's not a question of Mobi vs. PDF vs. ePub, and probably not Kindle vs. PC vs. tablet. Some of the suck is due to sloppy publishers, but a lot of it is down to the screen size. Getting code onto a 7" monochrome screen so it's comprehensible is a challenge.
The kindle DX was a 9.7" screen last I checked. It was big and, from what I've heard, had a tendency for the screen to get broken. It was also over $300. Looks like Amazon has dropped it off their list as well. My guess is there wasn't enough demand for them to keep making it.
PDF and ebooks (MOBI, EPUB) optimize for different things.
PDF is for layout -- it is actually more of a graphic format than a text format. Contiguous text may be scattered around in the internal data representation, so long as the location of each character on the page is expressed. This is good for tables, formulas, and other items requiring precise layout.
EBooks are for a smooth reading experience, including wrapping. This will naturally disrupt layout.
I think that with some technical effort, it is possible to build an eBook that flows -- while still allowing precise layout where needed.
I understand there are some formatting issues, but PDF's dont work well on smaller devices. In fact, I find they don't even work well on Laptops, since you need to scroll down each time to the bottom of a page. EBooks formats are better for both, they just require some formatting work before being published.
I want to second this. I use good reader on my iPad and absolutely love reading PDF on it. I was going to check out Readdle as well since that's been getting good reviews. The best part is that I have all my technical books categorized on my iPad which I always have with me so anytime I have a spare moment or wish I had "that" book with me, I now do!
Try Goodreader. It has a ton of other useful features, but among them is the ability to crop.
Cropping is applied to all pages in the document, so once you use the handles to eliminate excessive margins, you've got nothing but the text. I find it indispensable, to the point where I prefer reading PDFs on the iPad to reading them anywhere else.
GoodReader's UI is a bit of a mess, but that's just a reflection of how many truly useful features it has. And the systematic elimination of margins - including the ability to handle left and right pages differently - makes technical manuals and papers actually readable.
Not having GoodReader was easily one of the worst problems I faced when I switched to Android - there's just nothing like it.
I keep Goodreader around for its crazy feature set (it can even force a PDF to reflow, making it readable on an iPhone) but I find that PDF Expert is a lot more pleasant to use.
Unless I misunderstand, you will have to be double tapping on every page, and it seriously disrupts the work flow as you can't page turn until you zoom out.
ezPDF does have a persistent crop feature, and you can even set different crops for even/odd pages (which I understand is a killer feature of goodreader). ezPDF's UI gets really weird wrt page turning and scrolling while cropped though.
I just discovered that Mantano Reader has an awesome crop feature for PDF. The UX is really good. It does not have the even/odd thing though. This is my current leader for an ebook reading platform.
This is one of the bigger reasons I got my iPad (Retina). I can read most PDF textbooks/documents in full screen portrait without any zooming or scrolling with ridiculous text clarity. No other device I have allows for this nearly as comfortably.
Epub apparently supports fixed-layout content (or so says wikipedia). I don't know if I've ever seen it done (although I haven't read many technical epubs).
Fixed-layout is supported via EPUB, but you need a reader that knows how to display the fixed-layout version. 99% of readers only display reflowable content.
What you really want to do is go with reflowable content that is formatted correctly (proper HTML/CSS), and if the content breaks across the page, changing the font size should help rectify.
(spoken from too many years of ebook + tech book experience)
An understatement of the century. In 21st century, PDF is not suitable for mobile devices.
I've read ebooks (ePUB) on my Nexus 7, but it's only usable for serious reading because I can increase the fonts without having horizontal scrolling (in vertical orientation). Without this, I'd probably need an e-paper based reader but even that might unusable for typical A4/letter formatted PDFs.
While I can't speak about the native Amazon files, I know that the kindlegen use html under the hood. It outputs (at least the versions I've used) a mobi file, which is just a zip container with some metadata. Inside, you'll find html files. I'm reasonably certain that the native Amazon stuff is just another layer ontop of that.
I wonder if the problem is specific to whatever algorithm Amazon is using. I've never tried to read a technical book on a Kindle, but on both my Nook and my iPad ePub files work far better than PDFs.
Lines in code samples do have to get wrapped fairly often on the Nook, but it's done in a sensible manner and there's a nice text editor style arrow icon to indicate which lines were wrapped.
Perhaps the difficulties Tim is talking about are really renderer shortcomings and not the format itself, which I had assumed was a lot more restrictive than HTML.
The problem (mostly) isn't the renderer, it's the content. It does actually need to be a reasonably reflowable html page, as opposed to too many manual line breaks, excessively indented blocks, etc.
I've had some terrible Kindle experiences as well. Textbooks on learning Chinese here there was no way to zoom in on the tiny intricate characters. Textbooks on learning Chinese where the lines of Chinese had placed incorrectly. Whoever did the translation treated them as images basically, and image support on Kindle is very poor, and in that one book they placed the "images" under the wrong phrases making it completely wrong. Amusingly these are also the books that disabled themselves after I updated my phone too many times, saying I'd exhausted my licensed downloads. So they were simultaneously the worst content translation I ever encountered and the most useless licensing for a developer who changes and reflashes phones constantly. I imagine technical books are similar, with code being treated as badly as Chinese.
However, most PDFs don't work at all well on ebook readers.
If you make a monochrome PDF with minimal margins, with a sensible font size, for a 6 or 7" screen, it will be attractive and readable on a that screen. It can - if you do it right - have better wrapping, widow/orphan control, and fonts than a normal ebook, too. I have an old Sony Reader PRS-505. The screen's the same as a Kindle, but the processor is slow, and paginating its native format takes ages. Nonetheless, it can display PDFs very rapidly. Before I got a Kindle, I used to convert books from text into small PDFs, with attractive embedded fonts, and read them on the Sony that way.
However, if you try to read an A4 PDF, with two columns and massive margins, on a 7" screen, you're going to have a bad time.
The downside to using PDF is that you lose the ability to reflow. But that's the upside too, at least if you have complex content that's not amenable to reflowing; many technical books fit into this category.
As a journalist who spent a large part of my career converting masses of PDFs (not the scanned kind, though of course those were also problematic) into parseable text, I'd say the problem was what seemed like a total black box of document composition. A simple tabular text PDF that was generated through some bespoke software package could result in a stupefying variety of text outputs, whether through third-party services like CometDocs or good ol' xpdf. At least with HTML documents, an XML element is an XML element.
Of course, HTML presents the same stupefying array of possibilities, except in the form of visual output...which is why I guess we needed PDF in the first place.
With all the PDF tech books I read on my Kindle, I either have to read through a microscope, scroll the screen from left to right continously or hold the Kindle landscape meaning the buttons are in the wrong place.
You know that old joke? A patient says to the doctor, "it hurts when I do this", and the doctor replies: "so don't do that". Your 6" Kindle was not meant for reading PDF books, so don't complain when you're using it not what it was made for (reading novels).
I sympathize with the sentiments regarding tech books in Kindle format. I try to buy all my tech books at O'Reilly since they usually will have each title available in multiple formats (PDF included).
There is a fundamental problem with viewports of unknown size: you as author cannot ensure that the way you want to represent your information is also rendered that way. For a novel or other basically sequential text that doesn't matter, but for a technical document with complex representations, it might be problematic. Long or wide tables are a common example, if headers aren't repeated they fast become a meaningless bunch of symbols. A graph whereby you cannot see essential features as the legend, axis, all typical points (like extrema), or the pattern described by the whole, probably makes it troublesome to support your argument you try to make in your text. And don't get me started on unique and more complex diagrams and infographics.
Zooming in and out just doesn't cut it.
Of course, you could write a different text with different diagrams for different viewport sizes, but besides spending more time writing your documents, it also becomes a referencing hell.
I don't see why the typical shrunken-image with link to larger size doesn't work? You can only put so much on a small screen, and if zooming + panning doesn't work, and the author's / publishers don't want to make multiple representations for different sized devices (really just one for small and one for large) - what else would work?
Don't get me wrong, I like information rich interactive documents. I just think it is an illusion to think that these documents fit different (sized) devices without (extensive) editing. And there is nothing wrong with that as long as we are aware of that. The problem I currently see is that textbook writers (I work in education) write a textbook the traditional way and then digitize it poorly without any regard for the nature of the devices it is used on nor of modern information processing capabilities.
Or, stated differently, think about how information is represented best on different devices (device groups).
Perhaps I am far too naive on this subject, but I would have thought that it would be quite easy to algorithmically work out how best to present information on each device using different aspect ratios, colour levels and resolutions as variables to change em or % css values of elements, or even conditional css layouts.
With respect to a pure technically translation from one device to the other, you're absolutely right. The problem is in interpretation/representation/communication of information/ideas/knowledge.
For example, I've built an interactive computer simulation connected to a graphing component to enable students to explore rate of change. On a computer screen the simulation and graph are thus positioned that there is a direct link between what happens in the graph and in the simulation. It doesn't fit on a smartphone. I can (automatically) change the setup to put the graph after (below) the simulation, but if the student cannot see the graph and the simulation simultaneously, they miss out on the support for learning the concept of change in the original configuration.
Another example, I taught a course on regular expressions and that included a unit on deterministic finite automatons. For non-trivial examples these automatons aren't comprehensible on small screens. Zoom and pan doesn't work to get a picture of the whole. On the other hand, I can think of a simulation of an automaton that would give a better picture on a small screen how a particular automaton works by using a simplified track representing the whole and focusing on a local state and input to see the effect of connections in the track. However, this would mean two different approaches to learning automatons that aren't necessarily compatible, especially in an introductory course. Students will have different questions and problems with the different representations. They will construct different ideas about automatons that might make communicating in class about automatons troublesome as students don't understand each other while talking about the same thing. But it goes further than that, these representation will need different introductions, different exercises, maybe even a different structured learning trajectory to build a similar understanding of automatons.
Perhaps we need to invent different ways of displaying information when a lack of space is a given.
For example, with your interactive simulation, could the entities change colour or shape to show rate of change?
With your automatons, I would have thought it wise to try to separate the different groups and then step through each group. Understanding each group in order for me would be the best way to learn it than nitpicking at different parts of the whole system.
Certainly, it is similarly to mobile versions of websites: build the best representation for the device and situation and be aware of the differences and potential effects of these differences. I think that smartphones do offer great opportunities in education, not as the main information gateway, but as an auxiliary tool for students and teachers to interact with the content, each other, and the learning environment.
Design and looks aside, I can read a PDF on my laptop, desktop, phone, and e-reader, on any OS. Why the hell would I want to use a format that only works on one device?
>>>I can read a PDF on my laptop, desktop, phone, and e-reader
When MobiPocket (the basis for Kindle format) and ePub were specced, the world looked like this:
>>>I can read a PDF on my laptop, desktop
Expensive hardware with powerful CPUs. Kindle/ePub were designed for weak CPUs that could basically handle text (K/ePub -- tarted-up text files) well. Back then, these were PDAs.
It's only been really recently -- with iPad 4 -- that monster PDFs, such as those from Google Books (scans of original book/magazine pages), can be read without wanting to toss the hardware against the wall.
That's because Google Books made a decision to use a feature of PDF (JPEG2000 image compression) that is better suited for higher powered devices. They could just as easily chose JPEG or Flate if they wanted.
I feel like a god damned idiot, but what the fuck is a guy to do if he wants to read PDF files on an e-reader? They look like dogshit on Kindles. Thing is, I have a ton of them. Most of them somewhat obscure/can't be found in other formats.
I might be talking nonsense here, but I swear there is a niche market for some kind of affordable (<$100) tablet specifically designed for PDFs.
>>>I might be talking nonsense here, but I swear there is a niche market for some kind of affordable (<$100) tablet specifically designed for PDFs.
What would differentiate the hardware from current tablets to make that price possible?
And then what kind of PDF do you mean? Something squirted from a word processing/page-layout file that permits reflow or a collection of image scans (see Google Books) that are static? It's not that easy.
Less memory. Less research. It only has one job to do, etc. No extra bullshit.
I know I'm being ridiculous. I'm just bummed that I can't read some of these really awesome books with the same leisure that I can a Kindle book or a physical book. First world problems, etc. I have to get over being phenomenologically appalled at the idea of reading a book in the same space I read message boards. There's a weird mental block there.
Less memory? I don't know about that. There are some Google Book PDF scans of early 20th-century magazines that are close to 1GB in size. It's been my experience that even a good tablet doesn't keep more than 3-5 of these scanned PDF pages all rendered and ready to read if you want to switch back and forth between a bunch of pages.
I don't know about any of the more modern Kindles (or equivalent) but my girl friend's Kindle can't handle PDFs very well either. She reads a lot of scientific journals, which seem to usually be written in columns, and the Kindle gives you the option of either seeing the left column and the left half of the right one, or just the right hand half of the right.
This is the only reason I don't own an ebook reader yet. All the books I've read over the past few years are technical books. If I'm going to read rather tough material, I at least want the design/formatting to assist me in taking in the content.
Ebooks eliminate any typographical support (good) books can deliver.
> I mean, who gives a shit about some nerds when you’re moving bazillions of copies of books that help teens or moms explore new facets of their sexuality?
Can we have this conversation without resorting to highly gendered stereotypes for trivial-media-that-I'm-not-interested-in examples?
I'm really hoping Knuth doesn't discover how bad his books look in their Kindle editions, he's already had to fix this problem once... http://en.wikipedia.org/wiki/TeX#History
Simply put, PDF is a format designed to allow me to give you a document where you can be sure (with certain well defined exceptions that I can work around anyway) that you see what I intended you to see. That works brilliantly if I want you to see a letter-sized 3-color document, or an A5 booklet, or a funky 3" x 9" flyer, or whatever other size I want.
This doesn't work with e-book readers, because my e-book reader isn't A4 or 8" x 6" or whatever other format the book is in. Epub, mobi, LRF etc. give the e-book reader the ability to format the text (if it's pure text, and not, as OP complains, pre-formatted with assumptions the e-book reader can't match) for the screen or for my options wrt. zooming and font size.
I say all of this as both an avid PDF user (I write lots of documentation), and an avid e-book user. You use the right tool for the job. If you're complaining about PDFs on your e-book reader, you're doing it wrong.