Just in case you don't click through, the bug submitter refers to a color PDF (not a scan, so image compression artifacts are not an issue) that is similar in appearance to a periodic table. That is, it is not the sequence of `123456` that is mistranslated into `114447`, but a sequence of 6 table cells, each containing a single digit.
It's not just the numbers that are misprinted, but the text inside those cells too, which suggests that Edge's PDF engine is re-rendering the original PDF, rather than printing the original PDF as is, which I thought was the entire point of using PDF in the first place.
But maybe this is an edge case? In the sense that Microsoft assumes that given a PDF file, if a user wants to "Print to PDF", the user should just save the PDF file. "Print to PDF" is ostensibly used to convert HTML/DOC into PDF format.
As someone who works in PDFs constantly (due to work in local government), I would say the point of a PDF is to be able to reproduce the result given a file, not to assume the file cannot be changed... you can easily edit and save PDF files using Acrobat, for example.
It is common that when you "print to PDF" you take the output of the printer and serialize that to PDF. I use this feature often on my Mac (which I think many would claim
has excellent support for dealing with PDF files) to build a PDF that is stripped of any interactive forms: so as to get an output which is only the PDF "as printed".
The built-in MacOS support for PDF is great. Any app that can print to a printer can print to a PDF file. And the Preview app, despite its name, can edit PDFs to add annotations (text, graphics), add or remove pages, etc. There's also a feature by which you sign your name on a piece of paper, hold it up to the camera, and it will capture it and insert it into a the document. It's convenient that these are built in and don't require paying for something bloated like Acrobat Pro.
That's so amazing to hear! It pretty much flew under the company radar--most people didn't find out about it until the first reviews of the Lion beta came out.
I don't think it was very common at Apple for an individual engineer to conceive, implement, and ship a feature like this. There was a general sentiment on the team of "let's do something with signatures," but we knew that very few people had scanners. We thought about touchpad input, but decided against it at the time. (That came much later, in either 10.11 or 10.12)
I was thinking, "well, almost no one has a scanner, but practically everyone using this application has a camera in their Mac." I built the initial prototype in OpenCV and then ported it to Apple's vDSP/Accelerate frameworks.
My favorite detail, which doesn't seem to be present in 10.12, is that you could just click on a horizontal line in a PDF; since I recorded the signature's offset relative to the baseline superimposed on the camera image, it would place the signature exactly on the line, with descenders nicely descending.
I've since moved on from macOS/iOS development, but I had a really positive experience at Apple. Met so many amazingly talented people there.
Thanks! It's one of my favorite features in Preview.
I used to hate PDFs before getting a Mac, but the first-class support in Preview, with features like that (along with the ability to tear out, reorder and attach pages) made me change my mind.
To be fair, on a Microsoft Surface you would just use your pen to sign directly on the screen which is much more natural. It doesn't make any sense to me that Steve Jobs was so anti-pen, while also pushing for intuitive interaction.
Perhaps if someone added electromagnetic traction so that the screen provided some push-back to the pen tip, it would seem "natural," but as things are now, my on-screen signature is as good as a chicken scratch.
Providing a pen is an excuse to leave controls that are too small to use with your fingers. So while Windows now has a Touch mode where many controls are larger, many others which are also essential to using the system are tiny and hard or impossible to use by touch. That is what Steve Jobs was trying (successfully) to avoid.
That's how it works on an iPad. Touch/pen interfaces are terrible on the PC/Laptop form factor though. At best you gimp your desktop metaphor so it's functional with touch, at worst you have an unusable touch interface.
I think he was anti the crappy pens and digitising systems available at the time, the requirement to have a pen to use the device at all and the poor interface designs they encouraged.
Thanks! I use that feature almost every day. We try to be a paperless home so being able to sign PDF forms and return them without printing is one of my single favourite features of OSX. We even bought our house using that feature for all parties to sign it around my MacBook.
Thanks. It's a great feature, and one that adds a lot to the overall Mac OS experience. It's too bad that most people have no idea that it exists, since no other OS has it. But for those who know, it's a huge time and paper saver. Excellent work.
I just found out about this feature a couple of days ago, after being a Mac user for 4 years. My first thought was it's incredibly neat - shame I didn't learn about it before.
NeXT used Display PostScript, and I would create widgets by writing PostScript code.
Quartz is like QuickTime plus something like OpenGL shaders plus something like NeXTStep "display PDF". I have no idea if this encouraged PDF integration into the display model.
Rather, I would say that NeXT, and then Apple, had some great IP and cross-licensing of PostScript and PDF display tech, and so they could ship the OS distribution with PDF as the printing model.
That is, one reason Windows might today re-render PDF → XPS → PDF is that they had needed to create display tech like PDF anyway, and so they did, and this was after humans had been playing with HTML for a while... Silverlight was pretty.
That's not what's happening unless you have a printer which directly supports PDF. Most likely what is happening on Windows is either:
PDF → XPS → PCL
PDF → XPS → PostScript
Depending on the printer. Which makes far more sense and is analogous to what OS X does:
PDF → Quartz → PCL
PDF → Quartz → PostScript
There's really no notable difference between these two approaches as both Quartz and XPS are based on the same printing model as PDF. Don't confuse Quartz with PDF just because it offers import/export to PDF - Quartz has its own internal imaging model.
Absolutely right. PDF fixes the problem Word documents have where different versions of Word tends to render the document ever so differently. Usually in a way that seems to mess up all those beautiful page breaks you meticulously planned. It does this by every element having absolute positioning.
Which is why ePub and other digital book formats exist. I'm glad there is a format that prioritizes WYSIWYG over reflow (though the part of the spec where the introduced scripting is a bit dodgy)
Should be able to easily reflow text as long as it's using newline operators (T*, ', "). Might still need some basic heuristics for paragraph breaks. But much better than the alternative of attempting to correlate individual lines of text together based on positioning.
I use calibri for my E-Books, and it seems to do a pretty good job of reflowing PDFs(and exporting to mobi). YMMV though, especially if headers/footers are badly done.
Off-topic, but you seem like you might know: why does text copied from PDFs sometimes have messed-up spaces? It seems to guess where the spaces should go based on kerning, so with justified text, a widely-spaced line may come out with a space between each letter, while a narrowly-spaced one has no spaces at all.
(Also the thing where it inserts line breaks at the end of every print line is maddening)
That's often caused by the font specified in the PDF not being available on the platform where the PDF viewer is running, so a different font has been used instead.
Hmm. I may have been unclear--the PDF reads fine, but if I copy and paste some text into a text editor, I get the messed-up spaces. It seems as if the PDF doesn't encode text as text but just as a series of characters and locations, leaving spaces unrecorded, so when copy-pasting the reader has to guess from the distance between letters.
> I use this feature often on my Mac (which I think many would claim has excellent support for dealing with PDF files) to build a PDF that is stripped of any interactive forms: so as to get an output which is only the PDF "as printed".
Interesting; sort of like taking an old Fireworks .fw.png file, and then exporting it to PNG, to get rid of the Fireworks project data. Never thought of using "Print to PDF" this way!
> which suggests that Edge's PDF engine is re-rendering the original PDF, rather than printing the original PDF as is
That's to be expected. The bitmap which Edge has rendered to the screen is not what will be sent to the printer driver. Instead, rich vector graphics will be sent. On Windows, the native print format is XPS, so this is most likely a bug in how Edge converts PDF to XPS for printing.
For simpler use cases Windows' graphics APIs can be used to both render to bitmap and to XPS but when printing something as rich and sophisticated as PDF better results are achieved by directly targeting the native print format, such as PCL, PostScript, or XPS. I suspect that's what the Edge devs have done and why it's producing different results on screen and in print.
To make things even more interesting, the "original" PDF seems to have been generated by Ghostscript 8.15 and PScript5.dll 5.2, that is, it was also "printed to PDF" (from Microsoft Word, I presume).
I have also encountered something similar whilst attempting to print a ticket to a major amusement park in Europe using Edge. The page the tickets were on secured by a login mechanism, and attempting to print the tickets resulted a page with an error. I had to save the PDF to the computer and print from there to get the proper output. It definitely seems like Edge re-renders or even re-requests the PDF before printing.
You would probably like the James Mickens video "Life As A Developer: My Code Does Not Work Because I Am A Victim Of Complex Societal Factors That Are Beyond My Control"[0]. He starts talking about the Adobe PDF reader at 19:30.
The PDF goes through different rendering paths for display vs print. It's GDI+ for display, and WPF with an XPS spool file for print. So my guess is whatever does PDF to XPS filtering/conversion is getting something wrong; but then it could be complicated by an addtitional bug in the print driver which is why the report says the bug depends on what printer is used for printing.
It's not just the numbers that are misprinted, but the text inside those cells too, which suggests that Edge's PDF engine is re-rendering the original PDF, rather than printing the original PDF as is, which I thought was the entire point of using PDF in the first place.
But maybe this is an edge case? In the sense that Microsoft assumes that given a PDF file, if a user wants to "Print to PDF", the user should just save the PDF file. "Print to PDF" is ostensibly used to convert HTML/DOC into PDF format.