Shell Scripting a Side-by-Side Printed PDF Comparison
Today I needed to compare two similar PDFs page by page. Doing so on the screen is possible but awkward. Instead, I wanted to be able to print and write on them. I didn't want to shuffle papers or take alot of desk space. I wanted one half of the first page to have the first page of one PDF, the other half of the page to have the first page of the other PDF, and so forth. Thanks to the wonders of the Linux command line, doing so was not only possible but free and relatively easy.
My script is pretty tailored to my specific situation, so I won't be including it inline. Instead, I've made it an attachment for the curious to download .
Instead, let me describe the most interesting parts:
- mktemp
- Creating temporary files in a shell script is common. Doing so correctly isn't as common. If you always use the same filename, two copies of the script can't be run at the same time. Even if you use $$ or the current date and time in the filename, there's still a race condition leading to security risks. Happily, using mktemp isn't just one of the most secure options, it's also incredibly easy.
- pdftk
- As the pdftk Web site explains it, "If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses."
- mpage
- Some print drivers and some programs allow you to print multiple pages on a single piece of paper. With mpage you can do the same with Postscript documents. Since Postscript is the native Linux printing language, it's possible to do it with any document and any printer.
- pdftops & pdf2ps
- Both convert PDF to Postscript. pdftops is part of poppler-utils, formerly part of Xpdf. pdf2ps is part of Ghostscript. I've generally had better results with pdftops, although not always. (For a couple minutes of fun, try to figure out which version of Ghostscript you're using. There's a handful of forks floating around.)