summaryrefslogtreecommitdiff
path: root/USE.md
blob: 3e4cb3fffaa84da23561b5bf83d420ad595fd28a (plain)

Print

Garbled text

LibreOffice Writer 7.5 exports PDF 1.6 files with embedded subset fonts, which renders garbled text e.g. using "direct print" from a USB dongle on a Brother HL-4040CN Laser printer. A workaround is to downgrade to PDF 1.2 like this:

ps2pdf12 infile outfile

Unsupported PDF data

Some PDF generators, e.g. from Deutche Bahn or Microsoft Word on Windows, contains some yet unidentified PDF data, which is replaced by an error message e.g. using "direct print" from a USB dongle on a Brother HL-4040CN Laser printer. A workaround is to refry like this:

pdf2ps infile - | ps2pdf12 - outfile

or using this custom wrapper, where available:

localpdf2ps2pdf infile

If the above fails to work, then try replace pdf2ps with either pdftops or pdftocairo. Beware that options may differ.

PDF resizing and/or scaling

PDF resizing is changing media size, either together with or independent of content.

PDF scaling (a.k.a. "zooming") is changing content size, relative to size of media size.

Both resizing and scaling at once can be confusing. A suggested approach is to first media+content together, then scale content relative to this new target media size.

PDF dimensions

Resizing and/or scaling PDF files is tied to input and output dimensions. Simple measures are standard page formats like A4, A3, or letter, but more reliable are internal "box" dimensions measured in "pt":

In addition to page sizes, PDF files technically annotate content using several "boxes":

  • MediaBox: full printable area
    • Required
  • CropBox: area displayed in a PDF viewer
    • Optional, defaults to MediaBox
  • BleedBox: area including TrimBox/ArtBox and bleed
    • Optional, defaults to MediaBox
    • Must be larger than TrimBox/ArtBox, and smaller than MediaBox
  • TrimBox: subset of printable area without bleed, cropmarks etc.
    • Optional
  • ArtBox: essential subset of printable area
    • Optional

(Technically a "BoundingBox" is not a PDF hint but the equivalent of ArtBox in EPS and DSC-compliant Postscript.)

E.g. an A4 sized PDF page is technically a page with a /MediaBox of approx. 595 x 842 pt.

Extracting PDF box hints

Resolve all boxes (computing from defaults any omitted ones):

pdfinfo -box -l -1 input.pdf

Extract MediaBox:

gs -dQUIET -dNODISPLAY -dNOSAFER -sFileName=input.pdf \
-c "FileName (r) file runpdfbegin 1 1 pdfpagecount {pdfgetpage /MediaBox get {=print ( ) print} forall (\n) print} for quit"

To check if other boxes exist, replace /MediaBox with e.g. /CropBox in above command, and test if running the command succeeds.

Computing PDF dimensions

Ghostscript can virtually render each page and compute the area used, useful e.g. for cropping a PDF.

TODO: Add section on cropping, based on https://stackoverflow.com/a/10418720/18619283

Compute BoundingBox for each page individually (not across them all):

gs -q -dBATCH -dNOPAUSE -sDEVICE=bbox -dLastPage=1 input.pdf 2>&1 | grep %%BoundingBox

Source: https://stackoverflow.com/a/52644056

Resize PDF with crop marks

FIXME: merge this section with each of below specific ones, referencing dimensions and box marks above

Simple resizing to fit target paper size can often be done on-the-fly at the print dialog. Resizing to another PDF file can however be more reliable, and much cheaper when passing files to a third-party printing service.

Some PDF processing tools rasterize content, and scaling is often expressed as either exact width and height or a width/height scaling factor, only rarely as the more intuitive area factor.

Resizing by scaling factor is particularly useful for PDF documents containing bleed and/or crop marks, where the inner part needs to fit a certain size.

This example processes an A3 layout with crop marks placed on A3 oversize PDF pages, resizing to have inner part fit A4.

using plakativ (and mupdf internally)

FIXME: untested!

While the main purpose of plakativ is to not only resize but also slice onto multiple smaller tiles, one of its features is easy scaling by area.

plakativ --factor=0.5 --size=250mmx337mm --output=output.pdf input.pdf

using Ghostscript with fitPage

Ghostscript resizing is done by first defining target size and then tell to resize content to fit that target with FitPage.

Simply setting a target PAPERSIZE would either scale too much or (e.g. with -dUseArtBox) would loose bleed and crop marks. Instead we first lookup original width and height with the command pdfinfo and explicitly set those values scaled down by 21/29.7 (the ratio between A3 and A4 page formats).

Example command, resizing from A3+ to A4 (source width 910.24 and height 1258.9 as reported by pdfinfo):

gs -o output.pdf -sDEVICE=pdfwrite -dDEVICEWIDTHPOINTS=643.60333 -dDEVICEHEIGHTPOINTS=890.131 -dFIXEDMEDIA -dFitPage -dCompatibilityLevel=1.4 input.pdf

TODO: maybe options -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS can be shortened as -dDEVICEWIDTH and -dDEVICEHEIGHT

using Ghostscript with setpagedevice

Some PDF documents embeds /CropBox marks which is not handled by the -dfitPage option. For effectively resizing such documents, each page need to have applied a /CropBox mark matching the new size.

Example command, resizing from A4 to printable-area-A4 (i.e. a dumb printer always auto-resizing to fit):

gs -o output.pdf -sDEVICE=pdfwrite -dDEVICEWIDTHPOINTS=595 -dDEVICEHEIGHTPOINTS=842 -dFIXEDMEDIA -dCompatibilityLevel=1.4 \
-f input.pdf -c "<</EndPage {0 eq {[/CropBox [0 0 567 802] /PAGE pdfmark true}{false}ifelse}>> setpagedevice"

NB! It is important to list option -f before -c, to ensure that any CropBox marks in file is overridden by the command.

TODO: maybe options -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS are superfluous

TODO: maybe options -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS can be shortened as -dDEVICEWIDTH and -dDEVICEHEIGHT

Source: https://stackoverflow.com/a/26989410

Terminology

Downgrade

To parse a PDF file and recreate a new PDF file using reduced features is called downgrading.

Refry

To "flatten" a PDF file to Postscript and then recreate as new PDF file is called refrying.