Print
Garbled text
LibreOffice Writer 7.5 exports PDF 1.6 files
with embedded subset fonts,
which renders garbled text
e.g. using "direct print" from a USB dongle
on a Brother HL-4040CN Laser printer.
A workaround is to downgrade to PDF 1.2 like this:
ps2pdf12 infile outfile
Unsupported PDF data
Some PDF generators,
e.g. from Deutche Bahn or Microsoft Word on Windows,
contains some yet unidentified PDF data,
which is replaced by an error message
e.g. using "direct print" from a USB dongle
on a Brother HL-4040CN Laser printer.
A workaround is to refry like this:
pdf2ps infile - | ps2pdf12 - outfile
or using this custom wrapper, where available:
localpdf2ps2pdf infile
If the above fails to work,
then try replace pdf2ps
with either pdftops
or pdftocairo
.
Beware that options may differ.
PDF resizing and/or scaling
PDF resizing is changing media size,
either together with or independent of content.
PDF scaling (a.k.a. "zooming") is changing content size,
relative to size of media size.
Both resizing and scaling at once can be confusing.
A suggested approach is to first media+content together,
then scale content relative to this new target media size.
PDF dimensions
Resizing and/or scaling PDF files
is tied to input and output dimensions.
Simple measures are standard page formats like A4, A3, or letter,
but more reliable are internal "box" dimensions measured in "pt":
In addition to page sizes,
PDF files technically annotate content using several "boxes":
- MediaBox: full printable area
- CropBox: area displayed in a PDF viewer
- Optional, defaults to MediaBox
- BleedBox: area including TrimBox/ArtBox and bleed
- Optional, defaults to MediaBox
- Must be larger than TrimBox/ArtBox, and smaller than MediaBox
- TrimBox: subset of printable area without bleed, cropmarks etc.
- ArtBox: essential subset of printable area
(Technically a "BoundingBox" is not a PDF hint
but the equivalent of ArtBox in EPS and DSC-compliant Postscript.)
E.g. an A4 sized PDF page
is technically a page with a /MediaBox of approx. 595 x 842 pt.
Extracting PDF box hints
Resolve all boxes (computing from defaults any omitted ones):
pdfinfo -box -l -1 input.pdf
Extract MediaBox:
gs -dQUIET -dNODISPLAY -dNOSAFER -sFileName=input.pdf \
-c "FileName (r) file runpdfbegin 1 1 pdfpagecount {pdfgetpage /MediaBox get {=print ( ) print} forall (\n) print} for quit"
To check if other boxes exist,
replace /MediaBox
with e.g. /CropBox
in above command,
and test if running the command succeeds.
Computing PDF dimensions
Ghostscript can virtually render each page and compute the area used,
useful e.g. for cropping a PDF.
TODO: Add section on cropping, based on https://stackoverflow.com/a/10418720/18619283
Compute BoundingBox for each page individually (not across them all):
gs -q -dBATCH -dNOPAUSE -sDEVICE=bbox -dLastPage=1 input.pdf 2>&1 | grep %%BoundingBox
Source: https://stackoverflow.com/a/52644056
Resize PDF with crop marks
FIXME: merge this section with each of below specific ones, referencing dimensions and box marks above
Simple resizing to fit target paper size
can often be done on-the-fly at the print dialog.
Resizing to another PDF file can however be more reliable,
and much cheaper when passing files to a third-party printing service.
Some PDF processing tools rasterize content,
and scaling is often expressed as either exact width and height
or a width/height scaling factor,
only rarely as the more intuitive area factor.
Resizing by scaling factor is particularly useful
for PDF documents containing bleed and/or crop marks,
where the inner part needs to fit a certain size.
This example processes an A3 layout with crop marks
placed on A3 oversize PDF pages,
resizing to have inner part fit A4.
using plakativ (and mupdf internally)
FIXME: untested!
While the main purpose of plakativ is to not only resize
but also slice onto multiple smaller tiles,
one of its features is easy scaling by area.
plakativ --factor=0.5 --size=250mmx337mm --output=output.pdf input.pdf
using Ghostscript with fitPage
Ghostscript resizing is done by first defining target size
and then tell to resize content to fit that target with FitPage
.
Simply setting a target PAPERSIZE
would either scale too much
or (e.g. with -dUseArtBox
) would loose bleed and crop marks.
Instead we first lookup original width and height with the command pdfinfo
and explicitly set those values scaled down by 21/29.7
(the ratio between A3 and A4 page formats).
Example command,
resizing from A3+ to A4
(source width 910.24 and height 1258.9 as reported by pdfinfo
):
gs -o output.pdf -sDEVICE=pdfwrite -dDEVICEWIDTHPOINTS=643.60333 -dDEVICEHEIGHTPOINTS=890.131 -dFIXEDMEDIA -dFitPage -dCompatibilityLevel=1.4 input.pdf
TODO: maybe options -dDEVICEWIDTHPOINTS
and -dDEVICEHEIGHTPOINTS
can be shortened as -dDEVICEWIDTH
and -dDEVICEHEIGHT
using Ghostscript with setpagedevice
Some PDF documents embeds /CropBox
marks
which is not handled by the -dfitPage
option.
For effectively resizing such documents,
each page need to have applied a /CropBox
mark matching the new size.
Example command,
resizing from A4 to printable-area-A4
(i.e. a dumb printer always auto-resizing to fit):
gs -o output.pdf -sDEVICE=pdfwrite -dDEVICEWIDTHPOINTS=595 -dDEVICEHEIGHTPOINTS=842 -dFIXEDMEDIA -dCompatibilityLevel=1.4 \
-f input.pdf -c "<</EndPage {0 eq {[/CropBox [0 0 567 802] /PAGE pdfmark true}{false}ifelse}>> setpagedevice"
NB! It is important to list option -f
before -c
,
to ensure that any CropBox
marks in file is overridden by the command.
TODO: maybe options -dDEVICEWIDTHPOINTS
and -dDEVICEHEIGHTPOINTS
are superfluous
TODO: maybe options -dDEVICEWIDTHPOINTS
and -dDEVICEHEIGHTPOINTS
can be shortened as -dDEVICEWIDTH
and -dDEVICEHEIGHT
Source: https://stackoverflow.com/a/26989410
Terminology
Downgrade
To parse a PDF file and recreate a new PDF file using reduced features
is called downgrading.
Refry
To "flatten" a PDF file to Postscript and then recreate as new PDF file
is called refrying.