pdf-pagenum: Fix Messy macOS Preview Page Numbers in PDFs from the CLI

A pip-installable CLI tool that auto-centers off-center page number annotations created by macOS Preview, or batch-adds new ones — with smart content avoidance and landscape support.

The Problem

If you've ever used macOS Preview to add page numbers to a PDF (via the text annotation tool), you know the pain: numbers land wherever you drop them, never centered, and manually positioning dozens or hundreds of them is soul-crushing. Especially when the PDF has mixed portrait and landscape pages.

I ran into this preparing a thesis — 200+ pages of final manuscript, page numbers visibly off-center on every single page. Editing each one by hand wasn't an option.

The Solution

pdf-pagenum is a single CLI command that reads a folder of PDFs and centers every page number annotation to the bottom of its page. It works by:

Detecting FreeText annotations that look like page numbers
Measuring body content boundaries on each page
Repositioning the annotation to a clean, centered position below the content — with proper margins
Preserving the original page dimensions (no resizing, ever)

If your PDF has pages with no annotations at all, it can generate new page numbers from scratch in the correct position.

Install

pip install pdf-pagenum

That's it. PyMuPDF and natsort come along as dependencies.

Usage

Fix Mode (default)

Reposition existing page number annotations so they're centered at the bottom:

pdf-pagenum ./scans/ ./output/

This is the mode you'll use 90% of the time — it takes whatever rough page numbers Preview gave you and snaps them to the mathematically correct center.

Add Mode

Generate brand-new page numbers on pages that lack them:

# Number all pages starting from 1
pdf-pagenum ./scans/ ./output/ --add all

# Number pages 3 through 7 only
pdf-pagenum ./scans/ ./output/ --add 3-7

# Number specific pages, starting count from 10
pdf-pagenum ./scans/ ./output/ --add 1,3,5-7 --start 10

Ranges and comma-separated lists can be mixed freely.

Start Offset

The --start N flag works in both modes — it shifts the logical page number without touching the files. Handy when you're processing chapter PDFs that don't start at page 1.

How It Works Under the Hood

The pipeline runs per-PDF, processing pages independently for speed:

Content Detection: For each page, PyMuPDF extracts all text blocks and images to find the bounding box of actual body content. This tells us how low the content goes — and where the page number should sit below it.
Annotation Matching: FreeText annotations are tested against patterns like \d+ to identify candidates for repositioning. Non-page-number annotations are left untouched.
Font Measurement: Rather than guessing digit widths, the tool renders each page number string with the actual font to get exact bounding dimensions. This guarantees true horizontal centering regardless of digit composition.
Placement: The annotation is repositioned at (page_center - text_width/2, below_content + margin), with a white background fill so it's legible over images or dark areas.
Save: PDFs are written with compression and garbage collection enabled, keeping output sizes small.

Design Decisions Worth Mentioning

A4 dimensions are never changed. The tool refuses to resize pages. If content is too tight and a page number overlaps body text, it accepts the overlap rather than distorting the page. This is by design — changing page geometry breaks downstream workflows.

Landscape pages are handled correctly. Content boundaries are measured against the actual page orientation, so a landscape spread's "bottom" is correctly identified.

Natural sorting of input files. File names like chapter10.pdf sort after chapter9.pdf, not after chapter1.pdf, thanks to natsort. This matters when you're numbering pages sequentially across files and want them in human order.

Real-World Use Cases

Thesis / dissertation submission: Universities often require centered page numbers. If you annotated them hastily in Preview, this fixes them in seconds.
Project proposals and bid documents: Combine multiple PDF scans, then number them end-to-end with --add all --start 1.
Conference proceedings: Normalize page number positions across papers from different authors.
E-book and print prep: Ensure page numbers are consistently placed before sending to a printer or conversion tool.

Why Not Just Use [Other Tool]?

Preview alone has no "center annotation" or "batch page number" feature.
Adobe Acrobat can do this but costs $20+/month and is overkill for one task.
Manual LaTeX / Word layouts require the source document — pdf-pagenum works on the final PDF, even if it's a scan.

Open Source

MIT licensed. PRs welcome — if you have a feature request (custom fonts for add mode? different placement positions?), open an issue.

pip install pdf-pagenum

GitHub: github.com/zhangxin0611/pdf-pagenum

推荐订阅源

DEV Community