#ClawPDF
ClawPDF is a small PDFium WebAssembly wrapper for Node and browsers. It loads PDF input, extracts text, renders pages, and produces PNG fallback images without pulling in native canvas packages, postinstall scripts, or runtime dependencies.
It is built for OpenClaw's fallback PDF path: extract text first, render selected pages only when text is too short, and keep image work inside predictable page, pixel, and dimension budgets.
#Install
npm install clawpdf
ESM-only. Node 20+ is supported.
#Quick Example
import { writeFile } from "node:fs/promises";
import { openPdf } from "clawpdf";
await using pdf = await openPdf("report.pdf");
console.log(pdf.pageCount);
console.log(pdf.text({ maxPages: 5 }));
const png = await pdf.page(1).png({ dpi: 144, forms: true });
await writeFile("page-1.png", png);
For server code, keep one PdfEngine alive and reuse it. The top-level extractPdf(...) helper also shares a default engine when no engine option is provided.
#Feature Map
- Loading PDFs covers engines, inputs, lifetimes, and passwords.
- Text Extraction covers page text and selected-page extraction.
- Page Rendering covers DPI, scale, target sizes, backgrounds, and form widgets.
- PNG Output covers page PNGs and standalone encoding.
- Extraction Fallback covers text-first extraction with image fallback.
- Password-Protected PDFs covers user-password handling.
- Browser and Bundlers covers
clawpdf/browser. - PDFium Provenance covers the vendored binary and refresh workflow.
- Package Shape covers dependencies and published files.
- Performance records the current comparison snapshot.
- API Reference lists the exported API surface.