Scrub PII from PDFs, images, and DOCX files. Local-only. One command.
pip install scrubfileNo cloud APIs. No data leaves your machine. Zero network calls after model download.
PDF, PNG, JPG, TIFF, BMP, DOCX. One tool handles them all.
Names, SSNs, emails, phones, addresses, credit cards, and 20+ entity types via Presidio + spaCy.
Text removed from PDF content stream. Not a visual overlay — the data is gone.
AI agents (Claude, Cursor) can redact documents directly via Model Context Protocol.
Machine-readable output. Python API. CLI with rich output. Built for automation.
| scrubfile | Adobe Acrobat | Google DLP | Presidio | |
|---|---|---|---|---|
| Local-only | Yes | Yes | No | Yes |
| Multi-format | PDF, images, DOCX | PDF only | Text/images | Text only |
| CLI | Yes | No | No | No |
| Auto-detect PII | Yes | No | Yes | Yes |
| Agent-ready (MCP) | Yes | No | No | No |
| Free | Yes | $240/yr | Pay per call | Yes |