Strip Identifying Metadata from PDFs Before Publishing Sensitive Documents
Publishing a PDF document — whether a leaked memo, a FOIA response, or a source-provided file — without stripping its metadata can expose the identity of your source, your own organization's internal systems, or the chain of custody of the document. Journalists and investigative researchers who publish raw PDFs have inadvertently burned sources when embedded author fields or revision histories were examined by the subjects of their reporting.
High-profile metadata exposure incidents have affected major newsrooms. The most documented case involved the 2012 identification of a Guatemalan government whistleblower through Microsoft Word metadata embedded in a PDF — the author field pointed directly to the source. More recently, numerous FOIA-released PDFs from government agencies have contained internal staff names and workstation identifiers that agencies had failed to strip. Investigative reporters who receive and redistribute such documents inherit that exposure.
Deliteful removes standard PDF metadata fields — author, title, subject, keywords, creator, producer — before you publish or share a document. Page content, redactions, and layout are fully preserved. This is a first-line operational security step, not a comprehensive solution: steganography, document fingerprinting, and content-based identification are beyond the scope of metadata removal and require additional OPSEC measures.
How it works
- 1
Create a free account
Sign up with Google in about 3 clicks — no credit card required.
- 2
Upload the document PDF
Drop in the PDF you intend to publish, share, or embed in your reporting.
- 3
Remove metadata
Deliteful clears standard document property fields and returns a cleaned PDF with content intact.
- 4
Publish the cleaned file
Use the metadata-free output as the version you embed, link to, or provide to editors.
Frequently asked questions
- Can PDF metadata reveal the identity of a confidential source?
- Yes, in some cases. If a source created or edited the document and the Author or Last Modified By field was not cleared, the document properties may contain their name or username. This has led to real-world source identification in documented incidents.
- Does this tool remove all identifying information from a PDF?
- It removes standard document metadata fields only. Content-based identification methods — such as printer tracking dots, document watermarks, unique formatting patterns, or steganographic fingerprinting — are not addressed by metadata removal and require separate OPSEC review.
- Will removing metadata alter any visible redactions in the PDF?
- No. Visible content including text, images, redaction overlays, and annotations are not touched. Only the hidden document property fields are cleared.
- Should I strip metadata from FOIA documents before republishing them?
- It is good practice. FOIA-released PDFs routinely contain agency staff names, internal system identifiers, and document management metadata that the releasing agency failed to strip. Removing this before republication reduces inadvertent disclosure of government employee information.
Create your free Deliteful account with Google and remove metadata from sensitive PDFs before your next story goes live.