Make Scanned FOIA Responses and Public Records Searchable

FOIA responses and public records productions arrive as multi-hundred-page scanned PDFs specifically because they're harder to work with that way. Deliteful's PDF OCR → Text tool converts those image PDFs into searchable plain text so journalists can find the needle in the document dump.

Investigative journalists receiving large scanned document productions face a document review problem that's identical to legal discovery: too many pages, not enough structure. A 300-page scanned FOIA response can contain three relevant paragraphs. OCR converts the entire production to searchable text, letting you grep for names, dates, dollar figures, or keywords across the full set in seconds rather than hours.

Deliteful processes up to 50 PDFs per batch and returns one .txt file per document. The text appears in reading order without formatting. Government documents, typed correspondence, and standard printed forms typically produce high OCR accuracy from clean scans. Handwritten annotations, rubber-stamp redactions surrounding text, and low-quality agency scanner output will produce noisier results that require spot-checking.

How it works

  1. 1

    Sign up free with Google

    Create your Deliteful account via Google OAuth in about 3 clicks.

  2. 2

    Upload your document production

    Batch upload scanned FOIA PDFs — up to 50 files, 300 MB each, 2 GB per batch.

  3. 3

    OCR converts each document

    Deliteful extracts all readable text from image-based pages into plain text files.

  4. 4

    Search across the full production

    Download the .txt files and run keyword searches across your entire document set.

Frequently asked questions

Can I search for specific names or dates across a full FOIA production after OCR?
Yes. Once converted to .txt files, you can use any text search tool — grep, a code editor's find-in-files, or a dedicated document review tool — to locate any term across the entire production.
How does OCR handle redacted sections in government documents?
Black-box redactions are image content, not text — OCR will not extract them. Text surrounding redactions is extracted normally. You'll see gaps or incomplete lines where redactions appear.
What if the scanned documents are very low quality from a government scanner?
Low-quality scans produce lower OCR accuracy. Results are still usable for keyword searching where the surrounding text is clean, but expect some garbled words in degraded sections.
Does Deliteful retain my uploaded documents?
No. Files are processed using temporary storage only and are not retained after your session ends.

Sign up free on Deliteful with Google and turn your next scanned document dump into a fully searchable text archive.