Document Redaction: The Art of Hiding Information (Badly)

TL;DR

  • What's happening: Courts, government agencies, and corporations regularly botch document redactions, leaving sensitive information easily recoverable.
  • Who's affected: Victims whose personal data leaks, defendants whose secrets get exposed, and agencies who look incompetent.
  • The problem: Black boxes over text don't delete anything. Copy-paste reveals "redacted" content in seconds.
  • What to do: If you need to redact documents, use proper redaction tools. If you're researching, know how to check for redaction failures.

In December 2025, the Department of Justice released thousands of pages of Jeffrey Epstein documents. Many had black boxes over sensitive information. The problem? You could copy-paste the "hidden" text into Notepad and read everything. The government spent months preparing these documents and still couldn't figure out how PDFs work.

This isn't a one-off failure. It's a pattern. From Paul Manafort's lawyers accidentally exposing his Russian ties, to TikTok's internal research being revealed through faulty redactions in a Kentucky lawsuit, to the European Commission leaking AstraZeneca vaccine contract details through PDF bookmarks, redaction failures happen constantly.

Here's how redaction actually works, why it keeps failing, and what that means for privacy.

How Modern Redaction Works (When Done Correctly)

Real redaction isn't about covering text. It's about deleting it.

True Redaction vs. Visual Obscuration

There's a crucial difference:

Visual Obscuration (Wrong)

  • Black boxes drawn over text
  • White font on white background
  • Highlight tool with black color
  • Text still exists in the file
  • Can be retrieved via copy-paste

True Redaction (Correct)

  • Text permanently deleted from file
  • Metadata stripped
  • Revision history removed
  • Hidden layers flattened
  • Nothing to recover

When you draw a black rectangle over text in Microsoft Word or use a highlight tool in a PDF reader, the underlying text remains in the document's code. The black box is just a visual layer on top. [1]

What Proper Redaction Tools Do

Specialized redaction software like Adobe Acrobat Pro's redaction feature, Redactable, or iDox.ai actually removes the content:

  1. Mark content for redaction: Select text, images, or areas to remove
  2. Apply redaction: The tool permanently deletes the marked content
  3. Sanitize document: Remove metadata, comments, hidden text, and revision history
  4. Flatten layers: Merge all visible layers into a single layer
  5. Verify: Test that content can't be recovered

The result should be a file where the redacted content literally doesn't exist anymore. [2]

Elements That Need Redaction

Documents contain more than visible text. A proper redaction must address:

  • Visible text: The obvious stuff
  • Metadata: Author name, creation date, editing history, software used
  • Comments and annotations: Often contain sensitive discussion
  • Revision history: Track changes can reveal deleted content
  • Hidden layers: PDFs can contain invisible text layers
  • Bookmarks: PDF bookmarks sometimes contain text from document headings
  • OCR text: Scanned documents may have invisible searchable text
  • Embedded objects: Images, charts, attachments

Government and Court Redaction Standards

Federal courts have specific rules about what must be redacted from public filings, and it's less than you might think.

Federal Court Requirements

Under Federal Rules of Civil Procedure 5.2 and Criminal Procedure 49.1, courts must redact:

  • Social Security Numbers: Only last four digits permitted
  • Dates of birth: Only year if necessary
  • Names of minors: Initials only
  • Home addresses: City and state only
  • Financial account numbers: Last four digits with account type/institution name

That's it. Everything else is potentially public unless a court orders it sealed. [3]

FOIA and Government Agencies

The Freedom of Information Act allows agencies to withhold information under nine exemptions. The ones intelligence agencies love most:

  • Exemption 1: Classified national security information
  • Exemption 3: Information protected by other statutes (the CIA and NSA use this constantly for "sources and methods")
  • Exemption 6: Personal privacy
  • Exemption 7: Law enforcement records that would reveal techniques or endanger sources

Agencies must cite which exemption applies to each redaction. When you see "(b)(1)" or "(b)(3)" on a FOIA document, that's the exemption code. [4]

Intelligence Agency Practices

The FBI, CIA, and NSA have developed specific redaction practices:

  • Monospaced fonts: The FBI reportedly uses monospaced fonts to prevent the length of redacted areas from revealing information
  • Block redactions: Rather than redact individual words, agencies often redact entire paragraphs to hide document structure
  • Classification markings: Each paragraph is marked with its classification level (TS for Top Secret, S for Secret, etc.)
  • Exemption coding: Every redaction cites the legal basis for withholding

The CIA has developed over 126 specific reasons to justify withholding information under "sources and methods" protections. [5]

The Spectacular Failures

Despite all these procedures, redaction failures keep happening. Here's why each one matters:

Paul Manafort (2019)

Trump campaign chairman Paul Manafort's lawyers filed court documents with black boxes over sensitive text. Journalists discovered they could copy-paste the "redacted" content. The exposed information revealed Manafort had shared campaign polling data with a Russian intelligence-linked associate, exactly what investigators wanted to know. [6]

The mistake: Drawing black rectangles in a word processor before converting to PDF.

TikTok Internal Documents (2024)

In October 2024, faulty redactions in a Kentucky Attorney General lawsuit exposed 30 pages of internal TikTok research. The documents revealed that TikTok determined 260 videos creates addiction, that "compulsive usage correlates with a slew of negative mental health effects," and that their screen time tool had "negligible impact." [7]

The mistake: Superficial black boxes that could be copy-pasted around.

Epstein Files (2025)

When the Department of Justice released Epstein documents under the Epstein Files Transparency Act in December 2025, observers quickly discovered the redactions were cosmetic. Simple copy-paste or basic photo editing revealed hidden text. The DOJ missed the deadline, released incomplete files, and bungled the redactions, then had documents mysteriously disappear from their website. [8]

The mistake: Using superficial black boxes instead of true redaction tools.

European Commission/AstraZeneca (2021)

The European Commission published an AstraZeneca vaccine contract with sensitive pricing information redacted, but forgot about PDF bookmarks. The bookmarks contained the original headings, revealing the supposedly secret pricing terms. [9]

The mistake: Not sanitizing document metadata.

JFK Assassination Records (2025)

The National Archives released thousands of pages of JFK assassination documents with Social Security numbers and other PII either unredacted or with redactions easily removed, likely due to oversight of hidden OCR text layers. [10]

The mistake: Not addressing invisible text from document scanning.

How to Check for Redaction Failures

If you're researching public documents, here's how to identify improperly redacted content:

The Copy-Paste Test

  1. Open the PDF in any reader
  2. Select the "redacted" area (the black box)
  3. Copy it
  4. Paste into a plain text editor (Notepad, TextEdit)
  5. If text appears, the redaction failed

Metadata Inspection

  1. Right-click the file and check Properties/Get Info
  2. Use a PDF analysis tool to inspect document structure
  3. Look for comments, annotations, revision history
  4. Check bookmarks for revealing text

Layer Analysis

PDFs can contain multiple layers. Tools like Adobe Acrobat Pro, QPDF, or pdf-parser can reveal hidden layers that might contain unredacted content.

OCR Text Extraction

For scanned documents, the visible image might be redacted while invisible OCR text remains. Extract the text layer separately to check.

How to Properly Redact Documents

If you need to redact documents, for privacy, legal, or business reasons, here's how to do it right:

Step 1: Use Proper Software

  • Adobe Acrobat Pro DC: Has dedicated redaction tools (not just drawing black boxes)
  • Redactable: Cloud-based, designed specifically for legal/government use
  • iDox.ai: AI-powered identification of sensitive content
  • Microsoft Word's Redaction Add-in: Better than nothing

Never use: Highlight tools, drawing tools, font color changes, image editing apps

Step 2: Mark and Apply Redactions

  • Use the software's redaction feature to mark content
  • "Apply redactions" permanently removes the content
  • Don't just mark, you must apply

Step 3: Sanitize the Document

  • Remove metadata (author, creation date, editing history)
  • Delete comments and annotations
  • Clear revision history
  • Remove hidden text and layers
  • Check and clear bookmarks

Step 4: Verify

  • Try to copy-paste from redacted areas
  • Search for redacted terms
  • Inspect metadata
  • Open in multiple PDF readers
  • Have someone else verify

The Nuclear Option

For maximum security with text documents:

  1. Replace sensitive text with "[REDACTED]"
  2. Copy entire document to plain text editor (strips all hidden data)
  3. Re-format in new document
  4. Print to new PDF
  5. This removes all hidden layers, metadata, and formatting tricks

Why This Matters

Redaction failures aren't just embarrassing. They have real consequences:

  • Privacy violations: Victims' personal information exposed
  • National security: Classified information leaked
  • Legal consequences: Cases compromised by inadvertent disclosure
  • Corporate espionage: Trade secrets revealed
  • Criminal exposure: Defendants' communications revealed

The Manafort redaction failure advanced a federal investigation. The TikTok redaction failure exposed corporate knowledge of harm to children. The Epstein file failures undermined public trust in government transparency.

The AI Factor

In 2024-2025, AI-powered redaction tools have emerged claiming to automatically identify and redact sensitive information. They use pattern recognition to find:

  • Social Security numbers
  • Phone numbers
  • Addresses
  • Names
  • Financial information
  • Medical data

Government agencies are adopting these tools for FOIA processing, claiming they reduce review time by up to 80%. [11]

But AI redaction has its own risks:

  • Over-redaction: AI might hide public information
  • Under-redaction: AI might miss context-dependent PII
  • Pattern failures: Non-standard formats may slip through
  • Audit concerns: Who's responsible when AI makes mistakes?

Missouri updated its court rules in 2025 specifically to limit redactions to defined confidential information, requiring "good cause" for additional redactions, partly in response to lawyers over-redacting documents using automated tools. [12]

The Bottom Line

Redaction Is Harder Than It Looks

Black boxes aren't magic. They're visual overlays that anyone with Notepad can see through.

If you're redacting documents, use proper tools and verify your work. If you're researching public documents, always check for redaction failures. The information hiding behind those black boxes might be one copy-paste away from exposure.

Courts, government agencies, and corporations keep making the same mistakes. The 2025 Epstein file release proved that even mandatory transparency laws can't fix basic technical incompetence.

References

  1. U.S. Courts - Redacting Personal Information from Electronically Filed Documents
  2. Adobe Acrobat - Removing Sensitive Content from PDFs
  3. Federal Rules of Civil Procedure 5.2 - Privacy Protection
  4. DOJ Office of Information Policy - FOIA Exemptions
  5. MuckRock - CIA's 126 Reasons to Withhold Information
  6. New York Times - Manafort Shared Polling Data With Russian Associate (January 2019)
  7. NPR - TikTok Knows Its App Is Harming Kids, New Internal Documents Show (October 2024)
  8. Popular Information - DOJ Releases Epstein Files with Major Redaction Failures (December 2025)
  9. BBC - EU AstraZeneca Vaccine Contract Details Revealed (January 2021)
  10. National Archives - JFK Assassination Records
  11. Law In Order - AI-Driven Redaction Solutions for Government Document Review
  12. Missouri Courts - Supreme Court Clarifies Redaction Rules (2025)