Certificate Infrastructure Deep Dive — Part 5

Revocation, OCSP, and Why It Often Fails in Practice

In Part 4, we examined PKI governance and trust stores.

Now we examine one of the weakest — and most misunderstood — parts of the ecosystem:

Certificate revocation.

Revocation is supposed to answer this question:

“What if a certificate that was valid yesterday should no longer be trusted today?”

In theory, revocation solves key compromise and mis-issuance. In practice, it is slow, inconsistent, and often bypassed.

1. Why Revocation Exists

Certificates can become untrustworthy before expiry due to:

Private key compromise
CA compromise
Mis-issuance
Domain control loss
Policy violations

Revocation provides an early invalidation mechanism.

But revocation operates under severe constraints:

Internet scale
Latency sensitivity
Privacy concerns
Availability requirements

2. Certificate Revocation Lists (CRLs)

CRLs are the original revocation mechanism.

A CA periodically publishes:

A signed list of revoked certificate serial numbers

Clients download and cache this list.

How CRLs Work

graph TD
    Client[Client]
    CA[Certificate Authority]
    CRL[CRL Distribution Point]

    Client --> CRL
    CA --> CRL

Process:

Client retrieves CRL from URL in certificate.
Verifies CRL signature.
Checks if certificate serial number appears in list.

Problems with CRLs

CRLs grow large (megabytes)
Download latency impacts handshake
Clients may cache stale lists
Revocation delay depends on CRL update frequency

CRLs do not scale well at internet volume.

3. OCSP — Online Certificate Status Protocol

OCSP was designed to fix CRL inefficiencies.

Instead of downloading a full list, the client asks:

“Is certificate X still valid?”

OCSP Flow

sequenceDiagram
    participant Client
    participant OCSP
    participant CA

    Client->>OCSP: Status request (serial number)
    OCSP->>CA: (Internal validation)
    OCSP-->>Client: Good / Revoked / Unknown

OCSP Response Types

good
revoked
unknown

Responses are signed by the CA or delegated OCSP responder.

4. The Soft-Fail Problem

In theory:

If OCSP responder is unreachable → block connection.

In reality:

Most browsers and TLS clients implement soft-fail.

If OCSP check fails due to:

Timeout
Network error
Responder unavailable

The connection proceeds anyway.

Why?

Because failing closed would:

Break large portions of the internet
Create denial-of-service vectors

This undermines revocation effectiveness.

5. Privacy Concerns

Traditional OCSP leaks browsing behavior.

Client reveals to CA:

Which site it is visiting
When

This creates privacy implications.

Browsers introduced mitigations:

OCSP stapling
CRLite (Firefox)
CRLSets (Chrome)

6. OCSP Stapling

OCSP stapling shifts responsibility to the server.

Instead of:

Client → OCSP responder

It becomes:

Server → OCSP responder (periodically) Server → Client (during TLS handshake)

Stapling Flow

sequenceDiagram
    participant Server
    participant OCSP
    participant Client

    Server->>OCSP: Periodic status request
    OCSP-->>Server: Signed OCSP response

    Client->>Server: TLS handshake
    Server-->>Client: Certificate + OCSP staple

Advantages:

Improves privacy
Reduces latency
Avoids client network dependency

Limitations:

Staple may be stale
Server misconfiguration common
Still often soft-fail

7. Must-Staple

TLS Feature Extension: OCSP Must-Staple.

If present:

Client must receive valid OCSP staple
Otherwise handshake fails

In practice:

Rarely deployed
Operationally risky
Increases outage risk if responder unavailable

8. Why Revocation Often Fails in Practice

1. Soft-Fail Behavior

Clients proceed when checks fail.

2. Performance Constraints

TLS must be fast. Blocking network calls during handshake is undesirable.

3. Availability Trade-Off

Revocation systems must never become a single point of failure.

Security engineers often choose:

Availability over strict revocation enforcement.

4. Attack Timing Window

Revocation only works after:

Compromise detected
CA notified
Revocation processed
Clients updated

This introduces delay.

9. Modern Mitigations

Instead of relying heavily on revocation, modern ecosystems prefer:

Short-Lived Certificates

90-day lifetimes (e.g., Let’s Encrypt)
Rapid automated renewal

Reduces exposure window.

Certificate Transparency Monitoring

Detects mis-issuance quickly.

Rapid Root Removal

Browsers can remove trust entirely via updates.

CRLite (Firefox)

Uses compressed revocation data for scalable checking.

10. Enterprise Revocation Realities

In enterprise PKI:

Revocation often more strictly enforced
Internal OCSP responders more reliable
CRL distribution controlled

However:

Large internal CRLs can still cause performance issues
Device fleet management becomes critical

11. The Fundamental Trade-Off

Revocation exists at the intersection of:

Security
Availability
Performance
Privacy

Strict revocation enforcement increases:

Latency
Outage risk
Operational complexity

Relaxed enforcement increases:

Exposure window
Risk during compromise

There is no perfect balance.

12. The Hard Truth

At internet scale:

Revocation rarely stops real-time attacks.

It is primarily useful for:

Post-incident containment
Preventing future exploitation
Signaling ecosystem distrust

Short-lived certificates and rapid automation are more effective.

Revocation is not broken because of cryptography. It is weakened by scale, latency constraints, and the need to keep the internet available.

Certificate Infrastructure Deep Dive

Part 0: Architecture Map (Parts 1–7)
Part 1: The Cryptographic Foundations
Part 2: Inside the TLS Handshake
Part 3: Certificates Explained Properly
Part 4: PKI and Trust Stores
Part 5: Revocation, OCSP, and Why It Fails (current)
Part 6: Attacks and Real-World Failures
Part 7: The Future of PKI