Skip to content

Certificate transparency: how CT logs work and what they reveal

· 23 min read
Copyright: MIT
The words CERTIFICATE TRANSPARENCY as a monospace wordmark with a single orange underline and a small grey RFC 6962 subtitle

A certificate authority is a strange thing to build a security system around. Any one of a few hundred organizations the browser trusts can sign a certificate for any domain on the internet, and until about a decade ago, the owner of that domain had no reliable way to find out it had happened. The trust was total and the visibility was zero. That gap is exactly what got exploited in 2011, when a single compromised Dutch CA issued a working certificate for *.google.com and used it to intercept the traffic of hundreds of thousands of people in Iran. Nobody at Google signed that certificate. Nobody at Google could see it until the attack was already running.

Certificate Transparency is the answer to that gap, and the answer is almost embarrassingly simple in outline: write every certificate down in a public, append-only ledger before any browser will trust it, and let anyone read the ledger. The hard part is the cryptography that makes “append-only” mean something a log operator cannot lie about, and the policy machinery that makes browsers actually demand proof of logging. This post walks through both. It covers the DigiNotar motivation, the Merkle-tree structure that RFC 6962 nailed down, the Signed Certificate Timestamp that proves a certificate was logged, how Chrome, Safari and Firefox enforce all of it, the 2025 shift from dynamic logs to static tiles, and the recon side that detection engineers care about: the fact that every certificate you log is also a public announcement of a hostname you own.

2011: the failure that justified the whole thing

The certificate authority model has one structural flaw that no amount of careful operation fixes. Trust is not scoped. When your browser trusts DigiCert or Sectigo or Google Trust Services, it trusts them to sign a certificate for yourbank.com whether or not they have any relationship with your bank. The browser checks that the certificate chains to a trusted root and that the name matches. It does not, and cannot, check whether the domain owner authorized the issuance. So the security of every HTTPS site on the internet is bounded below by the security of the weakest CA in the trust store, and there are hundreds of them.

DigiNotar was the weakest. The Dutch CA was breached in mid-2011, and the attacker used the access to issue fraudulent certificates. The investigation that followed, published as the Black Tulip report, found at least 531 rogue certificates. One of them was a wildcard for Google, and on 28 August 2011 it surfaced in a man-in-the-middle attack against roughly 300,000 users almost entirely inside Iran. A Chrome user in Tehran connecting to Gmail would have seen a perfectly valid lock icon. The certificate chained to a real, trusted root. The only thing wrong with it was that Google never asked for it, and there was no system anywhere that would have told Google it existed.

The cleanup was brutal. DigiNotar was declared bankrupt within weeks. When the major browsers and operating systems pulled the DigiNotar roots, they took down certificates the Dutch government itself relied on, which broke services for clearing customs and other state functions until those could be re-issued under a different CA. A single CA compromise had cascaded into a national infrastructure problem. That cost is what made the case for transparency. If detection of a mis-issued certificate depended on the victim happening to notice the attack in progress, the system was broken; detection had to be automatic, public, and fast.

Three Google engineers, Ben Laurie, Adam Langley and Emilia Kasper, started building the framework that year. The first IETF draft went in under the codename Sunlight in 2012. RFC 6962 was published in June 2013.

The core idea: an append-only public ledger

The design goal CT set itself is narrow and worth stating precisely. CT does not try to stop a CA from mis-issuing a certificate. It cannot; the CA holds a trusted key and can sign whatever it wants. What CT does is make mis-issuance impossible to hide. Every certificate a publicly trusted CA issues must be submitted to one or more public logs, and a browser will refuse to trust a certificate that does not come with proof it was logged. The domain owner, who is watching the logs for their own name, sees the rogue certificate appear and can act. Detection moves from “maybe someone notices the attack” to “the certificate is published the moment it exists.”

That only works if the log itself can be held to account. A log that could quietly delete or alter entries would be no better than a CA you have to trust blindly. So the log is built as an append-only data structure whose history cannot be rewritten without the change being mathematically detectable. The tool for that is a Merkle tree, and the properties it gives you are the heart of CT.

There are three roles in the system. Logs accept certificates and maintain the append-only tree. Monitors watch logs, fetch every new entry, and look for certificates that matter to them; a domain owner runs a monitor, or pays someone to, and gets alerted when a certificate naming their domain shows up. Auditors check that logs are behaving, in particular that a log is genuinely append-only and is not presenting different versions of its history to different people. In practice the auditing role mostly lives inside browsers and monitoring services rather than as standalone software, which is a point the design has wrestled with for years and never fully resolved.

CA issues cert CT log append-only Merkle tree Monitor reads entries submit SCT = "logged, here is proof" Browser checks SCTs on TLS Domain owner gets alert *The four roles. The CA submits, the log returns an SCT promising inclusion, the browser checks the SCT on every TLS handshake, and monitors feed alerts back to whoever owns the name.*

The Merkle tree, and what “append-only” buys you

A CT log is one ever-growing binary Merkle tree. RFC 6962 uses SHA-256 throughout. The construction has a couple of details that exist purely to prevent a class of attack, and they are worth seeing exactly.

Each certificate becomes a leaf. The hash of a leaf is not just SHA-256 of the certificate data; it is SHA-256(0x00 || leaf_data). Internal nodes hash their two children as SHA-256(0x01 || left || right). Those leading bytes, 0x00 for leaves and 0x01 for interior nodes, are domain separation. Without them, an attacker could take an interior node’s hash and pass it off as a leaf, or vice versa, and construct two different trees with the same root. The one-byte prefix makes the leaf hash function and the node hash function provably distinct, so a hash computed as a leaf can never be mistaken for one computed as a node. The hash of an empty tree is defined as SHA-256() of the empty string. These are small rules, and they are the kind of thing that, if you got them wrong, would quietly destroy the security of the whole structure.

The root of the tree is a single 32-byte hash that commits to every leaf and the exact order they were added. Change any certificate, reorder any two, delete one, and the root changes. The log periodically signs a structure called the Signed Tree Head, which contains the tree size, a timestamp, and that root hash. The STH is the log saying, under its own key, “at this moment my tree has exactly this many entries and this root.” Anyone who has seen an STH has a commitment they can hold the log to later.

Two kinds of proof make the structure useful, and they are what separate a Merkle tree from a plain signed list.

An inclusion proof, which the RFC calls a Merkle audit path, is the short list of sibling hashes you need to recompute the root from a single leaf. If a log claims your certificate is entry number 4,210,118 in a tree of 9 million entries, it can hand you about two dozen hashes, and you can walk those up the tree and arrive at the signed root yourself. You never download the other nine million certificates. The proof size is logarithmic in the tree, so even a tree with billions of entries needs only thirty-odd hashes to prove any one leaf is in it. That logarithmic cost is the entire reason this is practical.

A consistency proof is the part that enforces append-only. Given two STHs from the same log, an older one over m entries and a newer one over n entries, a consistency proof is the set of nodes that lets you verify the newer tree contains the older tree, unchanged, as a prefix. If the log had quietly edited or removed any of the first m entries, no valid consistency proof between those two roots could exist. So an auditor that holds a sequence of STHs over time, and checks consistency between each consecutive pair, can prove the log only ever grew and never rewrote its past. The log cannot forge its way out of that, because it would have to produce a SHA-256 collision.

Inclusion proof for leaf d2: the orange siblings are all you need root h03 h47 h01 h23 d0 d1 d2 d3 d2 + h03 + h47 wait, d2 + d3 ... 3 hashes for 8 leaves; log2(n) for n leaves *To prove d2 is in the tree, a client needs d2's sibling d3, then h01, then h47, hashing upward to the signed root. The proof grows with the logarithm of the tree, not its size.*

The SCT: a receipt the browser checks

When a CA submits a certificate to a log, the log does not necessarily add it to the tree on the spot. Building the tree and signing a new head is batched. What the log returns immediately is a Signed Certificate Timestamp, the SCT, and the SCT is a signed promise: the log commits, under its own key, to add this certificate to its tree within a bounded time called the Maximum Merge Delay. RFC 6962 set a typical MMD of 24 hours. If the log fails to merge within the MMD, it is out of compliance and can be distrusted.

The SCT is small. Its fields are the SCT version, the log id (a hash identifying which log signed it), a timestamp in milliseconds, an optional extensions field, and a signature over the certificate plus that metadata. The browser, on receiving a certificate during a TLS handshake, checks the SCTs that came with it: that each is correctly signed by a log the browser recognizes, that the timestamp is sane, and that there are enough of them from enough distinct logs to satisfy policy. Notice what the browser does not do in the common case. It does not contact the log. It does not download an inclusion proof. It trusts the SCT as a signed promise, because requiring an online inclusion-proof fetch on every connection would be a latency and privacy disaster, the same problem that has dogged OCSP-based revocation checking. The SCT is a bearer receipt, verified offline.

There are three ways an SCT can reach the browser, and which one a site uses is mostly invisible to the user.

The common one is embedding. The CA gets SCTs at issuance time and bakes them into the certificate itself, in an X.509v3 extension with OID 1.3.6.1.4.1.11129.2.4.2. This requires no server configuration at all; the SCTs travel inside the certificate the server already sends. There is a chicken-and-egg problem here, though. The SCT covers the certificate, but the SCT has to be inside the certificate, and you cannot include a signature over a certificate that does not exist yet. CT solves it with a precertificate. The CA builds a precertificate that carries a special critical poison extension, OID 1.3.6.1.4.1.11129.2.4.3, which makes every TLS client refuse to validate it. The CA submits that poisoned precertificate to the logs, collects the SCTs, then issues the real certificate with those SCTs embedded. The precertificate and the final certificate share the same key and identity, so the SCT over the precertificate certifies the real one.

The other two delivery paths do not require a precertificate dance, because they hand the SCT to the client at connection time rather than baking it into the certificate. One is a TLS extension, signed_certificate_timestamp, sent during the handshake. The other rides inside an OCSP response via stapling, using OID 1.3.6.1.4.1.11129.2.4.5. Both let a server attach SCTs to a certificate that does not contain them, which matters when the CA did not embed any. In practice embedding won. Most servers do not support the TLS extension or OCSP stapling for this purpose, so CAs embed SCTs as the default, and since around June 2021 essentially all actively used publicly trusted certificates carry their SCTs in the X.509 extension.

Three ways an SCT reaches the browser Embedded (common) precert → logs → SCTs baked into cert via X.509v3 ext ...2.4.2 no server config TLS extension sent in handshake signed_certificate _timestamp server attaches OCSP stapling inside OCSP response OID ...2.4.5 rare in practice *Embedding via the precertificate path dominates because it needs no server changes. The TLS-extension and OCSP-stapling paths exist for certificates that did not embed SCTs at issuance.*

How browsers actually enforce it

A standard that browsers do not enforce is a suggestion. The teeth of CT are in browser policy, and Chrome moved first. Chrome began requiring CT for Extended Validation certificates in 2015, treating the green EV bar as conditional on logging. The bigger move came on 30 April 2018: every certificate issued after that date had to be CT-compliant or Chrome would reject it outright, not downgrade a UI indicator but refuse the connection. There was a forcing event in between. In 2016 and 2017, Symantec’s CA operations were found to have mis-issued a large number of certificates, and the response, eventually a full distrust of the Symantec PKI, leaned on CT logs as the evidence base. Transparency had found a real problem at one of the largest CAs in the world.

Today the policy is quantitative. To be CT-compliant in Chrome, a certificate needs a minimum number of SCTs that depends on its lifetime: two SCTs for certificates valid 180 days or less, three for longer-lived ones, each from a distinct log. At least two of those SCTs must come from logs operated by distinct operators that Chrome recognizes, which is the rule that stops a single log operator from being a single point of failure or capture. The logs themselves have to be in a recognized state, qualified or usable or read-only, at the relevant time. Chrome ships the list of recognized logs as a signed component, updated roughly daily, and it will only keep enforcing CT as long as that list is reasonably fresh; the policy gives it a window of about 70 days before it stops enforcing rather than risk failing connections on a stale list.

Early Chrome policy also required that one of the SCTs come from a Google-operated log. That requirement was dropped in 2022, aligning Chrome with Apple’s policy and removing a structural advantage for Google’s own logs. Apple enforces its own CT policy in Safari and across its platforms, with its own count of required SCTs that varies similarly by certificate lifetime and its own list of trusted logs. The two policies are close but not identical, which is why CAs target the union of both: enough SCTs from enough operators to satisfy whichever browser is strictest.

Firefox was the long holdout. Mozilla shipped CT enforcement much later than the others. As of Firefox 135 on desktop, released in early 2025, Firefox requires CT log inclusion for certificates chaining to CAs in Mozilla’s root program, with Firefox for Android following at version 145. So by 2026, all three major browser engines enforce CT, which means in practice that a publicly trusted certificate without valid SCTs is simply unusable on the web. The enforcement is invisible when it works and fatal when it does not.

One header is worth a footnote because it shows up in old configs. Expect-CT was an HTTP response header that let a site opt into CT enforcement and failure reporting before browsers enforced CT universally, with directives max-age, enforce, and report-uri. Once Chrome began enforcing CT for all certificates, the header had nothing left to do. It is deprecated, effectively obsolete since mid-2021, and Chromium has moved to remove it. If you see it in a config today, it is dead weight.

What the logs reveal

Here is the part detection and reconnaissance engineers care about. CT was designed to make certificates public. It succeeded completely. And a certificate is, among other things, a list of hostnames. Every name in the Subject Common Name and, more importantly, every name in the Subject Alternative Name extension is now permanently written into a public, queryable, append-only log the moment a certificate is issued. The mandate that protects you from a rogue CA also publishes a map of your infrastructure.

The practical interface to this is crt.sh, which aggregates the major logs into a searchable database. A query like %.example.com returns every certificate ever logged whose names match, and from the SAN fields you extract subdomains. This is the cleanest passive reconnaissance technique that exists. No packet is ever sent to the target. Nothing is scanned. The attacker reads a public database that the target’s own CA populated on the target’s behalf. Tools like subfinder, amass, certspotter and the purpose-built CTFR automate the query and merge CT results with passive DNS and other sources. For an attacker doing attack-surface mapping, or a defender doing the same mapping to find their own forgotten assets first, CT is usually the highest-yield source in the toolbox.

What it catches is everything that ever got a public certificate. That includes the things people forget. A staging environment stood up for a demo two years ago at staging-old.example.com, given a Let’s Encrypt certificate and never decommissioned, is in the logs forever even after the DNS record is gone and the service is down, because the log is append-only and certificates do not un-log. Internal tools that got a public certificate for convenience, vpn., jenkins., grafana., admin., are all enumerable. The hostname naming convention an organization uses leaks, so an attacker who sees prod-api-01 can reasonably guess prod-api-02 exists. CT did not create these mistakes, but it made them permanently discoverable, and that is a meaningful shift in what an attacker can learn for free before touching the target.

There are limits, and they matter for accuracy. A wildcard certificate for *.example.com logs the single string *.example.com and reveals nothing about which specific subdomains exist behind it; an organization that uses wildcards heavily exposes far less through CT than one issuing a named certificate per host. Hosts that never received a publicly trusted certificate are invisible, so a service behind an internal CA, or one using a self-signed certificate, or one with no TLS at all, does not appear. And the log tells you a name was certified, not that the name resolves today or that anything is listening; CT enumeration produces candidates that still need to be validated against live DNS and ports. So the technique is high-yield and incomplete at the same time, which is the normal shape of passive reconnaissance.

crt.sh ? q=%.example.com → SAN fields: www.example.com live api.example.com live staging-old.example.com stale, but still logged forever jenkins.example.com internal tool, enumerable *.example.com wildcard hides what is behind it invisible: internal-CA hosts, self-signed hosts, no-TLS hosts *A CT query recovers everything that ever got a public certificate, including stale and internal hosts, but a wildcard hides its children and anything without a public cert never appears.*

This is also why CT shows up in the threat models of teams that fingerprint and score traffic. A defender who knows their full hostname inventory from CT can spot a request to a host that should not be reachable from the outside. An attacker who scraped the same logs knows where the soft targets are before the first probe. The data is symmetric; both sides read the same public ledger. It pairs naturally with the rest of the Web PKI trust-store machinery that decides which certificates are valid in the first place.

2021 to 2026: version 2.0, and the move to static logs

CT has not stood still since 2013. RFC 9162, published in December 2021, is Certificate Transparency version 2.0, authored by Ben Laurie, Eran Messeri and Rob Stradling. It obsoletes RFC 6962 on paper and cleans up several things the original got slightly wrong. Hash and signature algorithms became agile, pinned to IANA registries rather than hardcoded to SHA-256 and a fixed signature scheme. Precertificates became CMS objects rather than X.509 certificates, which sidesteps an awkward conflict with the rule that certificate serial numbers must be unique. A unified TransItem structure replaced the assortment of separate leaf and proof encodings from 6962. The TLS extension was renamed transparency_info. Despite all that, 2.0 adoption was minimal for years; the ecosystem kept running on 6962 because it worked and migrating logs is expensive.

The actual migration that is happening in 2025 and 2026 is a different one, and it is driven by operating cost rather than the 2.0 spec. Running an RFC 6962 log at scale turned out to be brutally expensive. The logs are backed by large relational databases; Let’s Encrypt reported individual shards running between 7 and 10 terabytes, with annual cloud costs approaching seven figures. The dynamic API, where clients fetch individual proofs from live endpoints, forces monitors to slowly crawl the log and is hard to cache. And the 24-hour Maximum Merge Delay is a constant compliance hazard, because a log that stalls during a traffic spike and fails to merge within the window is out of compliance.

The replacement is the Static CT API, sometimes called the tiled-log or Sunlight model, specified at c2sp.org/static-ct-api. Instead of computing proofs on demand, a static log serves its Merkle tree as a set of immutable tiles, each a fixed run of 256 hashes at a given tree height, stored as plain files. A monitor fetches the tiles it needs in parallel and computes any inclusion or consistency proof locally. Because the tiles are static files, they cache trivially behind a CDN or object store, the server does no per-request computation, and the architecture drops the MMD requirement entirely. The cost profile collapses from a seven-figure relational database to cheap static hosting. Filippo Valsorda’s Sunlight is the reference implementation, and Let’s Encrypt has been standing up tiled logs on it.

The timeline is concrete and close. Let’s Encrypt announced in August 2025 that its RFC 6962 logs go read-only on 30 November 2025 and shut down completely on 28 February 2026. The static logs take over. For anyone monitoring CT, the practical effect is that the API you query is changing under you in early 2026: the dynamic get-entries style endpoints of the old logs are going away, and the tile-fetch model is replacing them. The data CT exposes does not change. The plumbing for getting at it does.

What CT settled, and what it did not

Certificate Transparency did the thing it set out to do. The DigiNotar-style attack, a trusted CA quietly issuing a certificate for a domain it has no business signing, is no longer quiet. It cannot be. The certificate has to be logged to be trusted, the log is public and append-only, and the domain owner is watching. The Symantec distrust showed the model working on the largest possible scale: transparency surfaced systematic mis-issuance at a top-tier CA and the evidence held up well enough to end that CA’s place in the trust store. For a system that ships as a few X.509 extensions and a browser policy, that is a large return.

What CT did not do is keep its byproduct contained. The same publicity that catches a rogue certificate publishes a complete, permanent, append-only map of the hostnames every organization has ever certified, and that map is one HTTP query away. There is no opt-out that preserves the security benefit, because the security benefit is the publicity. An organization can use wildcards to blur the picture, but it cannot remove a name once logged, and the append-only property that makes the log trustworthy is the same property that makes a leaked staging hostname permanent. Defenders and attackers read the identical ledger.

The thing that strikes me, looking at the 2025-2026 migration, is how the economics finally forced a redesign that the cryptography never required. The Merkle-tree math in RFC 6962 was sound from the start and barely changes in the tiled model; what broke was the cost of serving proofs from a live database at the scale the entire web now logs at. The move to static tiles is not a security fix. It is an admission that “every certificate on the internet, in a public log, queryable by anyone” is a genuinely large amount of data, large enough that the only sustainable way to publish it is as flat files on a CDN. The transparency won. Paying for it is the open problem.


Sources & further reading

Further reading