Internet-Draft | DNSSEC Key Restore | October 2025 |
Obser & Pels | Expires 15 April 2026 | [Page] |
This document describes the issues surrounding the handling of DNSSEC private keys in a DNSSEC signer. It presents operational guidance in case a DNSSEC private key becoming inoperable.¶
This note is to be removed before publishing as an RFC.¶
Discussion of this document takes place on the Domain Name System Operations Working Group mailing list (dnsop@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/dnsop/.¶
Source for this draft and an issue tracker can be found at https://github.com/fobser/draft-fobser-dnsop-dnssec-keyrecovery.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 15 April 2026.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
DNSSEC [RFC9364] uses public key cryptography to provide integrity protection of DNS data. From an operational point of view, it is critically important to keep the private key secret under all circumstances.¶
The private key is typically kept secret by using Hardware Security Modules (HSMs). HSMs are designed to perform cryptographic operations such as creating keys and signing messages without disclosing the private key. Alternatively the DNSSEC signer is an appliance or commodity server hardware and operational policy stipulates that the private key must not leave the signer.¶
Operationally this is a risk because only a single key exists. The key could become inoperable at any point due to hardware failure, natural disaster, operator error, or malicious action.¶
It is difficult to create backups of the private key. After all, the system is designed to prevent backups. A compromise is usually reached by using a secret sharing scheme, e.g. [Shamir]. The private key is split into n pieces inside of the HSM, which are then distributed to key share holders. In case the private key becomes inoperable, m out of the n key share holders need to come together to restore the secret key.¶
A key sharing scheme does not mitigate all risk. When more than n-m key shares become unavailable a restore cannot be performed, because not enough key shares are available. This is particularly challenging in small to medium sized teams.¶
Unlike the private key, a DNSSEC signed zone can be considered public data with its integrity protected by signatures. Signed zones can be added to the normal, established backup procedures.¶
The rest of the document describes procedures on how to restore DNSSEC signing functionality with only a backup of the signed zone available.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document uses DNS terminology from [RFC9499]. DNSSEC key states and timeline related abbreviations are defined in [RFC7583].¶
The following additional definitions are used within this document.¶
The private part of a DNSKEY appearing in the chain of trust of the zone that can no longer be used for signing. Causes include hardware failure, natural disaster, operator error, or malicious action. A compromised key is not an inoperable private key since it can still be used for signing.¶
The opposite of an inoperable private key. A key that can be used for signing.¶
The procedures described in this document pertain to DNSSEC architectures with pre-signed records. Online signing, such as described in [RFC9824], is out of scope since it requires that each server carrying the zone holds a copy of the signing key(s). Thus, the operational challenges are different than described in the introduction.¶
The root zone is out of scope since the distribution of a new trust anchor takes considerably longer than the RRSIG lifetime [RFC7958].¶
Algorithm Rollovers as described in [RFC6781], section 4.1.4 are out of scope as well. They are already complicated enough and trying to recover from an inoperable DNSSEC private key while an algorithm rollover is being performed is unlikely to be successful. If a new algorithm is required, the procedures defined in this document SHOULD be followed to first restore signing with the old algorithm. Once this has been completed a regular algorithm rollover can be performed.¶
Regular key rollovers are in scope, since they do not pose extra challenges. The procedures described in this document effectively cancel a potentially ongoing key rollover and perform a new one.¶
In case of a catastrophe where the DNSSEC private key becomes inoperable and no functioning backups of the private key are available, it is desirable to recover from this situation with DNS resolution continuing to work for the effected zone(s) while performing DNSSEC key restore operations.¶
This is possible because the moment the DNSSEC private key becomes inoperable, the zone is still correctly signed and served by the authoritative name servers. Signatures typically have a lifetime of many days. That means that the operator has a lot of time to recover from this situation without the zone becoming bogus and no longer validating. Hasty and inappropriate action on the other hand could lead to outages.¶
While the DNSSEC private key cannot be restored because no functioning backups exist, the function of the zone can be restored.¶
The restore process uses slightly modified key rollover procedures from [RFC7583].¶
During the restore process, the signing software operates on a pre-signed zone. That is, the zone already contains a DNSKEY RRset and RRSIG RRsets. The signing software might try to remove these records because the accompanying private key is no longer present. The operator MUST prevent this, otherwise the zone will become bogus.¶
The signing software MUST NOT remove DNSKEYs until instructed to do so and SHOULD NOT remove old RRSIGs. If a signer implementation does not support keeping the old RRSIG records in place these records, excluding the RRSIG for the old DNSKEY RRset, MUST be manually added back to the zone before publication.¶
The exact process depends on which key(s) are inoperable and if the zone is signed with a split KSK / ZSK key pair or a Combined Signing Key (CSK).¶
Since the old ZSK is inoperable, it cannot be used to create new RRSIGs. Therefore the zone cannot be changed and only the Pre-Publication method can be used. See [RFC7583] section 2.1.¶
Section 3.2.1 of [RFC7583] documents the timeline for this method.¶
The following diagram shows the timeline of the restoration. Time increases along the horizontal scale from left to right and the vertical lines indicate events in the process. Significant times and time intervals are marked.¶
|1| |2| |3| |4| | | | | Key N - - ----------->|<-Iret->| | | | | Key N+1 |<-Ipub->|<--->|<----- - - | | | | Key N Trem Key N+1 Tpub Trdy Tact ---- Time ---->¶
Event 1: The new ZSK is added to the DNSKEY RRset at its publication time (Tpub).¶
The inoperable ZSK and all RRSIGs it created MUST remain in the zone.¶
The new ZSK must be published long enough to guarantee that any cached DNSKEY RRset contains the new ZSK. This interval is the publication interval (Ipub), given by¶
Ipub = Dprp + TTLkey¶
Dprp is the propagation delay, the time it takes for changes to propagate to all authoritative nameserver instances. TTLkey is the TTL of the DNSKEY RRset.¶
Event 2: The new ZSK can be used when it becomes ready at Trdy.¶
Trdy = Tpub + Ipub.¶
At this point the zone can be changed again.¶
Event 3: At some later time, the zone is signed with the new ZSK. At this point RRSIGs from the inoperable ZSK can be removed. The inoperable ZSK MUST be retained in the DNSKEY RRset.¶
Event 4: The inoperable ZSK can be removed after the retire interval (Iret).¶
Iret = Dsgn + Dprp + TTLsig¶
Dsgn is the delay needed to ensure that all existing RRsets are signed with the new ZSK, Dprp is the propagation delay and TTLsig is the maximum TTL of all RRSIG records.¶
Theoretically the Double-Signature method could be used as well. In this case records in the zone can only be changed after the retire interval, which is at least as long as the publication interval of the Pre-Publication method. The Double-Signature retire interval is given by:¶
Iret = Dsgn + Dprp + max(TTLkey, TTLsig)¶
Since the old KSK is inoperable, the DNSKEY RRset cannot be changed. Therefore, only the Double-DS method can be used. See [RFC7583] section 2.2.¶
If the ZSK is inoperable as well, it MUST NOT be restored yet.¶
Section 3.3.2 of [RFC7583] documents the timeline for this method.¶
The following diagram shows the timeline of the restoration. The diagram follows the convention described in Section 4.1.¶
|1| |2| |3| |4| |5| | | | | | Key N - ---------------------->|<-Iret->| | | | | | Key N+1 |<-Dreg->|<-IpubP->|<-->|<------- - | | | | | Key N Trem Key N+1 Tsbm Tpub Trdy Tact ---- Time ---->¶
Event 1: A new DS record is added to the DS RRset in the parent zone, this is the submission time, Tsbm.¶
Event 2: After the registration delay, Dreg, the DS record is published in the parent zone. This is the publication time (Tpub).¶
Tpub = Tsbm + Dreg.¶
The DS record must be published long enough to guarantee that any cached DS RRset contains the new DS record. This is the parent publication interval (IpubP).¶
IpubP = DprpP + TTLds¶
DprpP is the propagation delay of the parent zone, i.e. the time it takes for changes to propagate to all authoritative servers of the parent zone. TTLds is the TTL of the DS RRset at the parent.¶
Event 3: The new KSK can be used when it becomes ready at Trdy.¶
Trdy = Tpub + IpubP¶
Event 4: At this point, Tact, the new KSK is added to the DNSKEY RRset and used to generate the DNSKEY RRsig. The old, inoperable KSK can be removed. The ZSK MUST remain in the DNSKEY RRset.¶
If the ZSK is inoperable, the ZSK signing function can be now be restored using the procedure in the previous section.¶
To ensure that no caches have DNSKEY RRset with the old KSK, the old DS record MUST remain in the parent zone for the duration of the retire interval (Iret), given by:¶
Iret = DprpC + TTLkey¶
DprpC is the child propagation delay, the time it takes for changes to propagate to all authoritative nameserver instances of the child zone. TTLkey is the TTL of the DNSKEY RRset.¶
Event 5: The old DS record can be removed from the parent zone at Trem.¶
Trem = Tact + Iret¶
Since the old CSK is inoperable, the DNSKEY RRset cannot be changed. Therefore, only the Double-DS method can be used. See [RFC7583] section 2.2.¶
Section 3.3.2 of [RFC7583] documents the timeline for this method.¶
Since the CSK is also used to sign the zone, the timing of the Double-DS method needs to be adjusted.¶
The inoperable CSK and all RRSIGs it created MUST remain in the zone.¶
The following diagram shows the timeline of the restoration. The diagram follows the convention described in Section 4.1.¶
|1| |2| |3| |4| |5| | | | | | Key N - ---------------------->|<-Iret->| | | | | | Key N+1 |<-Dreg->|<-IpubP->|<-->|<------- - | | | | | Key N Trem Key N+1 Tsbm Tpub Trdy Tact ---- Time ---->¶
Event 1: A new DS record is added to the DS RRset in the parent zone, this is the submission time, Tsbm.¶
Event 2: After the registration delay, Dreg, the DS record is published in the parent zone. This is the publication time (Tpub).¶
Tpub = Tsbm + Dreg.¶
The DS record must be published long enough to guarantee that any cached DS RRset contains the new DS record. This is the parent publication interval (IpubP) given by¶
IpubP = DprpP + TTLds¶
DprpP is the propagation delay of the parent zone, i.e. the time it takes for changes to propagate to all authoritative servers of the parent zone. TTLds is the TTL of the DS RRset at the parent.¶
Event 3: The new CSK can be used when it becomes ready at Trdy.¶
Trdy = Tpub + IpubP¶
Event 4: At this point the new CSK is added to the DNSKEY RRset and used to generate the DNSKEY RRsig. The old, inoperable CSK MUST remain in the DNSKEY RRset. The new CSK can be used to generate the RRsigs for the rest of the zone. The RRSIGs generated by the inoperable CSK MUST remain in the zone.¶
To ensure that no caches have DNSKEY RRset with the old CSK, the old DS record MUST remain in the parent zone for the duration of the retire interval (Iret), given by:¶
Iret = Dsgn + DprpC + max(TTLkey, TTLsig)¶
Dsgn is the delay needed to ensure that all existing RRsets are signed with the new CSK. DprpC is the child propagation delay, the time it takes for changes to propagate to all authoritative nameserver instances of the child zone. TTLkey is the TTL of the DNSKEY RRset and TTLsig is the maximum TTL of all RRSIG records.¶
Event 5: The old DS record can be removed from the parent zone at Trem.¶
Trem = Tact + Iret¶
At the same time the old, inoperable CSK and all its signatures can be removed as well.¶
All security considerations of [RFC9364] apply to this document.¶
This document has no IANA actions.¶