Internet-Draft DNSSEC Key Restore October 2025
Obser & Pels Expires 15 April 2026 [Page]
Workgroup:
Domain Name System Operations
Internet-Draft:
draft-fobser-dnsop-dnssec-keyrestore-00
Published:
Intended Status:
Informational
Expires:
Authors:
F. Obser
RIPE NCC
M. Pels
RIPE NCC

DNSSEC Key Restore

Abstract

This document describes the issues surrounding the handling of DNSSEC private keys in a DNSSEC signer. It presents operational guidance in case a DNSSEC private key becoming inoperable.

Discussion Venues

This note is to be removed before publishing as an RFC.

Discussion of this document takes place on the Domain Name System Operations Working Group mailing list (dnsop@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/dnsop/.

Source for this draft and an issue tracker can be found at https://github.com/fobser/draft-fobser-dnsop-dnssec-keyrecovery.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 15 April 2026.

Table of Contents

1. Introduction

DNSSEC [RFC9364] uses public key cryptography to provide integrity protection of DNS data. From an operational point of view, it is critically important to keep the private key secret under all circumstances.

The private key is typically kept secret by using Hardware Security Modules (HSMs). HSMs are designed to perform cryptographic operations such as creating keys and signing messages without disclosing the private key. Alternatively the DNSSEC signer is an appliance or commodity server hardware and operational policy stipulates that the private key must not leave the signer.

Operationally this is a risk because only a single key exists. The key could become inoperable at any point due to hardware failure, natural disaster, operator error, or malicious action.

It is difficult to create backups of the private key. After all, the system is designed to prevent backups. A compromise is usually reached by using a secret sharing scheme, e.g. [Shamir]. The private key is split into n pieces inside of the HSM, which are then distributed to key share holders. In case the private key becomes inoperable, m out of the n key share holders need to come together to restore the secret key.

A key sharing scheme does not mitigate all risk. When more than n-m key shares become unavailable a restore cannot be performed, because not enough key shares are available. This is particularly challenging in small to medium sized teams.

Unlike the private key, a DNSSEC signed zone can be considered public data with its integrity protected by signatures. Signed zones can be added to the normal, established backup procedures.

The rest of the document describes procedures on how to restore DNSSEC signing functionality with only a backup of the signed zone available.

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document uses DNS terminology from [RFC9499]. DNSSEC key states and timeline related abbreviations are defined in [RFC7583].

The following additional definitions are used within this document.

Inoperable (private key):

The private part of a DNSKEY appearing in the chain of trust of the zone that can no longer be used for signing. Causes include hardware failure, natural disaster, operator error, or malicious action. A compromised key is not an inoperable private key since it can still be used for signing.

Operable (private key):

The opposite of an inoperable private key. A key that can be used for signing.

3. Scope

The procedures described in this document pertain to DNSSEC architectures with pre-signed records. Online signing, such as described in [RFC9824], is out of scope since it requires that each server carrying the zone holds a copy of the signing key(s). Thus, the operational challenges are different than described in the introduction.

The root zone is out of scope since the distribution of a new trust anchor takes considerably longer than the RRSIG lifetime [RFC7958].

Algorithm Rollovers as described in [RFC6781], section 4.1.4 are out of scope as well. They are already complicated enough and trying to recover from an inoperable DNSSEC private key while an algorithm rollover is being performed is unlikely to be successful. If a new algorithm is required, the procedures defined in this document SHOULD be followed to first restore signing with the old algorithm. Once this has been completed a regular algorithm rollover can be performed.

Regular key rollovers are in scope, since they do not pose extra challenges. The procedures described in this document effectively cancel a potentially ongoing key rollover and perform a new one.

4. DNSSEC Key Restore

In case of a catastrophe where the DNSSEC private key becomes inoperable and no functioning backups of the private key are available, it is desirable to recover from this situation with DNS resolution continuing to work for the effected zone(s) while performing DNSSEC key restore operations.

This is possible because the moment the DNSSEC private key becomes inoperable, the zone is still correctly signed and served by the authoritative name servers. Signatures typically have a lifetime of many days. That means that the operator has a lot of time to recover from this situation without the zone becoming bogus and no longer validating. Hasty and inappropriate action on the other hand could lead to outages.

While the DNSSEC private key cannot be restored because no functioning backups exist, the function of the zone can be restored.

The restore process uses slightly modified key rollover procedures from [RFC7583].

During the restore process, the signing software operates on a pre-signed zone. That is, the zone already contains a DNSKEY RRset and RRSIG RRsets. The signing software might try to remove these records because the accompanying private key is no longer present. The operator MUST prevent this, otherwise the zone will become bogus.

The signing software MUST NOT remove DNSKEYs until instructed to do so and SHOULD NOT remove old RRSIGs. If a signer implementation does not support keeping the old RRSIG records in place these records, excluding the RRSIG for the old DNSKEY RRset, MUST be manually added back to the zone before publication.

The exact process depends on which key(s) are inoperable and if the zone is signed with a split KSK / ZSK key pair or a Combined Signing Key (CSK).

4.1. KSK / ZSK split, KSK operable, ZSK inoperable

Since the old ZSK is inoperable, it cannot be used to create new RRSIGs. Therefore the zone cannot be changed and only the Pre-Publication method can be used. See [RFC7583] section 2.1.

Section 3.2.1 of [RFC7583] documents the timeline for this method.

The following diagram shows the timeline of the restoration. Time increases along the horizontal scale from left to right and the vertical lines indicate events in the process. Significant times and time intervals are marked.

              |1|      |2|   |3|      |4|
               |        |     |        |
Key N         - - ----------->|<-Iret->|
               |        |     |        |
Key N+1        |<-Ipub->|<--->|<----- - -
               |        |     |        |
Key N                                 Trem
Key N+1        Tpub    Trdy  Tact

                  ---- Time ---->

Event 1: The new ZSK is added to the DNSKEY RRset at its publication time (Tpub).

The inoperable ZSK and all RRSIGs it created MUST remain in the zone.

The new ZSK must be published long enough to guarantee that any cached DNSKEY RRset contains the new ZSK. This interval is the publication interval (Ipub), given by

Ipub = Dprp + TTLkey

Dprp is the propagation delay, the time it takes for changes to propagate to all authoritative nameserver instances. TTLkey is the TTL of the DNSKEY RRset.

Event 2: The new ZSK can be used when it becomes ready at Trdy.

Trdy = Tpub + Ipub.

At this point the zone can be changed again.

Event 3: At some later time, the zone is signed with the new ZSK. At this point RRSIGs from the inoperable ZSK can be removed. The inoperable ZSK MUST be retained in the DNSKEY RRset.

Event 4: The inoperable ZSK can be removed after the retire interval (Iret).

Iret = Dsgn + Dprp + TTLsig

Dsgn is the delay needed to ensure that all existing RRsets are signed with the new ZSK, Dprp is the propagation delay and TTLsig is the maximum TTL of all RRSIG records.

Theoretically the Double-Signature method could be used as well. In this case records in the zone can only be changed after the retire interval, which is at least as long as the publication interval of the Pre-Publication method. The Double-Signature retire interval is given by:

Iret = Dsgn + Dprp + max(TTLkey, TTLsig)

4.2. KSK / ZSK split, KSK inoperable

Since the old KSK is inoperable, the DNSKEY RRset cannot be changed. Therefore, only the Double-DS method can be used. See [RFC7583] section 2.2.

If the ZSK is inoperable as well, it MUST NOT be restored yet.

Section 3.3.2 of [RFC7583] documents the timeline for this method.

The following diagram shows the timeline of the restoration. The diagram follows the convention described in Section 4.1.

            |1|      |2|       |3|  |4|      |5|
             |        |         |    |        |
Key N       - ---------------------->|<-Iret->|
             |        |         |    |        |
Key N+1      |<-Dreg->|<-IpubP->|<-->|<------- -
             |        |         |    |        |
Key N                                        Trem
Key N+1     Tsbm     Tpub      Trdy Tact

                 ---- Time ---->

Event 1: A new DS record is added to the DS RRset in the parent zone, this is the submission time, Tsbm.

Event 2: After the registration delay, Dreg, the DS record is published in the parent zone. This is the publication time (Tpub).

Tpub = Tsbm + Dreg.

The DS record must be published long enough to guarantee that any cached DS RRset contains the new DS record. This is the parent publication interval (IpubP).

IpubP = DprpP + TTLds

DprpP is the propagation delay of the parent zone, i.e. the time it takes for changes to propagate to all authoritative servers of the parent zone. TTLds is the TTL of the DS RRset at the parent.

Event 3: The new KSK can be used when it becomes ready at Trdy.

Trdy = Tpub + IpubP

Event 4: At this point, Tact, the new KSK is added to the DNSKEY RRset and used to generate the DNSKEY RRsig. The old, inoperable KSK can be removed. The ZSK MUST remain in the DNSKEY RRset.

If the ZSK is inoperable, the ZSK signing function can be now be restored using the procedure in the previous section.

To ensure that no caches have DNSKEY RRset with the old KSK, the old DS record MUST remain in the parent zone for the duration of the retire interval (Iret), given by:

Iret = DprpC + TTLkey

DprpC is the child propagation delay, the time it takes for changes to propagate to all authoritative nameserver instances of the child zone. TTLkey is the TTL of the DNSKEY RRset.

Event 5: The old DS record can be removed from the parent zone at Trem.

Trem = Tact + Iret

4.3. CSK inoperable

Since the old CSK is inoperable, the DNSKEY RRset cannot be changed. Therefore, only the Double-DS method can be used. See [RFC7583] section 2.2.

Section 3.3.2 of [RFC7583] documents the timeline for this method.

Since the CSK is also used to sign the zone, the timing of the Double-DS method needs to be adjusted.

The inoperable CSK and all RRSIGs it created MUST remain in the zone.

The following diagram shows the timeline of the restoration. The diagram follows the convention described in Section 4.1.

            |1|      |2|       |3|  |4|      |5|
             |        |         |    |        |
Key N       - ---------------------->|<-Iret->|
             |        |         |    |        |
Key N+1      |<-Dreg->|<-IpubP->|<-->|<------- -
             |        |         |    |        |
Key N                                        Trem
Key N+1     Tsbm     Tpub      Trdy Tact

                 ---- Time ---->

Event 1: A new DS record is added to the DS RRset in the parent zone, this is the submission time, Tsbm.

Event 2: After the registration delay, Dreg, the DS record is published in the parent zone. This is the publication time (Tpub).

Tpub = Tsbm + Dreg.

The DS record must be published long enough to guarantee that any cached DS RRset contains the new DS record. This is the parent publication interval (IpubP) given by

IpubP = DprpP + TTLds

DprpP is the propagation delay of the parent zone, i.e. the time it takes for changes to propagate to all authoritative servers of the parent zone. TTLds is the TTL of the DS RRset at the parent.

Event 3: The new CSK can be used when it becomes ready at Trdy.

Trdy = Tpub + IpubP

Event 4: At this point the new CSK is added to the DNSKEY RRset and used to generate the DNSKEY RRsig. The old, inoperable CSK MUST remain in the DNSKEY RRset. The new CSK can be used to generate the RRsigs for the rest of the zone. The RRSIGs generated by the inoperable CSK MUST remain in the zone.

To ensure that no caches have DNSKEY RRset with the old CSK, the old DS record MUST remain in the parent zone for the duration of the retire interval (Iret), given by:

Iret = Dsgn + DprpC + max(TTLkey, TTLsig)

Dsgn is the delay needed to ensure that all existing RRsets are signed with the new CSK. DprpC is the child propagation delay, the time it takes for changes to propagate to all authoritative nameserver instances of the child zone. TTLkey is the TTL of the DNSKEY RRset and TTLsig is the maximum TTL of all RRSIG records.

Event 5: The old DS record can be removed from the parent zone at Trem.

Trem = Tact + Iret

At the same time the old, inoperable CSK and all its signatures can be removed as well.

5. Security Considerations

All security considerations of [RFC9364] apply to this document.

6. IANA Considerations

This document has no IANA actions.

7. References

7.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC9364]
Hoffman, P., "DNS Security Extensions (DNSSEC)", BCP 237, RFC 9364, DOI 10.17487/RFC9364, , <https://www.rfc-editor.org/rfc/rfc9364>.

7.2. Informative References

[RFC6781]
Kolkman, O., Mekking, W., and R. Gieben, "DNSSEC Operational Practices, Version 2", RFC 6781, DOI 10.17487/RFC6781, , <https://www.rfc-editor.org/rfc/rfc6781>.
[RFC7583]
Morris, S., Ihren, J., Dickinson, J., and W. Mekking, "DNSSEC Key Rollover Timing Considerations", RFC 7583, DOI 10.17487/RFC7583, , <https://www.rfc-editor.org/rfc/rfc7583>.
[RFC7958]
Abley, J., Schlyter, J., Bailey, G., and P. Hoffman, "DNSSEC Trust Anchor Publication for the Root Zone", RFC 7958, DOI 10.17487/RFC7958, , <https://www.rfc-editor.org/rfc/rfc7958>.
[RFC9499]
Hoffman, P. and K. Fujiwara, "DNS Terminology", BCP 219, RFC 9499, DOI 10.17487/RFC9499, , <https://www.rfc-editor.org/rfc/rfc9499>.
[RFC9824]
Huque, S., Elmerot, C., and O. Gudmundsson, "Compact Denial of Existence in DNSSEC", RFC 9824, DOI 10.17487/RFC9824, , <https://www.rfc-editor.org/rfc/rfc9824>.
[Shamir]
Shamir, A., "How to Share a Secret", ACM Press Communications of the ACM, Vol. 22, No. 11, pp. 612-613, DOI 10.1145/359168.359176, , <https://doi.org/10.1145/359168.359176>.

Acknowledgments

The document draws heavily from the work in [RFC7583] and we thank the authors for their work:

Authors' Addresses

Florian Obser
RIPE NCC
Martin Pels
RIPE NCC