Internet-Draft | DKIM Hash Algorithm Adaptivity | February 2025 |
Nurpmeso | Expires 7 August 2025 | [Page] |
DKIM (RFC 6376, section 3.7) defines how "data-hash" is generated as input to a "sig-alg" for the purpose of generating a cryptographic signature. Different to the RSA algorithm (RFC 8017) solely defined for and by DKIM at the time of its creation, modern signature algorithms, for example EdDSA (RFC 8032), include extensive data hashing as part of the signing process. For these algorithms it may make sense not to create a "data-hash", but to use the entire data as input to "sig-alg". This specification allows DKIM signing algorithms "data-hash" adaptivity, taking advantage of algorithm design, and digital signature API reality.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 7 August 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
In section 3.7 DKIM[RFC6376] specifies the algorithm how "Computing the Message Hashes" for IMF[RFC5322] messages has to be performed, and notes that real life digital signature APIs often combine hashing and signing into a single call that performs both, "hash-alg" as well as "sig-alg", in what appears as a single operation. However, it only appears like that, these APIs, as mentioned, are interfaces designed for convenience: they allow specification of the message digest algorithm to be used, they can be called repeatedly until input data is fully consumed, to be finalized thereafter in order to create the signature; under the hood the complete message digest calculation plus signature creation is performed in full.¶
Modern algorithms, like for example EdDSA[RFC8032], however, include extensive data hashing via an algorithm-internal message digest as part of digital signature creation. They perform several passes over the entire data, which needs to be available in full: no repeated input data feeding is possible. The algorithms may use the internal message digest multiple times at different steps of result creation, incorporating secret key data in later steps, for example. The final digest may be the signature.¶
The above convenience interfaces cannot be used for these algorithms, or only in adopted form, dependent on the used cryptographic library. Specification of an additional message digest algorithm is impossible. Therefore the introduction of the DKIM algorithm Ed25519-SHA256[RFC8463] required code path changes, because the DKIM "data-hash" (SHA-256) now needed to be created first, in an extra step, in order to feed in the generated data as input to "sig-alg".¶
INFORMATIVE NOTE: EdDSA was adapted to DKIM as Ed25519-SHA256 in 2018, but has not gained much traction in the seven years since its introduction. A survey of DKIM implementations which adopted revealed enthusiastic code comments along the extra code paths that had to be introduced.¶
Analysis: in its current form DKIM defines the generated "data-hash" as the sole input of "sig-alg". Modern signing algorithms perform one to multiple digest operations on their input data, which must therefore be available in full for the single invocation of the cryptographic operation; the DKIM "data-hash" must therefore be created specifically. Also, the DKIM "data-hash" algorithm may be weaker than the one used by the signature algorithm: with the mentioned Ed25519-SHA256, for example, a 64-byte SHA-256 input is prehashed with SHA-512 to a 128-byte output. The conclusion is that currently the standard complicates implementations, fosters data processing redundancy, and potentially weakens security attributes of algorithms by feeding in only data subsets, prefiltered by potentially weak(er) algorithms.¶
The computation described in DKIM[RFC6376] section 3.7 is modified so that the described input to "sig-alg", the "data-hash", can adapt to standardized algorithms as appropriate. If an algorithm chooses adaption, "hash-alg" is only used to produce the "body-hash", whereas the input formerly used to create the "data-hash" is fed in full into "sig-alg", instead of to "hash-alg". More formally, the new pseudo-code for the signature algorithm is:¶
body-hash = hash-alg (canon-body, l-param) data-hash = hash-alg (h-headers, D-SIG, body-hash) signature = sig-alg (d-domain, selector, data-hash / (h-headers, D-SIG, body-hash))¶
This memo includes no request to IANA.¶
This specification should reduce implementation burden and complexity, aids in hash hardening of affected algorithms to a certain extend, and potentially increases, dependent upon the algorithm, data volume and API optimization efforts, processing performance.¶