Separate Transports for IKE and ESP

The Internet Key Exchange protocol version 2 (IKEv2) originally used unreliable transport (UDP) for its messages. Later it was extended to use TCP where UDP is blocked. UDP remains the preferred transport for IKEv2, and TCP is only used if UDP datagrams cannot get through. Originally IKEv2 peers exchanged only a small amount of data, so that simple retransmission mechanism on top of UDP with no congestion control sufficed. The situation has changed when post-quantum cryptographic (PQC) algorithms began to be incorporated into IKEv2 using multiple key exchanges . Most of post-quantum algorithms require IKE peers to exchange much more data, than classical algorithms, up to tens (or even hundreds) Kbytes. A few proposals exist that allow to overcome the 64 Kbytes limitation on the size of an IKE payload (, , ). When IKE messages grow to tens or even hundreds of kilobytes, using UDP as a transport becomes challenging. The use of IKE fragmentation helps mitigate IP fragmentation issues and ensures that each IKE message fragment fits into a UDP datagram, even if the original message does not. However, all IKE fragments are always sent (and retransmitted) simultaneously, meaning that as the number of fragments increases and congestion control remains absent, the simple retransmission mechanism of IKEv2 will perform poorly potentially causing even more problems for the network. In some cases, a pure PQC Key Exchange may be required for specific deployments, particularly those governed by regulatory or compliance mandates that necessitate exclusive use of post-quantum cryptography. Examples include high-security environments or sectors governed by stringent cryptographic standards. In this case larger amount of data need to be sent in the IKE_SA_INIT exchange, that makes using UDP problematic. For PQ KEM algorithms, if TCP is used for IKEv2 and peers do not require traditional algorithms, then PQ KEM can be used directly within the IKE_SA_INIT message when TCP transport is enabled for IKEv2. This approach allows IKEv2 to avoid UDP fragmentation concerns while enabling a purely post-quantum key exchange for deployments requiring exclusive PQC use. Using reliable transport (e.g., TCP) for IKEv2 could be a solution to the problem. However, the current use of TCP for IKE and ESP implies that ESP SAs are also encapsulated in TCP, which has negative impact on IPsec performance (see Section 9 of TCP encapsulation of IKE and ESP packets ). This specification allows to decouple IKE and IPsec transports, making it possible to use a reliable transport for IKEv2 while continuing to use an unreliable transport for IPsec. The proposed mechanism would enable the use of all parameter sets of a post-quantum key exchange algorithm in IKE_SA_INIT as a quantum-resistant-only key exchange. This allows deployments requiring a pure post-quantum key exchange to establish keys during the IKE_SA_INIT exchange without concerns about exceeding typical network MTUs. The idea to decouple IKE and IPsec transports was originally presented in .

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here.

If the initiator supports this extension, is configured to use it, and does not know whether the responder supports IKEv2 over TCP, the initiator starts the IKE_SA_INIT exchange over UDP to responder's port 4500, as per IKEv2 . In this case, the initiator includes the SEPARATE_TRANSPORTS notification (<TBA by IANA>) in the IKE_SA_INIT request. This allows the initiator to discover whether the responder supports the use of separate transports for IKE (over TCP) and ESP (over UDP). Using UDP port 4500 ensures that IPsec traffic can traverse NATs and intermediate devices that allow UDP encapsulation. If the responder has this extension enabled and receives the SEPARATE_TRANSPORTS notification in the IKE_SA_INIT request, it MUST respond with the same notification in the IKE_SA_INIT response. Upon receiving the SEPARATE_TRANSPORTS notification in the response, the initiator MUST switch to TCP port 4500 for subsequent exchanges (IKE_INTERMEDIATE or IKE_AUTH). The responder MUST be prepared to receive these exchanges over TCP.

IKE_SA_INIT response: HDR, SAr1, KEr1, Nr, [N(NAT_DETECTION_SOURCE_IP), N(NAT_DETECTION_DESTINATION_IP),] <--- N(SEPARATE_TRANSPORTS) => Initiator switches to TCP:4500 for IKE_INTERMEDIATE / IKE_AUTH / subsequent IKEv2 exchanges => ESP over UDP or IP depending on the presence of NATs ]]>

Alternatively, the initiator MAY start IKE_SA_INIT over TCP port 4500 directly, as specified in TCP encapsulation of IKE and ESP packets , for example, when large key exchange payloads (with large public keys) are used in IKE_SA_INIT. In this case, the initiator includes the SEPARATE_TRANSPORTS notification in the IKE_SA_INIT request to indicate its preference to use separate transports; IKEv2 over TCP and ESP over UDP, provided that UDP is not blocked in the network path. If the responder supports this extension, it includes the SEPARATE_TRANSPORTS notification in the IKE_SA_INIT response. In this case, Child SAs are created as specified in IKEv2 , with ESP sent over UDP (or directly over IP) if possible. If both UDP and IP are blocked, ESP is sent over TCP as described in TCP encapsulation of IKE and ESP packets . If the responder does not return the SEPARATE_TRANSPORTS notification in the IKE_SA_INIT response, the initiator MUST treat this as an indication that the responder does not support separate transports. In this case, both IKEv2 and ESP MUST use TCP transport for all subsequent exchanges, as per TCP encapsulation of IKE and ESP packets .

IKE_SA_INIT response: HDR, SAr1, KEr1, Nr, [N(NAT_DETECTION_SOURCE_IP), N(NAT_DETECTION_DESTINATION_IP),] <--- N(SEPARATE_TRANSPORTS) => All subsequent IKEv2 messages continue over TCP => ESP over UDP or IP if possible, else over TCP ]]>

In both scenarios described above ( and ), once the IKEv2 SA switches to TCP transport, either after IKE_SA_INIT or if TCP was used from the beginning, all subsequent IKEv2 exchanges MUST continue to use TCP. The interaction with MOBIKE is described in .

The SEPARATE_TRANSPORTS notification has Protocol ID set to 0 and SPI Size set to 0. This specification does not define any notification data, the notification is sent with no data. Future specifications may define data for this notification. Peers conforming to this specification MUST ignore any data if present.

Child SAs are created as specified in IKEv2 . ESP packets either use direct transport over IP or are UDP encapsulated if NAT is detected. If UDP transport for ESP becomes unavailable (e.g., blocked by a firewall), peers MAY switch ESP to use TCP transport as specified in . If ESP is transported over a different protocol than IKE, intermediate devices might apply different filtering rules. To detect possible connectivity issues with ESP traffic, the encrypted ESP ping mechanism defined in MAY be used.

When separate transports are used for IKEv2 and ESP, NAT traversal for each transport must be handled independently, as intermediate devices maintain NAT state per transport. NAT detection follows the standard mechanism defined in Section 2.23 of . The initiator SHOULD include NAT_DETECTION_SOURCE_IP and NAT_DETECTION_DESTINATION_IP notifications in IKE_SA_INIT, regardless of whether IKE_SA_INIT is sent over UDP or TCP. NAT detection MAY be omitted only if it is known by other means that no NAT is present on the path between the peers. If a NAT is detected, ESP MUST use UDP encapsulation on port 4500 . Peers MUST maintain NAT mappings for the ESP path by sending NAT keepalive packets as specified in Section 2.23 of , and MUST NOT assume that the TCP connection used for IKEv2 provides any keepalive benefit for the ESP UDP path.

When IKEv2 starts over UDP (), the successful exchange of IKE_SA_INIT messages implicitly demonstrates that UDP is reachable and NAT detection results can be used to determine whether ESP should be sent directly over IP or UDP encapsulated. No additional verification is needed. When IKEv2 starts over TCP (), there is no implicit evidence that ESP traffic is reachable. After the Child SA is established, the initiator SHOULD verify ESP reachability by sending an ESP Echo Request , unless the initiator has other means to ensure ESP reachability (for example, the presence of incoming ESP traffic). The number of retries and length of timeouts for ESP Echo Requests are not covered in this specification because they do not affect interoperability. If no ESP Echo Reply is received after exhausting the retransmissions described below, the peers MUST switch ESP to TCP as specified in . If the initiator uses an ESP Echo Request to verify ESP reachability, the ESP transport to probe is determined by the NAT detection results as follows: If a NAT was detected, the initiator MUST send the ESP Echo Request using UDP encapsulation on port 4500 .I If no NAT was detected, the initiator MUST send an ESP Echo Request with ESP sent directly over IP. If no ESP Echo Reply is received after a short delay, the initiator MUST also send an ESP Echo Request using UDP encapsulation on port 4500 , since some middleboxes do not allow IP traffic without UDP or TCP transport. The initiator MUST use the transport for which an ESP Echo Reply is received first. This approach is analogous to the Happy Eyeballs algorithm , giving preference to ESP sent directly over IP while avoiding excessive delay if it is not reachable.

MOBIKE allows an IKE SA, along with its Child SAs, to migrate from one IP address to another. Section 7.1 of TCP encapsulation of IKE and ESP packets specifies that when using TCP as the IKE transport, a peer should attempt to switch back to UDP in the event of an IP address change. This specification updates that requirement: when separate transports are used for IKE and ESP, peers MUST NOT attempt to switch the IKE SA transport from TCP to UDP. However, an ESP SA MAY switch from UDP to TCP if UDP is blocked at the new IP address. Similarly, when ESP is running over TCP and the initiator detects an IP address change, the initiator MUST perform UDP reachability verification as described in on the new path. If ESP reachability is confirmed, the ESP SA switches from TCP to the verified path.

IKE session resumption allows peers to quicly re-establish IKE SA after the connection is broken. Since network condition may change while the client is inactive, the use of separate transports MUST NOT be stored in the resumption ticket and MUST be re-negotiated during session resumption. Since no large public keys are transferred in the IKE_SESSION_RESUME exchange, then, unless configured to use TCP only, the initiator, when resuming, MUST start using UDP with destination port 4500, as discussed in . This enables NAT detection for UDP and UDP reachability testing before switching to TCP.

Section 10 of TCP encapsulation of IKE and ESP packets discusses security implications of using TCP as IKE transport.

This document defines a new Notify Message Type in the "IKEv2 Notify Message Status Types" registry:

SEPARATE_TRANSPORTS ]]>