This document describes the Distributed Congestion Mitigation (DCM) mechanism
using the Interior Gateway Protocols (IGPs) such as IS-IS [RFC1195],
OSPFv2 [RFC2328], or OSPFv3 [RFC5340].
DCM is a tactical, distributed mechanism, designed to mitigate network congestion
by offloading traffic to an alternate, congestion-free paths. DCM is fully integrated
in IGPs.¶
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF). Note that other groups may also distribute working
documents as Internet-Drafts. The list of current Internet-Drafts is
at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 7 November 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the
document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Revised BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Revised BSD License.¶
Network capacity planning is a proactive strategy to deal with the network
congestion. Even with the proper capacity planning, network congestion arises
from oversubscription, link or node failures, and from the shifting traffic patterns.
DCM provides a reactive, distributed mechanism to mitigate local congestion by
leveraging the interface utilization monitoring, IGP link administrative groups
[RFC8919], [RFC8920], and Flex-Algo
[RFC9350] path computation and forwarding.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as
shown here.¶
Offloading Flex-Algo, used for traffic offloading.¶
Congestion Affinity:
Administrative group used to exclude congested links
from the OFA topology.¶
High Utilization Affinity:
Administrative group used to signal high utilization
and prevent additional offload traffic.¶
Congestion Threshold:
The utilization level at which a link is marked
with Congestion affinity. Traffic offloading is initiated for the local link.¶
Non-Congestion Threshold:
The utilization level at which the additional
offloading on the link stops. If no traffic is offloaded from the local link, the
Congestion affinity is removed.¶
High Utilization Threshold:
The utilization level at which a link is
marked with High Utilization affinity to stabilize the load.¶
Restore Threshold:
The utilization level at which traffic restoration
is initiated.¶
Low Utilization Threshold:
The utilization level used to remove
High Utilization affinity.¶
DCM provisions a dedicated OFA. OFA's FAD is set to exclude the Congestion
affinity. Any link declared congested MUST be excluded from the OFA topology by
setting the Congestion affinity. This allows the OFA to natively route traffic
around all congested links.¶
DCM can be used to offload Algorithm 0 traffic or traffic of any Flex-Algo,
which is not used as OFA itself. If the DCM is used for Flex-algo traffic, the OFA,
on top of using the Congestion affinity exclude rule, SHOULD inherit the FAD
algorithm-type, constraints, and metric type from the original Flex-algo for
which the DCM is done.¶
DCM utilizes precise congestion monitoring and detection mechanisms for the local
interfaces on the router. Some of the characteristics of such monitoring are:¶
Interface output rates are collected periodically and are statistically
adjusted to avoid oscillations and ensure stability. An Exponential Weighted
Moving Average (EWMA) is an example of such adjustments, used for smoothing
the output rate samples.¶
Trend monitoring may optionally be employed to enhance detection performance.
When an upward trend is detected, a more aggressive weighting may be applied to
react faster to significant changes, thereby preventing or minimizing any traffic
drop. Stabilization windows and trend thresholds can be used to detect and confirm
sustained upward trends in the link utilization.¶
Some noise filtering is required for short-duration utilization spikes.¶
A more granular traffic rate data, like per destination prefix or per flow,
MAY be collected to find out the destinations that are significantly contributing
to the local congestion.¶
Traffic offloading is performed to divert the traffic onto the shortest path
that avoids any congested links. The offloading process adheres to the following
principles:¶
Only traffic from locally congested interfaces are offloaded at each node.¶
Offloading MUST NOT be done for Flex-algo that is used as OFA.¶
Offloading MUST NOT be done if there is no valid path to the destination node
in the OFA topology.¶
Additional offloading MUST NOT be done, if the path to the destination node
in the OFA topology crosses any link that is advertised with the
High Utilization Affinity.¶
The offload path is calculated for each prefix individually.¶
Any destination with the primary path over the locally congested link is
eligible for offloading. Offloading may be subject to user-defined
filtering.¶
A more granular traffic rate data, like per destination prefix or per flow,
MAY be used to decide which traffic is offloaded.¶
Traffic is diverted using the Unequal Cost Multi-Path (UCMP) across the primary
path and the offload path.¶
Offloading and restoring is executed progressively in periodic iterations,
utilizing a jittered delay between each step, to ensure network stability.¶
At each iteration, a small increment of traffic (e.g., 5%) is offloaded or
restored.¶
The specific amount of traffic to be offloaded is calculated based on the
lowest capacity link identified on the path from the offloading node to the
destination node inside the OFA.¶
Offloading starts when the Congestion Threshold is exceeded on the local interface.
Additional offloading ceases when the utilization on the local interface drops below
Non-Congestion Threshold.¶
If the calculated OFA path from the offloading node to the destination node
changes (i.e., any topological modification affecting the end-to-end path) while
traffic for a prefix destined to that node is being actively offloaded,
the offloading for that prefix MUST be terminated. The offloading process
for that prefix MUST then be re-initiated from the initial state, following
the standard iterative process.¶
Restoration of the offloaded traffic to the primary path starts
when the utilization on the local interface drops below the Restore Threshold.
Restoration ceases when the utilization on the local interface exceeds the
Non-congestion threshold, or if all traffic is restored.¶
Offloaded traffic is routed via OFA paths, requiring the switching of the
traffic's algorithm to the OFA. The forwarding behavior is defined as follows:¶
In the MPLS network (intra-area): The top label is replaced by the OFA label
of the destination node.¶
In the MPLS networks (inter-area/inter-domain): The OFA label of the
destination ABR/ASBR is pushed on top of the label stack.¶
In the SRv6 networks: The mechanism requires the push of the OFA node SID
(e.g., uN, END) of the destination node. Alternatively, an adjacency SID
(e.g., uA, END-X) of the penultimate hop of the destination node towards
the destination node itself in the OFA topology may be used, to avoid the presence
of multiple local SIDs at the destination node.¶
Congestion Affinity: Removes the link from the OFA topology. Any offloaded
traffic, local or remote, is removed from the link.¶
High Utilization Affinity: Signals routers in the area to stop sending new offload
traffic to the link without impacting existing offloaded traffic.
Traffic that has already been offloaded over the link can stay there.¶
While the affinity-based signaling described in Section 9
effectively mitigates large-scale oscillations, localized instabilities may still
occur due to the following:¶
Elephant Flows: If UCMP hashing results in a high-bandwidth flow being steered
onto an offload path, it may trigger secondary congestion on that path.¶
Simultaneous Offloading: In a distributed environment, multiple routers may
simultaneously detect congestion and initiate offloading for their locally congested
links, leading to rapid, coordinated shifts in traffic patterns that exceed the network's
convergence stability.¶
To address these scenarios, implementations MUST employ a damping mechanism
for prefix-specific offloading:¶
Path Change Monitoring: Routers SHOULD monitor the frequency of OFA path changes
for each prefix. If the path changes more than N times within a defined interval T,
the router SHOULD suppress further offloading for that prefix.¶
Exponential Backoff: To further enhance stability, an exponential backoff
mechanism MAY be applied, increasing the duration for which offloading is suppressed
each time the threshold is exceeded. This ensures that prefixes causing persistent
path churn are excluded from the DCM process until the network state stabilizes.¶
While this document defines the mandatory protocol behaviors required to ensure
interoperability and network stability, certain aspects of the DCM mechanism are
left to the implementation.¶
Implementations MUST adhere to the following rules to ensure the integrity of
the DCM mechanism:¶
OFA Topology Exclusion: The OFA FAD MUST exclude Congestion Affinity.¶
High Utilization Affinity Signaling: Routers MUST interpret the High Utilization
Affinity as a signal to cease sending new offload traffic to the link, while allowing
existing offloaded traffic to persist to avoid unnecessary churning.¶
Routers participating in the OFA MUST monitor the local links utilization and
advertise the Congestion Affinity when the Congestion Threshold is exceeded.
They MUST stop advertising the Congestion Affinity if the utilization drops below
the Non-Congestion Threshold and there is no traffic that is locally offloaded
from the link.¶
Routers participating in the OFA MUST monitor the local links utilization and
advertise the High Utilization Affinity when the High Utilization Threshold is
exceeded. They MUST stop advertising the High Utilization Affinity if the utilization
drops below the Low Utilization Threshold.¶
Advertising the Congestion and High Utilization Affinities is subject to the
standard throttling used by the implementation when generating the LSP, or LSA
update.¶
The setting of the Congestion and High Utilization affinities SHOULD be performed
more aggressively than the unsetting. Setting of these affinities is a protective
measure against imminent performance degradation; therefore, it SHOULD be prioritized
to prevent congestion. Implementations MAY use immediate (un-smoothed) utilization
samples, or more aggressive statistical adjustments for setting these affinities.
Conversely, unsetting these affinity is a restoration measure to return to optimal
routing. A more cautious approach to unsetting is recommended to ensure the link
has stabilized and usage of the smoothed, statistically adjusted utilization value
is recommended.¶
Algorithm Constraints: Offloading MUST NOT be performed for any Flex-Algo that is
currently designated as an OFA.¶
Implementations have the discretion to define the operational heuristics that
trigger the protocol mechanisms, including:¶
Congestion Detection Logic: The specific statistical methods used for link
utilization adjustment (e.g., EWMA, trend analysis, or noise filtering) are
implementation-dependent.¶
Threshold Tuning: The specific values for Congestion, Non-Congestion,
High Utilization, Restore, and Low Utilization thresholds are operational parameters
that should be tuned based on network-specific capacity, stability and performance
requirements.¶
Offload Granularity: The iterative process, including the size of traffic
increments, is left to the implementer to optimize for network stability.¶
Offload Interval: The offload interval SHOULD be set to a value larger than
the sum of the time required for the nodes in the network to detect high
utilization or congestion, the time required to advertise the link affinities
(including the LSP/LSA throttling), and the time required for those affinities
to propagate across the network, plus some additional time to ensure network
stability.¶
Filtering Policies: While the protocol provides the mechanism for offloading,
the policy governing which prefixes or traffic classes are eligible for offloading
is a local configuration choice.¶
DCM can be deployed incrementally in the network. Legacy routers, that do not
support DCM, MUST not participate in the OFA.¶
DCM does not need to be enabled on all routers in the network. However, it must be
enabled on all routers along the specific path towards the egress node, starting from
the point where DCM is being used. To successfully offload traffic, the offload path
must be contiguous. If any router along the path towards the egress node lacks DCM
enablement, the OFA path may not be available.¶
DCM relies on standard IS-IS, OSPF, and OSPFv3 flooding mechanisms.
Implementations MUST ensure that affinity configuration is consistent between
routers participating in the DCM inside the area and protected against unauthorized
modification, as malicious manipulation could lead to traffic drops or suboptimal
routing.¶
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for IP Fast Reroute: Loop-Free Alternates", RFC 5286, DOI 10.17487/RFC5286, , <https://www.rfc-editor.org/info/rfc5286>.
[RFC9855]
Bashandy, A., Litkowski, S., Filsfils, C., Francois, P., Decraene, B., and D. Voyer, "Topology Independent Fast Reroute Using Segment Routing", RFC 9855, DOI 10.17487/RFC9855, , <https://www.rfc-editor.org/info/rfc9855>.
[RFC8919]
Ginsberg, L., Psenak, P., Previdi, S., Henderickx, W., and J. Drake, "IS-IS Application-Specific Link Attributes", RFC 8919, DOI 10.17487/RFC8919, , <https://www.rfc-editor.org/info/rfc8919>.
[RFC8920]
Psenak, P., Ed., Ginsberg, L., Henderickx, W., Tantsura, J., and J. Drake, "OSPF Application-Specific Link Attributes", RFC 8920, DOI 10.17487/RFC8920, , <https://www.rfc-editor.org/info/rfc8920>.
[RFC9350]
Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., and A. Gulko, "IGP Flexible Algorithm", RFC 9350, DOI 10.17487/RFC9350, , <https://www.rfc-editor.org/info/rfc9350>.