| Internet-Draft | Multicast over RIFT | April 2026 |
| Gomez | Expires 19 October 2026 | [Page] |
RIFT (Routing in Fat Trees) is increasingly used as an underlay routing protocol in modern data center fabrics. However, RIFT does not natively define mechanisms for multicast traffic distribution.¶
This document provides operational guidance and best practices for deploying multicast in RIFT-based data center fabrics. It analyzes PIM, EVPN multicast, BIER, and head-end replication, highlighting trade-offs in scalability, efficiency, and operational complexity.¶
This document does not define new protocol mechanisms. It aims to assist network operators in making informed design decisions when deploying multicast services over RIFT-based fabrics.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 19 October 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Modern data center fabrics rely on Clos-based topologies to achieve scalability, high bandwidth, and fault tolerance. RIFT [RFC9692] provides efficient unicast routing with topology awareness and Zero Touch Provisioning (ZTP).¶
Multicast traffic remains relevant for telemetry distribution, financial data delivery, streaming, and EVPN-VXLAN BUM traffic handling. However, RIFT does not define native multicast capabilities.¶
Operators must rely on external mechanisms such as PIM [RFC4601], EVPN multicast [RFC7432], BIER [RFC8279], or head-end replication. This document provides guidance for selecting and operating these mechanisms in RIFT-based fabrics.¶
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals.¶
This document applies to data center fabrics using RIFT [RFC9692] as the underlay routing protocol, specifically Clos and fat-tree topologies as described in [RFC9696]. It is not intended for general-purpose IP networks or WAN environments.¶
Numerical values and thresholds presented in this document are illustrative and may vary by platform. Operators SHOULD validate all thresholds against their platform documentation prior to production deployment.¶
RIFT provides efficient unicast routing but does not define mechanisms for multicast group membership or tree construction. Existing solutions introduce the following trade-offs:¶
There is no standardized operational guidance for multicast in RIFT-based fabrics. This document addresses this gap.¶
Operators should evaluate deployment models based on scale, platform support, and operational requirements. Table 1 summarizes the trade-offs.¶
| Model | Config | Underlay Mcast | Scale | Convergence |
|---|---|---|---|---|
| RIFT Native | Auto | None | Excellent | Excellent |
| PIM | Medium | Required | Good | 1-5 s |
| EVPN / IR | Low | None | Limited | Good |
| EVPN / Incl. Trees | High | Required | Medium | Moderate |
| BIER | Auto | None | Excellent | Sub-second |
| Head-End Replication | Low | None | Poor | Good |
Work is ongoing [RIFT-MULTICAST] to define native RIFT multicast support. Operators MAY consider this approach once standardized. It is expected to provide automatic tree construction via ZTP and KV-TIEs.¶
PIM MAY be deployed over a RIFT underlay. The RP SHOULD be placed on spine nodes. PIM BSR SHOULD be used for automatic RP advertisement. A dedicated multicast group range SHOULD be assigned for fabric use. Convergence is typically 1 to 5 seconds after failure.¶
EVPN multicast [RFC7432] MAY be used over a RIFT underlay. Ingress Replication is RECOMMENDED for small deployments only (fewer than 100 VTEPs). Inclusive Multicast Trees scale better but require underlay multicast. Operators SHOULD carefully plan VNI-to-multicast group mappings.¶
BIER [RFC8279] MAY be used as a multicast forwarding mechanism. [RFC9624] defines how BIER optimizes EVPN BUM forwarding. BIER is stateless in intermediate nodes and provides sub-second convergence. Operators SHOULD evaluate BIER where platform support is available.¶
Head-end replication MAY be used for simplicity. It is RECOMMENDED only for small deployments (fewer than 50 VTEPs) and SHOULD NOT be used in large fabrics.¶
The values presented are illustrative and may vary by platform.¶
A multicast group allocation policy SHOULD be established prior to deployment. Each VNI SHOULD be mapped to a unique multicast group for EVPN BUM traffic. All mappings SHOULD be documented centrally to prevent conflicts.¶
A primary RP and at least one backup RP SHOULD be configured on different spine nodes. PIM Anycast-RP [RFC4610] or BSR-based redundancy MAY be used for automatic failover.¶
Operators SHOULD use configuration templates and automation frameworks for consistent configuration. Manual per-node configuration is error-prone and does not scale beyond 100 nodes.¶
Before production deployment, operators SHOULD validate convergence by deliberately failing an RP or spine node and measuring traffic interruption duration.¶
Operators SHOULD monitor multicast traffic using platform telemetry. Table 2 provides recommended alert thresholds. Values are illustrative.¶
| Metric | Alert Threshold | Recommended Action |
|---|---|---|
| Active Multicast Groups | > 80% of HW limit | Add capacity / optimize |
| Active Sources | Unexpected spike | Investigate source |
| RP CPU Usage | > 70% | Add RP / rate-limit |
| BUM Traffic Volume | > 150% of baseline | Check storm / misconfig |
| PIM Join/Prune Rate | Sustained high rate | Check topology stability |
Multicast convergence depends on underlay convergence time, protocol behavior, and control-plane load. Some packet loss during convergence is expected. Operators SHOULD use fast failure detection on PIM-enabled links.¶
| Metric | Small DC (<100) | Medium DC (100-500) | Large DC (500+) |
|---|---|---|---|
| Max Multicast Groups | ~1,000 | ~5,000 | ~50,000+ |
| Recommended RPs | 1 | 2-3 | 4-6 |
| Convergence Target | < 1 second | 1-5 seconds | 5-10 seconds |
| Recommended BUM Model | HER / PIM | Inclusive Trees | BIER / Dist. RPs |
| HW Threshold Alert | 80% of 4K | 80% of 16K | 80% of 64K+ |
EVPN ARP suppression SHOULD be enabled on all leaf nodes. DHCP relay SHOULD be configured on leaf nodes. Broadcast rate limiting SHOULD be configured to prevent storm propagation.¶
Leaf nodes SHOULD properly learn and advertise MAC addresses via EVPN Type 2 routes. Unknown unicast rates per VNI SHOULD be monitored. Unknown unicast suppression SHOULD be enabled where supported.¶
IGMP snooping SHOULD be enabled on all leaf nodes. An IGMP querier SHOULD be designated per broadcast domain. For IPv6 segments, MLD snooping and an MLD querier SHOULD be configured.¶
A small RIFT fabric (2-4 spines, 10-50 leaves) prioritizes simplicity. RIFT ZTP SHOULD be deployed. EVPN with Head-End Replication is RECOMMENDED for BUM. If native multicast is required, a single RP on one spine node is sufficient. ARP suppression SHOULD be enabled on all leaves.¶
A medium RIFT fabric (10-30 spines, 100-300 leaves) requires scalability and redundancy. Operators SHOULD deploy 2-3 RPs on different spine nodes with PIM BSR. EVPN inclusive multicast trees are RECOMMENDED over head-end replication. Separate multicast group ranges SHOULD be assigned for infrastructure and application traffic.¶
A large RIFT fabric (50+ spines, 500+ leaves) requires high efficiency. BIER [RFC8279] [RFC9624] is RECOMMENDED where platform support is available. If BIER is unavailable, 4-6 distributed RPs with PIM Anycast-RP [RFC4610] SHOULD be deployed. Regional multicast domains SHOULD limit state propagation.¶
Diagnostic procedures vary by platform. Operators SHOULD consult platform documentation for specific commands and tools.¶
Possible causes: incorrect IGMP/MLD group membership on leaf nodes; EVPN IMET routes not advertised; RPF check failing due to asymmetric routing. Steps: verify group membership state; verify EVPN IMET route advertisement; check RPF path; confirm RIFT unicast convergence is complete.¶
Possible causes: excessive PIM Register messages; (S,G) state approaching platform limits. Steps: examine PIM message rates on the RP; check active multicast state count; consider additional RPs or BIER migration.¶
Possible causes: PIM waiting for RIFT unicast convergence; slow backup RP detection; fast failure detection not configured. Steps: measure RIFT convergence time independently; verify fast failure detection on PIM links; test RP failover in a lab.¶
Possible causes: ARP suppression not enabled; host generating broadcast storm; MAC learning failures. Steps: verify ARP suppression per VNI; monitor per-interface broadcast counters; identify source via MAC address tables; inspect offending host.¶
Multicast deployments MAY introduce flooding, amplification, and unauthorized group access risks. Operators SHOULD enable RPF checking; use SSM with IGMPv3 where possible; filter 224.0.0.0/24 at fabric boundaries (this range MUST NOT be routed across the fabric); configure per-source rate limits; and apply control-plane policing on RP nodes.¶
This document has no IANA actions.¶
Level 1 (Spine)
+------------+ +------------+
| Spine-1 | | Spine-2 |
+------------+ +------------+
/ | \ / | \
/ | \ / | \
Level 0 (Leaf)
+--------+ +--------+ +--------+ +--------+
| Leaf-1 | | Leaf-2 | | Leaf-3 | | Leaf-4 |
+--------+ +--------+ +--------+ +--------+
| | | | | | | |
VMs VMs VMs VMs VMs VMs VMs VMs
Source (Leaf-1)
|
v
+----------+
| Spine-1 | <-- Rendezvous Point (RP)
+----------+
/ | \
v v v
+------+ +------+ +------+
|Leaf-2| |Leaf-3| |Leaf-4|
+------+ +------+ +------+
Ingress VTEP +--------+ | Leaf-1 | +--------+ / | \ (unicast VXLAN copies) v v v +------+ +------+ +------+ |Leaf-2| |Leaf-3| |Leaf-4| |(VTEP)| |(VTEP)| |(VTEP)| +------+ +------+ +------+
Level 2 (Super-Spine)
+-----------+ +-----------+ +-----------+
| S-Spine-1 | | S-Spine-2 | | S-Spine-3 |
+-----------+ +-----------+ +-----------+
Level 1 (Spine) -- Multiple RP regions
+------+ +------+ +------+ +------+
|Sp-1 | |Sp-2 | |Sp-3 | |Sp-4 | ...
|(RP-A)| |(RP-B)| |(RP-C)| |(RP-D)|
+------+ +------+ +------+ +------+
Level 0 (Leaf)
+--++--+ +--++--+ +--++--+ +--++--+
|L1||L2| |L3||L4| |L5||L6| |L7||L8| ...
+--++--+ +--++--+ +--++--+ +--++--+