Internet-Draft VPN Inter-AS Option BC February 2024
Zhang, et al. Expires 9 August 2024 [Page]
Workgroup:
BESS Working Group
Internet-Draft:
draft-zzhang-bess-vpn-option-bc-01
Updates:
9012, 4364 (if approved)
Published:
Intended Status:
Standards Track
Expires:
Authors:
Z. Zhang
Juniper Networks
K. Kompella
Juniper Networks
B. Decraene
Orange
L. Jalil
Verizon

VPN Inter-AS Option BC

Abstract

RFC 4364 specifies protocol and procedures for BGP/MPLS IP Virtual Private Networks (VPNs), including different options (A/B/C) of Inter-AS support. This document specifies MPLS VPN Inter-AS Option BC that combines the advantages of Option B and Option C (and that removes the disadvantages of Option B and Option C). The same concept is applicable to Ethernet Virtual Private Network (EVPN) as well.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 9 August 2024.

Table of Contents

1. Introduction

1.1. Option B and Option C Pros and Cons

With the Inter-AS Option B, ASBRs readvertises its received VPN routes (referred to as service routes) with the next hop changed to itself and the label in the NLRI changed to a locally allocated (service) label that is bound to the <next hop, label> in the received route, and installs corresponding label forwarding state. When it receives traffic and the locally allocated label is exposed at the top of the stack, the traffic is forwarded according to the installed forwarding state.

With the Inter-AS Option C, the ASBRs are not involved in the exchange of service routes. The PEs receive service routes with the next hop unchanged, and label switch paths (LSPs) are established for each next hop. An ingress PE push the service label first, and then push the label(s) for the LSP to the egress PE. The routers along the path label switch the traffic only based on the label for the LSP to the egress PE.

Because the ASBRs with Option B change the service routes' next hop and allocate local service labels on each hop, each AS's internal information (e.g. the loopback addresses) is hidden from outside the AS. Rich policy control could be applied on the ASBR-ASBR sessions, allowing very flexible and fine-grained route export control. As a result, Option B is very suitable for Inter-Provider scenarios.

On the other hand, Option C scales much better becasue ASBRs don't need to handle service routes or maintain per-service-label forwarding state. Compared to Option B, it is more suitable for Intra-Provider multi-AS scenarios.

1.2. Option BC

Option BC in this document combines the advantages of Option B and Option C. ASBRs re-advertise service routes with optional rich policy control. The service labels are not changed but the LSP towards the originating PE is advertised as part of the service routes - either as an additional label in the NLRI [RFC8277], or as a new type of tunnel in the Tunnel Encapsulation Attribute (TEA) [RFC9012] - without exposing the information about the originating PE.

Consider the following topology:

PE11                         PE21                          PE31
     -----                   -----                   -----
    (AS100) ASBR1 -- ASBR21 (AS200) ASBR22 -- ASBR3 (AS300)
     -----                   -----                   -----
PE12                         PE22                          PE32

PE11 and PE12 originate the following service routes respectively, where each tuple represents <service label in NLRI, service prefix, next hop>.

  • <100, sprfx1, PE11>

  • <100, sprfx2, PE12>

1.2.1. Using Multiple NLRI labels

When ASBR1 re-advertises the routes to ASBR21, they become as following:

  • <201, 100, sprfx1, ASBR1>

  • <301, 100, sprfx2, ASBR1>

The original service label 100 in the NLRI does not change, but a new binding label 201/301 is added respectively, and the next hop changes to ASBR1. Label 201/301 binds to PE11/PE12 respectively and ASBR1 sets up forwarding state so that when it receives a packet with label 201 (or 301), it label switches to PE11 (or PE12).

Similarly, when ASBR21 re-advertises the routes to its peers in AS200, the routes become:

  • <202, 100, sprfx1, ASBR21>

  • <302, 100, sprfx2, ASBR21>

Again, the inner service label does not change and the outer label 201/301 changes to 202/302 respectively. ASBR21 installs forwarding state so that when it receive packets with label 202/302, it swaps the label to 201/301 and then tunnel to ASBR21 the next hop.

This continues. ASBR22 re-advertise to ASBR3 as following:

  • <203, 100, sprfx1, ASBR22>

  • <303, 100, sprfx2, ASBR22>

and ASBR3 re-advertises to its AS300 peers as following:

  • <204, 100, sprfx1, ASBR3>

  • <304, 100, sprfx2, ASBR3>

When PE31/PE32 sends traffic for sprfx1/sprfx2, the following label stacks are used respectively:

  • <label (stack) to reach ASBR3, 204, 100>

  • <label (stack) to reach ASBR3, 304, 100>

ASBR3 label switches the traffic based on 204/304 as following:

  • <label (stack) to reach ASBR22, 203, 100>

  • <label (stack) to reach ASBR22, 303, 100>

Eventually the traffic arrives on ASBR1 and it label switches the traffic based on 201/301 as following:

  • <label (stack) to reach PE11, 100>

  • <label (stack) to reach PE12, 100>

1.2.2. Using Tunnel Encapsulation Attribute

Alternatively, when ASBR1 re-advertises the routes to ASBR21, they become as following:

  • <100, sprfx1, ASBR1>, TEA composite tunnel <ASBR1, 201>

  • <100, sprfx2, ASBR1>, TEA composite tunnel <ASBR1, 301>

The service label in the NLRI does not change. The next hop changes, but it is not used. A TEA with a "Composite Tunnel" is added, which includes the ASBR1 as the tunnel egress endpoint and a binding label 201/301, which binds to PE11/PE12 accordingly. ASBR1 also sets up forwarding state so that when it receives a packet with label 201 (or 301), it label switches to PE11 (or PE12).

Similarly, when ASBR21 re-advertises the routes to its peers in AS200, the routes become:

  • <100, sprfx1, ASBR21>, TEA composite tunnel <ASBR21, 202>

  • <100, sprfx2, ASBR21>, TEA composite tunnel <ASBR21, 302>

Again, the service label does not change. The Composite Tunnel in the TEA changes to a new one <ASBR21, 202/302> respectively. ASBR21 installs forwarding state so that when it receive packets with label 202/302, it swaps the label to 201/301 and then send to ASBR21 the tunnel egress endpoint.

This continues. ASBR22 re-advertise to ASBR3 as following:

  • <100, sprfx1, ASBR22>, TEA composite tunnel <ASBR22, 203>

  • <100, sprfx2, ASBR22>, TEA composite tunnel <ASBR22, 303>

and ASBR3 re-advertises to its AS300 peers as following:

  • <100, sprfx1, ASBR3>, TEA composite tunnel <ASBR3, 204>

  • <100, sprfx2, ASBR3>, TEA composite tunnel <ASBR3, 304>

When PE31/PE32 sends traffic for sprfx1/sprfx2, the following label stacks are used respectively:

  • <label (stack) to reach ASBR3, 204, 100>

  • <label (stack) to reach ASBR3, 304, 100>

ASBR3 label switches the traffic based on 204/304 as following:

  • <label (stack) to reach ASBR22, 203, 100>

  • <label (stack) to reach ASBR22, 303, 100>

Eventually the traffic arrives on ASBR1 and it label switches the traffic based on 201/301 as following:

  • <label (stack) to reach PE11, 100>

  • <label (stack) to reach PE12, 100>

With Option C, the signaling of inter-AS LSPs for PEs is done separately and associated with the PE addresses. With Option BC, the signaling of those PE LSPs is done as part of service routes advertisement, without associating with the PE addresses.

1.2.2.1. Incremental Deployment

The Option BC method requires the receiving PEs/ASBRs to handle the composite tunnel in TEA. While it is reasonable to upgrade ASBRs, it may not be feasible to upgrade all PEs at the same time to support Option BC. Therefore, an Option-BC-capable ASBR may have to revert back to Option B when re-advertising service routes.

In the above example, for the service routes originated by PE11/PE12:

  • If ASBR21 does not support Option BC, ASBR1 must not use Option BC when re-advertising to ASBR21.

  • If ASBR21 does support Option BC (and receive service routes from ASBR1 with the composite tunnel in TEA), but if any one of PE21/PE22/ASBR22 does not support Option BC, ASBR21 must revert to Option B when re-advertising into AS200. It removes the TEA, and allocate local labels that bind to received <service label, composite tunnel>. When it receive traffic with the allocated local service label, the traffic is label switched per the bound <service label, composite tunnel>.

  • If ASBR22 and ASBR3 both support Option BC, ASBR22 can use Option BC when re-advertising the service routes to ASBR3 even though ASBR21 has reverted to Option B. This is as if ASBR21 is a PE that originated the service routes.

  • If PE31/PE32 don't support Option BC, ASBR3 has to revert to Option B again.

Even though there are repeated Option BC <-> Option B conversions on ASBR21/ASBR22/ASBR3 in the above example, ASBR1 and ASBR22 are able to take the scaling advantage of Option BC.

A PE/ASBR advertises its support of Option BC with a new Capability Code in its BGP Capabilities Optional Parameter [RFC5492]. A Router Reflector (RR) does not need to support Option BC procedures, but it advertises Option BC capability on behalf of its clients. This can be either based on provisioning (e.g. the operator knows all clients support Option BC) or based the RR's dynamic detection of client's Option BC capability.

1.2.2.2. Update to RFC 9012

[RFC9012] specifically calls out in its Section 10 "Applicability Restrictions" that use of the Tunnel Encapsulation attribute in an "Inter-AS option b" scenario is not recommended, as quoted below:

"Note that if the Tunnel Encapsulation attribute is attached to a
VPN-IP route [RFC4364], if Inter-AS "option b" (see Section 10 of
[RFC4364]) is being used, and if the Tunnel Egress Endpoint sub-TLV
contains an IP address that is not in the same AS as the router
receiving the route, it is very likely that the embedded label has
been changed.  Therefore, use of the Tunnel Encapsulation attribute
in an "Inter-AS option b" scenario is not recommended."

Notice that context is "Tunnel Egress Endpoint sub-TLV contains an IP address that is not in the same AS as the router receiving the route" and the embedded service label is not allocated by the Tunnel Egress Endpoint. With Option BC, the binding label in the composite tunnel is allocated by the tunnel egress endpoint in the TEA, and the embedded service label will only be exposed at the PE/ASBR that allocated that embedded service label, so it is safe to use TEA with the "composite tunnel" in Option BC.

Even in the pure Option B case, as long as it is guaranteed that the embedded service label is allocated by the Tunnel Egress Endpoint, it is safe to use TEA with Option B.

1.2.3. Use of Option BC in SRv6/MPLS Service Interworking Option BC

In [I-D.zzhang-spring-service-interworking], when service routes are re-advertised by the interworking node to the MPLS domain from the SRv6 domain with the Option BC method, the next hop maps to an IPv6 prefix on the originating SRv6 PE. That has the following implications:

  • If the MPLS domain is IPv4, then many IPv4 addresses may have to be allocated and map to the IPv6 prefixes.

  • Even if the MPLS domain is IPv6, exposing those IPv6 prefixes may not be desired in the inter-provider case.

The Option BC procedures in this document can be used to address the above concerns. Instead of using a next hop that maps to an IPv6 prefix on the originating PE, the interworking node can use its own address as the next hop and use a composite tunnel in the TEA, in which the binding label is bound to the IPv6 prefix on the originating PE.

2. Specification

Normative specification will be provided in future revisions.

3. Security Considerations

To be added.

4. Acknowledgement

The authors thank Srihari Singha for his review and suggestions.

5. References

5.1. Normative References

[RFC5492]
Scudder, J. and R. Chandra, "Capabilities Advertisement with BGP-4", RFC 5492, DOI 10.17487/RFC5492, , <https://www.rfc-editor.org/info/rfc5492>.
[RFC8277]
Rosen, E., "Using BGP to Bind MPLS Labels to Address Prefixes", RFC 8277, DOI 10.17487/RFC8277, , <https://www.rfc-editor.org/info/rfc8277>.
[RFC9012]
Patel, K., Van de Velde, G., Sangli, S., and J. Scudder, "The BGP Tunnel Encapsulation Attribute", RFC 9012, DOI 10.17487/RFC9012, , <https://www.rfc-editor.org/info/rfc9012>.

5.2. Informative References

[I-D.zzhang-spring-service-interworking]
Zhang, Z. J., Decraene, B., Zadok, S., Jalil, L., and D. Voyer, "MPLS/SRv6 Service Interworking Option BC", Work in Progress, Internet-Draft, draft-zzhang-spring-service-interworking-02, , <https://datatracker.ietf.org/doc/html/draft-zzhang-spring-service-interworking-02>.

Authors' Addresses

Zhaohui (Jeffrey) Zhang
Juniper Networks
Kireeti Kompella
Juniper Networks
Bruno Decraene
Orange
Luay Jalil
Verizon