<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
    There has to be one entity for each item to be referenced. 
    An alternate method (rfc include) is described in the references. -->
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="std" docName="draft-ietf-idr-performance-routing-06"
     ipr="trust200902">
  <front>
    <title abbrev="BGP-PAR">BGP Performance-aware Routing Mechanism</title>

    <author fullname="Xiaohu Xu" initials="X." surname="Xu">
      <organization>China Mobile</organization>

      <address>
        <!--
       <postal>
         <street></street>
-->

        <!-- Reorder these if your country does things differently -->

        <!--
         <city>Soham</city>

         <region></region>

         <code></code>

         <country>UK</country>
       </postal>

       <phone>+44 7889 488 335</phone>
-->

        <email>xuxiaohu_ietf@hotmail.com</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <author fullname="Shraddha Hegde" initials="S. " surname="Hegde">
      <organization>HPE</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>shraddha.hegde@hpe.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Ketan Talaulikar" initials="K" surname="Talaulikar">
      <organization>Individual</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country>India</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>ketant.ietf@gmail.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Mohamed Boucadair" initials="M." surname="Boucadair">
      <organization>France Telecom</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>mohamed.boucadair@orange.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Christian Jacquenet" initials="C." surname="Jacquenet">
      <organization>France Telecom</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>christian.jacquenet@orange.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Jie" initials="J." surname="Dong">
      <organization>Huawei</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>jie.dong@huawei.com</email>

        <uri/>
      </address>
    </author>

    <!--

-->

    <date day="01" month="March" year="2026"/>

    <abstract>
      <t>The current Border Gateway Protocol (BGP) specification does not
      incorporate network performance metrics, such as network latency, into
      its route selection process. This document outlines a performance-aware
      BGP routing mechanism that integrates network latency as a critical
      criterion for route selection. This innovative approach is particularly
      beneficial for server providers with a global presence, enabling them to
      offer low-latency network connectivity service as a value-added service
      to their customers.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>Cloud and/or network service providers (service providers in short)
      with global reach aim to deliver low-latency network connectivity
      service to their customers as a competitive advantage. Sometimes, the
      network connectivity may travel across more than one Autonomous System
      (AS) under their administration, which usually spans multiple
      continents. However, the BGP [RFC4271] protocol, which is used for path
      selection across ASes, doesn't use the network latency metric in the
      route selection process. As such, the best route selected based on the
      existing BGP route selection criteria may not be the best from the
      customer experience perspective.</t>

      <t>This document describes a performance-aware BGP routing paradigm in
      which the network latency metric is disseminated via a new TLV of the
      AIGP attribute [RFC7311] and then is used as an input to the route
      selection process. This mechanism is useful for those server providers
      with global reach, which usually own more than one AS, to deliver
      low-latency network connectivity service to their customers.</t>

      <t>Furthermore, to ensure backward compatibility with existing BGP
      implementations and maintain the stability of the overall routing
      system, it is expected that the performance-aware routing paradigm could
      coexist with the vanilla routing paradigm. As such, service providers
      could provide low-latency network connectivity service as a value-added
      service while still offering the vanilla routing service to meet
      customers' different requirements.</t>

      <t>For the sake of simplicity, this document considers only one network
      performance metric: the network latency metric. The support of multiple
      network performance metrics is out of scope of this document. In
      addition, this document focuses exclusively on BGP matters, and
      therefore all BGP-irrelevant matters, such as the mechanisms for
      measuring network latency are outside the scope of this document.</t>

      <t>The performance-aware BGP routing paradigm has been successfully
      implemented in SONiC and is set to be open-sourced shortly. In addition,
      a variant of this performance-aware BGP routing paradigm has been
      implemented as well (see
      http://www.ist-mescal.org/roadmap/qbgp-demo.avi).</t>

      <t/>
    </section>

    <section anchor="Terminology" title="Terminology">
      <t>This memo makes use of the terms defined in <xref
      target="RFC4271"/>.</t>

      <t>Network latency indicates the amount of time it takes for a packet to
      traverse a given network path [RFC2679]. Provided a packet is forwarded
      along a path that contains multiple links and routers, the network
      latency would be the sum of the transmission latency of each link (i.e.,
      link latency), plus the sum of the internal delay occurred within each
      router (i.e., router latency) which includes queuing latency and
      processing latency. The sum of the link latency is also known as the
      cumulative link latency. In today's service provider networks which
      usually span a wide geographical area, the cumulative link latency
      becomes the major part of the network latency since the total of the
      internal latency occured within each high-capacity router seems trivial
      compared to the cumulative link latency. In other words, the cumulative
      link latency could approximately represent the network latency in the
      above networks.</t>

      <t>Furthermore, since the link latency is more stable than the router
      latency, the approximate network latency represented by the cumulative
      link latency is also more stable. Therefore, if there was a way to
      calculate the cumulative link latency of a given network path, it is
      strongly recommended to use such cumulative link latency to
      approximately represent the network latency. Otherwise, the network
      latency would have to be measured frequently by some means (e.g., PING
      or other measurement tools).</t>
    </section>

    <section anchor="Advertising"
             title="Performance-aware Route Advertisement">
      <t>Performance-aware (i.e., latency-aware in the context of this
      document) routes SHOULD be exchanged between BGP peers by means of a
      specific Subsequent Address Family Identifier (SAFI) of TBD (see IANA
      Section) and also be carried as labeled routes as per [RFC3107]. To some
      extent, performance-aware routes can then be looked as specific labeled
      routes which are associated with the network latency metric.</t>

      <t>A BGP speaker SHOULD NOT advertise performance-aware routes to a
      particular BGP peer unless that peer indicates, through BGP capability
      advertisement (see Section 4), that it can process update messages with
      that specific SAFI field.</t>

      <t>Network latency metrics are attached to the performance-aware routes
      via a new TLV of the AIGP attribute, referred to as NETWORK_LATENCY TLV.
      The value of this TLV indicates the network latency in microseconds from
      the BGP speaker depicted by the NEXT_HOP path attribute to the address
      depicted by the NLRI prefix. The type code of this TLV is TBD (see IANA
      Section), and the value field is 4 octets in length. In some abnormal
      cases, if the cumulative link latency exceeds the maximum value of
      0xFFFFFFFF, the value field SHOULD be set to 0xFFFFFFFF. Note that the
      NETWORK_LATENCY TLV MUST NOT co-exist with the AIGP TLV within the same
      AIGP attribute.</t>

      <t>A BGP speaker SHOULD be configurable to enable or disable the
      origination of performance-aware routes. If enabled, a local network
      latency value for a given to-be-originated performance-aware route MUST
      be configured to the BGP speaker so that it can be filled in the
      NETWORK_LATENCY TLV of that performance-aware route.</t>

      <t>A BGP speaker that is enabled to process NETWORK_LATENCY but was not
      provisioned with the local network latency value SHOULD set the value of
      the NETWORK_LATENCY attribute to zero when it advertises the
      corresponding route that it originated.</t>

      <t>When distributing a performance-aware route learnt from a BGP peer,
      if this BGP speaker has set itself as the NEXT_HOP of such route, the
      value of the NETWORK_LATENCY TLV SHOULD be increased by adding the
      network latency from itself to the previous NEXT_HOP of such route.
      Otherwise, the NETWORK_LATENCY TLV of such route MUST NOT be
      modified.</t>

      <t>As for how to obtain the network latency to a given BGP NEXT_HOP,
      this is outside the scope of this document. However, note that the path
      latency to the NEXT_HOP SHOULD approximately represent the network
      latency of the exact forwarding path towards the NEXT_HOP. For example,
      if a BGP speaker uses a Traffic Engineering (TE) Label Switching Path
      (LSP) or a SR policy route [RFC9256] from itself to the NEXT_HOP, rather
      than the shortest path calculated by the Interior Gateway Protocol
      (IGP), the latency to the NEXT_HOP SHOULD reflect the network latency of
      that TE LSP path or SR policy route , rather than the IGP shortest path.
      In cases where the latency to the NEXT_HOP could not be obtained due to
      some reason(s), that latency SHOULD be set to 0xFFFFFFFF by default.</t>

      <t>To keep performance-aware routes stable enough, a BGP speaker SHOULD
      use a configurable threshold for network latency fluctuation to avoid
      sending any update which would otherwise be triggered by a minor network
      latency fluctuation below that threshold.</t>
    </section>

    <section title="Capability Advertisement">
      <t>A BGP speaker that uses multiprotocol extensions to advertise
      performance-aware routes SHOULD use the Capabilities Optional Parameter,
      as defined in [RFC5492], to inform its peers about this capability.</t>

      <t>The MP_EXT Capability Code, as defined in [RFC4760], is used to
      advertise the (AFI, SAFI) pairs available on a particular
      connection.</t>

      <t>A BGP speaker that implements the Performance-aware Routing
      Capability MUST support the BGP labeled route capability by default. In
      other words, a BGP speaker that advertises the Performance-aware Routing
      Capability to a peer using BGP Capabilities advertisement [RFC5492] does
      not have to advertise the BGP labeled route capability to that peer
      explicitly.</t>
    </section>

    <section title="Performance-aware Route Selection">
      <t>Performance-aware route selection only requires the following
      modification to the tie-breaking procedures of the BGP route selection
      decision (phase 2) described in [RFC4271]: the network latency metric
      comparison SHOULD be executed just ahead of the AS-Path Length
      comparison step. Prior to executing the network latency metric
      comparison, the value of the NETWORK_LATENCY TLV SHOULD be increased by
      adding the network latency from the BGP speaker to the NEXT_HOP of that
      route.</t>

      <t>The Loc-RIB of the performance-aware routing paradigm is independent
      of that of the vanilla routing paradigm. Accordingly, the routing table
      of the performance-aware routing paradigm is independent of that of the
      vanilla routing paradigm.</t>

      <t>Whether the performance-aware routing paradigm or the vanilla routing
      paradigm would be applied to a given packet is a local policy issue
      which is outside the scope of this document. For example, by leveraging
      the color-based BGP route revolution method, those service routes marked
      with a certain color could be resolved over the performance-aware routes
      marked with the same color, which in turn could be resolved over the
      intra-AS routes (e.g., SR policy routes [RFC9256] ) marked with the same
      color. Alternatively, by leveraging the Cos-Based Forwarding (CBF)
      capability which allows routers to have distinct routing and forwarding
      tables for each type of traffic, the selected performance-aware routes
      could be installed in the routing and forwarding tables corresponding to
      high-priority traffic.</t>

      <section title="Deployment Considerations">
        <t>This section is not normative.</t>

        <t>Enabling performance-aware BGP routing at large (i.e., among
        domains that do not belong to the same administrative entity) may be
        conditioned by other administrative settlement considerations that are
        out of the scope of this document. Nevertheless, this document does
        not require nor exclude activating the proposed route selection scheme
        between domains managed by distinct administrative entities.</t>

        <t>The main deployment case targeted by this specification is where
        involved domains are managed by the same administrative entity.
        Concretely, this performance-aware BGP routing mechanism can
        advantageously be enabled in a multi-domain environment, where all the
        involved domains are operated by the same administrative entity so
        that the processing of low-latency routes can be consistent throughout
        the domains. Besides security considerations that may arise (which are
        further discussed in Section 9), there is indeed a need to
        consistently enforce a performance-aware BGP routing policy within a
        set of domains that belong to the same administrative entity. This is
        motivated by the processing of traffic which is of very different
        nature and may have different QoS requirements. For instance, a BGP
        color extended community could be attached to the performance-aware
        routes so as to associate it with a low-latency Segment Routing (SR)
        policy route towards the BGP NEXT_HOP that is configured with the same
        color. In this way, traffic matching the performance-aware BGP routes
        would be forwarded to the BGP NEXT_HOP via the low-latency SR policy
        routes towards that BGP NEXT_HOP. Alternatively, the combined use of
        BGP performance-aware routing with traffic engineering tools that
        would lead to the computation and establishment of traffic-engineered
        paths between "performance-aware-routing"-enabled BGP peers based upon
        the manipulation of the Unidirectional Link delay sub-TLV [RFC7810]
        [RFC7471] would contribute to guaranteeing the overall consistency of
        the low -atency information within each domain.</t>

        <t>In network environments where router reflectors are deployed but
        next-hop-self is disabled on them, route reflectors usually reflect
        those received routes which are optimal (i.e., lowest latency) from
        their perspectives but may not be optimal from the receivers'
        perspectives. Some existing solutions, as described in [RFC7911],
        [I-D.ietf-idr-bgp-optimal-route-reflection], and [RFC6774], can be
        used to address this issue.</t>
      </section>
    </section>

    <section title="Contributors">
      <figure>
        <artwork><![CDATA[   Ning So
   Reliance
   Email: Ning.So@ril.com


   Yimin Shen
   Juniper
   Email: yshen@juniper.net


   Uma Chunduri
   Huawei
   Email: uma.chunduri@huawei.com


   Hui Ni
   Huawei
   Email: nihui@huawei.com


   Yongbing Fan
   China Telecom
   Email: fanyb@gsta.com


   Luis M. Contreras
   Telefonica I+D
   Email: luismiguel.contrerasmurillo@telefonica.com
]]></artwork>
      </figure>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Thanks to Joel Halpern, Alvaro Retana, Jim Uttaro, Robert Raszuk,
      Eric Rosen, Bruno Decraene, Qing Zeng, Jie Dong, Mach Chen, Saikat Ray,
      Wes George, Jeff Haas, John Scudder, Stephane Litkowski and Sriganesh
      Kini for their valuable comments on this document. Special thanks should
      be given to Jim Uttaro and Eric Rosen for their proposal of using a new
      TLV of the AIGP attribute to convey the network latency metric. Thanks
      Shawn Zhang for proposing the new name of this performance-based BGP
      routing paradigm: Performance-aware Routing, abbreviated as PAR.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>A new BGP Capability Code for the Performance-aware Routing
      Capability, a new SAFI specific for performance-aware routing paradigm
      and a new type code for the NETWORK_LATENCY TLV of the AIGP attribute
      are required to be allocated by IANA.</t>

      <!---->
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>In addition to the considerations discussed in [RFC4271], the
      following items should be considered as well:</t>

      <t><list style="letters">
          <t>Tweaking the value of the NETWORK_LATENCY by an illegitimate
          party may influence the route selection results. Therefore, the
          Performance-aware Routing Capability negotiation between BGP peers
          which belong to different administration domains MUST be disabled by
          default. Furthermore, a BGP speaker MUST discard all
          performance-aware routes received from the BGP peer for which the
          Performance-aware Routing Capability negotiation has been
          disabled.</t>

          <t>Frequent updates of the NETWORK_LATENCY TLV may have a severe
          impact on the stability of the routing system. Such practice SHOULD
          be avoided by setting a reasonable threshold for network latency
          fluctuation.</t>
        </list></t>

      <!---->
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2119'?>

      <?rfc include='reference.RFC.4271'?>

      <?rfc include='reference.RFC.5492'?>

      <?rfc include='reference.RFC.4760'?>

      <?rfc include='reference.RFC.3107'?>

      <!---->
    </references>

    <references title="Informative References">
      <?rfc include='reference.RFC.2679'?>

      <?rfc include='reference.RFC.3630'?>

      <?rfc include='reference.RFC.5305'?>

      <?rfc include='reference.RFC.6774'?>

      <?rfc include='reference.I-D.ietf-idr-bgp-optimal-route-reflection'?>

      <?rfc include='reference.RFC.9256'?>

      <?rfc include='reference.RFC.7911'?>

      <?rfc include='reference.RFC.7471'?>

      <?rfc include='reference.RFC.7810'?>

      <?rfc ?>

      <!---->
    </references>
  </back>
</rfc>
