Path Selection Process - Versatile Routing and Services with BGP: Understanding and Implementing BGP in SR-OS (2014)

Versatile Routing and Services with BGP: Understanding and Implementing BGP in SR-OS (2014)

Appendix A. Path Selection Process

This appendix explains the BGP path selection process and provides some additional detail about MED comparison based on parameter settings.

Best-Path Selection Algorithm

The BGP path decision process compares paths to the same destination prefix that are held in the Adj-RIB-In and defines a degree of preference for a path (or paths), which in turn is advertised to peers (subject to Adj-RIB-Out policy).

If the Next-Hop attribute of a BGP route is an address that is not reachable (resolvable), the route is not considered as part of the decision process. The process follows these steps:

1. Select the route from the hierarchy of routes learned from different protocols. In SR-OS this is indicated as preference, and the route learned through the protocol with the lowest preference value is considered the best. (Note that IBGP and EBGP both have a preference value of 170.)

2. Select the route with the highest Local Preference (LOCAL-PREF attribute).

3. Select the route with the least number of Autonomous Systems in its AS_PATH attribute (unless as-path-ignore is configured). An AS_SET counts as one AS.

4. Select the route with the lowest ORIGIN attribute (where IGP = 0, EGP =1, and incomplete = 2).

5. Select the route with the lowest MED value if one of the following applies:

i. Both routes have the MED attribute and were advertised by the same neighbor AS (leftmost AS in the AS_PATH).

ii. Both routes were advertised by a different neighbor AS but always-compare-med without the strict-as option is configured.

iii. One or both routes do not have the MED, but always-compare-med is configured and indicates the MED value to assume for routes that do not have the attribute.

6. Select the route learned by an EBGP over the route learned from an IBGP peer.

7. Select the route with the lowest IGP distance to the BGP Next-Hop of the route (unless ignore-nh-metric is configured). If the BGP Next-Hop is resolved by an LSP (for example, IGP shortcuts or BGP-VPN routes), the cost from the tunnel-table is used.

8. Select the route with the lowest ORIGINATOR ID or received from the peer with the lowest BGP Identifier (unless ignore-router-id) is configured and the routes being compared are EBGP routes).

9. Select the route with the shortest CLUSTER list. An empty cluster list is considered to have a length of 0.

10. Select the route received from the lowest peer IP address.

Always-Compare-MED

As indicated previously in Step 5, the MED attribute is typically used in the decision process only if both routes have the attribute present and come from the same neighboring AS. There are, however, some exceptions depending on router configuration, notably the use of the always-compare-medcommand and the strict-as keyword. Table A.1 shows the influence each command/keyword has on the path selection algorithm.

image

By default, MED values of VPN-IPv4 routes are always compared even if always-compare-med is disabled (default). This behavior is historic and allows for sites of the same VPN to belong to different Autonomous System numbers. If this behavior is undesirable, you can disable it using the always-compare-med strict-as command.

Table A.1 MED Comparison with always-compare-med

!

Command

MED Comparison

always-compare-med disabled

Only compare the MED of two paths if they come from the same neighbor-AS and both paths have a MED attribute. Otherwise skip the step.

always-compare-med

Only compare the MED of two paths (whether or not they are from the same neighbor-AS) if they both have a MED attribute. Otherwise skip the step.

always-compare-med zero

Always compare the MED of two paths, even if they are from a different neighbor AS. If one or both paths do not have a MED attribute, consider the MED to be zero.

always-compare-med infinity

Always compare the MED of two paths, even if they are from a different neighbor-AS. If one or both paths do not have a MED attribute, consider the MED to be infinite.

always-compare-med strict-as zero

Only compare the MED of two paths if they come from the same neighbor-AS. If one or both paths do not have a MED attribute, consider the MED to be zero.

always-compare-med strict-as infinity

Only compare the MED of two paths if they come from the same neighbor-AS. If one or both paths do not have a MED attribute, consider the MED to be infinite.

Deterministic MED

In some environments the outcome of the BGP path selection process can be unpredictable and potentially lead to route oscillation because it depends on the order in which routes are learned. Consider the example shown in Figure A.1 where three external peers are advertising the prefix 172.16.32.0/20 with different AS paths and MED values.

Figure A.1 Deterministic MED

image

Using router R3 as the calculating router, assume that routes are learned from peers in the order A, then B, then C.

When route A is received, it is the only route to the prefix 172.16.32.0/20 so it is automatically the best route. When route B arrives it is compared to route A (the current best path). Because the neighbor ASs of routes A and B are different the always-compare-med configuration option determines whether the MEDs in the two routes are comparable or not. For the sake of example, assume the always-compare-med option is not enabled, so route A remains the best path because of its lower BGP identifier. When route C arrives it is compared to route A, and because the neighbor ASs are the same, route C is selected as the new best path because it has the lowest MED value.

Output A-1: Best Path with Routes Received A-B-C

*A:R3# show router bgp routes

==================================================================

BGP Router ID:192.168.0.11 AS:200 Local AS:200

==================================================================

Legend -

Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid

Origin codes : i - IGP, e - EGP, ? - incomplete, > - best, b - backup

==================================================================

BGP IPv4 Routes

==================================================================

Flag Network LocalPref MED

Nexthop Path-Id Label

As-Path

------------------------------------------------------------------

u*>i 172.16.32.0/20 100 2

192.168.0.22 33 -

64509

*i 172.16.32.0/20 100 5

192.168.0.13 31 -

64509

*i 172.16.32.0/20 100 10

192.168.0.21 32 -

64510

------------------------------------------------------------------

Routes : 3

==================================================================

Next, consider an example where the routes are received in the order A, then C, then B. When route A is received, it is the only route to the prefix 172.16.32.0/20 so it is automatically the best route. When route C arrives it is compared to route A (the current best path), and because the neighbor AS is the same, the route from C is installed due to the lower MED. When route B arrives, it is compared to route C. Because the neighbor AS is different and always-compare-med is not enabled, the MED is not compared, and route B becomes the best path because of the lowest router ID.

Output A-2: Best Path with Routes Received A-C-B

*A:R3# show router bgp routes

=========================================================================

BGP Router ID:192.168.0.11 AS:200 Local AS:200

=========================================================================

Legend -

Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid

Origin codes : i - IGP, e - EGP, ? - incomplete, > - best, b - backup

=========================================================================

BGP IPv4 Routes

=========================================================================

Flag Network LocalPref MED

Nexthop Path-Id Label

As-Path

-------------------------------------------------------------------------

u*>i 172.16.32.0/20 100 10

192.168.0.21 36 -

64510

*i 172.16.32.0/20 100 2

192.168.0.22 35 -

64509

*i 172.16.32.0/20 100 5

192.168.0.13 34 -

64509

-------------------------------------------------------------------------

Routes : 3

======================================================================

The Deterministic MED feature overcomes this problem and changes how MED comparisons are done to ensure deterministic best path selections. The main change is to always group received routes by neighbor AS (first AS in the AS_PATH or the local AS if the AS_PATH is empty). Within each group, BGP selects the best path. (The configuration of always-compare-med does not matter for this step.) Finally, BGP compares all the “group-best paths,” and for this step the configuration of always-compare-med is relevant. If one path remains after this final MED comparison step, this is the overall best path. If multiple paths remain, further rules of the decision must be evaluated.

Consider again the preceding example. Router A and C belong to the same neighbor AS group and the comparison of these two paths always selects route C as the group-best (lowest MED). With always-compare-med disabled, the MEDs of the group-best-paths cannot be compared so further rules must be evaluated. Route B is ultimately selected over route C as the best path in this example because of the lowest BGP identifier. When deterministic MED is enabled, route B will always be selected as best regardless of the arrival order of routes A, B, and C.

The deterministic-med function and always-compare-med are both enabled in the best-path-selection node of the base BGP context. As indicated previously, enabling deterministic MED can be considered best practice to provide deterministic path selection and also avoid potential route oscillation.

Output A-3: Deterministic MED Configuration

router

bgp

best-path-selection

always-compare-med strict-as zero

deterministic-med

exit

References and Glossary

References

!

1

RFC 4271

A Border Gateway Protocol 4 (BGP-4)

2

RFC 5942

Capabilities Advertisement with BGP-4

3

RFC 4760

Multi-Protocol Extensions for BGP-4

4

RFC 4360

BGP Extended Communities Attribute

5

RFC 4364

BGP/MPLS IP-VPNs

6

RFC 2918

Route Refresh Capability for BGP-4

7

RFC 4684

Constrained Route Distribution for BGP/MPLS IP-VPNs

8

RFC 3107

Carrying Label Information in BGP-4

9

draft-ietf-mpls-seamless-mpls

Seamless MPLS Architecture

10

RFC 4724

Graceful Restart Mechanism for BGP

11

RFC 4761

VPLS Using BGP for Auto-Discovery and Signaling

12

draft-ietf-l2vpn-vpls-multihoming

BGP based Multi-Homing in VPLS

13

RFC 6624

Layer-2 VPNs Using BGP for Auto-Discovery and Signaling

14

RFC 6513

Multicast in BGP/MPLS IP-VPNs

15

RFC 6037

Cisco Systems' Solution for Multicast in BGP/MPLS IP-VPNs

16

draft-ietf-grow-ops-reqs-for-bgp-error-handling

Operational Requirements for Enhanced Error Handling Behavior in BGP-4

17

RFC 4798

Connecting IPv6 Islands over IPv4 MPLS Using IPv6 Provider Edge Routers

18

draft-ietf-idr-best-external

Advertisement of the Best External Route in BGP

19

draft-ietf-pwe3-dynamic-ms-pw

Dynamic Placement of Multi-Segment Pseudowires

20

draft-ietf-l2vpn-evpn

BGP/MPLS Based Ethernet VPN

21

RFC 5575

Flow Specification

22

RFC 3704

Ingress Filtering for Multi-Homed Networks

23

RFC 5082

The Generalized TTL Security Mechanism (GTSM)

24

RFC 5331

Upstream Label Assignment and Context-Specific Label Space

25

draft-ietf-idr-bgp-gr-notification

Notification Support for BGP Graceful Restart

26

draft-ietf-idr-ls-distribution

Advertising Link State Information in BGP

27

draft-ietf-sidr-pfx-validate

BGP Prefix Origin Validation

28

RFC 6810

RPKI-Router Protocol

29

draft-ietf-side-origin-validation-signalling

BGP Prefix Origin Validation State Extended Community

Glossary

!

ABR

Area Border Router

AD

(or A-D) Auto-Discovery

AF

Assured Forwarding

AFI

Address Family Indicator

AGI

Attachment Group Identifier

AGN

Aggregation Node

ALTO

Application Layer Transport Optimization

AN

Access Node

ARP

Address Resolution Protocol

AS

Autonomous System

ASBR

Autonomous System Border Router

ASN

Autonomous System Number

BE

Best Effort

BFD

Bidirectional Forwarding Detection

BGP

Border Gateway Protocol

BNG

Broadband Network Gateway

CE

Customer Edge

CMS

Cloud Management System

CSC

Carrier Supporting Carrier

CSV

Circuit Status Vector

DF

Designated Forwarder

DHCP

Dynamic Host Configuration Protocol

EBGP

Exterior BGP

ECMP

Equal Cost Multi-Path

EF

Expedited Forwarding

EOR

End of Rib (Marker)

ERO

Explicit Route Object

ESI

Ethernet Segment Identifier

EVI

Ethernet VPN Instance

FC

Forwarding Class

FDB

Forwarding Database

FEC

Forwarding Equivalence Class

FIB

Forwarding Information Base

FSM

Finite State Machine

GR

Graceful Restart

GRE

Generic Routing Encapsulation

GUA

Globally Unique Address

I-PMSI

Inclusive PMSI

IBGP

Interior BGP

IGP

Interior Gateway Protocol

IMM

Integrated Media Module

IOM

Input Output Module

KVM

Kernel-Based Virtual Machine

LAG

Link Aggregation Group

LB

Label Base

LDP

Label Distribution Protocol

LSP

Label Switched Path

MDT

Multicast Distribution Tree

MEP

Maintenance Endpoint

MH-ID

Multi-Homed Identifier

MP2MP

MultiPoint to MultiPoint

MRAI

Minimum Route Advertisement Interval

MS-PW

Multi-Segment Pseudowire

MSDP

Multicast Source Discovery Protocol

MTU

Maximum Transmission Unit

MVPN

Multicast VPN

NAT

Network Address Translation

NCP

Network Control Protocol

NHLFE

Next-Hop Label Forwarding Entry

NLRI

Network Layer Reachability Information

NSF

Non-Stop Forwarding

NSH

Next Signalling Hop

NSR

Non-Stop Routing

NVE

Network Virtualization Edge

NVO

Network Virtualization Overlay

ORF

Outbound Route Filtering

ORR

Optimal Route Reflection

P2MP

Point to MultiPoint

PCE

Path Computation Element

PDU

Protocol Data Unit

PE

Provider Edge

PIC

Prefix Independent Convergence

PIM

Protocol Independent Multicast

PLR

Point of Local Repair

PMSI

Provider Multicast Service Instance

PPP

Point-to-Point Protocol

QOS

Quality of Service

QPPB

QOS Policy Propagation Using BGP

RD

Route Distinguisher

RG

Residential Gateway

RIB

Routing Information Base

ROA

Route Origin Attestation

RP

Rendezvous Point

RPF

Reverse Path Forwarding

RPKI

Resource Public Key Infrastructure

RPT

Rendezvous Point Tree

RR

Route-Reflector

RSVP

Resource Reservation Protocol

RT

Route Target

RTBH

Remote Triggered Black-Holing

RTM

Route Table Manager

S-PE

Switching PE

S-PMSI

Selective PMSI

S2L

Source To Leaf

SA

Source Active

SAII

Source Attachment Individual Identifier

SAFI

Sub Address Family Identifier

SAP

Service Access Point

SDP

Service Distribution Point

SPT

Shortest Path Tree

SR-OS

Service Router Operating System

SSM

Source-Specific Multicast

STP

Spanning Tree Protocol

T-PE

Terminating PE

TAII

Target Attachment Individual Identifier

TCP

Transmission Control Protocol

TED

Traffic Engineering Database

TLV

Type Length Value

TTL

Time-To-Live

UMH

Upstream Multicast Hop

URPF

Unicast RPF

VA

Virtual Application

VBO

VE Block Offset

VBS

VE Block Size

VE

VPLS Edge

VE ID

VPLS Edge Identifier

VID

VLAN Identifier

VM

Virtual Machine

VPLS

Virtual Private LAN Service

VPRN

Virtual Private Routed Network

VPWS

Virtual Private Wire Service

VRF

VPN Routing and Forwarding

VRP

Validated ROA Payload

VRR

Virtual Route-Reflector

VRRP

Virtual Router Redundancy Protocol

VSI

VPLS Switch Instance

VTEP

VXLAN Tunnel End Point

VXLAN

Virtual eXtensible Local Area Network