BGP Troubleshooting

Problem:
BGP zum Provider 1 funktioniert im Unterschied zum Provider 2nicht:

Ursache: Der Status ist nicht established sondern „nur“ connect

Check BGP Provider1:

userc@router1> show bgp summary
Groups: 2 Peers: 2 Down peers: 2
Peer                     AS     InPkt     OutPkt   OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
185.9.110.16          62460         0         0       0       0       32:40 Active
193.159.167.177       3320         0         0       0       0       32:40 Active

Check BGP Provider2:

user@router2> show bgp summary
Groups: 2 Peers: 2 Down peers: 1
Peer                     AS     InPkt     OutPkt   OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
31.3.80.101         196714     16223     18286       0     21 5d 17:12:17 Establ
WAN.inet.0: 1/1/1/0
185.9.108.16         62460        72         72       0       3     1w2d22h Active

 

Routing neu starten

Restart Routing immidiately|gracefully

Check BGP State

Current state of the BGP session:

  • Active—BGP is initiating a transport protocol connection in an attempt to connect to a peer. If the connection is successful, BGP sends an open message.
  • Connect—BGP is waiting for the transport protocol connection to complete.
  • Established—The BGP session has been established, and the peers are exchanging update messages.
  • Idle—Either the BGP license checkfailed, or this is the first stage of a connection and BGP is waiting for a Start event.
  • OpenConfirm—BGP has acknowledged receipt of an open message from the peer and is waiting to receive a keepalive or notification message.
  • OpenSent—BGP has sent an open message and is waiting to receive an open message from the peer.

Wir haben als ein Problem mit der TCP Verbindung:

show interfaces ge-4/0/0
Physical interface: ge-4/0/0, Enabled, Physical link is Up
  Interface index: 160, SNMP ifIndex: 551
  Description: WAN-DTAG-ADVA-LWL-Port-1
  Link-level type: Ethernet, MTU: 1514, Link-mode: Full-duplex, Speed: 1000mbps, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled, Flow control: Enabled,
  Auto-negotiation: Enabled, Remote fault: Online
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x0
  Link flags     : None
  CoS queues     : 8 supported, 8 maximum usable queues
  Current address: 3c:61:04:8f:ff:64, Hardware address: 3c:61:04:8f:ff:64
  Last flapped   : 2014-04-18 20:07:12 UTC (00:03:31 ago)
  Input rate     : 0 bps (0 pps)
  Output rate    : 0 bps (0 pps)
  Active alarms  : None
  Active defects : None
  Interface transmit statistics: Disabled

  Logical interface ge-4/0/0.0 (Index 84) (SNMP ifIndex 560)
    Flags: SNMP-Traps 0x0 Encapsulation: ENET2
    Input packets : 0
      Output packets: 480
    Security: Zone: Null
    Protocol inet, MTU: 1500
      Flags: Sendbcast-pkt-to-re, Is-Primary
      Addresses, Flags: Is-Preferred Is-Primary
        Destination: 193.159.167.176/30, Local: 193.159.167.178, Broadcast: 193.159.167.179

 

Näheres verrät die Info über die BGP Neighbours:

show bgp neigbours

Peer: 193.159.167.177+179 AS 3320 Local: 193.159.167.178 AS 62460
  Type: External    State: Connect        Flags: <ImportEval>
  Last State: Connect       Last Event: ConnectRetry
  Last Error: None
  Export: [ AS3320_v4_EXPORT ] Import: [ AS3320_v4_IMPORT ]
  Options: <Preference LocalAddress AuthKey PeerAS Refresh>
  Authentication key is configured
  Local Address: 193.159.167.178 Holdtime: 90 Preference: 100
  Number of flaps: 0
  Trace options: open, update
  Trace file: /var/log/WAN-bgp size 0 files 10

 

user@router1> show bgp replication
Synchronization master:
  Session state: Down, Since: 28:25
  Flaps: 0
  Protocol state: Idle, Since: 28:25
  Synchronization state: NotStarted
  Number of peers waiting: AckWait: 0, SoWait: 0, Scheduled: 0
  Messages sent: Open 0, Establish 0, Update 0, Error 0, Complete 0
  Messages received: Open 0, Request 0 wildcard 0 targeted, EstablishAck 0, CompleteAck 0

 

ACTIVE and CONNECT are basically TCP States

BGP relies on the TCP protocol for transport. So TCP must be established first before BGP can proceed, hence these different states.

I have found many different explanations and here is one, I concocted. Both states are trying to acomplish the same thing, but are a result of diffent things happening.

The CONNECT state is the the result of a start event, which can be an automated event or one done by an Admin- maybe he just configured BGP on an interface and hit the commit button. Once the start even is sent, BGP kicks its resources into gear and trys to establish the TCP session, then transition to the CONNECT state.

In the CONNECT state it is waiting for response from the peer to complete the 3-way handshake, TCP Connection.

Among other things it starts a ConnectRetryTimer. If the connection is successful, it sends the OPEN message to the peer, then transitions to the OPENSENT state.
If there are TCP ERRORS resulting in a failure, it closes the connection and goes back to IDLE
If the connection is unsuccessful during the connect state and the retry timer has expired, transitions to ACTIVE state and tries to re-establish the TCP session again. And if successful, then OPEN message is sent and it transitions to OPENSENT…