Quantcast
Channel: CCIE Blog | iPexpert » CCIE Lab
Viewing all articles
Browse latest Browse all 220

Configure a Highly-Available IPSec VPN tunnel on IOS

$
0
0

It is possible to configure Highly-Available IPSec VPN tunnel on IOS so that the SA information is replicated between the routers. This ensures that a potential failover will be transparent to users and it will not require adjustments or reconfiguration of any remote peers.

There are two protocols used to deploy this feature, HSRP and Stateful Switchover (SSO). HSRP is one of the First Hop Redundancy Protocols that provide network redundancy for IP networks, ensuring that user traffic immediately and transparently recovers from failures in network edge devices. The protocol monitors the interfaces so that if either interface goes down, the whole router is deemed to be down and the ownership of IKE and IPSec SAs is passed to the standby router (which now transitions to the HSRP active state). SSO allows the active and standby routers to share IKE and IPSec state information so both routers have enough information to become the active router at any time.

Before we take a look at the configuration, let’s have few words about our topology. The internal network (VLAN 146 below) configuration is outside the scope of this post, but it would be normally configured with a separate HSRP instance, tracking not only internal but also external interfaces. The goal is to make sure that the traffic leaving the VPN is entering the Active router (SSO/HSRP –active). So things like default route pointing to the internal VIP or using RRI is something you would definitely want to look at.

piotr-IPSec-Stateful

Our focus will be the “outside” part, so where the tunnel is terminated. The session will land on a Virtual Address (6.6.156.100) that is associated with our HSRP instance. Configuration of R2 is going to be like a regular L2L tunnel – R10 and R11 is where HSRP and SSO will be deployed. Let’s first look at R2 config :

crypto isakmp policy 10
 encr aes
 authentication pre-share
 group 2

crypto isakmp key cisco address 6.6.156.100
crypto isakmp keepalive 10 3 periodic

crypto ipsec transform-set SET2 esp-aes esp-sha-hmac 
crypto ipsec security-association replay window-size 1024

ip access-list extended HA_VPN
 permit ip 6.6.2.0 0.0.0.255 6.6.146.0 0.0.0.255

crypto map MAP2 10 ipsec-isakmp 
 set peer 6.6.156.100
 set transform-set SET2 
 set pfs group2
 match address HA_VPN

int g0/0
 crypto map MAP2

Again, it is pretty much a regular IKEv1 L2L configuration where we define our Phase I and II Policies, Authentication Credentials, Encryption Domain and we use a crypto map to bind these together and associate with an interface. Two things that were done as well was enabling DPD (Legacy Keepalives are not supported by this feature) and expanding Anti-Reply window. DPD is used to detect liveliness of the remote peer, where Anti-Reply window was increased to avoid any potential problems that might be related to how SSO replicates Sequence Number updates to the standby SA. By default, this happens every X-number of packets and this “X” is then explicitly set to a minimal value on R10 and R11 (1000 packets). Also, note that we are using Pre-Shared Keys for authentication – that’s another limitation of Stateful IPSec Failover.

All right, what do we have to configure on R10 and R11? The same regular settings plus HSRP and SSO :

R10 :
crypto isakmp policy 10
 encr aes
 authentication pre-share
 group 2

crypto isakmp key cisco address 6.6.25.2       

crypto isakmp keepalive 10 3 periodic

ip access-list extended HA_VPN
 permit ip 6.6.146.0 0.0.0.255 6.6.2.0 0.0.0.255

crypto ipsec transform-set SET2 esp-aes esp-sha-hmac 

crypto map MAP2 10 ipsec-isakmp 
 set peer 6.6.25.2
 set transform-set SET2 
 set pfs group2
 match address HA_VPN

crypto ipsec security-association replay window-size 1024

Now let’s look at that “extra” configuration. Here’s how you can tune Anti-Reply updates :

crypto map MAP2 redundancy replay-interval in 1000 out 1000

Next, we need to tell the router what addresses and ports will be used by SSO to replicate the state information (timeout settings shown are based on the default values). A similar configuration will be done on R11 but the addresses and ports will be reversed (local will be R11, remote R10) :

ipc zone default
 association 1
  no shutdown
  protocol sctp
   local-port 5000
    local-ip 6.6.146.10
    retransmit-timeout 300 10000
    path-retransmit 5
    assoc-retransmit 5
   remote-port 5000
    remote-ip 6.6.146.11

Now we should build a tracking object. Instead of looking only at G0/1, we will be looking at two interfaces (inside and outside), to ensure that failover takes place no matter which of the two interfaces fails (remember, our Active Router must be active for both networks to avoid traffic black-holing).

track 1 interface GigabitEthernet0/0 line-protocol
track 2 interface GigabitEthernet0/1 line-protocol

track 3 list boolean and
 object 1
 object 2

It is important to keep HSRP priorities the same on both routers. This is needed because the SSO-standby device always reboots to sync its state with the Active box. If you left one router with a higher priority, and this device failed, the other router would self-reboot after the previously active box comes alive again.

interface GigabitEthernet0/1
 ip address 6.6.156.10 255.255.255.0
 standby 2 ip 6.6.156.100
 standby 2 preempt
 standby 2 priority 100
 standby 2 name HSRP
 standby 2 track 3 decrement 30
 crypto map MAP2 redundancy HSRP stateful

Note that the crypto map was applied with the “redundancy stateful” option. Finally we need to activate inter-device SSO communication:

redundancy inter-device
 scheme standby HSRP

A very similar configuration is done on R11. The only changes from R10 config are done to SSO (as explained earlier) :

R11 :
crypto isakmp policy 10
 encr aes
 authentication pre-share
 group 2

crypto isakmp key cisco address 6.6.25.2       
crypto isakmp keepalive 10 3 periodic

ip access-list extended HA_VPN
 permit ip 6.6.146.0 0.0.0.255 6.6.2.0 0.0.0.255

crypto ipsec transform-set SET2 esp-aes esp-sha-hmac 

crypto map MAP2 10 ipsec-isakmp 
 set peer 6.6.25.2
 set transform-set SET2 
 set pfs group2
 match address HA_VPN

crypto ipsec security-association replay window-size 1024
crypto map MAP2 redundancy replay-interval in 1000 out 1000

ipc zone default
 association 1
  no shutdown
  protocol sctp
   local-port 5000
    local-ip 6.6.146.11
    retransmit-timeout 300 10000
    path-retransmit 5
    assoc-retransmit 5
   remote-port 5000
    remote-ip 6.6.146.10

track 1 interface GigabitEthernet0/0 line-protocol
track 2 interface GigabitEthernet0/1 line-protocol
track 3 list boolean and
 object 1
 object 2

interface GigabitEthernet0/1
 ip address 6.6.156.11 255.255.255.0
 standby 2 ip 6.6.156.100
 standby 2 preempt
 standby 2 priority 100
 standby 2 name HSRP
 standby 2 track 3 decrement 30
 crypto map MAP2 redundancy HSRP stateful

redundancy inter-device
 scheme standby HSRP

NOTE : If you are using 15.2 (3)T to test it (like what we have on our routers), it is definitely advisable to disable the Hardware VPN module due to a Bug (on both, R10 and R11) :

no crypto engine onboard 0

Once you configured the devices the HSRP standby unit will now reload to synchronize SAs with the Active unit.

Time to verify our configuration:

R11#sh crypto engine brief 
        crypto engine name:  Virtual Private Network (VPN) Module
        crypto engine type:  hardware
                     State:  Disabled
                  Location:  onboard 0
              Product Name:  Onboard-VPN
                HW Version:  1.0
               Compression:  Yes
                       DES:  Yes
                     3 DES:  Yes
                   AES CBC:  Yes (128,192,256)
                  AES CNTR:  No
     Maximum buffer length:  0000
          Maximum DH index:  0000
          Maximum SA index:  0000
        Maximum Flow index:  3200
      Maximum RSA key size:  0000


        crypto engine name:  Cisco VPN Software Implementation
        crypto engine type:  software
             serial number:  12D23801
       crypto engine state:  installed
     crypto engine in slot:  N/A


R10#sh standby brief
                     P indicates configured to preempt.
                     |
Interface   Grp  Pri P State   Active          Standby         Virtual IP
Gi0/1       2    100 P Active  local         6.6.156.11    6.6.156.100

R10#sh crypto ha
IKE VIP: 6.6.156.100
  stamp: C0 7F F5 AE 96 7E 06 E3 FC 92 F3 60 92 51 49 88 
IPSec VIP: 6.6.156.100

R10#sh redundancy states 
       my state = 13 -ACTIVE
     peer state = 8  -STANDBY HOT 
           Mode = Duplex
        Unit ID = 0

     Maintenance Mode = Disabled
    Manual Swact = enabled
 Communications = Up

   client count = 13
 client_notification_TMR = 60000 milliseconds
           RF debug mask = 0x0  

R11#sh redundancy states 
       my state = 8  -STANDBY HOT 
     peer state = 13 -ACTIVE 
           Mode = Duplex
        Unit ID = 0

     Maintenance Mode = Disabled
    Manual Swact = cannot be initiated from this the standby unit
Communications = Up

   client count = 14
 client_notification_TMR = 60000 milliseconds
           RF debug mask = 0x0

R10#sh redundancy inter-device 
Redundancy inter-device state: RF_INTERDEV_STATE_ACT
  Scheme: Standby
      Groupname: HSRP Group State: Active
  Peer present: RF_INTERDEV_PEER_COMM
  Security: Not configured

R11#sh redundancy inter-device 
Redundancy inter-device state: RF_INTERDEV_STATE_STDBY
  Scheme: Standby
      Groupname: HSRP Group State: Standby
  Peer present: RF_INTERDEV_PEER_COMM
  Security: Not configured

R2#ping 6.6.146.4 source g0/0 rep 5    

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 6.6.146.4, timeout is 2 seconds:
Packet sent with a source address of 6.6.2.2 
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 28/28/28 ms

R10#sh crypto session detail
Crypto session current status

Code: C - IKE Configuration mode, D - Dead Peer Detection     
K - Keepalives, N - NAT-traversal, T - cTCP encapsulation     
X - IKE Extended Authentication, F - IKE Fragmentation

Interface: GigabitEthernet0/1
Uptime: 01:02:50
Session status: UP-ACTIVE     
Peer: 6.6.25.2 port 500 fvrf: (none) ivrf: (none)
      Phase1_id: 6.6.25.2
      Desc: (none)
  IKEv1 SA: local 6.6.156.100/500 remote 6.6.25.2/500 Active 
          Capabilities:D connid:1002 lifetime:22:57:09
  IPSEC FLOW: permit ip 6.6.146.0/255.255.255.0 6.6.2.0/255.255.255.0 
        Active SAs: 2, origin: crypto map
        Inbound:  #pkts dec'ed 4 drop 0 life (KB/Sec) 4355669/3236
        Outbound: #pkts enc'ed 4 drop 0 life (KB/Sec) 4355669/3236

Let’s now start sending traffic from R2 and disable G0/0 on R10 (this causes it to reboot since it lost the SSO-active status) – see what happens :

R2#ping 6.6.146.4 source g0/0 rep 50000

Type escape sequence to abort. Sending 50000, 100-byte 
ICMP Echos to 6.6.146.4, timeout is 2 seconds: 
Packet sent with a source address of 6.6.2.2

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!U.
...!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!

Looks like 5 Echoes were lost but then we have the connectivity again.

R11#
*Dec 11 21:44:26.123: %HSRP-5-STATECHANGE: 
GigabitEthernet0/1 Grp 2 state Standby -> Active
*Dec 11 21:44:26.127: %CRYPTO-5-IPSEC_SA_HA_STATUS: 
IPSec sa's if any, for vip  6.6.156.100 will change 
from STANDBY to ACTIVE

R11#sh crypto session detail
Crypto session current status

Code: C - IKE Configuration mode, D - Dead Peer Detection     
K - Keepalives, N - NAT-traversal, T - cTCP encapsulation     
X - IKE Extended Authentication, F - IKE Fragmentation

Interface: GigabitEthernet0/1
Session status: UP-ACTIVE     
Peer: 6.6.25.2 port 500 fvrf: (none) ivrf: (none)
      Desc: (none)
      Phase1_id: (none)
  IKEv1 SA: local 6.6.156.100/500 remote 6.6.25.2/500 Active 
          Capabilities:D connid:1001 lifetime:23:49:39
  IPSEC FLOW: permit ip 6.6.146.0/255.255.255.0 6.6.2.0/255.255.255.0 
        Active SAs: 2, origin: crypto map
        Inbound:  #pkts dec'ed 138 drop 0 life (KB/Sec) 3742576/3083
        Outbound: #pkts enc'ed 135 drop 0 life (KB/Sec) 4203376/3083

Finally I want to mention that this feature is very buggy, especially in the IOS code that we run on our devices (disabling the Hardware VPN module does not appear to solve all of the problems – (By the way, don’t try this in Production). You may see that SA synchronization works only until the first failure occurs after it the SA data does not appear to be mirrored to the standby device, even that SSO states are shown correctly.


Viewing all articles
Browse latest Browse all 220

Trending Articles