VNS3 HA Plugin

You can configure two VNS3 Controllers as a highly available pair for automatic connectivity failover. The High Availability Plugin (HA Plugin) allows you to all but eliminate downtime by automatically updating the Cloud Provider’s route table as well as the VNS3 Routes in the event that a VNS3 Controller fails. VNS3 Controllers are setup in an HA pair by installing and configuring the HA Plugin (one Primary and Secondary VNS3 Controller) and use a combination of VNS3 and Cloud provider API calls to synchronize and maintain routing configuration information.

HA Plugin Modes

The HA Plugin has two modes with the following operations depending on configuration parameters:

Primary Mode

  • During normal operation an HA Plugin in primary mode captures the Cloud Provider Route Table settings and stores any routes where the Primary VNS3 controller is the “target”/“next hop ip”.
  • When Primary recovers after an outage, the HA Plugin looks at the routes stored during normal operations and replaces any of those routes on the Cloud Provider Route Table that are pointing at the Secondary VNS3 Controller.

Secondary Mode

  • During normal operation an HA Plugin in secondary mode:
    • Captures the Cloud Provider Route Table settings and stores any routes where the Primary VNS3 controller is the “target”/“next hop ip”.
    • Captures the Primary VNS3 controller’s VNS3 routes and stashes them every N seconds depending on configuration settings.
    • Sends heartbeats on a specific and configurable periodicity to the Primary VNS3 Controller is down/unresponsive. The following are the configurable heartbeat checks
      • Public IP ping
      • Private IP ping
      • Overlay IP ping
      • Private IP VNS3 API call (get/config)
      • Overlay IP VNS3 API call (get/config)

Supported Connectivity

The HA Plugin is designed for specific highly available hybrid connectivity architectures. If you don’t see your specific use-case listed below, please contact our support team.

  • Active-Active BGP-over-IPsec (dynamic route-based IPsec VPN)
  • Active-Passive IPsec (static route-based IPsec VPN)
  • VNS3 Peering Mesh
  • VNS3 SecLink (federated multicloud network solution)

Architecture / Use-case

Below is an example Hybrid Cloud Connectivity architecture that can leverage the HA Plugin.

HA diagram

  1. Two VNS3 Controllers are running in a Cloud (e.g. AWS, Azure, etc.) and are connected via active-active BGP-over-IPsec VPN.
  2. During normal operations the Primary VNS3 controller is the route to the remote on-premises data center subnet/network.
  3. In the event the Primary fails, the HA Plugin running on the Secondary VNS3 Controller will update the Cloud route Tables
  4. The Secondary VNS3 Controller is the route to the remote network until the Primary VNS3 Controller recovers.

Configuration Steps

Step 1: VNS3 Instance/VM Permissions

The VNS3 controller will require certain cloud permissions to be able to update the cloud route tables.

AWS permissions

VNS3 Controllers running in AWS will need a IAM Role assigned (AWS Documentation) with an appropriate IAM Policy to grant permissions to update the AWS VPC Route Tables. Below are some example policies: Allow-All, VPC-ID Limited, Route Table-ID Limited, and Tag Limited.

NOTE: While the ec2:ReplaceRoute action can be narrowed down by specifying conditions (limit action to a specific vpc-id, a set of route-table-ids, or tag key:value combination), the ec2:Describe* actions cannot be limited by a resource ARN and cannot be conditionally controlled.

Allow-All

No edits required, simply copy/paste into a policy, attache to a role and assign the role to the VNS3 controller running the HA Plugin.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:DescribeRouteTables",
                "ec2:ReplaceRoute"
            ],
            "Resource": "*"
        }
    ]
}
VPC-ID Limited

Replace REGION, ACCOUNT-ID, and VPC-ID with the relevant details for your deployment.

 {
     "Version": "2012-10-17",
     "Statement": [
         {
             "Sid": "VisualEditor1",
             "Effect": "Allow",
             "Action": [
                "ec2:DescribeRouteTables",
                "ec2:DescribeInstances"
            ],
             "Resource": "*"
         },
         {
             "Sid": "VisualEditor0",
             "Effect": "Allow",
             "Action": "ec2:ReplaceRoute",
             "Resource": "arn:aws:ec2:REGION:ACCOUNT-ID:route-table/*",
             "Condition": {
                 "ArnEquals": {
                     "ec2:Vpc": "arn:aws:ec2:REGION:ACCOUNT-ID:vpc/VPC-ID"
                 }
             }
         }
     ]
 }
Route Table-ID Limited

Replace REGION and ACCOUNT-ID with the relevant details for your deployment. Also replace ROUTE-TABLE-IDs with the rtb-ids associated with the VNS3 controllers in your deployment.

 {
     "Version": "2012-10-17",
     "Statement": [
         {
             "Sid": "VisualEditor1",
             "Effect": "Allow",
             "Action": [
                "ec2:DescribeRouteTables",
                "ec2:DescribeInstances"
            ],
             "Resource": "*"
         },
         {
             "Sid": "VisualEditor0",
             "Effect": "Allow",
             "Action": "ec2:ReplaceRoute",
             "Resource": "arn:aws:ec2:REGION:ACCOUNT-ID:route-table/*",
             "Condition": {
                 "ForAnyValue:StringEquals": {
                     "ec2:RouteTableID": [
                         "ROUTE-TABLE-1-ID",
                         "ROUTE-TABLE-2-ID"
                     ]
                 }
             }
         }
     ]
 }
Tag Limited

Replace TAG-KEY and TAG-VALUE with the tag key and value respectively that are applied to the rtb-ids associated with the VNS3 controllers in your deployment.

 {
     "Version": "2012-10-17",
     "Statement": [
         {
             "Sid": "VisualEditor1",
             "Effect": "Allow",
             "Action": [
                "ec2:DescribeRouteTables",
                "ec2:DescribeInstances"
            ],
             "Resource": "*"
         },
         {
             "Sid": "VisualEditor0",
             "Effect": "Allow",
             "Action": "ec2:ReplaceRoute",
             "Resource": "arn:aws:ec2:REGION:ACCOUNT-ID:route-table/*",
             "Condition": {
                 "StringEquals": {
                     "ec2:ResourceTag/TAG-KEY": "TAG-VALUE"
                 }
             }
         }
     ]
 }

Azure permissions

Permissions:actions:

  • Microsoft.Compute/virtualMachines/read
  • Microsoft.Network/networkInterfaces/read
  • Microsoft.Network/virtualNetworks/read
  • Microsoft.Network/publicIPAddresses/read
  • Microsoft.Network/routeTables/read
  • Microsoft.Network/routeTables/routes/read
  • Microsoft.Network/routeTables/routes/write

Scope:

  • Resource group for the VNS3 controllers

Step 2: Plugin Network Access

The HA Plugin requires network access such that it can send requests to the environment’s cloud and VNS3 API as well as send/receive Pings.

2.1: Cloud Network Security Groups

Configure the Cloud Provider Network Security Groups to allow the following between the VNS3 controllers in the HA Pair.

Required Traffic

  • TCP traffic on port 8000 (VNS3 API) via Public IPs
  • TCP traffic on port 8000 (VNS3 API) via Private IPs
  • ICMP traffic via Private IPs

Optional Traffic:

  • UDP traffic on port 1195 (VNS3 Peering) - this is optional

2.2: Cloud DNS

Your HA Plugin requires network access to the cloud DNS server to make API calls. The DNS server is typically located at an IP address that is the VPC CIDR + 2. So if your VPC is 10.255.0.0/24, your DNS server is at 10.255.0.2. Now, if one of your controller’s is running in a subnet that doesn’t contain the DNS server, you will need to add a VNS3 route to the DNS server via the local subnet’s gateway. See the example below.

2.3: VNS3 Firewall Rules

The following rules need to be added to both the Primary and Secondary VNS3 Controllers to provide the HA Plugin with the appropriate network access. The following rules assume the HA Plugin is allocated using the 198.51.100.2 Plugin Network IP:

POSTROUTING -o eth0 -s 198.51.100.2/32 -j MASQUERADE-ONCE
FORWARD -s 198.51.100.2/32 -j ACCEPT
FORWARD -d 198.51.100.2/32 -j ACCEPT
INPUT -i plugin0 -s 198.51.100.2/32 -j ACCEPT
OUTPUT -o plugin0 -d 198.51.100.2/32 -j ACCEPT

Step 3: Upload and Allocate HA Plugin

The HA Plugin is available via the Plugin Catalog on all VNS3 version 6 Controllers with network access to the publicly available Cohesive Networks Plugin storage site. Below are download links to the plugin if VNS3 Controllers are running in a deployment that does not have public Internet access.

Once the HA Plugin is uploaded and available, allocate an instance of the HA Plugin.

Step 4: HA Plugin Configuration File

You can configure the HA plugin via a configuration file or via environment variables. If using environment variables, you would prepend the variables with HAENV_. If using the configuration file edit via the Plugin Manager or directly on the Plugin at /opt/hacontainer/conf/vars.yml, you would use yaml format without the HAENV prefix.

Variables

Primary Mode Variables

Primary mode accepts the following variables:

  • mode: primary
  • cloud: aws or azure
  • peer_public_ip: Public IP address of secondary VNS3 controller
  • sleep_time: number of seconds to wait in between checking to see if it is the primary and assuming all routes. default is 15.

Secondary Mode Variables

Secondary mode accepts the following variables:

  • mode: secondary
  • cloud: aws, azure or overlay (if overlay, only updates VNS3 routes)
  • peer_public_ip: Public IP address of primary VNS3 controller
  • my_underlay_ip: primary or secondary IP of secondary VNS3 controller
  • peer_underlay_ip: primary or secondary IP of primary VNS3 controller
  • peer_overlay_ip (optional): overlay IP address of the primary controller
  • my_api_password: secondary controller’s API password
  • peer_api_password: primary controller’s API password
  • sleep_time: number of seconds to wait in between checking to see if primary is down and assuming all routes. Default is 15.

Configuring via Config file

Configuring the HA Plugin via Configuration File can be done by accessing the Plugin directly (via SSH and port forward VNS3 firewall rule) OR via the Plugin Manager available on VNS3 version 6 and later (recommended).

The configuration file is located at /opt/hacontainer/conf/vars.yml and should be in yaml format. Here is an example of a primary and secondary mode config file:

Primary

mode: primary
cloud: aws
peer_public_ip: 55.55.55.55
sleep_time: 15

Secondary

mode: secondary
cloud: aws
sleep_time: 15
peer_public_ip: 55.55.55.55
my_underlay_ip: 10.0.0.253
peer_underlay_ip: 10.0.1.253
peer_overlay_ip: 100.127.255.253
my_api_password: mysecret
peer_api_password: mysecret
log_level: debug

Configuring via the Environment

You can also configure the HA plugin via the environment by capitalizing the variable and prepending with HAENV_. Here is an example of a primary and secondary mode configuration via environment variables:

Primary:

HAENV_MODE=primary
HAENV_CLOUD=aws
HAENV_PEER_PUBLIC_IP=55.55.55.55
HAENV_SLEEP_TIME=15

Secondary:

HAENV_MODE=secondary
HAENV_CLOUD=aws
HAENV_SLEEP_TIME=15
HAENV_PEER_PUBLIC_IP=55.55.55.55
HAENV_MY_UNDERLAY_IP=10.0.0.253
HAENV_PEER_UNDERLAY_IP=10.0.1.253
HAENV_PEER_OVERLAY_IP=100.127.255.253
HAENV_MY_API_PASSWORD=mysecret
HAENV_PEER_API_PASSWORD=mysecret
HAENV_LOG_LEVEL=debug
Tip: For more verbose logging you can set log level to debug with the variable log_level. Or vi _the environment with HAENV_LOG_LEVEL=debug_

Step 5: Restart the HA Plugin

Once Steps 1-4 are complete, restart the HA Plugin in order for the updated configuration file to be used.

Example Setup

Ok, lets run through a quick example. Let’s say you have a network with the following attributes:

  1. VPC with CIDR 10.0.0.0/8
  2. Two subnets with CIDRs 10.0.0.0/25 and 10.0.0.128/25
  3. Primary VNS3 controller running in 10.0.0.0/25 and Secondary running in 10.0.0.128/25
  4. Primary VNS3 private IP is 10.0.0.13 and Secondary VNS3 private IP is 10.0.0.138
  5. Plugin network on each controller is 198.51.100.0/28
  6. Primary HA plugin is running at IP 198.51.100.3 and Secondary Plugin at IP 198.51.100.10

Cloud Setup

Here’s what we need in the cloud route and network security groups:

  1. Cloud route between subnets 10.0.0.0/25 and 10.0.0.128/25. This is typically accomplished with a single route table rule routing the VPC CIDR to the local network. It is typically there by default in public clouds.
  2. Security group rules that allow TCP port 8000 and UDP port 1195 between controllers. These rules can be locked down to their peer’s private IP address:
  • Source: 10.0.0.13/32 and 10.0.0.138/32, Protocol: TCP, Port: 8000, Description: VNS3 Peer API Traffic
  • Source: 10.0.0.13/32 and 10.0.0.138/32, Protocol: UDP, Port: 1195, Description: VNS3 Peering Traffic

VNS3 Setup

We need to extend the VPC network to your HA plugins. This will require 2 steps:

  1. Adding routes ensuring traffic goes to the local subnet’s gateway
  2. Adding a firewall policy allowing the plugin network access to the VPC network

Adding routes

It is generally good to add locking routes to the local subnet gateway for your VNS3 controllers’ IPs. The local gateway is always the subnet CIDR + 1. So lets add routes on the secondary VNS3 in subnet 10.0.0.128/25. They will look like this:

  1. Interface: eth0. CIDR: 10.0.0.138/32. Gateway: 10.0.0.129. Advertised: No. Description: Local Private Self
  2. Interface: eth0. CIDR: 10.0.0.13/32. Gateway: 10.0.0.129. Advertised: No. Description: Local Primary VNS3

These lock down VNS3 controller IPs to the local gateway, ensuring traffic is routed properly. You’ll want to create lock-down routes for each VNS3 controller in your peering mesh. So the architecture diagram at the beginning of this post would require 4 lock-down routes.

Finally, we need to add a lock-down route for the DNS server because the secondary controller is in a subnet different from the DNS server. That route will look like this as the DNS server will be at the VPC CIDR + 2:

  1. Interface: eth0. CIDR: 10.0.0.2/32. Gateway: 10.0.0.129. Advertised: No. Description: Local DNS Server

Adding Firewall Policy

The firewall is pretty simple. Here’s what it would look like on the secondary controller:

MACRO_CUST -o eth0 -s 198.51.100.10/32 -j MASQUERADE
INPUT_CUST -i plugin0 -s 198.51.100.0/28 -j ACCEPT
OUTPUT_CUST -o plugin0 -d 198.51.100.0/28 -j ACCEPT

These rules do the following:

  1. MASQUERADE allows the plugin to send requests out to the cloud network via eth0 (it’s a SNAT rule essentially).
  2. The INPUT and OUTPUT rules accept traffic originating from or traveling to the plugin network.

Automating Your Configuration

And that’s it! But you should only ever have to do this once. This configuration can be totally automated when deploying new network segments.

The API endpoints you would use are the following:

  1. Update your VNS3 firewall allowing plugin network access with POST /firewall/rules
  2. Update your VNS3 route table with POST /routes
  3. Upload the Plugin Image with POST api/container_system/images
  4. Start the Plugin with POST /api/container_system/containers (can pass environment variables)

Have any questions? Contact Cohesive Networks support. We take pride in responding quickly.