VNS3 HA Plugin

The high availability plugin runs on 2 VNS3 controllers in primary/secondary modes. In the event of VNS3 controller failure, the secondary controller will assume the routes of the primary controller. It does this by the following:

  1. Updating cloud route tables to send traffic to the secondary
  2. Updating the secondary VNS3 routes with the primary VNS3 controller’s routes to ensure mesh routing is still available

This is assuming the following architecture: You have 2 VNS3 controllers that are running in a peered configuration. This diagram shows 4 controllers, a primary/secondary pair running on either side of a multi-cloud bridge.

HA Plugin architecture

  1. 4 VNS3 Controllers configured in a Peered Mesh.
  2. Each VNS3 Controller running the HA plugin in the appropriate mode with the appropriate IAM role policy or Azure permissions.
  3. Azure route table has AWS available CIDRs routed to VNS3 primary to traverse the bridge. And likewise in AWS.
  4. Failover even triggers secondary taking over all cloud and bridge routing.

Image downloads

The Plugin Manager is an alpha feature that allows easier management of your plugins via the VNS3 Console.

Definitions

  • HA Plugin: Plugin image that can run in various modes. This plugin runs availability tests against VNS3 controllers in a primary/secondary setup. In the event of a controller failure, the secondary will update the cloud route table and assume the routes of the failed VNS3 controller such that cloud network routing and VNS3 mesh routing is still available
  • Primary Mode: This mode only handles updating routing on the cloud route table. It can run in either AWS, Azure or a virtualized environment such as VMWare
  • Secondary Mode: This mode handles updating the cloud route table and assuming the VNS3 routes from the primary controller when it fails.
  • Failover Event: VNS3 VM instance failure
  • Underlay IP: The primary (or secondary) private IP address of the VNS3 instance as configured
  • Overlay IP: The VNS3 controller IP address on the VNS3 overlay network. This can be obtained from the VNS3 status page as Controller Overlay IP.

Permissions

The VNS3 controller will require certain cloud permissions to be able to update the cloud route tables.

AWS permissions

Actions:

  • ec2:DescribeRouteTables
  • ec2:DescribeInstances
  • ec2:ReplaceRoute

Resource:

  • VNS3 Controllers
  • VPC
  • Route Tables

Azure permissions

Permissions:actions:

  • Microsoft.Compute/virtualMachines/read
  • Microsoft.Network/networkInterfaces/read
  • Microsoft.Network/virtualNetworks/read
  • Microsoft.Network/publicIPAddresses/read
  • Microsoft.Network/routeTables/read
  • Microsoft.Network/routeTables/routes/read
  • Microsoft.Network/routeTables/routes/write

Scope:

  • Resource group for the VNS3 controllers

Configuration

You can configure the HA plugin via a configuration file or via environment variables. If using environment variables, you would prepend the variables with HAENV_. If using the configuration file at /opt/hacontainer/conf/vars.yml, you would use yaml format without the HAENV prefix.

Variables

Primary Mode Variables

Primary mode accepts the following variables:

  • mode: primary
  • cloud: aws or azure
  • peer_public_ip: Public IP address of secondary VNS3 controller
  • sleep_time: number of seconds to wait in between checking to see if it is the primary and assuming all routes. default is 15.

Secondary Mode Variables

Secondary mode accepts the following variables:

  • mode: secondary
  • cloud: aws, azure or overlay (if overlay, only updates VNS3 routes)
  • peer_public_ip: Public IP address of primary VNS3 controller
  • my_underlay_ip: primary or secondary IP of secondary VNS3 controller
  • peer_underlay_ip: primary or secondary IP of primary VNS3 controller
  • peer_overlay_ip: overlay IP address of the primary controller
  • my_api_password: secondary controller’s API password
  • peer_api_password: primary controller’s API password
  • sleep_time: number of seconds to wait in between checking to see if primary is down and assuming all routes. Default is 15.

Configuring via Config file

The configuration file is located at /opt/hacontainer/conf/vars.yml and should be in yaml format. Here is an example of a secondary mode config file:

mode: secondary
cloud: aws
sleep_time: 15
peer_public_ip: 54.237.95.153
my_underlay_ip: 10.255.1.238
peer_underlay_ip: 10.255.1.254
peer_overlay_ip: 100.127.255.253
my_api_password: mysecret
peer_api_password: mysecret
sleep_time: 15
log_level: debug

Configuring via the Environment

You can also configure the HA plugin via the environment by capitalizing the variable and prepending with HAENV_. For example, a primary controller’s environment could be configured with the following environment:

HAENV_MODE=primary
HAENV_CLOUD=aws
HAENV_PEER_PUBLIC_IP=55.55.55.55
HAENV_SLEEP_TIME=15
Tip: For more verbose logging you can set log level to debug with the variable log_level. Or via the environment with HAENV_LOG_LEVEL=debug

Plugin Network Access

The HA Plugin requires network access such that it can send requests to the environment’s cloud API. Here are the routes and rules required:

  1. The VNS3 controller’s must have network extent to eachother and allow the following traffic:
    • TCP traffic on port 8000 (VNS3 API traffic)
    • UDP traffic on port 1195 (Peering traffic) This will require cloud routes and security group rules. See the below example.
  2. The HA plugins require network access such that they can query the cloud API’s and their peer VNS3 controller’s API. The following rules will extend the network to the plugins. The default plugin network is 198.51.100.0/28.
MACRO_CUST -o eth0 -s [plugin IP address]/32 -j MASQUERADE
INPUT_CUST -i plugin0 -s [plugin network CIDR] -j ACCEPT
OUTPUT_CUST -o plugin0 -d [plugin network CIDR] -j ACCEPT
  1. Finally, your HA Plugin requires network access to the cloud DNS server to make API calls. The DNS server is typically located at an IP address that is the VPC CIDR + 2. So if your VPC is 10.255.0.0/24, your DNS server is at 10.255.0.2. Now, if one of your controller’s is running in a subnet that doesn’t contain the DNS server, you will need to add a VNS3 route to the DNS server via the local subnet’s gateway. See the example below.

Example Network

Ok, lets run through a quick example. Let’s say you have a network with the following attributes:

  1. VPC with CIDR 10.0.0.0/8
  2. Two subnets with CIDRs 10.0.0.0/25 and 10.0.0.128/25
  3. Primary VNS3 controller running in 10.0.0.0/25 and Secondary running in 10.0.0.128/25
  4. Primary VNS3 private IP is 10.0.0.13 and Secondary VNS3 private IP is 10.0.0.138
  5. Plugin network on each controller is 198.51.100.0/28
  6. Primary HA plugin is running at IP 198.51.100.3 and Secondary Plugin at IP 198.51.100.10

Cloud Setup

Here’s what we need in the cloud route and network security groups:

  1. Cloud route between subnets 10.0.0.0/25 and 10.0.0.128/25. This is typically accomplished with a single route table rule routing the VPC CIDR to the local network. It is typically there by default in public clouds.
  2. Security group rules that allow TCP port 8000 and UDP port 1195 between controllers. These rules can be locked down to their peer’s private IP address:
  • Source: 10.0.0.13/32 and 10.0.0.138/32, Protocol: TCP, Port: 8000, Description: VNS3 Peer API Traffic
  • Source: 10.0.0.13/32 and 10.0.0.138/32, Protocol: UDP, Port: 1195, Description: VNS3 Peering Traffic

VNS3 Setup

We need to extend the VPC network to your HA plugins. This will require 2 steps:

  1. Adding routes ensuring traffic goes to the local subnet’s gateway
  2. Adding a firewall policy allowing the plugin network access to the VPC network

Adding routes

It is generally good to add locking routes to the local subnet gateway for your VNS3 controllers’ IPs. The local gateway is always the subnet CIDR + 1. So lets add routes on the secondary VNS3 in subnet 10.0.0.128/25. They will look like this:

  1. Interface: eth0. CIDR: 10.0.0.138/32. Gateway: 10.0.0.129. Advertised: No. Description: Local Private Self
  2. Interface: eth0. CIDR: 10.0.0.13/32. Gateway: 10.0.0.129. Advertised: No. Description: Local Primary VNS3

These lock down VNS3 controller IPs to the local gateway, ensuring traffic is routed properly. You’ll want to create lock-down routes for each VNS3 controller in your peering mesh. So the architecture diagram at the beginning of this post would require 4 lock-down routes.

Finally, we need to add a lock-down route for the DNS server because the secondary controller is in a subnet different from the DNS server. That route will look like this as the DNS server will be at the VPC CIDR + 2:

  1. Interface: eth0. CIDR: 10.0.0.2/32. Gateway: 10.0.0.129. Advertised: No. Description: Local DNS Server

Adding Firewall Policy

The firewall is pretty simple. Here’s what it would look like on the secondary controller:

MACRO_CUST -o eth0 -s 198.51.100.10/32 -j MASQUERADE
INPUT_CUST -i plugin0 -s 198.51.100.0/28 -j ACCEPT
OUTPUT_CUST -o plugin0 -d 198.51.100.0/28 -j ACCEPT

These rules do the following:

  1. MASQUERADE allows the plugin to send requests out to the cloud network via eth0 (it’s a SNAT rule essentially).
  2. The INPUT and OUTPUT rules accept traffic originating from or traveling to the plugin network.

Automating Your Configuration

And that’s it! But you should only ever have to do this once. This configuration can be totally automated when deploying new network segments.

The API endpoints you would use are the following:

  1. Update your VNS3 firewall allowing plugin network access with POST /firewall/rules
  2. Update your VNS3 route table with POST /routes
  3. Upload the Plugin Image with POST api/container_system/images
  4. Start the Plugin with POST /api/container_system/containers (can pass environment variables)

Have any questions? Contact Cohesive Networks support. We take pride in responding quickly.