VNS3 HA Plugin
The high availability plugin runs on 2 VNS3 controllers in primary/secondary modes. In the event of VNS3 controller failure, the secondary controller will assume the routes of the primary controller. It does this by the following:
- Updating cloud route tables to send traffic to the secondary
- Updating the secondary VNS3 routes with the primary VNS3 controller’s routes to ensure mesh routing is still available
This is assuming the following architecture: You have 2 VNS3 controllers that are running in a peered configuration. This diagram shows 4 controllers, a primary/secondary pair running on either side of a multi-cloud bridge.
- 4 VNS3 Controllers configured in a Peered Mesh.
- Each VNS3 Controller running the HA plugin in the appropriate mode with the appropriate IAM role policy or Azure permissions.
- Azure route table has AWS available CIDRs routed to VNS3 primary to traverse the bridge. And likewise in AWS.
- Failover even triggers secondary taking over all cloud and bridge routing.
Image downloads
The Plugin Manager is an alpha feature that allows easier management of your plugins via the VNS3 Console.
Definitions
- HA Plugin: Plugin image that can run in various modes. This plugin runs availability tests against VNS3 controllers in a primary/secondary setup. In the event of a controller failure, the secondary will update the cloud route table and assume the routes of the failed VNS3 controller such that cloud network routing and VNS3 mesh routing is still available
- Primary Mode: This mode only handles updating routing on the cloud route table. It can run in either AWS, Azure or a virtualized environment such as VMWare
- Secondary Mode: This mode handles updating the cloud route table and assuming the VNS3 routes from the primary controller when it fails.
- Failover Event: VNS3 VM instance failure
- Underlay IP: The primary (or secondary) private IP address of the VNS3 instance as configured
- Overlay IP: The VNS3 controller IP address on the VNS3 overlay network. This can be obtained from the VNS3 status page as Controller Overlay IP.
Permissions
The VNS3 controller will require certain cloud permissions to be able to update the cloud route tables.
AWS permissions
Actions:
- ec2:DescribeRouteTables
- ec2:DescribeInstances
- ec2:ReplaceRoute
Resource:
- VNS3 Controllers
- VPC
- Route Tables
Azure permissions
Permissions:actions:
- Microsoft.Compute/virtualMachines/read
- Microsoft.Network/networkInterfaces/read
- Microsoft.Network/virtualNetworks/read
- Microsoft.Network/publicIPAddresses/read
- Microsoft.Network/routeTables/read
- Microsoft.Network/routeTables/routes/read
- Microsoft.Network/routeTables/routes/write
Scope:
- Resource group for the VNS3 controllers
Configuration
You can configure the HA plugin via a configuration file or via environment variables. If using environment variables, you would prepend the variables with HAENV_. If using the configuration file at /opt/hacontainer/conf/vars.yml, you would use yaml format without the HAENV prefix.
Variables
Primary Mode Variables
Primary mode accepts the following variables:
- mode: primary
- cloud: aws or azure
- peer_public_ip: Public IP address of secondary VNS3 controller
- sleep_time: number of seconds to wait in between checking to see if it is the primary and assuming all routes. default is 15.
Secondary Mode Variables
Secondary mode accepts the following variables:
- mode: secondary
- cloud: aws, azure or overlay (if overlay, only updates VNS3 routes)
- peer_public_ip: Public IP address of primary VNS3 controller
- my_underlay_ip: primary or secondary IP of secondary VNS3 controller
- peer_underlay_ip: primary or secondary IP of primary VNS3 controller
- peer_overlay_ip: overlay IP address of the primary controller
- my_api_password: secondary controller’s API password
- peer_api_password: primary controller’s API password
- sleep_time: number of seconds to wait in between checking to see if primary is down and assuming all routes. Default is 15.
Configuring via Config file
The configuration file is located at /opt/hacontainer/conf/vars.yml
and should be in yaml format. Here is an example of a secondary mode config file:
mode: secondary
cloud: aws
sleep_time: 15
peer_public_ip: 54.237.95.153
my_underlay_ip: 10.255.1.238
peer_underlay_ip: 10.255.1.254
peer_overlay_ip: 100.127.255.253
my_api_password: mysecret
peer_api_password: mysecret
sleep_time: 15
log_level: debug
Configuring via the Environment
You can also configure the HA plugin via the environment by capitalizing the variable and prepending with HAENV_. For example, a primary controller’s environment could be configured with the following environment:
HAENV_MODE=primary
HAENV_CLOUD=aws
HAENV_PEER_PUBLIC_IP=55.55.55.55
HAENV_SLEEP_TIME=15
Plugin Network Access
The HA Plugin requires network access such that it can send requests to the environment’s cloud API. Here are the routes and rules required:
- The VNS3 controller’s must have network extent to eachother and allow the following traffic:
- TCP traffic on port 8000 (VNS3 API traffic)
- UDP traffic on port 1195 (Peering traffic) This will require cloud routes and security group rules. See the below example.
- The HA plugins require network access such that they can query the cloud API’s and their peer VNS3 controller’s API. The following rules will extend the network to the plugins. The default plugin network is 198.51.100.0/28.
MACRO_CUST -o eth0 -s [plugin IP address]/32 -j MASQUERADE
INPUT_CUST -i plugin0 -s [plugin network CIDR] -j ACCEPT
OUTPUT_CUST -o plugin0 -d [plugin network CIDR] -j ACCEPT
- Finally, your HA Plugin requires network access to the cloud DNS server to make API calls. The DNS server is typically located at an IP address that is the VPC CIDR + 2. So if your VPC is 10.255.0.0/24, your DNS server is at 10.255.0.2. Now, if one of your controller’s is running in a subnet that doesn’t contain the DNS server, you will need to add a VNS3 route to the DNS server via the local subnet’s gateway. See the example below.
Example Network
Ok, lets run through a quick example. Let’s say you have a network with the following attributes:
- VPC with CIDR 10.0.0.0/8
- Two subnets with CIDRs 10.0.0.0/25 and 10.0.0.128/25
- Primary VNS3 controller running in 10.0.0.0/25 and Secondary running in 10.0.0.128/25
- Primary VNS3 private IP is 10.0.0.13 and Secondary VNS3 private IP is 10.0.0.138
- Plugin network on each controller is 198.51.100.0/28
- Primary HA plugin is running at IP 198.51.100.3 and Secondary Plugin at IP 198.51.100.10
Cloud Setup
Here’s what we need in the cloud route and network security groups:
- Cloud route between subnets 10.0.0.0/25 and 10.0.0.128/25. This is typically accomplished with a single route table rule routing the VPC CIDR to the local network. It is typically there by default in public clouds.
- Security group rules that allow TCP port 8000 and UDP port 1195 between controllers. These rules can be locked down to their peer’s private IP address:
- Source: 10.0.0.13/32 and 10.0.0.138/32, Protocol: TCP, Port: 8000, Description: VNS3 Peer API Traffic
- Source: 10.0.0.13/32 and 10.0.0.138/32, Protocol: UDP, Port: 1195, Description: VNS3 Peering Traffic
VNS3 Setup
We need to extend the VPC network to your HA plugins. This will require 2 steps:
- Adding routes ensuring traffic goes to the local subnet’s gateway
- Adding a firewall policy allowing the plugin network access to the VPC network
Adding routes
It is generally good to add locking routes to the local subnet gateway for your VNS3 controllers’ IPs. The local gateway is always the subnet CIDR + 1. So lets add routes on the secondary VNS3 in subnet 10.0.0.128/25. They will look like this:
- Interface: eth0. CIDR: 10.0.0.138/32. Gateway: 10.0.0.129. Advertised: No. Description: Local Private Self
- Interface: eth0. CIDR: 10.0.0.13/32. Gateway: 10.0.0.129. Advertised: No. Description: Local Primary VNS3
These lock down VNS3 controller IPs to the local gateway, ensuring traffic is routed properly. You’ll want to create lock-down routes for each VNS3 controller in your peering mesh. So the architecture diagram at the beginning of this post would require 4 lock-down routes.
Finally, we need to add a lock-down route for the DNS server because the secondary controller is in a subnet different from the DNS server. That route will look like this as the DNS server will be at the VPC CIDR + 2:
- Interface: eth0. CIDR: 10.0.0.2/32. Gateway: 10.0.0.129. Advertised: No. Description: Local DNS Server
Adding Firewall Policy
The firewall is pretty simple. Here’s what it would look like on the secondary controller:
MACRO_CUST -o eth0 -s 198.51.100.10/32 -j MASQUERADE
INPUT_CUST -i plugin0 -s 198.51.100.0/28 -j ACCEPT
OUTPUT_CUST -o plugin0 -d 198.51.100.0/28 -j ACCEPT
These rules do the following:
- MASQUERADE allows the plugin to send requests out to the cloud network via eth0 (it’s a SNAT rule essentially).
- The INPUT and OUTPUT rules accept traffic originating from or traveling to the plugin network.
Automating Your Configuration
And that’s it! But you should only ever have to do this once. This configuration can be totally automated when deploying new network segments.
The API endpoints you would use are the following:
- Update your VNS3 firewall allowing plugin network access with POST /firewall/rules
- Update your VNS3 route table with POST /routes
- Upload the Plugin Image with POST api/container_system/images
- Start the Plugin with POST /api/container_system/containers (can pass environment variables)
Have any questions? Contact Cohesive Networks support. We take pride in responding quickly.
Updated on 21 Oct 2020