Multi-Region Peering mesh with Terraform and VNS3

Creating a dynamic routing mesh network involves the following steps:

  1. Build AWS VPC’s in desired regions
  2. Launch VNS3 controllers in your VPC’s
  3. Configure the VNS3 controllers
  4. Peer VNS3 controllers with VNS3 API

Steps 1 and 2 are accomplished with terraform and steps 3 and 4 via the VNS3 API. Here we use the Cohesive Networks Python SDK to work with the VNS3 API.

Technologies

This tutorial makes use of Terraform and the Cohesive Networks Python SDK to build out the cloud virtual networks and the mesh network with VNS3.

Building your Cloud with Terraform

Launching VNS3 in AWS and Azure is simple with terraform. The terraform code used for this example can be found in our templates repository. The multi-cloud topology code can be found here.

The following files are included:

  • vpc.tf - Builds two AWS VPC’s in different regions: us-east-1 and us-west-2
  • vns3.tf - Launches VNS3 in the VPCs created by vpc.tf. Here we launch a VNS3 controller in each subnet in each VPC, which sums to 4. So we will have a 4-way peered mesh. The VNS3 controllers are assigned public IP addresses and the security group only provides access to the network range defined by the variable access_cidr in variables.tf
  • bridge.tf - This creates a peering connection between the two VPCs. This allows us to build an entirely private mesh network. It also adds the required routes and security group rules. VNS3 peering occurs over UDP ports 1195-1205.
  • variables.tf - this defines the input variables required for this infrastructure
  • outputs.tf - defines terraform outputs for the topology.

You can build the infrastructure by running the following:

cd vns3-peering-aws-tf
terraform init 
# Create a terraform plan with a timestamp
terraform plan -out "build__$(date -u +"%Y-%m-%dT%H-%M-%SZ").tfplan"
terraform apply build__2020-05-22T21-29-32Z.tfplan

After about 10 minutes you will get your outputs:

Apply complete! Resources: 40 added, 0 changed, 0 destroyed.

Outputs:

vpc1_cidr = 10.0.1.0/24
vpc1_controller_dns_hostnames = [
  "ec2-52-202-1-140.compute-1.amazonaws.com",
  "ec2-18-235-153-11.compute-1.amazonaws.com",
]
vpc1_controller_public_ips = [
  "52.202.1.140",
  "18.235.153.11",
]
vpc1_default_security_group_id = sg-0e93ea1debcb3f42d
vpc1_id = vpc-0eb964047205b5d9a
vpc1_route_table_id = rtb-00074d5903cc7a07c
vpc1_subnet_ids = [
  "subnet-03c21a171a0332daf",
  "subnet-054a1c31b5b6256be",
]
vpc1_subnets = [
  "10.0.1.0/25",
  "10.0.1.128/25",
]
vpc1_vns3_instance_ids = [
  "i-0920f020991b9e4d0",
  "i-07857bd1b9b2c2c29",
]
vpc2_cidr = 10.0.2.0/24
vpc2_controller_dns_hostnames = [
  "ec2-54-185-79-180.us-west-2.compute.amazonaws.com",
  "ec2-52-25-196-234.us-west-2.compute.amazonaws.com",
]
vpc2_controller_public_ips = [
  "54.185.79.180",
  "52.25.196.234",
]
vpc2_default_security_group_id = sg-027d3bf6e6dd6454e
vpc2_id = vpc-0a0c2d4823a307c63
vpc2_route_table_id = rtb-081749c2a1af2399a
vpc2_subnet_ids = [
  "subnet-08435a036c82a9637",
  "subnet-0b926a2e93bec9229",
]
vpc2_subnets = [
  "10.0.2.0/25",
  "10.0.2.128/25",
]
vpc2_vns3_instance_ids = [
  "i-0ebc766ef9f5321da",
  "i-063bd20922c57c2cc",
]

Configuring VNS3 Controllers

Configuring the VNS3 controllers involves generating the topology keyset on controller 1 and then fetching that configuration from the other controllers.

  1. Upload a license with PUT /license to controller 1
  2. Configuring and accepting the license parameters with PUT /license/parameters
  3. Generating a keyset with PUT /keyset
  4. For each other controller, fetch the licensed configuration from controller 1

Here’s some code with all of the API calls using the Python SDK:

root_client.licensing.upload_license(license_file_data)
root_client.licensing.put_set_license_parameters(**license_parameters)
root_client.config.put_keyset
# Poll on the keyset until it is available
client.config.wait_for_keyset

for client in peer_clients:
    client.config.put_keyset(source=root_client_ip_address, token=shared_token)

Now, VNS3 does take some time to configure itself during licensing and so requires some polling on responses. We tend to do this a lot for our topologies so we added a helper function to the SDK that will idempotently configure your controller. Also, the fetch performed by each peer controller in step 4 will trigger background tasks as the controller fetches and sets the configuration. This requires polling on the state of the controller. These tasks require some polling code. We added some functionality to the python SDK to make this easier:

from cohesivenet.macros import config, state

# Root client = controller 1
config.setup_controller(
        root_client,
        parameters["topology_name"],
        parameters["license"],
        license_parameters={"default": True},
        keyset_parameters={"token": keyset_token},
        reboot_timeout=240,
        keyset_timeout=240,
    )

# peer_clients = List[VNS3Clients] for controllers 2-4
# state.get_primary_private_ip fetches the private IP address of the root client.
config.fetch_keysets(
    peer_clients, state.get_primary_private_ip(root_client), keyset_token
)

This will configure your VNS3 controller with the provided parameters if not already configured and fetch keysets will configure controllers 2-4.

Creating the Peering Mesh

Creating the peering mesh requires a few steps such that our network is configured properly:

  1. Create explicit routes for VNS3 peer IP addresses
  2. Create the peering map on each VNS3
  3. Create route advertisements for each VNS3’s subnet

Creating explicit VNS3 peer routes

By default, VNS3 will only have a route for its local subnet. Our VPC’s are peered together and each subnet’s cloud route table contains a route properly routing the peer VPC’s subnet to the peering connection. This was created with terraform in bridge.tf. Now we need to tell each VNS3 controller how to reach their peers. Here’s a table of our VNS3 mesh:

IDControllerSubnet
110.0.1.3610.0.1.0/25
210.0.1.19810.0.1.128/25
310.0.2.10010.0.2.0/25
410.0.2.19310.0.2.128/25

So controller 1 needs a route to controllers 2,3 and 4 at 10.0.1.198, 10.0.2.100, 10.0.2.193. And likewise for each other controller. You can create a route with the POST /routes call. Here’s an example with the SDK:

vns3_client.routing.post_create_route(
    cidr="10.0.1.198/32",
    description="Peer VNS3 cloud route",
    interface="eth0",
    gateway="10.0.1.1",
    advertise=False
)

How did we get the gateway of 10.0.1.1? This is the default router for AWS VPC’s. Each AWS subnet has a router located at the subnet cidr + 1.

This involves creating 12 routes: 3 for each of the 4 controllers. Having explicit and specific routes makes for a more secure and robust network. We wrote a function in the python SDK to make this a little easier:

import os
from cohesivenet.macros import routing, connect

controller_password = os.environ.get("VNS3_MASTER_PASSWORD")
ordered_subnets = ["10.0.1.0/25", "10.0.1.128/25"]
# Connect to all controllers
# hosts_connection_params=List[dict{host=str, password=str}]
# connecting via public IPs returned by terraform outputs
hosts_connection_params = [
  {"host": "52.202.1.140", "password": controller_password},
  {"host": "18.235.153.11", "password": controller_password}
]
vns3_clients = connect.get_clients(hosts_connection_params)
routing.create_peer_mesh_local_gw_routes(vns3_clients, ordered_subnets)

Creating the peering connection map

VNS3 peering connections are fully encrypted TLS connections using x509 certificates. Creating a mesh network requires configuring your peering “map”. Each VNS3 controller in the mesh will have an ID and maintains a map to their peers. We will use the map provided above with peer 1 at 10.0.1.36.

We are using the private IP addresses here for the peering connection as the VPCs’ private networks are peered. We could also the the AWS DNS names such as ec2-75-101-214-198.compute-1.amazonaws.com for the peering connection.

Tip: It is a good idea to use AWS DNS names when configuring your peering mesh along with elastic IPs. DNS will handle the case when your instance fails and you failover to a new instance that might have a different primary IP address. Amazon DNS will resolve to the local private IP address.

Each VNS3 controller requires the following 2 API calls:

  1. Set VNS3 peering ID with PUT /peering/self
  2. Set the VNS3 peering map with POST /peering/peers

Setting VNS3 peering for controller 1:

resp = vns3_client.peering.put_self_peering_id(id=1)
peer_map = [
    {"id": 2, "ip": "10.0.1.198"},
    {"id": 3, "ip": "10.0.2.100"},
    {"id": 4, "ip": "10.0.2.193"},
]
for peer in peer_map:
    vns3_client.peering.post_create_peer(
        id=peer["id"], name=peer["ip"]
    )

We create peering maps a lot so to make this a little easier we created an API “macro” to set and peer your VNS3 clients together:

from cohesivenet.macros import peering, connect

# host_connection_params are ordered according to above map!
#  e.g. hosts_connection_params[0] == 
vns3_clients = connect.get_clients(hosts_connection_params)

# Sets IDs based on order of list. client at vns3_clients[0] gets ID=1.
peering.set_peer_ids(vns3_clients)
# defaults to using primary private IPs for peering mesh\
peering.peer_mesh(vns3_clients)

Creating Route Advertisements

Now that all VNS3 controllers are meshed together we can take advantage of the dynamic routing running over the mesh. Each controller can advertise which networks it has connectivity to, informing each network peered into the mesh. In our somewhat contrived topology we have the following subnets for each controller:

  1. VNS3 Peer 1 in subnet 10.0.1.0/25
  2. VNS3 Peer 2 in subnet 10.0.1.128/25
  3. VNS3 Peer 3 in subnet 10.0.2.0/25
  4. VNS3 Peer 4 in subnet 10.0.2.128/25

So each VNS3 controller can create a route advertisement with the POST /routes endpoint, advertising the subnet to the mesh. With the Python SDK, the call looks like this:

vns3_client.routing.post_create_route(
    cidr="10.0.1.0/25",
    description="Local subnet advertisement",
    advertise=True
)

Creating route advertisements is pretty common so we provide a parallelized “macro” for creating this route advertisements across a mesh:

from cohesivenet.macros import routing

ordered_subnets = [
    "10.0.1.0/25",
    "10.0.1.128/25",
    "10.0.2.0/25",
    "10.0.2.128/25"
]
routing.create_route_advertisements(vns3_clients, ordered_subnets)

Putting it all together

So to wrap up, here we configured a highly available mesh network with dynamic routing in a private cloud environment. Building network topologies with VNS3 can be fully automated and reproducible with frameworks like terraform and the VNS3 API.

If you’d like to see a final version of a working python script that builds this peering mesh, you can check it out here, as one of the python SDK examples.

Any questions on how to automate your network? Email us at support@cohesive.net or open a ticket directly on our support site.