Telegraf Plugin

# Telegraf Plugin Documentation

Telegraf is a plugin-driven agent that collects and sends monitoring data from a VNS3 controller to various destinations.

Input Plugins

Telegraf can leverage over 250 input plugins (https://docs.influxdata.com/telegraf/v1/plugins/#input-plugins) to collect metrics from a system, services, or 3rd party APIs. Not all plugins are relevant for the VNS3 controller. The The Telegraf v1 plugin comes preconfigured to collect VNS3 produced data via SNMP traps and IPsec Tunnel API metrics.

Output Plugins

Similarly Telegraf can leverage over 60 output plugins (https://docs.influxdata.com/telegraf/v1/plugins/#output-plugins) to send/write monitoring data to various destinations. The choice of output plugin depends on where you want to send your data. Each output plugin has its own set of configuration parameters to ensure compatibility with the target system. Outputs are controlled via the telegraf.conf configuration file.

Example Outputs

  1. HTTP API Output

    • Sends metrics to an HTTP endpoint.
    • Useful for custom APIs that accept monitoring data.
    • Documentation
    • Configuration Example:
      [[outputs.http]]
        url = "https://$MONOCLE_DISCOVERY_URL/v1/metrics"
        username = "api_user"
        password = "api_password"
        timeout = "5s"
        method = "POST"
        data_format = "json"
      
  2. InfluxDB Output

    • Stores time-series data in an InfluxDB instance.
    • Supports local and cloud-based InfluxDB setups.
    • Documentation
    • Configuration Example:
      [[outputs.influxdb_v2]]
        urls = ["https://influxdb-cloud-url"]
        token = "influxdb-token"
        organization = "customer-org"
        bucket = "metrics"
      
  3. Prometheus Output

    • Exposes metrics in Prometheus format for scraping.
    • Ideal for Kubernetes and cloud-native monitoring.
    • Documentation
    • Configuration Example:
      [[outputs.prometheus_client]]
        listen = ":9273"
        metric_version = 2
      
  4. Elasticsearch Output

    • Sends metrics to Elasticsearch for indexing and searching.
    • Works well with Kibana for visualization.
    • Documentation
    • Configuration Example:
      [[outputs.elasticsearch]]
        urls = ["http://elasticsearch-host:9200"]
        index_name = "telegraf-metrics"
        username = "elastic-user"
        password = "elastic-password"
      
  5. Kafka Output

    • Streams data into a Kafka topic for distributed processing.
    • Useful for large-scale event-driven architectures.
    • Documentation
    • Configuration Example:
      [[outputs.kafka]]
        brokers = ["kafka-broker1:9092", "kafka-broker2:9092"]
        topic = "telegraf-metrics"
        data_format = "json"
      
  6. PostgreSQL Output

    • Stores structured data in a relational database.
    • Ideal for advanced SQL-based analysis.
    • Documentation
    • Configuration Example:
      [[outputs.postgresql]]
        address = "host=postgres-host user=postgres password=postgres dbname=telegraf sslmode=disable"
        jsonb_tags = true
      
  7. AWS CloudWatch Output

    • Sends metrics to AWS CloudWatch for monitoring and alerting.
    • Useful for AWS-based infrastructure.
    • Documentation
    • Configuration Example:
      [[outputs.cloudwatch]]
        region = "us-east-1"
        namespace = "TelegrafMetrics"
        access_key = "aws-access-key"
        secret_key = "aws-secret-key"
      
  8. Loki Output

    • Sends logs to Loki for centralized log aggregation.
    • Works well with Grafana for visualization.
    • Documentation
    • Configuration Example:
      [[outputs.loki]]
        urls = ["http://loki-host:3100/loki/api/v1/push"]
        tenant_id = "loki-tenant"
        labels = { job = "telegraf" }
      

Capabilities

1. SNMP Input Plugin (snmp.conf)

The SNMP input plugin in Telegraf is used to collect network and system metrics from SNMP-enabled devices. This configuration allows Telegraf to pull data using SNMP and store the results for processing. The SNMP Input Plugin is controlled by snmp.conf (Telegraf SNMP Plugin Documentation).

NOTE: Any changes made to snmp.conf requires a restart of either the Telegraf service or Telegraf plugin.

  • Device IP: The SNMP input plugin retrieves data from specified VNS3 IP provided by environment variable during staring the plugin instance.
  • SNMP Port: Configured for 161
  • hostname: Uses HOSTNAME environment variable specified when the plugin instance is started.
  • customerPrefix: Uses the TOPOLOGY environment variable specified when the plugin instance is started.

Metrics Collected:

  1. System Metrics

    • system_version - Device system version
    • system_description - Device description
    • system_uptime - System uptime in ticks
  2. CPU Metrics

    • cpu_idle - Percentage of idle CPU time
    • cpu_system - Percentage of CPU time spent in system mode
    • cpu_user - Percentage of CPU time spent in user mode
    • cpu_wait - Percentage of CPU time spent waiting for IO
  3. Load Metrics

    • load_avg_1min - System load average over 1 minute
    • load_avg_5min - System load average over 5 minutes
    • load_avg_15min - System load average over 15 minutes
  4. Memory Metrics

    • memory_available - Available memory in bytes
    • memory_swap_available - Available swap memory
    • memory_total - Total memory in bytes
    • memory_cached - Cached memory in bytes
    • memory_swap_in - Swap memory used (in bytes)
    • memory_swap_out - Swap memory freed (in bytes)
  5. Disk Metrics

    • disk_root_total - Total disk space available on the root partition
    • disk_root_used - Disk space used on the root partition
    • disk_root_allocation_units - Allocation unit size for disk usage calculations
  6. Network Metrics

    • tcp_connections_established - Number of active TCP connections
    • tcp_attempt_fails - Number of failed TCP connection attempts
    • tcp_in_segments - Incoming TCP segments count
    • tcp_out_segments - Outgoing TCP segments count
    • tcp_passive_opens - Number of passive TCP connections opened
    • tcp_retrans_segments - Number of retransmitted TCP segments
    • udp_datagrams_received - Number of received UDP datagrams
    • udp_errors - Count of UDP errors
    • udp_datagrams_sent - Number of sent UDP datagrams

2. IPsec Tunnel Status Plugin (ipsec.conf)

The ipsec.conf configuration enables Telegraf to collect VPN tunnel metrics via an HTTP endpoint (Telegraf HTTP Plugin Documentation). This data is made available through VPN device API and serves time-series data.

NOTE: Any changes made to ipsec.conf requires a restart of either the Telegraf service or Telegraf plugin.

Metrics Collected for Each IPsec Tunnel:

  • uptime - VPN tunnel uptime
  • host - VNS3 device name
  • topology - VNS3 device topology name
  • instance - IPsec Endpoint name
  • tunnel_description - Description of the VPN tunnel
  • Active - Whether the tunnel is active
  • Connected - Connection status
  • Enabled - Whether the tunnel is enabled
  • subnets - Subnets associated with the VPN tunnel
  • Phase1 - Phase 1 status of the VPN tunnel
  • Phase2 - Phase 2 status of the VPN tunnel
  • Phase1Lifetime - Lifetime of Phase 1 negotiations
  • Phase2Lifetime - Lifetime of Phase 2 negotiations
  • nat_t - NAT traversal status
  • dpd - Dead peer detection (DPD) status
  • phase1_cipher - Encryption cipher used in Phase 1
  • phase1_prf - Pseudo-random function used in Phase 1
  • phase1_dh_group - Diffie-Hellman group for Phase 1
  • phase2_algo - Algorithm used in Phase 2
  • phase2_hash - Hashing method used in Phase 2

Getting Started

OPTIONAL - API Token

If the Telegraf plugin will be monitoring IPsec tunnel status, an API Token will need to be created and included as an environment variable when the plugin instance is started.

API tokens can be created from the Access Management page, located under the Controller’s left column menu.

  1. Click “New API Token”
  2. On the resulting modal popup, enter a Name, specify a lifetime of 1 hour, click the “Token lifetime refreshes when used” checkbox, and click Generate.
  3. Copy the token for use during Step 2 - Start Plugin Instance.

Step 1 - Install Plugin

The Telegraf plugin is listed in the public Public Catalog available on all VNS3 controllers with public Internet access.

To install the Plugin Image on a VNS3 controller, click the Install button associated with the Telegraf listing.

For controllers without public Internet access, the Telegraf plugin can be uploaded directly to the VNS3 controller OR a private Plugin Image Catalog can be configured. Contact Cohesive Support for more information.

Step 2 - Start Plugin Instance

After a Plugin Image is installed or uploaded, it will be listed on the Plugins tab on the Plugins Dashboard with a Ready status. Perform the following actions to start an instance of the Telegraf Plugin:

  1. Click the Actions menu associated with the Telegraf Plugin.
  2. Click Start Instance
  3. On the resulting modal popup, enter a Name, optional Description, and IP address (it’s recommended to leave the system to auto-assign unless there is a specific requirement).
  4. Click the Advanced dropdown to specify environment variables the Telegraf plugin requires for operation.
  5. Specify the environment variables required for the specific Telegraf use case. These cannot be changed once the Plugin instance is running. If you need to make changes to the Environment Variables, a new Plugin instance will need to be created.
  6. Click Start.

Environment Variables

The following table shows the environment variables used by the Telegraf plugin.

Variable Description Default
ENABLE_SNMP Enable/disable SNMP trap collection true
ENABLE_IPSEC Enable/disable IPsec metric polling true
ENABLE_API Enable/disable API metric polling true
HOSTNAME Hostname used in data reporting ``
TOPOLOGY Topology name used in data reporting ``
VNS3_IP Private VNS3 IP for API calls ``
API_TOKEN Token used to authenticate API calls ``

Step 3 - VNS3 Firewall

The Telegraf Plugin requires different network access configurations via the VNS3 firewall, depending on its use case. The following are the Rulesets for push output plugin configurations where the Telegraf plugin sends data to endpoints. IF you are using an output plugin that is pulled by an external system (e.g. Prometheus), additional VNS3 Firewall and Cloud Network Security Group rules will be needed.

Push Ruleset

# Telegraf Plugin - Allow access out to Internet via VNS3 IP to push monitoring data
POSTROUTING -o eth0 -s <telegraf plugin IP> -j MASQUERADE-ONCE
# Telegraf Plugin - Allow plugin to pull data from VNS3 SNMP
INPUT -p udp -s <telegraf plugin IP> --dport snmp -j ACCEPT
# Telegraf Plugin - Allow VNS3 to respond to plugin requests
OUTPUT -d <telegraf plugin IP> -j ACCEPT
# Telegraf Plugin - Allow traffic from plugin through VNS3
FORWARD -s <telegraf plugin IP> -j ACCEPT
# Telegraf Plugin - Allow traffic to plugin through VNS3
FORWARD -d <telegraf plugin IP> -j ACCEPT

Pull Ruleset

# Telegraf Plugin - Allow pull service to access monitoring data on the plugin - example Prometheus
PREROUTING -p tcp --dport 9273 -d ${vns3_primary_private_ip} -j DNAT --to <telegraf plugin IP>:9273
# Telegraf Plugin - Allow plugin to pull data from VNS3 SNMP
INPUT -p udp -s <telegraf plugin IP> --dport snmp -j ACCEPT
# Telegraf Plugin - Allow VNS3 to respond to plugin requests
OUTPUT -d <telegraf plugin IP> -j ACCEPT
# Telegraf Plugin - Allow traffic from plugin through VNS3
FORWARD -s <telegraf plugin IP> -j ACCEPT
# Telegraf Plugin - Allow traffic to plugin through VNS3
FORWARD -d <telegraf plugin IP> -j ACCEPT

Step 4 - Update Telegraf Configurations

Once the Telegraf plugin instance is running, its configuration can be updated based specific monitoring needs per the use case. Any updates will require restarting the Telegraf process.

Input Plugins

To modify input plugin settings, adjust the relevant configuration files (e.g., snmp.conf, ipsec.conf) via the Plugin Manager.

Output Plugins

To modify output plugin settings, adjust the relevant configuration sections in the telegraf.conf file via the Plugin Manager

Step 5 - Restart Telegraf

After making configuration changes, restart the Telegraf service to apply updates. Restarting via VNS3 Controller:

  1. Click the dropdown arrow under the “Process Manager” on the Telegraf Plugin Manager page.
  2. Click Stop.
  3. Click Start.
  4. Check the status of the process by clicking Status.
  5. Confirm there are no errors in the Telegraf log.

Extending the Plugin

Add Additional Outputs

Configure Telegraf or Prometheus to send metrics to other backends, such as Elasticsearch or Kafka.

Custom Dashboards

Use Grafana to create dashboards for logs (Loki) and metrics (Prometheus/InfluxDB/Azure Monitor).