Amazon AWS Specific Information

Please note--this section is organized in the form of a quick-start guide. If all the steps performed are done in-order, it will help to ensure success for an AWS deployment.

Typical Deployment

In most situations, customers go through two stages of deployment for Heimdall, a single server test deployment for POC installs and QA:

Here, Heimdall exists only on one server, and both the management server and the proxy exist together. Obviously, this results in a single point of failure, but this allows easy and fast proof of concept installs to occur. Once the idea of using Heimdall is validated, a more robust configuration is deployed:

Here, the proxy nodes are deployed into one or more Availability zones (to match the applicaton server AZs) with autoscaling, and the central manager exists on it's own server. As the proxies operate as an independent data plane, failure of the management server is non-disruptive, so most customers choose to simply deploy a single management server. If a redundant management server is desired however, this can be done as well, with the obvious extra cost. In the single management server model, the management server can sit behind an autoscaling group, and keep it's configuration files on EFS, or pre-populate the configurations from S3, as desired.

RDS Support

Heimdall provides support for all MySQL, Postgres, and SQL Server RDS types, including Aurora and Aurora for MySQL, including all actively supported versions of those engines.

Creating an IAM Role

In order to simplify the configuration for Amazon Web Services, the system supports reading the RDS and Elasticache configuration directly. In order to use this feature, one must configure an IAM role with access to the appropriate rights, as shown, then the option for "AWS Configuration" will become available in the Wizard. Without the IAM credentials, a manual configuration is still possible.

The permissions are for:

  • AmazonEC2ReadOnlyAccess: To detect the local region being run in, and the security groups to validate proxy configurations
  • CloudWatchFullAccess: When using cloudwatch monitoring, to build log groups and to report data (recommended, but not required)
  • AmazonElastiCacheReadOnlyAccess: To populate the available caches
  • AmazonRDSReadOnlyAccess: To populate database configurations and to enhance auto-reconfiguration of clusters on a configuration change or failover

For more information on IAM and IAM best practices please see Security best practices in IAM.

Configuring an Elasticache for Redis Instance

If Elasticache will be used, it is recommended that the instance be setup before configuring Heimdall, as this allows the Elasticache configuration to be auto-detected. When configuring Elasticache, ensure it is set to "clustered mode enabled" for redundancy, and select at least a two node configuration if redundancy is desired. After configuring either a single or multi-node instance, it is important to configure a parameter group, and set the parameter "notify-keyspace-events" to the value "AE". This will allow the system to track objects that are added and removed from the cache automatically, which helps prevent L2 cache misses. In other Redis types, this parameter can be set dynamically at runtime, but in ElastiCache, this can only be set via the parameter group. Failure to do this will simply reduce the performance of the system when there are cache misses. The configuration should appear as below:

AWS Marketplace Install

Heimdall can be be easily started using the AWS marketplace. During this startup, the image will download the newest release version of Heimdall. If the security groups are such that the download can not complete, then an older version of Heimdall will be used (from the instance creation time).

First, in the EC2, select launch instance, select the aws marketplace option on the left, and search for "Heimdall Standard Edition" or "Heimdall Enterprise Edition" with 24/7 support, then select:

Then, continue with Heimdall:

Select the desired instance types--The marketplace offering supports a variety of appropriate instances, with larger core counts supported in the Enterprise edition. Note on sizing: We generally recommend in initial core count for proxy capacity to be 1/4 that of the database behind Heimdall, so if the database has 8 cores, we would recommend starting with a 2 core proxy. In the event the cloud formation template was used, then you can split these cores across multiple proxies and use auto-scaling to size from there. In general, the c5 or Graviton 2 instances are recommended. For very high cache hit rates (over 90%), the c5n instance type may be appropriate, as the network bandwidth will be saturated before the CPU will be. Sizes can be adjusted once testing is complete and a better idea of the load ratio is understood.

Continue through the screens, ensuring that the security group configuration opens ports as needed for proxy ports (please update the source subnets to match your local networks, it is best practice to never open ports to the internet in general):

Finally, review and launch:

Next, with the EC2 instance online, you can bind the IAM Role created above to the EC2 instance, by right mouse-clicking on the instance, and under security, select modify IAM role. The following screen will allow you to select the role created. This enables autodetection of RDS and Elasticache resources, and other AWS integrations.

Once the instance is online and the IAM role is configured, connect to the instance on port 8087--there is a login help page providing instructions for the initial login. Please configure the instance using the wizard for the best results.. Nearly every manual configuration will have a fault, often resulting in support calls.

Advanced Marketplace Install

As Heimdall can run on multiple servers, at initialization, an instance can accept as user-data configuration options that will control how the instance runs, and if it should operate as a proxy or server (only). When set to such a mode, the instance will attempt to tune itself for the role in question, i.e. use the entire instance's memory vs. allowing memory to be used. To provide these options, the user-data should be a script that generates a file "/etc/heimdall.conf" with the following options echoed out:

  • hdRole=proxy|server
  • hdHost=hostname of management server
  • hdPort=port of the management server, generally 8087
  • vdbName=exact name of the vdb to service
  • hdUser=login username for the management server, can be admin
  • hdPassword=login password for the management server
  • javaOptions=Any arbitrary options desired to be set

Example user data script:

#!/bin/bash

(
echo "hdRole=server"
echo "hdUser=admin"
echo "hdPassword=somepassword"
) > /etc/heimdall.conf

Once initialized, this configuration can be adjusted manually if necessary. Note, if the hdRole is set, then the instance will automatically allocate 80% of instance memory for the process (server or proxy). This can be tuned in the /etc/heimdall.conf as needed as well.

These settings will effectively allow auto-scaling groups of proxies to be configured.

Note: If building an AMI for auto-scaling, which may be used by multiple scaling groups with different configurations, it is suggested that after doing initial testing, the heimdall.conf be deleted so that user-data will be re-read to build the Heimdall configuration. This will leverage the user-data on each new initialization to build the configuration at startup.

Please refer to AWS CloudFormation Template page for more information on using a template to create an autoscaling group, which automates the process of creating proxy-only instances, and provides failover resiliency as well.

Cloudwatch Metrics and Logs

In each VDB, under logging, an option is available for AWS Cloudwatch. If enabled and Cloudwatch access is enabled in the system's IAM role, it will start logging a variety of metrics into Cloudwatch under the "Heimdall" namespace, and the vdb logs will also be logged into Cloudwatch. The metrics include:

  • DB Query Rate
  • DB Query Time
  • Avg Response Time
  • DB Read Percent
  • DB Transaction Percent
  • Cache Hit Percent

Please note that additional charges may be incurred due to metrics and logging, in particular if debug logs under high volume are logged into Cloudwatch.

Aurora/RDS Load Balancing Behaviors

In an AWS RDS (including Aurora) environment, Heimdall has taken a flexible approach to allowing a cluster to be defined and configured. When used with RDS and the proper IAM role is attached to the management server(s), the primary URL hostname defined in the data source (outside of the LB) is used to identify a particular cluster. The cluster nodes are then probed in order to build the LB configuration. This definition is re-done at the initial cluster definition (if cluster tracking is enabled) every 30 seconds during normal processing and every 1 second when a node is in a "failure" mode, and also every time the data source configuration is committed.

When a cluster definition poll is done, the driver specified for the data source is used to create a new set of nodes. The nodes can be defined as "listener" (SQL Server only), "writer" node, "replica" for a read replica, or "reader" which includes both replicas and writers. The reader and writer URL's are then used to populate the nodes as appropriate based on the actual cluster state.

Please see the driver configuration page for information on each variable.

The goal of the Heimdall configuration is to bypass DNS resolution when possible for endpoints, which can slow down the failover process, while allowing the underlying drivers to support automatic detection and failover of the nodes that are acceptable for reader vs. writer roles. Through the default configurations provided by Heimdall, sub-second failovers can be achieved in most cases, and via the templates, custom configurations can be created to adjust the actual behavior desired.

For example, with the Postgres drivers, the default writerUrl template is:

jdbc:postgresql://${readers}/${database}?targetServerType=master

Here, the ${readers} is used, as the Postgres driver supports an ordered list, and the writer nodes will be included first in this list. With the targetServerType of master included, it will connect only to a master in the list, but on a failure, it will fail as quickly as possible to a new writer as soon as it is promoted. This allows the endpoint hostname resolution for Aurora to be bypassed, while providing the best possible availability.

And for the reader it is:

jdbc:postgresql://${readers}/${database}?targetServerType=preferSecondary&loadBalanceHosts=true&readOnly=true

Again, in this case, the driver's capability to load balance is being used to create a single pool of read connections, which will avoid using the writer node if possible.

An alternate writer URL could be used of:

jdbc:postgresql://${writer}/${database}

This wouldn't leverage the built-in failover capability, but would rely on Heimdall to detect a failure, and poll the AWS API's to find the new writer.

If each host in a number of readers is desired to have it's own node entry, along with graphs on the dashboard, and the write nodes should not be used for reading, the following alternate URL could be used:

jdbc:postgresql://${reader}/${database}?readOnly=true

This bypasses the postgres driver built-in LB capability, and will create a node for LB in Heimdall for each reader.

All built-in database types include a pre-defined set of reader and writer URL's except for Oracle, due to the complex nature of Oracle URL's for many environments. Older installs prior to this template function will have defaults included when updated to the code that supports this functionality.

Notes on behavior:

  • ${writeEndpoint} and ${readEndpoint} can be used for Aurora definitions, and will point to the writer (primary) and reader endpoints as appropriate. This allows use of the AWS managed DNS targets if desired. DNS resolution on these will be cached for five seconds, and due to potential update timings, this could result in a connection intended for a writer to point to a reader instead. This is a side-effect of connection pooling, DNS caching, and the update timing of the DNS entry. This behavior can occur with other applications not using Heimdall as well.
  • ${reader} will include the nodes that are expanded from ${writers}, but will set the weight of 1 vs. 10 for any pure reader node.
  • ${readers} can be used in the writer URL definition to expand to the entire list of writers and readers for drivers that will use an ordered list to identify what is a writer and what is a reader. Other variables can only be used in the proper context, i.e. ${writer} for a writeUrl, or ${reader} for a readUrl.
  • ${listener} is used only for SQL Server, and will resolve to the listener IP for a particular always-on cluster for the writerUrl, and will map to the writer instance in the cluster.