Select Page
FacebooktwitterlinkedinyoutubeFacebooktwitterlinkedinyoutube

This is a supplemental guide to the AWS Blog Using Amazon Aurora Global Database for Low Latency without Application Changes. It provides details on how to configure the solution suggested.

Solution Setup

 

First, we will create an Aurora Global Database with a writer and reader in region US-West-2, and additional readers in regions AP-South-2 and AP-Southeast-2 (please see Appendix A for a script to build this).  Then, with this cluster created, we will create a Heimdall proxy cluster in the same regions.

 

To create the Aurora Global Database via the GUI instead, the overall steps for creation in the console are as follows (if the script is not used):

  1. Create a single regional cluster with one read/write primary server and optional Multi-AZ reader;
  2. Once initialized, create a global cluster from the regional cluster, and add a second region to it with an additional reader;
  3. Add additional regional clusters to the global cluster, as desired, each with at least one reader, up to the limit of 5 regions.

 

A video walking through these steps can be viewed here.

 

Once setup, your Aurora cluster should look this:

 

The next step is to create a Heimdall central manager instance.  This is basically two steps:

  1. Create an IAM role for the instance;
  2. Start the instance from the AWS Marketplace, attaching the proper role to the instance.

These steps are documented in Appendix B.

 

Configuring Heimdall

 

Once the Heimdall Central Manager is online, connect to it via HTTP on port 8087 or HTTPS via port 8443.  Once logged in, use the Configuration Wizard:

Select AWS Detect for the proxy to locate your AWS services (e.g. Amazon RDS, ElastiCache, Cloudwatch).  If the AWS Detect option is not shown, please refresh the screen):

Next, select the Global Cluster Name (regional clusters will be listed as well):

Select Next and ensure the database host name and other fields are correct.  Note, if the Central Manager is not in the same region as the global writer, some fields may be empty or have the incorrect values.  You can simply override the incorrect fields to continue:

Select Next and ensure the Track Cluster changes option is selected; this enables the auto-discovery and reconfiguration of the cluster on a configuration change:

Select Next. For Caching, select the Redis Elasticache option if you plan to use caching or select Local Only if you do not have Elasticache setup, this will use Local Cache on the Heimdall Proxy (EC2) Instance:

Select through the options and on the Summary page confirm all the details are correct. Click Next and then Submit. Note: When a global database is configured with the wizard, options are set based on this fact to ensure proper functionality by default.

Once the Configuration is complete, navigate to the Status Page; you will be able to see all the Aurora Instances shown here:

As part of the Central Manager, a default proxy is initialized, which can be used to validate an application against the proxy. Although in general, it is recommended that the proxy be configured in a redundant manner, and in the case of a global cluster, in multiple regions as well.  It is recommended to initialize a proxy in each of the region’s readers are in, although proxies can be initialized in additional regions as well.

In order to automate the initialization of proxy clusters, a CloudFormation script is available at https://s3.amazonaws.com/s3.heimdalldata.com/templates/heimdall-proxy-template.yaml.  This can be used in the CloudFormation create stack UI:

On the next page, complete the parameters. Take care to enter all the information correctly:

Note:  If the LB type is set to nlb, then the Route53 configuration will be optional, but will provide a name that can route directly to the database if the proxy cluster fails.

In the Network configuration, make sure the subnets match the proxy availability zones selected.

Here, the alias points to the product alias for the Heimdall Enterprise ARM Edition.  To prevent the template from failing after a very long time, go to the Heimdall Enterprise ARM page and subscribe to the product: 

https://aws.amazon.com/marketplace/pp/prodview-6cbfhwntxwmmw.  This only needs to be done one time across AWS regions.

The below settings reference the configuration of the Heimdall Central Manager and the vdb configuration.  Make sure everything aligns with the vdb configuration or else the proxies will not be able to initialize properly.  

Click through and ensure you acknowledge the creation of IAM resources:

Create the stack and wait until complete:

Repeat the steps for each region that you wish to have proxies operate from, and verify that all the target groups are showing as healthy in all three regions:

On the Status Tab, all the instances should be showing in the central manager:

Now you can navigate to the Dashboard tab to view the Metrics related to Queries being executed against the Heimdall Proxy:

 

And aggregate and per-proxy stats can now also be viewed on the dashboard:

Appendix A: Script to Create Example Global RDS Cluster

 

aws rds create-db-cluster –region us-west-2 \

 –db-cluster-identifier us-west-2 \

 –engine aurora-postgresql –engine-version 13.3 \

 –vpc-security-group-ids sg-0f60f421ce5a8906c \

 –master-username postgres –master-user-password Heimdalltest1

 

aws rds create-db-instance –region us-west-2 \

 –db-instance-class db.r6g.large  –availability-zone us-west-2a \

 –db-cluster-identifier us-west-2 \

 –db-instance-identifier us-west-2a \

 –engine aurora-postgresql –publicly-accessible

 

aws rds create-db-instance –region us-west-2 \

 –db-instance-class db.r6g.large  –availability-zone us-west-2b \

 –db-cluster-identifier us-west-2 \

 –db-instance-identifier  us-west-2b \

 –engine aurora-postgresql –publicly-accessible

 

aws rds create-global-cluster  –region us-west-2 \

 –global-cluster-identifier global \

 –source-db-cluster-identifier \

 arn:aws:rds:us-west-2:272965818115:cluster:us-west-2

 

aws rds create-db-cluster –region ap-south-1 \

 –db-cluster-identifier ap-south-1 \

 –global-cluster-identifier global \

 –engine aurora-postgresql –engine-version 13.3 \

 –vpc-security-group-ids sg-06550a2c7bbed8a96

 

aws rds create-db-instance –region  ap-south-1  \

 –db-instance-class db.r6g.large  –availability-zone ap-south-1a \

 –db-cluster-identifier ap-south-1 \

 –db-instance-identifier  ap-south-1a \

 –engine aurora-postgresql –publicly-accessible

 

aws rds create-db-cluster –region ap-southeast-2 \

 –db-cluster-identifier ap-southeast-2 \

 –global-cluster-identifier global \

 –engine aurora-postgresql –engine-version 13.3 \

 –vpc-security-group-ids sg-0173d13ac20565517

 

aws rds –region ap-southeast-2 create-db-instance \

 –db-instance-class db.r6g.large \

 –db-cluster-identifier ap-southeast-2 \

 –db-instance-identifier  ap-southeast-2a  \

 –availability-zone ap-southeast-2a \

 –engine aurora-postgresql –publicly-accessible

Appendix B:  Installing Heimdall from the AWS Marketplace

 

The first step in creating a Heimdall proxy cluster is to create a management server.  To do this, first create an IAM role with the following minimum access privileges (this allows autodetection of the RDS and Elasticache configurations:

 

AmazonEC2ReadOnlyAccess

AmazonElastiCacheReadOnlyAccess

AmazonRDSReadOnlyAccess

 

Next, in the ec2 console, launch an ec2 instance, and on the first page, select AWS Marketplace, and search for “Heimdall Proxy Standard Edition”:

Note:  For the proxy clusters we will create later, we will use the Enterprise Edition to enable support.

 

Click through the pages, selecting an instance size–most customers will want to use a t3.medium for a management server or for POC.

 

In section 3, configuring instance details, make sure to select the IAM role created earlier, and place the instance in the appropriate subnet and AZ:

Next, in page 6, configure the appropriate security group–you can remove the unnecessary ports and configure more restrictive access as needed.  Ports 8087 is for HTTP access, and 8443 for HTTPS:

Finally, review and launch.  Once started, you should be able to access the UI via browser on the instance’s IP, on http port 8087 or https port 8443.

The default user is “admin” and password will be the instance ID of the management server.    

Solution Overview

 

The following components include:

 

  1. Aurora Global Database
  2. Heimdall Proxy and Central Manager
  3. AWS Network Load Balancer (NLB): The NLB is used for high-volume traffic to direct DNS queries to the proxies, bypassing NLB traffic costs.
  4. Amazon Route53 routes a single name to the closest available proxy and health checks the proxies.

 

This solution guide uses the traffic flow of an application accessing the database, from initial DNS request through to the actual result-set being delivered to the application.

 

The first step is the DNS query, when the application first attempts to connect to the database.  The application will connect to a Route53 configured name, such as global-vdb-geonlb.test.heimdalldata.com:

Route53 uses an internal lookup to determine the IP addresses that the latency routed records point to, which has three potential targets.  These names resolve into the us-west-2, us-south-2 and us-southeast-2 regions.  Route53 will then compare the source IP of the DNS query source, and will use this comparison to determine which healthy region is closest, and return the closest NLB:

In this setup, the failover records are not explicitly used, but necessary to “glue” the latency records with the NLB records for proper operation.

 

Once the NLB IP address is returned to the application, the application will open a connection to the NLB IP address.  The NLB+target group will then use the health and state of the proxies to pick a proxy to send the connection to:

The proxy authenticates the user’s connection and receives the request.  At this point, what happens will depend on the configuration of the VDB and the data source.  First, if caching is enabled, then the result may be returned from the cache:

Second, if read/write split is enabled, and a query isn’t cached, it can be routed to a connection to a reader node:

When a connection is made to the servers, Heimdall also provides connection pooling.  The state of the pools can be viewed with the command “show pools” through the proxy itself:

# show pools;

       URL or catalog:user        | busy | connecting | idle | wait count | created | borrowed | borrow time(us) | returned | closed | abandoned |err

———————————-+——+————+——+————+———+———-+—————–+———-+——–+———–+—-

 172.31.54.43:5433/               | 26/0 | 0          | 3    | 0          | 29      | 169      | 1186            | 143      | 0      | 0         |

    postgres:odoo                 | 0/0  | 0          | 1/0  | 0          | 1       | 35       | na              | 35       | 0      | 0         | 0

    NA:postgres                   | 0/0  | 0          | 2/0  | 0          | 2       | 108      | na              | 108      | 0      | 0         | 0

    odoo:odoo                     | 26/0 | 0          | 0/0  | 0          | 26      | 26       | na              | 0        | 0      | 0         | 0

 172.31.54.43:5433/?readOnly=true | 10/0 | 0          | 1    | 0          | 11      | 45       | 1566            | 35       | 0      | 0         |

    postgres:odoo                 | 0/0  | 0          | 1/0  | 0          | 1       | 35       | na              | 35       | 0      | 0         | 0

    odoo:odoo                     | 10/0 | 0          | 0/0  | 0          | 10      | 10       | na              | 0        | 0      | 0         | 0

 

Pooling locally is of particular importance to global applications as connections establishment can take several round trips for authentication, etc.  By performing this locally, and using pooled connections, it reduces this overhead dramatically.  With PHP applications that open and close connections per page, the penalty of connections opening and closing alone can make an application unusable.

 

When choosing the data source to pool to however, if the load balancing option “Use Response Metrics” is enabled, the per-proxy response metrics used to render the monitor response time graph will be used to determine which proxy to route traffic to.  Note in this graph, all but one site’s proxies are disabled as normally the graph shows an aggregated response time view of all sites.  This is to help illustrate the performance difference of the nodes at each site.  The local site is so close to 0 that it is nearly indistinguishable from 0 in this case, and clearly would be used for reads:

The logic in selecting works as follows:

 

  1. Find the server that has recently been responding the fastest;
  2. Find all other servers that are performing reasonably close to the fastest server;
  3. Pick from the servers selected.

 

As such, if you have two readers in one AZ, both will tend to be load balanced too, since their response times will be similar, BUT if there is a third reader in another remote region, the response time will be significantly slower and will be eliminated from the list of candidates.  This evaluation happens every second, allowing rapid changes to occur based on changing conditions, i.e. if an international link goes down, we can reroute traffic quickly.

 

If we detect that a write has happened inside the safe replication lag window, then instead of reading from a local reader, we will read from the global primary writer.  While this may be slower, it isn’t as slow as waiting for the query to be safe against the local reader.

 

FacebooktwitterlinkedinyoutubeFacebooktwitterlinkedinyoutube