Implementing Amazon Aurora Global Databases Without Application Changes
Applications with a Global footprint have multiple challenges facing them, in particular when the global application edge directly accesses databases to build custom pages for end-users such as apps using AWS Lambda@Edge. The two main challenges include availability and performance.
Amazon Aurora Global Database is designed to solve the technical challenges of these global applications, allowing a single Amazon Aurora database cluster to span multiple AWS regions. It asynchronously replicates your data with sub second timing. This enables fast, low-latency local reads in each region, and provides disaster recovery from region-wide outages by providing multi-region writer failover. These capabilities minimizes the RTO (Recovery Time Objective) of cluster failure, and thus minimizes the data loss during failure, allowing the RPO (Recovery Point Objective) to be met.
There are considerations that those implementing an AWS Global Database need to be aware of however. The most important is that most applications are designed so that you connect to a single hostname, and the database “just works” with acid consistency across connections and application nodes. In the case of a global database, you have reader hostname endpoints in each region, and in the primary region, you have two endpoints, one for writes, and one for reads. As currently implemented, it is the job of the application owner to access the proper endpoints optimally, and on a failover, to change the writer endpoint. A resolution for this second point is detailed at Automate Amazon Aurora Global Database endpoint management | AWS Database Blog, which leverages a combination of AWS services to automatically update a route53 cname on failover, so the application writer endpoint can be changed automatically. This solution however does not resolve the issue that the application also has to be intelligent to understand when it is safe to use a local reader (or where the closed reader is), or, if due to consistency requirements, the remote writer should be used for reads as well. Most applications do not have any of this intelligence, and thus, would require additional development in order to leverage the capabilities of the Amazon Aurora Global Database.
In order to resolve the challenges that face those implementing the Amazon Aurora Global Database, Heimdall Data has introduced a set of functionality, which in conjunction with the features of Route53, bridge the implementation gap, allowing the global database nodes to be optimally and intelligently utilized, while providing application edge database caching functionality as well. The result of this is that global and edge-based applications can access the global database seamlessly and optimally, without having to rewrite the application behavior.
This post will detail how the Heimdall Data solution works, and provides the steps needed to implement the entire solution and
First, to implement the solution, we will create an Aurora Global Database, with a writer in one region in US-West-2, and additional reader regions in AP-South-2 and AP-Southeast-2. Then, with this cluster created, we will create a Heimdall proxy cluster in the same regions.
An Aurora Global Database is created with a primary Aurora cluster in one Region and secondary Aurora clusters in one or more additional regions. Aurora Global Database uses the dedicated infrastructure in the Aurora purpose-built storage layer to handle replication across Regions. Using Heimdall Global Load Balancing Feature Leveraging Aurora Postgres Global Databases allows applications residing anywhere in the world to connect to the Reader Instance closest to the application using latency based routing. (Erik to add further if required)
In traditional Database environments applications connect to specific Read/Write Database Instances, Heimdall proxy is a transparent data access layer that intelligently routes queries to the most optimal data source, resulting in SQL offload and improved response times. The Heimdall proxy leverages Amazon ElastiCache for Redis to cache SQL results and track SQL queries so they are routed to the appropriate database node for fresh data.
In this architecture users can deploy Aurora Postgres Global Databases in Multiple regions (upto 5 regions are supported) at least 1 instance in each region, to ensure resiliency incase of region failure Heimdall Proxy Instances are deployed in each of the regions as well for high availability (Heimdall Proxy can also be deployed behind a Load Balancer within a region as well for multi availability zone resiliency.)
The IAM role is added to the Heimdall Proxy host instance and this also contains the Aurora Global cluster ARN which allows Heimdall to track the Readers Available and incase of failover which instance/ region is the writer.
(Erik to add further if required)
The following diagram shows Heimdall Smart Proxy with Aurora global databases setup in multiple regions:-
- An application connected to Heimdall smart proxy in the primary region routes Write connections to the Aurora Writer Instance and Read queries to the reader instance that responds with the least latency. (This may be in-region or to a secondary region reader instance as well;)
- Applications connected to Heimdall proxy in the secondary Region(s), routes Write connections to the Aurora Writer Instance in the Primary region and Read queries to the reader instance that responds with the least latency. (This may be in-region or to a secondary region reader instance as well;)
Incase Reader instances or the Aurora cluster in the secondary region becomes unavailable the read connections will routed to the Aurora reader which responds with the least latency.
- Builtin Heimdall Proxy features like auto-caching and route queries (read/write splits) are still available to applications in secondary regions as well and the proxy will automatically route connections :-
- With the architecture upto 5 Aurora Global Database Regions and upto to 15 Reader Instances in the Primary Region and 16 Reader Instances in each of the secondary regions can be setup, the Heimdall smart proxy will automatically be able to automatically detect configuration changes and incase Auto Autoscaling Readers is configured in the primary region, Heimdall will dynamically be able to detect additional/removal of Reader Instances and route connections automatically. Note;- Autoscaling Replicas is not supported for Secondary DB Clusters in Aurora Global Databases
With an Aurora global database, there are 2 approaches to failover:
- Managed planned failover – To relocate your primary DB cluster to one of the secondary Regions in your Aurora global database, see Managed planned failovers with Amazon Aurora Global Database. With this feature, RPO is 0 (no data loss) and it synchronizes secondary DB clusters with the primary before making any other changes. RTO for this automated process is typically less than that of the manual failover.
- Manual unplanned failover – To recover from an unplanned outage, you can manually perform a cross-Region failover to one of the secondaries in your Aurora global database. The RTO for this manual process depends on how quickly you can manually recover an Aurora global database from an unplanned outage. The RPO is typically measured in seconds, but this depends on the Aurora storage replication lag across the network at the time of the failure.
Since Heimdall smart proxy is able to automatically detect RDS/ Aurora configuration changes based on the ARN of the Aurora Global Cluster both managed planned and manual unplanned failovers are supported in the above architecture.
Steps to create the above setup:-
Create an Aurora global database or add Aurora Global database for a pre-existing Aurora cluster.
The Primary Cluster will be created in ap-southeast-2 region, and secondary clusters in ap-south-1 and us-west-2
- On the Amazon RDS console, choose Databases.
- Create Database:-
Choose Amazon Aurora;
Either Amazon Aurora with MySQL compatibility or
Amazon Aurora with PostgreSQL compatibility
And enable the radio button for “Show versions that support the global database feature”
this will ensure you are selecting a Aurora Mysql or Aurora Postgres version that is supported for Global Databases.
Follow the steps to create the Aurora Cluster:- https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.CreateInstance.html
Once the Primary Region Cluster is created:-
On the source Cluster, click Actions -> Choose Add region
On the Add a region page, Enter a name for your Global database identifier:-
Roneel:- Show enter name for global
If you need you add multiple regions, select Add region step above to add all the regions as needed.
Once the Cluster is created:-
Create Heimdall Proxy Instances as needed; Note you do NOT need to create Heimdall Proxy Instances in all the regions where you have created Global Clusters, additionally if needed you could create Heimdall proxy instances in other regions where you do not have Aurora Global Instances as well, this will simply route the connections from the application to the neared Aurora Global Database instance.
Erik to add steps for :-
creation of Heimdall Proxy
Configuration of Proxy
Once the Aurora Global Databases and the Heimdall Proxy Instances are running:-
Goto the Primary region Heimdall Proxy Instance and Enter the Ipaddress to access the console:-
Goto the Configure Section and select Wizard:-
Select AWS Detect:-
And select the Global Cluster Name:-
Select Next and Ensure the database host name and other details are correct;
Select Next and ensure the Track Cluster changes option is selected:-
Select next, for Caching select the Redis Elastic Cache option if you plan to use caching or select Local Only if you do not have Elastic cache setup, this will use Local Cache on the Heimdall Proxy (EC2) Instance:-
Select through the options and on the summary page confirm all the details are correct. Click Next and then Submit.
Once the Configuration is complete, navigate to the Status Page; you will be able to see all the Aurora Instances shown here:-
Navigate to the Dashboard page to view the Metrics related to Queries being executed against the Heimdall Proxy:-