Application developers must balance time between coding features and improving performance and scalability. Most believe that focusing on functionality and features is the priority, but features and application response times go hand-in-hand. Slow performance can adversely affect revenue.
As the director of IT operations at Questis, a configurable technology platform for financial intermediaries, we realized our backend infrastructure required additional development to improve performance, provide high-availability, and rapidly scale to meet the needs of Questis customers accessing the platform.
Questis allows organizations to deliver effective, white-labeled financial wellness programs to their clients, and we had a tough choice to make—develop the improvements internally or buy a third-party solution. We estimated it would take more than a year to complete the project ourselves, with additional ongoing maintenance each year going forward.
We chose to team up with Heimdall Data, an AWS Partner Network(APN) Advanced Technology Partner with the AWS Data & Analytics Competency. It offers a transparent database proxy, giving developers SQL visibility and control to improve backend performance and scalability.
In this post, I will show how Heimdall Data reduced the development cycle at Questis by months while meeting our dynamic traffic demands during peak load times.
At Questis, our development strategy was first to develop industry-differentiating functionality to get the product to market. Afterwards, we would focus on improving application performance, but in anticipation of large customer acquisitions we had to have a robust Amazon Web Services (AWS) architecture to meet future demand.
We developed our own cloud-native application on AWS and Amazon Aurora PostgreSQL for several reasons:
- Excellent sub-second replication times
- Auto scaling of database storage
Our deployment had three Amazon Elastic Compute Cloud (Amazon EC2) application instances running on Amazon Elastic Container Service (Amazon ECS) and connected to two Amazon Aurora backend servers (one write, one read replica), and an Amazon ElastiCache for Redis cluster.
Figure 1 – Heimdall Data for Questis architecture diagram.
As a database proxy, Heimdall Data provided a number of features that benefited our backend infrastructure for Aurora.
Better Use of Read Replicas
To horizontally scale Amazon Aurora, users must modify the application as Aurora instances are added. Heimdall Data removed the need for code changes by routing queries to the write servers or read replicas. We needed only to create routing rules on the Heimdall Central Console. This allowed write servers to process “expensive” queries, while the replicas serviced the read queries.
Let’s walk through an example of how read/write splits are configured on the Heimdall Central Console. First, you must ensure that a read/write master is configured along with at least one read-only server.
Figure 2 – Data Sources tab in the Heimdall Central Console.
Next, configure a Reader Eligible rule to specify what’s allowed to be routed to the read server.
Figure 3 – Rules tab in the Heimdall Central Console.
Another unique Heimdall feature for read/write splits is replication lag detection. There’s a delay between when data was written to Aurora, and when other Aurora instances received the updates. The Heimdall Database Proxy calculates the data replication time, and if a read query falls within the lag time window, Heimdall automatically routes the query to the server with fresh data. For financial transactions, this feature is important for data integrity purposes.
Since the replication lag above was set to 10 seconds (10s), and a table was last written 20 seconds (20s) ago, we can deem it safe to use the read replica. If, however, the last write was 5 seconds (5s), then Heimdall Data can route the query to the write master node to complete the read operation, guaranteeing that a stale response is not received.
The value of the detected replication lag can be found on the status tab, as shown in Figure 4.
Figure 4 – Status tab n the Heimdall Central Console.
For further protection, Heimdall has a fixed lag window value, which is configured in a static manner. This allows the replication lag window to spike on the short-term without impacting the freshness of the data being returned. Further control over the read/write split can be applied at a rule level based on matching regular expressions, including bypassing the replication lag logic for particular queries, and always using the read server.
When replication lag is not a concern and eventual consistency can be tolerated, you can create a read/write split rule for particular queries that should unconditionally be read from the read/write server.
Questis choose Amazon ElastiCache for Redis as the SQL results cache to offload traffic. The benefits of this include improved application response times and improved Aurora scale as less traffic is processed in the backend.
The Heimdall Database Proxy manages the interface to ElastiCache by determining which queries are cached, and automating cache invalidation to ensuring fresh data.
Figure 5 – Heimdall Data distributed caching deployment.
Our goal was to offload Aurora and bring the data closer to the user, removing network latency and improving application response times. We could have spent a year developing a SQL cache system ourselves, but with Heimdall Data, caching was online for testing in just one day, without any application changes.
For more information on how to configure distributed SQL caching with Heimdall, read this post on the AWS Database Blog: Heimdall Data SQL Caching for Amazon ElastiCache.
Heimdall Data identified SQL performance bottlenecks in all locations of the network—application, network, and database. We utilized the rules engine, explain plans, and analytics, allowing us to quickly identify several places for optimization.
The analytics provided by Heimdall Data displayed query response sizes, response times, query count, and explain plan under a single platform. This level of monitoring and analytics allowed us to identify issues that only a seasoned database administrator (DBA) would be able to gather.
Figure 6 – Heimdall Data SQL analytics.
The chart in Figure 7 gives helpful insights in the Analytics tab. This information allowed Questis to identify performance issues.
Figure 7 – SQL analytics definition table.
Heimdall Data’s analytics provided our team insights into how Questis could optimize slow SQL queries. This alone made the product very useful for us.
Database caching improved website response times up to 46 percent due to high cache hit rates from Heimdall auto-caching and route queries (read/write splits). Heimdall Data saved us more than a year of development and maintenance of a data access layer.
Our entire team at Questis was very pleased with the performance gains and visibility delivered from Heimdall Data.
To get started with Heimdall Data, download a free trial from AWS Marketplace. Their AWS wizard walks you through a step-by-step configuration.
Don’t miss this post on the AWS Database Blog: Heimdall Data SQL Caching for Amazon ElastiCache.
This blog is posted at AWS Partner Network (APN) Blog