In working with various cache vendors over the years, I’ve come to realize that one of the more important questions isn’t what content you can cache, but which content you should NOT cache. This may seem an odd question–shouldn’t you always cache content that is cacheable; the more you cache, the better, right? The answer, surprisingly, is no! There are a number of factors that come into play, including:
- The memory usage of the cached object;
- The amount of time it takes to query the cache for a hit vs. a miss;
- What the cache hit rate is for given query patterns;
- The amount of time it takes to ingest an object;
- How often objects are invalidated.
The result of this is that caching by various cache storage mechanisms should result in different hit rates–not all caches are created equal. For example, in using a memcache server, since every query against the cache traverses the network at least once, if the hit rate is not high, AND the response time is not better than querying against the database, then using the cache for anything but complex queries will have a detrimental performance on application performance, even if you are reducing load on the database. If using a local (in-proxy) cache however, since there is no network overhead, even content that only offers a marginal hit rate may make sense to cache.
Here at Heimdall, we recognized these issues, and have introduced an exciting feature that is getting great results for our customers–that of automatic cache tuning. The logic behind this feature is that a customer can specify with rules which SQL queries that they are comfortable with caching (accounting for our built-in invalidation logic). Our logic then constantly analyzes the performance of queries based on patterns and tables, and then uses this data to determine which query patterns SHOULD be cached.
The results with customers have been very promising–in nearly every case, maximum cache hit rates go down when using this feature, but the overall performance of the application goes up. In a case I analyzed today, the maximum cache hit rate was about 60%, but the ideal hit rate was about 50% when using a pure local cache, and it went down to about 25% when using an external cache on another node. The beauty of this system though is that it is adaptive–it will adjust what to cache automatically based on network, cache, and database performance. It will even account for potential DDoS attacks that may change the database access patterns. If your database becomes sluggish due to other activity on it, then it will offload as much as possible, when needed, and resume using it when db performance returns to normal. This means that you can augment your database with a cache, and even if under normal conditions the optimal cache hit rate is low, it provides protection against surges in traffic and periodic batch processing that may slow down the database.
We believe that this technology is unique in the industry–in most cases, a customer has to decide what makes sense, and requires hand-tuning of cache behavior in the application. For Heimdall, this optimization is a single check-box in the Virtual Database cache configuration “Auto-tune Cache.”
Enjoy the time this saves trying to program the ideal cache policies into the application itself!