1. ElastiCache overview
Welcome to this section on Elasti Cache. ElastiCache is an inmemory caching service from AWS. So it’s a fully managed inmemory data store or a caching service that you use to improve the performance of your read operations. ElastiCache is a remote caching service or a side cache. That is, it is a separate dedicated caching instance which runs on a separate server from the database and from the application. ElastiCache provides rapid access to data across distributed nodes. There are two flavors to ElastiCache you have ElastiCache for Redis and ElastiCache for Mem cache D. Both of these are open source key value stores. And ElastiCache provides a sub millisecond latency for your applications. Or in other words, it provides a microsecond latency.
And if you compare Redis to Memcache D, redis supports a lot more features than Memcache D does. Redis supports complex data types like sorted sets, snapshots, replication, encryption transactions, pub sub, messaging, transactional, luis, scripting, and also geospatial data. But what Redis does not support is a multi threaded architecture. So if you need multithreaded architecture, then Memcache D is the way to go. Redis is suitable for complex applications including messaging, queues, session, caching leaderboards, et cetera. Whereas Memcache D is relatively basic caching service that you can use for simple applications like static website caching. So now let’s look at different types of database caches that we have.
So Caching, as I mentioned, is used to store frequently accessed data, so it improves your read performance. So when you use caching, you essentially are taking the read load off your database and thereby improving the database performance. There are three types of database caches integrated cache, local cache, and a remote cache. So when you talk about the integrated cache, it is the cache that is built into your database. And because this cache is a part of the database, it’s typically limited by the available memory and resources on your database instance. For example, we have an integrated cache in Aurora, and this cache is an integrated and managed cache with built in write through capabilities.
It’s enabled by default. You don’t have to enable it, so you don’t need any code changes in order to use it. And the second type of cache is a local cache. So local cache is something you implement within your application. So this is used to store data within application. And third type of cache is the remote cache that we are right now interested in. And this is a dedicated cache. It stores data on dedicated servers, so typically it’s built upon key value NoSQL data stores. For example, Redis and Memcache Day and Remote caches can support up to a million requests per second per cache node. And they offer sub millisecond latency or microsecond latency. The caching of data and managing its validity, or the TTL the time to live is managed by your application.
So you have to manage how you want to cache data on the Redis or the memcache clusters and you have to manage the time to live of different items that you store on the database. Now, let’s look at some of the use cases of this remote cache or Elastic cache. In particular, any apps that require sub millisecond latency or microsecond latency is a typical use case of ElastiCache. You can use Elastic Cache for real time analytics, gaming leaderboards, session stores, chat apps, messaging apps. You can use it for machine learning, for streaming media apps. You can use it for location based apps or geospatial apps. And you can also use it for message queues. So these are a couple of use cases that we have for ElastiCache. All right, so that’s about it. Let’s continue.
2. Caching strategies
Now let’s look at some of the caching strategies that we can use with such caching solutions. So the first one is lazy loading. So lazy loading loads your data into your cache only when it’s necessary. So it’s a reactive approach. Only the data that gets queried from the database is cached, so you don’t need a lot of storage space. It’s small in size. But this does result in a cache miss penalty. And what this means is, when the requested data is not present in the cache, it has to be read from the database so that round trip from the database is necessary when the requested data is not present in the cache. And this cache can also contain stale data.
So you must use appropriate TTL or Time to live to ensure that your cache does not contain stale data. The second approach is a write through approach. So the way it works is it loads data into the cache as it gets returned to the database. So whenever you write anything to the database, it’s also written into the cache. So this is a proactive approach. This ensures that your data is always current. You don’t have any stale data in your cache, but the downside is it results in cache churn. Most of the data that you write to the cache is never read, so this consumes a lot more storage than you actually require. So you should use the right amount of TTL to save space.
And the third approach is a combination of lazy loading and write through, so you get benefits of both the approaches. And even with this approach, you should ensure that you use appropriate TTL value for your cache data. Now, let’s look at these caching strategies in a little more detail. We have lazy loading here. So you have an application and your database and your Elastic cache. Or your remote cache is sitting between your application and your database. Whenever your application makes a read request and the data is there in ElastiCache, then it will be returned from the Elastic cache. So that’s called as a cache hit. And when the requested data is not found in Elastic cache, then we call it as a cache miss.
Then this data is read from the database and also written to cache. So this is how lazy loading works. And in contrast, let’s look at how write through works. You have your application and your database and Elastic cache sitting in between both of them. So whenever you make any write requests, then you write to the database as well as to the cache at the same time. So whenever you make any read requests and the data is present in the cache, then it will be returned from the cache. If there is a cache miss, then the data will be read from the database and also returned to the cache. And when you use write through, the data will expire after the TTL.
So if you write some data to the cache and the TTL expires, then that data will be removed from the cache. And that is when you will experience a cache miss in a write through cache. All right, now let’s look at this use case of a user session store. This is a typical example of using a caching service like ElastiCache. So here we have a couple of applications and application users, and they are interacting with the ElastiCache service. So whenever a user logs into one of the application, then the application creates a session into ElastiCache. So the application writes the session data into the ElastiCache service.
Now, let’s say the user hits another instance of our application then. Now, this application can simply retrieve the session data from ElastiCache so the user doesn’t have to log in. Again, the instance retrieves the data, and the user is already logged in. So even when the user connects to the third application, it can retrieve the same session data. So you can use ElastiCache to create a shared session store that is shared between different applications. So users don’t have to log in every time they want to use a new application in your group of applications or your set of applications. All right, so that’s about it. Let’s continue.
3. Redis architecture and Multi-AZ auto-failover
Now let’s look at the redis architecture. There are two modes with redis. One is Cluster mode disabled and one is Cluster mode enabled. Let’s look at the Cluster mode disabled first. This is how it looks. And remember that redis clusters are generally placed in private subnets within your VPC and they are accessed from an easy to instance placed in a public subnet within the VPC and this is true for all the redis modes whether cluster mode is disabled or enabled. So when you talk about Cluster mode disabled, it has a single shard, okay? So all the data is stored in a single shard and a shard consists of primary node and up to five replicas.
So you can have one primary node and you can have multiple replicas up to five of them. And when a shard has one or more replicas, then we call it as a replication group and you can deploy these replicas into a multiaz setup. So you have one replica in one AZ and another replica in another AZ and you can have up to five such replicas in a Cluster mode disabled cluster. And these replicas support auto failure capabilities. If the primary goes down, then redis cluster can fail over to one of the replicas and all these instances use a single reader endpoint that auto updates whenever there are changes to your replica nerds.
So whenever you add or remove any replicas, the reader endpoint will ensure that the connection data is up to date so you don’t have to make any changes to your application. Now let’s look at the Cluster mode enabled architecture. Cluster mode enabled simply means that you can have more than one shard. You can have multiple shards like this and the data is distributed across these shards. You have one set of data in one shard, the second set of data in second shard, and so on. And just like the cluster mode disabled cluster, a shard has a primary node and up to five replicas.
These replicas support auto failure capability just like the cluster mode disabled cluster and with the Cluster mode enabled cluster, you can have up to 90 nodes per cluster. So you can have 90 nodes if you create no replicas or if you create five replicas per node, then you can have up to 15 shards. So at the max you can have up to 90 nodes in total. And for high availability it’s recommended to use at least three shards. All right, using three shards is recommended and generally it’s a good idea to use nitro system based node types for higher performance or in other words, use the M five or R five instance classes. Now let’s talk about the multiaz with auto failover.
So whenever there is an outage or if your primary goes down, then redis cluster will fail over to a replica node so the primary p one goes down, then redis cluster, let’s say it fails over to the first Replica. So the first Replica will become the primary and other Replicas will remain as Replicas. So this is how it works. And the downtime is typically about three to six minutes. It’s a minimal downtime. And because redis Replicas use async replication, you can experience some data loss due to replication lag. And remember that if you perform a manual reboot, it does not trigger auto failover.
But other reboots or failures do result in auto failure. And if you want to simulate or test a failover, you can do that using the AWS console. Or you can also use the CLI or API for this purpose. And let’s look at what happens during a planned maintenance. Okay, so during a planned maintenance for auto failure enabled clusters, if you have cluster mode enabled, then there is no right interruption. So your application continues to perform the way it does during a planned maintenance. But if you have cluster mode disabled, then there will be a brief right interruption for about a few seconds. All right, so that’s about it. Let’s continue to the next lecture.
4. Redis backup and restore
Now let’s talk about the backup and restore capabilities in Redis. So Redis supports both manual as well as automatic backups. The backups are pointing time copy of the entire Redis cluster. You cannot backup individual nodes, and you can use backups to warm start your new cluster. So you can preload your new cluster with the data from the backups. This way you can warm start your cluster so the cluster doesn’t have to wait until it gathers all the data that your application requests. And with Redis, you can backup either from the primary node or from the replica. And for the best performance, it’s recommended that you backup from a replica. And this ensures that the performance of your primary node is not affected.
And remember that these backups, just like other database backups, are stored in S Three. And you can also call these backups as snapshots. And you can export these snapshots to your S Three buckets in the same region. So you have already snapshot stored in S Three, and you can export that snapshot to the buckets that you own using the export option. And once you have copied these snapshots to your own bucket, then you can copy those snapshots from your bucket to your bucket in another region, for example. So this is how you would create a cross region snapshot of your Redis cluster. All right, so that’s about backup and restore. Let’s continue to the next lecture. It’s.
5. Redis scaling and replication
Now let’s talk about redis scaling. So the scaling approach is slightly different for the cluster mode disabled clusters and the cluster mode enabled clusters. So let’s first talk about the cluster mode disabled cluster. Cluster mode disabled cluster supports two types vertical as well as horizontal scaling. So vertical scaling means you scale up or scale down your node type. So you simply increase or decrease the size of your instance. And this is a minimal downtime operation. Similarly, when you do horizontal scaling, you add or remove the replica nodes. So, for example, you have one replica. You can add up to five replicas in total, or you can remove some of the replicas as per your needs.
And remember that if you have multiaz with automatic failover enabled, then you cannot remove the last replica. So multi AZ requires at least one replica to be in use. All right, so this is about the scaling for cluster mode disabled cluster. Let’s look at the scaling for cluster mode enabled cluster. Now, so cluster mode enabled cluster supports online vertical scaling. So you can scale up or scale down your node type, so you can increase or decrease the size of your instance. And since this is online scaling, there is no downtime. So what online means is your cluster remains available during the scaling operation. And when you talk about horizontal scaling in a cluster mode enabled cluster, it means reharding and shard rebalancing.
So you can use horizontal scaling to add, remove, or rebalance the shards in your cluster. So let’s say you have one shard with about five replicas. Then you can create another shard with any number of replicas you desire. All shards don’t have to have the same number of replicas, and you can create up to 90 such shards. Okay? So adding and removing these shards is called as re sharding, whereas changing the data content in different shards is called as rebalancing. So shard rebalancing ensures that data is equally distributed across different shards. And there are two modes with horizontal scaling offline and online. So offline means there will be some downtime, whereas online means there is no downtime.
Now, let’s talk about these two modes, the online and offline modes for horizontal scaling in a little more detail. So here is the comparison of the online mode and offline mode. And as I mentioned, online mode means there is no downtime, whereas offline mode will have some downtime. So with the online mode, the cluster remains available during the scaling operation. But with offline mode, the cluster will not be available. In both the modes, you can perform ScaleOut and scaling operations as well as rebalancing operations. But scale up and scale down operations are only supported in offline mode. So you cannot change the size of your instance if you are using the online mode.
If you want to upgrade the engine version, then you have to use the offline mode. If you want to change the number of replica nodes in each shard independently, then you have to use offline mode. If you use the online mode, then all the shards will have the same number of replicas. If you want to specify the number of replicas for each shard independently then you have to use the offline mode. And in the similar manner, if you want to specify the key spaces for your shards independently, then you have to use the offline mode. So online mode only supports scale out scaling and rebalancing operations. And the benefit is your cluster remains available during the scaling operation. And if you need to scale the size of your cluster or upgrade the engine version.
Or if you want to manage the Replica nerds independently, if you want to manage the key spaces independently, then you should consider using offline mode. Now, let’s talk about redis replication with the two cluster modes. Cluster mode disabled and cluster mode enabled. The cluster mode disabled cluster has a single shard, whereas the cluster mode enabled supports up to 90 shards. Cluster mode disabled cluster can have up to five replicas, whereas the cluster mode enabled cluster can have up to five Replicas per shard. So if you’re not using any Replicas, then with Cluster mode disabled cluster, if your primary instance fails, then there will be total data loss but in Cluster mode enabled cluster.
If you’re not using any replicas and your primary goes down, then only that Shard will experience data loss. Multiaz is supported in cluster mode disabled cluster, whereas with cluster mode enabled cluster, multiaz is required. So it’s a mandatory requirement. Cluster mode disabled cluster supports scaling, whereas the cluster mode enabled cluster supports partitioning of data across different shards. And if the primary load is read heavy, then you can scale the cluster mode disabled. Cluster only up. To maximum of five replicas and the cluster mode enabled cluster is good for write heavy nodes because you get additional write endpoints one. Right endpoint shard. So that’s about it. Let’s continue to the next lecture.
6. Creating a Redis cluster – Hands on
In this lecture. Let’s create a redis cluster. So here I am in the ElastiCache dashboard. Let’s get started. So here we choose redis. And if you want to create a single node cluster, then you leave this unchecked. And if you want to create a cluster mode labeled cluster, then we select it. Let’s select this one. Okay, let’s name our cluster. Let’s say redis cluster one. And we can leave all this stuff here as default for node type. We can choose the smallest node available. So I’m going to go with TTU micro. We leave the number of shards to three and replicas per shard to two. Okay? And we also want to have a multiaz set up. Then we provide a subnet group. So let’s name a subnet group.
Let’s say redis subnet group. Okay, and I’ll give the description as well. And then you should select at least two subnets. I’m going to select one, two and three. Okay. And here you can decide on the slots and key spaces. So Idly equal distribution is a good choice. If you want to go with a different one, you can choose a custom distribution and then you can specify the key spaces for shard one, two and three and so on. But I’m going to go with the equal distribution. Then under security group, I’m going to use the default security group. Only thing you should remember is the security group should allow inbound access on the ready spot. Okay? So the ready spot is 6379.So this is the port your security group should allow inbound access on.
Then we don’t want encryption. And if you want to see a file or warm start your cluster, then you can specify a data file path here. We don’t need that for now. Then you can choose the backup options here and maintenance window options. If you want to subscribe for alerts, then you can specify an SNS topic here. All right, so that’s about it. Go ahead and hit create to create the cluster. And it’s going to take a while for this cluster to be ready. So I’m going to pause the video here and come back once this cluster is available. All right, now we can see that the cluster is available, so let’s open it. And here we can see that we have about three shards and about nine nodes. Three nodes per shard, right? Let’s click on the cluster name.
And here we can see that we have three shards, three nodes each, and the key spaces on each shard. So the data is distributed across three shards and each shard has three notes. Okay? And from here you can add or remove shards. You can add or remove replicas. So each of the shard can have up to five replicas and different shards can have different number of replicas as well. So let’s click on one of the shards. And here we have one shard which has three different nodes and they are placed in three different az’s. And from here you can add additional nodes, or you can remove nodes, you can failover, and so on. Okay, so that’s about it. Let’s continue to the next lecture.