20. [SAA/DVA] Auto Scaling Groups – Scaling Policies Hands On
Okay, so now let’s have a look at automatic scaling for your ASG. As you can see, we have three categories. So we have dynamic scaling policies, predictive scaling policies, and schedule actions. So let’s start with the simplest. Scheduled actions are when you want to schedule a scaling action for the future. So you can create one, and then you say what you want the desired capacity, the main capacity, or the maximum capacity to be. And then, is it once, or is it every week, every hour? Is it based on a specific schedule? and so on. And then a start and end time if you wanted it completed.
Okay, so this is pretty cool. And this is allowing you to schedule based on events that you can predict and that you know in advance because, for example, you know that you’re going to run a big promotion next Saturday. Okay, next. Machine learning will be used to drive predictive scaling policies. So you need to scale based on the forecast. So you need to take a look at the actual policy and the actual scaling based on the past. Then it will examine a metric, such as CPU utilisation or networking network out, the application load, balancer, request count, or a custom metric that you specify.
And then you want, for example, to have a target of 50% CP utilisation per instance. And then you can set up some additional settings. But based on this and the actual CPU utilisation for the past week, for example, a forecast is going to be created, and your ASG is going to be scaling based on that forecast. So it’s not something I can demonstrate with you because I would need to enable this for a very long time—a week—and then actually have some usage on my application. But at least you see that a predictive scaling policy is quite simple to set up. You just specify the metric you want and the target utilization, and then some machine learning will be applied for the scaling to happen. So the one policy I can demonstrate to you is a dynamic scaling policy. So let’s go ahead and create a dynamic scaling policy.
So we have three options here. We have target tracking, stepscaling, and simple scaling. So let’s have a look at simple scaling first. So here we have to specify a name, then a “Clywatch alarm,” which is an alarm that can scale capacity whenever it is triggered. So we need to create an alarm beforehand, but we’ll see this later on. And so, in case the alarm is triggered, what happens? Maybe you want to add two capacity units, or maybe you want to add 10% of your group, okay. And then add capacity units in increments of at least two capacity units. Okay, so this is a simple scaling policy. So you can have a scaling policy that goes out by adding instances or that goes in by removing instances here. Or you can have a plan of action as well. Okay? And next, we have step scaling. So this is allowing you to set up multiple alarms and, based on the alarm value, have different steps. For example, if the alarm is very, very high in terms of its value, then add ten capacity units. But if it’s high but not too high, add one. This is the idea behind step-scaling. But we’ll set up a target tracking and scaling policy because this is going to create Cloudwater alarms for us.
So here is the name of the Target Tracking Policy, and it will track the average CPU utilisation against a target value of 40. Okay? And then I will go ahead and create my policy. So now what we’re saying is that, hey, the goal of this ASG is to maintain the CPU addition of 240. And if you go over, then please add capacity units. So to see this in action, we need to change a few things. So right now the main and the desired are one, which is good, but let’s set the maximum capacity to be three or two, whatever happens. The idea is that you want to give it a maximum that is greater than the minimum, so that the capacity can go from one to two and then to three. And so the idea now is that we want the CPU utilisation of my auto-scaling group to be at a target value of 40. So if you have a look right now at the CPU solution, it’s going to be zero, obviously. So let’s have a look at EC 2. It’s going to be close to zero because, well, my EC2 instance is not doing anything. So I’m going to go to my EC2 instance and stress it until the CPU utilization reaches 100%.
So I’m going to connect to my EC2 instance using “easy to instance connect” and then connect to it. Then I’ll Google install Amazon Linux two stress. because there are a few commands. And here is the command. So I will copy the first command in here, and then I will copy the second command to install stress. Here we go. So stress is installed, and then I just run the command stress -C C four.And this is going to make the CPU reach 100% by leveraging four CPU units, meaning four virtual CPUs are being used at a time. So this should make my CPU go to 100%. And so what’s going to happen is that in my monitoring of my ASG in here, what I want it to happen is to see the CPU ratio go to something very, very high, okay? Then, in the activities, I’d like to see a scaling action. And so that means that I will go from one instance to two instances, so what I’m going to do is just pause the video until enough metrics are being captured, okay?
And until we can see that the CPU position is at a very high value, And then we’ll see how the target tracking policy works. So now I went into “Activities” and under “Activity History.” It says that an alarm has been triggered, and due to the target traffic policy, the capacity went from one instance to two instances. So if I go into instance management, as you can see now, I have two EC2 instances due to the scaling. And if I go into monitoring and look at the EC2-level monitoring, as you can see, the CP translation went to a very high value, and therefore the scaling happens. So how do we know that the scaling happens?
So if we go into automatic scaling, as you can see, there is a Target Tracking Policy right here. and what I want to show you is the back end. So, if we go into the Cloud Watch service in here and select Cloud Watch on the left hand side, I’d like to access alarms. So we need to go into alarms. As you can see, two alarms are created directly by the Target Tracking policy, okay? And one of them is called AlarmHigh, which is to scale out. So add instances, and one of them is called Alarm Low, which is to scale in. So fewer instances So, if the CPU utilisation is greater than 40% for three data points within three minutes, this one will enter the alarm states. And this one is looking to see if the CPU station is less than 28 for 15 data points, then scale in. Okay? So this is the idea. So this one was an alarm. And so, due to the metric itself going into an alarm state, by having the CPU station go over that limit right here, well, it got triggered.
And this made a scaling activity happen from my auto-scaling group, which in turn made a new instance go into service. So, if I go here and stop this command, which I’m not even sure I can stop, but if I stop this command, or let me just finish it, and to stop this command, I can probably go to my EC2 instances and reboot them. So I’m going to have a look at this one, I believe, and I’m going to reboot it. So reboot this instance, and I’m also going to reboot that one just in case. Okay? So this should make my CPU station go back to zero, and this will trigger a scale in action within 15 minutes. So I’m going to yet again pause the video in my ASG and see if, by any chance, I see a scale in action happening. So I will pause right now. And actually, just because I wasn’t quick enough and the CPU was still high, the desired capacity went from two to three. So as we can see, yet another instance has been added. So this really shows the power of auto-scaling groups. And now, if I go into instance management,
I’ll have three instances that are simple to terminate, and I’ll have to wait a little while for them to be terminated. So now let’s go back into activity. And as you can see, more activities have been going on, so some instances have been terminated. Because the alarm came from the lower CPU, the capacity was reduced from three to two, and then again from two to one. And so if you go into your instance management, as you can see, one instance was already terminated and the other is in the terminating space phase. And so that means that your target tracking policy is working. And you can see this by going into this alarm right here. And as you can see, the CPU position went up and then it went down. And then, as soon as it passed this 28% threshold, it went into the alarm state, which means that your ASG will start removing instances. Okay? So it really shows the power of target tracking policies. When you’re done, please make sure to delete this policy, and you’ll be good to go for the cleanup. That’s it. I will see you at the next lecture.
21. ASG for SysOps
Okay? So let’s have a look at a few features you need to know going into the Sysps exam. So the lifecycle hooks are a way for you to hook into the lifecycle of an ASG instance. That is, whenever it is initiated and terminated. So by default, as soon as you create an instance, it goes into service right away. So it goes from spending to in-service.
But you can set up a lifecycle hook for that effect. As a result, some additional steps may be required. So after pending, it can go into a pending-wait state as part of your lifecycle hook, okay? And in that state, you can define a script to run on the instance as it starts, for example, for some initial setup. So when you’re done with the initial setup of your two instances, you make them go into a “pending” state, and then after that they will be moved into the “in service” states. So this lifecycle hook really allows you to perform some kind of custom logic between the pending and the in-service states. Then you can also perform some action before the instance is terminated. So, for example, you want to pause the instance before it’s terminated for troubleshooting. And what this will give you is the opportunity, for example, to take the lugs out of your instances. So let’s say that instance goes from being in service to terminating.
Then, as part of your lifecycle hook, you can go into the terminating wait state, okay? When you get there, you can run some scripts, get some logs, do whatever you want, even get some information out, take an AMI, whatever you want, or take an EBS snapshot, and then terminate the process. Following that, it will enter the terminated state. And then the use cases for all these bicycle hooks are really to do cleanup, lug extraction, or special health checks before your instance starts and goes into service. And to integrate the scripts into these lifecycle hooks, there are Event Bridge, SNS, and SQS. So whenever there’s a lifecycle event trigger, a message can be sent to these three destinations. And if it goes to Event Bridge, for example, it can invoke the lambda function for you to perform any kind of scripting you want on top of things. Okay? So very quickly, there are launch configurations and launch templates.
So both allow you to specify the AMI, the instance type, a key pair, security groups, and any other managers to launch your EC2 instance, such as tags and user data, okay? And they’re both used by your ASG to launch instances, but you cannot edit them. So whenever you wanted to create a new launch configuration, you had to create a new one. You must also create a new version of the launch template. So the launch configuration must be recreated every single time. They’re part of the legacy of AWS, so they’re not really used anymore, really.And launch templates are newer, so they can have multiple versions, which is a much cleaner way of evolving your launch templates. They can create parameter subsets, so you can have launch config templates based on other launch templates for configuration, reuse, and inheritance.
It can provision both on-demand and spot instances as part of your launch template or a mix to create an optimised fleet. You cannot do this with the launch configuration. It supports placement groups, capacity reservations, a dedicated host, and multiple instance types. And you can use the T2 unlimited burst feature. So this is something that is recommended for use going forward by AWS. Next, you can look at your SQS with auto-scaling. So, how do you scale an auto-scaling group based on an SQSQ status? So, in this example, we have an SQSQ and a bunch of simple two-instances that process messages from it. And what you want to do is scale your ASG based on whether or not you have more messages in your SQS queue. So for this, you can create a CloudWatchmetric, for example, on the Q length.
So the metric name is “approximate number of messages,” and whenever that queue length is too big, that means you have too many messages to process. Then you can create an alarm, and that alarm will go into the cloud, which would trigger a scaling policy on your ASG. Now for your ASG, your multiple health checks So to make sure you have high availability, you need to have at least two instances in your ASG, okay? And multiasg. And then you can do some health checks, such as the EC-2 status checks. So to make sure that the underlying software and hardware of your EC2 instances are still functioning, which is enabled by default, but also the ELB health check, So this is to make sure that your application, if linked to a target group, will have its health checked by the ELB as well. And if the ELB figures out that your instance is unhealthy, then your ASG will terminate it.
And there’s one last type of health check, which is called a “custom health check,” which is to send the instance health to the ASG manually or automatically using the CLI or the SDK. So there is an API call for you to set the instance health directly on your SD. That’s why it’s called the “custom health check.” So whenever a health check fails, the instance will be deemed unhealthy and a new instance will be launched. After terminating an unhealthy one, there will not be a reboot of an unhealthy host for you. So if the instance is failing its EC2 status check, then it will not reboot the instance; it will just terminate it and launch a new one. So it’s good to know that the CLI can set instance health. So this is the API call that’s being used for the custom health checks and to terminate instances in the auto-scaling group, as well as another one that you should be using.
Finally, some common troubleshooting around your ASG So if you already have some instances running in your ASG but you cannot launch any new ones, it could be, for example, that your auto scaling group has reached the limit that you set by your maximum capacity parameter, in which case you have more instances and you need to increase the maximum capacity for the ASG to scale out more. Another common reason for new instances not being launched is a capacity issue in the ASG. Because the ASG cannot find capacity, the instance cannot be launched. Then, if launching a specific EC2 instance is failing, maybe the security group does not exist, or it might have been deleted in the back end. So have a look for this, or if the keypad does not exist, it might have been deleted as well. And if there are problems with your ASG launching instances after 24 hours, it will suspend the auto scaling processes, which is an automatic suspension for you to debug. Okay, so that’s it for an overview of the things you need to know. As a student, I’m going to the exam. I hope you liked it, and I will see you in the next lecture.
22. CloudWatch for ASG
So here are some important Cloud Watch metrics for your ASG. So you have metrics that are collected every 1 minute, okay? And they’re ASG-level metrics. So it’s upped in, and you have group minimum size, group maximum size, and group desire capacity, which represents the value of these parameters over time. Then you have groups in service provider instances: pending instances, standby instances, terminating instances, and total instances. And they represent the number of instances based on the state of them in your ASG. And to see these metrics as an opt-in So you should enable metric collection at the ASG level to do so. And then you have EC two-level metrics.
Okay? and by default it’s enabled. So you can begin to see CPT utilization, network in and out, and so on. You get basic monitoring at a five-minute level of granularity. or detailed monitoring at 1-minute granularity. As a result, you can enable Group Metrics Collection in your ASG by clicking the Enable button. and they will start populating here. So, as you can see, no data is available right now because I haven’t done it yet, but these could be very interesting to look at. And then for the EC2 side, as you can see, we get some metrics around CPU utilization. The discrete describes if this was an instant source, which it isn’t. So this is where they’re not showing anything, but Network In and Out is very interesting, as well as the Status Check and so on. Okay? So hopefully that was of great assistance to you. I hope you like this lecture, and I will see you in the next lecture.
23. Auto Scaling Overview
The AWS auto-scaling service is thus at the heart of auto-scaling. So it’s going to be available for all the scalable resources in AWS. So auto-scaling groups allow you to launch or terminate EC2 instances, but spot fleet requests allow you to launch or terminate instances from the spotfleet request itself, as well as automatically replace instances that are interrupted due to price or capacity constraints. You have Amazon ECS, which will use auto scaling to adjust the ECS service desire counts for DynamoDB tables or your global secondary indexes to adjust the WCU and RCU.
So write capacity units and read capacity units over time. And Aura will be using auto scaling as well for dynamic read replication. Auto scaling. Okay. And obviously, maybe other services will be added over time to the auto-scaling service in AWS. So we have scaling plans, and it could be dynamic scaling; we already know how to track a target. So with that dynamic scaling, you have the same capacity over time, but with dynamic scaling, you start adjusting the capacity over time, and so you stabilise the utilization of your service itself. So you can optimise for availability, in which case 40% of your resource utilisation will be the target. You can balance, and Sothis will be about 50% used. Or if you want to optimise for cost, then you will have 70% resource Resource Utilization.
But obviously, the closer you get to 100%, the less efficient your scaling is going to be, because you’re going to reach a performance bottleneck. And obviously, you can choose your own metrics and your own target value if you don’t want to follow the recommendations provided by AWS. And dynamic options. To disable scaling, use scaling. That means you can only scale out and not in. You can specify the cool-down period as well as the warm-up time for the ASG. And the inverse of dynamic. Scaling is predictive scaling. So we know this already. So with predictive scaling, the idea is that you’re going to analyse the historical load using a machine-learning algorithm that is done for you. And then a forecast will be generated, and then automatically scheduled actions will be taken based on that forecast.
24. Auto Scaling Hands On
And so let’s go to the auto-scaling service in AWS to have a look at it. So as you can see, there’s a dedicated console for the auto-scaling itself. So we can find scalable resources by confirmation stack, by tags, or by auto-scaling groups. So as you can see, we can select the demo ASG right now, and I click on Next. Then we need to specify a demo name. The scaling strategy was given this name. So, show the scaling strategy. Then what do we want to optimise my ASG for? So you can see that this is an alternative to configuring scaling. Scaling can be performed directly from the auto-scaling UI. So you can optimise for availability to stabilise at 40%.
You can optimise for cost or for customization. And in custom, you can choose your own metric as well as a target value. Okay, but let’s say, for example, we want to optimise for availability. Do we want it on top of it to enable predictive scaling yes or no?And dynamic scaling, yes or no? Okay, and then it will specify the configuration details. So here we are. Choose computerization and have a target value of 40%. And when you’re done, you click on Next, and then you can set up your demo ASG. Okay? And so we want to configure some advanced settings. For example, we can configure all of the settings here so that we can change their values if necessary. The same is true for dynamic scaling and predictive scaling. But I’ll click on Next, and then we’ll review and create the scaling plan, and we’d be good to go. So I’m not going to do it because it’s going to create the exact same thing that we had before. But as you can see from within the auto scaling GUI, you can not only configure the auto scaling group but also, if we had ECS services or if we had DynamoDB tables, and so on, we could find them here. And then set up an auto-scaling plan that will appear right here in these auto-scaling plans for you to have a central place to manage them. Okay, so that’s it for this lecture. I hope you liked it, and I will see you in the next lecture.
25. Section Cleanup
Okay, so just to clean up this section, please take your ASG and delete it. So you type “delete” in this field, and then you can “take” your load balancer so that in the load balancer section, you can “take” it and then “delete” your demo LD. And finally, you could go ahead and delete your target group as well, if you wanted to. and you’re good to go. That’s it for this lecture. I hope you liked it. and I will see you in the next section.