14. Elastic Load Balancer – Monitoring, Troubleshooting, Logging and Tracing
Okay, so let’s have a look at the types of errors that can be generated by your bouncer. So a successful request will return a 200, which is fairly standard, but if the client made an error—for example, if your web browser made an error—you will receive a type X error. Okay, you don’t have to remember all of them; I’ll give you an example, but only as an illustration. So 400 is a bad request, 401 is unauthorized, 403 is forbidden, 60 is a client course connection, and four six three is that the header X forwarded was malformed. Okay? and anything that is going to cause an error on the server side. So it could be the load balancer or your backend EC2 instances that generate a type 5 XXX code. So 500 means that there is an internal server error. 502 is by gateway, and 503 is service unavailable, which is when your EC2 instances are not available to send back the reply to the load balancer, which is an important code to know. 504 is a gateway timeout, and five six one is unauthorized. Okay? So, in terms of exams, keep in mind that 4X codes are client-side errors, implying a client problem, whereas 5X codes indicate a server problem. Okay? So this is just for you to be able to determine the kind of metrics to look at on your load balancer.
So talking about metrics on your load balancer, there are metrics pushed directly from the album to any load balancer related to Cloud Watch metrics, so it could be back end connection errors to monitor if your EC2 instances are arriving with an unhealthy code host count, and a healthy code host count is super important, which is illustrated right here. So say you have six load balancers, for instance, and two of them are down. So in this case, the healthy host count is going to be four and the unhealthy host count is going to be two. Okay, just to illustrate it, but it’s important to see it once. Next, the number of times that there was a two-digit successful request in the back end is three, followed by four XS, which hint at client error codes, and five excesses, which hint at server error codes. Okay, then we have latency information. So how fast is it to get a request back to the client? The request counts, which are the overall request counts for your ALB, the request counts per target, which is very interesting, which is how many instances receive a request on average, which is a good metric to monitor and scale on, and the search queue length are all important metrics.
So this is the total number of requests that are pending and being routed to healthy instances. Okay? And this can help, for example, scale out your ASG. The maximum value is 1000, so you don’t want to have a big queue of requests; you want to make sure that, of course, this queue remains closed, the queue remains close to zero, and spillover occurs. Is that the number of requests that were rejected because the queue was full? Okay? And this is something you never want to be above zero, which means that if you’re above zero, you’ll need to really scale your back end to serve these extra requests, and your clients will be losing some requests. Now, if you want to troubleshoot using metrics, a 400 bad request means that the client sent a malformed request. A 503 means that, again, there are no healthy instances available for your load balancer. And so you could have a look at the “Healthy Host Counts” metric in Cloud Watch. A five-four is a gateway timeout, so check if the Keeper live setting is on your EC.
Make sure the “keep a life” timeout is greater than the load balancer’s idle timeout setting, and overall set alarms for your load balancer and troubleshoot. Using the documentation, I recommend you take a look at this link just to understand things better, and you’ll be good to go next for your load balancer. Access logs. Okay, so these are the access logs from the load balancer, and they can be stored in S3, and they will contain all the requests made to your load balancer. So this includes only the metadata, as well as the client’s IP address and the Latencies Request Path Server Response Trace ID, and you’ll only pay for the free storage of sending these logs directly to Amazon. It’s very helpful for compliance reasons and for debugging, and it’s helpful to keep access data even after the ELB or your EC instances are terminated and the access logs are encrypted for additional security. Finally, a custom header called “X,” an Amazon Trust ID, is added to every HTTP request for request tracing. And here’s an example: This will be extremely useful in logs or for distributed tracing platforms to track a single request. But just so you know, the ALB is not yet integrated with Xray, so you’re not going to see that request tracing appear in Xray just yet, but I will update it as soon as this is available.
So here in my ALB, I have the option to look at monitoring, which shows me all the kinds of comment tricks I just mentioned. So it shows you the HTTP 400, 500, ALB 405, hundred, and so on, the target connection errors, all of which are very, very interesting, the active connection counts, and so on. So you can have a look at the metrics right here, and you can view them all in Cloud Watch. Okay? And what I want to show you is that if you click on your ALB, you can edit some attributes. You can, for example, have the access logs, and the access logs can be enabled and sent to an S3 bucket. And you can create the location directly from this UI as well. Okay. Okay, so that’s it for this lecture. I hope you liked it, and I will see you in the next lecture.
15. Target Group Attributes
So, let’s take a look at all of the options you can set for a target group because they’re required for the exam. So the first one is the deregistration delay, and this is a timeout you specified in seconds, which corresponds to the time the load balancer has to wait before deregistering a target. And we’ve seen this before. Then there is the slow start, which is expressed in seconds. And I’ll go over it in the following slide. There is the routing algorithm. So there are going to be three. The algorithm will be covered in the following slide. So there’s going to be a run for Rubinleast’s outstanding request and then a flow for Hash. Then there are the stickiness settings. So whether or not it’s enabled, the type, which is application-based or duration-based, and then the cookie name, the duration in seconds in terms of the app cookie name or for the LB cookie, just the duration in seconds in terms of the expiration period So now let’s have a look at how slow starts work and the different routing algorithm works.
So a slow start is a way for you to send traffic gradually to an instance because, by default, whenever your target becomes online in your target group, it will receive its full share of requests. So it will start getting many, many requests at a time. And if you enable slow start mode, it gives a healthy target a bit of time to warm up before the load balancer sends a full share of requests.
So the idea is that the load balancer will start at zero and will linearly increase the number of requests that it sends to the target until the slow start mode is ended, and then it will have its full share. So here’s an example: If you don’t have slow start mode, all of a sudden, as soon as your instance is part of your target group, it is going to receive a full share of requests, which may overload the instance directly. But with slow-start mode, there’s going to be a gradual increase. So at first it will save one request, then it will receive two requests, then it will receive three requests, and so on up until the slow start is over. And then the E-2 instance will be at full capacity. So this target will exit slowly when the duration period elapses or the target becomes unhealthy. And if you want to disable slow start, which is the default setting, then just set the slow start duration value to zero.
Now let’s have a look at the routing algorithms for your request. So there is one called the “least outstanding request,” which means that the next instance to receive a request is the instance that has the lowest number of pending or unfinished requests. So basically, the instance that is less currently busy is the instance that will receive the next request. which makes sense, right? Because you want to make sure that if one instance has fewer requests than another, then maybe it has more capacity to receive these requests. So it’s available for the ALB and the CLB. And so the idea is that the first request will go maybe to the first instance, the second request to the second, and then the third request has to go to the third because the third instance right now doesn’t have any requests. So it has at least one outstanding request. So this is how it works. And then the fourth may go back to the third instance. If the third instance has the least outstanding request, maybe the third request was very, very quick to finish, so the fourth one went again to the third instance, and so on.
Okay? Now, for the round robin, this is a bit different. So the round robin means that the targets will receive the next request one after the other, regardless of how many outstanding requests are available on your instance. And so this works for ALB and CLD. The idea is that number one will go to number one, number two will go to number two, number three will go to number one, and then number four will again go through the same cycle. So number one, and so on. Then, for the NLB, there is the flow hash request routing, in which a target is selected based on a hashing of the protocol source, destination, IP address, source, destination port, and TCP sequence number. That means that each TCP or UDP connection is going to be routed to a single target for the life of the connection, which is sort of equivalent to the sticky sessions on the network load balancer.
So straightforward. Whenever a user makes a request in an EC2 instance, all the information, as I said below, is going to be hashed through a flow hash algorithm, and we can then assume that this hash number is going to route the same request from the same user to the same EC2 instance as long as the TCP connection is open. Okay? And this is a feature of your NLD. So if you go into your targetgroup action and then edit attributes, then we’re going to see the deregistration delay.
We know this one is the slow start duration, so by default it’s disabled, but you can set it, for example, to 30 to start, enabling it up to 900 seconds if you really want a slow start duration. But if you have this, as you can see, this attribute cannot be combined with the least outstanding request algorithm. So we’ll disable it. And if you do disable it, then we have the least outstanding request algorithm. Okay? That’s available. So round robin is the default, but we can also use this one, which sends the request to the EC2 instance with the fewest requests available. So the one that is the least busy runs. So that’s pretty cool. And then finally, Tkns can be enabled or disabled. We’ve seen this before as well. So that’s it. You’ve seen all the attributes for your target group. I hope you liked it, and I will see you in the next lecture.
16. ALB Rules – Deep Dive
So a quick lecture is needed to understand the rules as a whole. So rules are set up on your application balancer, and you can have many rules. Rule one, rule two, and the last rule are called the default rules. And each rule will have a specific target, okay? Now the rules are going to be processed in order, okay? The default rule is going to be the last one to be processed, and each rule supports different actions. For example, it can forward to a specific target group, it can redirect to a specific other URL, or it can send a fixed response back. Rules can have conditions in terms of which rule is going to be hit first.
So it could be having a rule on the host header, okay, which is part of the client request. It could be an HTTP request method to see if the request is a get, post, or put path pattern. So, is the request directed to MyApp one or MyApp two’s source IP? So where is the request coming from, and what is the specific HTTP header in general or the query string parameters as part of the request? So this really allows you to create some complex routing to different target groups based on certain conditions in the originating request. Another thing you should know is that for a single rule as part of your ALB, you could have multiple target groups as a target. And the idea is that you can specify weights for each target group within a single rule. So the idea behind this is that you want to be able to evolve your back-end services from one target group to another and update them. And so you want to have version one in one target group and version two in another target group, and you want to evaluate whether or not version two makes sense. This is also called a “blue-green” deployment.
So thanks to this weighting, you can control how much traffic goes from your ALB to a specific target group as part of the rule. So let’s have a look at an example. So let’s say that we have our users; they talk to the ALB, and the ALB has been wired to talk to target group one historically, okay, which is your blue group. But now you’re going to add target group two, which is a new version of your application. And you will set up weights in your rule to have a weight of eight for target group one and a weight of two for target group two. That means that now, for a single rule, the ALD will send 80% of the traffic to your first group and 20% of the traffic to the second group, which would allow you to do some monitoring on your target. groups and to see whether or not they are receiving the traffic correctly, whether or not the metrics are good, and whether or not your application version is behaving as expected. Okay, so this is a cool new feature as well. So that’s it for this lecture. I hope you liked it, and I will see you in the next lecture.
17. [SAA/DVA] Auto Scaling Groups (ASG) Overview
Now we’re getting into the concept of auto-scaling groups. So in real life, basically, your websites and applications will change and have different loads. The more users you have, the more popular you’re going to be and the more load you’re going to have. So in the cloud, as we see, we can create and get rid of servers very quickly, and so if there’s one thing that the autoscaling group does very well, it’s scale out. That means adding instances to match and increase load, but also scaling in to remove EC two instances to match a decrease in load.
And then finally, we can ensure that the EC two instances can only grow by a certain amount or decrease by a certain amount. And this is where we can define a minimum and a maximum number of machines running in an ASG. Finally, we can have an ASG super goal feature that automatically registers new instances with a load balancer. So in the previous lecture, we registered instances manually, but obviously there’s always some kind of automation we can do in AWS. What does it look like on the graph? Well, here is our beautiful auto-scaling group. It’s a big arrow. So, for example, the minimum size is one, which is the number of instances you will undoubtedly encounter when running into this auto-scaling group. The actual size or desired capacity parameter is the number of instances that were running in your ASG at the previous moment.
And then you have the maximum size, which is how many instances can be added to scale out if needed when the load goes up. So that’s very helpful. What you need to know about are the minimum size, desired capacity, and maximum size parameters because they will come up very often. Also note that scaling out means adding instances, and scaling in means removing instances.
So now, how does it look with a load balancer? Well, here is our load balancer, and web traffic goes straight through it. And we have an auto-scaling group at the bottom, so basically the load balancer will know directly how to connect to these ASG instances, so it will direct the traffic to these three instances. But if our auto-scaling group scales out so that we add two instances, then the load balancers will also register these targets, obviously perform health checks, and directly reach traffic back to them. So load balancers and auto-scaling groups really work hand in hand in AWS. So, ASG, they have the following attributes for a Launch configuration, and we’ll be creating one during the hands-on in the next lecture. But the launch configuration has an AMI and an instance type. If you require it, the following data is easily accessible: EBS volumes, security groups, and an SSH keypair.
And, as you can see, while this is quite common, it’s exactly what we’ve been doing since the first time we launched an instance manually. Obviously, they’re very close. You also set the minimum size, the maximum size, and the initial capacity, as well as the desired capacity. We can define the network and the subnets in which our ASG will be able to create instances, and it will define load balancer information or target group information based on which load balancer we use. Finally, when we create the NASG, as we’ll see, we’ll be able to define scaling policies. So what will trigger a scale out, and what will trigger a scale in? So we are getting to the auto-scaling part of Oto scaling, which is the alarms. So basically, it’s possible to scale your autoscaling groups based on the Cloud Watch alarm. And we haven’t seen yet what Cloud Watch is. But as I said, Amazon is kind of a spaghetti ball. So don’t worry. Follow me. So the CloudWatch alarm is going to be something that monitors a few metrics. And when the alarm goes up, so when the metric goes up, you’ll say, “Okay, you should scale out, you should add instances,” and then when the alarm goes back down or there’s another alarm saying it’s too low, we can scale in. So basically, the ASG will scale based on the alarms, and the alarms can be anything you want to measure, such as the average CPU, and the metrics are computed as an overall average overall.
Okay, it doesn’t look at the minimum or the maximum; it looks at the average of these metrics based on the alarm. Basically, we can create scale-up policies and scaling policies, as I said. So we’ll be seeing their rules, as well as their new autoscaling rules, firsthand. But now you can basically say, “Okay, I want to have a target average CPU usage in my auto scaling group,” and basically it will scale in and scale out based on your load to meet that target CPU usage. You can also have a rule based on the number of requests on the ELV, for instance, the average network in the average network app. So, whatever you believe is the best scaling policy for your application, you can use these rules because they are easier to set up and make more sense to reason about. For example, you could say, “Okay, I want 1000 requests per instance from my ELB.” That’s easy to reason about. Or I want my CPU usage to be 40% on average.
You can now autoscale based on a custom metric, and we can basically define a custom metric, such as the number of connected users to our application. And to do this, we’ll create that custom metric from our application, send it to Cloud Watch using the Put metric API, and then create a Cloud Watch alarm to react to low or high values of that metric. And then these alarms will basically trigger the scaling policy for the ASG. What you should know is that the auto-scaling group is not limited to the metrics exposed by AWS; it can be any metrics you want, including custom metrics. So here’s a small brain dump on things to know for your ASG. First of all, you can have scaling policies for your ASG, and they can be anything you want. It could be a CPU network; it could be custommetric; it could be defined; or it could even be based on the schedule. If you know in advance how your visitors are going to visit, for example, your website, and you know they’re logging in very early at 9:00 a.m., maybe you can be proactive and add more instances proactively before users arrive so they have a better experience.
Also, ASG can use launch configurations or launch templates. And launch templates are the newer versions of launch configuration, and they’re recommended to use moving forward. And if you wanted to update an auto-scaling group, what you need to do is provide a new version of that launch configuration or that launch template, and then your underlying EC2 instances can be replaced over time. If you do attach an im role to your auto scaling group, that means that the im role will automatically be assigned to the two EC2 instances that you launched, okay? The auto-scaling group is then free to use. The only thing you’re going to pay for is the underlying resources being launched. That means your EC2 instances have TBS volumes attached, and so on. If you have an instance under an ASG, the beautiful thing is that if somehow the instance gets terminated, then the ASG will realise that your instance has been terminated and will automatically create a new instance as a replacement. And that is the whole purpose of having an SG release. It’s that extra safety that it gives you to know that your instances will be automatically recreated as new ones in case something goes wrong. And when would the instance be terminated? Well, instances can be terminated, for example, if they’re marked unhealthy by a load balancer. And so USG is aware, okay? Your load balancer thinks that your instance is unhealthy. The better thing to do is to terminate that instance and replace it by creating a new one. Okay? So keep in mind that an ASG automatically creates new instances, okay? It doesn’t restart; it doesn’t stop your instances; it just terminates them and creates new ones as replacements. So that’s it. I hope you liked it, and I will see you in the next lecture.
18. [SAA/DVA] Auto Scaling Groups Hands On
Okay, so let’s practice using auto scaling groups to do so. First of all, please make sure to take your instances and terminate them all to have zero instances running in EC 2. Okay, so next we have to create an auto-scaling group. So let’s go into auto-scaling groups on the left-hand side, and we’re going to create our first auto-scaling group. So create an auto-scaling group, and then I’ll call this one the demo ASG. And then we have to select a launch template. So let’s create a launch template right here, and this will allow us to select my demo template. Specify the options to create our E2 instances. So templates are demoed. Okay, so I will scroll down.
We need to select an AMI. So let’s select Amazon Linux 2, and I will choose the X86 type of architecture. So the first one in my list is 86, then the instance type. So we want a T2-micro, which is free-tier eligible. And if you can finally just type it in the search box right here, then we need to select a key pair. So, for this, I’ll use the network settings from the EC2 tutorial and launch it in my VPC. And for security groups, I will attach the Launch Wizard One security group to my EC Two instances. Then for storage, I will have a volume of 8 GB. So just like before, And then for the advanced details in here, I will scroll down, and then at the very bottom, I have user data, and I just need to pass in the user data that I had from before. So paste it, and we’re good to go. So let’s create this long template. So now it’s been created. So I returned to my management console. I can just close this, refresh it, and select my demo template version one to create my ASG.
So everything looks good. Let me click on “next.” Now, in terms of the instance-purchase options, we have two options. Either we adhere to what was defined in the launch template, which is either spun, spot, or on demand for instances, or we combine purchase options and instance type. And the cool thing about this option is that we can have a baseline of on-demand instances and then some search capacity of Spot’s capacity to be flexible yet cost-optimized. But to keep things simple, I’m going to adhere to launch templates. I will use the VPC, and I will launch it in three different availability zones. Okay, let’s click on “next.” Next, we need to select load balancing for our ASG. So this is optional. We can have an ASG without a load balancer. But because we want to have our EC2 instances linked to our load balancer, we need to attach to an existing load balancer. In this case, this is going to be an application balancer, and I need to select a relevant target group. We only have one left. So my first target group, HTTP, is going to make sure that it’s going to be linked to the application balancer demo. ALB Okay, now, for health checks, there are two kinds of health checks that an ASG can do. The first one is EC2-based to make sure that the EC2-instance does not get any software or hardware failures.
So this is enabled by default, and we have to keep it on. But we can also enable an ELB-based health check, and this is because we have specified an ELB here. So, if the instance fails the ELB’s health check, it will be marked as unhealthy in the auto scaling group, it will be terminated by the auto scanning group, and a new instance will be launched. This is to ensure that all the instances in your auto-scaling groups are healthy no matter what. So we’ll keep this on, and then I will click on “next” for the group size we have desired as one minimum and one maximum as well. We will change those later on to show you the scaling in the auto-scaling group. So, for the time being, we will choose none.
But in the next lecture we will set up some scaling policies as well, okay? And we will disable the “instant scaling protection.” Now let’s click on “next.” There’s no need for notifications for now, no need for tags, and we look good to go. So let me scroll down and create my auto-scaling group. Okay, so my auto-scaling group is created, and what it’s going to do is create one easy instance for me. So the idea here is that if I click on my auto scaling group and we get some details, we can see the desired minimum and maximum capacity as well as the launch template that was used for this ASG. And then if I go into activity, this is where the interesting stuff happens. So when the ASG decides to create an easyto instance, it’s going to appear in the activity notification. So if I refresh this very soon, sorry, in the activity history, if I refresh the activity history, then I can see that there is a new instance being launched right here because the autoscaling group has a capacity of zero and we wanted one capacity. So it’s just increasing the number of EC-2 instances in RSG to match the desired capacity. So if I go to the tab “instance management,” I can see that one EC2 instance was created by the ASG, and if we go to the “instance stat” right here, we can see that the EC2 instance is currently running and initializing. So it’s pretty cool.
Now the cool thing is that because we had set up our ASG to be linked to our target groups, if we go to the target groups on the left hand side and find my first target group and go to targets, my EC Two instance is being created, and I will just expand it. My EC2 instance is being registered into my ALD. So right now it is shown as unhealthy because the objects are failing because the EC2 instance is still bootstrapping. But if, after a bit, it passes, it will be shown as healthy and then be registered in my ALD. So let me pause the video to wait for that. Okay, so my instance is now healthy, and if I go to my ALD and refresh, I get the Hello World response. That means that everything is functioning. The instance was created by an ASG. We can see it in the EC2 console. It was registered in the target group, which is linked to our load balancer, and therefore our load balancer is fully working. Now if the instance never gets healthy, it gets unhealthy, and because we set up our ASG to terminate unhealthy instances, what you’re going to see is that the instance is going to be terminated and a new one will be created.
So you will see this mostly in the activity history. If you do have this, The reason why this is happening is that your EC2 instance is misconfigured. This could be either a security group issue or an easy-to-user data script issue. So please check those before asking questions in the Q and A. Okay? But these are the mistakes you should be making. So now we have one instance, and it’s in service. And what we could do is experience the scaling by editing the auto-scaling group size. So we want two instances, and so we have to increase the maximum capacity as well. So we’ll update this. And now what we told the ASG is that we want you to create one or more easy instances for us. So if you go to the activity history and then refresh it, very soon it’s going to show a new activity that it will be trying to launch an easy instance of. So here we go. It now says “launching a new EC.” two instances because you have changed the desired capacity from one to two. And the actual number of instances in our AIG was only one.
So now, as you can see, we have a second easy-to-create instance being created, and after a while it’s going to be registered into our target group and edited. So we’ll pause the video for this process to happen. Okay, so my instance is now healthy. And so if I go to my ALB and refresh, as you can see, I have my two IPS that are looping for my ALB. So everything is working great. So, one last thing. So if we go to activities, as we can see, the two Institute instances were launched successfully. Let us now examine the inverse. We’re going to scale down our capacity by putting the desired capacity back to one and updating it. And what this will do is say, “Hey, it looks like you have two instances, but now you only need one.” So I’m going to pick one of these two instances right here and terminate it. So as you can see here, it says I’m terminating one of these instances because of the change in the configuration of the SG. And it’s going to terminate the instance and then deregister it from the target group. And then we’ll be back with only one easy-to-understand instance in our ASG. So that’s it for this lecture. You’ve seen the whole power of ASG. I will see you in the next lecture to talk about automatic scaling.
19. [SAA/DVA] Auto Scaling Groups – Scaling Policies
So now let’s talk about auto-scaling and groupscaling policies, and we have two different kinds. We have the dynamic scaling policies first, and within the dynamic scaling policies, we have three kinds. We have the target tracking scale, which is pretty easy. It’s the most simple and easy to set up. And the idea is that you want to say something like, “I want to keep the average CPU utilisation of my auto-scaling groups across all of my least two instances at around 40%.” This is when you want to have a default baseline and want to make sure that you’re always available. Step scaling is more involved than simple scaling. So you set your own Clywatch alarms, and when they go off, So for example, when the CPU goes over 70% for your ASG as a whole, add two units of capacity, and then you would set up a second rule saying, “Hey, in case the CPU utilisation goes to less than 30% as a whole within my ASG, remove one unit.”
But you would have to set up your steps, your power alarms, and the steps themselves, which is how many units you want to add at a time and how many units you want to remove at a time. And finally, schedule actions, which are to anticipate scaling based on known usage patterns. For example, you’re saying, “Hey, I know that there’s going to be a big event at 5:00 p.m. On Fridays, because people are going to be done with work and they’re going to use my application, you want to increase the main capacity automatically of your ASG to ten at 5:00 p.m. Every single Friday This is a scheduled action where, you know, scaling is done in advance, and there’s a new kind of scaling that is called predictive scaling. So with predictive scaling, you continually have a forecast being made by the auto-scaling service in AWS, and it will look at the load and schedule scaling ahead.
So what will happen is that the historical load is going to be analyzed over time, and then a forecast is going to be created. And then, based on that forecast, scaling actions will be scaled ahead of time, which is quite a cool way of doing scaling as well. And I think this is the future because this is machine-learning powered, and this really is a hands-off approach to automatic scaling for your ASG. So, what are some good metrics for scaling on? So it really depends on what your application is doing and how it’s working. But usually, there are a few. So number one is CPU utilisation because every time your instance receives a request, usually they will do some sort of computation, and so it will use some CPU. And so if you look at the average CPU utilisation across all your instances and it goes higher, that means that your instances are being more utilized. And so it will be another good metric to scale on.
It’s more like application-specific, but it is a request count per target that is based on your testing. You are aware that your EC2 instances run at an optimal rate of 1000 requests per target at any given time. And so maybe this is the target you want to have for your scaling. So here’s an example: You have an auto-scaling group with three easy-to-create instances, and your LD is currently spreading the request across all of them. So the value of the request counts per target metric is three right now because each EC-2 instance has three outstanding requests on average. Next, if your application is network-bound, with many uploads and downloads, and you know that the network will be a bottleneck for your EC2 instances, Then you might want to scale on the average network in or out to ensure that if you reach a certain threshold, you’ll scale based on that or any custom metric you send to Clywatch.
So you can set up your own metrics that are going to be application-specific, and based on those, you can set up your scaling policies. Now, one last thing you need to know about scaling policies is what’s called the scaling cooldown. So the idea is that whenever there is scaling activity, such as adding or removing instances, you enter the cooldown period, which is set to five minutes or 300 seconds by default. And during that cooldown period, the ASG will not launch or terminate additional instances. And the reasoning behind this reasoning is that you allow metrics to stabilise before implementing your new instance and seeing what the new metric will become. So the idea is that when there is a scaling action that occurs, the question is, “Is there a default cooldown in effect?” If yes, then ignore the action. If not, then proceed with the scaling action, which is to launch or terminate instances. So, my recommendation is to use a ready-to-use AMI to reduce the configuration time for your EC2 instances, allowing them to serve the request faster. Okay? So if you don’t spend time configuring your E2 instance, they can be in effect right away. And then, because they can be active way faster, the coolant period can be decreased, and you can have a more dynamic scaling up and down of your ASG. And, of course, you must enable detailed monitoring for your ESG in order to gain access to lower level two metrics every 1 minute and ensure that these metrics are updated as quickly as possible. So that’s it for this lecture. I hope you liked it, and I will see you in the next lecture.