1. Live Migration
We’ve studied VMs in a fair bit of depth, but there’s still a bunch of stuff that we need to know about VMs in order to understand GCP’s platform completely. In this section, we’ll study some advanced features of VMs that we haven’t seen before, and we’ll also look at images. Images are operating system images that you can use to create boot disks for your instances. The Google Cloud platform allows you to create a bunch of different image types that you can use to instantiate or create new VMs. You’re already intimately familiar with VMs. In this module, we’ll cover some features which we haven’t looked at before, such as live migration.
Live migration is an extremely useful feature that VMs on the Gcp offer which allow your instance to be up and running even in the case of a maintenance event. If the source VM is scheduled for maintenance, its data and state is migrated to another instance which basically picks up where the first one left off. Google’s Compute Engine also offers you right sizing recommendations. This is a brand new feature launched on the Google Cloud platform. Often when you instantiate a VM, you choose a certain number of vCPUs, you choose a certain amount of memory. Is it really appropriate for your workload? You never know till you run it in production.
Right Sizing Recommendations are Sizing recommendations for your machine types which you get after your application has been live in production for a while. We’ll also spend some time looking at the various machine types that you can instantiate and how Google builds us. For them, google offers extremely good sustained use and committed use discounts which will help reduce your cloud bill. We’ll see how to take the best advantage of these. And finally we look at images which help you instantiate new VMs with the operating system and the applications of your choice. Baked in images are a very good way to get up and running with a new VM instance configured exactly how you want it to be.
When we studied about VMs earlier, we’ve spoken about the three compute choices that you could make when you are hosting your application on the Google Cloud Platform. The first of these is the Compute Engine. This is infrastructure as a service where you have access to the raw virtual machines and you can basically install any application or software that you want. The configuration of the virtual machine and how exactly you want the back end to be set up is completely in your hands. This is the Compute engine. The second compute choice that you have is the Google Kubernetes engine or the Container engine.
This provides a little bit of abstraction from Raw virtual machines in that you use the Kubernetes orchestration framework in order to set up containers. Your application and its corresponding dependencies involve one or more containers called pods, and it is these pods that are orchestrated and installed on a Kubernetes cluster. The third compute Choice is basically platform as a service where you are completely abstracted from the concept of virtual machines. App Engine basically just allows you to write code and deploy your application. The load balancing, scaling and actually running this application on VMs is taken care of by the Cloud Platform. In this section we’ll look at the Compute Engine once again, that is the Raw VMs and look at some advanced features that we hadn’t considered earlier on.
We’ll start off with live migration. Live migration refers to the process of moving a running virtual machine or application between different physical machines without actually disconnecting the client or application. The memory, storage and network connectivity of the virtual machine are transferred from the original machine to the destination machine. The Google Cloud Platform offers live migration services for your virtual machines, which helps you keep your VM instances running even during a hardware or software update. The Compute Engine is capable of live migrating your running instances to another host in the same zone rather than requiring your VMs to be rebooted.
Instances in Google’s data centers often require maintenance or upgrade, which can be a software update or some kind of hardware maintenance. You don’t want your service to be interrupted when this occurs, which is why you would enable live migration. Examples of maintenance events that could cause your host machine to go down are infrastructure maintenance and upgrades, network and power grid maintenance, failed hardware, host and BIOS updates, security changes, and so on. In any of these instances, live migration will ensure that your VM instance continues to serve requests by moving it to another host which is not subject to this maintenance.
When a maintenance event is about to occur, the virtual machine typically gets a notification that it needs to be evicted. All the applications and running state in that machine needs to be shut down and it needs to move. Compute Engine will choose a new VM which is the target VM in the same zone where the host VM lives. This target VM is not subject to maintenance and it’s ready for operating systems and other applications to be deployed on it and it’s ready to receive requests. A connection needs to be established between the host VM that is being evicted and the target VM which is about to receive all the current state of the host.
This current state comprises of the memory, the persistent disks, applications that are running, network connectivity and so on, which is why a connection between the host and the target VM needs to be authenticated first. Live migration occurs in three phases. The first of these phases is called the pre migration brown. Out here, the VM is still executing on the source when most of the state is sent from the source to the target. For example, Google will copy all the guest memory to the target while tracking the pages that have been changed on the source. There is no fixed amount of time that is spent in premigration brown out. It depends on how much memory has to be copied, the rate at which pages are being changed, and so on.
The next phase to occur is a very brief interval of blackout. This is when the source VM is not running and neither is the target VM. Request sent during blackout will fail, but this is a very, very short interval of time. The VM enters the blackout stage when sending state during premigration brownout reaches a point of diminishing returns. After this brief blackout, we enter the post migration brownout phase. Here the VM is finally running on the target machine. The source is still present and up though, and it might offer support if needed. For example, if the routing tables and other network functionality hasn’t yet caught up to the fact that the target VM is where the requests have to be sent, the source VM might forward all the packets that it receives to the target VM till networking is updated.
Here is a block diagram along with the timeline of how live migration might occur from a host to a target VM. This is a complicated looking diagram. Don’t worry, we’ll look at it in parts and see what exactly is going on. At the very top are represented blocks which represent the persistent disk that is attached to the source VM. This persistent disk is referred to as PD in the timeline diagram and this persistent disk has to be migrated so that the target VM has all the data that it needs to run the applications. The second block refers to cloud networking functionality. The host VM is connected to a certain network. The target VM will be connected to the same network.
Now that we know what those blocks stand for, let’s just zoom in and look at the timeline for live migration. The host VM components which have to be migrated, including networking functionality, memory, local SSD are represented at the top and the target VM is at the bottom. The timeline at the very bottom represents the various phases during live migration. The first phase is the source brown out where the VM is still running on the source host, but the state is being transferred over to the target host. The second phase is a very brief blackout phase where the host VM has been paused and the target VM isn’t yet up and running. And the third phase is the target brownout where the VM is now running on the target host.
But the source host is still up and running, ready to provide support when it’s needed. An authenticated connection has been established between the source VM and the target VM, and the first thing to be switched off is the networking of the source VM. It’s no longer connected to the network. Next up, all the states within the local SSD. The local persistent disk is transferred from the source VM to the target VM. Somewhere along the way. At some point, ram memory from the source VM is copied over to the target VM. Any data that’s stored in persistent disks is also copied over. Next to be copied is any static VM state, such as configuration parameters that will be copied.
And then there is a memory post copy phase where basically the VMs ensure that all the memory has been copied over and finally the target VM is attached to the network. It’s now ready to receive requests. Live migration does not apply to all instances on the Google Cloud platform. Instances with GPUs cannot be live migrated. Typically, an eviction notice is sent to the VM 60 minutes before the maintenance event, and that’s when you can tear down that VM and set up a new one with a GPU. We’ve seen from the timeline diagram that live migration is supported by local SSDs. Their contents will be copied over to the new target host. But preemptible instances cannot be live migrated. The only action that you can take on preemptible instances is termination.
2. Machine Types and Billing
In this video we’ll study the various kinds of machines that the Google Cloud platform has to offer and how they are priced. We’ll also see the discounts that Gcp offers and how we can take the best advantage of them. The Google Cloud platform offers a free usage tier which is enough for you to keep a very simple website up and running permanently without you having to pay anything. Within this free tier we get one F one micro VM instance per month. You can use this instance in any of the US regions except Northern Virginia. An F one micro machine basically has 0. 2 virtual CPUs and 0. 6 GP of memory, and the maximum number of persistent disks that you can attach to it is four.
This free tier also includes 30GB of standard persistent disk storage per month. Absolutely Free snapshotting is a way for you to create a backup of your persistent disks. The free tier offers fiveGB of Snapshot storage per month. There is some egress traffic that is free as well. oneGB of egress from North America to all other destinations is absolutely free per month. These destinations do not include Australia and China. With these resources, you should be able to run a simple website with low usage absolutely free all year round. Machine types on the Gcp are divided into two broad categories. The first is the predefined machine type, which are a bunch of standard machines with a set number of vCPUs and associated memory.
These are standard instances which are available across regions. However, if you find that your application workloads are such that a standard instance does not suit you, then you might want to choose a custom machine type. Here you can specify the number of vCPUs you want and the amount of memory that you want for your machine. The billing model for the Google Cloud platform when you instantiate VMs is per second billing, but all machine types are charged for a minimum of 1 minute. After the minute is over, you are only charged for every incremental second that you have the VM running amongst the larger cloud providers. Google Cloud platform was amongst the first to introduce the per second billing concept.
Now it has become the standard across AWS, Azure and other cloud platforms. Let’s consider predefined instances. Here on the screen you can see the list of predefined machine types that are available for you to choose on the Gcp. Now this list is specific to Iowa, so if you use that drop down, you can specify different regions and see what kind of machines are available on that region. You can also use the slider to see whether you want the price for these instances on an hourly or a monthly basis. I’ve set it to hourly here as an example. These prices are inclusive of sustained use discounts. Sustained use discounts are those that come into the picture when you use an instance for longer durations within a month.
So if you use an instance for 25% of the month, you get a certain discount. If you use it for an entire month, you get more of a discount. Each of these standard machines come with a certain number of vCPUs. If you want a more powerful machine, you want more processing capacity. You’ll choose a machine with a higher number of virtual CPUs. It goes from one all the way to 96. These standard machines have a certain CPU to memory ratio one vCPU to 3. 75gb of memory. So as the number of vCPUs increase, your memory also increases. Here is the price that you’ll pay per hour to instantiate and use these machines instead of reserved instances. If you instantiate these machine types as preemptible instances, you can see that the price is correspondingly lower.
A preemptible instance is only about 20% the price of a regular reserved instance. Every Google data center might have different kinds of machines which are on slightly different platforms. This shouldn’t affect your working as such, but certain configurations such as this N One standard 96 configuration which is in beta, is available at this point in time only in the Sky Lake platform. You don’t always have to use predefined machine types if you find that your workload is such that the ideal machine is between two predefined types. That is, when you choose to use a custom machine type. A custom machine type will allow you to specify exactly the number of vCPUs that you need and exactly the amount of memory that is required for your application, and in the long run might prove to be more cost effective.
The Gcp also offers you the choice of using shared core machines. So if you have an application running which does not require a lot of intensive resources, there is no reason for you to instantiate an instance. With a lot of vCPUs and a lot of memory, you can simply use a shared core machine. Shared core machines are meant for small, nonresource intensive applications. Shared core machines may not offer high processing power or high memory, but they do have this capability called Bursting. An F One micro machine type offers Bursting capabilities that allow your instances to use additional physical CPUs for very short periods of time.
So you have low processing capability most of the time, but when your application really needs it, when there is a spike in processing that you need, this machine will allow you to burst and use extra CPUs when you need it. On the shared core machine, this Bursting happens automatically. There is no additional configuration that you need to do in order to enable Bursting on shared cores. If there is a surge in the number of requests to your shared core machine and you require additional CPU capacity, the instance will automatically appropriate more CPU resources in order to accommodate the additional processing capacity needed. An additional caveat here is that these additional CPU resources cannot be used for a sustained period of time bursts are not permanent.
They are only possible periodically. It’s possible that your workload is memory intensive and not process intensive. In that case, the Google Cloud platform offers you high memory machines. These machines offer more memory per vCPU as compared with regular machine types that we saw earlier. In order to determine whether the high memory machine is right for you, you have to be able to evaluate your workload and see whether it’s processing heavy or memory heavy. If it’s memory heavy, you might benefit from using these high memory machines. For every vCPU, the regular machine types offered 3. 75gb of memory per vCPU. The high memory machines offer 6. 5gb of Ram per core.
Here is a table of the high memory machines that are offered in Iowa. The rates are hourly as before. These high memory machines are prefixed with N one High Mem and they specify the number of cores present in the machines. N one Hi M Two has two virtual CPUs and 13GB of memory. You can see the memory increases as the vCPUs increase and there is a lot more memory on offer per vCPU. Analogous to high memory machines, Google also offers high CPU machines. This has a lot more processing power per unit of memory. If your workloads are more processing heavy, you might want to consider using a high CPU machine as opposed to a regular machine. High CPU machines are useful for tasks.
Here are the high CPU machine types that Google has on offer in the Iowa region. The prices are once again hourly. All the machines are prefixed with N one, High CPU Two 4816, et cetera. Refer to the number of vCPUs. Notice that we get very little memory for every processor. These machines are for processing intensive applications which use less memory relatively. Now, it’s totally possible that you’ve examined your workload over a couple of days and none of the standard predefined machine types fit it perfectly. This is when you would choose to use a custom machine type. Let’s say you started off with a standard machine types which had eight vCPUs and you realize that your utilization is very low.
You’re only using six of your vCPUs. That’s when you set up a custom machine type with six vCPUs and the amount of memory that you need. Using custom machine types can help you save on costs, because you’ll save the cost of running on a machine which is much larger or more powerful than what you need. There is no fixed billing for custom machine types because there are no standard configurations. You set up a configuration as you see fit, but you are build based on the number of vCPUs that your machine has and the amount of memory that you’ve allocated for your custom machine.
3. Sustained Use and Committed Use Discounts
In addition to the free usage tier that we saw in the last lecture, google offers two additional kinds of discounts when you use their virtual machine instances. These are called sustained use and committed use discounts. Sustained Use discounts come into the picture when you instantiate and use a VM for a longer period of time. If you use a VM for 25% of the month, some kind of Sustained Use discount kicks in. We’ll see exactly what as you use it for a longer and longer portion of a month, you get additional discounts. Google Compute engine also offers the ability to purchase a Committed Use contract in return for a deeply discounted price for VM usage. These are committed. Use discounts.
You can purchase a Committed Use contract by creating something called a commitment. With Google, you can make a one year or a three year commitment. Obviously, the prices are lower for your instance usage. When you make a three year commitment, you commit to the entire usage term, and you are billed for each month regardless of whether or not usage occurred. So that’s something to watch out for. This is all we’ll talk about. Committed Use Discounts in this particular lecture, we’ll look at Sustained Use Discounts in some more detail. Sustained Use Discounts are the discounts that you get when you run a VM instance for a significant portion of the billing month.
The billing month typically runs from the first of the month to the last date, 30th or 31st of the month. Let’s say you run an instance for 25% of the month. You get a discount for every incremental minute. Beyond the 25%, sustained discounts are applied automatically. At the end of the month, Google will calculate how long your particular instance ran and then apply the corresponding Sustained discount. You don’t have to sign up for it or do anything. Here is a table that’s available in Google’s documentation which shows you how Sustained Discounts applied to your VMs. If you notice, we have a usage level that is percentage of the month that your VM has been up and running. If it’s zero to 25%, then you pay 100% of the base rate.
For 25% to 50%, you just pay 80% of the base rate. 50 to 75, 60% of the base rate for every incremental second that you use your VM. The column at the very right shows you the example incremental rate in US. Dollars per hour for a particular machine type the N one standard, one instance. In addition, Google uses something called Inferred Instances in order to give you the best possible discount that you can get when you run your VM instances. It’s totally possible that your VM instance doesn’t run continuously during a month. You’re constantly starting and stopping it. You’ll run it for five days, then shut it down, run it for another five days, and so on. You’ll qualify for Sustained Use Discounts even if your pattern of usage is kind of start and stop.
That’s because Compute engine gives you the maximum available discount by clubbing instance usage together. So let’s say you don’t have the VM running continuously, but it has run a particular kind of VM, has run for 25% of the month. Then you qualify for sustained use discounts. One thing to note here is that you’re not starting and stopping the same instance. Different instances, as long as they are running the same predefined machine type, they can be combined to create inferred instances. And then basically Google will check the usage pattern of your inferred instance and see whether you qualify for sustained use discounts. This might seem confusing at first, but a graphical representation will help you follow what exactly is going on.
Now let’s say that you have started and stopped five different instances during the course of a month, starting from day one to the end of the month. That is on the x axis, and you have five different instances at the end of the month. Google will check and see which of these instances were of the same predefined type. Here one and three were of the same type. So they are club together. Two, four and five were of the same type and then five ran for a little longer and was of a slightly different type. So we have three different types of instances which were clubbed together. These are the inferred instances. By combining one and three to four and five, you’ve basically got longer terms of usage than you would have had each instance been considered separately.
Google will then calculate the sustained use discount on these inferred instances. Google also offers sustained use discounts for custom machines, but they are calculated a little differently for predefined machine types. Inferring instances is very easy. You simply take the same machine type, find the usage of those, and club the usage together to find the sustained use discount. But in the case of custom machine types, we calculate sustained use discounts by combining memory as well as CPU usage separately. Google will add up similar blocks of memory together and similar blocks of vCPUs together, trying to combine resources to qualify for the biggest sustained use discounts possible.
Here are some graphs showing how sustained use discounts work for custom machine types. Now let’s take a look at the graph on the left, which is the actual use of custom machines on your cloud project. There are two custom machines here that you’ve instantiated and used. One is a custom four vCPU, six GB of memory. That is the first machine. The second machine is a custom two vCPU with four GB of memory. Now let’s see how Google clubs them together to get the best possible discount. We’ll start off by looking at vCPUs. The custom two vCPU machine is clubbed together with two vCPUs from the second custom four vCPU machine.
So two plus two equals four. So we have one two vCPU machine that has been used for the entire month. The two remaining vCPUs of the machine depicted in green, which is a four vCPU machine, will be considered separately. Similarly, for memory, the six GB of memory used by the custom machine in green is split into a four GB chunk and a two GB chunk. The four GB chunk is then combined with the custom machine depicted in blue, which has just four GB of memory. So this ran for the entire month and qualifies you for a sustained use discount. Custom machines do not use inferred instances they infer the time for which the CPU and the memory ran separately.