Google Professional Data Engineer – Virtual Machines and Images Part 2

July 25, 2023

4. Rightsizing Recommendations

We cover two last bits about VMs before we move on to images. The first of these is right sizing recommendations. You’ve instantiated your machines and you’ve started your workloads, and you’re serving production traffic. But how do you know that this machine type is the right one for you? How do you know you haven’t provisioned excess capacity? How do you know that you’re not utilizing it completely, and you might do better with a bigger machine? Right, sizing recommendations from Google can help you here. This is a fairly new feature offered by Google which was launched in beta back in July 2016. Compute Engine provides machine recommendations to help you optimize resource utilization.

Until very recently, Right Sizing Recommendations were based solely on the CPU and the memory statistics that was visible to the Compute Engine Virtual Machine Manager. This was a good enough approach, but it wasn’t really accurate in terms of all the other usage statistics that can be gathered from a VM. The current version of Right Sizing Recommendations uses system metrics gathered by Stack driver monitoring. The additional metrics provided by Stackdriver allows Right Sizing Recommendations to accurately determine how much memory is allocated by process versus operating system caches and how much memory it freed up. These make the recommendations more accurate and in tune with real world workloads.

Right Sizing Recommendations uses the last eight days of data for recommendations so your workload is in production. It will calculate the recommendations based on eight days of usage. The Right Sizing Recommendations algorithm will look at the CPU usage for your machines and determine whether your CPU utilization was low or high. If it was too low, then it’ll ask you to use a machine type with fewer vCPUs. Or if your CPU utilization was too high, it will recommend a machine type with more Vcp use separately. It will also look at memory usage and see whether it’s low or high. If it finds that the memory usage is very low, it will recommend a machine type with less memory. Or if the memory usage is too high, it will recommend a machine type with more memory.

5. RAM Disk

Early on in this course. When we first mentioned VMs, we also spoke about the various kind of persistent disks that can be attached to your VMs. We spoke about the two kinds of persistent disks that is, a hard disk drive or a solid state drive. We also spoke about local SSDs which are specific to a particular zone and can be attached only to instances in that zone. Is one other kind of disk called the Ram disk that is worth mentioning. As its name suggests, the Ram disk is an in memory disk. Data is held in memory, but it’s structured to look like a hard disk drive. A Ram disk is basically created when a machine allocates some of its high performance memory to use as a disk.

A Ram disk has very low latency and very high performance. So if you have an operation which requires writing out files to disk, and this writing operation has to occur extremely fast, that’s when you’ll prefer to use a Ram disk.A Ram disk is typically used when your application expects a file system structure. It can’t write out data to memory. It wants a file system that’s when you’ll make your memory or your Ram behave like the file system. Unlike other persistent disks, this only pretends to be a disk. It’s not actually a disk, so it has no storage, redundancy or other flexibility that is offered by persistent disks.

A very important point to note here is that the Ram disk shares memory with your applications. The memory that’s allocated to your VM which your applications use, is portioned out and used as a Ram disk. So you make sure that you have enough memory to run your application. Only if you have additional memory would you provision a Ram disk. A ram disk is Ephemeral. The contents that have been written out to Ram disk stay only as long as the VM is up and running. Once you stop or terminate your VM, Ram disk contents are lost.

6. Images

A virtual machine or a VM is basically a software implementation of a machine which runs on underlying hardware. An image of a virtual machine is a copy of this VM which may contain an OS, data files, applications and a bunch of other stuff. Images are typically used to create a bunch of identical virtual machines with all your software preloaded. The Google Cloud platform provides very good support for images. An image in Compute engine can be thought of as a cloud resource that provides a reference to an immutable disk. This immutable disk basically boots up new VMs with all the operating system and software preinstalled.

You can think of an image as some kind of box which holds the operating system, the applications and everything that else that you need in order to instantiate and set up a brand new VM. An image is used to create boot disks for VM instances. There are two primary categories of images that Google offers. The first of these are public images. These are images that are stored and distributed by Google and offers standard operating systems with standard software pre installed such as SQL Server might come pre installed in a certain image, red Hat Linux might come installed in a certain image, and so on.

Public images are provided and maintained by Google or Open Source communities or third party vendors. All projects on the Gcp have access to these images and can use them. So you can instantiate a VM in any project, in any organization using these public images. Some of these public images are free and others might incur a cost. Public images are standard images and contain no customization that is specific to your project. If there is a particular configuration that you want to export as an image, you will create a custom image for this and this custom image will only be available to your project. You can create a custom image from a combination of a boot disk and other images.

So you might use a public image to instantiate a VM, perform a bunch of configuration, install the application software that you are interested in, and then use this VM instance to create a custom image which will have all your stuff built in. Google hosts a number of public images which are registered and which you can use to create your VM at no additional cost. Linux distributions which are free typically have an image which are also free to use, but there might be some images where you will incur an additional cost. These are called premium images. Typically, if you use an image of an operating system that requires a license in the real world, you will incur a cost for that image.

Windows images or SQL Server images will have additional charges. Let’s say you have a custom image that you’ve created as a part of your project. You can import this custom image in order to recreate VM instances in that image. This custom image can be imported at no cost. If the image is your image and you have built it, there is no additional cost for you to import and use it. The only cost associated with a custom image is a storage charge. Images are typically a tard and gzip file and you store it in cloud storage so that all projects across your organization can access it. This storage charge is the only cost that you incur when you use a custom image.

The cost will be the same as any other blobs of data that you store on cloud storage. We’ll cover managed instance groups a little later in this course. Managed instance groups are basically a group of VM instances which are identical to each other. Managed instance groups are an important part of setting up load balancers. This is because managed instance groups can auto scale based on the number of requests that come in or on the CPU utilization of the group as a whole. The VMs instantiated in a managed instance group come from the same instance template, and instance templates typically point to an image in order to recreate a VM an image.

It’s a card and gzipped file which is made up of a bootloader, the operating system, the file system structure that goes with the OS, any additional software that you want included in the image, and any additional customization or configuration. Here is a block diagram of the steps that you would follow if you wanted to create a VM instance from an image. An image is basically just a bunch of raw bytes stored in a tard and gzipped file. You can think of an image as containing everything that’s needed to create a pre populated hard disk. Within the image is a partition table that points to one or more partitions that contain data. If you want this image to be used to boot a new instance, it must contain a master boot record and also a bootable partition.

For a disk to be imported as a Compute Engine image, the disk’s bytes must be written to a file named Disk Raw. When you create an image from a disk, after the complete sequence of bytes from the disk are written to the file, the file is then archived using the tar format and then compressed using the gzip format. One major advantage of using an image in order to recreate your VM instances is the fact that it can be uploaded to cloud storage, which is accessible by all projects in your organization. So you can create a VM instance using this image anywhere in your in order to use it as a VM image, you need to register it as an image in Compute Engine. So you create an image and then register it so that it can be used.

Once you have an image, you can use it to recreate replicas of your original disk any time that you want in any region for any VM a very common use case is for these images to be used as boot volumes for compute engine instances. We spoke about premium images early on in this lecture. Premium images are those which incur additional per second charges. When you use those images to create VM instances, the charges are the same across the world. They don’t differ from region to region. An easy way to identify whether a particular image is a premium image is to ask yourself if the software requires a license when you install it on your own computer or a laptop, then it is a premium image on the Google Cloud platform.

Red Hat Enterprise, Linux, Microsoft Windows are examples of premium images. The per second cost for using this image is not the same. It depends on the machine type. If you use a larger and more powerful machine, the per second cost will be higher. Premium images typically incur charges on a per second basis, except for SQL Server. SQL Server images are charged per minute. Here is a snapshot of all the public images that are available on the Google Cloud platform. You can use the Web console and view these public images.

7. Startup Scripts And Baked Images

Once you’ve created a VM instance, it’s typically not useful to you in its raw state. There’s a bunch of software that you might need to install on this VM, some dependencies that you need to set up before you get your application up and running. There are two ways by which you can achieve this using the Google Cloud platform. The first of these is using Startup Scripts. Once you’ve created an instance using a public image, you can customize this instance using a startup script. This script will run during the boot of your instance and install all the libraries and dependencies that you need. The startup script is something that you configure on your virtual machine.

So as your VM instance boots, this script will run those commands that deploys your applications. Maybe it installs the dependencies, maybe there are some other configuration parameters that you need to set up. All of these can be set up in this cast. Startup Script it’s very important for Startup Scripts to be idempotent. That is, if you run the script multiple times, the end state is always the same. During the boot of a VM, it’s possible that the startup script is executed multiple times, especially if the VM stops and then reboots or something else happens. The script should be idimportant to avoid inconsistent or partially configured states. As many times as you run the script, the end result should be the same.

Startup Scripts are one way by which you can customize your virtual machine to have your application up and running. A better method is baking. Baking basically sets up a custom image with your application dependencies and everything else that you need installed, and you use this custom image to boot and create your VM instance. An image which contains everything that you need to get up and running within it is called a baked image, and baking is a much more efficient way to provision infrastructure. If you want to create a whole cluster of virtual machines with exactly identical setups, baking is the way to go.

Startup Scripts are less reliable than baking. In order to get a baked custom image, you’ll typically start off with a public image that the Google Cloud platform offers. You will then perform a bunch of customizations and configuration on this public image and then save it as a custom image. This is your baked image. Baking has several advantages over Startup scripts. If you want to provision virtual machines, let’s take a look at some of the differences and see why baking is better. If you use Startup Scripts, it takes much longer for the instance to be ready because the image holding the operating system has to be used to create the virtual machine.

Then the startup script has to run, which will basically, step by step, set up all the dependencies that your application needs. On the other hand, if you use a baked image, it is much faster to go from boot to application readiness because everything is included in the image itself. It’s very important for the execution of Startup Scripts to be item potent because Startup Scripts might fail if there is a problem during VM boot and might have to be rerun. It’s much more reliable to use baked images because application deployments, even if they fail, you can simply roll back that version of that image. This takes us straight to the next point.

Rollback for Startup Scripts is much harder because you might have to roll back the image instance that you use to create that VM as well as the application that you deployed. They have to be handled separately. Version management as well as rollbacks are much easier with baked images. You just roll back the image as a whole and switch to a new version. The application will be rolled back as well. Startup Scripts introduce additional dependencies when the VM instances boot. The script will need to install dependencies during application deployment and the script runs during VM booth. Since your application software and all the corresponding dependencies are baked into your custom image, there are no external dependencies during application bootstrap.

Version management of dependent software is also much harder when you use Startup Scripts. Startup Scripts might refer to the latest version of its dependencies and if the latest version changes, you might have different VM instances with different versions. When you scale up your number of VMs using custom images, each image will have identical software and dependency versions installed. This is a huge plus. Once you start using custom images it’s quite normal for you to have a bunch of images representing different versions of your software. And once you have a number of images, you need to manage the image lifecycle. So there are various states in which images can exist.

The latest image will of course be the current live version. You might want to mark older image versions as deprecated, obsolete or deleted. Each has its own special meaning. Deprecated basically means that images are no longer the current live version, but they can be still launched by users. Obsolete means the images should not be launched by users or automation. We should not allow instances of obsolete images to be created. And deleted means images have already been deleted or are marked for deletion. They cannot be launched. Here is a block diagram which depicts the image lifecycle. You have a base image. It goes through an image build process to give you a number of custom images.

There are a whole number of versions of the same custom image and as the image gets older you’ll change the state of that image. The oldest custom image is V One and that has been marked deleted. It can no longer be launched by users and should be removed from the registry as soon as possible. Version V Two is also old. It has been marked as obsolete. It cannot be launched by users. Version V three is relatively new. It’s not the latest version, that’s why it has been deprecated. But users can still use it and launch it. And finally, at the top we have version V Four, which is the current live image. I’ve mentioned this earlier.

One of the most important advantages of using images to create VM instances is the fact that images can be stored on cloud storage and used across projects. Here is a block diagram of how you would set up images to be used across multiple projects. A user who has permissions to create an image will do so and upload it onto cloud storage and register the image. Project A will have its own administrators and any user within Project A who has the permissions to create an instance can use this image to create a compute instance which lives within project A. The same is true for Project B. A user who has permissions to create instances here can use the same image. And both of these instances will be identical because they are based off the same image. It’s.

Uncategorized

Related posts:

Leave a Reply Cancel reply