39. Cloud Storage Module Introduction
Yet another area where cloud really shines is with data storage. And data storage has been a massive challenge for information technology and it’s been that way for decades and it keeps getting worse. Everything is digitized now and we wanna store everything in this digital form for sometimes massively long periods of time. So, this data is building up. Can you only imagine the amount of data of YouTube now? You know, I mean, think about it. There is… I mean, me, myself, I have uploaded thousands of videos at this point. I’m one person and I’m not as prolific as peers that I have. So, it’s just amazing this issue of data storage today. And I think you’re gonna love this module because we’re gonna see where the cloud is extremely flexible and extremely capable in the areas of data storage.
40. Cloud Storage Types
Now, I must admit I’m a little bit uncomfortable with this title, and that’s just because of the fact that we’re gonna talk about storage types. Not necessarily cloud storage types, it’s storage types. Now, of course, these are available to us in the cloud, and that’s why we are discussing them. But, really please keep in mind that we can have object-based storage today in standard non-cloud environments, just like we have block storage and file based storage.
Now, I decided to go ahead and take you on a tour of this storage in an actual public cloud, as a much more interesting way to teach it to you. So, let’s get going on that tour right now.
So, as you can see, we’re in the latest AWS console. And let’s see just how public cloud can accommodate any type of storage need we might have. One of the things that people immediately think of when they think of Amazon Web Services and storage is, without a doubt, S3. One of the reasons that people think of S3 and AWS all the time is because this is true. This is their oldest service. Yeah, this is their first, their oldest cloud service. And S3 stands for the Simple Storage Service. It is object-based storage. So, this is so cool. You create a bucket, okay? So, you create a bucket that has a DNS compliant globally unique name. So, notice I use my last name, a first letter and then last name. That’s pretty unique. But then I had to add – 1234 to it to make it globally unique. And then I can upload into this bucket any object I want.
So, I can come here and grab this object, and I can add it to the bucket and upload it. And the bucket’s size is limitless. It is not file-based storage. It’s not block-based storage. It is just object based storage. And believe it or not, this bucket is just limitless in its size, and it’s limitless in the number of objects that I can toss inside of it. So, object-based storage is very flexible storage and that’s one of the reasons why it is so often celebrated in cloud environments. You have a flexible resource type environment with the cloud and here you have this bucket that is limitless in what it can contain. Now, that’s object-based storage, but what about other types of storage that we might need?
Well, one type of storage is file-based storage. And sure enough, file based storage is given to us in AWS, thanks to the Elastic File System, or EFS. So, you can come in here and notice very easily, you can create a nice regional network file system share. And this file system can then be accessed by devices for storing their files. So, this is great for NFS, Network File System, type of arrangements that are popular in Linux environments.
In Microsoft environments, there is the server message block that is used for file sharing. And sure enough, AWS addresses that, now. And I do believe it is, is it the FSx, right here, FSx. It’s got a kind of strange name, but file system. And notice, we can create one of these FSx file systems, and that will be for the Windows shares. So, let me search again. Yeah, see there’s this FSx service, and this is where we can create a file system for Windows environments. Notice, this is expanding all the time. So, there’s more and more types of operating systems. Notice, NetApp’s ONTAP operating system is now supported. Oh my goodness, this is amazing. So, we can have NetApp communicating with AWS cloud storage now directly for storage needs. Incredible.
Now, another type of storage that we need to know about is of course, block-based storage. And yes, we have that inside of AWS as well. And yes, it’s called Elastic Block Storage. And this is Elastic Beanstalk, not where I wanted to go. So, let me type EBS again. And notice how we get to the Elastic Block store is through the virtual machine service of AWS. So, inside of the EC2 service which is for the creation of virtual machines, if we look down in here, we’ll see that of course there are these elastic block store storage volumes. And what these are for is creating volumes for operating systems that are running on top of them. So, block storage is what an operating system would look to see and install itself on. And that’s why the Elastic Block store is in the virtual machine portion of AWS.
Now, keep in mind, I used AWS to teach this topic, but I could have easily used Azure, which has file systems for, that are available that has block storage, that’s available that has the object-based storage that’s available. So, just because we used AWS, it doesn’t mean they’re the only cloud that offers such cloud-based storage. So, we know in today’s landscape, there’s a large variety of storage capabilities, and sure enough the cloud is gonna be responding with all types.
41. Cloud Storage Tiers and IOPS
It really does seem almost unbelievable, doesn’t it, when we think about topic after topic after topic where the answer really can be the cloud. This is not just marketing speak. This is not exaggeration. The cloud can be so incredibly important for organizations and their data storage needs. In this video, we’re gonna talk about specifically how the cloud can help us tier storage, and what this whole concept of IOPS is, and why we hear about it so much in cloud today.
Now remember, one of the reasons why we absolutely love oftentimes a move to the cloud is because when it comes to that challenge of data storage, the cloud is typically gonna give us the ability to do tiering. And oftentimes, you’ll see cloud vendors or even storage providers talking about hot storage versus warm storage versus cold storage. Now, your popular cloud providers, like AWS, and Google Cloud Platform, and Azure, they actually don’t use this terminology. They’ll use things like frequent access (FA) or very frequent access (VFA) as ways to describe what would be the hot-cloud-based storage. And we wanna keep in mind that the frequency of access is often what will define the actual tier of storage. And remember this, that the actual technology will change typically based on the tier that you’re dealing with. So, what you’ll see a lot of cloud vendors do is they’ll move all of their old but still perfectly functional hard disk drives, and they’ll make those available for archiving your cold storage. So your data is actually on old-school hard disk drives. And then they’ll use SSD technology for the hot and warm tiers. And let’s keep in mind that not all SSDs are created equal. There are some SSDs that are very expensive that are very high performance, and then there’s more affordable SSDs that are lower performance.
Now, the additional topic that I wanted to make sure you were aware of here is IOPS. What is IOPS all about? And why are we constantly talking about IOPS when it comes to cloud-based storage? Well, this is the input/output per second of the media. So, you can just imagine this. Where you’re gonna have 20,000 input/output operations per second compared to 1,000. And I’m just making these numbers up, but you get the idea. The input operations that can be done, input and output per second is how we measure the speed of storage. And thanks to the cloud, we can tier the various technologies around the IOPS we would actually want. And guess what happens in public cloud? As you start saying, I want everything in the hot storage, man. And don’t say it that way. You’d sound like a nerd. But anyways, let’s say that we’re all excited about the hot storage. Well, guess what’s gonna happen? Well, our cost of storing data in the cloud is going to be much greater than if we adopted for the lower IOPS cold storage. Thanks so much for watching.
42. Storage Tiers in AWS
Now, if you’re new to this concept of storage tiers, and many of you are in organizations that might not have the budgets and things or even the variety of the equipment that you would need in order to do storage tiering, let’s take a look at this as it would exist in an example of public cloud. Let’s take a look at storage tiering as we would see it with data storage of various types inside of AWS.
So, here we are inside the brand new, brand new style of console homepage in AWS. Again, I am a huge fan of this. I love how they’re finally giving us our current cost information. And in fact, I can see that, you know, I’m really starting to spend a significant, right, amount of money per month on EC2. So, I can check that out. And there’s the EC2 service right there at my fingertips. So, I can come in here, take a peek. Notice, I have no instances running right now. So, I must have been doing some pretty intense stuff in the past, let’s take a look at what instances are in here, yeah, look, I was doing some big Splunk system testing. Actually, this was it right here, this FMC right here. This is a Firepower Management Center. This is a Cisco firewall, a very big, sophisticated Cisco firewall. And notice, in fact, this is an old version. Watch this, we’re gonna do a little cleanup here. I’m gonna go ahead and delete this thing, cause notice I have an elastic IP address that this thing is consuming, and so that’s a little bit of money, too. Notice all of this is very affordable here. Did you notice we’re talking about like, $7 was my total spend this month? I mean, this is not a lot of money, but it’s always nice to go ahead and clean this stuff up when you can. So, I’m gonna go to, boy, see, isn’t this interesting? So, terminate instance is what we want, but they move things around so much with these interfaces. That is one of the huge challenges with public cloud. As you just saw, I was struggling to find that menu option and sure enough, it’s moved and it was, you know, difficult to find.
So, there we go. Now we’re in the terminated state. So, this had some big storage constraints on it, too. And the underlying storage, the underlying data storage, was just cleaned up, as well. See how I tied this demonstration into our concept of data storage? And no, I’m being totally serious, this is a big deal, so we had some very expensive SSD storage that just got deleted, as well.
Now, in the elastic IP area is where I have an elastic IP address now hanging out, that I can go in and disassociate the elastic IP address with my Amazon Cloud account, and then once that is done, I will release the elastic IP address and goodbye. So, now, I will not be charged for consuming that public IP address. I hope you appreciated this little kind of jaunt into a related area in AWS. But now let’s get to the main point of this video, of course, and that is, let’s go in and look at the object-based storage in AWS and let’s take a look at, you know, how is the intelligent tiering done with something like the famous S3 bucket of AWS?
Notice, here’s my aseaqueira-12345678, it has to be a unique name, your AWS S3 buckets, it has to be globally unique for all of AWS, so you end up with some kooky bucket names to give yourself that uniqueness. Obviously, when I created the aseaqueira bucket and then deleted it, that name couldn’t be used for a period of time, so that was problematic. You get the idea, uniqueness on the name is a challenge with the bucket, but then once you get past that, you have this bucket in AWS that’s extremely flexible. It can just contain objects and there’s no limit to the size or the number of objects that you can dump in here. And notice they give us a nice little structure. These are actually meta tags about the objects that are going in here, don’t think of this as file-based storage because it is not. This is not file-based storage. And look at this, I have this MP4 hanging out here that really belongs in one of these recording backups. So, I’m sure, let’s see if they allow you to click and drag. Ooh, no, nope, that’s not gonna work because it’s a web-based interface. So, notice now I’d have to figure out, okay, how do I, in the console, move that resource? It’s probably pretty easy. No, no, look at that. It’s not gonna be all that easy, I think, to move that.
But anyways, I digress because what we want to talk about is look at this, that object is in the glacier deep archive. You see, what you can do is, you can come in here and I can select all of the objects in my bucket. All of these folders, if you want to call them that, that’s what they certainly are visualizing it as. I don’t like to call them folders because again, this is not a file system. This is just big object-based storage, but I can select all these, and therefore, all the things inside them, these containers, let’s call them that, inside the bucket, these sub-buckets, all right, inside the main bucket, everything’s selected. And now you can go up to the actions menu and you can edit the storage class. Now, notice that I am not able to do that against all these objects, why? Well, some of them have minimum age requirements, like this glacier deep archive object. It’s throwing off the edit class option because the glacier deep archive, when you put your object in that deep archive storage, you are subscribing for a minimum amount of time it has to stay there. Isn’t that interesting, so yeah, it has to be there for a minimum amount of time. That’s part of your subscription to that particular public cloud service. Very interesting. So, let me go to like this, that I’m using all the time now, my 2022 recordings back up.
And if I select this folder and then I go up to actions, notice I can edit this storage class. And that’s because it is currently in a storage class where I’m not forced into the number of days it has to stay in there. See this min storage duration? So look at this, I can say, ‘All right, right now, I want to go standard,’ which means I’m gonna be frequently accessing this data and therefore it’s gonna be, you know, quickly available to me. Or you could say, ‘I want reduced redundancy.’ I don’t really care if AWS were to have a little struggle getting me this data because it’s only in two places instead of three places, you know, how many copies are they? How many backups are they doing for us? And you could say, ‘I want reduced redundancy, and I’m fine with that.’ Notice you can go into these glacier tiers of storage, this is really cool, where it’ll be pennies, and this is what I’m doing, obviously, with this backup. This is, by the way, an actual backup that you’re seeing that I have for all my videos.
And of course, I have this in the standard type of access right now, and the reason why I put that folder in, oh, look at this, look at this. So, there we go, there are some objects inside that are still set to glacier deep archive, but it’s editing the storage class for everything else. The reason why I don’t have all of this in the deep archive like I normally would, is for just purposes of demonstration, like I’m doing for you now.
So, notice, 141 objects were moved into the frequently accessed storage tier, the standard storage tier, and 13 gig were not edited because those are deep archived and they’re stuck there for a period of time. So, interesting and it’s so great to see how this kind of stuff works.
Now, by the way, if there was something in the deep archive and you’re like, ‘Okay, I really do need it, I really do need this MP4.’ Well, I can go ahead and initiate a restore, this button right here. And they will start pulling that out of the deep archive for us, and now we could even take that unarchived copy and we could put it into a bucket in S3 for things we wanna work on currently. So, you can get the stuff back, it’s just that you’re gonna be charged additionally when you do the restoration, and also, just keep in mind that it’s gonna take some time. If you’re in the deep archive, it might take eight hours for you to get that object. I took something out of the deep archive recently and I think it took even longer, like 12 hours. So, yeah, it really is a deep archive. That’s why they charge you very-very-very little for all this storage up there in the cloud. It’s cause they cannot retrieve it for you or they’re not going to quickly because of your payment structure, it’s really interesting.
Now, I want to impress upon you that that was just how tiering would work, for example, inside of AWS S3 object storage, but you have to remember, that’s how it’s gonna work, you’re gonna have easy options like that for all kinds of storage needs. For instance, let’s pretend we want to go in, and in our EC2 environment, we want to launch a new instance, so we’re gonna go through the launch instance wizard, at least part of it here, to get to the storage part, so look at this. Let’s spin up an Amazon Linux 2 AMI with an SSD volume type, woo, nice, fast SSD storage. So we select this thing and notice there’s a free tier eligible, one virtual CPU, one gig of memory, and it’s got EBS, elastic block storage, storage behind it, okay, fine. Now, I’m gonna come down here, I need to move my recording software thing out of the way, and then I’m gonna go next to the details. Let’s just accept all the defaults here and yeah, because I want to get to this next step, and that’s the storage step, look at this.
Under the volume type, right now it is set to General Purpose SSD storage, but if we drop this selection, notice there is a GP3 there, that’s gonna be higher performance. There’s also these IOPS versions one and two. So these are specially tuned, provisioned IOPS SSDs. We’re getting into blazing performance here. And then look at this, there’s Magnetic. How cool is that? You can spend even less money on this virtual machine by saying, ‘You know what, the speed of the disc is totally irrelevant,’ maybe, it is for what you’re using the virtual machine for, so you say, ‘Let’s just go to old-fashioned hard disc drives for this virtual machine.’ So, notice, whether we’re talking about blocked storage that’s gonna be the storage for one of our virtual machines, or whether we’re talking about, you know, object-based storage in S3, there’s gonna be a convenient interface that’s going to allow you to point and click and take advantage of storage tiering, and that’s why storage tiering and the cloud is so exciting, thanks so much for watching.
43. Other Cloud Storage Topics
I’m sure you’re not surprised that there is more to say about cloud storage, because data storage is a massive challenge for enterprises today. Let’s wrap up our discussion of cloud storage with just a discussion of some other various topics.
One of the remarkable things when you think about cloud storage is the fact that we can have a bucket. Yes, that was my terrible attempt at drawing a bucket. And when I say bucket, we think AWS S3 bucket as an example. The appropriate technology, by the way, in Azure, would be the binary large object, or BLOB service. But so, let’s say we’re in S3 bucket world here, and we are tossing all of this stuff into the bucket, and remember, there’s no limit on the number of objects. There is no limit on the size of the bucket. So there are just unlimited opportunities here. Now, what’s amazing is thanks to the cloud, you can have services go in and do things like check the security needs of that data. For example, if the service runs and goes in and sees that there are credit card numbers that are exposed, maybe there’s social security numbers in that data, then the cloud service can recommend the appropriate action for you to take, and then can automate that action. Maybe, it’s even masking out that information, or maybe it’s putting that into a special encrypted location. So, once the data gets into the cloud, the wealth of services that we have to do analysis of that data, those number of services is just on the increase. Let me give you another example of this that just blew my mind when I first saw it.
In AWS, there is the ability to create a data lake. And what a data lake is all about is just this concept of all this unstructured or various types of data being thrown into something like an S3 bucket, and then that data being able to be analyzed. And that’s exactly what types of services exist now with S3. We can toss things into that cloud storage, and then we can have the data lake service. I forget what its name is inside of AWS. But this data lake technology will reach out, will grab the data that’s in that S3 bucket, and then will allow, like, structured query language (SQL) style searches to be done against that information.
I mean, this, we are in amazing times now. Before, you had to get them the data all sanitized and all cleaned and all ready for ingestion into a relational database management system, and there was little to no flexibility in that system, and it required all this planning. Now, we’re just dumping stuff into an S3 bucket. It may be .csv files, it may be text files. It doesn’t matter, and the data lake technology is reaching out, and it is just sucking that information in and then allowing us to analyze the data. It is a whole new world thanks to cloud, and I hope this video kind of opened your mind to all of the possibilities that exist now with data storage in the cloud.