21. Understanding the Content Delivery Networks
Hi everyone, and welcome to the Knowledge Port video series. Now, today we are going to talk about content delivery networks. So, in the last lecture, we spoke about the basics of reverse proxy and how reverse proxy helps with caching. So this is a scenario that we have taken, where we take the static files like pictures or JavaScript and move them to the intended server, which is NGINX. Now, whenever traffic comes from a mobile device or a browser, the traffic will reach NGINX, and NGINX can serve all of the static files directly. And this is one of the advantages that helps not only in latency but also in resource savings. Now, one more advantage over here is that this particular computer is not directly accessible to the back-end server. So it helps from a security point of view as well. Now, there is one issue over here, which is that let’s assume that there are lots of visitors. Let’s say you have a big sale on your website and you get thousands of visitors.
Now, the problem here is that this is a single point. Now all the thousand users will come here, and all the big things like serving the websites, DDoS protection, and having a proper security suite will come over here. Now, one of the things that you can do is take all of this load offshore. This means that many things, such as a web application firewall, DDoS protection, various caching mechanisms termination, proxies, and so on, can be moved away from the front-end server and onto a content delivery network. So we put a content delivery network between the users and our front-end server, and whatever heavy lifting has to be done can be done by this content delivery network instead of our servers. Now, one big advantage of having a CDN in between is that this CDN is actually optimised for doing the heavy lifting. So we don’t really have to spend time designing our own web application, firewall, et cetera, et cetera.
So now all of these big things can be handled by the content delivery network. And along with that, if you see the static assets that we are putting on the NGINX server, those static assets can now be at the CDN level. So now a web browser makes requests to the CDN and CDN without needing to contact the front-end server. It can directly serve traffic as long as it is for static assets. As a result, CDN offers numerous benefits. And many of the major websites have CDNs of some kind. Now, if you’re wondering what CDN types are available, there are two major CDNs that most small and medium-sized enterprises generally use. One is Cloudflare.
Cloudflare is an amazing CDN specifically designed for small and medium organizations. They also have a free plan available. So if you have a small blog, you can actually use the Cloudflare CDN for your own website. It also provides a lot of things like DDoS protection, content-based caching, et cetera. So if you see, they have a free plan that is available, which is $0 per month, and it offers various things like DDU protection, although it is limited in CDN. If you choose a more advanced plan, such as the Professional One, you will also require an SSL certificate. The professional one also comes with a web application firewall. So this is one of the CDN providers that are available. The second one is the cloud front. So this is Amazon Cloud Front.
And Cloud Front supports a lot of features. Specifically, if you are hosting your contents within your AWS network, so it supports dynamic content, it supports manual cache invalidation, then the things that I would really target are retargeting, cross-origin resource sharing, and obviously, caching. It also supports the Web application firewall service with the integration of Cloud Front. So a lot of things AWS cloud front provides assistance. The good thing is that this also has a free tire. So if you registered yourself under the free tire, you see that Cloud Front supports up to 50 GB of data transfer, which you can use in your free tire. So Cloud Front is really nice, and what we will do is learn the basics about the content delivery network. In the upcoming lectures, we’ll set up our own CloudFront-based CDN and explore the various features that will help us not only in caching but also in the security aspect. So I hope this has been informative for you, and I’d like to see you in the next lecture.
22. Understanding Edge Locations
Hi everyone, and welcome back again to the new topic called edge location. As a result, edge location is an extremely important concept in CDN. So let’s go ahead and understand how edge location helps CDN networks. As previously stated, many organisations use CDN for a variety of reasons. Some can use it for content caching, some use the web application firewall feature, some use it to prevent distributed denial of service attacks, and some organisations use it for all of these purposes. So again, the purpose is different, but I would say that many of the organisations use CDNs mostly for content caching as well as to protect against DDoS attacks. So one of the most important concepts in CDN is content caching, and edge location is part of content caching. So let’s understand what it means. So let’s assume that this is a world map and that your server is located somewhere around Singapore.
Now, as you have a global website, you can expect users from all over the world to connect to this particular server. Now again, from where the users are coming, it really depends. Depending on where the users are coming from, the amount of latency will also depend. So assume that this is a user from the US. So the amount of time the packet will take for this user to reach this server will be much higher than the amount of time it will take for the user in Australia to connect to the server. Here’s another important concept to consider: “hops.” So this user’s packet, the one that user will send, will not directly reach here; it will travel through something called “hops.” So hops are basically routers. So let’s assume that this user from the US is making a request to this particular server. So the request might first go to, say, somewhere in Europe, like Germany.
From Germany, it might traverse to some other country, then to another country, and then it might reach this particular server. So there are various intermediary hops that the packet has to traverse through before it reaches the server. So let me actually show you how this works. So there’s a very useful utility called “traceroute,” and you can use it to see where or from where your packets are taking hops. So allow me to run a traceroute on Kplabs. Okay, so if you’ll see where the initial hop is, it is reaching this particular server, or I would say router. From here, it goes to the second hop. From there, it is going somewhere in Germany. So, from the Germany bit down, this is the Germany area. If you go, you are reaching Amsterdam from Amsterdam. If you go a bit down, it is reaching London, and from London, it is reaching the lineout servers where the site is hosted. So it is taking various intermediary hops to reach the final location.
So this is what we were discussing about that. It will not reach this specific server directly. It will go through various countries before it reaches here. So again, the amount of time it will take for the users to reach the particular server will depend on how far they are from the servers. In this case, the user from Australia will take fewer milliseconds than the user from the United States. So, what exactly is edge location? So what happens is that when you create a cloudfront distribution, cloudfront will create an edge location. So let’s assume that you have an MP3 file over here that you want all the users to download. As a result, CloudFront will copy all MP3 files to the edge locations. So it will connect to the server, and it will copy the files to the edge locations over here. And now, when the user makes the request, they will not make the request over here, but they will make the request to the nearest edge location. So this user from the US, instead of making requests to the server in Singapore, will directly make them to the edge location, which is present over here. As a result, the amount of time spent in latency has been drastically reduced. This is a very important point. So this is just an overview of what edge location means.
Now, there are only three edge locations that are represented in a diagram, but in reality, there are a lot of edge locations that Amazon provides. So you have more than 50 edge locations, which Amazon provides all over the world. So, just talking about India, the country has around three edges. So you have one in Chennai, you have one in Mumbai, and you also have one in New Delhi. So within one country, there are three edge locations. So your content can be distributed to each of these edge locations, which will make the amount of latency much lower. So again, when you design your own CloudFront distribution, the amount of edge location that you take has a significant impact on the amount that you will be charged. So let’s go back to the EC console. So if I go to cloud printing, let me just create one distribution.
So if I go a bit down, you see the distribution settings. So there is a price class where it says to use all edge locations. This is for best performance, but again, with best performance comes an increase in price. However, keep in mind that Amazon does offer a variety of edge location options from which to choose. So let’s assume that all of your users are only from the US, Canada, and Europe. Then, instead of using all the edgelocation, you can just select the first option, and this will solve the problem for you. Or if your users are mostly from Asia or a similar region, then you can use this instead of using all the edge locations. So that will basically save you some amount of cost. So this is the basic information about edge locations in CloudFront. I hope this has been helpful, and in the next lecture, we’ll go ahead and create our own distribution in CloudFront. Thanks for watching.
23. Deploying Cloudfront Distribution – Part 01
Hey everyone, and welcome back. In today’s video, we will be discussing deploying a CloudFront distribution. Now, in order for us to be able to deploy a CloudFront distribution, there are certain steps that are involved. Now the first step is to create a server or some kind of storage location where we can store our website files or the content that CloudFront delivers. Now, one great thing about CloudFront is that it can integrate with SRE. As a result, you don’t necessarily need an easy-to-use example. We can make use of a Sri bucket for that. Once you have your files in your SRE bucket, the next thing you need to do is create a CloudFront distribution. Once the distribution is created, you can go ahead and load the website from CloudFront to verify if everything is working fine. After that, you can proceed to explore the various features of Cloud Friend. So we can understand the steps with the help of the below animation here.
So let’s assume that this is the server, or this can be an S3 bucket, and this S3 bucket or this server has some kind of static file. So this can be an image, this can be an HTML file, etc. What you do now is create a Cloud Front distribution, correct? Now this CloudFront distribution can communicate with the server or the SRE bucket, which has the static files, or it can even have dynamic contents. Now the CloudFront distribution has edge locations, which are present over here, and these edge locations are something that basically caches a lot of information. Now, the first time a user visits your website and there is a cloud front distribution, what happens is that the cloud front distribution will request content from the server, and it will serve the content. Once it serves the content, it will also save the content in the edge location. So let’s say that this user has requested the image. So for the first time, Cloud Front will serve the image directly, and along with that, it will store the image in all the edge locations all around the world.
So now, next time, let’s say there is another user who also loads the website. Now this time, what happens is that the image will be served from the edge location. So again, Cloud Front will not send a request to the server for the image. Images will be sent from the edge locations themselves. So this is the high-level overview of the serial. Let’s go ahead and do the first step for this video post, which will go ahead and deploy the cloud-based distribution. So I’m in my AWS management console. Now the first thing that we need to do is have a location where we can store our images and the HTML file. Now again, you can create an easy instance, but this is something that you should avoid. For the demonstration, we’ll build a simple three-bucket system. So I’ll go to services, and I’ll select S 3. Now within here, I’ll create a new bucket. I’ll call it my demo HyphenCloud friend, and I’ll click on Create. Great. So this is our S-three bucket that is available. The next thing is that we need to upload certain contents over here.
So what I’ve done is basically make two pieces of content available. One is a simple index in HTML, and the second is the image, which I’m really fond of. So this is the image. So we’ll be uploading both of these to our S3 bucket. So from my screen, I’ll click on “Upload,” and from here, I’ll upload both of these contents. Great, so you have the index HTML and you have the shift jpg. Again, you can have your own custom contents as well. So basically, if I can show you what the index HTML file is all about, I’ll just open up a notepad. So this index HTML file is a simple file that basically contains “welcome to the website” and that’s about it. All right? And the image is something that we have already explored. You can have your own custom contents for your demo that you can use. Great. So once your content has been uploaded, let’s quickly go to the permissions. In fact, I wanted to show you a few things. So currently, AWS has released a feature that means you cannot really make things public. And this is quite a new feature.
So I’ll just deselect all of them and click on “Save.” So this is just for our testing purposes. Let me confirm this. Great. So the public access settings have been updated. So now let’s go to the properties, and within the static website hosting, I’ll select the first option, which basically states that use this bucket to host a website. The index document would be indexHTML, and I’ll click on Save. Great. So now the last thing that you need to do is change the permissions. We’ll go to access control here, and for public access, we’ll select everyone to be able to read the objects. All right, let’s click on “Save.” Perfect. So everyone will be able to read the objects. And now, if you notice, let me proceed to S 3. Now, you see that this bucket is now named “public.” So this is a really great feature because if I go to the S3 console, I’ll be able to see which buckets are public in a simple and easy-to-understand way. So now, within the bucket, I’ll quickly select both of these objects and make them public. All right, so that’s about it. In order to verify if everything is working correctly, let’s click on one of them. I’ll copy the object URL, and if you post it in the browser, you should be able to see Welcome to the Website over here.
In a similar case, we’ll take the URL for the image that we have present. Let me put it in the browser, and you should be able to see the image. Great. So since we have the static website hosting available for the S3 bucket, let’s take the URL. So this is basically the URL of the S3 bucket. Now, if you paste the URL over here, you should be able to see the index HTML website. All right, so this is the S3, which is hosting both our website and our image. Now, coming back to our animation diagram, we have our server. In our case, it is S Three, which is hosting the image and the index HTML file. So now, instead of users directly accessing our server, what we want is to create a CloudFront distribution that will handle all the requests over here. So this is something that we’ll create in the upcoming video. So this is the high-level overview video. I hope this video has been informative for you, and I look forward to seeing you in the next video.
24. Understanding the Origin Access Identity
Hey everyone, and welcome to the second part of our video series on deploying Cloud Front. In today’s video, we will be creating the second step, which is creating the Cloud Front distribution. Now, I’m in my AWS management console. Let’s go to the Cloud Front service. Now, I already have a CloudFront distribution that is available. So this was basically used for a different demo that we had. So let’s go ahead and create a new distribution over here.
So there are two types of distribution. One is Web, and the second is RTMP. For our demo, we’ll be using web distribution. So we must specify the origin domain name here. So the origin domain name is essentially where Cloud Front will get the data from. So, in our case, the origin is essentially a three-bucket configuration. So if you just click over here, it will basically show you the list of three buckets that are available within your account. If you remember, our bucket name was “my demo hyphenated Cloud Front.” So once you have selected the origin domain name, let’s go a bit further.
Now, if you see it says, use all the edge locations within the price classes, do so. Now, this is important because let’s say that you have customers coming from all across the world. In that case, you can basically make use of all the edge locations. That basically means that CloudFront will begin storing cash in all of the edge locations around the world. Now, in that case, if a customer is coming from, say, the US region, he might be served all the files from the nearest edge location all the files.So, if your customers are not from the United States and you know they are only from Asia or possibly Africa, there is no need to store your data in every edge location. So in such a case, you can select one of them. So you have used only the US, Canada, and Europe. You have used the US, Canada, Europe, Asia, and Africa. So, depending on the location of the customers who visit your website, you can choose one of them. All right? So let me just select the second option here. Now, the next thing is that, basically, you can specify the default route object over here. So let’s specify the index HTML. So anytime a user visits the website, the index HTML should be returned. So once you have done that, you can go ahead and create a distribution.
All right? So this is the distribution here. So if you just let me sort it out, As you can see, the distribution origin here is my demo at Hype and CloudFront St. Amazon.com. So currently, the status is “in progress.” It takes a little bit of time for the CloudFront distribution to get created. I’ll pause the video for some time, and once the status is changed, we’ll resume the video. All right? So it has been close to ten to fifteen minutes. And our CloudFront distribution status is now “deployed.” So you’ll need to take this domain name over here. All right, we’ll copy the domain name that is associated with the Cloud Front distribution, and we’ll put it within the browser. And here it is again, with the welcome to the website page. So this is how you can create the cloud front distribution and associate it with the three buckets. So before we conclude this video, I wanted to show you a few more things. So, in this diagram, we were discussing how the first time a user visits the cloud distribution, the request is sent to the origin and the image or whatever file is present is returned. It would be returned. Now, along with that, Cloud Front distribution will also store the static contents within the edge locations over here.
Now, the second time the user visits the same website, the content will be served from the edge location. All right? So let’s quickly look at what exactly that might look like. Now, let me do one thing. I’ll copy the Cloud Front domain, and let’s do a curl. And we’ll do a shift jpg this time. So this is an image file. So now what we can do in an easier way is make use of I. So I will basically print the headers. Now, if you look into the X cacheheader, it basically states Hit from Cloud Front. That basically means that this specific image file has been served from the Cloud Front edge location. So this is the basic information on how we can go ahead and deploy the Cloud Front distribution. We also look into how, when multiple requests are made, the contents are served from the edge location instead of sending the request to the origin and fetching the same content multiple amount of time. So with this, we’ll conclude this video. I hope this video has been informative for you, and I look forward to seeing you in the next video.
25. Understanding importance of SNI in TLS protocol
Hey everyone, and welcome back. Now in today’s lecture we are going to speak about the server name indication, which is also referred to as SNI.
Now, SNI is quite an important topic as far as the TLS is concerned, and even in exams, you might find a question or two related to the SNI topic. So let’s go ahead and understand more about SNI. Now, going back to earlier, specifically when it comes to the HTTP-based website, let’s take a scenario where you have a client—you can consider this as a web browser—and you have a server. Now, in the TLS handshake, the first thing that really happens is that the client sends a client hello. Now, whenever a client sends a request, if it is for an HTTPS-based website, as we have already looked into, the server will reply back along with the certificate, which is mostly signed by the certificate authority. Now with this certificate, the client browser will verify it against various CAS, which are root CAS that are installed in the browser, and then the other steps will follow. So this is a very simple scenario, and we have a happy, happy ending. But after virtualization came, things became much more challenging. Now, one of the major challenges is what would happen if there were multiple websites running on the same server and each of the websites had its own certificate. So, if a client requests or sends a client hello to the server, which certificate should the server send—the certificate of the website?
Should it send a certificate for website two? Or certain times when the shared hosting feature comes along, there are like 10 to 20 websites on a single server, and thus the older approach does not really work. And this is the reason why the SNI extension came into play in the TLS protocol. So what happens in the TLS protocol with the SNI extension is that whenever the client sends the request to the server, it will specify the server name. So it will specify if it wants to connect to server one or if it wants to connect to server two. Now, once the web server knows that it wants to connect to the server, it will go ahead and retrieve the certificate and send it back to the client. And the same goes with server two. If the client specifies that it wants to connect to server 2, then the server 2 certificate will be fetched and sent back to the browser. So let’s do one thing. Let’s go ahead and look into how exactly this would work with our favourite wireshark. So I’ll start my wire shock, and we’ll be using the WiFi interface because I’m connected to the WiFi, and I’ll start the packet capture. Now I’ll just go to some random website that has HTTP. So HTTPS got loaded, and you will see that I will go a bit down. So this was the DNS query, and if you go a bit down, there are some client hello packets. So client hello is one of the first steps in initialising the TLS connection.
So within that client-hello packet, let’s go to the handshake protocol. So there are various extensions; as you can see, various extensions are part of the protocol. Let me just maximise it so that it will become much more clear. So this is the client hello packet, and within the client hello packet, there is an extension of the server underscore name. So this server underscore name is basically the SNI extension. Now you see the server name indication extension. Now, within the server name indication extension, you will see the server name, which is Youtube.com. When the client sends a request, this is how it works. So this is my local IP, and this is the IP of the YouTube server. So when my browser made the request to this IP address within the client hello packet, it specifically said that I wanted to access Youtube.com, and then the server will send back the Youtube.com certificate to my browser, which my browser can verify with a certificate authority. So this is the basic information about the SNI extension. So let’s go back to the topic and look at some of the benefits that the SNI brings.
Now, one important thing to remember is that the SNI extension is supported in modern browsers. So, going back, there are many legacy clients that do not support SNI extensions, and thus will not be able to support multiple websites with the SSL certificate sharing a single IP address. And thus, what those legacy clients want is that there should be only one website with an SSL certificate per IP. Okay? As a result, you cannot have multiple SSL certificates on the same IP address. So that is not supported by the legacy client. So let’s look at the benefits. So, prior to SNI, a website needed to have a dedicated IP address in order to have an SSL certificate installed. This is very important. If you did not have a dedicated IP, you could not really have an SSL certificate. Browsers will simply refuse to work with that. So however, with the SNI, we are finally able to host multiple websites that can share a single IP address, and all of these websites can have their own SSL certificate. Now, one important thing to remember is that the SNI extension needs to be supported by the browser as well, because it is the browser that will be sending the client hello with the SNI extension. So last, and this is a very important point, some of the legacy clients, which include Internet Explorer on Windows XP, do not support server name indication, and thus your website may break if you are using the different approach of SNI. So this is the basic information about SNI. Let me actually show you one more important thing. So when you go ahead and create a cloudfront distribution, which is a content delivery network, let me just go ahead and create a sample distribution. And if you go a bit down in the SSL certificate, let’s click on the custom SSL certificate.
And now, if you go a bit down, you see there are two options over here. The first option is to only support clients that support the Snip-based indication. So this is the first option, and for this you don’t really have to pay anything. There is also a second option, which basically assigns your dedicated IP address. So you see, CloudFront allocates a dedicated IP address at each CloudFront edgelocation to serve your content over HTTP. So this specific option is more for supporting legacy clients. So, if you have a website and a significant number of legacy clients who visit your website, using a SNI will break all of those clients. So you need to have a dedicated IP address to make them work. So that’s the end of the server name indication. I hope this has been informative for you.