45. Infrastructure as Code
Hey everyone, and welcome back to the Knowledge Portal video series. Now in today’s lecture, we will be speaking about infrastructure as code. Now, this specific approach to building infrastructure has gained huge popularity in today’s market. And lots of organisations are now migrating to infrastructure as a code-based approach. So let’s go ahead and understand this. So, coming back to the basics, there are two ways in which you can actually build your infrastructure. One method is to build the infrastructure by hand. So you manually create a server. Everything will be manual. And the second will be through automation. So when I talk about manuals, let’s just see how they work. Let’s assume that needs to be an easy instance. So you have a developer’s requirement that EC requires two instances. So what do you do? You go to AWS or whatever cloud provider there is. You do everything manually. similar to how you choose an operating system After selecting the operating system, you can specify whether it requires two or four gigabytes of RAM.
Once you’ve chosen the resource, you’ll need to decide which VPC to install it in and what firewall settings to use. Everything will be done manually. So, if a developer requires another instance tomorrow, you will log back into the cloud provider console and do everything manually. So that is the manual approach. Definitely. That manual approach is repetitive, and it takes a lot of time. The second method is through automation. So what happens in automation is that you write a simple script that launches the EC2 instance. Whenever a developer requests an easy-to-create instance, you just run the script, and the EC2 instance will be created for you. So this specific approach to automation is a very sound one, and this is something that must be implemented, I would say, in most of the organization.So let’s understand this with the example of a single service. In many organizations, the environment is segregated.
You have a development environment, a staging environment, and a production environment. So anything new or any new service that comes up, the infrastructure has to be built first in the development environment. So you will need some EC2 instances. You will need a MySQL database, and you will need a S3 bucket. And there will be some kind of pipeline that has to be created. Now, in the traditional way, you will be doing everything by hand. So in a development environment, you launch everything manually. And once the development environment seems to be working, the developer will ask you to do the same or replicate the same setup in staging. So what you do is you go to the staging account and again create an entire infrastructure, which would be a similar application in a staging environment. Now, once the testing is done and everything is done in staging, the developer and the QAT will ask you to move everything to production.
And now what you’ll do is move the entire environment. Again, you’ll launch instances; you’ll create a MySQL database; an essay; a bucket; everything will be manual, and this is just for a single service. So you will understand the efforts that will be needed by a DevOps team or by a solutions architect to launch this infrastructure for a single service. But when you talk about big organizations, they actually have like 500 to 600 services, and doing things manually actually does not suffice. So you cannot scale this approach, and specifically for lazy people like me who do things manually, it’s out of reach. So in order to solve this, what you do is write a template that will launch an infrastructure for you. And from that template, what you do is launch a development environment. So the entire template for creating EC2 instances, databases, and the S3 bucket is written over here, and you launch this template in the development environment and everything will be created for you.
Now, after a week or two, once the developer has told you to replicate this environment in staging, you don’t really have to worry because you have the template. You go ahead and launch the template, and the same infrastructure will be created. Similarly, after a month, if the QA team asks you to deploy in production, again, you don’t have to worry; you can directly launch it from the template. So the only effort you’ll need is in the beginning when you have to create a template and write code, and after that you don’t have to worry about it. So this is called infrastructure as code. I’ll show you a demo so that it will become much more clear to you. So let me open up my item. So, I have a simple infrastructure as a code-based template, which I have written based on TerraForm. So what this basically does is create an EC2 instance based on a specific AMI. This is the Amazon Linux AMI. Then you have a T2-microinstance, and you have a security group. So if you look into the security group, there are two ports that are present over here. One is for inbound, and one is for outbound. So in inbound, it is allowing port 22 on this specific IP, and outbound, it is allowing all the ports for perfection.
So we have written a simple template, and this template is based on AWS, and it will be launched in the EU region. Perfect. So this is something that we really wanted. So let’s do one thing; let’s go ahead and deploy the template. So let’s go to Terraform. And let me just quickly verify if everything seems to be proper, which it is. So I have actually copied the entire template in the EC2 instance, and what we’ll do is create a region. Let me show you. I have a region called Ireland, and no instances have been created there.
So we will be creating our first EC2 instance with the help of infrastructure as code. Perfect. So since I have my working code ready, what I’ll do is run a TerraForm plan. So this template is based on Terraform, which is again a great infrastructure platform. So what it is actually showing you is what exactly will be created in your AWS account. So this seems proper. I’ll also run TerraForm Apply. TerraForm Apply will go ahead and deploy everything that is written in the code to your cloud provider, in our case, AWS. So if you’ll see over here, it is creating a security group first. And once the security group is created, it will launch an EC2 instance and connect it to the security group. So it might take a minute for this to be deployed. So you see, it is creating the EC Two instance all the time. Let me show you. There are various platforms available for developing infrastructure as code.
Terraform is one of the very nice ones, which I really like. So this is something that we will be using extensively. And we actually have an entire course coming up for the TerraForm. So stay tuned. Along with that, there are other platforms also available, like AWS cloud formation, that allow you to do similar things. Now, one of the differences between cloud formation and Terraform is that cloud formation is vendor-specific, like it is only for AWS. You cannot use this for other cloud providers. However, TerraForm can be used with other cloud providers as well. And this is the reason why this is really great. Perfect. So now coming back, you will see our TerraForm template has been completed, and it says that there are two resources that have been created. So let’s do one thing. Let me just refresh the EC2 console. And now you see our first EC2 instance created with the name IAC, which has a security group. Let me just open up the security group, and in the inbound, you will see we have one inbound rule that is created, and in the outbound, you have one outbound rule that is also created.
So now, if I want to create the same instance in some different region, the only thing that I have to do is specify the region over here, and then it will create the same EC-2 instance in a different region as well. So this is a very high-level overview of infrastructure as a code. So let’s come back to our PowerPoint presentation. As a result, there are numerous advantages to running infrastructure as code. One is the reusable code. So if you develop a three-tiered architecture, you don’t really have to write the code again. Because on the Internet, there are a lot of people who have actually written the code related to freedom-based architecture, based on TerraForm, or based on cloud formation. So, what you can do is just copy that code, and you can launch your infrastructure based on that. You don’t really have to write the entire code. Again, that is one very big advantage. The second advantage is that you can manage infrastructure via source control, so you can actually commit your code to Git and have proper source control. Second, you can enable collaboration, and if your organisation has multiple solution architects, everyone can collaborate and build a comprehensive infrastructure.
46. Understanding CloudFormation
Hey everyone, and welcome back to the Knowledge Portal video series. Now in the earlier lecture, we discussed the basics of infrastructure as code, and we also looked into how we can create an infrastructure based on Terraform.
Now in today’s lecture, we’ll be speaking primarily about AWS CloudFormation because this is something that will come up in exams. So AWS CloudFormation is yet another platform where you can launch infrastructure based on code. So let’s go ahead and understand more about AWS cloud formation. So I have launched AWS Cloud Formation from the console. The first thing you need to do now is click on “Create a new stack.” So I’ll just click here, and there are some templates that are available. Let me just select the Lampstack template. And one really nice thing that I really love about cloud formation is the designer. So, if you simply click here, cloudformation will display the entire design in a nice graphical format. So this is what is there. So you have a simple, easy-to-implement instance here, with a dependency on the security group. So there will be a security group that will be created first, and after the security group is created, you have an EC2 instance that will be launched based on the security group.
So if I just click here, you will automatically be directed to the security group template. So if I just click on the EC2, you will be directed over here. Now, there are a lot of other resource types that you can create from the designer itself, and the cloud formation will automatically create code for you. So there are two template languages that are available. JSON is one, and YAML is another. YAML was recently introduced, and I would argue that it was much more eagerly anticipated because writing in JSON is a real pain. Anyway, let me just close this and start over. I’ll select lampstack. I click on “next.” It will ask me for some default settings. I’ll say the stack name would be KP Labsmo. Let’s enter a database password; the DB user is of the KP admin instance type. Allow me to place it at two micro and key names. I have already created a key, so I’ll select Kplabs as a key name. We’ll go next. Click on “next.” It will show you the configuration that you had selected, so go ahead and click on “Create.” So now what will happen is that cloud formation will use the template, and it will start to create the resource based on the template that has been written. So in the event step, you will find out exactly what is happening.
So you see, the status is “created in progress.” Now there are two important things that will be created. One will be the security group, and the other will be the EC2 instance. So now you will see the security group. Initially, it was created in progress, but once the security group was created, it had a tag of “created complete.” Because of the dependency, the security group was created. The EC2 instance is launched after the security group has been created.
So, once these events are completed, this is where things stand. Let me just show you. I’ll click on EC2, and now you see there is one instance that is running based on T2 microinstances, and it automatically has a security group as well. And the key name associated with it is Kplab. So all of these things are automatically created with the help of cloud formation. So the entire stack is created; you see Create Complete. Now let me show you. The URL will be displayed in the outputs column. So this is the URL of the EC2 instance. And now, if you click here, you will be presented with the PHP page of the EC2 instance. So this is the basic idea of what cloud formation is all about. Now, as a personal choice, I would really encourage you to go with TerraForm because it is never recommended to stick with a specific vendor.
Terraform is really easy to learn. As we have already seen, the template that you create with Terraform is very simple. So you specify the AMI ID, you specify the instance type, and you specify the security group over here. So this is very simple to type and to understand. However, when you talk about cloud formation, it really sometimes becomes a big pain. So this is the template that got created. This, after all, is a template for creating a simple security group and an easy to instance based on Lamp stock. Again, it is entirely up to you, but in my opinion, Tera Form is the best app available right now. So anyway, I would encourage you to try this out. Terraform tribute cloud formation: see which one works best for you. And with this, we’ll conclude this lecture related to the first part of cloud formation. Thanks for watching.
47. Amazon Rekognition
Hey everyone, welcome back. Hey everyone, welcome back. In today’s video, we will be discussing the AWS Recognition Service. Now, AWS Recognition is a deep learning-based virtual analyst service. Now, now AWS Recognition is a deep learning based virtual analysis service and it basically allows us to integrate a very powerful visual analysis based feature within our application. Now, building a visual analysis feature is extremely difficult. So what AWS has done is that they have already built it, and we can make use of the SDK or even the CLI to integrate with our application. So that’s the only thing that we have on our slide. I really wanted to jump right into the demo to really show you how amazing this service is. So I’m in my AWS management console, and if we go to recognition, let’s click here. So this is how the AWSRecognition GUI looks in real life.
So let’s jump directly into the demo so that you can know what this service is all about. So let’s go to object and scene detection. And if you look over here, you have a photo. And this photo has multiple objects. You have a person over here, and you have a car over here. And, in essence, when you upload a picture in response, the recognition service will tell you what objects are present and with what level of accuracy. So you’ve got the automobile. And if you look into the request, this request is basically a skateboard-based JPG image. So let’s look at a few more. Let’s look at the facial analysis. So this is a picture of a female. And here you see that it says it looks like a face with an accuracy of 99.9% and appears to be female. It displays the age range, for example, 20 to 38 years. She’s smiling, appears to be happy, and is not wearing glasses. She is wearing glasses. The accuracy here is a little less, and this is how it really looks like.
Let’s look at a few more interesting ones. One is celebrity recognition. So you can easily tell when you upload a picture that it belongs to a specific celebrity with a specific match. You also have one more, and again, it will give you a specific match confidence and the name of a service. Again, there are a few more. One is the face comparison, where you can compare the face in one picture to the face in another picture to see if there is a similarity or not. And you also have text in the image. So let’s say you have an image that contains text. This service can extract the text for you. Great. So let’s try it out. So these are some of the demo images that AWS had. So, I’ll show you some sample images from my phone. We’ll use that to see how accurately the service determines the objects from the images. So what I’ll do is upload the picture. So this was one of my recent travels. So this is basically an ATR flight, and basically it has determined that this is an airport; it has determined that this is an airplane; and you also have a vehicle because it has determined the truck here; it has determined the car; and so on.
This has been accurate for us. Let’s try one more. So this is a random image of a laptop on a desk. As you can see, it has determined 99.9% accuracy. This is furniture. Then you have a table, and then you have a computer. Now, definitely, it might not really recognise that this is a Mac or some other brand if you decide to upload, but it will definitely tell you all that. So this is a computer. So let’s try one last image here. So this is a unique image among the three. So this used to be one of my rabbits, and we sent him to a relocation centre where he had a lot of space to run around. So he was very happy there. So now you have a plant here with an accuracy of 92.6% confidence in what it is saying. It is saying that it is an animal. There is an animal. You, too, have a food. So you can do more demonstrations and you will see that it is also capable of recognising a rabbit with a high level of confidence. So this is a fantastic service, and it has been able to accurately identify the objects or even the animals in the photograph. So that’s about it. About the recognition service, one important part is that if I integrate with the application, your application can send a request for the photo, and within the response code, it will actually tell you everything that you see in the GUI with specific confidence-related parameters. So that’s about it for the recognition service. I hope this has been informative for you, and I look forward to seeing the next video.
48. Overview of AWS ElasticBeanstalk
Hi everyone, and welcome back to the Knowledge Poodle video series. So, continuing our journey today, we’ll be speaking about Elastic Beanstalk. Now, since the past year, Elastic Beanstalk has gained a good amount of popularity, specifically among startups, because Elastic Beanstalk makes it quite easy to launch a new infrastructure in terms of point and click. So it is something similar to an orchestration platform. So let’s do one thing. Let’s understand Elastic Beanstalk with the help of a simple use case. Now consider a scenario in which you need to deploy a simple Hello World application in an EC2 instance with an ELB. So, in a very simple use case, there is a need for an EC2 instance under an ELB. And that EC2 instance will host a straightforward Hello World application. So how will we achieve this use case in a traditional way?
So, in the traditional manner, we will create an EC2 instance. Once the EC2 instance is created, then we have to modify the security group and other things. And then we log into the EC2 instance, we install a web server called Apache NGINX, and then we install the application dependencies. As a result of application dependencies I mean that if your Hello World application is based on Java, then you need to install Java packages. If it is based on Python, then you have to install Python packages. So these things need to be done. Once you install the web server and the relevant application dependencies, you go ahead and configure your application on the server. So you upload your application files to the server, put them in the right directory, make sure the permissions are correct, and once everything seems to be working fine, you create a new ELB, configure the health check, and once it’s done, you point it to the right EC2 instance. So this is definitely not a very difficult task. However, doing all of these things is difficult for startups or small organizations that lack a dedicated solutions architect or a DevOps team. I’ve seen a lot of startups; I used to get emails from them because all they have are developers; they can’t afford a DevOps guy, a dedicated DevOps guy. So their developers create their own infrastructure. And when you create the infrastructure in a nontraditional way, it really becomes challenging for them. And this is the reason why Elastic Beanstalk really helps you. So what happens in a beanstalk way is that you create an elastic beanstalk environment with the correct platform, okay? And once you do that, you just do a point-and-click, and we’ll be able to deploy the entire application. So it is a very, very simple step.
So let’s go ahead and do the practical session. So this part will become much more clear. So I’m going to log into my simple console. Let’s open up the beanstalk. Perfect. So this is the elastic beanstalk console. So you’ll see over here that there are only two steps that you have to do. First, you have to select a platform. So platform will be determined by the platform in which you wrote your code, such as PHP, Python, Java, and so on. Once you select a platform, you upload your application as a zip file and run it. Okay, only three simple steps. So let’s try it out. So I’ll click on “Get started,” and it is asking for my application name. I’ll see the application name as KP Labs. Now I have to select a platform. So there are various platforms that are available. So what really happens when you select a platform? Let’s assume I select PHP and then elastic beamstock. When it creates an EC2 instance, it will automatically install all the relevant PHP-based packages on your server. So you don’t really have to do it on your own. Once you select the platform, you have to upload your code. So you can click here and select upload, and you can just upload your code in a zip file over here. This is one method; another is to use a sample application. So we’ll use a sample application in our case. And once you select the sample application, just click on “Create an application.” So Elastic Beanstalk will now create an easy-to-use instance, a security group, upload the sample application to the server, and configure everything for you. So you don’t really have to do anything. This is how simple it is. As a result, this takes some time.
So the “create environment” call has started to work. So it might take around five to six minutes for the code to get deployed. So if you see over here now, it is creating a security group. Now once the security group is created, it can go ahead and create the EC2 instance as well. Once an EC2 instance is created, it will install the packages inside the EC2 instance. So now the security group is configured. Now you have elastic IP configured so that this can be accessible over the internet. And then the EC2 instance will be connected, and the EIP will be associated with the EC2 instance. So let’s do one thing: let’s pause, and I’ll pause the video for a while, and I’ll come back once this process is completed. Okay, so it took around two minutes, and now our Elastic Beanstalk application is ready. So now what you see over here is that once a Beanstalk application is ready, it will give you a public URL. So if you’ll just open it up over here, as you see, it will show you your sample application. So we just had to go through three steps, and Elastic Beanstalk actually did everything for us. Now, if you will see over here, there are some interesting settings. One is the log. If you want to see the logs on the server, you can do that. It has undoubtedly created a simple instance behind the scenes. Let me just show you. You can log into the instance and get the logs. But Beanstalk does things in a much simpler way. So, as you see over here, it has created a KP lab on EMV. So this is basically our “Beanstalk” environment name.
And if you click on “request logs,” I’ll say “full logs,” and it is fetching the logs from the server, it will display you the logs. So let’s just wait. Let me show you. So you have the URL. So it has configured the Apache web server for you. Also, these are the EB activity logs. Let me just show you. So this will actually show you what exactly the Elastic Beanstalk has done for you behind the scenes. So it is installing packages like AWS logs, something related to Cloud Watch as well, Cloud Watch logs, and many other things. Anyways, coming back to the topic, let’s go to the PPT and see what exactly the elastic beanstalk has done for us. First things first, it created a security group. Then it created an elastic IP address. Once this was created, it launched the EC2 instance for us. The platform environment inside the EC2 instance was configured once the EC2 instance was launched. like it installed. AWS, CLI, Cloud Watch, plugins, Apache webserver, PHP-related packages—all those things. Once the platform environment is configured, it uploads your application. In my case, it was sampled. But if you clicked on the upload button and uploaded your zip file, it would upload it. It will unzip it, and it will configure your application. And at the end, it will give you a public endpoint where you will find your application running. So just in three simple steps It does everything for us. And this is why developers really love it. because they don’t really have to do much of our technical stuff. All they have to do is click three buttons, and the entire environment is created for them.
49. AWS CodeCommit
Hey everyone, and welcome back. In today’s video, we will be discussing the AWS code commit service. Now AWS code commit service is basically a managedsource control service which is being offered by AWSwhich allows us to host the private gate repositories. Now AWS code commit is basically a managed source control service offered by AWS, which allows us to host the git repositories. Now, typically in an organization, there are two ways in which you can host your git repositories. One is that you do it on premises, so you have options for installing GitLab, and there are various other providers offline on your infrastructure. Or you can make use of SAS’s offerings. Now, a SaaS offering is really great because you don’t really have to worry about high availability, security, et cetera. Because AWS code commit is now a SaaS service, it basically means that AWS will manage the entire thing for you, so you don’t have to worry about things going down and other issues. AWS code commit is not a particularly old source control service. There are various other services, like Bitbucket, that offer a similar solution. You have GitHub, which is pretty famous; Big Bucket is one of my favorites. I really love it. And they also have unlimited free private repositories, so you can try it out. However, since we’ll be focusing on the AWS certification, for us, an AWS code commit is the thing that will help us get good scores. So let’s look into how exactly this works.
So I’m in my AWS management console, and if you type “AWS code commit” or just “code commit,” you will be directed to the code commit console. So if you see it, it is giving a message saying that it is introducing the new console for AWS code commit. In fact, this is a pretty new console that they have launched, and it is pretty good when compared to the older one. In fact, this is the second time we are recording this video because the earlier version was based on the older console, and I decided to rerecord it because the new console had come anyway.
So this is how the code commit console looks like. Now, typically, in order to create a repository, you need to click here and say “create a repo.” So here you need to give a repository name; let me give it as Kplabs git, and I’ll go ahead and do a create. Now, a repository is basically created if you are doing it with a root account. In my case, I’m doing this with a root account, and you’ll basically get an error saying that you cannot really configure SSH connections for a root account. So, basically, if you want to do cloning and such, you should sign in using the Im user. Now, in order for you to start working with the commit reporter, there are basically two prerequisites. One is that you need to have the git client installed, and there is a specific version. And second, you need to have the AWS code commit managed policy attached to the IAM user. So these are the two prerequisites that you need to have it.
Now in order to do that, as we already see, having a git client is the first prerequisite. So let’s look at how we can do that. So I’m connected to a Linux box, and basically, if I type get over here, it is saying that there is no such file or directory. So the first thing that you need to do is install the git client. Now, if you’re basically using an OS like Red Hat Enterprise Linux or Amazon Linux, it’s quite simple. What does git provide if you quickly run a yam? You see, this is basically the package that has the git binary. So you can go ahead and install git with this command: yum y install get. If you’re using a Mac, you can do a Berkeley git, and if you’re using Windows, you’ll need to install the exe file there. Great. So once you have the git client installed, you can just verify. If you just type “git,” it will give you the basic help documentation. Now our first prerequisite step is complete. The next thing is that we need to create an imuser that has the code commit-specific policies attached. So let’s go ahead and create an IMS user. So I’ll basically type “im” in code commit, and you can control the permissions with IAM policies. So I’m playing on my Xbox. So what we’ll do is I’ll go to Users and Groups and create a new user. I’ll name it “code commit” just for our testing purposes, and we’ll select a programmatic access over here.
Now we’ll move on; you can attach existing policies here; I’ll just pick the code commit one. So once you do that, there are three managed policies. One is basically for readonly.The first is that it essentially gives full access to the code commit, while the second gives full access but does not allow repository deletion. So, for testing purposes, we’ll create the full access policy and associate it with the im user. Perfect. So once you have created them, you basically get the access key and secret key you need to configure your OS with those keys. So let me do some configuring. Great. So my AWS configuration has now been completed. Now the next thing that we have to do is clone this specific repository. Now, if you return to the source and click on the repository, you have two options for cloning. One is through HTTP, and the second is through SSH. Now if you typically click on HTTP, it will give you the link to do a clone over here. And similarly, if you just do SSH, it will give you the link. So let’s try the HTTP one.
I’ll copy this up, and in order to clone it, you have to type get clone, and you have to specify the link to the repository. And now you see that it is asking for the username and password. So this username and password can be associated with the IAM user that we have created. So let me click on the code commit here, and if you go to security credentials, let’s go back down. There are two ways. The first is the SSH key pair for AWS code commit, and the second is the HTTP get credentials for AWS code, which we will generate now. I clicked on “generate” here, and it gave us the username and password. So I’ll copy this username and password, and once you press Enter, basically it says that you have cloned an empty repository. That is fine because this is a newly created repository. So I’ll do a CDKP lab because this is our new repository, and basically if we do a quick touch test, let’s do it, I’ll say touch test TXT and I’ll do a quick echo saying hello, this is our first commit, and I’ll put it to test TXT over here.
Once you have done that, you can do a git status, and it will say that there is an untracked file. You can go ahead and add this, and you can commit it. The commit message would be “adding our first file.” So, once you’ve committed, go ahead and push it. I’ll push it to the master branch again; it will basically ask you for the username and password. Let’s copy this username. Great. So now the data has been pushed to the master branch. So, if you go to code commit and simply click over here, you will see that you have the test TXT file, which is present over here. Along with that, you now have additional options in a code commit, such as a pulled request. You have the option of viewing the commits that were made. You can see the commit message that we had written as if this was the first commit. You also have options for branches and tags, and within settings, you have options for notifications. So this is pretty helpful. You can integrate SMS so that whenever some commit happens, you can get an email or an SMS, and you also have an option for this, so this is a very high-level overview of the AWS code commit. I hope this video has been useful for you, and I look forward to seeing the next video.
50. Business Intelligence and Data Warehouse
Hey everyone, and welcome to the Knowledgeable video series. Now, in today’s lecture, we will be speaking primarily about the data warehouse, but in order to understand the need for the data warehouse, we also have to understand business intelligence. So let’s go ahead and understand both of them with a simple use case. So when it comes to business intelligence, it basically consists of data technology, analytics, and human intelligence to provide insights that result in more successful business outcomes. Now, from this definition, it is not much more clear what exactly I mean. So let’s look at a use case where we need to understand the users who are visiting the website. So this is basically for business purposes. So, let’s take a look at the points that will be relevant in this use case. First, I want to see the number of page views on my course. So since as an instructor I record courses and post them online, what are the things that I will need in order to better understand my audience? So, first and foremost, I’d like to see the number of page views on my courses, primarily to determine which course is rated higher than the others in terms of page views. The second is to determine the session based on the country of origin. This is also very important, because I need to know from which country the people are visiting—are they visiting from the US, are they visiting from Europe, are they visiting from India? And this actually helps in marketing as well.
So whenever you do Facebook ad marketing, you can market based on countries as well. From the second point, if I know that most of the users who are coming to my course are from, let’s assume, the United States, then what I can do is, whenever I do my Facebook marketing, I would be marketing it to the largest amount of users based on their US origin. So that is the second point. The third point is to determine the session based on the device from which the user is accessing the site. So whether they are visiting from laptops, mobile devices, or tablets, this is also a very important factor because if they are visiting from, let’s assume, mobile devices, If most of my users are visiting from mobile devices, then the contents that I will be putting in the course should be compatible with the mobile devices as well. It’s not like if I record a video, it can only play very well on laptop devices; it should play very well on all kinds of devices, maybe laptop, tablet, and mobile. So these specific points also help a lot. The fourth point is where the traffic is coming from. So this does not mean the country; this basically means the website. Is the traffic directly coming from Google or are people directly opening the website? Is it coming from Bing?
Is it coming from email marketing? Or is it coming from some articles that are part of the quota or some other forum? So this is also very important because, let’s assume, most of the traffic is coming from email. That means that I have to continue, and I have to increase the email presence among all the users as far as the marketing is concerned. So this is a very important point. The next critical point is the session duration, which is also important if someone is coming to see how long he is watching the video or on the website. And one last important point that also helps is the age and gender of the users who are visiting. So these are some of the examples of important pointers when someone wants to, or when a business wants to, understand the users who are visiting the website. Now it has become more of a theory. Let me actually show you what exactly that would look like. So I’m in Google Analytics, and I’ve downloaded the AWS Security specialty videocourse that we launched six to seven months ago. As a result, the number of users has increased by 30%. It shows the session, it shows the bounce rate, and it also shows the session duration, like how much time the user spent on that page, which is on average six minutes as the average.
Now, within the traffic channel, it actually shows you a good amount of information. For example, if you notice that some of them are direct, that some are from email marketing, that some are also from other services, that it could be some kind of forum, and so on. So, for example, there are certain forums, such as Cora, where you have degrees and dev mates from which users visit my courses. Now it also shows you a lot of information, like the countries from which the users are visiting from.Most of the US is what you see then, and in India, it tells about the desktop, mobile, or tablet, and it actually gives information on a weekly basis as well. So this is a good amount of information if you consider that this is essentially what the business would require. So basically, Google Analytics is a great platform that gives you a lot of information related to the users who visit the website.
It also shows you some interesting information related to the age, possible gender, and possible interest of the users who are visiting your website. So Google Analytics is a great platform to look into the nifty-gritty details related to the users who are visiting. So this is what business intelligence is all about. So once I know the answers to all of this, I can maybe optimise my marketing in a much more effective way so that I can send my courses or advertise my courses to a large audience that might like them and maybe even purchase it as well.This is how it would typically appear in any organization. So this entire part comprises business intelligence. Now, in short, if you will see, business intelligence is the act of transforming raw data into useful information for the purpose of business analytics. So with Google Analytics, it’s not like I’ll directly get the graphs; there will be some kind of raw data that Google Analytics is capturing and then making and transforming into nice little graphs. And based on that, there are a lot of analytics that are performed, like how many users are visiting and how many sessions there are there.
So this is the analysis that is performed on the data that Google has received. That is the essence of business intelligence. So an overview of a working business intelligence system is basically based on a data warehouse. The BI systems, which are based on data warehouses, extract the information from various organisational sources. So there can be various sources of information. Like I have my course, I have put my course on five websites, and in order to gain proper business analytics, I need to receive the data from all five websites so that I can know how many users are visiting in total. So that is the first thing. The second thing is that the data is then transformed, cleaned, and loaded into a data warehouse. So, once you receive the information, it is possible that the format of the information from one website differs from that of another, such as Udemy or Stack Skills. So these are big websites. Now, the format that Udemy may provide me with and the format that Staff Skills may provide me with may be completely different. And if I want to do analytics, I need to have similar kinds of data.
So whenever I receive data, I transform and clean it according to my requirements, and then I load it into a data warehouse. And then that data, which is stored in the data warehouse, is used to perform analytics in a nice graphical manner. The key point is that one of the major advantages of a data warehouse is that data from multiple sources is integrated into a single platform, making it easier for analytics to visualize the data. So this is where I have various systems like operating systems, log files, ERP, CRM, and flat files. So all of these systems will send the data to ETL. So ETL is where the extraction, transformation, and loading take place. As an example, Udemy and Stack Overflow may contain different types of data. So I have to first extract the data from those websites and transform it according to my needs because I might not need all the data; I just need certain columns or certain rows.
So I’ll transform it, and then I’ll load it into a central database called a “data warehouse,” and then from this warehouse I use certain tools that are great, like Tableau, which can query the data warehouse and show you the great graphs that you see over here. It will show you great graphs. So all of this data that you see over here is from every day that you have this data from. As a result, these data are typically stored in a data warehouse. So this is one very important point to remember. Now, some of you might ask, “What’s the difference between a database and a data warehouse?” One of the major and high-level overviews that I can share is that a data warehouse basically contains data from multiple systems. It can be an operating system, a database, or flat files as well. Now, you can definitely perform the analytics from the database, which you can always do. So one of the points that we should remember is the difference between a relational database and a data warehouse. So a relational database is referred to as OLTP because it is more concerned with transactional data, whereas a data warehouse is referred to as OLAP because it is more concerned with analytical data. So you see “last second,” “last word,” and “A” stands for analytics. T stands for transactional. So relational data basically contains the latest data from the website, while data warehouses contain the historical data.
Most likely, the data warehouse does not contain the most recent information. So a relational database is useful for running the business. So if the database goes down, the entire business goes down in most cases, while the data warehouse is more about analysing the business. The third important point is that, in relational databases, data warehouses are generally used for read and write operations; however, data warehouse operations are more about reading operations because they want to read the data that is there. Last but not least, the number of records typically accessed is limited to tens or even twenty. A data warehouse, on the other hand, can have millions of records. So, in organizations where I used to work, we used to have a data warehouse, and the query that the analytics guy put in took like 10 or 12 hours to run because there are millions of rows involved in querying the data. Now, when we were discussing this diagram, there were two important points. One is the ETN, and the second is the data warehouse.
Now, there are great software programmers that are generally used for both of them. So let me just show you that when it comes to ETL and when it comes to data warehouses, I have a nice little diagram related to the data warehouse software’s that are generally used. So, according to the Gartner rankings, the leaders are the most widely used data warehouse software products. You have staff, which is at the top, and you also have Amazon Redshift. I’ve seen most organizations, particularly startups, use Redshift because it’s adequate, provided by Amazon, and relatively inexpensive to work with. So staff is also quite good, but it’s generally aimed at enterprises, which is what I believe, and Redshift More is used by small to medium startups as well as larger organizations. Now this is what the data warehouse is all about. Now there are also tools that can perform ETL-related functionality. Basically, these are the top three tools, but there are a lot of paid tools that do the same thing in a much better way.