1. Demo: Azure Monitor and Alerts
In this demonstration, we are now going to take a look at Azure Monitor and Azure Alerts from the Azure Portal. And in the portal, there are a couple of ways to access monitoring. First of all, we can access it from our favourites by clicking Monitor, which is there by default. If you don’t have it there, simply click All Services, go ahead, and type in Monitor, and that will bring up AzureMonitor for you so you can click into it here. While we’re here, you’ll notice that you can monitor and visualise metrics, query and analyse logs, set up alerts, and take actions, just like we did in the previous lecture. But before we dive into it from this point of view,
it’s also worth noting that if I want to get information from a specific object, perhaps I’ve already got a virtual machine built and I just want to look at the metrics of that VM without having to go to the monitoring view. I can go over to my virtual machines and select my VM. In my case, I’ve got a VM that’s already built here. So, VM zero one, and on the left side of a scroll down, you’ll see monitoring, as well as alerts, metrics, and so on. So there are a couple of places to get to it—directly from Monitor or directly from the object itself. But we will start with Monitor first of all. So we’ll click Monitor, and on the right hand side, let’s go ahead and explore metrics first of all, which opens up the Metrics pane.
Now, sometimes it’s a little hard to see here in the dark scheme, so you can go to your settings and just change it to a lighter scheme inside of Azure, whatever your preference is. But once we’ve got that set up, now what I need to do is choose the subscription I want to pull my metrics from. So, in this demonstration, I only have one subscription, Visual Studio Enterprise One. So that’s where I’m going to pull my metrics from. I need to choose a resource group, so I’m going to choose the one that my virtual machine is in, which is AZ Demo. I could select all if I wanted to, and then the resource types that I want to pull from as well, so I could narrow this down and choose specific services I care about. Perhaps I just want to look at virtual machines. But in my case, again, with the subscription I’m looking at, I don’t have much I need to filter on. I’m going to go ahead and just select everything there and then select our virtual machine, which is right here on the bottom, VM Zero 1. And then this will pull up the metrics for those specific objects.
In my case, I’ve got a few things there: CPU credits consumed and remaining that are specific to B series VMs. So there’s nothing you need to worry about right here. But then there are other metrics, such as data disk, reads, writes, we go down, network in and out, and queues. I’ve got a very common one that we want to look at, which is the percentage of CPU. So I can choose the CPU percentage. And you can see from 6% that this VM isn’t doing much. You know, you can see it along the barrel: what it predicted, what it did in the past, and then what it’s doing right now. And the solid blue bar on the right now lets me zoom in a little bit there by saying I don’t want to look at the last 24 hours. I just powered this machine on. That’s why it’s kind of predicting and showing that blue dotted line there, because it doesn’t really have history for that area. So I can go and select the last 30 minutes. And then we’ll start to see something that makes a little bit more sense, like a virtual machine that’s just been powered on.
It still has very low CPU utilization. I can go back in and click this VM person’s CPU. I can also change the aggregation so I don’t have to just look at the average. I could look at the men or at the max. These are all options available to you, so you can choose how you want to display that information. You can also choose if you want it to be in a line chart or if you want it to be in, say, a bar chart. So you can just look at the CPU over time, depending on how you’re troubleshooting. Maybe you’re in a war room situation. This is where you can sort of get different views of that data and present it differently. Now, perhaps I want to put this on my dashboard. Perhaps I really care about CP on this VM. This is something I’m looking at a lot, where I can click “pin to dashboard” and either pin to the current dashboard or select another dashboard. I’m just going to pin it to the current one, and then I’m going to go ahead and click “pin to dashboard.” I could also click Dashboard on the lefthand side here, but this is a shortcut. I just clicked “pin to dashboard,” and then here is my dashboard again. I can make new dashboards if I want, but if I scroll to the right, I can see the widget that is now on my dashboard. And if I want to, I can edit this dashboard as well. So I can move this guy around and put him in this section. I can move this around. It’s really just a canvas for you to ultimately play with and put whatever monitors you care about on there. To create new dashboards, I click the plus sign, which creates a new dashboard. And I can also upload and download dashboards as well, which is really handy for sharing things between colleagues. but that’s basically the dashboard created there. I’ve got my metric, my CPU metric, all very well and good, but what happens if I want to see a little bit more? Let’s go back over to the virtual machine itself and select VM One. I don’t really want to go through the monitor every single time.
So I’ll scroll down on the left-hand side here, and I’ll be able to check metrics directly from here. And you can see the source is already filled in, the namespace is already filled in, and I just chose my metric there. Well, going back even further, if I scroll to the top of this area and select Overview for my virtual machine, and this is the same for a lot of other Azure services as well, And then I scroll down on the right side. Microsoft has created precanned widgets for you, associated with the object that it thinks you might want to look at. So, the CPU average is the one we just created manually and is available there. If you want to look at network throughput, that’s available to me on the right. The same is true for disc operations. So all of these are available there. And again, if I want it on my dashboard—let’s say I want this network one—I can simply click the Pin option, and that pins it directly to the dashboard for me as well. Now, what if I want to start alerting or things like that? Well, again we’ll go to the monitoring view, so select Monitor, and we’ll go in this time and click Create Alert. So this is a slightly different screen. What we need to do is create rules, first of all. So if I want to create a rule for that virtual machine, perhaps when the CPU exceeds 90%, I will select that resource.
So this is the resource I want to monitor. First, I choose to filter so that I can filter on virtual machines. I choose VM 1, and I click done. So that is my object that I am concerned with right now. What I want to signal on is the condition. Basically, what signal do I want to alert on? So in my case, it’s got some suggestions for us. I want to choose the percent CPU, and then I can choose to show the history over a longer period of time if I want to. Right now it’s just showing the last 6 hours if I scroll down. So there’s the metric graph, much like we saw when we created the metric monitor that we saw previously. But here is the alert condition: so the condition is greater than the time aggregation. same options there as well. We’ve got an average, maximum, and minimum total for CPU. So I can average it out and say, “Okay, any time it goes over 90%,” So whenever the CPU is greater than 90% over the last 5 minutes and checked every 1 minute, that’s what I want to take an action on, basically. So now if I scroll further down and look at action groups, this is what I want to do with that alert. So let’s say the alert is fired. Who do I want to notify? What do I want to do? And so I have to create an action group if I don’t already have one already.I could choose an existing action group if I already had one, but I’ll make a new one now and call it “Team” as an example.
And I’ll just call this one Email it. And here I’m basically going to choose my resource group, where I want to basically keep this action group. So I can choose to keep it in my default activity log alerts, which it will create for me. And then I need to choose the action. So let’s say again that we want to email as our first action and choose the action type. And these are really important, as you can see by the number of them here. So we’ve got email, SMS, push, and voice.
So if I select this one, this is where I would come in here and choose, okay, I want to email [email protected], just a fake email I’m putting in there right now. I can choose to text someone, etc. All of those options are available there, and I’m just going to choose email for now. But I could also go in and do other things. Let’s say I wanted to create a ServiceNow ticket; I wanted an incident ticket because this was a really important alert for me. I want to make sure that we document that in our ITIL system. And so I would choose to use the ItSM connector in this example. And then I’d use my item connection that I’d created separately as part of a service now, say integration. So that’s another option there. In addition, you also have the option of automation and running books, and these are key. I often see a lot of people starting to use these now, where you can choose to run a run book that you already have, and there are some already created for you now. So you have restart, stop, scale up, scale down, and remove VM options.
So perhaps in this particular virtual machine by itself, I’ve hit the CPU alarm, and I want to change it to a higher instance with more CPU. I could do that as well. So there are a lot of options here in terms of the action group, not just sending the email. I could have multiple actions: I could email somebody, I could create a ticket, and I could restart the VM if I wanted to as well. All of these things require you to decide what you want to do based on your specific situation, but you have many options. In my case, I’m just going to choose the email option. So I’m going to click okay there. It’s going to create a new resource group where it will put those activity log alerts, and then we’ll create that for you. Then all I need to do is get the Alert rulename so I can say CPU > 90%. So I’ve got my group. In fact, I need to go back and select the group that created it, but I haven’t done so yet. So, now that I’ve selected the group for CPU over 90%, or CPU Over 90, as we’ll refer to it, I can’t have that character in there and select a description for the alert. So the CPU went over 90%; help with whatever you want to put in the incident description for the incident.Obviously, you’d be more professional in an enterprise scenario. Then you choose the severity.
So this is how important it is to, you know, if it’s a Sev Zero, it’s normally mission-critical. If it’s a sub-four, perhaps it’s not that critical for you. And once you’ve got all that created, go ahead and enable the rule upon creation and create an alert rule. This will take a little bit of time. It’s essentially going to continuously monitor that virtual machine. Anytime that metric is exceeded, it will fire off that email, and you can see it’s actually already been completed for that specific one. Others do take a little bit more time. And with that, that concludes alerting and metrics. The only thing left to do to configure COVID is to perform a diagnostic login. So if we go into virtual machines again, click VM One, scroll down to Monitoring, and choose Diagnostic Settings. When we look in here, we can see that Azure monitoring collects host level metrics for a CPU disc network by default. But what it doesn’t show us are guest-level metrics. So these are the metrics. If you open up a note on your Windows laptop and go to the performance monitor on a Windows server, you’ve got application logs and performance logs for your CPU that show that Windows is detecting the Windows operating system. The host-level metrics are what Azure sees from the hypervisor. The guest-level metrics are what the operating system itself sees.
So we can choose to enable these guest-level monitoring metrics as well. You’ll pick a storage account; this is probably one that’s already been created. When you created your virtual machine, You may want to centralise all of the storage account logins for guest-level metrics. That’s a decision you will need to make as part of a wider strategy for monitoring. But for right now, I’m going to select “enable guest-level monitoring.” That updates the settings for that specific VM, allowing it to begin gathering additional data from the virtual machine and reporting those logs in as well. And what will happen is that you will get additional metrics once this piece is complete. 3
So when you go into the metrics section and scroll through here, you will see not just the host metrics but also the guest level metrics as well. And you can start to choose if you just want to see what Azure is reporting or what the system is reporting as well. And with that, this concludes the demonstration. This should give you a great base in terms of monitoring. We definitely need to know how metrics work for the exam, what options you have for calculating averages (Min, Max, etc.), how you can adjust your time period to look at it, and how action groups work. You’ll most likely be asked about action groups and what’s going on. May I have a screenshot? You might need to know that it’s sent an email or that it’s going to go after the ItSM connector. All those things are really important for you. But with that, this concludes the demonstration.
2. Lecture: Log Analytics
To get started, let’s take a look at some of the key features of log analytics. Well, the first one is that it plays a central role in monitoring. So it’s not necessarily the tool that’s doing all the metrics and performance data gathering that’s typically done on other systems. Maybe the Azure Pad Services Maybe your Windows event logs show things like that. But everything is funnelled into log analytics, and it’s a central role that it plays in bringing all of this data together. The next thing is data sources. And we’ve also got other log analytics sources, which we combine when we talk about this. And data sources are things to which log analytics can be connected.
So these could be virtual machines that you want to plug in and get data on; these could be systemlog events, Windows Log events, performance events, etc. And they’ve also got other log analytics sources—things like Security Center and App Insights, which use their own sort of log analytics under the covers as well. And so these are two sets of sources that you can use, whether you’re manually saying, “Hey, I want to get data from this machine,” or whether you want to ingest data from other log analytics sources native to Azure. Then we’ve got the concept of search queries, which you’ll see a lot of in the demonstration. So again, I encourage you to check that out because this will talk everything through. But once you’ve got all that data in there, we’ll do queries on it to then output the data. And there are a variety of output options available to us, as you’ll see. In fact, if we look at the log search use cases on the lefthand side, we see things like Azure Monitor, Virtual Machines, and Operations Manager, which is Scum. If you’re using that on premises, that stands for SystemCenter Operations Manager is what that stands for.
You’ve got Application Insights, Azure Security Center, PowerShell, Data Collector, all these different sources of data that come in; the incoming data is automatically indexed, and data types and tables are automatically created. And then the data is available for log searching and smart analytics across multiple channels. And the channels on the right are for designing and testing queries in a Loganyx engine, visualising data in the Azure Portal, sending a log analytics search query and designing it so that it triggers a logic app, sends an alert, or goes to Power Bi. We’ve got a whole visualisation there. So you can kind of see the role it’s starting to play. It’s kind of that middle piece that we have here that brings all of this together. To expand on that, if we look at the architecture on the lefthand side, we have our data sources as well as the concept of solutions.
Solutions are third-party solutions that you can plug into log analytics. In fact, it’s kind of a marketplace for them where you can go in and say, “Add things like NetworkWatcher and other services that are out there,” and then all that data comes in, it creates these records, and then you perform your log searches against these records, and then you say, “Okay, I want to load on that.” I want a dashboard for that logsearch; I want Power, Bi, et cetera. In fact, if we go a little further and look at things like the data sources and break them out, you’ve got logs and performance there plus those third-party solutions, and you can see that data comes in and creates those records, but they’re in a series of tables. In fact, if you look at it again, we say, “Look, Windows event log comes in, it goes into the event log table, syslog goes into syslog agents might go into heartbeat if we’re just checking if something is online or not.” such as up-down alerted custom logs alert rules You can see they’ve got all these tables there, and as you will notice, this will be expanded upon when you go through the demonstration, but you’ve got this concept in the query; you see where it says “Event” and then “Union Syslog,” so you can query across multiple tables. And again, that will become more important a little bit later on.
But it’s worth noting that right now, if we look at those data sources, it’s a summary of the ones that are available to you. So you’ve got the concept of custom logs. First of all, those are text files on Windows or Linux agents containing log information, and they go in the event type LOGNAME. Underscore CL: We’ve got Windows event logs, which are event-type events. Windows performance counters, which are under Perth Linux performance counters, are also under Perth IIS logs, which are W, three CIISlogs, and Syslog, which are under Syslog and are syslog events on Windows and Linux computers. The last thing we’ll leave you with in this lecture are some search query fundamentals, which I encourage you to write down or just commit to memory, as they will help you tremendously when you’re getting started with Loganalytics. To begin with, start with the source table that you’re focused on. So if you’re focused on event logs, let’s start with the event table, then follow on with a series of operators. So perhaps I’m doing a query where I want events on a specific computer.
That computer would be my first operator to set the condition. Then perhaps I want a specific event ID on that computer. I use pipes to separate out additional operations. So I’m starting to narrow down my search for that particular event table. But now if I want to join other tables and workspaces, I just need to use that union command that I showed you as an example earlier, and then I can search across multiple tables.
So perhaps I’m looking at errors in IRS systems that are showing a high CPU performance metric. So the performance metric will come from the Perth table. The Application Insights table will perhaps provide my transaction times for my users that are accessing my website. But I can combine these user unions so I can get a query that says OK, when the transactions hit a certain threshold and the CPU is this high, I’ve got a correlation here that I can work with. And with that, this concludes this lecture series, and hopefully this gives you the fundamentals you need. So definitely, questions are starting to appear for loganalytics on the exam, so definitely make sure you know how to query around, especially as you hit the upcoming demonstration.
3. Demo: Log Analytics
In this demonstration, We’re now going to take a look at log analytics in the Azure Portal. And, if you’re unfamiliar, log analytics has recently undergone a number of changes. So, if you looked at this a few months ago, you know, early in January 2019, Microsoft eventually changed everything, and log analytics went primarily through something called the OMS Portal. And now things have all been migrated and blended into the Azure Portal as a whole, which makes it a lot easier to get to. So let’s head over to the Azure Portal, and we can show you how to get started there.
Now in the Azure Portal, there are a couple of things to be aware of, as you probably saw when you looked at monitoring earlier on. We can go to Monitor, and there is a section here called Logs, and this is essentially where we can query and analyse all the logs coming in using log analytics. If I click Logs right away, you’ll see the workspace for log analytics that opens up on the right hand side as it loads up here. Now you might already have a default workspace in there. In my case, I’ve moved over and actually told her to look at the Loganalytics workspace I’ve created. I’m going to show you how to create one of those in just a second. If you’re unsure, simply click the small settings icon and select the Workspace option. And this is where you can change the subscription and workspace that you’re connected to if you so desire. But with that said, let’s go ahead and create a brand new log analytics workspace. And this is where you’ll connect data sources to.
This is where your logs and data come in, and then you can query them and create graphs and nice charts and everything else from there. So let’s go ahead and do that. So we’ll go to Create a Resource, type in log analytics, and then click Create to see our first thing. “Create a new one or link to an existing one,” it says. If we want to, we can link to an existing log analytics workspace that we have out there. I don’t have one in my case; I’m just going to go in and create a new one. But if I click Create, I will be prompted for the name. So I’m going to call this AZ demo “temporary” because I actually already created one, which we’ll use for the demo. But I want to show you how this gets created. So I’m just making this one temporary. In my demonstration, I chose my subscription because I do get charged for analytics, as you’ll see in a second. And then I choose the resource group or just give it a new one. So it’s called this “temp log analytics.” RG Choose the location where all the data is going to be stored. So I just use East US and then my price and tier. Now historically, there were a number of price tiers, but the new arrangement is basically PGB. So you just pay for the storage utilized—a simple pay-as-you-go model that you basically have.
Now, with that said, just click OK, and that goes ahead and initiates that deployment, which will take maybe a minute or so. Now while that one’s deploying, let’s go ahead and look at one that I’ve already created. So I’m going to go over to resources, I’m going to scroll up to resource groups, and you’ll see I’ve got a resource group here called AZ Log Analytics. If I click into that one, you’ll see what I’ve got inside here, which is a couple of VMs I’ve already pre-provisioned. So I’ve got a Linux VM and a Windows VM, which I’m going to use to show you some of the logs and performance reporting. And if I scroll to the top, this is my workspace: AZ Demo workspace, Log Analytics workspace, which is located in East US. So if I click on that, that’ll take me to my Log Analytics workspace. And, before we begin, on the right, you’ll see Getting Started with Log Analytics. It’s got things like a connected data source that helps you configure your monitoring solutions and then maximise your experience by searching and analysing logs and managing alert rules. We’ve already covered alerts and action groups, which you can access in Loganalytics, and we’ll spend some time in this demonstration searching and analysing logs. Now, before you can go ahead and search and analyse logs and get anything meaningful, you do have to connect data sources to log analytics. Basically, you’ve just created an empty workspace.
There’s no data being fed into log analytics. So, if we scroll all the way down, the first thing we want to do on the left hand side is see workspace and data sources. And, to begin with, if I click on virtual machines. And you’ll see here that I’ve got Linux VM 0/1 not connected. And I have a Windows VM running on Wind that is linked to this workspace. And if you had lots of machines, you might connect them all to the same workspace, or you might connect them to other workspaces. The good thing is you can query across workspaces; you just have to specify the other workspace when you get to that point. But the major point being made here is that this is where the logs are being sent to. So this Windows VM has already been configured to send logs to this workspace. Now, how do we configure it? Well, let’s do it for the LinuxVM, because that one’s not connected yet. So if we click this one, and you’ll see there’s a simple item at the top here, it says Connect. See how it says “no workspace name VM is not connected to Logan litigation” and “not connected status.” If I click connect, It will go about connecting that VM in here, and essentially there’s an agent that runs on the machine that pushes the data up to Loganalytics.
So we’ll let that one go ahead. But the good news is that we already have that Windows One connected, so we can start pulling data from it. Now there are a few other things we need to configure as well. So aside from saying, “Connect my VMs,” you might not want to go in and click every one of these each time. So you can automate that in a couple of ways. You could do it using Azure automation, or more specifically, a lot of people. One of the Azure Security Center settings is to ensure the VMs send their log data in as well. But with that, let’s jump to the top a little bit now. So I’m going to go up to the advanced settings because it’s one thing to say connect the VM to log analytics, which you can see here, but it’s quite another to say connect the VM to log analytics, which you can see here. I forgot about advanced settings and connected sources, so Windows Server One and Windows Computer connected to Linux should still show zero because they aren’t configured. Here are the manual steps to configure it as well. So you can download the agent, log into the machine, install the agent, and then connect it with these keys if you want. Again, that’s how we connect services in.And you can see that Linux VM One has completed while we’ve been going through this demo already. Now if we go to “Data,” the next thing we can see here is the Windows event logs, Windows performance counters, Linux performance counters, et cetera. And we can go here and choose the log that we might want to send over.
So perhaps we type in “System” and we can send the system log for Windows across, and we can press the plus sign here, and now it’s going to send the error warning information logs from Windows across there as well. We could type “Application” and we could get all the application logs sent across as well. So just remember, this is the same as on a Windows server. If you log in and go to your event viewer, you’ll see those logs there, and it’s the same concept, basically, for performance counters. If I click down, I go to performance counters. Remember, we’ve got all that data from the hypervisor that Azure can basically monitor already there in our metric section. But what if we want to get additional data like logical discounters, memory, etc.? These are all OS-based metrics that we want to get from inside the OS. We can simply click “Add” for the selected performance counters. Here they are; select the interval in which you want to collect, and they will be added as well. If we go down to Linux Performance Counters, we first click this little box at the top. And here we’ve got a whole bunch of Linux counters that we can add as well. And again, we can continue on for other counters and logs that we want to add.
And when we’re done, we simply click Save, and then that will adjust those settings for us. And there we go. The configuration will save successfully. If we scroll back over to the left all the way, I’m going to drop all the way down on the left side now. I know we’re sort of jumping around a bit, but it’s just the nature of the menu system here. So, once again, we have VM as a data source, and that is connecting those VMs to where I was. Now, in advanced settings, that’s the type of data I want to collect from those sources. But there are other sources we can have as well, like storage account logs and an Azure Activity Log. This is a very common one. A lot of people use the activity log. And you can see here—here’s my subscription. And if I want to get the activity log for that entire subscription, I can click this one here and connect that in as well. And now Azure is going to send those activity logs—all the things that happen in that subscription—up to Loganytics as well. As you can see, it’s very powerful, but it can quickly become very large as you connect more and more data sources in. You can have custom data sources as well.
And you’ve got a lot of data going into log analytics. So that’s step one. And if you are doing this and trying to follow along from a demo perspective, make sure to do this. Connect a bunch of VMs, maybe run some performance-intensive application on them so you get some data sent in, and wait 30 minutes before you kind of proceed with the next things that I’m going to show you here. So, if we return to AZ Loganalytics and enter our workspace again, we’ll go to the Log section. So I’m going to scroll down and select Logs. Now again, I can go to Azure Monitor on the left hand side, scroll down, click Monitor, and I could click Logs from there. I’m ultimately getting to the same place. I just need to make sure I’m looking at the same workspace, but there are lots of ways to get to the same information. and that can be a little bit confusing at first. So don’t worry; just make sure you’re in the correct workspace to begin with. So here we are. I’m going to scroll to the right a little bit, and you can see this is the query area where I can ultimately run queries. Microsoft now displays a number of them below select queries for heartbeat performance usage. If you want more information, there are tutorial sections on the right as well.
But I’m going to start you off with just some simple things to try. First of all, get used to the platform. So I’m going to zoom out a little bit here, and then immediately let’s go in and type a query. Let’s start with a very basic query in the section, and we can go ahead and type “perf” if we just want to look at the perflogs, and you’ll see the intelligence starts to complete that for you. But if I just type “perf” and click “run,” you’ll see it’s gathering the data. And here are all of the different counters so you can see the logical disc on Linux, VM disc transfers, three megabytes of performance, and so on. Those are the only ones where I can simply type in heartbeat as an example. Now the heartbeat tells us if the agent has ultimately checked in. So again, just a heartbeat, nothing else. After it. If you type in IntelliSense and tab through it, it will fill it in for you and get you to the next thing, which will appear shortly. If you’re just trying to type heartbeat, make sure you don’t have anything else there, just heartbeat, and then click run, and then you can see how it’s put into that data. And I can click into one of these, click through here, scroll down, and we can see what machine it came from. So we can see a Windows VM one.Here’s the IP address of that machine. If we scroll further down, we can see that type of heartbeat.
So that’s our heartbeat data that we’ve got there. Well, that’s not super useful just by itself. I mean, if you just want to get off and get data, all the heartbeat is great, and you can certainly choose to put it in a table or other formatting if you want to. You can also filter by time range at the top, which is probably a little more useful. Let’s say we just want the heartbeats from Windows Zero one.So I’ll do heartbeat and pipe to whereso I can tab through that where computer tab throughyou do equals equals so that means is equivalent to and I can put quotes around win VM 1 and let’s just check the spelling. It is case-sensitive. So, yes, the WinVista one appears to be correct, and click Run. And now I have that data, and it’ll just be all of the heartbeats for Wind VM 1. Okay, so we started to get something out, but what if I say, “Okay, I just want the heartbeats,” where the time generator was in the last hour? So again, we can pipe that again, and I can say, “Where is the time generator?” We can tap through because it does help you here.
We can do more than a go, and then enter one H for one hour and click run again. Let’s see what happened. There should be a lowercase H. So we hit that and clicked run again. And we can see—here are all the heartbeats. And if we look at the timestamps, they’re all within the last hour. Okay, so that’s useful if we’re just trying to troubleshoot as an agent checking in and maybe troubleshooting the service itself. But something that’s very common is events. So let’s go to the event again. We can type in just the event. And actually, I do this every time I’m trying to create new queries; I just query the service I’m trying to get it from to make sure I’ve actually got data. So, as you can see, there’s only one eventually coming in right now from Wind VM One. Well, it’s a new machine, and not too much has happened with that machine. But I do have an event ID that I can use to show you something a little bit more advanced. “Well, what if I just wanted events where event ID equals “equals,” so where event ID equals “equals,” and I can do that event ID 736,” I can say. And if you’re not sure about that event idea, it’s a common one. It just shows that a service has entered a run-in or stopped state. In this case, it’s the WMI service in Windows itself.
I can click run on that, and that will pull up that log-in again. I’ve only got one right now, but if you had lots and lots of events from Windows that were pulled in there, then those would come in for you as well. Okay, so that’s the event’s heartbeat. Let’s go back to performance, because that gives us a little bit more data to actually play with. So if we go to performance now, then again, perhaps I can do where time is generated again a go.So I can say 1 hour and do where counter name equals, equals at quote percent space, processor time, and end quote again, making sure the space between the percent and processor and processor and time is correct, and then click run. And as you can see on the bottom here, all of our counters for processor time have counter values for the instance that that’s affecting. So you can see Linux VM One here, and we can tab through here. If we continue, you’ll notice that we have some WinVM and Linux VMs, each with a different type of processor CPU utilization.
Now that’s not super easy to view in a table, so perhaps we want to render it as a time chart. So what you can do now, which is a little bit more advanced, is basically pipe this to a summary, so summarize, and then we can do average. So we’ll say the average again here. I’ll choose the counter value that I want to actually filter on. So this is the counter value, which is this column that you see here, which actually has the value of CPU time. Now I can do it on the computer. So if I want to, I can say by computer, and then we can do it with a bin size of, let’s say we want to do every 15 minutes of time-generated 15 m, and then we can say, “Okay, we want to render a time chart out of all of this.” We’ve added a few extra things, such as summarising by the computer’s counter value and using a 15-minute interval. So let’s give that a go and run that one. Okay, we’ve got a syntax area here, so let’s see where that is. It looks like I just forgot to close a bracket right here. So we’ll put that one in and click run, and there we go. Here’s our visualization, and as you can see, it’s currently focused on Linux VM 1, with little data for Windows VM 1.
You can see the little dot down here, and that might just be that it hasn’t got enough data for it. Now we can also zoom in a little bit, so we can change this time generated to be just 1 minute. And the graph will start to form a little bit better, as you can see, because now we’re breaking it down in 1-minute intervals. And we can see there’s some CPU activity within those thresholds, but there wasn’t. And we just kind of looked at the 15-minute bucket averages that were there as a whole. So with that, that’s how you query logs. We’re not going to go through all of the logs just yet. This will definitely get you what you need for the exam. Just know how to kind of read these, understand how they work, and then you can basically go from there. Now let’s move on to the alert rules themselves. Now what you can do is take the query that you have and immediately go in here and click the New Alert rule. So any queries that are out there, anything you find on the web that other people have created around log analytics and queries, you can basically go in there like we did previously with alerts and action groups.
The only difference is that your source will now be the workspace. You’ll need to define the conditions that occur. So you can click in here and decide what your signal logic is that you basically want to query on. So again, right now, if you put your query in, it’s going to take the last one that you had and actually put it right there, and then you would put your alert logic below. Now, just doing something with a query like a time chart is probably not very useful. It would simply say “okay,” “number of results.” If there’s a lot of results greater than X, then maybe set an alert, but you could do things like count the number of heartbeats received. You could use those queries, and then you would just carry on as normal. We’re not going to revisit this right now. because we already covered action groups. But you would choose your query, your conditions, and then your action groups just like you would everything else. and that’s one of the beautiful things about Azure. It’s just the way that you can use the same concept as action groups to decide what to do with the events. And the events can come from different sources, whether that’s something like what we’re looking at now in Log Analytics or some of the metric monitors that we looked at in one of the earlier demonstrations as well.
So we’ll come back out of this now. I’m going to come out of action groups and rules here and go back to our workspace. Here we are back in the workspace. And the last thing I want to show you is the solutions area. So, if you’re coming from a previous OMS, background solutions that used to be in the OMS portal will be changed. Now they’ve embedded solutions directly. You no longer have the menu inside Loganytics to get solutions on the left hand side. You go to the marketplace for them. So if we go up to the top left and click Create Resource, up comes our marketplace. And if we scroll down on the left-hand side, you’ve got management tools. And on the right-hand side, if I expand the featured list here, you’ll start to see all those plugins that previously existed. If you’re familiar with Loganalytics, you’ll know that the Common Ones security and compliance patch is now available via update management. This is a common one. A lot of people are looking at this now to even replace services that perhaps they had on premises before, like Blade Logic and things like that.
So if I click in Update Management and say, “I want to add this pack,” this comes with a whole bunch of dashboards, a whole bunch of queries, and a whole bunch of additional things that you can utilize. This one actually uses Azure Automation to do a lot of patching and things like that. And you simply click “create.” I’m going to zoom back in a little bit here. Now choose the Log Analytics workspace that we want to use. I’ll do the AZ Demo workspace, and then you’ll see there are some additional settings that you sometimes have to utilize. In my case, this particular marketplace addition requires that we have an automation account as well. So in our case, you’d create an automation account. This is Azure Automation, which is covered in a separate lecture that I could create right now. Or if I had one already created, I would simply select it. But I don’t need to go and set that one up. Now I’ll show you a couple others. So if we go back to the management tool section, one common one is this security and compliance one. So if we click this one and click “Create similar concepts,” select your workspace, and select the recommended solutions, I might want to add as well. Click Create, and that one’s going to go ahead and deploy, and we’ll fast-forward here while that completes, and that’s completed.
So we’ll go to that resource by going to the workspace, then into solutions, and then selecting, in this case, this anti-malware solution. Now again, just to remind you where to go, we can go back to the workspace. workspace for log analytics Sometimes you get a few blades deep, and it’s hard to find if you’re in the workspace. Scroll down, go to Solutions, and there you will see the security and antimalware solution that’s in place. I can click the security one now as well. And you’ll notice some of these dashboards that have already been created for us on the right side. You can also essentially obtain the queries. Any of the kinds of pre-created little widgets that you like You can go ahead and get the queries from those as well, just like in Security Center. Now again, if we go back one more time, the last thing I want to show you in the section—I know this is a pretty big demonstration—is the view designer as well. So if I click the view designer, this is a place where I can take queries and start to design my own little widgets based on those queries as well. Now some pre-canned food will show you how to do it. So if you click the donut as a common one and put this on you, this is the overview tile.
So this would be the overview that you would then click into to drill down and get more information. Then there’s the view dashboard, which allows you to add additional items. It could be a list of queries, a number list, information you want to display, stacks of line charts, or performance. Perhaps your favourite application is one that you troubleshoot on a daily basis, and you’d like to create your own dashboard for it. This is where you would essentially do that. If I go back to the overview tile for a second, you’ll see I want to click the donut. Because we have PerfLogs, Heartbeat, and Azure Activity, it has already prefilled some fields for me. Those were the data sources that I was connecting to previously. And I can give a name to this view. So we’ll call this next view; feel free to describe it. And if you scroll down here, you can see those queries. So here’s the search: it’s summarising the aggregate value, and it’s purely just a count of those particular logs of Perth, Heartbeat, and Azure activity. But if you scroll further down again, you can see we can modify the colours and things like that as well and do any data flow verification that we want to, which is not scoped for what we want to do right now, but with that, once I’m done, I can simply click save.
And now I’ve essentially created my own view. So you can see the next view here, and then I can click into there. There won’t be anything there because we didn’t create anything in that second section. But if I want to, I can continue to create more and more views inside of that major view. Again, I’m drilling down deeper and deeper as I go, but I do have to facilitate that with a query at every single level. And then, if I want to, if I go back to my overview, if I want to, I can pin this to my dashboard so I can say, “OK, I want this on my main dashboard in Azure.” Now that it’s pinned, I can click the dashboard, which takes me to my main section, and I can rearrange this just like any other dashboard widget I’ve got. But now that it’s essentially here, I can put it in any dashboard I want. So, a lot of power, a lot of things to COVID in log analytics, key things to remember, know the basics of the query language, know how to read queries, know about the view designer, and know about the solutions that have ultimately been moved because I did go through a lot of change. Last but not least, remember the alerts and action groups you created during the previous demonstrations. And with that, that concludes this demonstration.
4. Lecture: Azure Security Center Overview
with any cloud project. Today, security is obviously top of mind for an organization, and Microsoft has released Azure Security Center as a mechanism to help us define security policy, monitor threats, and just generally make sure we have a good security posture in our Azure environment. And so Azure Security Center came around; it’s now part of the log analytics platform and essentially gives us a number of items. In particular, we have centralised policy management, so this is where we can ensure compliance with company or regulatory security requirements. We can centrally manage security policies across all of our hybrid cloud workloads. So this works both on-premises and in Azure, as you’ll see in the upcoming demo.
And then we have the continuous security assessment, so this can continually monitor all the machines, network storage, web, apps, etc. so that we have them and just check them for threats and potential security issues there. It will also give us actionable recommendations. This is one of the things I like the most because it’s not just saying, “Hey, you’ve got a vulnerability here.” It actually gives you something you can do to actually resolve that vulnerability. So this is great for operations teams; it’s got advanced cloud defenses. So this is where you can reduce threats with just-in-time access to management ports and some of the threat intelligence features that you’ll see as well. Prioritized alerts and incidents go without saying; we want to be able to make sure that’s okay to begin with. Let’s focus on all the critical alerts that come in; let’s prioritise those; and let’s make sure we can funnel those off, whether we email them or send them to other log systems like Splunk, et cetera.
And finally, one of the other key things is Microsoft has opened the platform up and integrated it with a lot of other security solutions. You can collect, search, and analyse security data from a variety of sources. So if you decide, “OK, I use semantic endpoint protection,” I can integrate with them. And if you say COAL is for vulnerability assessment, you can integrate with them as well. So you can bring in all these other security products that you might already have. The service itself is available in two pricing tiers. There’s actually a free tier; this is for Azure resources only and includes the security assessment, security recommendations, basic security policy, and connected partner solutions. But again, that’s for Azure resources only. The standard version, which is actually $15 per node per month, comes with all the features in the free tier. In addition, you get just-in-time VM access. So you can temporarily limit that RDP security vulnerability that you have when that port is open and disable the network threat detection piece as well as the VM threat detection as well.But there is no better way to show you this than in a demo. So I encourage you to check out the upcoming demonstration. We’ll give you a full tour of the Azure Security Center.
5. Demo: Prevent and Respond to Threats in Azure Security Center
Azure Portal. To get to Security Center, simply scroll down on the left-hand side and you’ll see Security Center with the shield there. Select that, and this will open up Azure Security Center for you. Now, for this demonstration, I’ve already connected to a live lab environment that has lots of VMs running and different vulnerabilities. So I’ll show you some of them. But before we begin looking at the threats, let’s take a look at some of the security policy settings that you need to be aware of. So if we select Security Policy and then select our subscription, you’ll see it’s divided up into these major areas. There’s a new one at the bottom of the preview now, but we’re not going to COVID it just yet. The main four are data collection, security policy, email notification, and pricing tier. So, starting with data collection, the first thing we notice on the right hand side is data collection. So this is where we can choose to automatically provision our monitoring agent or not. You can do it manually, but it’s highly recommended that you just turn this on. And it will install the Azure Security Center monitoring agent on every single machine that we have in Azure. If we scroll further down, then we have our default workspace configuration. Azure Security Center shares the same data storage mechanism as Log Analytics. So you can choose to use your Loganalytics workspace or use the workspaces created by Security Center, which is the default option.
And if we scroll down, we can select all of the events that we want Security Center to collect. So we could simply select all events. If we want absolutely everything, we can go with Common, which is the default. We could drop it to minimal if we think, “Okay, we don’t need all that extra data,” and we could choose “none.” But then you’re really only going to get a few basic events from Windows in there. So it’s entirely up to you which level you choose. However, Microsoft would strongly advise you to stay at the common or event level to ensure you get some good data that Security Center can act on. If we then move over to Security Policy, this is where we can further define what we want the Security Center to show our recommendations for. So we’ve got our data collection that’s pulling all the data, but then what do we want it to look at inside of that data? And if we look through them, I’ll highlight a few of them. Most of these make a lot of common sense, but we’ve got things like system updates, security configurations, and endpoint protection. All of these, if you hover over the little I here, require that data collection is enabled for your virtual machines because it’s basically pulling all that data from those in order to make these recommendations. Then we’ve got disc encryption, network security groups, web application firewalls, and next-generation firewall vulnerability assessments. It’s telling you if your patches are up-to-date or not. Up-to-date storage encryption, just in time network Access, which I mentioned in the tutorial, essentially allows us to say, “Okay, I only want RDP open for a certain period of time.” I can turn on JIT, and that allows me to say, “Okay, when someone wants RDP access, they request it, open it up for an hour, and then lock that network security group back down when they’re done.” So RDP Port 3389, for example, isn’t available.
Then there are adaptive application controls, SQL, threat detection, and sequel encryption. So you can just toggle these on and off as you see fit to meet your needs. Then we have email notifications, and this is where we want to send the alerts to our security team. So these are your security contact emails, and you can also include a phone number. And you can also choose to send yourself emails about alerts as well. Then we move on to the pricing tier. And this is where you can choose again between the free tier and the standard tier. As I mentioned in the tutorial, the Free Tier is free for all Azure resources. The Standard Tier is $15 per node per month and gives you just-in-time VM access. The Adaptive Application controls network threat detection and VM threat detection, and it also works for your VMs that are potentially on premises. It will go and analyse those, pull the data, use an agent, and funnel it all into Azure Security Centers so you can assess those machines as well. So that takes care of the configuration settings. Let’s go and look at what the Security Center actually shows.
So I’m going to scroll back over on the left-hand side. If we go back to the overview, first of all, you’ll see it’s divided up with this kind of recommendation. Any security solutions we’ve got integrated have addressed any new alerts and incidents in the last 72 hours and our events from last week. If we go down a little further, we’ve got our prevention. So compute. Let’s actually go into one of these first and look at prevention. If we use a computer as an example, you’ll see this divided up into monitoring recommendations. So this is anything to do with agent health. So, for example, if a machine is not reported in the health status, then we know we have an issue with the agent. So we have a monitoring issue that we need to resolve. If we scroll down, though, that’s where it starts to get more interesting in.So we’ve got things like endpoint protection issues. When I select one of these, you can see that there are protection providers installed. Installed Health Protection State And if we click on this one down here, this will actually show us the resources where endpoint protection is not installed. And now I have a few options. I could simply select Install on nine VMs, and it’s going to bring up either the Microsoft built-in anti-malware or partners that you’ve integrated into the solution here that you can actually install.
Not all partners are now on board. Someone who works from a discovery standpoint doesn’t allow you to actually install the agent directly from there. You would use a VM extension for that. If you saw the VM extensions module, that would help you identify how to do that. But if I move back a little bit, you can see all of the machines again, and this is a high severity because none of them have any endpoint protection installed. If we go back to the recommendations, scroll down a little more, and you can see things like missing disc encryption as well. And this allows us to add disc encryption like BitLocker, etc. or to a VM if we go back again. I also received a vulnerability assessment solution. This is another common one, and again, if you choose one of these machines and select Install, you can select an existing solution or create a new one in this case. And Qatar, being a very popular country for vulnerability assessment, is available there for you to choose if you wish and integrate that into your environment. All right, so let’s head back over to the overview section. So we’ll scroll back in our blades and get to the first page, and the other thing I want to show you is this event section. So if I click Events from last week and just take a look at all the events coming into the events dashboard, sometimes it takes a minute or so to populate. It all depends on how much data is available to you. All those events pop up.
We’ll select one of our workspaces, and we’ll see our events over the last seven days. And pretty interestingly, here are the notable events that will pop up in a second on the for you. AlAnd you can see there that we have 9.2 thousand accounts that have failed to log on. So let’s click into that and see what’s going on there. So now let’s basically search those logs, and you can see, if you look at the Analytics module already, the security event. This language that we use is kind of the log analytics query language on the right here. But this is really interesting because we can see that 116,000 of these events, if we look on the left hand side, are all related to this particular computer, and if we look on the right hand side, we can see all the different accounts that have been tempted to log in. So this is a VM that we’ve just exposed to a public IP address, and we can see administrators try to login: admin Michael, a test user called Scanner Test User, etc. So this is the real world, and it emphasises the importance of not exposing VMs to a public IP address because they will be tacked fairly frequently. But that’s also where we could turn on just-in-time VM access. So let’s go and look at this virtual machine. I’m just going to close this down, go back to the Security Center overview, and if we scroll down, we should see just-in-time VM access.
If we select that one, scroll down again, and go to recommend it, it’ll show us the virtual machines it recommends, and you can see that machine is up there. We select it, we can say “enable JIT” on that virtual machine, and then we could simply say “338-922-5985” and “5986.” These are all the admin ports. We simply click “Save,” and that’s going to enable just-in-time access on that machine. That means that if we want RDPinto to work, we’ll need to install a firewall and deny an error. Basically, we can’t even connect to it right now. We would simply request access to it before we were able to connect via RDP. The last thing I’ll show you in the advanced cloud defence section is these adaptive application controls. This is really about application whitelisting. So what you can do is put your machines out there, have them do various tasks, and then they will get recommendations. So if I look at this resource group where there are some recommendations for it, I’ve got a selected virtual machine, and you can see it’s a recommended process.
It’s actually come in and said, “Okay, the following processes are very frequent on the VMs within this group and are highly recommended for whitelisting rules.” So you can create these whitelists, and then you can later change the policy to enforce them. So right now we’re in audit mode, so it’s just looking at the machines and seeing what they’re doing. But once we know what they’re doing, we can enforce it, and then we can make sure that anything that doesn’t comply with those rules is denied. So this is a way to look at what’s inside the VM, what it’s actually doing, what processes it’s running, and say yes, you can run that processor; no, you cannot run that process. And that’s one of these next-generation security mechanisms that a lot of people are moving towards. And finally, the last thing I’ll show you if you go back to the home screen, back to Security Center, is that if we scroll all the way down, we also have our security alerts section. So this will identify all the common alerts for you. If you scroll down, you can see that the failed RDP brute force attack is there.
There’s suspicious incoming RDP network activity. We also have threat intelligence and custom rules, which are currently in preview but are coming pretty quickly. And this will allow you to customise all the rules and decide what you do with them. What exactly do you want to be alerted about? You can also now integrate with Azure policy, which is also in preview, so you can create your own security policies and incorporate them as well. And just to hit the threat intelligence, which I don’t have completely running in the lab here, you can see a threat intelligence dashboard. I’ll just hit one of these, and this will actually show you analytics on a global scale and show you where your threats are and try to help you identify all the different threats across different countries. So, hopefully, this gives you an idea of how you can not only prevent threats but also respond to them when they occur. And some of the other administrative tools that you can use, such as time access and adaptive application controls, to further secure the environment. And with that, this concludes the demonstration.