Student Feedback
AWS Certified Machine Learning - Specialty: AWS Certified Machine Learning - Specialty (MLS-C01) Certification Video Training Course Outline
Introduction
Data Engineering
Exploratory Data Analysis
Modeling
ML Implementation and Operations
Wrapping Up
Introduction
AWS Certified Machine Learning - Specialty: AWS Certified Machine Learning - Specialty (MLS-C01) Certification Video Training Course Info
Gain in-depth knowledge for passing your exam with Exam-Labs AWS Certified Machine Learning - Specialty: AWS Certified Machine Learning - Specialty (MLS-C01) certification video training course. The most trusted and reliable name for studying and passing with VCE files which include Amazon AWS Certified Machine Learning - Specialty practice test questions and answers, study guide and exam practice test questions. Unlike any other AWS Certified Machine Learning - Specialty: AWS Certified Machine Learning - Specialty (MLS-C01) video training course for your certification exam.
Data Engineering
4. Amazon S3 Security
Three's security is last but not least. Let's start with three-factor authentication for objects. There are four methods for encrypting objects in S 3. There is SSE S Three, where we'll encrypt S Three objects using a key that is handled and managed by AWS SSE, Kms. We will want to use the AWS Key Management Service to manage encryption keys. So we have more control over the keys. Then it gives us additional security because now we can choose who has access to this KMS key. In addition, we get an audit trail for KMS key usage. Then we have SSEC when we want to manage our own encryption keys outside of AWS. And finally, client-side. Encryption. We want to encrypt the data outside of AWS before sending it to AWS. So SSE, by the way, stands for server-side side Encryption.That means that Alice will do the encryption for us, which is different from client side encryption, or CSE. So from a machine learning perspective, only SSEThree and Ssekms will most likely be used. So you don't need to really remember what SSEC and client-side options are for this exam. So I will just do a deep dive into SSE Three and SSE Kilometers. So, here's what it looks like. We have the Amazon S Three buckets, and we want to put an object in them. So we're going to send that object into S Three, and then the object will arrive in S Three. And for SSE S Three, S Three has its own manage data key, and it will take that key plus the object and perform some encryption, then add this to your bucket. So, as we can see here, this key is managed by S Three. We have no idea what it is, we don't manage it, and that's fine. Then there's SSE km, which follows the same pattern in km. We have our object, and we want to insert it into S Three. So we'll send it to SRE, and then the key used now for encrypting that object is generated thanks to the KMS Customer Master Key. And this KMS Customer Master Key is something that we can manage ourselves within AWS. By employing this pattern, we gain more control, possibly more safety, and possibly a greater sense of security. And so AWS will use this custom master key and a datakey to encrypt the object and put it into the bucket. So that's it. That's the difference here. As you can see here in SSEKMs, we have control over how the customer-managed master keys will be used. In the previous example, it was only Three who had its own data key. So now, for S-three security, how do we manage access directly into S Three?It could be user-based, so we can create an IM policy. and IAM is identity access management. That means we need to specify which APIs Auser should be able to use on our buckets. Or it can be resource-based, where we say, "Okay, we want to set an overall bucket policy, and there will be a bucket-wide rule from the SB console and it will allow cross-account access and so on" and define how the users can access objects. And we could also have ACL, which is an object access control list, which is finer grain, and bucket ACL, which is less common. So going into the exam, the really important thing is to understand user ID policies and bucket policies. So let's have a look at bucket policies. They're JSON documents, and you can define the buckets and objects. You can define a set of APIs to allow or deny, for example, "put object" or "get object." And then the account principle is the user's account to apply the policy to. So we can use an SThree bucket policy for multiple things. number one, to grant public access to the buckets. Number two: to force objects to be encrypted at upload time or to grant access to another account. They are very common in AWS. Now there is a way to set default encryption for your buckets instead of using bucket policies. So the old way was to set a bucket policy and refuse any request that did not include the right header for encryption. So this is what it looked like before, but nowadays you can just use a new way, which is called "default encryption" in S 3. And using default encryption, there will be a way to just say "S Three" to encrypt every single object that is sent to it. So you need to know just that the bucket policies are evaluated before the default encryption. So if we go to S Three now and go to Overview and look again at our data points, we look at Instructor Data CSV and click on the properties. As you can see, there is currently no encryption, but I can click on it and select SSE S 3, which is 256, or AWS kms. And we need to set up a key to anchor the data with. So we can either set up one managed by AWS or create our own or send our own custom key arm. So I'll just type SSEsThree in here and click Save. And as we can see now, this object's Instructor Data CSV is fully encrypted. Another way to do it is to click on the bucket, click on Properties, and go to Default Encryption. And here we can set a default encryption for all the objects in our buckets, but only for the new ones that are uploaded. So we'll say okay; we want AES 256. That means that we're going to say, "Okay, all the objects should be encrypted with SSEs free, click on Save, and we're good to go." So now if I go back to my overview and I go back to my data set, So I'm going to go back to my objects. I'm going to delete this one, okay? And I'm going to upload it all over again. So something we should see now is that the object itself is already encrypted with AES 256. So automatically, thanks to the default encryption, AES256, we were able to encrypt that object. So another way of doing it would be to use bucket policies. So I'm not going to go into the details of bucket policies because you don't need to know them for the exam, but just know that we could set a bucket policy here to provide access to the files in our buckets, okay, to other users or AWS services. So finally, other things for security you needto absolutely remember going to the exam. The first one is networking for a VPC endpoint gateway. So when you use S3, right now, when I do it, I go over the public Internet, and all my data goes over the public Internet. But if I have private traffic within my VPC or virtual private cloud, then I want it to remain within this VPC. And so I can create what's called the VPC endpoint gateway for three And they will only allow the traffic to stay within the VPC instead of going through the public website for S 3. To make sure that all the private services,for example, AWS, Sage Maker, as Frank willshow, you can access S Three. So Frank will be reiterating that point, obviously, when we go into Sage Maker. But you need to remember this very important fact for the Iris exam. If you want your traffic to remain within your VPC Four, S, and not go over the public web, then creating a VPC endpoint gateway is the way to go. There's also logging and auditing. So you can create S3 access logs to store in other S3 buckets to make sure that you can see who has made a request to your data. And all the API calls can be logged into something called Cloud Trail. Cloud Trail is a service that allows you to look at all the API calls made within your account. And that could be really helpful in case something goes bad and you need to remember who did what and why. Finally, you can have tags on your objects. So, for example, we could tag base security and add classification equals personal health data information to your objects. So let me show you how that works. So, in this case, let's pretend it's phi data. So what I'm going to do is go to the properties of the subject. Let's click it again and properties. and in here I'm able to add tags. And the tag I'm going to add is classification, and the value is phi for personal health information, and then click on save. And now my object has been tagged with this. And so if I had the right bucket policy or the right Im policy, I could restrict access to this file thanks to this tag, because I've tagged this object to be data. That is quite sensitive. OK, well, that's it. Remember, for security again, bucket policies, encryption, then security, VPC, endpoints, and tags. And that's all you need to know. And I will see you at the next lecture.
5. Kinesis Data Streams & Kinesis Data Firehose
Okay, so let's get into Kinesis.Kinesis is an extremely important service goinginto the machine learning exam, so makesure you pay attention to this section.So it's a streaming service and it's amanaged alternative to technology called Apache Kafka.It's great for gathering application logs, metrics, IoTdevices, information, click streams and so on.And anytime you see the worldreal time, usually that means Kinesis.It's great that because it has integration withstream processing frameworks such as Spark, Ni, Fiand so on, and the data is automaticallyreplicated to three availability Zone AWS.So that means that your data issafe using this real time service.So there are four services you needto remember within the whole Kinesis umbrella.The first one is Kinesis Streams thatis low latency streaming ingest at scale.Don't worry, we'll see at the end of thissection how we can remember which service is helpfuldepending on which machine learning use case.Then we have Kinesis Analytics to perform real timeanalytics on streams using the SQL language and thenKinesis Data Firehose which is to load data intoS Three, Redshift, Elasticsearch and Splunk.And finally Kinesis Video Streams, which is meant asthe video indicates, to stream video in real timeand do some analytics on top of it.Okay, so these four services will do asmall deep dive on just to make surewe properly understand them and understand the differences.Because this is what the exam will test you andit will test you on figuring out what is thedifference between streams, analytics, fire, hose and video streams.Okay, so from an architectural standpoint,here's what Kinesis looks like.We have Kinesis Streams and they willonboard a lot of data coming fromclick streams, IoT devices, metrics and logs.Then we may want to performsome analytics real time on it.So we'll use Kinesis Analytics as a service andthen finally we'll use Amazon Kansas Firehose to takeall the data from these analytics and maybe insertit into Amazon's Three buckets or Amazon Redshift foryou doing your data warehousing and we'll see whatredshift is in a summary.Okay, so this is one simple architecture with Kinesis.You can see there will be a lot ofthem in this hands on end section, but thatgives you a general idea of how things work.We onboard real time data, we analyse it and dosome analytics in real time and then we split itinto stores like S Three that we just saw orRedshift to perform deeper analytics or reporting and so on.So let's first look into Kinesis streams.Streams are divided in what's called shards or partitionsand the shards have to be provisioned in advance.So there is some capacity planning you need to do.So here's what it looks like.We have producers and we havethree shards and we have consumers.And so in here the producersproduce data to the shards.And we have provisioned three shards in advance.So it's something we have to choose.And the more shards, obviously, the more capacity and the morespeed, and then the real time applications that need to consumethis data can go hook up to the shards and thestreams and read this data in real time.This is can you see this in a nutshell? So the data retention here is 24 hours bydefault, and you can go up to seven days.So that means that there is theability to replay or reprocess the data.So that means also that multiple consumingapplications can consume from the same stream.And that is the whole power of Kinesis.Every kind of application can go at itsown pace to read data from these charts.Once the data is inserted into Kinesisdata streams, it cannot be deleted.So it's called immutability.And the data records can be up to 1 MB in size.Which is fine for streaming use cases,but obviously not fine if you dolarge petabyte scale analysis in batch. OkaSo remember that can you see the streams? Is this just for real-time streaming of fast records that are small in size? So can you see the streams? What are the limits you need to know? First of all, the producers can send about 1 MB per second or 1000 messages per second at the right time per shard. And if you go over this, you'll get an exception. The consumer will read at two megabytes per second per shard across all the consumers, okay? Or make a maximum amount of five API calls per share across all the consumers. So what that means is that the more capacity you want, if you want to have more and more capacity in your streams, you need to add shards. So can you see the streams only scaling if you add shards over time? Now, for the data retention, as I said, it's 24 hours by default, but it can be extended to seven days. So Kinesis data streams are great when you want to create real-time streaming applications, okay? And remember that it is something you must provision in advance.Now, there is another thing called Kinesis Data Firehose, and Firehose this time is a fully managed service. You don't have to administer it.And it's not real time.It's near real time.As we'll see in the hands-on.It allows us to ingest data into Redshift.Amazon's three elasticsearch and Splunk.You must remember these four destinations. Amazon s three.So Redshift is a data warehousing service.S Three, we just saw elasticsearch isa whole service to index data andSplunk is an external third party service.So there is automatic scaling.So there is no capacity to createin advance instead of data streams.And it supports many data formats.You can do conversions from CSV JSON to parkit ORC when you sync it into S Three.And you can also do data transformations ofany kind you want through AWS lambda.For example, if you wanted to converta CSV into JSON, it supports compressionwhen the target is Amazon S Three.So again, very important to rememberg, Zip, Zip and Snappy.And data fire hose is awesome because you only pay forthe data, the amount of data that is going through it.Okay? So it is very different from kinesis datastreams and we'll see in a second.So from a diagram perspective, we can havedirectly producers producing into kinesis data firehose.So any kind of producer, or for example a kinesisdata stream can produce two kinesis data fire hose.And then we get a lambdafunction for doing all the transformations.And finally from kinesis data fire hosewe can send it out to AmazonS Three, redshift, Elasticsearch and Splunk.Okay, so it's very simple.But you have to remember this.This is more of a serverless offering.So here we have our source, our deliverystream and then again, as I said, therewas a lambda function, it's optional.But to perform some data transformation if we needto, we send the output to Amazon S Three.Maybe it will even go all the wayto Redshift by doing a copy command.And then the source records can be sentinto an Amazon S Three or the transformationfailures also or the delivery failures.So there is a way to recover from failureby putting all the source records, transformation failures anddelivery failures into another Amazon s Three buckets.So this is in a nutshellwhat data fire hose looks like.So the question you may be havingright now is what is the differencebetween kinesis data streams and firehose? So for kinesis data streams, the firstthing is that we can write customcode for the producer and the consumer.And this will allow us to createreal time applications that's super important.It is going to be real time.So between 70 ms and 200 ms latency.And you must manage scaling yourself.So if you want more throughput youneed to do something called shard splitting.That means adding shard.And if you want less throughput, you needto do shard merging which is removing shards.And so the idea is that if youhave a lot more data coming in oneday, you would need to manage scaling yourself.And that is quite painful to do.But there's this cool thing that there's data storagefor your stream between one to seven days.And this allows us to have replay capabilities so we canreplay a stream based on the data it has in memory.And also we can have multiple consumer applications,multiple real time applications reading from that stream.So all in all, think of like kinesis datastreams as a way to build real time applicationsthat can be replayed and so on.Fire hose on the other end is a delivery service.It's an ingestion service.So remember that word, ingestion.And so it's fully managed.And you can send data to amazon.S three Splunk Redshift and Elasticsearch.It is fully serverless.That means we don't manage anything.And you can do alsoserverless data transformations using lambda.And lambda, by the way, is a smallcompute service by AWS that allows us torun functions in the cloud without provisioning servers.It's going to be near real time.That means that it was going to deliver data withthe lowest buffer time of 1 minute into your targets.Okay, so it's not real time.And there's automated scaling.We don't need to provision capacity in advance.And there's no data source,so there's no reply capability.So there are very different services isfor building real time applications and theother one for delivering data.And we'll see how we can deliverthat with Firehose in the next lecture.So see you in the next lecture.
6. Lab 1.1 - Kinesis Data Firehose
Okay, so let's stay in S 3, but I'm going to go and open a new page for AWS, and this time I'm going to look at the Kinesis data firehose service. So when we get into Kinesis, click on Get Started. And there are really four options. The first one is to do a Kinesis data stream, but this one I will not do because it will cost us a lot of money and we have to provision the stream in advance. So I will do a Kinesis data fire hose.And don't worry, we'll see analytics and video streams later on. So for now, I'll click on the Kinesis Firehose Delivery stream because we want to deliver data into these three buckets right here. So let's go. You can see the firehose, and the delivery stream name is going to be Ticker underscore Demo. So this is just some simple data that we will send into SRE. ticker underscore demo Okay, then we need to choose a source, which is how we want data to arrive in a firehose delivery stream. So we can send data directly using the SDK, using Put or other sources, or we could choose a KinesisData stream and say, "Okay, a stream that I have previously created that we have our real-time applications running on can also send its data into a firehose to maybe deliver it into SRA." But we haven't created a KineSIS data stream. As I said, it will cost us a lot of money. So I will just choose direct input or other sources. And here at the bottom, it tells us how to send source records to Kennedy Data Firehose, where we can safely ignore that because that's not the point. And we'll be sending sample data using the console. So I click on next, and then do we want to transform source records with AWS Lambda? So Lambda can allow us to define a function in the cloud to define how we can transform these source records into whatever we want. And so I could enable it and choose a lambda function, but I don't have one at the moment, so I'll disable it. But with a lambda function, you are able to do anything you want. For example, as I said, you could change the data from a CSV format into a JSON format using a lambda function. Then we need to be able to convert record formats. So we will not convert, but we have the option to enable record format conversion into Park It or OSC. And for this to work, we actually need to use something called AWS glue, and we haven't seen it yet. We'll see this in the next lecture. So for now, we will not use recordformat conversion and keep it disabled. That means that the source records are not transferred and they're not converted, therefore they will be the exact same in the target, which is S 3. So let's go into the target, and we can select a destination. And as I said, there are four destinations possible. Three: redshift, elastics, service, or Splunk. We'll choose Amazon's Free, and the SF destination is going to be our machine learning buckets. So here we go. And then for the prefix, I'm going to enter "tickerunderscore demo" and then remember to add a trailing slash. So it's very important to add a trailing slash. As a result, ticker underscore demotrailing slash And then for the error prefix: ticker underscoring demo underscore error, trailing slash. Okay? So that means that the data we have will be delivered to this prefix. And if we get any errors, it will go into this prefix as well. So click on next. And then for the buffer condition, set a buffer size of 1 and a buffer interval of 60 seconds. This will guarantee that our data is delivered as soon as possible, either when we reach 1 MB in size or 60 seconds in time into S 3. And this is why Kinesis Firehose is a near-real-time service. It's because we cannot have a buffer interval of less than 60 seconds. So can you see? Data streams are in real time. Kinesis Firehose is for near-real-time delivery to targets. Okay, now, do we want compression and encryption? We can have G, Zip, Snappy, or Zip, but I'll just disable it for now. And do we want encryption? We'll disable it, but we can have KMS encryption directly set using this drop-down menu. So I'll just say "disabled" and then enable for error logging. Then we'll need to write a new imrule that allows the Kinesis Firehose Delivery role to send data to S 3. So it's created, and then I can click on Next, and we're good to go. So I'll click on Create Delivery Stream, and I will wait about a minute until the stream is fully created. So the stream is not created, and I can click on my ticker demo, and I'll click on Test with Demo Data, and I will start sending demo data. Now, this demo data is an adjacent document that looks like this, and it will take about a week. Not a week; I'm tired. It will take about a minute to be delivered into your S-3 buckets in here. So the reason is that if we go and scroll all the way down, then the buffer condition says that we have 60 seconds to wait until we deliver data into the street. So what I will do is wait about a minute before getting back to you. OK, so it's been over a minute now, and if I go into Amazon S3 and refresh now, I can see the ticker demo directory has been created for me. And if I click on it, I see 2019-10-23 and even the hour. And so that means that there is some partitioning that is being done by Kinesis Data Firehose for us. So based on that time, there's some partitioning being done. As we can see, some data was being delivered into our S-3 buckets. So if I download this and open it up now, we can see a lot of Jason's data that is being delivered directly into our bucket. So this is one JSON, and then we have another one, and so on. This is a lot of data that is being delivered, and that's the sample data set, obviously, that is being delivered by Kennedy Data Firehose. So this is perfect. We have the first delivery stream working, and some data is being returned to Amazon S 3. If I refresh now, I see a second file and so on. So this is very positive, and this will allow us to keep on going with our hands full. So make sure you can leave sending them data running, or you can just stop it for now and we'll restart it later on. Okay. I will see you in the next lecture.
7. Kinesis Data Analytics
So now that we have onboarded our data into Kinesis, we may want to perform some real-time analytics on it. And for this, we can use Kinesis Analytics. So conceptually, Kinesis data analytics will take data either from Kinesis datastreams or Kinesis data fireworks. The host will run some SQL code, and the results will be sent to analytics tools or output destinations. If we want to zoom in a little bit and know how that works, this looks like the blown-up version. So the input streams can be either an Akinesis data stream or a Kinesis data firehose. And this will be fed into our Kinesis analytics. In here, we're able to set a SQL statement to define how we want to change or modify that stream, perform some aggregation, some counting, some windowing, and so on. Then we are able to join it with some reference data. And that reference table comes straight from an Amazon list of three buckets. So if we had some reference data in an Sbacket, we could join the input stream to a lookup reference table. So this stream will do a lot of things for us, and then out of it we have an output stream and an error stream. So the output stream is obviously the result of what we have, and the error stream is if the query goes wrong. This output stream can go to many different places and can go into Kinesis data streams, Kinesis data firehose, Lambda, and then through Kinesis data firehose, we can send it all the way to Amazon's Free or Redshift. So let's look at it. The use case for Kinesis data analytics is streaming ETL. For example, we can reduce the size of our data set by selecting columns and making simple transformations; all of that on streaming data means we could generate metrics in real time. So if we had a live leaderboard for a mobile game, we could use cancer data analytics, and we could have responsive analytics, for example, if we wanted to look for a certain criteria. And we want to build alerting by filtering some of the input data sets and looking at some very odd data points. Okay, so one of its benefits is that we only pay for the resources we use, but it is not cheap. We can have serverless computing; it will scale automatically. So that means that we don't prevent servers in advance, and we just don't need to do any capacity planning for KineSIS data analytics. And we can use its permissions to gain access to the stream sources and destinations, allowing Kinesis Data Analytics to perform the transformation itself. We could either specify a sequel script or use Flink to write the computation, and Flink will give us a lot of power if we need to, and we have Schemadiscovery, and we can even use Lambda for preprocessing. So there are a lot of things we can do with kinetic data analytics, and we'll see this in the hands-on experience with machine learning. Now, on Kinesis data analytics, there are two algorithms that you can do directly, and you need to remember them. The first one is called the random-cut forest, and this is the diagram in the bottom. So this is used to detect an anomaly on numeric columns in a stream. So if you look at the diagrams below, as we can see, most of the data points are in the centre, but four data points in red are outside the centre, and so they look like anomalies. And so this random-cut first algorithm will detect these anomalies. So for example, if we want to detect an anomalous subway ride during the New York City Marathon, this will be a way to do it with a random cut file. You remember that this random cut tree is an algorithm that adapts over time. So it only uses recent history to compute the model. So that means that if your data set changes over time, then this random cut forest model will change over time as well. Now, for Hotspots, this is a second algorithm that's provided as a sequel function in Kinesis data analytics. And as you can see in the diagram, it allows you to locate and return information about relatively dense regions in your data. So, for example, if you wanted to find a collection of overheated servers in the data center, they would be grouped together based on temperature, for example. So these two algorithms are very, very different. The first one is to detect anomalies. The second one is to detect dense areas. and you have to remember that. The first one allows you to use only the recent history. So there's an ever-changing model. The second is less dynamic and allows you to detect locations. So I will see you in the next lecture, where we can see how we can use Kinesis data analytics on the Kinesis data firehose we have just set up. So, until the next lecture.
8. Lab 1.2 - Kinesis Data Analytics
So I'm going to turn back on my demodata, and I'm going to go into Kinesis DataAnalytics to create our first analytics query. So I'll create an application, and I'll call it Ticker Analytics. OK, and the runtime is going to be SQL, but you can also use, as I said, Apache Flink. So we'll use SQL and create our application. Okay, the first thing we have to do to perform an application is to connect it to streaming data sets. That streaming data set can be either a KinesisData stream or a Kinesis Firehose delivery stream. So let's click on Connect streaming data, and the source is going to be a Kinesis Firehose delivery stream. It's the one we have from before, and it's called Ticker Demo. Excellent, then do we want to have record preprocessing with Lambda? No, but we could change the records and transform them a little bit to make it work with our application and then finally get my permission. So we're going to create a new role that will allow Kinesis Analytics to perform the reads on this Kinesis Firehose delivery stream. Okay, now for Schema, we click on "Discover Schema," and because some data is being sent through here, Kinesis Analytics should be able to discover Schema for our ticker demo stream. So let's wait a little bit and see what comes up. And the schema discovery was successful, and as we can see here, it was detected that we have a ticker symbol, a sector, a change, and a price. And all of these have the right data type. So four vertices, 16 varchars, and a real number. and a real number. Okay, so this is excellent; we can look at the raw data as well and see the JSON, but it looks like it was possible from this JSON to infer the schema of our data. And this is really good because now we'll be able to write sequel queries automatically. So we click on Save and Continue, and we have our streaming data that is being done next. We could have some reference data, and as I said, it would be a JSON file or a CSV file stored in Amazon history, but we don't have one, so we won't connect any reference data. And then we need to actually perform the real-time analytics. So I'm going to my SQL Editor, and would you like to start running this thing? Yes, I would like to start my application. And so, in here, we are able to write our own SQL to run against the data firehose stream that we have here. So we won't go and be crazy; we'll just click on "Add SQL from templates" and we'll choose a template, and this one is Aggregate Function" in the sliding time window. Okay, so this is a very simple SQL statement, and it shows in the graph what it will do. So what it will do is perform an aggregation. So it will perform accounting over a sliding time window. And that sliding time window is a ten-second window. So you look at the source stream; it will look and perform the aggregation over a ten-second sliding window, and then write the thing to a destination stream. So what we'll do is that I will click "Add this SQL to the Editor," and it's in here. So now the application is running, and we need to save and run the sequel for it to work. So if I click on save and run SQL, it's going to save the SQL. And as we can see in a second, this should produce an error message. So it's saying there's an error in your SQL code, and this is quite annoying. But the issue is that this ticker symbol right here is not quoted in this article. So because there are no quotes around this ticker symbol, then things don't work. So you need to add a quote, a double quote, around the ticker symbol. So you add this double quote everywhere. So in here and in here So three times, you add this double quote. And now we click on "Save" and run the sequel again. And this time the sequel should be able to be saved, and then we should be able to run it. So here we go. It's being saved, and let's wait a second until it's being run. And now the application is running, and as we can see, new results will be added every two to 10 seconds. And in real time, we get the ticker symbol count for each of these ticker symbols. And we can see some are one, some are five, some are four, two, and so on. And this will be updated in real time every two to 10 seconds based on the input data sets. So the really cool thing about this is that now we have Kinesis data analytics to figure out exactly what the data set looks like and what the result looks like, and from this example, for example, we can say, "Okay, we could have another application that says if the ticker symbol count is over ten, then you need to do something and alert someone." Okay, so this is quite interesting. So I'll just close this, but we can see the realtime analytics in here, and the destination right now is not connected. So how about we click on Connect to a Destination to send the data somewhere? So we'll scroll all the way to the top, and we'll see the destination of this is going to be a Kinesis firehose delivery stream. So we're going to create a new Kinesis Firehose Delivery Stream. I'll go quick. So the ticker analytics demo is up, and the source is going to be direct points. Click on next. And then I will disable any record transformation. The destination is going to be S2, and the S2 bucket is going to be my machine learning bucket. Okay, the prefix is going to be ticker underscore analytics. And don't forget the slash at the very end, and the error prefix is going to be underscore error. Okay. Click on "next." The buffer size is 1, and the buffer interval is 60 seconds. I'll disable any of these settings. Click on Create new Im role choose. so I just clicked on it and said, "Okay, use this Firehose delivery role." Click on Allow, and we're good to go. And then click on "next." Everything looks nice. Click on "Delivery Stream." And this delivery stream has been created. So I can go back into Kinesis Analytics and select my Ticker analytics demo to send this destination output into this Kinesis Firehose deliver stream. Okay, now, do we want to have an in-application stream? So, yes, we want to have this one. And this one, which we named destination SQL Stream. So this comes straight from the sequel statement. This is how we created our destination stream, and the output from that is going to be Jason's, but we might as well have CSV. so I'll just click on JSON to keep it as Jason. Okay. Click "Save" and "Continue." And this took about five minutes, but the update has been successful, and now we are in the application status of "running." And so that means that if I go back to my Amazon S queue and refresh this, yes, now I see my ticker analytics in here, and so it's been delivered by my other delivery stream by Firehose. And let's take one of these files. I'll take this one, for example, and download it. And as we can see now, yes, all the ticker symbols and the ticker symbol count are in one big JSON file delivered into my bucket. So this is perfect. Everything is working as expected. We have one Kinesis data firehose sending them data. We have Kinesis Analytics analysing that data and doing so over a ten-second sliding window. And then we have another Kenneth data firehose sending that data back into S three.So now we have multiple data sets in S three.So when you're ready, just stop sending demo data when everything works. And now we have enough data to work in this bucket. So I'll see you in the next class.
Pay a fraction of the cost to study with Exam-Labs AWS Certified Machine Learning - Specialty: AWS Certified Machine Learning - Specialty (MLS-C01) certification video training course. Passing the certification exams have never been easier. With the complete self-paced exam prep solution including AWS Certified Machine Learning - Specialty: AWS Certified Machine Learning - Specialty (MLS-C01) certification video training course, practice test questions and answers, exam practice test questions and study guide, you have nothing to worry about for your next certification exam.