36. AZ-203/204 – Lab – Azure CosmosDB
Hi, and welcome back. Now in this chapter, let’s look at how we can work with Azure Cosmos DB. Now, before moving on ahead, there are a few concepts that are important to know. So in Azure, the first thing you do is create an Azure Cosmos DB database account. After you create a database account, you then create a number of databases within the account. Now, within your database account, you would then host a number of databases. Within the database, you’d have a container, and within the container, you’d have your items.
Now, the term given for the database itself depends upon the API that you choose for the database account. So, for example, if you choose the API that has SQL, MongoDB, or Gremlin, then the term given is database. However, if you select Cassandra, the term is given with key space. When you look at a container for your items, if you’re looking at the SQL API, then the container is known as a collection. If you choose the Table API, then it’s known as a table. And if you choose the Gremlin API, the container is known as a graph. And when looking at items, if you have the SQL API, then each item is known as a document. And if you choose the Table API, each item is known as a row. So let’s go on to Azure. Let’s see how we can work with Azure Cosmos DB. So here we are in Azure. Now we can go on to Azure Cosmos DB. Let’s go ahead and add an Azure Cosmos DB account. So first, let me choose the resource group.
Now, we can choose the API. So you could choose. I mentioned Azure, Cosmos, DB, or the Mongo API.Cassandra azure. Table or Gremlin? So for one of the accounts, let’s choose the Cosquel part and give the account a name. I’ll choose the region. Now, at this point in time, you can enable georedundancy. So that is where your data gets copied to multiple regions. You can also have multiregion writes, wherein your rights not only go on to your database in one particular region, but you can also have rights on data that’s replicated across multiple regions. For the time being, let me disable everything and move on to the next step. If you want, you can configure the Azure Cosmos DB to be part of a virtual network. For now, I’ll leave it as it is. I won’t add any tags. And let’s go on to the review and creation.
And finally, let’s go on to create. Now let’s go ahead and create one more ASIO Cosmos DB account. So I’ll choose the same resource group. This time. Let me choose the Azio table. API. Just to show you that you can have an Azure Cosmos account with a different API, I’ll give it a name: the region. Again, I’ll disable geodenancy and multiregion rights again; no virtual network. Let me go on to review and create, and let me go ahead and hit the Create button. So let’s come back once both accounts are in place. Now, one quick thing to note is that when we created our Cosmos DB account, we didn’t have to provide any virtual machine details or any infrastructure information. All of that—the scaling, the storage, the compute, the memory, the I/O operations—everything is managed for you in the background. All you do is concentrate on the data itself. Now, once the account is in place, if you go over to the one where we have provisioned the SQLAPI, you can go on to the Data Explorer.
Now, please note that you can create a database, a container, and add items using the APIs that are in place. So you could use the APIs from a DotNet programme or any other programming language that is supported to go ahead and create your database containers and your items. But if you want to see how it works, you can actually do it from the Data Explorer within the Azure Portal itself. So first, we have to go ahead and create a new database. After you’ve specified the database name, you must specify the provision throughput. So, basically, 400 is the number of request units you are provisioning for this database. So remember, that request unit is a blend of the CPU, the memory, and the IO, which is given to the underlying database. The minimum that you have to provide is 400. And over here, you can see how much you would spend hourly and on a daily basis. So let’s go ahead and click “okay.” Within the database itself, you’ll go ahead and create a new container. So let’s say I want to store JSON documents for various customers. So I’ll give the container ID to the customer.
You must now specify what the partition key is. So Cosmos DB uses the values of the partition key to distribute items across partitions on the physical infrastructure. Now, this helps with the even distribution of the items. It also helps when you query for the items in your underlying container. So please make sure that you choose a partition key that allows for an equal distribution of items across these multiple partitions. Now, for example, let’s say that I choose that Customer City has the partition key.
Now, maybe you have 100 customers who are from New York City. So that means that there are a lot of items that have the same value as the partition key. That means they will all be concentrated on a single partition. Hence, this is not a good candidate for a partition key. So what could be a good candidate for the partition key? Well, you could use the customer ID. So remember, customer IDs are unique, and they would allow for equal distribution, or at least an even distribution, of your items across multiple partitions. Please keep in mind that in Cosmos DB, you must have an ID attribute for your JSON documents that you store in the container by default. So either you could have Slash ID itself have the partition key, or you could decide on some other attribute to be the partition key as well. For now, let me say that Slash ID itself has the partition key, and from the perspective of domain modeling, what’s going to happen is that we are going to be mapping this Slash ID to basically our custom ID.
Now you could go ahead and provision dedicated throughput for this container itself, but I’ll leave everything hazardous. Now, if you want to have a unique key under your partitions, you can do that as well. But let me leave everything dangerous alone. Since we’re just starting off with the SQLAPI, let’s go ahead and create our container. Once a container is in place, you can go onto the items and create a new item. So I can see that automatically it’s asking you to enter a value for the ID. So even if you had to choose—let’s say the customer name has the partition key—you would still need to enter an ID value. That’s why, to make it simple, I’m making both the ID and the partition key the same. So let me place an ID on one. Let me go ahead and add a customer name and a customer city, and let’s go ahead and click on Save.
Now you can see that there are some properties that have been self-generated by the system itself. So you have a reference ID, you have the underscore “Self,” and you have the underscore “Etag,” which is used for optimistic concurrency control, which you do from your program. Then you have underscore TS, which is the timestamp, and you can create a new item. Now, what happens if I add the same item again? So you can see you’re getting an error because in the database itself there is conflict resolution where the ID has to be unique, right? So let’s place this, and let me keep it as user B. I can put the customer’s city as New York. Let me click on “save.” Let me add a new item. Now, since this is a Nosql-based database, you can actually add other attributes to a document as well. So you can place the H, and we can click on Save. Now we can go ahead and create a new SQL query. So you can use normal SQL syntax to query the documents. So if you do a normal selectStar from C and execute the query,
So you’ve got all the documents in place. As you can see, this document is completely separate, with only an extra property. So all the documents in the container itself do not conform to any schema, and that’s why it’s known to have no SQL. If you want, you can also add a “Where” clause. Let me click on “execute query.” So you could also filter out your results using the Where clause. Right, so this is from the SQL API. If you go back to the Cosmos DB table API or the Data Explorer, you can click on New Table. You can provide a table ID in the same way that you did with the SQL API.So the throughput concept is the same for all APIs. So it just gives a measure of how to charge the customer when they work with Azure Cosmos DB. I’ll click “Okay,” which will automatically create a database on this table (DB).And then you have your table if you go on to the entities or if you go on and add an entity.
As with tables in Azure Storage Accounts, you have a partition key and a row key. So your partition key could be mapped to your customerID, so you can place a value of 1. Your row key can be mapped to your customer name. And if you want, you can add another property. You can click on “Add Entity.” So now you’re basically dealing with tables of data. So you could use different APIs depending on your requirements. Depending on the data that needs to be stored, you would choose the API accordingly. Right, so this marks the end of this chapter, in which we have looked at an introduction on how to use Azure Cosmos DB.
37. AZ-203/204 – CosmosDB – Partition Key
Hello and welcome back! In this chapter, I want to talk about partition keys because they are important to understand when looking at Azure Cosmos DB. When we looked at our lab, we created an ASIO Cosmos account and chose the SQL API for the first account. And we saw that we could store JSON-based documents. Now, we mentioned that the partition key has an ID.
So each customer will have an ID, and that will form our partition key. So a container is a grouping of your items. So if you look at a traditional SQL database, it’s like a table that has your rows and your columns. But over here, since everything is stored as JSON-based documents, it’s stored in a container. Now, by default, if you don’t set the partition key, it has the ID. By default, every container needs to have an ID property, and it also needs to have a property that you will mention has the partition key. So remember, your JSON document has properties like the ID, the customer name, and the customer city, and each property has a value. So, when you create a container, you must specify which partition key you want to use.
So which property would be your partition key? Now, the ID itself is used to identify an item within a partition. Just keep this in mind before we go on to the next step. We delve deeper into the partition itself. So again, an ID is used to identify an item within a partition. The partition key is used to decide which partition the item will reside on. And then the ID plus the partition key are used to uniquely identify the item itself. So if your ID is separate and, let’s say, the partition key has a customer name, together they will decide how to uniquely identify the item. Otherwise, if your ID and your partition key are the same, it’s used to uniquely identify the item. Here, you just have the extra benefit of specifying a different property as having the partition key. Now let’s go over to the partition key itself to understand how it works.
Now we’ll look at the partitions themselves. So what Cosmos DB does is that it takes your data and shards it, then spreads it across different logical partitions. Because of the segregation of data, you have different partitions in place. Now, these are logical. Now, when it comes to the underlying physical infrastructure, the logical partitions are again segregated into different physical partitions. You could have these two logical partitions stored on this physical partition, and maybe you could have this logical partition stored on this physical partition. So instead of having all of your data in just one place, it’s always a good idea to partition your data. And that’s what Cosmos DB uses as its design pattern for storing its data. Now, how does Cosmos DB decide how to create the logical partitions?
It does this by creating a hash. So it uses the hash function of your partition key. And that’s why it’s so important to carefully decide what should be the property that becomes your partition key. So it is the hash of that value that then decides on which partition your item will reside. For example, if you choose the customer name as your partition key, you could have items spread across multiple logical partitions depending upon the output of the hash function. Let’s say that you provide the partition key. When I told you that you have a customer city of New York and that you have 1000 items with this customer city, we saw an early example. So the hash output would always be the same, and those thousands, or maybe tens of thousands, of records would all go to one partition itself. So you have all the data in one logical partition, so this is not a good candidate for the partition key. Another important aspect I’d like to discuss is throughput. So remember, when you define your database or container, you define the number of request units. So by default, the minimum is 400.
Now these request units are equally divided among the partitions. So let’s say that you had ten partitions based on the data in the container itself. Each partition would get 40 request units. So, if you had to choose the partition key that contains all or most of your data, if it is in one partition and you try to write to or read from this partition, the request units may exceed 40.So even though you have provisioned a throughput of 400 for the entire container, because all of your data is in one partition and because Cosmos DB splits the throughput across the different partitions, this becomes something known as a “hotspot,” wherein all of the input and output is concentrated on this partition itself. As a result, throughput will be exceeded. The analyse programme will start running into errors simply because the throughput is greater than the throughput that is provisioned for this particular partition.
And that’s why it’s always important to ensure that you choose the right property for your partition key. So when you have a wide range of values for your property, you should choose that as your partition key. So some good candidates for the partition key are the customerID and the customer name, so not things like the state or the customer city, which do not have a wide range of values. This determines the throughput and how it’s stored. Perspective. Another critical factor is the queries you run against your containers. Now let’s say that you define the partition key as the customer name. And let’s say that you fire the first query wherein you’re trying to fetch the items or the documents from your container where the customer city is equal to New York. So you’re basically trying to find or fetch the items based on the customer’s city.
Now, the customer city is not your partition key. That means Cosmos DB has to go through all of the partitions just to satisfy your query. Remember that by forcing Cosmos DB to perform read operations on all of your partitions, you are unnecessarily increasing the number of request units processed for your entire container. Also, keep in mind that the more you consume, the higher the cost. So remember, there’s a direct relationship between the amount of reusing you consume and the cost you incur. So if you had to look at the second query, wherein you are now querying based on the customer name, Now, since your customer name is the partition key, it will calculate the hash of that value. It will go to the required partition only and get that data. So you’re reducing the number of request units and read operations, resulting in a more cost-effective and effective query. So if you’re going to query frequently on a property, make sure that also becomes a possible candidate for your partition key. Right, so this marks the end of this chapter, in which we looked much deeper into partition keys in Azeo Cosmos DB.
38. AZ-203/204 – CosmosDB – Consistency Levels – Part 1
Hi, and welcome back. Now, in this chapter, I want to talk about Cosmos DB consistency levels. Now, why do we have this feature of consistency levels? So let’s try to understand this in much more detail. Now, Cosmos DB provides high availability for your data. So within a particular region itself, it makes multiple copies of the data so that even if one copy gets lost, we have other copies still in place. And remember that you can also have multiregion accounts where your data can be replicated across multiple regions.
So let’s take a simple use-case scenario. Let’s say there’s a document in your database, and one of the properties has an order value of ten. So this has been written to a container in your database. Remember, this value or item is now propagated. I said there were multiple copies. So let’s take, for example, the three copies that are being made with the use of replica sets. Now, let’s say that a change has been made to the property of the writer. So let’s say the order value has been changed from 10 to 15. So there is a writer; it could be a programme or an application that has made this change. Now, if you have another user, so let’s say a reader that’s trying to read the order value,
Now, remember that the order value of each replica set was initially ten. Now the value has changed from 10 to 15. As a result, the value must now be propagated across all replica sets.But let’s say that during this time, even though this happens in a fraction of a second in terms of milliseconds, when you have multiple users, if you have millions of users and a lot of data, this does count. So let’s say now that the reader is making a request for that item, which has the order value. During the time when the update is taking place, what is the value that’s going to be returned to this reader? Would this reader see a value of ten, or would this reader see a value of 15? Let’s say, for example, that in one of the replica sets, the value has changed to 15, but the others still have ten. What is the value that’s going to be returned to that reader?
This is where you have the concept of consistency levels, which you can assign to your Cosmos DB account. So you have five levels in place. You have strong, bound, staleness, session, a consistent prefix, and eventual. Let’s start with the strong and eventual. If you set your Cosmos DB account to the strong consistency level, what’s going to happen is that if the reader makes a request to that item when the item is changing, what Cosmos DB will do is that it will first ensure that all the values have been set to 15, and only then will the value be returned to the reader. So in this case, there is a small latency because the reader has to wait until all the values are updated before the updated value is sent back to the reader. So this is a compromise in terms of speed. But remember, this has no compromise when it comes to consistency. So the reader will always be guaranteed to get the most recent committed version of this particular item.
So, if your programme or application has the need to always have the most recent version, sometimes when it comes to critical data, So let’s say that you have patient records or that you have a hospital that has that inventory system, and in that system, you always have to make sure that you get the most recent data. In such a case, you have to choose a strong consistency level, or at least one that is in the state of providing you with at least the latest version of the item. So this is from a strong point of view. Now let’s look at the eventual consistency. Now, this is the other way around. So the reader might get a value of 10 because all the values will eventually become 15, but there is no guarantee on what value the reader will read. As a result, there is no compromise in terms of speed latency. So as soon as the reader makes a request, whatever the value, the reader will get the value. So the reader might get the value of ten. It will not get the most recent version. It may not get 15. It will have the value of ten. Maybe later on, after some time, when the reader again makes a request, eventually, because all of them will have the value of 15, the reader will get the value of 15.
So in terms of speed, it’s fast, but in terms of consistency, it’s a problem. And it’s up to you in your application to make sure that you take into consideration this sort of consistency because you might have users in the application stealing data. So if this is not a concern, then you can use the eventual consistency model. Because, remember, there is an additional cost in terms of strong because you must ensure that all the values are updated. So there is an additional expense or cost because, in order to ensure that all the replicas are consistent, the read is made first and only then, whereas in the future, it will be like, “Okay, you make a request unit, you read the data, and eventually it will become consistent.” So even in terms of cost, this is the most cost-effective option, whereas this is the least cost-effective option, right? So this is in terms of strength and eventual consistency levels. Now, let’s move on to the next three.
39. AZ-203/204 – CosmosDB – Consistency Levels – Part 2
So now, looking at the others, So let’s take the case of bounded staleness. So let’s assume that again, we have an item where the order value is equal to five. Now. Now let’s say we have a writer, so we have a user that’s making a change to the value.
So the order value is equal to ten. That’s the first change. The next change is that the order value is equal to 15, and so on, with order values equal to 20. Now we select the level that has limited the steepness. That means your application doesn’t mind getting a delayed result by, let’s say, some seconds, or maybe by a set of versions. You can say that the application doesn’t mind being three versions, or maybe two seconds, behind. So this kind of saves a little bit on the cost. So over here, let’s say a reader is now making a reading. So initially, because you mentioned that maybe after a particular amount of time, only then should you see the most recent version. So maybe initially the user might see an order value of five. When the reader makes a request after some time, the order value of ten is displayed. Again, after some time, it could see an order value of 15.
So there is a slight delay in getting the most recent version. But again, this is less than eventual because, inevitably, you just don’t know when you are actually going to get the most recent version. You’re actually stamping the application with the information that it can only handle a delay of, say, three versions, two versions, or this much time in seconds. So you’re just putting a stamp on it and saying this is all the application can afford in terms of delay. So that’s the difference between bounded stainless and eventual, because sometimes students get confused between eventual and bounded stainless. Now, next, let’s talk about sessions. So this is basically pertinent to the session itself. So let’s say you have a writer in one session, again, making those changes. Now, if you have a reader in the same session, they will always get the same value. So it’s consistent with what the writer is writing. But if you have a reader in a different session, there could be a delay in getting the values right. So this depends on the session, quite simply. So this is based on sessions. And then next, we have a consistent prefix.
so contradictory prefix A reader will see a delay in the rights, but they will always be in order. So the reader will never see an out of order situation, right? So over here, you can see the rights in order: 10, 15, and 20. So maybe the reader makes the first request, and it gets the value of ten. So what’s the most recent version? But then, when the reader makes another request after the value has gone up to 15, it doesn’t get that response. So there is a delay over here, but next time, when the reader makes a request, it will get the last version. So it will not get the value of 20, it’ll get the value of 15. So this is in a particular order. Right. Again, this is different from “eventual,” because in “eventual” there is no guarantee on the order. Right. So this marks the end of this chapter, in which we have looked at the consistency levels.
40. AZ-203/204 – CosmosDB – Partition Key and Consistency Levels Recap
So now let’s have a recap of what we have learned so far in terms of the partition keys and the consistency levels. Just a quick overview So first, remember the partition key. So choose a partition key that has a high range of values. This allows the items to be spread across a wider range of logical partitions. Beware of hotel spots. So these are when requests are made on the same partition. Also, when you query for data, remember that if you want to perform a filtered search, ensure to include the partition key for a more effective search pattern. The cost to read a 1 KB item is normally one request unit, but this also depends on the consistency level that you assign. So remember, there are different consistency levels that are available.
So what are the different consistency levels? So first is a strong impression, wherein the client reads the most committed version of an item. Then there’s the bound staleness. In this case, the read may lag behind the right by up to K versions of an item or a T time interval. Then there’s the session that’s more relevant to a single client session. Then you have the consistent prefix. There are guarantees that no out-of-order rights will be seen here. And then you have the eventual, wherein there is no ordering guarantee for the reads at all, just some important points when it comes to Azure Cosmos DB. Let us quickly go on to the Azure Portal. I just want to show you a couple of these aspects. So, here we are in our Azure Cosmos DB account.
Now, if you go on over to the homepage itself, in the monitoring section, you can see the number of requests that are being made on your Azure Cosmos DB account. You can also see your throughput billing. So this gives you a good idea of how much money you’re spending on throughput on your container or database. If you go on to the metrics section here, you can see the average throughput and the average number of requests. And you could view this based on a time range. So you could see that you’re consuming 0.02 request units per second for your East U.S. region.
And this is basically the central US. So here, you’re not consuming any request units if you go on to the throughput. so you’ll get a better idea of the throughput itself. You can look at your storage, your availability, and the latency. So remember that Cosmos DB gives you a 10-millisecond latency. As a result, your relative timing is far below the ten millisecond threshold. Next, if you go on to the default consistency, So currently, it’s a marked session. But, if you’re going to change the consistency for your Cosmos DB account, then you’ve reached the end of this chapter, which covers some additional aspects of Cosmos DB.
41. AZ-203/204 – CosmosDB – Making API calls
Hi and welcome back. Now, in this chapter, we look at a lab on how we can work with data in an Azure Cosmos DB account via the API. Now, obviously, there is support on the Azure platform to work with Cosmos DB data from various programming languages. Now, you could have wrapper classes. You could have packages in place that work around the API itself. And those packages can be used to work with data in an Azure Cosmos DB account. But it’s always important to understand how you can work with the data via the API provided on the Azure platform. So what are we going to do in this particular lab? We’re going to be using a popular tool known as Postman to issue API calls. The first thing that we need to do is to generate something known as an authorization key. So this will allow us to actually authorise ourselves to work with the data in an Azure Cosmos DB account.
That authorization key needs to be generated via a program, so we can’t do it ourselves. We need a programme to generate the authorization key, and then we’ll see how to perform some of the basic Crud operations. So let’s go ahead with our lab. So before we go on to the actual API calls, I just want to show you that from Visual Studio, from a Net program, you could use packages in place to actually work with data in an Azure Cosmos DB account. So, for example, in this program, if I go onto the NuGet package manager, if I go on to manage NuGet packages, if I go onto the installed So I’ve installed this Microsoft Azure document DB. As a result, this is a.cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval cheval chevalSo using the inbuilt classes and methods, I can just quickly go on and work with the data directly in my Cosmos DB account. However, it is essential for any programmer to understand how to make Rest API calls to and from your Cosmos DB account. So let’s do that.
So I’ve gone ahead and downloaded the Postman tool. Again, it’s free to download and use. So this is a very popular tool when it comes to issuing Rest API calls. Now, here I am in the Microsoft documentation. So this is the Azure Cosmos DB rest API reference. So let’s look at the Rest API when it comes to documents themselves. So you could use API calls to create your databases and work with them accordingly. You can use the API calls to work with the collections of the containers, and you can use your API calls to work with documents themselves. So let’s go on to the documents section. Now, in the Document section, again, you could use the EPI course to create a document, list documents, or get a document. Let’s go on to the List Documents section. So over here, it’s saying the method to use is the Get method. The request Uri is this, wherein you have to replace the DBID. So this is the DBID with your database ID, and you have your collections or your container ID over here. So let’s go on to the Postman tool.
So here I have the Postman tool open. So if I hit a new request, it’s a Get request. Now, as for the documentation, I’ve placed my Azure Cosmos DB account documents on Azure.com. Then I have the DBS, the name of my database, which is App DB, and then my collection, the name of my collection of containers. And then I have doctors. So I want to list all the documents that are there. Remember, in our customer container, in our AppDB database, in a SQL API-based Azure Cosmos DB account Now, let me just go ahead and click on Send, and let’s see what happens. So currently it’s saying that the request header authorization is missing. So we have to pass a token in the authorization header to ensure that we are authorised to make this call to get the data in our container. Now, if you go back onto Azure, if I go on to the keys section,
Now, over here, you can see that you have a primary key, a secondary key, a primary connection string, and a secondary connection string. Now, in a net program, you could directly make use of the primary key, the secondary key, and the connection string to make a connection to the database and to the container and then fetch the documents. But when you’re making a pure Rest API call, you have to make sure that you formulate the authorization and place that authorization token in the header itself. So for that, I mentioned that you have to create a program. There’s actually already a code snippet available in the Microsoft documentation on generating this authorization token. Now, here is the program. So this is taken from the Microsoft documentation itself. So this can be used to generate that authorization token. This is the method described in the Microsoft documentation that will generate the token for you automatically.Now, this token basically requires certain parameters from you in order to generate that token. So first is the verb being used. So since you want to get the list of documents, the documentation says that you have to issue the Get method. Next, what is the resource type? So, you are basically working with documents. We want to get a list of documents within our collection or container. So our resource type will be Docs.
Next is the resource ID. So we want to work with our customer’s container. Next is the date of our request. So please make sure that this date is in line with the server date. So I’ll just make this 54.So my current date is July 18, and the time is approximately 1254.Next is the key. So where do we get the key value from? So, if you go on to a zero, if you go on to your Azure Cosmos DB account, if you go on to keys, you can take either the primary or the secondary key. So in a Dotnet program, you can use the key to automatically go ahead and work with the Azure Cosmos DB account. The key, on the other hand, is used in the process of generating the authorization token. Next, the key type is master, and the token version is 10. Now, if I go ahead and run the programme, I’m now getting the authorization token, so I can go ahead and just copy this. Now, if I go on to the Postman tool, here is my request. Now, you have to go on to the header section. So remember, you have to add this to the headers of your request. Make sure there is a key of authorization in the value of the authorization. Make sure you have the value of your authentication token, which was generated from your program.
Next is the XMS version. So this remains the same. It’s a static value. And next, you have the date. So whatever the date that you have actually mentioned in your programme is, you have to make sure that you have the same date over here, because it has to match with what you have used to generate the authorization token. Now let me go ahead and click on “Send.” So now you can see that you’re getting all the documents from your container. Now, please note that if there is a time mismatch, you will be notified in the output. Make sure you change the time accordingly. Now, if there is a change in the time, make sure you add the new time over here, generate the authorization token again, and place the authorization token over here. all important points. So this is when you want to issue a request to get documents from a container. Now, likewise, I said there are RestAPI calls in the documentation for inserting documents, deleting documents, updating documents, et cetera. So please make sure that you try out these different Rest API calls. What is important is to understand how to generate the authorization token, and what is important is to understand what all goes into an API request. Right? So this marks the end of this chapter.
42. AZ-203 – Azure Database Migration Service
Hi, and welcome back. Right, so this chapel, let’s look at the Azure database. Migration service. Now this service is used to migrate data from various sources to various types of destinations. Now you can perform both offline and online migrations. Now, online migration can limit your application’s downtime. So if you want to ensure that during the migration itself you have less downtime for your application, then you can go ahead and choose an online migration.
Now, please note that for an online migration to happen, there are some prerequisites that you have to implement, either on the source or on the destination. Again, this depends upon the source and destination databases that you’re putting into the database migration service. Now the service has support for migrating data across Azure SQL databases, Azure SQL databases hosted on Azure Virtual Machines, Azure databases for PostgreSQL, and it also has support for Cosmos DB, which is very important from an exam perspective. So let’s go ahead and look at an example of this. So, here we are in Azure. Now, what I’m going to do first is create an instance of the Azeo Database Migration Service. Now, in the background, this Azure Database migration service will create Azure virtual machines.
So, on these Azio virtual machines, a service would be hosted that would actually perform the data migration. A virtual network would host these virtual machines. So I’m going to issue the following command to go ahead and create an instance of the Azure Database Migration Service. So I’m giving Azdms the command create. I’m specifying the location, the name of the service, the resource group, and what the skew is. So I’m going to be using premium four-cores, and I’m specifying what the subnet is. So the entire place Then I specify which subnet the virtual machine should be hosted in. So, as I said, remember that the database migration service internally will be creating virtual machines, and those virtual machines will actually be responsible for performing the data migration. So let me go ahead and execute this command.
Now, for some time, you will have an instance of the Database Migration service. So, if I use all of my resources, I’ll have a demo service. So this is a zero-database migration service. Now the first thing to do is to create a new migration project. So let’s go ahead and do that. Let’s give the project a name. Now, we must determine the type of source server. So from where do we want to transfer the data? So, as you can see, you have multiple options available. So I’m going to be actually transferring data from a MongoDB instance onto a Cosmos DB account. So I’m going to be choosing MongoDB as my source service type, and I’m going to be choosing Cosmos DB as my target service type. Now, you could also choose the type of migration. So you could choose either an offline data migration or an online data migration.
So I’m going to go ahead and create an offline data migration and click on save. Now we can go ahead and create the project and create an activity. So the activity will basically specify what we want to copy from the source to the target. Now, before we actually go on with the activity, let me go ahead and create an Azure Cosmos DB account. So I’ll choose Azeo Cosmos DB. I’ll choose my resource group given the account name. for the API. I will go ahead and choose ASIO Cosmos DB. for the Mongo API. I’ll go with the central United States. I’m going to mark the version as 3.6. So please make a note of the version in your source MongoDB; this is important. When you’re trying to replicate onto Azure Cosmos DB, you have to make sure that the same version exists. So I’m going to choose 3.6. I won’t define it in networks; I’ll go on tags, review, and create, and then we’ll go ahead and create the Azure Cosmos DB account. While this is going on, I’ve gone ahead and installed MongoDB 3.6 on a virtual machine in Azure. What I’ve done is make sure that it’s listening on the private IP address of that virtual machine on the defined port. So I can do that using a configuration file. I’ve gone ahead and created a database known as AbdB. I’ve created a table known as “Customer,” and I’ve gone ahead and added two documents to this MongoDB table. So I want to use the database migration service to move these two documents from the MongoDB instance to Azure Cosmos DB.
Now, once you have your Cosmos DB account in place, we can go on to Data Explorer. You can go ahead and create a new database if required; click OK. So now we have both our source and destination in place. So now when I go back onto the migration wizard, in the source, I’m going to select the connection mode as connection string mode. So over here, I have the public IP address. Remember that I have the MongoDB instance running on an Azure virtual machine? So I’m using the public IP address of the virtual machine and the port number. Let me click on “save.” So this will actually go ahead, connect to the source, and see if it can make a successful connection. If it cannot make a successful connection, it will actually give you an error. Right, so that’s done. Now the next thing is to go ahead and choose our target Cosmos DB account.
So it’s automatically detected that we have a Cosmos DB account in place. So please allow me to click Save for that as well. Now, in the database settings, I have my soul database and the target, so I can go ahead and save that as well. So I’ve got my customer collection. So that’s going to be copied. Let me go ahead and click on Save, and now we can go ahead and give an activity name, and then we can go ahead and run the migration soit will go ahead and start the migration. You can click on Refresh to see the status of the migration at any point in time. You can see it’s already complete. So let’s go on to our Cosmos DB account. Let’s go on to Data Explorer. So remember, in AppDB, we didn’t create any collections before, but now we can see a customer collection. If you go on to the documents, you can see that the two documents are part of this collection. So this has gone ahead and automated copying data from an existing MongoDB instance onto Azure Cosmos DB, right? So this marks the end of this chapter.