1. Keyspaces overview
In this section, let’s talk about Key Spaces, the new database service from AWS. So what exactly is key spaces? Key Spaces allows you to run your Cassandra workloads on the AWS cloud. It’s a highly scalable, highly available and fully managed database service. As I mentioned, it lets you run your Cassandra. Cassandra workloads on AWS. Cassandra is an open source wide column, no SQL data store. It’s a key value data store. And key spaces is server less. Just like DynamoDB. So you pay for what you use. And it also supports auto scaling. And Key Spaces supports thousands of requests per second with virtually unlimited throughput and storage. It’s compatible with what’s called a CQL API, or the Cassandra Query Language.
This is the API language that you use to interact with your key spacess. Database security is provided through IAM, VPC and Kms. Data is encrypted by default. And Key Spaces supports encryption at rest, as well as encryption in transit. It also offers continuous backups with Pitr, and all the rights are replicated three times across multiple Az’s. So this provides for durability and availability. And Key Spaces also offers 99. 99% availability SLA within region with no schedule downtime. So this is a great feature. And as always, monitoring is provided through Cloud Watch and all the DDL actions are logged with Cloud Trail. And these are typical use cases of Key Spaces.
You can use it for storing the IoT device metadata, you can store user profiles like the gaming profiles, you can store time series data like the temperature data or weather data, and you can store transactions data like you have in the ecommerce websites. All right, so these were couple of use cases, and as mentioned previously, we use CQL with Key Spaces. CQL stands for Cassandra Query Language. You use it for interacting with Cassandra database and of course with Key Spaces as well. You can run these CQL queries using the CQL editor available in the API AWS console, or you can also use cqlsh, which is the SQL shell. And for programmatic access, you can use Cassandra client driver. So these are different ways you can run SQL queries on your Key Spaces database. All right, let’s continue.
2. Migrating from Cassandra to Keyspaces
Now, migrating from Cassandra to key spaces is fairly easy. It’s a twostep process. You export your existing Cassandra cluster data to CSV files, and then you can import these CSV files using the Sqlsh copy command. So you have your Cassandra database, you export your data into CSV file and then use the SQL sh copy command to copy the data from CSV file or import the data from CSV file into your key spaces database. And the SQL sh copy command splits your CSV file into smaller chunks and loads that data into key spaces parallely. All right, so this is how migration from Cassandra to keyspaces first. So that’s all about migrating from Cassandra to key spaces. Let’s continue to the next lecture.
3. Read and write consistency in Keyspaces
Now let’s talk about read and write consistency in key spaces. So we have two read consistency modes in key spaces local one consistency and local quorum consistency. The local one consistency optimizes for performance and availability so it returns the first returned value from any storage replica. So it’s the fastest and local quorum consistency optimizes for data correctness. It requires at least two replicas to return a value before it can return that back to the application. So these are the two consistency modes in key spaces for read consistency and write consistency always uses local quorum so that allows for durability. So that’s about the read and write consistency modes and key spaces.
4. Keyspaces pricing
Now, let’s talk about key spaces. Pricing. And the key spaces pricing is more or less similar to the Dynamo TB pricing. So we have two modes for pricing on Demand Mode and the Provision Mode. So let’s look at the OnDemand mode. This uses RRS and WRUS are the Read Request Units and Write Request Units units, and you pay for the actual reads and writes that your application performs. And just like Dynamo TB, you use this with unpredictable application traffic. So when you cannot predict the traffic from your application, you would use the On Demand mode. One RRU corresponds to four KB read with local quorum consistency. And if you use the local one consistency, then you can perform two four KB reads. So the local one consistency costs you one half the local quorum consistency.
And one WRU corresponds to one one KB write with local quorum consistency. Then with provisioned mode we use the capacity unit terminology. So we call these RCAs and WCAS or read capacity units. And write capacity units. And just like DynamoDB, you specify the number of reads and writes per second. And this mode definitely allows you to optimize your costs. So if you have a predictable application traffic, and if you can forecast your capacity requirements in advance, then you should use the Provision mode and not the On Demand mode. And one RCU here corresponds to one four KB read with local quorum consistency. And it also corresponds to 24 KB reads with local one consistency.
So local one consistency costs you one half the local quorum consistency, and one WCU corresponds to one one KB. Right? With local quorum consistency, of course. And with both these pricing modes, if your query returns multiple rows, then you are billed based on the aggregate size of the data. So, for example, if your query returns four rows and each row has about two KB of data, then it’s eight KB in total. So you will be billed for two RCU if you use local quorum consistency, and one RCU if you use the local one consistency. All right? And just like any other AWS service, you will be billed for storage, backup and restore, as well as the data transfer. All right? So that’s about the key spaces pricing. Let’s continue.
5. Working with Keyspaces – Hands on
In this demo. Let’s explore key spaces for Apache. Cassandra So, Key Spaces has made it very easy to get started with it. So let’s simply click on the Get Started button here and you’ll see what I mean. So here you can use this screen to create a key space. You can add table in your key space. Then you can populate sample data into your table and then query your data. So simply clicking through these steps is going to get you started with Key Spaces. Let’s go ahead. First we create a Key space. Key Space is like a container for your Cassandra tables. All right, let’s click on this. And your key space has been created. What it actually did is it simply executed this CQL query in the background.
And you can use the clients like SQL sh to run this same query. So you can copy the SQL here and run it using the SQL sh, for example. Then let’s create a table in the Key space. Click on the Create Table button and the table should be created in a few seconds, I believe. So the table is ready now. And now we can add some data to this table. So here we are, inserting one record into our table, and it contains name, email, and age. All right, so let’s insert that. That’s done. And now we are ready to query our data. So let’s go to the SQL editor, and we can simply run the command. And there we go, a record is returned. So this is how we use Key Spaces.
Now, this was a guided example. You can also create your Key Spaces and tables from this left side menu. So if you click on Key Spaces, you can create a key space. Let’s name it My Key Space and create the key space. And it’s available now. Now we can create a table within this key space. Let’s go to Tables and let’s create a table.So you choose a key space here and specify a table name. So let’s say Employees, for example. And then you can add some columns here. So we could say Department ID, and you can choose the type. So we might want to have, let’s say, an Integer. Then you add another column, let’s say Employee ID, and it could also be an Integer. Then we can say name.
So this could be a text. And one more column we’ll add, let’s say Age. Okay, let’s make it Integer as well. Okay, and then each table in Cassandra or in Key Spaces has a partition key, and it can also have an optional sort key. Sort key is called as a clustering column. So let’s create the partition key. You can choose multiple columns. So we can have department ID and employee IDs or partition key. And we can also add a clustering column. So clustering columns are used for sorting the data within a partition. So let’s add a clustering column here. Let’s say we want to sort by Age ascending. So that’s how you define your table schema. Then you can choose the read write capacity mode.
You can either choose on demand or provisioned. So if you choose Provision, then you can specify the Read and Write Capacity units here and also specify the auto scaling configuration. And this is exactly the same like you have in DynamoDB. For now, I’m going to choose the on demand mode. Then you can configure point in time recovery here. You can add the tags if you like. And then this is the query that’s going to create this table for us. All right, so let’s create the table. And there you go. It’s creating. And now we see that this table is ready. And you can go ahead and use the SQL editor to add more data into this table and play around with it. So that’s all about this demo on Key spaces. Let’s continue.