Amazon Keyspaces (for Apache Cassandra) is a fully managed database service that is compatible with Apache Cassandra and designed to handle petabytes of data with high availability, scalability, and performance. By using Amazon Keyspaces, developers can build Cassandra-based applications without the need to worry about managing the underlying infrastructure. This knowledge base will provide an in-depth guide to Amazon Keyspaces, its architecture, setup, use cases, and best practices.
Introduction to Amazon Keyspaces
Amazon Keyspaces is designed to provide a fully managed and serverless version of the Apache Cassandra database. It allows users to work with the familiar Cassandra Query Language (CQL) without having to manage the complexity of provisioning, configuring, and maintaining a Cassandra cluster. With Amazon Keyspaces, you only pay for the resources you use, and the system automatically scales to meet your application’s needs.
Key benefits include:
- Serverless architecture
- No need to provision or manage servers
- Compatibility with Cassandra drivers and CQL
- High availability across multiple AWS regions
- Fully integrated with AWS services for enhanced security and monitoring
Key Features and Benefits
Serverless Architecture
Amazon Keyspaces operates as a serverless service, meaning you do not need to manage the infrastructure or configure nodes. It scales automatically based on the traffic and data load, ensuring your application can handle large volumes of traffic without manual intervention.
Cassandra Compatible
Amazon Keyspaces is compatible with Apache Cassandra, so you can use existing Cassandra drivers, CQL, and your application code with minimal changes. This makes it easy to migrate from self-managed Cassandra clusters to Amazon Keyspaces.
Fully Managed
Amazon Keyspaces handles database provisioning, scaling, patching, and maintenance. This reduces operational overhead and allows developers to focus on building applications instead of managing the database.
Elastic Scalability
The system can scale in response to demand without any manual configuration. This is particularly useful for applications that experience fluctuating traffic patterns or need to handle spikes in usage.
High Availability
Amazon Keyspaces offers high availability with automatic replication across multiple AWS Availability Zones (AZs). This ensures that your application remains operational even in the case of an AZ failure.
Integrated Security
Keyspaces integrates with AWS Identity and Access Management (IAM) to control access to tables and data, and it supports encryption at rest and in transit, ensuring that your data is secure.
Amazon Keyspaces Architecture
Amazon Keyspaces uses a serverless architecture that abstracts the complexities of managing and scaling a distributed database system. The architecture is designed to be fault-tolerant, distributed, and highly available.
Data Replication
Amazon Keyspaces replicates data across multiple AZs within an AWS Region, ensuring fault tolerance and data durability. In case of a failure in one AZ, data can still be accessed from another.
Distributed and Scalable Storage
Keyspaces uses a distributed storage system to store data across multiple nodes in the cloud. This allows it to scale to store petabytes of data while providing low-latency access to applications.
Consistency Model
Keyspaces supports two consistency levels:
- LOCAL_QUORUM: This is the default consistency level, ensuring that the majority of replicas in the local region have agreed on the result of an operation.
- LOCAL_ONE: Provides lower-latency reads by only contacting a single replica in the local region.
Stateless Compute
Amazon Keyspaces uses stateless compute to execute queries, meaning the service does not retain any state between requests. This allows it to scale up and down automatically in response to traffic without requiring complex configuration.
Setting Up Amazon Keyspaces
Setting up Amazon Keyspaces is straightforward, and you can use the AWS Management Console, AWS CLI, or AWS SDKs to create and manage your Keyspaces resources.
Step by Step Setup via AWS Console
-
Sign in to the AWS Management Console:
- Go to the Amazon Keyspaces service page.
-
Create a Keyspace:
- In the Keyspaces dashboard, click on Create Keyspace.
- Enter a name for your Keyspace (which is similar to a database in Cassandra).
-
Create a Table:
- After creating a Keyspace, you can create a table within it.
- Define the table's primary key (partition key and optional sort key).
- Configure TTL (Time to Live) if necessary.
-
Configure Capacity Mode:
- Choose between On Demand or Provisioned capacity mode (discussed in more detail below).
- Set up read/write capacity units if using Provisioned mode.
-
Review and Create:
- Review your settings and click Create Table.
Key Differences Between Amazon Keyspaces and Apache CassandraAmazon Keyspaces is designed to be compatible with Apache Cassandra but has several differences in how it operates:
Feature Apache Cassandra Amazon Keyspaces Management Self-managed, requires node provisioning, scaling, and maintenance Fully managed by AWS, serverless Consistency Levels Various levels (e.g., ONE, QUORUM) Supports LOCAL_QUORUM and LOCAL_ONE Scaling Requires manual node scaling Automatically scales based on demand High Availability Multi-region replication requires manual setup Built-in multi-AZ high availability Backups Requires manual snapshots Automated backups and point-in-time recovery Pricing Costs associated with infrastructure and operations Pay-as-you-go, no infrastructure costs
Data Models in Amazon Keyspaces
Amazon Keyspaces follows the same wide column data model as Apache Cassandra, where tables are collections of rows, and each row consists of multiple columns. Tables are schema-less, allowing flexibility in defining the structure of each row.
Primary Key Structure
Every table in Amazon Keyspaces has a primary key that consists of a partition key and an optional sort key:
- Partition Key: Determines how data is distributed across nodes.
- Sort Key: Optional and allows you to store multiple rows with the same partition key.
Collections
Amazon Keyspaces supports Cassandra collections such as sets, maps, and lists. These data types allow developers to store multiple values in a single column.
Static Columns
Static columns are columns whose values are shared by all rows with the same partition key. These are supported by Amazon Keyspaces, providing a means to store information that is constant across a set of rows.Scaling and Performance
Amazon Keyspaces is designed to scale automatically to meet your application's demands, without requiring manual intervention. Performance is optimized for high-throughput, low-latency workloads.
Capacity Modes
Amazon Keyspaces supports two capacity modes:
- On Demand Mode: Automatically scales to accommodate traffic without requiring you to provision capacity upfront.
- Provisioned Mode: Allows you to specify read and write capacity units (RCUs and WCUs) and scale based on the defined capacity.
Latency
Amazon Keyspaces is optimized for low-latency reads and writes, with most queries being completed in milliseconds. It is built to handle thousands of requests per second, even during peak traffic periods.
Security Features
Amazon Keyspaces provides robust security features to protect your data:
Encryption
- Encryption at Rest: All data stored in Amazon Keyspaces is encrypted at rest using AWS Key Management Service (KMS).
- Encryption in Transit: Data is encrypted using SSL/TLS when it is transmitted between the client and Amazon Keyspaces.
IAM Integration
Amazon Keyspaces integrates with AWS Identity and Access Management (IAM) to control access to your keyspaces and tables. You can define fine-grained access policies that allow or deny access to specific resources based on user roles.
- Review your settings and click Create Table.