databasesSAA-C03SAP-C02DEA-C01DOP-C02

Amazon DocumentDB: The MongoDB-Compatible Document Database Powerhouse

Fully managed, scalable, highly available JSON document database with MongoDB compatibility — without the operational overhead.

Updated 2026-02-21

Overview

Amazon DocumentDB (with MongoDB compatibility) is a fully managed, serverless, NoSQL document database service designed to store, query, and index JSON data at scale. It implements the Apache 2.0 open-source MongoDB 3.6, 4.0, and 5.0 APIs, allowing existing MongoDB workloads to migrate with minimal code changes. DocumentDB decouples compute from storage using a distributed, fault-tolerant, self-healing storage system that automatically replicates data six ways across three Availability Zones.

Run MongoDB-compatible document workloads at enterprise scale without managing database infrastructure — ideal for content management, catalogs, user profiles, and real-time applications requiring flexible JSON schemas.

Use When

Migrating existing MongoDB workloads to AWS with minimal application code changes (supports MongoDB 3.6, 4.0, 5.0 APIs)
Applications requiring flexible, schema-less JSON document storage with rich query capabilities including nested documents and arrays
High-availability workloads needing automatic multi-AZ replication with up to 15 read replicas and sub-10ms replica lag
Regulated industries needing encryption at rest (AES-256), encryption in transit (TLS), VPC isolation, and AWS CloudTrail audit logging
Workloads needing elastic storage that automatically grows in 10 GB increments up to supported maximum without manual intervention

Avoid When

True graph relationship traversal workloads — use Amazon Neptune instead; DocumentDB has no native graph engine and cannot efficiently traverse deeply connected relationship networks
Workloads requiring full MongoDB feature parity (e.g., $lookup across shards, change streams to non-AWS targets, MongoDB Atlas-specific features) — DocumentDB is API-compatible, not a MongoDB fork, and subtle behavioral differences exist
Simple key-value access patterns at extreme scale with single-digit millisecond latency requirements — use Amazon DynamoDB instead for pure key-value or wide-column access patterns
Relational data with complex multi-table JOINs and ACID transactions across many tables — use Amazon Aurora or RDS for relational workloads
Workloads requiring full-text search as a primary access pattern — use Amazon OpenSearch Service instead; DocumentDB text indexes have limitations

Key Features

MongoDB API Compatibility (3.6, 4.0, 5.0)

Wire protocol compatible, not a MongoDB fork. Not all operators supported.

Automatic Multi-AZ Storage Replication (6 copies, 3 AZs)

Built into the storage layer — no configuration required.

Up to 15 Read Replicas

Low-latency reads; replicas promote to primary on failover.

Automated Backups (1–35 days retention)

Continuous backups to S3; point-in-time restore supported.

Manual Snapshots

Retained indefinitely until explicitly deleted; can be shared cross-account.

Encryption at Rest (AES-256 / AWS KMS)

Must be enabled at cluster creation; supports CMK (Customer Managed Keys).

Encryption in Transit (TLS)

Enforced by default; cannot be disabled on new clusters.

VPC-Only Deployment

No public internet endpoint; security best practice.

IAM Authentication

Supports IAM database authentication for passwordless access.

AWS Secrets Manager Integration

Rotate credentials automatically without application downtime.

Change Streams

Must be enabled via parameter group; enables event-driven architectures.

Global Clusters (Cross-Region)

Up to 5 secondary read-only regions; fast regional failover.

Elastic Clusters (Serverless Sharding)

Horizontally sharded, serverless option for variable workloads — separate from standard clusters.

Performance Insights

Visualize database load and identify bottlenecks.

CloudWatch Metrics and Alarms

Integrated monitoring; key metrics: DatabaseConnections, ReadLatency, WriteLatency, CPUUtilization.

CloudTrail Audit Logging

API-level audit trail for compliance and security.

Profiler Logging

Logs slow queries to CloudWatch Logs; configurable threshold.

Auto Scaling for Read Replicas

Automatically adds/removes read replicas based on CPU or connection metrics.

Multi-master (simultaneous writes to multiple nodes)

DocumentDB has a single primary writer. Use Elastic Clusters for horizontal write scaling.

Native MongoDB Transactions (ACID)

Supported in MongoDB 4.0-compatible mode and above.

Aggregation Pipeline

Most aggregation stages supported; some advanced stages may differ from MongoDB.

Text Indexes

Limited compared to dedicated search services; use OpenSearch for advanced full-text search.

Geospatial Indexes

2dsphere indexes supported for location-based queries.

TTL Indexes

Automatically expire documents after a specified time.

Cross-Account Snapshot Sharing

Manual snapshots can be shared with other AWS accounts for migration or DR.

Integration Patterns

Event-Driven Document Processing via Change Streams

high freq

Amazon DocumentDBAWS Lambda

Enable Change Streams on a DocumentDB collection, then use a Lambda function (or a polling application on EC2/ECS) to consume the stream and trigger downstream processing — e.g., invalidate cache, send notifications, or replicate data to another service. Note: Lambda cannot directly trigger from DocumentDB Change Streams natively (unlike DynamoDB Streams); a polling layer is required.

Automated Credential Rotation

high freq

Amazon DocumentDBAWS Secrets Manager

Store DocumentDB master user credentials in Secrets Manager and configure automatic rotation using the built-in Lambda rotation function. Applications retrieve credentials at runtime — no hardcoded passwords. This is the AWS-recommended security pattern for all managed databases.

Operational Monitoring and Alerting

high freq

Amazon DocumentDBAmazon CloudWatch

Use CloudWatch metrics (CPUUtilization, DatabaseConnections, ReadLatency, WriteLatency, FreeableMemory) with CloudWatch Alarms to detect performance degradation. Enable DocumentDB Profiler to log slow queries to CloudWatch Logs for query optimization.

Polyglot Persistence — Document + Graph

high freq

Amazon DocumentDBAmazon Neptune

Use DocumentDB for flexible JSON document storage (e.g., user profiles, product catalogs) and Neptune for graph relationship traversal (e.g., social connections, fraud detection). A common exam pattern tests whether candidates can distinguish document vs. graph use cases.

Polyglot Persistence — Document vs. Key-Value

high freq

Amazon DocumentDBAmazon DynamoDB

DocumentDB is preferred when rich query capabilities (aggregation pipelines, nested document queries) are needed. DynamoDB is preferred for pure key-value access at massive scale with single-digit millisecond latency. Exam questions test this distinction heavily.

Real-Time Change Data Capture (CDC) Pipeline

medium freq

Amazon DocumentDBAmazon Kinesis Data Streams

Use a DocumentDB Change Stream consumer application (running on EC2 or ECS) to publish change events to Kinesis Data Streams, enabling real-time analytics, downstream microservice fan-out, and audit trails at scale.

Hybrid Relational + Document Architecture

medium freq

Amazon DocumentDBAmazon RDS

Use RDS (Aurora/PostgreSQL/MySQL) for transactional relational data and DocumentDB for flexible, schema-less document storage in the same application. Common in e-commerce: RDS for orders/inventory, DocumentDB for product catalog with variable attributes.

MongoDB to DocumentDB Migration

medium freq

Amazon DocumentDBAWS Database Migration Service (DMS)

Use AWS DMS with a MongoDB source endpoint and DocumentDB target endpoint for online migration with minimal downtime. DMS supports full-load and CDC (ongoing replication) modes. Validate compatibility using the MongoDB Compatibility Checker before migration.

Caching Layer for Read-Heavy Workloads

medium freq

Amazon DocumentDBAmazon ElastiCache

Place ElastiCache (Redis or Memcached) in front of DocumentDB to cache frequently accessed documents, reducing read latency and I/O costs. Cache-aside pattern: check cache first, read from DocumentDB on miss, populate cache.

Full-Text Search Augmentation

medium freq

Amazon DocumentDBAmazon OpenSearch Service

DocumentDB handles document storage and transactional queries; OpenSearch Service handles full-text search, faceted search, and analytics. Sync data from DocumentDB to OpenSearch using Change Streams + Lambda or DMS.

Service Limits & Quotas

LimitValueNote

Maximum instances per cluster

16 instances (1 primary + up to 15 read replicas)

Candidates confuse this with RDS which supports fewer read replicas per instance. Aurora and DocumentDB share the 15 read replica limit.

Maximum cluster storage

Up to 128 TiB TiB per cluster

Storage auto-scaling is a key differentiator vs. self-managed MongoDB where you must manage disk capacity manually.

Maximum databases per cluster

No documented hard limit — practical limits apply based on instance size

Maximum document size

16 MB per document

This is inherited from MongoDB's BSON specification — not an AWS-imposed limit. Knowing the reason helps remember it.

Maximum indexes per collection

64 indexes per collection

Maximum index key size

2,048 bytes bytes

Automated backup retention period

1 to 35 days days

Some candidates confuse this with the default of 1 day — always explicitly set retention in production environments.

Storage replication

6 copies across 3 AZs copies / AZs

This is the same shared storage architecture as Aurora — both use the same underlying distributed storage engine. Knowing this helps answer questions about both services.

Read replica lag

Typically under 100 milliseconds ms

Cluster endpoint types

3 types: Cluster (writer), Reader, Instance-specific

A very common exam trap: applications hardcoded to an instance endpoint fail during failover. The cluster endpoint automatically routes to the current primary.

Failover time

Typically under 30 seconds seconds

Failover without a replica is significantly slower. Exam scenarios about RTO requirements should lead you to always provision at least one replica.

Supported MongoDB API versions

3.6, 4.0, 5.0

MongoDB 6.0+ features are NOT supported. Exam questions about 'latest MongoDB features' or 'MongoDB Atlas' are distractors — DocumentDB is not equivalent to Atlas.

Global Clusters (cross-region replication)

Supported — up to 5 secondary regions secondary regions

Global Clusters are a separate feature from standard multi-AZ replication. Multi-AZ is within one region; Global Clusters span regions.

Change Streams

Supported

Change Streams must be explicitly enabled. They are not enabled by default and require a cluster parameter group change.

Encryption at rest

AES-256, enabled at cluster creation — cannot be enabled post-creation

This is identical behavior to RDS and Aurora. The pattern: unencrypted → snapshot → encrypted copy → restore is tested heavily across all database services.

VPC requirement

DocumentDB clusters must be deployed inside a VPC

Unlike some AWS services, DocumentDB has NO public endpoint option — it is always VPC-only. This is a security feature, not a limitation.

Instance types supported

Memory-optimized (r-series) and general purpose (t3, t4g) instances

Exam scenarios describing intermittent performance degradation on DocumentDB often involve t3 instances hitting CPU credit limits under sustained load.

Pricing Model

Pay-per-use: billed separately for compute (instance hours), storage (GB-month), I/O (per million requests), backup storage, and data transfer.

Compute: Billed per instance-hour for each instance in the cluster (primary + replicas). Prices vary by instance type and region.
Storage: Billed per GB-month for the actual storage consumed — storage auto-grows in 10 GB increments and you pay only for what is used.
I/O: Billed per 1 million I/O requests. High-read/write workloads should monitor I/O costs — this can dominate the bill for I/O-intensive applications. Consider DocumentDB Elastic Clusters which bundle I/O costs differently.
Backup Storage: Automated backup storage equal to cluster storage is free. Additional backup storage beyond cluster size is charged per GB-month.
Data Transfer: Standard AWS data transfer pricing applies. Data transfer within the same AZ is free; cross-AZ and cross-region transfers incur charges.
Global Clusters: Secondary region instances are billed at standard instance rates; cross-region replication data transfer costs apply.
Elastic Clusters: Priced per RPU (Read/Write Capacity Unit) — a serverless pricing model distinct from standard cluster pricing.
No upfront costs for On-Demand; Reserved Instances (1-year or 3-year terms) provide significant discounts (up to ~60%) for predictable workloads — critical for cost optimization exam questions.

Exam Tips

criticalHigh Availability, Failover, Connection Management

DocumentDB uses a CLUSTER ENDPOINT (not instance endpoint) for writes. Always route write traffic to the cluster endpoint — it automatically points to the current primary after failover. Applications hardcoded to an instance endpoint will BREAK during failover.

criticalSecurity, Encryption, KMS

Encryption at rest CANNOT be enabled on an existing unencrypted DocumentDB cluster. The only path to encrypt an unencrypted cluster is: (1) Create a snapshot, (2) Copy the snapshot with encryption enabled using a KMS CMK, (3) Restore from the encrypted snapshot. This pattern is tested across DocumentDB, RDS, and Aurora.

criticalMongoDB Compatibility, Migration

DocumentDB is MongoDB-COMPATIBLE, not MongoDB-identical. It implements the MongoDB wire protocol but is NOT a MongoDB fork. Some MongoDB operators, commands, and Atlas-specific features are NOT supported. Exam scenarios mentioning 'full MongoDB compatibility' or 'MongoDB Atlas features' are testing whether you know this distinction.

criticalHigh Availability, Storage Architecture, Durability

DocumentDB storage is automatically replicated SIX WAYS across THREE Availability Zones — this happens at the storage layer automatically, regardless of how many instances you configure. Even a single-instance cluster has 6-way replicated storage. Instances are the compute layer only.

criticalChange Streams, Lambda, Event-Driven Architecture

Lambda CANNOT natively poll DocumentDB Change Streams as an event source (unlike DynamoDB Streams which has native Lambda integration). You need an intermediary — either a polling application on EC2/ECS, or route through Kinesis Data Streams. This is a critical architectural distinction tested on exams.

criticalService Selection, NoSQL Comparison

When comparing DocumentDB vs. Neptune vs. DynamoDB: DocumentDB = flexible JSON documents with rich queries; Neptune = graph traversal (relationships); DynamoDB = key-value/wide-column at massive scale with predictable single-digit ms latency. Memorize this triangle — it appears in almost every database selection question.

criticalEncryption, KMS, CMK, Snapshots

Encrypting a DocumentDB SNAPSHOT with a CMK does NOT encrypt the live cluster. The snapshot encryption is independent. To have a CMK-encrypted cluster, encryption must be enabled at cluster creation time with the CMK specified — or restore from an encrypted snapshot.

critical

Encryption at rest is IMMUTABLE — set at cluster creation only. To encrypt an unencrypted cluster: snapshot → copy with encryption → restore as new cluster. This is tested constantly.

critical

Lambda has NO native event source mapping for DocumentDB Change Streams. You need a polling intermediary (EC2/ECS app or Kinesis). DynamoDB Streams DO have native Lambda integration — don't confuse them.

critical

DocumentDB is MongoDB-COMPATIBLE (wire protocol), NOT MongoDB-identical. MongoDB 6.0+ and Atlas features are unsupported. Always use the cluster endpoint (not instance endpoint) for writes — it survives failover automatically.

importantChange Streams, Event-Driven Architecture, Lambda Integration

Change Streams are NOT enabled by default. You must enable them via a cluster parameter group (change_stream_log_retention_duration). Exam questions about event-driven architectures with DocumentDB always assume Change Streams are explicitly enabled.

importantDisaster Recovery, Global Clusters, RPO/RTO

For CROSS-REGION disaster recovery with DocumentDB, use DocumentDB Global Clusters — NOT cross-region read replicas (which don't exist in standard DocumentDB). Global Clusters provide RPO of seconds and RTO under 1 minute for regional failover.

importantInstance Types, Performance, Production Readiness

DocumentDB t3/t4g instances are BURSTABLE and suitable only for development/testing. Production workloads must use r-series (r5, r6g) memory-optimized instances. Exam scenarios about unpredictable performance spikes on DocumentDB often involve t-series instances exhausting CPU credits.

importantElastic Clusters, Serverless, Auto Scaling

DocumentDB Elastic Clusters offer SERVERLESS HORIZONTAL SHARDING — automatically scale read/write capacity without managing instances. This is distinct from standard DocumentDB clusters. Exam questions about 'unpredictable, highly variable' DocumentDB workloads should point to Elastic Clusters.

importantCost Optimization, Reserved Instances, I/O Pricing

For COST OPTIMIZATION on DocumentDB: (1) Use Reserved Instances for predictable workloads (up to ~60% savings), (2) Monitor I/O costs — they can dominate bills for high-throughput workloads, (3) Use ElastiCache to reduce DocumentDB read I/O, (4) Right-size instances (avoid over-provisioned r-series for dev/test).

Common Misconceptions & Traps

Common Mistake

Encrypting a DocumentDB snapshot with a Customer Managed Key (CMK) means the live cluster is now encrypted with that CMK.

Correct

Snapshot encryption is completely independent of cluster encryption. Encrypting a snapshot (or copying a snapshot with encryption) does NOT retroactively encrypt the source cluster. To have an encrypted cluster, you must restore from the encrypted snapshot into a NEW cluster — the original unencrypted cluster remains unencrypted.

This is one of the most dangerous exam traps. Candidates assume that because the snapshot is encrypted, the data is protected. But the live cluster continues operating unencrypted. The correct remediation is always: snapshot → encrypt copy → restore as new cluster → update connection strings → decommission old cluster. This pattern is tested across DocumentDB, RDS, and Aurora.

Common Mistake

DocumentDB is a fork of MongoDB and supports all MongoDB features, including MongoDB Atlas capabilities and the latest MongoDB 6.x/7.x features.

Correct

DocumentDB implements the MongoDB 3.6, 4.0, and 5.0 wire protocols but is NOT a MongoDB fork — it is an entirely different database engine built by AWS that speaks the MongoDB protocol. MongoDB 6.0+ features are not supported. MongoDB Atlas-specific features (Atlas Search, Atlas Data Federation, Atlas Vector Search) are not available. Some MongoDB operators behave differently or are unsupported.

Migration questions frequently test this. If an exam scenario mentions 'the application uses MongoDB 6.0 aggregation features' or 'Atlas Search,' DocumentDB is NOT the right answer. Always check the compatibility matrix. The phrase 'MongoDB-compatible' ≠ 'MongoDB-identical.'

Common Mistake

AWS Lambda can natively subscribe to DocumentDB Change Streams as an event source, similar to how Lambda integrates with DynamoDB Streams.

Correct

Lambda does NOT have a native event source mapping for DocumentDB Change Streams. DynamoDB Streams have native Lambda integration (Lambda polls the stream automatically). For DocumentDB Change Streams, you need an intermediary: either a polling application (on EC2/ECS/Fargate) or publish changes to Kinesis Data Streams first, then Lambda can consume from Kinesis.

This is a critical architectural trap. Exam questions about real-time event-driven processing with DocumentDB will test whether you know this distinction. The correct architecture is: DocumentDB Change Streams → polling app (EC2/ECS) → Kinesis/SQS → Lambda. NOT: DocumentDB → Lambda directly.

Common Mistake

A single-instance DocumentDB cluster (no read replicas) has no high availability — data is stored in only one location.

Correct

Even a single-instance DocumentDB cluster automatically replicates data SIX WAYS across THREE Availability Zones at the storage layer. This is built into the DocumentDB storage architecture and requires no configuration. However, a single-instance cluster still has compute-level single-point-of-failure — failover requires creating a new instance (takes minutes). Always add at least one read replica for fast compute-layer failover (under 30 seconds).

Candidates conflate storage-level HA (automatic, always-on) with compute-level HA (requires replicas). Exam questions about DocumentDB HA often test this distinction. The correct answer for fast failover is: add a read replica. The storage is always highly available regardless.

Common Mistake

You can enable encryption at rest on an existing DocumentDB cluster by modifying the cluster settings, similar to enabling other features post-creation.

Correct

Encryption at rest CANNOT be enabled on an existing unencrypted DocumentDB cluster through modification. It is an immutable setting set at cluster creation. The only way to encrypt an unencrypted cluster is: (1) Take a snapshot of the unencrypted cluster, (2) Copy the snapshot and enable encryption with a KMS key during the copy, (3) Restore a new cluster from the encrypted snapshot, (4) Update application connection strings to the new cluster.

This is tested consistently across all AWS database services (RDS, Aurora, DocumentDB). The 'snapshot → encrypt copy → restore' pattern is the universal answer for enabling encryption on an existing unencrypted database. Candidates who don't know this will select 'modify the cluster' which is incorrect.

Common Mistake

AWS Trusted Advisor can automatically categorize DocumentDB costs by business unit, and Service Control Policies (SCPs) can provide cost chargeback capabilities for DocumentDB resources.

Correct

AWS Trusted Advisor provides cost optimization recommendations (rightsizing, idle resources) but CANNOT categorize costs by business unit. SCPs are permission guardrails — they control WHAT actions are allowed, not cost attribution. For cost attribution by business unit: (1) Apply resource tags to DocumentDB clusters, (2) ACTIVATE those tags as cost allocation tags in the AWS Billing console, (3) Use AWS Cost Explorer to filter/group by those tags. Tags alone are insufficient — they MUST be activated in the Billing console.

This misconception appears directly in the exam question bank. The complete cost attribution workflow is: Tag resource → Activate tag in Billing Console → View in Cost Explorer. Skipping the activation step means tags never appear in billing reports. Trusted Advisor and SCPs are completely unrelated to cost attribution.

Common Mistake

DocumentDB Global Clusters work the same as cross-region read replicas — you can add a read replica in another region directly from the cluster console.

Correct

DocumentDB does NOT support cross-region read replicas in the traditional sense. Cross-region replication is achieved through DocumentDB Global Clusters, which is a distinct feature that creates a separate global cluster with a primary region and up to 5 read-only secondary regions. The secondary regions can be promoted to primary for DR. This is fundamentally different from simply adding a read replica.

Candidates familiar with RDS cross-region read replicas assume the same model applies to DocumentDB. Global Clusters is the correct mechanism and has different setup, pricing, and failover procedures. Exam questions about DocumentDB cross-region DR should always point to Global Clusters.

Memory Tricks

🧠

DocumentDB = 'DOCS in a VAULT across 3 ZONES': Documents stored, 6-way replicated, 3 AZs — always.

🧠

The 3 NoSQL Siblings: D-N-D — DocumentDB (JSON docs), Neptune (graph), DynamoDB (key-value). Pick the right sibling for the job.

🧠

Encryption Rule: 'You can't lock a house that's already built open' — encrypt at creation or snapshot-copy-restore.

🧠

Change Streams ≠ Lambda Native: Remember 'DocumentDB needs a MIDDLEMAN' — polling app or Kinesis between DocumentDB Change Streams and Lambda.

🧠

Cluster Endpoint = 'The GPS that always finds the primary' — it reroutes automatically after failover. Instance endpoint = 'hardcoded address that breaks when you move.'

CertAI Tutor · SAA-C03, SAP-C02, DEA-C01, DOP-C02 · 2026-02-21

Ready to test your knowledge?

Practice SAA-C03, SAP-C02, DEA-C01, DOP-C02 exam questions with AI-powered explanations — free to start.

Amazon DocumentDB: The MongoDB-Compatible Document Database Powerhouse

Overview

Key Features

Integration Patterns

Service Limits & Quotas

Pricing Model

Exam Tips

Common Misconceptions & Traps

Memory Tricks

Ready to test your knowledge?

Related Cheat Sheets