
Cargando...
Fully managed, scalable, highly available JSON document database with MongoDB compatibility — without the operational overhead.
Amazon DocumentDB (with MongoDB compatibility) is a fully managed, serverless, NoSQL document database service designed to store, query, and index JSON data at scale. It implements the Apache 2.0 open-source MongoDB 3.6, 4.0, and 5.0 APIs, allowing existing MongoDB workloads to migrate with minimal code changes. DocumentDB decouples compute from storage using a distributed, fault-tolerant, self-healing storage system that automatically replicates data six ways across three Availability Zones.
Run MongoDB-compatible document workloads at enterprise scale without managing database infrastructure — ideal for content management, catalogs, user profiles, and real-time applications requiring flexible JSON schemas.
Use When
Avoid When
MongoDB API Compatibility (3.6, 4.0, 5.0)
Wire protocol compatible, not a MongoDB fork. Not all operators supported.
Automatic Multi-AZ Storage Replication (6 copies, 3 AZs)
Built into the storage layer — no configuration required.
Up to 15 Read Replicas
Low-latency reads; replicas promote to primary on failover.
Automated Backups (1–35 days retention)
Continuous backups to S3; point-in-time restore supported.
Manual Snapshots
Retained indefinitely until explicitly deleted; can be shared cross-account.
Encryption at Rest (AES-256 / AWS KMS)
Must be enabled at cluster creation; supports CMK (Customer Managed Keys).
Encryption in Transit (TLS)
Enforced by default; cannot be disabled on new clusters.
VPC-Only Deployment
No public internet endpoint; security best practice.
IAM Authentication
Supports IAM database authentication for passwordless access.
AWS Secrets Manager Integration
Rotate credentials automatically without application downtime.
Change Streams
Must be enabled via parameter group; enables event-driven architectures.
Global Clusters (Cross-Region)
Up to 5 secondary read-only regions; fast regional failover.
Elastic Clusters (Serverless Sharding)
Horizontally sharded, serverless option for variable workloads — separate from standard clusters.
Performance Insights
Visualize database load and identify bottlenecks.
CloudWatch Metrics and Alarms
Integrated monitoring; key metrics: DatabaseConnections, ReadLatency, WriteLatency, CPUUtilization.
CloudTrail Audit Logging
API-level audit trail for compliance and security.
Profiler Logging
Logs slow queries to CloudWatch Logs; configurable threshold.
Auto Scaling for Read Replicas
Automatically adds/removes read replicas based on CPU or connection metrics.
Multi-master (simultaneous writes to multiple nodes)
DocumentDB has a single primary writer. Use Elastic Clusters for horizontal write scaling.
Native MongoDB Transactions (ACID)
Supported in MongoDB 4.0-compatible mode and above.
Aggregation Pipeline
Most aggregation stages supported; some advanced stages may differ from MongoDB.
Text Indexes
Limited compared to dedicated search services; use OpenSearch for advanced full-text search.
Geospatial Indexes
2dsphere indexes supported for location-based queries.
TTL Indexes
Automatically expire documents after a specified time.
Cross-Account Snapshot Sharing
Manual snapshots can be shared with other AWS accounts for migration or DR.
Event-Driven Document Processing via Change Streams
high freqEnable Change Streams on a DocumentDB collection, then use a Lambda function (or a polling application on EC2/ECS) to consume the stream and trigger downstream processing — e.g., invalidate cache, send notifications, or replicate data to another service. Note: Lambda cannot directly trigger from DocumentDB Change Streams natively (unlike DynamoDB Streams); a polling layer is required.
Automated Credential Rotation
high freqStore DocumentDB master user credentials in Secrets Manager and configure automatic rotation using the built-in Lambda rotation function. Applications retrieve credentials at runtime — no hardcoded passwords. This is the AWS-recommended security pattern for all managed databases.
Operational Monitoring and Alerting
high freqUse CloudWatch metrics (CPUUtilization, DatabaseConnections, ReadLatency, WriteLatency, FreeableMemory) with CloudWatch Alarms to detect performance degradation. Enable DocumentDB Profiler to log slow queries to CloudWatch Logs for query optimization.
Polyglot Persistence — Document + Graph
high freqUse DocumentDB for flexible JSON document storage (e.g., user profiles, product catalogs) and Neptune for graph relationship traversal (e.g., social connections, fraud detection). A common exam pattern tests whether candidates can distinguish document vs. graph use cases.
Polyglot Persistence — Document vs. Key-Value
high freqDocumentDB is preferred when rich query capabilities (aggregation pipelines, nested document queries) are needed. DynamoDB is preferred for pure key-value access at massive scale with single-digit millisecond latency. Exam questions test this distinction heavily.
Real-Time Change Data Capture (CDC) Pipeline
medium freqUse a DocumentDB Change Stream consumer application (running on EC2 or ECS) to publish change events to Kinesis Data Streams, enabling real-time analytics, downstream microservice fan-out, and audit trails at scale.
Hybrid Relational + Document Architecture
medium freqUse RDS (Aurora/PostgreSQL/MySQL) for transactional relational data and DocumentDB for flexible, schema-less document storage in the same application. Common in e-commerce: RDS for orders/inventory, DocumentDB for product catalog with variable attributes.
MongoDB to DocumentDB Migration
medium freqUse AWS DMS with a MongoDB source endpoint and DocumentDB target endpoint for online migration with minimal downtime. DMS supports full-load and CDC (ongoing replication) modes. Validate compatibility using the MongoDB Compatibility Checker before migration.
Caching Layer for Read-Heavy Workloads
medium freqPlace ElastiCache (Redis or Memcached) in front of DocumentDB to cache frequently accessed documents, reducing read latency and I/O costs. Cache-aside pattern: check cache first, read from DocumentDB on miss, populate cache.
Full-Text Search Augmentation
medium freqDocumentDB handles document storage and transactional queries; OpenSearch Service handles full-text search, faceted search, and analytics. Sync data from DocumentDB to OpenSearch using Change Streams + Lambda or DMS.
DocumentDB uses a CLUSTER ENDPOINT (not instance endpoint) for writes. Always route write traffic to the cluster endpoint — it automatically points to the current primary after failover. Applications hardcoded to an instance endpoint will BREAK during failover.
Encryption at rest CANNOT be enabled on an existing unencrypted DocumentDB cluster. The only path to encrypt an unencrypted cluster is: (1) Create a snapshot, (2) Copy the snapshot with encryption enabled using a KMS CMK, (3) Restore from the encrypted snapshot. This pattern is tested across DocumentDB, RDS, and Aurora.
DocumentDB is MongoDB-COMPATIBLE, not MongoDB-identical. It implements the MongoDB wire protocol but is NOT a MongoDB fork. Some MongoDB operators, commands, and Atlas-specific features are NOT supported. Exam scenarios mentioning 'full MongoDB compatibility' or 'MongoDB Atlas features' are testing whether you know this distinction.
DocumentDB storage is automatically replicated SIX WAYS across THREE Availability Zones — this happens at the storage layer automatically, regardless of how many instances you configure. Even a single-instance cluster has 6-way replicated storage. Instances are the compute layer only.
Lambda CANNOT natively poll DocumentDB Change Streams as an event source (unlike DynamoDB Streams which has native Lambda integration). You need an intermediary — either a polling application on EC2/ECS, or route through Kinesis Data Streams. This is a critical architectural distinction tested on exams.
When comparing DocumentDB vs. Neptune vs. DynamoDB: DocumentDB = flexible JSON documents with rich queries; Neptune = graph traversal (relationships); DynamoDB = key-value/wide-column at massive scale with predictable single-digit ms latency. Memorize this triangle — it appears in almost every database selection question.
Encrypting a DocumentDB SNAPSHOT with a CMK does NOT encrypt the live cluster. The snapshot encryption is independent. To have a CMK-encrypted cluster, encryption must be enabled at cluster creation time with the CMK specified — or restore from an encrypted snapshot.
Encryption at rest is IMMUTABLE — set at cluster creation only. To encrypt an unencrypted cluster: snapshot → copy with encryption → restore as new cluster. This is tested constantly.
Lambda has NO native event source mapping for DocumentDB Change Streams. You need a polling intermediary (EC2/ECS app or Kinesis). DynamoDB Streams DO have native Lambda integration — don't confuse them.
DocumentDB is MongoDB-COMPATIBLE (wire protocol), NOT MongoDB-identical. MongoDB 6.0+ and Atlas features are unsupported. Always use the cluster endpoint (not instance endpoint) for writes — it survives failover automatically.
Change Streams are NOT enabled by default. You must enable them via a cluster parameter group (change_stream_log_retention_duration). Exam questions about event-driven architectures with DocumentDB always assume Change Streams are explicitly enabled.
For CROSS-REGION disaster recovery with DocumentDB, use DocumentDB Global Clusters — NOT cross-region read replicas (which don't exist in standard DocumentDB). Global Clusters provide RPO of seconds and RTO under 1 minute for regional failover.
DocumentDB t3/t4g instances are BURSTABLE and suitable only for development/testing. Production workloads must use r-series (r5, r6g) memory-optimized instances. Exam scenarios about unpredictable performance spikes on DocumentDB often involve t-series instances exhausting CPU credits.
DocumentDB Elastic Clusters offer SERVERLESS HORIZONTAL SHARDING — automatically scale read/write capacity without managing instances. This is distinct from standard DocumentDB clusters. Exam questions about 'unpredictable, highly variable' DocumentDB workloads should point to Elastic Clusters.
For COST OPTIMIZATION on DocumentDB: (1) Use Reserved Instances for predictable workloads (up to ~60% savings), (2) Monitor I/O costs — they can dominate bills for high-throughput workloads, (3) Use ElastiCache to reduce DocumentDB read I/O, (4) Right-size instances (avoid over-provisioned r-series for dev/test).
Common Mistake
Encrypting a DocumentDB snapshot with a Customer Managed Key (CMK) means the live cluster is now encrypted with that CMK.
Correct
Snapshot encryption is completely independent of cluster encryption. Encrypting a snapshot (or copying a snapshot with encryption) does NOT retroactively encrypt the source cluster. To have an encrypted cluster, you must restore from the encrypted snapshot into a NEW cluster — the original unencrypted cluster remains unencrypted.
This is one of the most dangerous exam traps. Candidates assume that because the snapshot is encrypted, the data is protected. But the live cluster continues operating unencrypted. The correct remediation is always: snapshot → encrypt copy → restore as new cluster → update connection strings → decommission old cluster. This pattern is tested across DocumentDB, RDS, and Aurora.
Common Mistake
DocumentDB is a fork of MongoDB and supports all MongoDB features, including MongoDB Atlas capabilities and the latest MongoDB 6.x/7.x features.
Correct
DocumentDB implements the MongoDB 3.6, 4.0, and 5.0 wire protocols but is NOT a MongoDB fork — it is an entirely different database engine built by AWS that speaks the MongoDB protocol. MongoDB 6.0+ features are not supported. MongoDB Atlas-specific features (Atlas Search, Atlas Data Federation, Atlas Vector Search) are not available. Some MongoDB operators behave differently or are unsupported.
Migration questions frequently test this. If an exam scenario mentions 'the application uses MongoDB 6.0 aggregation features' or 'Atlas Search,' DocumentDB is NOT the right answer. Always check the compatibility matrix. The phrase 'MongoDB-compatible' ≠ 'MongoDB-identical.'
Common Mistake
AWS Lambda can natively subscribe to DocumentDB Change Streams as an event source, similar to how Lambda integrates with DynamoDB Streams.
Correct
Lambda does NOT have a native event source mapping for DocumentDB Change Streams. DynamoDB Streams have native Lambda integration (Lambda polls the stream automatically). For DocumentDB Change Streams, you need an intermediary: either a polling application (on EC2/ECS/Fargate) or publish changes to Kinesis Data Streams first, then Lambda can consume from Kinesis.
This is a critical architectural trap. Exam questions about real-time event-driven processing with DocumentDB will test whether you know this distinction. The correct architecture is: DocumentDB Change Streams → polling app (EC2/ECS) → Kinesis/SQS → Lambda. NOT: DocumentDB → Lambda directly.
Common Mistake
A single-instance DocumentDB cluster (no read replicas) has no high availability — data is stored in only one location.
Correct
Even a single-instance DocumentDB cluster automatically replicates data SIX WAYS across THREE Availability Zones at the storage layer. This is built into the DocumentDB storage architecture and requires no configuration. However, a single-instance cluster still has compute-level single-point-of-failure — failover requires creating a new instance (takes minutes). Always add at least one read replica for fast compute-layer failover (under 30 seconds).
Candidates conflate storage-level HA (automatic, always-on) with compute-level HA (requires replicas). Exam questions about DocumentDB HA often test this distinction. The correct answer for fast failover is: add a read replica. The storage is always highly available regardless.
Common Mistake
You can enable encryption at rest on an existing DocumentDB cluster by modifying the cluster settings, similar to enabling other features post-creation.
Correct
Encryption at rest CANNOT be enabled on an existing unencrypted DocumentDB cluster through modification. It is an immutable setting set at cluster creation. The only way to encrypt an unencrypted cluster is: (1) Take a snapshot of the unencrypted cluster, (2) Copy the snapshot and enable encryption with a KMS key during the copy, (3) Restore a new cluster from the encrypted snapshot, (4) Update application connection strings to the new cluster.
This is tested consistently across all AWS database services (RDS, Aurora, DocumentDB). The 'snapshot → encrypt copy → restore' pattern is the universal answer for enabling encryption on an existing unencrypted database. Candidates who don't know this will select 'modify the cluster' which is incorrect.
Common Mistake
AWS Trusted Advisor can automatically categorize DocumentDB costs by business unit, and Service Control Policies (SCPs) can provide cost chargeback capabilities for DocumentDB resources.
Correct
AWS Trusted Advisor provides cost optimization recommendations (rightsizing, idle resources) but CANNOT categorize costs by business unit. SCPs are permission guardrails — they control WHAT actions are allowed, not cost attribution. For cost attribution by business unit: (1) Apply resource tags to DocumentDB clusters, (2) ACTIVATE those tags as cost allocation tags in the AWS Billing console, (3) Use AWS Cost Explorer to filter/group by those tags. Tags alone are insufficient — they MUST be activated in the Billing console.
This misconception appears directly in the exam question bank. The complete cost attribution workflow is: Tag resource → Activate tag in Billing Console → View in Cost Explorer. Skipping the activation step means tags never appear in billing reports. Trusted Advisor and SCPs are completely unrelated to cost attribution.
Common Mistake
DocumentDB Global Clusters work the same as cross-region read replicas — you can add a read replica in another region directly from the cluster console.
Correct
DocumentDB does NOT support cross-region read replicas in the traditional sense. Cross-region replication is achieved through DocumentDB Global Clusters, which is a distinct feature that creates a separate global cluster with a primary region and up to 5 read-only secondary regions. The secondary regions can be promoted to primary for DR. This is fundamentally different from simply adding a read replica.
Candidates familiar with RDS cross-region read replicas assume the same model applies to DocumentDB. Global Clusters is the correct mechanism and has different setup, pricing, and failover procedures. Exam questions about DocumentDB cross-region DR should always point to Global Clusters.
DocumentDB = 'DOCS in a VAULT across 3 ZONES': Documents stored, 6-way replicated, 3 AZs — always.
The 3 NoSQL Siblings: D-N-D — DocumentDB (JSON docs), Neptune (graph), DynamoDB (key-value). Pick the right sibling for the job.
Encryption Rule: 'You can't lock a house that's already built open' — encrypt at creation or snapshot-copy-restore.
Change Streams ≠ Lambda Native: Remember 'DocumentDB needs a MIDDLEMAN' — polling app or Kinesis between DocumentDB Change Streams and Lambda.
Cluster Endpoint = 'The GPS that always finds the primary' — it reroutes automatically after failover. Instance endpoint = 'hardcoded address that breaks when you move.'
CertAI Tutor · SAA-C03, SAP-C02, DEA-C01, DOP-C02 · 2026-02-21
In the Same Category
Comparisons