
Cargando...
Master every AWS migration strategy to ace exam scenarios and real-world projects
Database migration patterns define the strategies used to move data from source to target databases with minimal risk, downtime, and data loss — a core competency tested across AWS Solutions Architect, Database Specialty, and DevOps Engineer exams. AWS provides a rich ecosystem of services (DMS, SCT, Snowball, DataSync, native engine tools) that map to specific migration scenarios based on homogeneous vs. heterogeneous engines, acceptable downtime windows, data volume, and ongoing replication needs. Understanding which pattern to apply — and why — is the difference between passing and failing scenario-based exam questions.
Exam questions present a business scenario (e.g., 'minimal downtime,' 'heterogeneous engines,' 'petabyte-scale,' 'continuous replication') and expect you to select the correct migration pattern and AWS service combination. Knowing the decision tree cold is essential.
Homogeneous Migration (Like-to-Like)
Migrating between the same database engine family (e.g., Oracle on-premises → Oracle on RDS, MySQL → MySQL on Aurora). Because the schema, data types, and stored procedures are natively compatible, no schema conversion is required. AWS DMS alone is sufficient — no AWS Schema Conversion Tool (SCT) needed. Native database tools (mysqldump, pg_dump, Oracle Data Pump) can also be used directly.
Source and target engines are the same or highly compatible (MySQL → Aurora MySQL, PostgreSQL → Aurora PostgreSQL, SQL Server → RDS SQL Server). Ideal when you want simplicity and speed with low transformation overhead.
Simplest pattern but locks you into the same engine family. Missed opportunity to modernize or reduce licensing costs. Still requires DMS for ongoing replication if you need near-zero downtime.
Heterogeneous Migration (Cross-Engine / Schema Conversion)
Migrating between different database engines (e.g., Oracle → Aurora PostgreSQL, SQL Server → MySQL, Teradata → Redshift). Requires a two-step process: (1) Use AWS Schema Conversion Tool (SCT) to convert the source schema, stored procedures, functions, and views to the target engine's dialect, then (2) Use AWS DMS to migrate the actual data. SCT produces an assessment report highlighting objects that cannot be automatically converted and must be manually remediated.
Any time source and target engines differ. Common exam scenarios: Oracle/SQL Server → open-source engines (cost reduction), commercial OLTP → Aurora, data warehouse migration to Redshift.
Higher complexity — SCT conversion action items require developer effort. Stored procedures with engine-specific syntax (PL/SQL → PL/pgSQL) often need manual rewrite. Plan for a longer migration timeline and thorough testing.
Continuous Replication / Minimal Downtime (CDC Pattern)
Uses AWS DMS Change Data Capture (CDC) to continuously replicate ongoing transactions from source to target after the initial full-load phase. The source database remains live and serving production traffic throughout. A cutover window is scheduled only for the final delta sync and DNS/connection string switch — typically minutes rather than hours. DMS reads the source database's transaction logs (binlog for MySQL, redo logs for Oracle, WAL for PostgreSQL) to capture changes.
Production systems with strict SLA requirements (e.g., 'migration must not exceed 30 minutes downtime'). E-commerce, financial systems, healthcare — any workload where extended downtime is unacceptable.
Requires that the source database has transaction logging enabled and accessible. Adds replication lag monitoring overhead. Target must stay in sync; large transaction bursts can cause lag spikes. DMS replication instance must be sized appropriately to handle CDC throughput.
Large-Scale / Offline Bulk Migration (Snowball + DMS)
For datasets too large to migrate over the network in a reasonable time (typically multi-terabyte to petabyte scale), AWS Snowball Edge devices are used to physically transport the bulk of the data to AWS S3. AWS DMS then loads from S3 into the target database. A parallel CDC stream captures changes that occurred during the physical transport window, and DMS applies those changes after the bulk load completes to bring the target current.
When available network bandwidth would make a full online migration take weeks or months. Rule of thumb: if migration over your available bandwidth would exceed your acceptable downtime window, consider Snowball. Exam clue words: 'limited bandwidth,' 'petabyte-scale,' 'remote location,' 'data center with poor connectivity.'
Adds physical logistics time (device shipping). Requires careful coordination between bulk load completion and CDC catch-up. Not suitable for truly real-time requirements during the bulk phase.
Database Replatforming (Lift-and-Optimize)
Moving from a self-managed database to a fully managed AWS service (e.g., on-premises MySQL → Amazon Aurora, on-premises PostgreSQL → Amazon RDS for PostgreSQL) to gain managed backups, Multi-AZ HA, read replicas, and automatic patching — without necessarily changing the engine. Combines homogeneous migration techniques with architectural improvements like enabling Multi-AZ, adding read replicas, and integrating with IAM database authentication.
Organization wants to eliminate database administration overhead while maintaining application compatibility. Common in 'modernize without rewriting the app' scenarios.
Some engine-specific features may not be available in managed form (e.g., certain Oracle features not supported on RDS). Licensing model changes (BYOL vs. License Included) affect cost.
Data Warehouse Migration (OLAP to Redshift)
Migrating traditional data warehouses (Teradata, Netezza, Greenplum, SQL Server DW, Oracle DW) to Amazon Redshift. Uses AWS SCT to convert the schema and ETL scripts, and DMS or native Redshift tools (COPY command from S3) to load data. The Amazon Redshift Migration Playbook and AWS SCT's data warehouse-specific assessment guide the conversion of proprietary SQL dialects and compression encodings.
Legacy on-premises data warehouse modernization. Exam clue: 'reduce data warehouse costs,' 'move from Teradata/Netezza to AWS,' 'petabyte-scale analytics,' 'columnar storage.'
Proprietary DW SQL (BTEQ, FastExport, etc.) requires significant manual conversion effort. Distribution keys and sort keys must be redesigned for Redshift's MPP architecture — a direct port without optimization will underperform.
NoSQL / Purpose-Built Database Migration
Migrating relational data to a purpose-built NoSQL service (e.g., RDS MySQL → DynamoDB) or migrating between NoSQL engines. This is a re-architecture pattern, not just a lift-and-shift. Requires data modeling redesign (access pattern-driven schema for DynamoDB). AWS DMS supports DynamoDB as a target. AWS Database Migration Service can also migrate from MongoDB to DocumentDB or Amazon DynamoDB.
Application access patterns are better served by key-value or document stores (single-digit millisecond latency at any scale, serverless scaling). Exam scenario: 'application needs to scale to millions of requests per second with consistent latency.'
Highest transformation effort — requires application code changes and data model redesign. Cannot simply map relational tables to DynamoDB tables without rethinking primary keys, GSIs, and item structure.
• STEP 1 — Are source and target engines the SAME family? YES → Homogeneous migration (DMS only, no SCT needed). NO → Heterogeneous migration (SCT first, then DMS). |
• STEP 2 — What is the acceptable downtime window? HOURS acceptable → Full-load migration with scheduled maintenance window. MINUTES or ZERO → CDC pattern with DMS ongoing replication + minimal cutover window. |
• STEP 3 — How large is the dataset relative to available bandwidth? Data can transfer in time → Online migration via DMS over Direct Connect or internet. Data TOO LARGE (weeks/months to transfer) → Snowball Edge for bulk + DMS CDC for delta. |
• STEP 4 — Is the target OLAP or OLTP? OLAP / Analytics → Consider Redshift migration path (SCT + DMS or COPY from S3). OLTP → RDS, Aurora, or DynamoDB depending on access patterns. |
STEP 5 — Does the application need to change its data model? NO → Replatform (lift-and-optimize to managed RDS/Aurora). YES → Replatform to purpose-built DB (DynamoDB, DocumentDB, ElastiCache) with application refactoring. | QUICK RULE:
• Oracle/SQL Server license cost is a concern → Heterogeneous to Aurora PostgreSQL or Aurora MySQL. Need sub-millisecond key-value at scale → DynamoDB. Need managed relational with same engine → RDS or Aurora homogeneous.
SCT is ONLY required for heterogeneous migrations (different engine families). If source and target are the same engine, SCT is not needed — DMS alone handles it. Exam questions often include SCT as a distractor in homogeneous scenarios.
CDC (Change Data Capture) is the key to near-zero downtime migrations. DMS reads transaction logs (MySQL binlog, Oracle redo log, PostgreSQL WAL) to replicate ongoing changes. For this to work, the source must have binary logging / supplemental logging enabled BEFORE the migration starts.
When a question mentions 'petabyte-scale' or 'limited network bandwidth' for a database migration, the answer almost always involves AWS Snowball Edge for the bulk transfer combined with DMS for CDC delta sync — not a direct online DMS migration.
DMS replication instances run in a VPC. If migrating from on-premises, you need AWS Direct Connect or VPN connectivity between on-premises and the DMS replication instance's VPC. A DMS instance without network connectivity to the source is a common trap in architecture questions.
Heterogeneous migration = SCT (schema) THEN DMS (data). Homogeneous migration = DMS only. SCT in a homogeneous scenario is always a wrong answer.
Minimal/zero downtime always requires CDC mode in DMS. Full-load only = downtime. Full-load + CDC = minimal downtime cutover window.
Petabyte-scale or bandwidth-constrained migration → Snowball Edge for bulk + DMS CDC for delta. Never choose direct online DMS migration for 'terabytes with limited bandwidth' scenarios.
AWS DMS supports both full-load and CDC simultaneously in a single task (Full load + CDC mode). This is the recommended approach for minimal downtime: DMS does a full load of existing data while simultaneously capturing changes, then applies the buffered changes after full load completes.
SCT produces a migration assessment report that categorizes objects as: automatically convertible, requiring minor manual changes, or requiring significant manual effort. Exam questions about 'planning a heterogeneous migration' often reference this report as the first step.
For Oracle to Aurora PostgreSQL migrations, SCT converts PL/SQL to PL/pgSQL but cannot convert 100% of complex packages automatically. The exam may ask about the correct tool for this conversion — the answer is SCT, not DMS.
When migrating to DynamoDB, DMS supports DynamoDB as a target but you must define the attribute mapping and partition key strategy yourself — DMS does not automatically design your DynamoDB schema. The data model redesign is a human/architect responsibility.
The AWS Database Migration Service does NOT migrate database users, roles, or permissions — only schema objects (via SCT) and data (via DMS). IAM and database-level permissions must be recreated manually on the target.
DMS can be used for ongoing replication (not just one-time migration) — it can keep a target database continuously synchronized with the source as a read replica alternative across different engines. This is useful for hybrid architectures during phased migrations.
Common Mistake
AWS DMS handles schema conversion for heterogeneous migrations — you just point it at the source and target.
Correct
DMS handles DATA migration only. Schema conversion (DDL, stored procedures, functions, views) for heterogeneous migrations requires AWS Schema Conversion Tool (SCT) as a separate prerequisite step. DMS without SCT on a heterogeneous migration will fail or produce incorrect results.
This is the #1 misconception on the Database Specialty and Solutions Architect exams. Always remember: SCT = Schema, DMS = Data. Run SCT first, then DMS.
Common Mistake
You can achieve zero downtime by simply running DMS in full-load mode during a maintenance window.
Correct
Full-load mode alone requires the source to be quiesced (no writes) during migration to ensure consistency — this is NOT zero downtime. Zero/minimal downtime requires Full-load + CDC mode, where DMS captures changes during the full load and applies them afterward, allowing the source to remain live.
Exam scenarios will specify 'zero downtime' or 'minimal downtime' as requirements. The correct answer must include CDC, not just full-load.
Common Mistake
Snowball is only for S3 data transfers and cannot be used for database migrations.
Correct
Snowball Edge is a valid and recommended pattern for large-scale database migrations where network bandwidth is insufficient. The workflow is: export database to files → load onto Snowball → ship to AWS → import to S3 → DMS loads from S3 to target DB → CDC catches up the delta.
Exam questions about 'limited bandwidth + large database' are testing knowledge of the Snowball + DMS hybrid pattern. Candidates who dismiss Snowball as 'only for S3' will choose the wrong answer.
Common Mistake
AWS SCT is a cloud service you configure in the AWS Console.
Correct
AWS SCT is a downloadable desktop application (client-side tool) installed on your local machine or a migration server. It is NOT a managed AWS cloud service — it runs locally and connects to both source and target databases to perform conversion.
Architecture diagrams and exam questions sometimes test whether you know SCT runs client-side. This affects network design (SCT needs connectivity to both source and target).
Common Mistake
DMS automatically handles LOB (Large Object) columns the same as regular columns.
Correct
LOB columns (BLOB, CLOB, NCLOB, etc.) require special handling in DMS. DMS has three LOB modes: Limited LOB mode (truncates LOBs above a size threshold — fastest but lossy), Full LOB mode (migrates complete LOBs — slower), and Inline LOB mode (for small LOBs). Choosing the wrong LOB mode causes data truncation or migration failures.
LOB handling is a real operational gotcha and appears in scenario questions about data integrity during migration. Always check LOB configuration when full fidelity is required.
Common Mistake
Once DMS migrates data, the source and target are automatically kept in sync indefinitely without any ongoing cost.
Correct
DMS continuous replication (CDC) runs as an ongoing task on a replication instance that incurs hourly compute charges. The replication instance must remain running to maintain sync. Stopping the instance stops replication and risks falling behind on changes.
Cost and operational questions about long-running DMS tasks are tested. Candidates need to understand that ongoing replication = ongoing cost for the replication instance.
SCT = Schema Conversion Tool (S for Schema). DMS = Data Migration Service (D for Data). SCT before DMS — always Schema before Data in heterogeneous migrations.
CDC Decision: 'Can't Delay Cutover' → use Change Data Capture for minimal downtime migrations.
Snowball Rule: 'If it would take WEEKS over the wire, ship it on a SLED (Snowball)' — physical transfer + CDC delta.
Heterogeneous = Two tools (SCT + DMS). Homogeneous = One tool (DMS only). Count the H's: Heterogeneous has more letters = more tools.
DMS task modes: F = Full load only (downtime required), C = CDC only (target must already have schema + data), FC = Full load + CDC (the minimal downtime hero mode).
Selecting DMS alone for a heterogeneous migration (e.g., Oracle → Aurora PostgreSQL) without SCT. DMS migrates data rows, not schema objects or stored procedures. Heterogeneous migrations ALWAYS require SCT first — this trap appears on every AWS database certification exam.
CertAI Tutor · · 2026-02-22