
Cargando...
Pick the right orchestration pattern the first time — every time — on AWS exams and in production
Container orchestration patterns define how you deploy, scale, network, and manage containerized workloads on AWS across services like ECS (Fargate & EC2 launch types), EKS, App Runner, and AWS Batch. Understanding when to use each pattern — and why — is a recurring exam theme across AWS Certified Developer, SysOps, Solutions Architect Associate/Professional, and DevOps Engineer exams. Mastery of these patterns separates candidates who pass from those who guess.
Exams test your ability to select the optimal orchestration pattern given constraints like operational overhead, cost, scaling requirements, compliance, and existing Kubernetes investment — not just which services exist.
ECS on Fargate (Serverless Containers)
AWS manages the underlying EC2 infrastructure entirely. You define task definitions with CPU/memory, and Fargate provisions, scales, and patches the compute. No cluster node management required. Billing is per vCPU-second and GB-second of memory consumed by running tasks.
When you want to eliminate EC2 fleet management overhead, need rapid scaling without pre-provisioning, have variable or unpredictable workloads, or when your team lacks deep infrastructure expertise. Ideal for microservices, APIs, and event-driven workloads.
Higher per-unit cost compared to reserved EC2 instances at sustained high utilization. No access to GPU instances. Cannot run privileged containers or mount host-level volumes. Cold start latency exists for new task provisioning. Windows containers on Fargate have additional constraints.
ECS on EC2 Launch Type (Self-Managed Compute)
You manage a cluster of EC2 instances registered with ECS. The ECS agent runs on each instance and the scheduler places tasks onto available capacity. You retain full control over instance types, AMIs, storage, and networking. Supports Auto Scaling Groups (ASG) and Capacity Providers.
When you need GPU workloads, require privileged containers, need specific instance types not available in Fargate, have compliance requirements mandating dedicated hosts, or when cost optimization via Reserved/Spot Instances at sustained utilization outweighs operational overhead.
You own patching, scaling the cluster nodes, handling bin-packing inefficiency, and managing ECS agent updates. Requires careful Capacity Provider configuration to avoid stranded capacity or failed task placements.
Amazon EKS (Managed Kubernetes)
AWS manages the Kubernetes control plane (API server, etcd, scheduler). You manage worker nodes (EC2 managed node groups, self-managed nodes, or Fargate profiles). Provides full Kubernetes API compatibility, enabling portability and use of the Kubernetes ecosystem (Helm, service meshes, operators).
When you have existing Kubernetes expertise, need Kubernetes-native tooling (Helm charts, CRDs, operators), require multi-cloud or hybrid portability, run complex microservice architectures needing service mesh (Istio/App Mesh), or your organization mandates Kubernetes as the standard.
Higher operational complexity and cost than ECS/Fargate. EKS control plane has a per-cluster hourly charge. Steeper learning curve. Kubernetes upgrade management still required for worker nodes even though control plane is managed.
EKS on Fargate (Serverless Kubernetes)
Combines EKS Kubernetes API compatibility with Fargate's serverless compute model. Pods matching Fargate profiles run on AWS-managed infrastructure with no node management. Each pod runs in its own isolated micro-VM (Firecracker) for strong security isolation.
When you want Kubernetes API compatibility without managing EC2 worker nodes. Ideal for teams migrating from on-premises Kubernetes who want reduced operational burden, or when workload isolation and security are paramount (each pod = isolated VM).
Does not support DaemonSets, privileged pods, stateful workloads with certain volume types, or GPU instances. Not all Kubernetes features are available. Persistent storage limited to EFS (no EBS for Fargate pods). Higher cost per pod than EC2 node groups at scale.
AWS App Runner (Fully Managed Container Service)
Highest abstraction level — point App Runner at a container image (ECR) or source code repository, and AWS handles everything: build, deploy, load balancing, TLS, auto-scaling (including scale-to-zero), and health checks. No task definitions, no cluster configuration.
When developers want to deploy web applications or APIs with zero infrastructure knowledge. Perfect for startups, internal tools, proof-of-concepts, or teams where developer velocity trumps cost optimization. Automatic scaling including scale-to-zero minimizes cost for low-traffic apps.
Least control of any option. No VPC integration by default (VPC connector available but adds complexity). Limited customization of runtime environment. Not suitable for complex microservice topologies, batch workloads, or stateful applications.
AWS Batch (Batch Container Workloads)
Managed batch computing service that dynamically provisions optimal EC2/Fargate compute resources based on job queue requirements. Supports job dependencies, array jobs, multi-node parallel jobs (MPI), and priority queues. Containers run as jobs, not long-running services.
For scheduled or on-demand batch processing: ETL pipelines, genomics, financial modeling, rendering, ML training jobs. When workloads are finite, parallelizable, and don't require always-on containers. Eliminates need to manage HPC clusters manually.
Not designed for long-running services or real-time workloads. Job startup latency can be significant. Monitoring and debugging batch jobs requires familiarity with CloudWatch Logs and Batch console. Spot interruption handling must be designed into job retry logic.
Blue/Green Deployment Pattern (ECS + CodeDeploy)
Maintain two identical environments (blue = current, green = new version). Traffic shifts from blue to green after validation. AWS CodeDeploy integrates natively with ECS to automate this pattern, supporting canary, linear, and all-at-once traffic shifting via Application Load Balancer listener rules.
When you need zero-downtime deployments, fast rollback capability, and production validation before full traffic shift. The gold standard for production microservice deployments. Required pattern when exam asks about 'minimize downtime during deployment' with containers.
Temporarily doubles infrastructure cost during deployment window. More complex pipeline setup. Rollback is fast but not instantaneous — CodeDeploy must detect failure and initiate rollback.
Sidecar Pattern (Multi-Container Task/Pod)
Deploy a helper container alongside the main application container within the same ECS Task or Kubernetes Pod. The sidecar shares the network namespace (localhost communication) and optionally volumes. Common sidecars: log shippers (Fluent Bit), service mesh proxies (Envoy/AWS App Mesh), secrets injectors, monitoring agents.
When you need to add cross-cutting concerns (logging, tracing, security, service mesh) without modifying application code. In ECS, define multiple containers in one Task Definition. In EKS, define multiple containers in one Pod spec. AWS FireLens uses this pattern for log routing.
Increases task/pod resource consumption. Sidecar failures can impact main container if not isolated properly. Adds complexity to task/pod definitions. In Fargate, all containers in a task share the task's total CPU/memory allocation.
Service Mesh Pattern (AWS App Mesh / EKS Service Mesh)
Inject Envoy proxy sidecars into every service to handle east-west traffic management, mutual TLS (mTLS), circuit breaking, retries, and observability without application code changes. AWS App Mesh provides a managed control plane. In EKS, Istio or App Mesh can be used.
For complex microservice architectures with many service-to-service calls where you need fine-grained traffic control, canary routing between service versions, end-to-end encryption between services, and distributed tracing without code instrumentation.
Significant operational complexity. Each Envoy sidecar consumes CPU/memory. Debugging mesh configuration issues is non-trivial. App Mesh is being succeeded by Amazon VPC Lattice for many use cases — know both for exams.
Event-Driven Container Scaling (ECS + SQS / EventBridge)
Scale ECS tasks or EKS pods based on queue depth or event volume rather than CPU/memory metrics. Use Application Auto Scaling with a custom metric (SQS ApproximateNumberOfMessagesVisible) to drive ECS service scaling. KEDA (Kubernetes Event-Driven Autoscaling) provides the same for EKS.
When workload volume is driven by message queues, streams, or events rather than CPU load. Processing SQS queues, Kinesis streams, or DynamoDB Streams with containers. Prevents queue backup during traffic spikes and scales down to zero during quiet periods.
Requires custom CloudWatch metric publishing or KEDA installation. Scaling lag between metric publication and task launch must be accounted for in queue processing SLA design. Scale-in must be handled carefully to avoid dropping in-flight messages.
• STEP 1 — Do you need Kubernetes API compatibility or have existing K8s investment? YES → EKS (EC2 node groups for full control, Fargate profiles for serverless pods). NO → continue. |
• STEP 2 — Is this a batch/finite job workload (not a long-running service)? YES → AWS Batch. NO → continue. |
• STEP 3 — Do you want zero infrastructure management and maximum developer simplicity? YES → App Runner (for web apps/APIs) or ECS Fargate (for more control over networking/VPC). NO → continue. |
• STEP 4 — Do you need GPU instances, privileged containers, specific instance types, or maximum cost optimization via Reserved/Spot at sustained scale? YES → ECS on EC2 launch type with Capacity Providers. NO → ECS Fargate. |
STEP 5 — Deployment strategy:
• Need zero-downtime with fast rollback? → Blue/Green with CodeDeploy. Need gradual traffic shift validation? → Canary via CodeDeploy linear/canary configs. Need cross-cutting concerns (logging/tracing/security) without code changes? → Sidecar pattern. Need service-to-service traffic control at scale? → Service Mesh (App Mesh or VPC Lattice). Need queue-driven scaling? → Event-driven scaling with SQS + Application Auto Scaling.
ECS Fargate tasks are billed per vCPU-second and GB-memory-second from task start to stop — there is NO charge for idle capacity between tasks. EC2 launch type bills for running instances regardless of task utilization. This cost model difference drives many exam scenario answers.
When an exam scenario mentions 'minimize operational overhead' OR 'no infrastructure management' AND containers are involved, the answer is almost always ECS Fargate or App Runner — never ECS EC2 or self-managed EKS node groups.
EKS control plane costs a per-cluster hourly fee regardless of workload size. For small workloads, this makes EKS more expensive than ECS. Exam questions testing cost optimization for small container workloads should favor ECS Fargate over EKS.
Blue/Green deployments in ECS require AWS CodeDeploy (not CodePipeline alone, not rolling updates). The ALB shifts traffic between two target groups. CodeDeploy supports three traffic shifting modes: Canary (e.g., 10% then 90%), Linear (e.g., 10% every N minutes), and AllAtOnce. Know which mode to recommend for which scenario.
ECS Fargate does NOT support DaemonSets, privileged containers, or GPU instances. EKS Fargate profiles also do NOT support DaemonSets or privileged pods. If an exam scenario requires any of these, the answer must involve EC2-based compute (ECS EC2 or EKS managed node groups).
Fargate = no infrastructure management but NO GPU/DaemonSets/privileged containers. Any scenario requiring those must use EC2-based compute.
Blue/Green ECS deployments REQUIRE CodeDeploy + ALB with two target groups — rolling update is the ECS default and is NOT blue/green.
EKS has a per-cluster control plane hourly charge making it more expensive than ECS for small workloads — choose ECS Fargate for cost-optimized small container deployments unless Kubernetes API compatibility is explicitly required.
AWS FireLens is the ECS-native log routing solution using Fluent Bit or Fluentd as a sidecar container in the ECS task definition. It replaces the need for the awslogs driver for advanced log routing to S3, Kinesis, OpenSearch, or third-party tools. Know FireLens as the answer to 'centralized container log routing' questions.
Amazon VPC Lattice is the newer AWS-native service for service-to-service networking that replaces many App Mesh use cases. For exams after 2023, VPC Lattice may appear as the preferred answer for east-west traffic management between ECS/EKS services, especially for cross-VPC service connectivity.
ECS Service Connect (launched 2022) provides service discovery and inter-service communication within an ECS cluster using AWS Cloud Map, without requiring a full service mesh. It is the preferred answer for 'simple service discovery between ECS services' — lighter weight than App Mesh.
For event-driven container scaling, the exam pattern is: SQS queue depth → CloudWatch custom metric → Application Auto Scaling policy → ECS Service desired count adjustment. Know that ECS does NOT natively read SQS depth — you must use Application Auto Scaling with a custom metric or target tracking policy.
App Runner automatically scales to zero (no running instances = no compute cost) during periods of no traffic, making it uniquely cost-effective for low-traffic or intermittent workloads. ECS Fargate does NOT scale to zero by default — minimum service count is configurable but requires explicit configuration.
Common Mistake
ECS and EKS are interchangeable — just pick whichever you prefer and the outcome is the same
Correct
ECS and EKS have fundamentally different operational models, pricing structures, feature sets, and ecosystem tooling. ECS is AWS-proprietary with lower operational overhead and no control plane cost. EKS provides Kubernetes API compatibility (critical for portability and ecosystem tools) but has per-cluster control plane charges and higher complexity. The choice depends on Kubernetes requirements, team skills, and portability needs.
Exam questions frequently hinge on whether Kubernetes compatibility is explicitly required. If not mentioned, ECS is usually the simpler, cheaper answer. If K8s ecosystem, Helm, operators, or multi-cloud portability are mentioned, EKS is correct.
Common Mistake
Fargate is always cheaper than EC2 launch type because you only pay for what you use
Correct
Fargate is cheaper for variable, spiky, or low-utilization workloads. For sustained high-utilization workloads, EC2 Reserved Instances or Savings Plans can be significantly cheaper than Fargate per-second pricing. The crossover point depends on utilization percentage — Fargate becomes more expensive when tasks run near 100% utilization continuously.
Cost optimization questions require knowing WHEN each model wins. The exam may describe a workload running 24/7 at high utilization — the correct answer would be EC2 launch type with Reserved Instances, not Fargate.
Common Mistake
Blue/Green deployments automatically happen when you update an ECS service — you just push a new image
Correct
By default, ECS performs a rolling update (replacing tasks gradually). Blue/Green deployment with instant traffic switching and fast rollback requires explicit integration with AWS CodeDeploy and an Application Load Balancer with two target groups. This must be configured at service creation time — it cannot be added to an existing rolling-update service without recreation.
Candidates confuse 'rolling update' (ECS default) with 'blue/green' (requires CodeDeploy). Exam questions about zero-downtime + fast rollback = CodeDeploy Blue/Green. Questions about gradual replacement = rolling update.
Common Mistake
EKS Fargate profiles eliminate all node management concerns including DaemonSets for monitoring and logging
Correct
EKS Fargate does NOT support DaemonSets. This means node-level monitoring agents (like CloudWatch Container Insights agent deployed as DaemonSet) and log collectors cannot run as DaemonSets on Fargate nodes. You must use sidecar containers or alternative approaches (like Fluent Bit sidecar) for logging and monitoring on Fargate pods.
This is a critical architectural constraint. Exam scenarios describing 'centralized logging for EKS Fargate' must use sidecar-based solutions, not DaemonSet-based ones. Getting this wrong leads to architecturally invalid answers.
Common Mistake
AWS App Runner is just a simpler version of ECS Fargate — they're basically the same thing
Correct
App Runner is a fundamentally higher abstraction. App Runner handles source code builds, automatic TLS certificate management, built-in load balancing, and scale-to-zero — none of which ECS Fargate does natively. ECS Fargate still requires you to define task definitions, configure ALBs, manage service scaling policies, and handle TLS via ACM separately. App Runner trades control for simplicity.
When an exam question describes a developer who 'just wants to deploy code without any infrastructure configuration,' App Runner is the answer — not ECS Fargate. Fargate still requires meaningful infrastructure configuration.
Common Mistake
Container image storage in ECR is free — only compute costs money
Correct
Amazon ECR charges for storage (per GB per month) and data transfer out. ECR Public Gallery images have free pulls from within AWS regions. Private ECR repositories incur storage charges for all stored image layers. For exam cost optimization questions involving container images, ECR storage costs are a real consideration, especially with many image versions retained.
Lifecycle policies in ECR are tested as a cost optimization mechanism — they automatically expire old image versions to reduce storage costs. Knowing ECR has storage costs makes lifecycle policy questions make sense.
FAKE-BS for orchestration choice: Fargate=Agility, K8s-required=EKS, Batch-jobs=Batch, Simple-web-app=App-Runner — 'FAKE Batch Simplifies' your decision
For deployment patterns remember 'BLue/Green = Both Live, CodeDeploy Governs' — always needs CodeDeploy + ALB dual target groups
Fargate CANNOT: GPU, DaemonSets, Privileged containers — remember 'GDP is forbidden in Fargate' (GPU, DaemonSets, Privileged)
Sidecar shares: Network namespace (localhost), Volumes (optional), Lifecycle (same task/pod) — 'SNL: Same Network, same Lifecycle'
Confusing ECS rolling updates (the default) with Blue/Green deployments. Exam questions about 'zero-downtime deployment with instant rollback capability' require CodeDeploy Blue/Green — not ECS service rolling updates, not CodePipeline alone. Candidates who don't know this distinction consistently choose wrong answers on deployment scenario questions.
CertAI Tutor · · 2026-02-22