
Cargando...
Run code without servers — pay only for what you use, scale to millions automatically.
AWS Lambda is a fully managed, event-driven, serverless compute service that executes your code in response to triggers without requiring server provisioning or management. It automatically scales from zero to thousands of concurrent executions, handles all infrastructure concerns (patching, capacity, availability), and charges only for the compute time consumed. Lambda is the backbone of modern serverless architectures on AWS, integrating natively with over 200 AWS services.
Execute application logic in response to events (HTTP requests, file uploads, stream records, schedules) without managing any infrastructure, enabling rapid development and cost-efficient scaling.
Use When
Avoid When
Synchronous invocation
Caller waits for response. Used by API Gateway, ALB, Cognito, SDK direct calls. No automatic retries on failure.
Asynchronous invocation
Lambda queues the event and returns immediately. Used by S3, SNS, EventBridge. Retries up to 2 times. Supports DLQ and EventBridge Pipes for failure handling.
Event source mapping (poll-based)
Lambda polls SQS, Kinesis Data Streams, DynamoDB Streams, MSK, self-managed Kafka. Lambda manages the polling — you do not pay for polling itself.
Reserved concurrency
Guarantees a maximum number of concurrent executions for a function AND prevents it from using more. Acts as both a floor (guaranteed) and a ceiling (throttle limiter).
Provisioned concurrency
Pre-initializes execution environments to eliminate cold starts. Billed for provisioned environments even when idle. Critical for latency-sensitive applications.
Lambda Layers
Share libraries, custom runtimes, or configuration across functions. Up to 5 layers per function. Total unzipped size (function + layers) must be ≤250 MB. Not supported with container images.
Lambda Extensions
Run alongside your function to integrate monitoring, security, and governance tools (e.g., Datadog, HashiCorp Vault). Delivered as Lambda Layers.
Container image support
Package functions as OCI-compliant container images up to 10 GB. Use AWS base images or custom images implementing the Lambda Runtime API. Layers not supported.
VPC integration
Lambda can run inside a VPC to access RDS, ElastiCache, or other private resources. Uses Hyperplane ENI — modern architecture avoids ENI exhaustion. Adds ~100ms to cold start.
Amazon EFS integration
Mount EFS file systems for shared, persistent storage across concurrent Lambda executions. Requires VPC configuration.
Function URLs
Built-in HTTPS endpoint for a Lambda function without needing API Gateway. Supports IAM auth or no auth. Supports response streaming.
Response streaming
Stream responses progressively up to 200 MB. Supported via Function URLs and Node.js runtime. Improves time-to-first-byte for large responses.
Versions and aliases
Publish immutable versions (v1, v2, ...). Aliases are mutable pointers to versions. Use alias routing to split traffic between versions (canary/weighted deployments).
Code signing
Ensures only code signed by approved entities (via AWS Signer) can be deployed. Enforced via code signing configurations.
Dead Letter Queue (DLQ)
Capture failed asynchronous invocations in SQS or SNS. Configure on the function for async failure handling.
Lambda Destinations
Route async invocation results (success OR failure) to SQS, SNS, EventBridge, or another Lambda. More flexible than DLQ — preferred modern pattern.
SnapStart (Java)
Dramatically reduces cold start latency for Java functions by taking a snapshot of the initialized execution environment. Supported for Java 11+ managed runtimes.
Environment variables
Key-value pairs injected at runtime. Encrypted at rest using KMS. Total size limited to 4 KB aggregate.
X-Ray tracing
Enable active tracing to send trace data to AWS X-Ray. Adds the X-Ray daemon as a sidecar. Execution role must include xray:PutTraceSegments permission.
CloudWatch Logs integration
Lambda automatically sends logs to CloudWatch Logs. Log group: /aws/lambda/<function-name>. Execution role needs logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents.
Durable (long-running) Lambda functions
Stateful, multi-step workflows using Lambda with Step Functions can run for up to one year. Not the Lambda function itself — the orchestration via Step Functions.
Recursive loop detection
Lambda detects and stops recursive invocation loops automatically to prevent runaway costs. Configure recursion detection threshold.
Supported runtimes
Node.js, Python, Java, .NET (C#/PowerShell), Ruby, Go (custom runtime). Custom runtimes via Lambda Runtime API for any language.
Serverless REST/HTTP API Backend
high freqAPI Gateway routes HTTP/HTTPS requests to Lambda synchronously (RequestResponse). Lambda processes business logic and returns a response. API Gateway handles auth (Cognito, Lambda authorizers), throttling, caching, and SSL. The canonical serverless web API pattern. Watch for the concurrency mismatch: API Gateway throttles at 10,000 req/s but Lambda defaults to 1,000 concurrent executions — use reserved concurrency and request increases proactively.
Event-Driven File Processing
high freqS3 sends event notifications (PUT, POST, COPY, DELETE) to Lambda asynchronously. Lambda processes files (image resizing, ETL, virus scanning) and writes results to another S3 bucket or database. Use separate source and destination buckets to avoid recursive trigger loops. Lambda retries up to 2 times on failure — use DLQ/Destinations for error capture.
Decoupled Queue Processing (Event Source Mapping)
high freqLambda polls SQS queues via event source mapping. Lambda scales based on queue depth (up to 1,000 concurrent for standard queues, 1 concurrent per shard for FIFO). Messages are deleted from queue only after successful processing. On failure, messages return to queue and eventually go to SQS DLQ. Batch size and batch window are configurable. Use SQS as a buffer between high-throughput event producers and Lambda to prevent throttling.
Database Change Data Capture
high freqLambda processes DynamoDB Streams records via event source mapping. Receives old and new item images. Use for replication, audit logging, cache invalidation, or triggering downstream workflows. Lambda scales to one concurrent execution per shard. Failures block the shard — use bisect-on-error and maximum retry attempts to handle poison pills.
Real-Time Stream Processing
high freqLambda reads from Kinesis shards via event source mapping. One concurrent execution per shard. Supports enhanced fan-out for lower latency. Use parallelization factor (1–10) to process multiple batches per shard concurrently. Critical for real-time analytics, IoT data processing, and log aggregation pipelines.
Fan-Out Async Processing
high freqSNS delivers messages to Lambda asynchronously. Lambda is invoked once per SNS message. Use SNS→Lambda fan-out to trigger multiple Lambda functions in parallel from a single event. SNS does NOT support event source mapping — it pushes directly. Lambda retries 2 times on failure; configure DLQ on Lambda for unprocessed messages.
Scheduled Tasks and Event Routing
high freqEventBridge invokes Lambda on a cron/rate schedule or in response to AWS service events, custom events, or SaaS partner events. Replaces CloudWatch Events (same service, rebranded). Use for automated operations, compliance checks, data pipeline triggers, and cross-account event routing. EventBridge invokes Lambda asynchronously.
Serverless Orchestration for Long-Running Workflows
high freqStep Functions orchestrates Lambda functions for workflows exceeding 15 minutes, requiring human approval, parallel branches, error handling, or retry logic. Each Lambda handles one discrete task. Avoids the anti-pattern of synchronous nested Lambda calls. Express Workflows suit high-volume, short-duration; Standard Workflows suit long-running, durable workflows (up to 1 year).
Observability Stack
high freqLambda automatically ships logs to CloudWatch Logs. Enable X-Ray active tracing for distributed tracing across services. Use Lambda Insights (CloudWatch) for enhanced metrics (memory usage, init duration, cold starts). Lambda Extensions can integrate third-party APM tools without modifying function code.
Database Access with Connection Pooling
high freqLambda functions connecting directly to RDS can exhaust database connection limits under high concurrency (each execution = new connection). Use RDS Proxy as a connection pooler between Lambda and RDS. RDS Proxy maintains a warm pool of connections, dramatically reducing connection overhead. Requires Lambda in the same VPC as RDS.
Serverless Auth Customization
high freqCognito User Pools invoke Lambda synchronously via triggers (pre-signup, post-confirmation, pre-token generation, custom auth). Lambda customizes auth flows, validates data, enriches JWT tokens, or integrates with external identity providers. Lambda must respond within Cognito's timeout window.
ML Inference Invocation
high freqLambda invokes SageMaker endpoints for real-time ML inference as part of API or event-driven workflows. Lambda handles pre/post-processing; SageMaker handles the model. For lightweight models, Lambda can host inference directly using container images (up to 10 GB). Lambda is NOT suitable for training — use SageMaker Training Jobs.
Generative AI Application Backend
high freqLambda invokes Bedrock Foundation Models (Claude, Titan, Llama, etc.) via the Bedrock Runtime API for text generation, summarization, and RAG patterns. Lambda handles prompt construction, Bedrock invocation, and response processing. Combine with Knowledge Bases for RAG and Agents for multi-step AI workflows. Watch for timeout limits with large model responses.
Least-Privilege Execution and Resource Policy
high freqEvery Lambda function needs an IAM execution role (what the function can DO — e.g., read S3, write DynamoDB). Resource-based policies control who can INVOKE the function (push model — e.g., allow S3 to invoke). These are two separate IAM constructs. Exam questions frequently test whether you modify the execution role vs. the resource policy.
Shared Persistent Storage for Serverless
high freqMount Amazon EFS access points to Lambda for shared, persistent storage accessible across concurrent executions and other compute services. Use for ML model files, shared datasets, or stateful processing. Requires Lambda in VPC. EFS is the ONLY way to share data between concurrent Lambda executions in real time.
Lambda timeout is 900 seconds (15 minutes) — any question describing a workload needing MORE than 15 minutes must use Step Functions + Lambda, Fargate, or EC2. 'Lambda alone' is always wrong for >15-minute tasks.
You CANNOT configure CPU directly on Lambda — CPU scales proportionally with memory. To speed up a CPU-bound function, increase memory allocation. At 1,769 MB = 1 full vCPU. This is the ONLY way to increase compute power.
The default regional concurrency limit is 1,000 concurrent executions shared across ALL functions. API Gateway default throttle is 10,000 req/s. This mismatch causes 429 TooManyRequests errors. Solution: request concurrency limit increase AND implement SQS as a buffer between API Gateway and Lambda.
Async invocations (S3, SNS, EventBridge) retry up to 2 times automatically. Sync invocations (API Gateway, ALB, SDK direct) do NOT retry — the caller must retry. After async retries are exhausted, configure Lambda Destinations or DLQ to capture failed events.
Reserved concurrency serves DUAL purposes: (1) it GUARANTEES that many executions are always available for the function, AND (2) it CAPS the maximum concurrency (acts as a throttle). Setting reserved concurrency to 0 effectively disables the function. Provisioned concurrency eliminates cold starts but costs money even when idle.
Lambda in a VPC requires subnets with sufficient IP addresses and a NAT Gateway (or VPC endpoints) to reach AWS services or the internet. Lambda in a VPC does NOT automatically get internet access — a common architecture mistake. Without NAT, the function can only reach VPC-internal resources.
For Lambda connecting to RDS under high concurrency, always recommend RDS Proxy. Direct RDS connections from Lambda exhaust the database connection pool because each concurrent Lambda execution opens a new connection. RDS Proxy pools and multiplexes connections.
Workload > 15 minutes = Lambda CANNOT do it alone. Answer is always Step Functions + Lambda, Fargate, or EC2. No exceptions.
CPU-bound function is slow? Increase MEMORY — it is the ONLY way to get more CPU on Lambda. CPU is not configurable separately. At 1,769 MB = 1 full vCPU.
Reserved concurrency = guaranteed capacity + throttle ceiling (does NOT eliminate cold starts). Provisioned concurrency = pre-warmed environments (DOES eliminate cold starts, costs money idle). Know which problem each solves.
Lambda Destinations (for async invocations) are MORE FLEXIBLE than DLQ because they can route BOTH successes and failures to SQS, SNS, EventBridge, or another Lambda. DLQ only captures failures. Prefer Destinations for new architectures.
The 250 MB unzipped deployment package limit INCLUDES all Lambda Layers. If your function is 100 MB and you add 3 layers totaling 200 MB, you exceed the limit. Container images (up to 10 GB) solve large dependency problems but cannot use Layers.
Lambda SnapStart (Java) takes a snapshot of the initialized execution environment and restores it on subsequent invocations, dramatically reducing Java cold starts. It does NOT eliminate cold starts entirely — the first initialization still occurs. Available for Java 11+ managed runtimes only.
Environment variables are limited to 4 KB TOTAL aggregate. For larger configuration, use SSM Parameter Store (free for standard parameters) or Secrets Manager (paid, for secrets). Retrieve at runtime in the function initialization code (outside the handler) to cache across warm invocations.
Lambda Aliases enable traffic splitting between two function versions (canary deployments). Combined with CodeDeploy, you can automate gradual traffic shifting (canary, linear, all-at-once) with automatic rollback on CloudWatch alarm triggers. This is the correct answer for 'zero-downtime Lambda deployments'.
/tmp storage is ephemeral and execution-environment-scoped — it persists across WARM invocations of the same execution environment but is NOT shared between different concurrent executions. Use EFS (requires VPC) for shared persistent storage across concurrent executions.
For Kinesis and DynamoDB Streams event source mappings, failures BLOCK the shard — Lambda retries the failing batch until it succeeds or expires. Use bisect-on-error (splits batch to isolate bad records), maximum retry attempts, and destination on failure to handle poison pill messages and prevent shard stalls.
Lambda@Edge runs functions at CloudFront edge locations — it has STRICTER limits than standard Lambda: max 5 seconds timeout for viewer requests/responses, 30 seconds for origin requests/responses, max 128 MB memory for viewer functions, 10 GB for origin functions. Do NOT apply standard Lambda limits to Lambda@Edge questions.
Cold starts are affected by: (1) runtime — compiled languages (Java, .NET) are slower to initialize than interpreted (Python, Node.js); (2) memory size — more memory = faster initialization; (3) VPC configuration — adds ~100ms for Hyperplane ENI setup; (4) deployment package size — larger packages = slower cold start. Provisioned concurrency is the definitive solution.
Lambda billing uses 1-millisecond granularity (changed from 100ms in December 2020). Duration is rounded UP to the nearest millisecond. Free tier: 1M requests + 400,000 GB-seconds per month, PERMANENT (not just first year).
Common Mistake
Lambda functions scale instantly and infinitely — there are no scaling limits.
Correct
Lambda scales at a rate of 1,000 new execution environments every 10 seconds per function (burst scaling limit). The default regional concurrency cap is 1,000 concurrent executions (soft limit, can be increased). Sudden traffic spikes can cause throttling (429) until scaling catches up.
Exam questions describe traffic spike scenarios and ask why Lambda returns 429 errors or how to handle burst traffic. The answer involves understanding the scaling ramp-up rate, using SQS as a buffer, and proactively requesting concurrency limit increases. 'Auto-scaling' does not mean 'instant unlimited scaling.'
Common Mistake
You can configure CPU allocation directly on Lambda to speed up compute-intensive functions.
Correct
Lambda does NOT expose direct CPU configuration. CPU is allocated proportionally to memory. At 1,769 MB you get exactly 1 vCPU. To get more CPU, you MUST increase memory allocation. This is the only knob available.
This is one of the most-tested Lambda performance questions. Candidates who come from EC2 backgrounds expect separate CPU/memory configuration. The correct answer to 'how do I speed up a CPU-bound Lambda function?' is always 'increase memory allocation.'
Common Mistake
Lambda functions in a VPC can access the internet and AWS services by default.
Correct
Lambda in a VPC loses internet access by default. To reach the internet, the function must be in a private subnet with a NAT Gateway (or NAT instance) routing to an Internet Gateway. To reach AWS services (S3, DynamoDB, etc.) without internet, use VPC Endpoints (Gateway or Interface endpoints). Without these, the function can ONLY reach resources inside the VPC.
A very common architecture mistake and exam trap. Questions describe a Lambda function in a VPC that can't reach S3 or the internet and ask for the fix. The answer is NAT Gateway (for internet) or VPC Gateway Endpoint (for S3/DynamoDB) — not 'remove from VPC' or 'add an Internet Gateway directly.'
Common Mistake
Reserved concurrency and provisioned concurrency are the same thing — both eliminate cold starts.
Correct
Reserved concurrency GUARANTEES a maximum number of concurrent executions for a function AND acts as a throttle ceiling — it does NOT eliminate cold starts. Provisioned concurrency PRE-WARMS execution environments to eliminate cold starts but costs money even when idle. They solve different problems: reserved = capacity guarantee + throttle; provisioned = cold start elimination.
Exam questions frequently present both as options and test whether you know the distinction. If the question asks about eliminating cold starts, the answer is provisioned concurrency. If the question asks about guaranteeing capacity or preventing a function from consuming too much shared concurrency, the answer is reserved concurrency.
Common Mistake
Lambda DLQ (Dead Letter Queue) captures both failed and successful invocations for auditing purposes.
Correct
Lambda DLQ captures ONLY failed asynchronous invocations after all retries are exhausted. It does NOT capture successes. For capturing both successes and failures, use Lambda Destinations, which can route to SQS, SNS, EventBridge, or another Lambda based on success or failure conditions.
Exam questions about audit trails and failure handling test whether you know the DLQ-vs-Destinations distinction. DLQ = failures only, async only. Destinations = success AND failure, async only, more routing flexibility. Prefer Destinations for new architectures.
Common Mistake
Lambda functions can directly share data between concurrent executions using /tmp storage.
Correct
/tmp storage is scoped to a single execution environment. Concurrent Lambda executions run in SEPARATE execution environments, each with their own isolated /tmp. There is NO way for concurrent executions to share /tmp data. Use Amazon EFS (mounted via VPC) for shared persistent storage across concurrent executions.
This misconception leads to broken architectures where developers expect shared state via /tmp. The correct solution for shared storage between concurrent Lambda executions is EFS. For shared ephemeral state, use ElastiCache or DynamoDB.
Common Mistake
Lambda Layers can be used with container image Lambda functions to add shared dependencies.
Correct
Lambda Layers are NOT supported for container image-based Lambda functions. When using container images, ALL dependencies must be included in the container image itself. Layers are only available for .zip deployment packages.
A direct exam trap. Questions may describe a scenario using container images and ask which Lambda features are available. Layers are explicitly not supported. This also affects the 250 MB unzipped limit — container images use the separate 10 GB limit.
Common Mistake
Increasing Lambda memory always increases cost because you are using more resources.
Correct
Increasing memory INCREASES cost per millisecond but can DECREASE total cost if the function completes significantly faster. Since billing is duration × memory (GB-seconds), a function using 256 MB for 1000ms costs the same as 512 MB for 500ms. For CPU-bound functions, doubling memory often more than halves execution time, resulting in lower total cost AND better performance.
Exam questions about cost optimization for Lambda may present 'reduce memory' as the obvious answer. The correct approach is to use AWS Lambda Power Tuning tool to find the optimal memory setting — sometimes HIGHER memory is MORE cost-effective.
Common Mistake
Lambda automatically handles recursive invocation loops — you don't need to design against them.
Correct
While Lambda has recursive loop detection that can throttle recursive calls, this is a safety net, NOT a design principle. Recursive loops (Lambda A triggers Lambda B which triggers Lambda A) are an anti-pattern that can cause runaway costs and must be explicitly avoided in architecture. Use Step Functions for orchestration instead of chained Lambda invocations.
Exam questions about Lambda anti-patterns and cost runaway scenarios test this knowledge. The correct answer is always to redesign to eliminate the loop, not to rely on Lambda's detection mechanism.
STAR for Lambda invocation types: Sync (API Gateway/ALB waits), Trigger-async (S3/SNS/EventBridge fires and forgets), Auto-poll (SQS/Kinesis/DynamoDB Streams — Lambda polls). Remember: API Gateway is SYNC, S3 is ASYNC, SQS is POLL.
COLD START killers in order of effectiveness: Provisioned Concurrency (eliminates), SnapStart for Java (dramatically reduces), Increase Memory (speeds up init), Smaller Package (faster load), Avoid VPC if possible (saves ~100ms).
The '15-minute rule': If a task takes MORE than 15 minutes, Lambda CANNOT do it alone. Answer: Step Functions + Lambda or Fargate.
Memory = CPU on Lambda: 'More RAM = More Compute'. At 1,769 MB = 1 vCPU. Think of it as a linear dial: turn up memory, turn up CPU.
DLQ vs Destinations: 'DLQ = Dead (failures only). Destinations = Both Directions (success + failure).' Prefer Destinations for new designs.
VPC Lambda internet rule: 'VPC = No Internet by Default. Need NAT for net, Need Endpoint for AWS services.' — VPC is a walled garden; you must build the doors.
Reserved vs Provisioned: 'Reserved = Reservation (guaranteed slot + ceiling). Provisioned = Pre-warmed (no cold start, costs money idle).'
CertAI Tutor · DEA-C01, DVA-C02, DOP-C02, SAA-C03, SCS-C02, AIF-C01, SAP-C02, CLF-C02 · 2026-02-21
In the Same Category
Comparisons
Guides & Patterns