
Cargando...
End-to-end request tracing across microservices, Lambda, and APIs — see exactly where latency hides
AWS X-Ray is a distributed tracing service that collects data about requests as they travel through your application, helping you analyze and debug production and distributed applications. It provides a visual service map showing the relationships between services, pinpoints performance bottlenecks, and identifies root causes of errors and latency issues. X-Ray works across EC2, ECS, Lambda, Elastic Beanstalk, API Gateway, and more — but requires SDK instrumentation in your application code to generate trace data.
Identify performance bottlenecks, errors, and root causes in distributed and microservices architectures by providing end-to-end request tracing with a visual service map.
Use When
Avoid When
Distributed Tracing with Trace IDs
A unique Trace ID is propagated via HTTP headers (X-Amzn-Trace-Id) across all services in a request path
Service Map (Visual Graph)
Auto-generated visual map of all services and their connections, color-coded by health status
Segments and Subsegments
Segments represent work done by a service; subsegments represent downstream calls (DB queries, HTTP calls, etc.)
Annotations (Indexed Key-Value Pairs)
Filterable metadata attached to segments; used in GetTraceSummaries filter expressions
Metadata (Non-Indexed Key-Value Pairs)
Rich contextual data not searchable/filterable; unlimited types but not indexed
Sampling Rules (Custom)
Define rules by service name, URL, HTTP method, host, and resource ARN to control what gets traced
X-Ray Daemon
A lightweight process that receives UDP traffic from SDKs and forwards to X-Ray API; required on EC2/ECS; built-in on Lambda
X-Ray SDK (Multiple Languages)
Available for Java, Python, Node.js, Ruby, Go, .NET; must be integrated into application code
X-Ray API
PutTraceSegments, PutTelemetryRecords, GetTraceSummaries, GetServiceGraph, GetTraceGraph
Lambda Active Tracing
Enable via Lambda console or SAM/CloudFormation; daemon runs automatically; no separate installation needed
API Gateway Integration
Enable tracing per stage in API Gateway; passes trace header to downstream Lambda or HTTP integrations
Elastic Beanstalk Integration
Enable via .ebextensions or console; daemon is pre-installed on Beanstalk platforms
ECS Integration
Run X-Ray daemon as a sidecar container in the same task definition
X-Ray Groups
Filter expression-based groups; can have CloudWatch alarms on error/fault/throttle rates per group
CloudWatch ServiceLens Integration
ServiceLens in CloudWatch combines X-Ray traces with CloudWatch metrics and logs for unified observability
AWS Distro for OpenTelemetry (ADOT)
AWS-supported distribution of OpenTelemetry that can send traces to X-Ray; recommended for new implementations
Cross-Account Tracing
Traces can span accounts when using ADOT or when services pass the trace header across account boundaries
Encryption
Trace data encrypted at rest using AWS managed keys or customer-managed KMS keys
Filter Expressions
Query traces by annotation values, response time, HTTP status, service name, etc.
Fault vs Error vs Throttle Classification
Faults = 5xx (server-side); Errors = 4xx (client-side); Throttle = 429 (subset of errors)
Serverless Active Tracing
high freqEnable Active Tracing on Lambda function (TracingConfig: Active). X-Ray daemon runs automatically in the Lambda execution environment — no sidecar needed. The SDK instruments the function handler and downstream calls. Lambda generates two segments: Initialization and Invocation. Critical for debugging cold starts and downstream latency.
API Gateway Stage-Level Tracing
high freqEnable X-Ray tracing at the API Gateway stage level. API Gateway creates a segment for each request and passes the X-Amzn-Trace-Id header downstream. IMPORTANT: This only traces the API Gateway portion and what it calls — it does NOT automatically trace all downstream services unless they also instrument with X-Ray SDK. Common exam trap: candidates think enabling API Gateway tracing covers the entire application.
CloudWatch ServiceLens Unified Observability
high freqCloudWatch ServiceLens integrates X-Ray service maps with CloudWatch metrics and logs into a single pane of glass. You can navigate from a CloudWatch alarm → ServiceLens service map → individual X-Ray traces. X-Ray Groups can trigger CloudWatch alarms on error/fault rates. This is the primary pattern for operations teams needing unified observability.
Sidecar Daemon Container Pattern
high freqDeploy X-Ray daemon as a sidecar container in the same ECS task definition as your application container. Application containers send UDP traffic to the daemon on port 2000 using the container's local network. The daemon forwards batched segments to X-Ray API. Task IAM role must include xray:PutTraceSegments and xray:PutTelemetryRecords permissions.
Beanstalk Built-in X-Ray Daemon
high freqX-Ray daemon is pre-installed on Elastic Beanstalk platforms. Enable via the Beanstalk console (Software configuration) or .ebextensions config file. Application must still use X-Ray SDK for instrumentation — enabling the daemon alone does not create traces. Common in exam scenarios asking about the easiest way to add tracing to an existing Beanstalk application.
Complementary Observability (NOT Replacements)
high freqX-Ray traces application-level request flows (what your code does). CloudTrail records AWS API calls (who did what to AWS resources). They serve completely different purposes and complement each other. A common exam scenario asks which service to use for application debugging vs. security auditing — X-Ray for app tracing, CloudTrail for API audit.
Event-Driven Tracing Correlation
medium freqWhen using EventBridge in event-driven architectures, X-Ray can trace the producer service and consumer Lambda functions independently. Trace context must be manually propagated in event payloads if you need end-to-end correlation across EventBridge boundaries. ADOT (OpenTelemetry) provides better support for this pattern.
Compliance and Configuration Audit (Separate Concerns)
medium freqAWS Config tracks resource configuration changes over time. X-Ray traces request flows through applications. They are NOT interchangeable. Exam questions sometimes present Config as an option for application tracing — it is not. Config is for configuration compliance; X-Ray is for performance/error tracing.
Modern Instrumentation with ADOT
medium freqADOT is the AWS-supported, vendor-neutral alternative to X-Ray SDK for instrumentation. It can send traces to X-Ray, Prometheus, and other backends. AWS recommends ADOT for new workloads. ADOT Lambda Layer available for Lambda functions. Supports cross-account and cross-region tracing more naturally than native X-Ray SDK.
X-Ray requires SDK instrumentation in your application code — enabling X-Ray at the service level (API Gateway, Lambda) alone is NOT sufficient to trace your custom application logic. You must import and configure the X-Ray SDK in your code to create custom segments and subsegments.
Know the difference between Annotations and Metadata: Annotations are indexed key-value pairs (max 50 per trace, values up to 1,000 chars) that can be used in filter expressions to search traces. Metadata is non-indexed and cannot be used for searching. If a question asks how to search/filter traces by a custom business attribute, the answer is Annotations.
For ECS, the X-Ray daemon must run as a SIDECAR container in the same task definition. The application container communicates with the daemon over UDP port 2000. The task IAM role (not the instance profile) must have xray:PutTraceSegments and xray:PutTelemetryRecords permissions.
Lambda with X-Ray: When Active Tracing is enabled, Lambda automatically runs the X-Ray daemon — you do NOT need to configure or run it yourself. However, you still need the X-Ray SDK in your function code to create custom subsegments and add annotations/metadata to traces.
Sampling is ON by default — X-Ray does NOT trace 100% of requests. The default rule is 1 request/second (reservoir) + 5% of additional requests. For exam questions about cost optimization or reducing overhead, sampling rules are the answer. For compliance requiring 100% tracing, you must configure a custom sampling rule with a fixed rate of 100%.
X-Ray requires SDK instrumentation in application code — enabling the daemon or toggling X-Ray at the service level (API Gateway, Lambda console) is NOT enough to trace custom application logic. You must use the X-Ray SDK or ADOT in your code.
CloudTrail ≠ X-Ray: CloudTrail audits AWS API calls (security/compliance); X-Ray traces application request flows (performance/debugging). They are complementary, never interchangeable. If the question asks about debugging microservices latency or errors, the answer is X-Ray.
Annotations are indexed and searchable (use for filtering traces); Metadata is not indexed (use for rich context). Max 50 annotations per trace. This distinction appears directly in exam questions about finding traces by custom business attributes.
X-Ray Groups allow you to create filtered subsets of traces using filter expressions, and you can configure CloudWatch alarms on the error rate, fault rate, and response time for each group. This is the pattern for alerting on specific service or endpoint degradation.
The X-Amzn-Trace-Id HTTP header carries the Trace ID, Parent ID, and Sampling decision across service boundaries. If your custom HTTP service does not read and forward this header, the trace chain breaks and you get disconnected traces. The X-Ray SDK handles this automatically for supported frameworks.
Faults vs Errors vs Throttles classification: 5xx responses = Faults (server-side problems); 4xx responses = Errors (client-side problems); 429 Too Many Requests = Throttle (a specific subset of Errors). The X-Ray service map color-codes nodes by these categories. Exam questions may ask which category a specific HTTP status falls into.
For Elastic Beanstalk, you can enable X-Ray via the console (Software configuration → X-Ray daemon) or via .ebextensions. The daemon is pre-installed — you just need to enable it AND instrument your application code with the SDK. This is the fastest path to add tracing to an existing Beanstalk app.
CloudWatch ServiceLens is the unified observability feature that combines X-Ray service maps, CloudWatch metrics, and CloudWatch Logs. When an exam question asks about a single dashboard or console view combining traces, metrics, and logs for a microservices application, the answer is CloudWatch ServiceLens (powered by X-Ray).
X-Ray trace data is retained for exactly 30 days — this is fixed and cannot be changed. If you need longer retention, use the GetTraceSummaries and BatchGetTraces APIs to export data to S3, then analyze with Athena. This is a common architecture question for compliance-driven organizations.
For AI/ML workloads (AIF-C01 relevance): X-Ray is NOT used to monitor model performance, data drift, or inference quality for Bedrock or SageMaker models. Use SageMaker Model Monitor for model quality. X-Ray can trace the application code that calls Bedrock/SageMaker APIs as a downstream service, but it cannot inspect what the model does internally.
Common Mistake
CloudTrail provides application-level request tracing and can replace X-Ray for debugging microservices latency issues.
Correct
CloudTrail records AWS API calls (control-plane actions) for security auditing — who called which AWS API, when, from where. It has NO visibility into application-level request flows, latency between your microservices, or custom business logic. X-Ray is the correct service for distributed application tracing.
This is the #1 misconception on exams. Remember: CloudTrail = WHO did WHAT to AWS (security audit log). X-Ray = HOW a request traveled through YOUR application (performance and debugging). They answer completely different questions and are complementary, not interchangeable.
Common Mistake
Enabling X-Ray tracing on API Gateway automatically traces the entire application flow including all downstream Lambda functions, databases, and external APIs.
Correct
Enabling tracing on API Gateway only creates an X-Ray segment for the API Gateway portion of the request. Each downstream service (Lambda, EC2, ECS) must independently have X-Ray enabled and SDK-instrumented to contribute to the same trace. The trace ID is propagated via the X-Amzn-Trace-Id header, but downstream services must be configured to use it.
Exam questions often describe a multi-tier app and ask why traces are incomplete. The answer is always that downstream services need their own X-Ray instrumentation. API Gateway tracing is not a magic 'trace everything' switch — it's just the entry point.
Common Mistake
The X-Ray daemon or Systems Manager Agent automatically instruments application code without any SDK changes.
Correct
The X-Ray daemon is a network proxy that collects UDP data from the X-Ray SDK and forwards it to the X-Ray service — it does NOT instrument your code. The SSM Agent has absolutely nothing to do with X-Ray. You MUST modify your application code to import and use the X-Ray SDK (or ADOT) to generate trace segments.
Candidates confuse 'installing the daemon' with 'enabling tracing.' The daemon is infrastructure; the SDK is what creates trace data. Without SDK instrumentation, the daemon has nothing to forward. SSM Agent appearing as a distractor answer is a known exam trap.
Common Mistake
X-Ray Annotations and Metadata are both searchable and can both be used in filter expressions to find specific traces.
Correct
ONLY Annotations are indexed and searchable via filter expressions (e.g., annotation.user_id = '12345'). Metadata is stored with the trace but is NOT indexed and CANNOT be used in filter expressions or GetTraceSummaries queries. Use Annotations when you need to search/filter; use Metadata for rich context that doesn't need to be queried.
This distinction appears directly in DVA-C02 and DOP-C02 questions. The limit of 50 annotations per trace forces you to be selective. A common scenario: 'How do you find all traces for a specific customer ID?' — Add customer ID as an Annotation, not Metadata.
Common Mistake
X-Ray automatically traces 100% of all requests, providing complete visibility into every transaction.
Correct
X-Ray uses sampling by default. The default sampling rule records 1 request per second (the reservoir) plus 5% of additional requests beyond the reservoir. This is intentional to reduce overhead and cost. You can customize sampling rules (up to 25 per account) or set a 100% fixed rate if complete tracing is required (e.g., for compliance), but this increases cost and overhead.
Exam questions test whether you understand the cost/completeness tradeoff. If a question asks why some requests don't appear in X-Ray, sampling is almost always the answer. If a question asks how to ensure every request is traced, the answer is a custom sampling rule with fixed_rate=1.0 (100%).
Common Mistake
SageMaker Model Monitor or Amazon CloudWatch can be used to trace and debug Bedrock model invocations the same way X-Ray traces application requests.
Correct
SageMaker Model Monitor monitors data quality, model quality, bias drift, and feature attribution drift for SageMaker-hosted models — not Bedrock. CloudWatch monitors infrastructure metrics. X-Ray can trace the APPLICATION CODE that invokes Bedrock APIs (as a downstream HTTP call), but cannot inspect what happens inside the model. For Bedrock-specific observability, use CloudWatch metrics for Bedrock and application-level X-Ray tracing for the calling service.
This is a specific trap for AIF-C01 and DVA-C02 candidates working with generative AI. The key insight: X-Ray sees Bedrock as an external HTTP endpoint — it can measure latency and success/failure of the API call, but has no visibility into model internals, token usage details, or inference quality.
SAFE = Segments (work done by a service), Annotations (indexed, filterable), Faults (5xx), Errors (4xx) — the four core X-Ray concepts
DAM = Daemon collects, Annotations are indexed, Metadata is not — remember which one you can search
CATS = CloudTrail for API audit, X-Ray (Xray) for Application Tracing in Services — never confuse their purposes
The X-Ray daemon is like a MAILBOX: your SDK drops letters (segments) in it via UDP, and the daemon posts them to AWS. The mailbox doesn't write the letters — your code (SDK) does.
Reservoir + Rate = X-Ray sampling: think of the reservoir as a 'guaranteed slots' bucket (1/sec default) and the rate as a 'lottery' for the rest (5% default)
CertAI Tutor · SAA-C03, DOP-C02, SAP-C02, DVA-C02, DEA-C01, AIF-C01, CLF-C02 · 2026-02-22
In the Same Category
Comparisons
Guides & Patterns