
Cargando...
Stop guessing which monitoring tool to use — the definitive decision guide for every AWS certification
Performance metrics vs API audit logs vs distributed traces vs compliance snapshots
| Feature | CloudWatch Metrics, alarms, logs, dashboards | CloudTrail Who did what, when, where | X-Ray Distributed tracing across services | Config Resource compliance and configuration history |
|---|---|---|---|---|
Core Question Answered This is the #1 differentiator. Memorize the core question each service answers. | Is my system healthy RIGHT NOW? (performance, utilization, errors) | WHO made API calls and WHAT did they do? (audit trail) | WHY is my request slow? WHERE is it failing? (request tracing) | Is my infrastructure COMPLIANT? What did it look like BEFORE? (configuration drift) |
Primary Data Type CloudTrail records = API events. CloudWatch records = performance data. Config records = resource state. | Metrics (time-series numbers), Logs (text events), Alarms, Dashboards | API call records (who, what, when, from where, result) | Traces, Segments, Subsegments, Service Maps, Annotations | Configuration snapshots, Configuration history, Compliance evaluations, Rules |
Scope / Layer | Infrastructure + Application layer (metrics & logs from AWS services and custom apps) | AWS Control Plane layer (management events) + Data Plane (data events, optional) | Application request layer (inter-service latency, errors, throttling per request) | Resource configuration layer (what settings/attributes a resource has or had) |
Granularity / Resolution Only CloudWatch supports sub-minute (1-second) resolution for custom metrics. CloudTrail is NOT real-time by default — ~15 min lag to S3. | Standard: 1-minute minimum. High-resolution custom metrics: 1-second. Detailed monitoring: 1-minute for EC2. | Per API call (event-level). Delivery to S3 within ~15 minutes. CloudTrail Lake: near-real-time. | Per-request trace (millisecond-level latency breakdown per segment/subsegment) | Per configuration change (triggered) or periodic (1hr, 3hr, 6hr, 12hr, 24hr) |
Default Retention CloudTrail 90-day event history is FREE and requires no trail setup. X-Ray traces expire after 30 days — cannot be extended. | Metrics: 15 months (high-res aggregated over time). Logs: configurable (1 day to 10 years, or never expire). Default log group: never expire. | Event history (console): 90 days free, no S3 needed. Trail logs in S3: governed by S3 lifecycle policies. | Traces: 30 days | Configuration history and snapshots: governed by S3 lifecycle. Config rules evaluation history retained. |
Real-Time Capability If the question asks about REAL-TIME alerting, CloudWatch Alarms is the answer. CloudTrail is never the real-time answer. | Near real-time metrics (within seconds to 1 minute). CloudWatch Logs Insights: query in near real-time. Alarms: evaluate every 1 minute minimum. | NOT real-time to S3 (~15 min). CloudTrail Lake supports near-real-time querying. CloudWatch Events/EventBridge integration is near real-time. | Near real-time trace data (seconds delay). Service map updates in near real-time. | NOT real-time. Configuration changes trigger evaluations but delivery has latency. Periodic rules run on schedule. |
Alerting / Notifications CloudWatch Alarms is the PRIMARY alerting mechanism. Config alerts on compliance violations. X-Ray has no direct alarms — common trap. | YES — CloudWatch Alarms → SNS, Auto Scaling, EC2 actions, Systems Manager OpsCenter, Lambda. Composite Alarms supported. | Indirectly via CloudWatch Logs metric filters → CloudWatch Alarm → SNS. Or EventBridge rules on CloudTrail events. | No native alerting. Integrate with CloudWatch ServiceLens for anomaly detection. X-Ray Insights for anomaly alerts. | YES — Config Rules → SNS notifications, EventBridge, Systems Manager Automation (auto-remediation) |
Auto-Remediation AWS Config is the PRIMARY service for automated compliance remediation. CloudWatch handles performance-based auto-remediation. | Via Alarms → Lambda, Auto Scaling, EC2 actions (stop/terminate/reboot/recover), Systems Manager | Indirectly via EventBridge → Lambda (e.g., detect unauthorized API call → revoke IAM policy) | None — diagnostic only | YES — Config Conformance Packs + SSM Automation Remediation Actions (e.g., auto-enable S3 versioning) |
Cost Model CloudTrail first management event trail is FREE — huge cost advantage. CloudWatch Logs can get expensive at scale without retention policies. | Free tier: 10 custom metrics, 10 alarms, 5GB log ingestion/month. Paid: per metric, per alarm, per log GB ingested/stored/analyzed, per dashboard. | Management events: first copy FREE per region. Additional trails: $2.00/100,000 events. Data events: $0.10/100,000. CloudTrail Lake: additional charges. | Free tier: 100,000 traces recorded/month, 1,000,000 traces retrieved/month. Paid: $5.00/1M traces recorded, $0.50/1M traces retrieved/scanned. | Per configuration item recorded: $0.003/item. Per active Config rule evaluation: $0.001/evaluation. Conformance packs: $0.0012/evaluation. |
Log Storage Backend | CloudWatch Logs (native). Export to S3 for long-term. Stream to Kinesis Data Firehose → S3/OpenSearch. | S3 (primary). Optional: CloudWatch Logs group. CloudTrail Lake (managed event data store, Athena-compatible). | X-Ray service (managed). Export to S3 via X-Ray API for long-term analysis. | S3 (configuration snapshots and history). DynamoDB (optional). Config console for recent history. |
Query / Analysis CloudTrail Lake = managed, SQL-queryable event data store. Athena on S3 = DIY but flexible. Config Advanced Query = current state only, not history. | CloudWatch Logs Insights (SQL-like query language). Metric Math. Contributor Insights. CloudWatch Anomaly Detection (ML-based). | CloudTrail Lake (SQL queries via Athena-compatible engine). Athena on S3. CloudWatch Logs Insights (if streamed to CWL). | X-Ray console (trace map, timeline view, filter expressions). X-Ray Analytics (compare trace groups). ServiceLens integration. | AWS Config Advanced Query (SQL-based, queries current state across accounts/regions). Config Rules compliance dashboard. |
Multi-Account / Multi-Region Organization Trail in CloudTrail is the EASIEST way to audit all accounts centrally. Config Aggregator is the equivalent for compliance. | Cross-account observability (CloudWatch cross-account, requires sharing). Cross-region dashboards. Unified dashboards via CloudWatch Observability Access Manager. | Organization Trail (single trail covers all accounts in AWS Organization). Multi-region trail supported. CloudTrail Lake supports multi-account. | Groups and sampling rules per account. No native cross-account trace aggregation without custom solution. | AWS Config Aggregator (collects data from multiple accounts/regions). Organization-level aggregator via AWS Organizations. |
Sampling X-Ray sampling is CRITICAL to understand. By default, NOT all traces are captured. For 100% tracing, set sampling rate to 1.0 — but this increases cost. | No sampling — all metrics and logs are recorded (subject to limits/costs) | No sampling — all API calls recorded (management events). Data events can be filtered. | YES — sampling rules required (default: 5% of requests after first request/second). Custom sampling rules supported. | No sampling — all configuration changes recorded for tracked resource types |
Security / Compliance Use Case Security audit = CloudTrail. Compliance enforcement = Config. Performance security = CloudWatch. Application-level tracing = X-Ray. | Monitor security metrics (failed login attempts, error rates). Metric filters on CloudTrail logs for security alerts. | PRIMARY security audit service. Detect unauthorized access, privilege escalation, resource deletion. Required for compliance (PCI-DSS, HIPAA, SOC). | Identify security bottlenecks, unauthorized service calls within application. Not a compliance service. | PRIMARY compliance service. Enforce resource configurations (e.g., no public S3 buckets, MFA required, encryption enabled). CIS benchmarks via Conformance Packs. |
Key Integrations GuardDuty ingests CloudTrail logs AND VPC Flow Logs — it does NOT replace CloudTrail. CloudWatch ServiceLens = CloudWatch + X-Ray unified view. | EC2, Lambda, RDS, ECS, EKS, API Gateway, ALB, S3, Bedrock, SageMaker, GuardDuty (findings → metrics), Systems Manager, SNS, Auto Scaling, EventBridge | S3, CloudWatch Logs, EventBridge, Athena, CloudTrail Lake, SNS, IAM, AWS Organizations, Security Hub, GuardDuty (uses CloudTrail events) | Lambda, API Gateway, EC2 (agent), ECS, Elastic Beanstalk, ALB, SNS, SQS, DynamoDB, S3 (SDK instrumentation), CloudWatch ServiceLens | S3, SNS, Systems Manager (Automation), Security Hub, CloudTrail (for change tracking), AWS Organizations, EventBridge, Lambda (custom rules) |
Agent Required? CloudWatch memory/disk metrics are NOT available by default — you MUST install the CloudWatch Agent. This is a frequent exam trap. | Optional: CloudWatch Agent for OS-level metrics (memory, disk) and custom log collection from EC2. Required for on-premises. | NO agent needed — CloudTrail is a control-plane service that captures API calls automatically. | YES — X-Ray Daemon (EC2, on-premises) OR X-Ray SDK instrumentation in application code. Lambda has built-in support (enable active tracing). | NO agent needed — Config uses AWS APIs to discover and record resource configurations. |
Enabled by Default? CloudTrail event history is ON by default but trails (for S3 delivery, long-term retention) are NOT. Config must be explicitly enabled — a compliance gap if not done. | Basic monitoring (5-min metrics) enabled by default for most AWS services. Detailed monitoring (1-min) must be explicitly enabled. | CloudTrail Event History (90-day, management events) is ON by default since Nov 2021. A persistent Trail must be explicitly created. | NOT enabled by default. Must enable active tracing per service (Lambda, API Gateway, etc.) and instrument application code. | NOT enabled by default. Must be explicitly enabled per region. AWS Organizations can enable it organization-wide. |
Supports On-Premises | YES — CloudWatch Agent on on-premises servers. Requires IAM credentials or IAM Role via AWS Systems Manager. | Partially — CloudTrail records API calls made TO AWS from on-premises. Cannot record on-premises OS/app activity. | YES — X-Ray Daemon can run on-premises. SDK instruments on-premises applications. | Partially — AWS Config records AWS resource configs. On-premises resources can be recorded as custom Config items. |
Anomaly Detection / ML CloudTrail Insights detects unusual write API activity (e.g., sudden spike in TerminateInstances calls). GuardDuty uses ML on CloudTrail + VPC Flow Logs + DNS logs. | YES — CloudWatch Anomaly Detection (ML model on metrics). CloudWatch Contributor Insights. CloudWatch Alarms with anomaly detection bands. | Indirectly via GuardDuty (ML on CloudTrail events for threat detection). CloudTrail Insights (unusual API activity detection). | YES — X-Ray Insights (ML-based anomaly detection on trace data, identifies fault/error/throttle anomalies). | NO native ML. Rule-based compliance evaluation only. |
Summary
CloudWatch answers 'Is it healthy?' (metrics/logs/alarms), CloudTrail answers 'Who did what?' (API audit), X-Ray answers 'Why is it slow?' (distributed tracing), and Config answers 'Is it compliant?' (configuration state). These four services are complementary, not competing — production environments need all four. The exam will test whether you can identify which service to use for a specific problem statement.
🎯 Decision Tree
IF question involves performance metrics, CPU, memory, latency thresholds, or alarms → CloudWatch. IF question involves 'who deleted/modified/called', unauthorized access, API audit, or compliance logging → CloudTrail. IF question involves request tracing, inter-service latency, debugging slow microservices, or service dependency maps → X-Ray. IF question involves resource compliance, configuration drift, 'what did this resource look like before', or enforcing security standards → Config. IF question involves security threat detection with ML → GuardDuty (which USES CloudTrail + VPC Flow Logs). IF question involves real-time event-driven responses to API calls → EventBridge + CloudTrail.
The 4-service mental model: CloudWatch = PERFORMANCE (is it working?), CloudTrail = AUDIT (who did it?), X-Ray = DEBUGGING (why is it slow?), Config = COMPLIANCE (is it configured correctly?). If you can instantly map the exam question's problem statement to one of these four questions, you will never confuse these services again.
CloudWatch does NOT automatically collect memory utilization or disk space from EC2 instances — these require the CloudWatch Agent. The default EC2 metrics (CPU utilization, network in/out, disk read/write operations) are hypervisor-level only. This distinction appears on SAA-C03, DVA-C02, and SysOps exams constantly.
CloudTrail records API calls — NOT application-level logs. If someone asks 'who terminated my EC2 instance?', use CloudTrail. If they ask 'why is my EC2 instance showing high CPU?', use CloudWatch. If they ask 'why is my Lambda function timing out on the 3rd downstream call?', use X-Ray. If they ask 'was my S3 bucket public last Tuesday?', use Config.
AWS Config is the ONLY service that tracks historical configuration state of resources. If an exam question asks 'what was the security group configuration 3 weeks ago?' or 'detect configuration drift', the answer is Config — not CloudTrail (which tracks API calls, not resource state snapshots). Config + CloudTrail together give you the complete picture: what changed AND who changed it.
X-Ray requires instrumentation — it is NOT passive like CloudWatch or CloudTrail. You must: (1) Install X-Ray Daemon (EC2/on-prem) OR enable active tracing (Lambda/API Gateway), AND (2) Instrument application code with X-Ray SDK. For Lambda, simply enabling 'Active Tracing' in the Lambda configuration is sufficient for Lambda-level tracing, but downstream calls require SDK instrumentation.
GuardDuty is NOT a replacement for CloudTrail — GuardDuty CONSUMES CloudTrail logs (plus VPC Flow Logs and DNS logs) and applies ML to detect threats. Disabling CloudTrail would blind GuardDuty. Security Hub aggregates findings from GuardDuty, Config, Inspector, and others into a single compliance dashboard.
CloudTrail Data Events (S3 object-level: GetObject, PutObject, DeleteObject; Lambda: Invoke) are disabled by default and cost $0.10/100,000 events. On a busy S3 bucket with millions of requests, this can cost thousands per month. Exam questions about auditing S3 object access require enabling Data Events — but be aware of the cost implication.
For compliance automation: Config Rule detects non-compliance → EventBridge or SNS notification → Lambda or Systems Manager Automation remediates. This is the canonical Config remediation pattern. CloudWatch Alarms → SNS/Lambda is the performance remediation pattern. Know both for DOP-C02 and SCS-C02.
CloudWatch Composite Alarms allow you to combine multiple alarms with AND/OR logic to reduce alarm noise. This is critical for SAP-C02 and DOP-C02 questions about operational excellence. A Composite Alarm can suppress individual alarms during maintenance windows.
CloudTrail Lake is the modern, managed alternative to Athena-on-S3 for CloudTrail analysis. It stores events in an immutable, managed data store for up to 7 years, supports SQL queries natively, and integrates with AWS Organizations. Exam questions about long-term audit log querying without managing Athena/Glue should point to CloudTrail Lake.
The #1 exam trap: Confusing CloudTrail (API audit — WHO did WHAT) with CloudWatch (performance monitoring — IS IT HEALTHY) with Config (compliance — IS IT CONFIGURED CORRECTLY). A question saying 'an administrator deleted a security group — how do you find out who did it?' = CloudTrail. 'CPU is spiking on EC2' = CloudWatch. 'S3 bucket became public — how do you detect and prevent this?' = Config. 'Lambda is slow on the 3rd service call' = X-Ray. The second trap: assuming CloudWatch automatically captures memory/disk metrics from EC2 — it does NOT without the CloudWatch Agent.
CertAI Tutor · SAA-C03, SAP-C02, DVA-C02, DOP-C02, SCS-C02, DEA-C01, AIF-C01, CLF-C02 · 2026-02-22
Services
Comparisons
Guides & Patterns