
Cargando...
Stop guessing which service to use — master the when, why, and how of AWS log analytics in one place
Real-time operational monitoring vs. full-text search & analytics vs. serverless SQL on S3
| Feature | CloudWatch Logs Operational log monitoring, alerts, metrics | OpenSearch Full-text search, analytics, vector store | Athena Serverless SQL queries directly on S3 |
|---|---|---|---|
Primary Use Case CloudWatch Logs = operational monitoring; OpenSearch = search + dashboards + vector embeddings; Athena = SQL analytics on S3 data lakes | Collect, monitor, and alert on AWS service and application logs in real time | Full-text search, log analytics dashboards (Kibana/OpenSearch Dashboards), and vector search for RAG | Ad-hoc SQL queries against structured/semi-structured data stored in S3 (CSV, JSON, Parquet, ORC) |
Data Storage Model Athena does NOT store data — it queries S3 directly. OpenSearch stores indexed copies of your data. CloudWatch Logs stores logs in its own managed store. | Managed log groups and log streams (proprietary AWS storage, not directly S3) | Distributed index-based storage inside managed OpenSearch cluster nodes (EBS or UltraWarm/cold tiers backed by S3) | No storage — queries data IN PLACE from S3; metadata catalog via AWS Glue Data Catalog |
Query Language Athena uses standard SQL — lowest learning curve for SQL-fluent teams. OpenSearch DSL is powerful but proprietary. CloudWatch Logs Insights is fast but limited to log-specific patterns. | CloudWatch Logs Insights query language (proprietary, purpose-built for log analysis) | OpenSearch Query DSL (JSON-based), SQL plugin, PPL (Piped Processing Language), and k-NN for vector search | Standard ANSI SQL (Presto/Trino engine); also supports Apache Spark for notebook-style analytics |
Serverless / Managed Both CloudWatch Logs and Athena are fully serverless. OpenSearch requires cluster provisioning unless you use OpenSearch Serverless (newer option). | Fully serverless — no infrastructure to manage, scales automatically | Managed service (you provision instance types and count, or use OpenSearch Serverless for on-demand capacity) | Fully serverless — no clusters, no infrastructure; scales queries in parallel automatically |
Pricing Model Athena's cost is directly tied to data scanned — use columnar formats (Parquet, ORC) and partitioning to dramatically reduce costs. OpenSearch has ongoing cluster costs even when idle (unless Serverless). | Pay per GB ingested, per GB stored (after free tier), per GB scanned by Logs Insights queries, and per metric/alarm | Pay per instance-hour (EC2 instance type), EBS storage, UltraWarm/cold storage, data transfer; OpenSearch Serverless charges OCU (OpenSearch Compute Units) | Pay per TB of data scanned per query ($5/TB scanned for SQL queries); no charge for DDL queries; Apache Spark charged per DPU-hour |
Latency / Query Speed For sub-second search (e.g., user-facing search bars, real-time dashboards), OpenSearch wins. For operational alerting, CloudWatch Logs. For batch/ad-hoc analytics, Athena. | Near real-time ingestion (<1s); Logs Insights queries run in seconds for recent data | Sub-second search latency on indexed data; near real-time indexing (seconds delay after ingestion) | Seconds to minutes depending on data volume and query complexity; not designed for sub-second interactive use cases |
Real-Time Alerting CloudWatch Logs is the ONLY native AWS service that directly feeds CloudWatch Alarms for automated operational responses. This is a frequent exam scenario. | Native — Metric Filters convert log patterns to CloudWatch Metrics, then CloudWatch Alarms trigger SNS/Lambda/etc. | Alerting plugin supports monitors and triggers; integrates with SNS, Slack, webhooks | Not designed for real-time alerting — query-on-demand only; can be scheduled via EventBridge + Lambda |
Vector Search / RAG Support CRITICAL for AIF-C01 and SAP-C02: OpenSearch is the AWS-native vector database for RAG with Bedrock. Neither CloudWatch Logs nor Athena support vector embeddings. | Not supported — no vector/embedding capabilities | Full k-NN vector search support; native integration with Amazon Bedrock as a vector store for RAG applications | Not supported — relational SQL engine only, no vector indexing |
Dashboards & Visualization For rich operational dashboards (think ELK-style), OpenSearch Dashboards is the answer. For BI/business reporting, pair Athena + QuickSight. | CloudWatch Dashboards (basic widgets); Logs Insights results can be added to dashboards | OpenSearch Dashboards (formerly Kibana) — rich, purpose-built log/metric visualization with pre-built templates | No native dashboards; integrates with Amazon QuickSight for BI visualization |
Data Sources / Integrations CloudWatch Logs is the default destination for AWS service logs. To get logs into OpenSearch, a common pattern is: CloudWatch Logs → Subscription Filter → Kinesis Firehose → OpenSearch. | Native source for all AWS services (EC2 via agent, Lambda auto-integration, ECS, EKS, RDS, API Gateway, VPC Flow Logs, CloudTrail, etc.) | Ingestion via Kinesis Data Firehose, Logstash, OpenSearch Ingestion (managed), direct API; can subscribe to CloudWatch Logs | Queries S3 directly; also supports Athena Federated Query to query RDS, DynamoDB, Redshift, and other sources via Lambda connectors |
Data Retention Forgetting to set CloudWatch Logs retention policies is a common cost trap. Default 'Never Expire' means indefinite storage charges. | Configurable per log group: 1 day to 10 years, or Never Expire; default is Never Expire (costs accumulate) | Data retained as long as cluster has storage; use Index State Management (ISM) policies to automate deletion/archival to UltraWarm/cold/S3 | Data retention managed entirely by S3 lifecycle policies — Athena itself has no retention concept |
Access Control For column-level and row-level security on data lake queries, use Athena + AWS Lake Formation. OpenSearch FGAC provides document-level security for search use cases. | IAM policies; resource-based policies for cross-account log sharing; CloudWatch Logs data protection policies for PII masking | IAM-based access control + fine-grained access control (FGAC) at index/document/field level; VPC support; encryption at rest and in transit | IAM policies; Athena workgroups for query isolation and cost control; Lake Formation for column/row-level security on Glue catalog tables |
Schema / Data Structure Athena uses 'schema-on-read' — you define the schema when querying, not when storing. This is a key exam concept distinguishing it from traditional databases. | Schema-less log streams; Logs Insights auto-discovers JSON fields; structured and unstructured text supported | Schema defined via index mappings (dynamic or explicit); flexible document model (JSON) | Schema-on-read using AWS Glue Data Catalog; supports structured (CSV), semi-structured (JSON), and columnar (Parquet, ORC) formats |
Cost Optimization Strategies Athena + Parquet + partitioning is the canonical exam answer for cost-efficient ad-hoc analytics on S3. This pattern appears across SAA-C03, DEA-C01, and DOP-C02. | Set retention policies; export old logs to S3 (then query with Athena); use Metric Filters instead of querying raw logs; use data protection to avoid storing sensitive data | Use UltraWarm for infrequently accessed indices (lower cost than hot storage); use cold storage for archival; right-size instance types; use reserved instances | Convert data to columnar format (Parquet/ORC) — can reduce costs by 30-90%; use partitioning to limit data scanned; compress data; use workgroup query result reuse |
Compliance & Audit Log Use Cases For SCS-C02: CloudTrail → S3 → Athena is the standard pattern for historical security investigation. CloudTrail → CloudWatch Logs is for real-time alerting on suspicious API activity. | CloudTrail logs can be sent to CloudWatch Logs for real-time alerting on API calls; supports log integrity validation via CloudTrail | Audit logging plugin tracks all cluster operations; useful for compliance dashboards aggregating multiple log sources | Query CloudTrail logs stored in S3 for historical forensic analysis; Athena has pre-built CloudTrail table integration; ideal for SCS-C02 investigations |
Machine Learning / AI Integration OpenSearch is the AWS vector database of choice for Bedrock RAG architectures. This is heavily tested in AIF-C01. Athena and CloudWatch Logs have no vector search capabilities. | CloudWatch Logs Anomaly Detection (ML-powered); CloudWatch Contributor Insights for pattern analysis | Native ML framework; anomaly detection plugin; k-NN vector search; direct Bedrock integration for semantic search and RAG | Can query SageMaker feature store data; integrates with SageMaker for ML pipelines via S3; no native ML inference |
Cross-Account / Multi-Region For multi-account observability at scale, CloudWatch OAM (Observability Access Manager) is the AWS-native answer. Common in DOP-C02 and SAP-C02. | CloudWatch cross-account observability (OAM) allows sharing logs/metrics across accounts; cross-region log group replication available | Cross-cluster replication (CCR) for multi-region; VPC peering for cross-account access; no native cross-account dashboard federation | Athena can query S3 buckets in other accounts (with proper bucket policies); Glue catalog sharing via Lake Formation for cross-account |
Typical Exam Scenario Keywords Train yourself to map these keywords to the correct service instantly — exam questions are often scenario-based with these exact trigger words. | 'real-time monitoring', 'operational alerts', 'Lambda logs', 'metric filter', 'log group', 'VPC Flow Logs analysis', 'CloudTrail alerting' | 'full-text search', 'Kibana dashboards', 'log analytics platform', 'vector search', 'RAG', 'semantic search', 'ELK replacement', 'OpenSearch Dashboards' | 'ad-hoc queries', 'S3 data lake', 'serverless SQL', 'cost-effective analytics', 'Parquet', 'partitioning', 'CloudTrail forensics', 'pay-per-query' |
Summary
Use CloudWatch Logs when you need real-time operational monitoring, alerting, and native AWS service log collection — it's the default log destination for the entire AWS ecosystem. Choose OpenSearch when you need full-text search capabilities, rich Kibana-style dashboards, or vector search for AI/RAG applications with Bedrock. Pick Athena when you need cost-effective, serverless SQL analytics on data already stored in S3, especially for historical analysis, data lake queries, or forensic investigations — and always pair it with Parquet format and partitioning for maximum cost efficiency.
🎯 Decision Tree
Need real-time alerts on log patterns? → CloudWatch Logs (Metric Filters + Alarms). Need full-text search or dashboards (ELK-style)? → OpenSearch. Need vector search / RAG with Bedrock? → OpenSearch (k-NN). Need ad-hoc SQL on S3 data lake? → Athena. Need to query CloudTrail logs historically? → Athena (pre-built CloudTrail integration). Need to query CloudTrail in real time and alert? → CloudWatch Logs. Need BI dashboards on S3 data? → Athena + QuickSight. Need to reduce Athena query costs? → Convert to Parquet + add partitions.
CRITICAL — Vector Search = OpenSearch, NEVER Athena or CloudWatch Logs: Any exam scenario involving RAG (Retrieval-Augmented Generation), semantic search, vector embeddings, or Bedrock Knowledge Bases points to OpenSearch as the vector store. Neither Athena (SQL engine) nor CloudWatch Logs (log monitoring) support vector indexing or k-NN search. This appears in AIF-C01, SAP-C02, and SAA-C03.
CRITICAL — Real-Time Alerting on Logs = CloudWatch Logs, NOT Athena: When a scenario says 'automatically alert when error rate exceeds threshold in application logs' or 'trigger Lambda when a specific pattern appears in logs', the answer is CloudWatch Logs Metric Filter → CloudWatch Alarm → SNS/Lambda. Athena is query-on-demand only and cannot trigger real-time alerts. This is the #1 confusion between these services on DOP-C02 and SCS-C02.
CRITICAL — Cost-Efficient S3 Analytics = Athena + Parquet + Partitioning: Any exam question mentioning 'reduce cost of querying S3', 'serverless analytics on data lake', or 'ad-hoc SQL without managing infrastructure' points to Athena. The canonical cost optimization pattern is: convert CSV/JSON to Parquet or ORC (columnar, compressed) AND partition data by date/region/etc. This combination can reduce data scanned (and therefore cost) by 30-99%. Appears in SAA-C03, DEA-C01, and DOP-C02.
IMPORTANT — CloudTrail Forensics = Athena; CloudTrail Real-Time Alerting = CloudWatch Logs: This distinction is heavily tested in SCS-C02. 'Investigate who deleted S3 objects last month' → CloudTrail logs in S3 → Athena SQL query. 'Alert immediately when root account is used' → CloudTrail → CloudWatch Logs → Metric Filter → Alarm → SNS. Both patterns use CloudTrail as the source but route to different services based on latency requirements.
IMPORTANT — OpenSearch Dashboards (not CloudWatch Dashboards) for ELK-Style Analytics: When a question describes 'replacing an on-premises ELK stack', 'Kibana-style dashboards', 'log aggregation with rich visualization', or 'centralized log analytics platform', the answer is OpenSearch Service with OpenSearch Dashboards. CloudWatch Dashboards exist but are basic widgets, not full analytics platforms. This appears in DOP-C02 and SAP-C02 migration scenarios.
IMPORTANT — Athena Federated Query Extends Beyond S3: Athena is not limited to S3. With Athena Federated Query (Lambda-based connectors), you can query RDS, Aurora, DynamoDB, Redshift, and even on-premises databases using SQL — all from one Athena query. This is tested in DEA-C01 and SAA-C03 as a unified analytics layer pattern.
NICE-TO-KNOW — CloudWatch Logs Export to S3 is NOT Real-Time: The CreateExportTask API can take up to 12 hours. For near-real-time log delivery to S3 (for Athena querying), use CloudWatch Logs Subscription Filter → Kinesis Data Firehose → S3. This architectural distinction appears in DOP-C02 pipeline design questions.
NICE-TO-KNOW — OpenSearch Cold Storage Requires Migration Before Querying: Unlike UltraWarm (which is directly queryable), OpenSearch Cold storage requires you to migrate indices back to warm or hot storage before querying. If an exam scenario mentions 'lowest cost storage for rarely accessed OpenSearch indices that may occasionally need querying', UltraWarm is safer than Cold. Cold is for true archival.
The #1 exam trap: Choosing Athena or a traditional database (RDS) for vector search / RAG applications. When any scenario involves semantic search, vector embeddings, Bedrock Knowledge Bases, or RAG architecture, the answer is ALWAYS OpenSearch (with k-NN plugin) — not Athena (no vector support), not RDS (inefficient for vectors), not SageMaker alone (inference, not vector storage), and not CloudWatch Logs (log monitoring only). A secondary trap: Assuming CloudWatch Logs can perform ad-hoc historical SQL analytics like Athena — CloudWatch Logs Insights is powerful but proprietary and not SQL; for SQL on historical log data, export to S3 and use Athena.
CertAI Tutor · CLF-C02, DEA-C01, DOP-C02, AIF-C01, SAA-C03, SAP-C02, DVA-C02, SCS-C02 · 2026-02-22
Services
Guides & Patterns