
Cargando...
Apache Kafka ecosystem vs AWS-native streaming — choose the right tool before the exam chooses for you
Kafka compatibility & ecosystem depth vs serverless simplicity & native AWS integration
| Feature | MSK Fully managed Apache Kafka on AWS | Kinesis AWS-native real-time data streaming |
|---|---|---|
Underlying Technology If the scenario mentions 'Kafka', 'existing Kafka workload', 'Kafka Connect', or 'Kafka Streams' — the answer is MSK, period. | Apache Kafka (open-source, industry standard). You get real Kafka brokers — same APIs, same client libraries, same ecosystem. | Proprietary AWS service. Custom protocol; uses Kinesis Client Library (KCL) or AWS SDK. Not Kafka-compatible. |
Management Model MSK Serverless exists but is a distinct offering. Standard MSK still requires broker sizing. Kinesis On-Demand is the simpler serverless path. | Managed but NOT serverless by default. You provision broker instance types and storage. MSK Serverless is available as an option for fully serverless Kafka. | Fully managed. On-demand mode is truly serverless (no shard management). Provisioned mode requires shard planning. |
Throughput — Write Kinesis Provisioned is 1 MB/s write per shard — this is a critical exam number. MSK throughput is bounded by cluster size, not a fixed AWS quota per stream. | Scales with broker instance type and number of brokers. Can reach very high throughput; limited by cluster configuration. No hard AWS-imposed per-stream cap at the Kafka protocol level. | On-Demand: Default 4 MB/s write; scales up to 10 GB/s write in us-east-1, us-west-2, eu-west-1. Other regions: up to 200 MB/s write. Provisioned: 1 MB/s write per shard. |
Throughput — Read Kinesis Enhanced Fan-Out gives each registered consumer 2 MB/s per shard via HTTP/2 push. Without it, the 2 MB/s is SHARED. MSK consumer groups each independently read at full speed. | Scales with broker count and consumer group parallelism. Multiple consumer groups each get full throughput — no read fan-out penalty. | On-Demand: Default 8 MB/s read; up to 20 GB/s read in us-east-1, us-west-2, eu-west-1. Other regions: up to 400 MB/s read. Provisioned: 2 MB/s read per shard (shared across all consumers unless Enhanced Fan-Out is used — then 2 MB/s per consumer per shard). |
Number of Streams / Topics Kinesis On-Demand has a DEFAULT limit of 50 streams per account — a common exam trap. Provisioned mode has no stream count cap. | Topics are limited by broker storage and configuration, not a hard AWS quota on topic count in the same way. Scales to thousands of topics per cluster. | On-Demand: Default limit of 50 data streams per account (can be increased via support ticket). Provisioned: No upper quota on number of streams per account. |
Shard / Partition Model Kinesis Provisioned shard quota varies by region — us-east-1/us-west-2/eu-west-1 get 20,000 default; other regions get 1,000 or 6,000. This regional difference appears on exams. | Kafka partitions. You define partition count per topic. Partitions can be increased (never decreased). Consumer groups allow independent parallel consumption. | Shards. Provisioned: default 20,000 shards in us-east-1/us-west-2/eu-west-1; 1,000 or 6,000 in other regions. On-Demand: no upper shard limit (auto-scales). |
Data Retention Kinesis default retention is 24 HOURS — critical exam fact. MSK can retain data much longer. If the scenario requires weeks or months of replay, MSK is preferred. | Configurable up to your broker storage capacity. Default 7 days but can be set to unlimited (log.retention.ms). Long-term retention via Tiered Storage (MSK feature). | Default 24 hours. Extended retention up to 7 days (standard). Long-term retention up to 365 days (additional cost). NOT configurable beyond 365 days. |
Consumer Groups / Fan-Out If you need many independent consumers reading at full speed, MSK consumer groups are more natural and cost-effective than Kinesis Enhanced Fan-Out. | Native Kafka consumer groups. Unlimited independent consumer groups, each maintaining their own offset. True fan-out with no throughput penalty. | Multiple applications can consume the same stream. Without Enhanced Fan-Out, read throughput is shared (2 MB/s per shard total). With Enhanced Fan-Out, each consumer gets dedicated 2 MB/s per shard. |
Ordering Guarantees Both services guarantee ordering within their unit (partition/shard). The key differentiator is NOT ordering — it's ecosystem, management, and throughput model. | Per-partition ordering guaranteed. Messages with the same key go to the same partition (consistent hashing). Strong ordering semantics. | Per-shard ordering guaranteed. Records with the same partition key go to the same shard. Same strong ordering semantics within a shard. |
Protocol & Client Compatibility Migration from on-premises Kafka to AWS? MSK is the ONLY answer — zero code changes to producers/consumers. Kinesis requires a full rewrite. | Native Kafka protocol. ANY Kafka client library works (Java, Python, Go, .NET, etc.). Kafka Connect, Kafka Streams, Schema Registry, ksqlDB all work natively. | Proprietary Kinesis protocol. Must use AWS SDK, KCL (Kinesis Client Library), or Kinesis Producer Library (KPL). NOT compatible with Kafka clients. |
AWS Native Integration If the scenario is 'AWS-only stack, no Kafka expertise, need quick integration with Lambda/S3/Redshift' — Kinesis is the answer. MSK requires more operational knowledge. | Good integration but requires more configuration. Lambda triggers available. Glue, EMR, Flink integrations exist. Less 'click and go' than Kinesis. | Deep native AWS integration. Direct Lambda triggers, Firehose delivery to S3/Redshift/OpenSearch, Managed Service for Apache Flink (formerly KDA), CloudWatch metrics out of the box. |
Serverless Option Both have serverless options now. MSK Serverless = Kafka APIs without broker management. Kinesis On-Demand = auto-scaling shards. Neither requires upfront capacity planning. | MSK Serverless: fully serverless Kafka, auto-scales, pay-per-use. Supports standard Kafka APIs. Some Kafka features not supported in Serverless mode. | On-Demand mode: fully serverless, auto-scales shards, pay-per-GB ingested and retrieved. Available since 2021. Default for new streams in many scenarios. |
Security Model MSK supports multiple auth mechanisms (IAM, SASL/SCRAM, mTLS) — useful when non-AWS clients need to connect. Kinesis is IAM-only, simpler for pure-AWS workloads. | Deploys inside your VPC. Supports TLS encryption in transit, at-rest encryption (EBS). Authentication via IAM, SASL/SCRAM, mutual TLS (mTLS). Fine-grained ACLs via Kafka ACLs or IAM. | IAM-based access control. Server-side encryption at rest (KMS). TLS in transit. VPC endpoints (PrivateLink) supported. Simpler security model — IAM only. |
Pricing Model MSK costs are dominated by broker instance hours even when idle. Kinesis On-Demand is pure consumption-based. For spiky/unpredictable workloads, Kinesis On-Demand is often cheaper. | Provisioned: pay per broker-hour (instance type) + storage (EBS GB-month) + data transfer. MSK Serverless: pay per cluster-hour + GB ingested + GB retrieved. No free tier. | Provisioned: pay per shard-hour + PUT payload units (per 25KB). On-Demand: pay per GB ingested + GB retrieved. Extended retention costs extra. No free tier for production use. |
Operational Complexity Exam scenarios mentioning 'reduce operational overhead', 'no Kafka expertise', or 'fully managed' favor Kinesis. 'Existing Kafka team', 'open-source', or 'ecosystem' favor MSK. | Higher: must choose broker instance types, configure replication factor, manage topics, tune Kafka configs, monitor broker metrics, plan for broker upgrades. | Lower: AWS manages everything. Shard splitting/merging (Provisioned) or fully automatic (On-Demand). Minimal Kafka expertise required. |
Message Size Limit Both default to 1 MB max message size. Kinesis is a HARD limit. MSK limit is configurable. For large messages, MSK wins — or use S3 pointer pattern with either service. | Default 1 MB per message (Kafka default: message.max.bytes). Configurable up to broker limits — can be increased significantly. | Maximum 1 MB per record (hard limit, cannot be increased). |
Multi-Region / Disaster Recovery For multi-region Kafka replication, MSK Replicator is the managed answer. Kinesis cross-region replication requires custom solutions. | MirrorMaker 2 (MM2) for cross-region replication. MSK Replicator (managed MM2) available. Active-active or active-passive topologies possible. | No built-in cross-region replication. Must implement custom Lambda-based replication or use Firehose to replicate to S3 then re-ingest. More complex DR story. |
Ecosystem & Connectors CDC (Change Data Capture) with Debezium? MSK + MSK Connect. Need to stream from databases to S3 without code? MSK Connect with Kafka Connect S3 Sink connector. | Full Kafka ecosystem: Kafka Connect (MSK Connect managed), 200+ connectors (Debezium, JDBC, S3, etc.), Kafka Streams, ksqlDB, Schema Registry. Huge community. | AWS-specific ecosystem: Kinesis Agent, KPL, KCL, Firehose, Managed Service for Apache Flink. Limited to AWS-native tooling. |
Use Case Fit | Lift-and-shift Kafka workloads, microservices event buses, CDC pipelines, complex stream processing (Kafka Streams/Flink), multi-consumer fan-out at scale, long retention replay. | Real-time AWS analytics pipelines, log/event ingestion, IoT data streams, quick integration with Lambda/S3/Redshift, teams without Kafka expertise, variable/spiky workloads. |
Summary
Choose MSK when you need Apache Kafka compatibility, rich ecosystem tooling (Kafka Connect, Streams, Schema Registry), or are migrating existing Kafka workloads — it offers superior flexibility, consumer group semantics, and long-term retention. Choose Kinesis when you want a fully AWS-native, lower-operational-overhead streaming service that integrates seamlessly with Lambda, Firehose, S3, and Redshift — especially for teams without Kafka expertise or for variable/spiky workloads where On-Demand pricing shines. Both are production-grade; the decision is ecosystem vs. simplicity.
🎯 Decision Tree
If 'Kafka' or 'existing Kafka' appears → MSK always. If 'no Kafka expertise' or 'fully managed with Lambda/S3' → Kinesis. If 'CDC with Debezium' → MSK Connect. If 'stream to S3/Redshift automatically' → Kinesis + Firehose. If 'multiple independent consumer groups at full throughput' → MSK. If 'spiky unpredictable load, pay per use, no shards' → Kinesis On-Demand. If 'cross-region replication built-in' → MSK Replicator. If 'default 24-hour retention is acceptable' → Kinesis. If 'need weeks/months of data replay' → MSK.
CRITICAL — Kafka keyword = MSK, always: Any exam scenario mentioning 'Apache Kafka', 'existing Kafka cluster', 'Kafka Connect', 'Kafka Streams', 'ksqlDB', 'Schema Registry', or 'lift-and-shift from Kafka' has ONE correct answer: Amazon MSK. Kinesis is NOT Kafka-compatible and requires a full rewrite of producers and consumers.
CRITICAL — Kinesis Provisioned throughput limits are per-shard: Write = 1 MB/s or 1,000 records/s per shard. Read = 2 MB/s per shard (shared without Enhanced Fan-Out). Kinesis On-Demand defaults to 4 MB/s write / 8 MB/s read and auto-scales to 10 GB/s write / 20 GB/s read in us-east-1, us-west-2, eu-west-1 — but only 200 MB/s / 400 MB/s in other regions. These numbers appear directly in exam questions.
CRITICAL — Kinesis default retention is 24 hours, NOT 7 days: The most common retention trap. Kinesis defaults to 24 hours. Extended retention goes to 7 days. Long-term retention goes up to 365 days. MSK defaults to 7 days and supports effectively unlimited retention via Tiered Storage. If a scenario requires days or weeks of data replay at low cost, MSK wins.
IMPORTANT — Kinesis On-Demand default stream limit is 50 per account: Provisioned mode has no stream count quota, but On-Demand is limited to 50 streams per account by default. If an architecture requires hundreds of On-Demand streams, a support ticket is needed. This distinction appears in architecture constraint questions.
IMPORTANT — Enhanced Fan-Out changes Kinesis read economics: Without Enhanced Fan-Out, all consumers on a shard SHARE 2 MB/s read throughput. With Enhanced Fan-Out, each registered consumer gets a DEDICATED 2 MB/s per shard via HTTP/2 push (at additional cost). If you have 5 Lambda functions all consuming the same Kinesis stream, without EFO they each get 0.4 MB/s. MSK consumer groups each get full partition throughput independently — no sharing, no extra cost.
IMPORTANT — MSK is NOT serverless by default: Standard MSK requires broker provisioning (instance types, EBS storage). MSK Serverless is a separate offering. Kinesis On-Demand is the simpler serverless path. Exam scenarios asking for 'minimum operational overhead' with no Kafka requirement → Kinesis On-Demand. Scenarios asking for 'Kafka without managing brokers' → MSK Serverless.
NICE-TO-KNOW — Regional shard quotas differ significantly: Kinesis Provisioned default is 20,000 shards in us-east-1, us-west-2, eu-west-1 — but only 1,000 or 6,000 shards in other regions. For global architectures processing high-volume data in non-primary regions, this limit matters and can be increased via Service Quotas console.
The #1 exam trap: Candidates see a scenario about 'real-time streaming data' and default to Kinesis — but the scenario mentions 'existing Kafka producers', 'Kafka Connect', or 'migrate from on-premises Kafka'. The correct answer is ALWAYS MSK in those cases. Kinesis is NOT Kafka-compatible. Conversely, candidates sometimes recommend MSK for simple AWS-native pipelines (Lambda → stream → S3) where Kinesis + Firehose is far simpler, cheaper, and requires zero Kafka expertise. The decision rule: Kafka ecosystem needed → MSK. Pure AWS-native simplicity → Kinesis.
CertAI Tutor · DEA-C01, DOP-C02, SAA-C03, SAP-C02, SCS-C02, CLF-C02, DVA-C02 · 2026-02-22
Services
Comparisons