messagingSAA-C03SAP-C02CLF-C02DEA-C01

Amazon MSK (Kafka): The Managed Streaming Powerhouse

Fully managed Apache Kafka for real-time streaming — without the operational overhead

Updated 2026-02-21

Overview

Amazon MSK (Managed Streaming for Apache Kafka) is a fully managed service that makes it easy to build and run applications that use Apache Kafka to process streaming data. MSK provisions, configures, and maintains Kafka brokers and Apache ZooKeeper nodes, handling infrastructure tasks so you can focus on building streaming applications. It integrates natively with AWS services like Lambda, Kinesis Data Firehose, Glue, and S3, making it the go-to choice when you need a fully compatible, open-source Kafka experience on AWS.

Run Apache Kafka workloads on AWS without managing broker infrastructure, ZooKeeper, or cluster scaling — while retaining 100% Kafka API compatibility for lift-and-shift migrations and new streaming architectures.

Use When

Migrating an existing on-premises Apache Kafka workload to AWS with zero code changes (full API compatibility)
Building real-time data pipelines that ingest, transform, and deliver high-throughput streaming data to downstream consumers
Event-driven microservices architectures requiring a durable, replayable, high-throughput message bus
Log aggregation, clickstream analysis, IoT telemetry ingestion, and financial transaction processing at scale

Avoid When

Simple point-to-point messaging or task queues — use Amazon SQS instead; Kafka's overhead is unnecessary and cost is higher
Fan-out pub/sub to thousands of HTTP endpoints or email/SMS — use Amazon SNS, which is purpose-built for that pattern
Low-volume, simple event notifications where Kinesis Data Streams would suffice with less operational complexity and lower cost
Serverless, fully abstracted streaming with no Kafka expertise on the team — consider Amazon Kinesis Data Streams for a more AWS-native, serverless experience

Key Features

Fully Managed Broker Provisioning

AWS provisions, patches, and maintains Kafka brokers and ZooKeeper/KRaft nodes automatically

MSK Serverless

Automatically scales capacity based on throughput — no broker sizing required; pay per usage

MSK Connect (Kafka Connect managed)

Fully managed Kafka Connect workers for integrating Kafka with external systems (S3, DynamoDB, RDS, etc.)

MSK Tiered Storage

Offloads older log segments to S3 for cost-effective long-term retention without increasing broker storage

Multi-AZ High Availability

Supports 2 or 3 AZ deployments; 3 AZs recommended for production workloads

Encryption at Rest

Supports AWS KMS CMK encryption; default uses AWS-managed keys

Encryption in Transit

TLS encryption between clients and brokers, and between brokers; configurable (TLS only, or TLS + plaintext)

Authentication: mTLS (Mutual TLS)

Client certificate-based authentication using AWS Certificate Manager Private CA

Authentication: SASL/SCRAM

Username/password authentication stored in AWS Secrets Manager

Authentication: IAM Access Control

Native AWS IAM-based authentication and authorization for Kafka clients — no separate credential management

VPC Networking (Private)

MSK clusters run inside your VPC; brokers are not publicly accessible by default

Public Access

Optional public broker endpoints can be enabled for clients outside the VPC (requires TLS)

CloudWatch Metrics Integration

Basic, Enhanced, and Maximum monitoring tiers; Enhanced and Maximum add per-broker and per-topic metrics

Open Monitoring (Prometheus)

Expose JMX and Node Exporter metrics for scraping by Prometheus-compatible tools

Broker Log Delivery

Kafka broker logs can be delivered to CloudWatch Logs, S3, or Kinesis Data Firehose

AWS Lambda as MSK Consumer

Lambda supports MSK as an event source trigger (event source mapping) — serverless Kafka consumer

Cluster Policy (Cross-Account Access)

Resource-based policies allow cross-account access to MSK clusters

KRaft Mode (ZooKeeper-less)

Newer Kafka versions on MSK support KRaft mode, eliminating ZooKeeper dependency

Automatic Minor Version Upgrades

Version upgrades in MSK are manual/controlled — AWS does NOT auto-upgrade Kafka versions without your action

Cross-Region Replication (native)

MSK does not offer native cross-region replication; use MirrorMaker 2 (via MSK Connect or self-managed) for cross-region

Integration Patterns

Serverless Kafka Consumer (Event Source Mapping)

high freq

Amazon MSK (Kafka)AWS Lambda

Lambda polls MSK topics and invokes functions with batches of Kafka records. Ideal for lightweight, serverless stream processing without managing consumer infrastructure. Lambda handles offset management automatically.

Kafka-to-S3/Redshift/OpenSearch Delivery

high freq

Amazon MSK (Kafka)Amazon Kinesis Data Firehose

MSK Connect (or Lambda) reads from Kafka topics and writes to Firehose, which buffers and delivers data to S3, Redshift, or OpenSearch Service. Common for analytics pipelines.

Managed Kafka Connect Sink

high freq

Amazon MSK (Kafka)MSK ConnectAmazon S3

MSK Connect runs Kafka Connect workers fully managed, using S3 Sink Connector to continuously write Kafka topic data to S3 for data lake ingestion — no EC2 required.

Self-Managed Kafka Consumers on Compute

high freq

Amazon MSK (Kafka)Amazon EC2Amazon EKS

Kafka consumer applications running on EC2 or EKS (containers) consume from MSK topics. MSK handles broker management; you manage the consumer application layer.

Kafka vs. Kinesis Architecture Decision

high freq

Amazon MSK (Kafka)Amazon Kinesis Data Streams

MSK is chosen when Kafka API compatibility is required (migration, open-source tooling). Kinesis is chosen for fully serverless, AWS-native streaming with tighter service integrations and simpler ops.

Managed ETL from Kafka Streams

medium freq

Amazon MSK (Kafka)AWS Glue

AWS Glue Streaming ETL jobs consume from MSK topics, transform data using Spark Structured Streaming, and write to S3 or Redshift. No server management required for the ETL layer.

Kafka Tiered Storage / Archival

medium freq

Amazon MSK (Kafka)Amazon S3

MSK Tiered Storage automatically offloads older Kafka log segments to S3, enabling long retention periods at lower cost. Consumers can still read historical data transparently.

Real-Time Search Indexing

medium freq

Amazon MSK (Kafka)Amazon OpenSearch Service

Kafka Connect (via MSK Connect) uses the OpenSearch Sink Connector to index streaming events into OpenSearch for real-time search and dashboards.

SASL/SCRAM Authentication Credential Management

medium freq

Amazon MSK (Kafka)AWS Secrets Manager

Kafka client credentials for SASL/SCRAM authentication are stored and rotated in Secrets Manager. MSK retrieves credentials from Secrets Manager automatically — no hardcoded passwords.

Service Limits & Quotas

LimitValueNote

Brokers per cluster (default)

Not specified as a hard number in current docs — consult Service Quotas console for your account brokers

Do not confuse broker count limits with partition limits — these are separate constraints

Clusters per account per region (default)

Adjustable — check Service Quotas console; default is not published as a fixed number in current docs clusters

This is a soft limit — it can be increased by submitting a support request or using the Service Quotas console

MSK Serverless clusters per account per region

Adjustable via Service Quotas — not a fixed published number in current docs serverless clusters

MSK Serverless has different quota tracking than provisioned MSK clusters

Apache Kafka versions supported

Multiple versions supported; MSK keeps pace with Apache Kafka releases (check MSK console for current list) versions

Older Kafka versions are periodically deprecated; always verify version support for migration scenarios

Availability Zones per cluster

2 or 3 AZs supported (3 AZs recommended for production) AZs

Single-AZ is NOT supported for MSK provisioned clusters — minimum is 2 AZs

Replication factor (recommended)

3 (Kafka best practice; MSK enforces minimum replication factor of 2 for multi-AZ) replicas

This is a Kafka-level constraint enforced by MSK, not just a recommendation

Storage per broker

Up to 16 TiB per broker (provisioned); MSK Serverless manages storage automatically TiB

MSK Tiered Storage is a separate feature that must be enabled; it does not activate automatically

MSK Connect connectors per account

Adjustable — check Service Quotas; not a fixed published number in current docs connectors

MSK Connect is a separate sub-service within MSK; its quotas are tracked independently

Pricing Model

Pay for what you provision (brokers + storage) for MSK Provisioned; pay per usage (partition-hours + storage + I/O) for MSK Serverless

MSK Provisioned: Charged per broker-hour based on instance type (e.g., kafka.m5.large), plus EBS storage per GB-month, plus data transfer charges
MSK Serverless: Charged per cluster-hour, per partition-hour, per GB of storage, and per GB of data written/read — no broker sizing required
MSK Tiered Storage: Additional charge per GB-month for data offloaded to S3 (cheaper than broker EBS storage at scale)
MSK Connect: Charged per MSU (MSK Connect Unit) hour — billed for the capacity used by connectors
Data transfer within the same AZ between clients and brokers is free; cross-AZ data transfer incurs standard EC2 data transfer charges
No charge for ZooKeeper nodes — AWS manages ZooKeeper infrastructure at no additional cost to you
Enhanced and Maximum monitoring tiers incur additional CloudWatch charges for extra metrics

Exam Tips

criticalService selection: MSK vs. Kinesis

MSK vs. Kinesis Data Streams decision: If the question mentions 'Apache Kafka', 'Kafka API compatibility', 'lift-and-shift from on-premises Kafka', or 'open-source Kafka tooling' — the answer is MSK. If it says 'serverless', 'no Kafka expertise', or 'AWS-native' — lean toward Kinesis.

criticalVPC networking, security

MSK clusters live INSIDE your VPC — brokers are not publicly accessible by default. For external client access, you must explicitly enable public access (which requires TLS). Any exam scenario about 'connecting from outside the VPC' requires either VPC peering, VPN, Direct Connect, or enabling public access.

criticalLambda event source mapping, serverless consumers

Lambda can consume from MSK as an event source mapping — Lambda polls the Kafka topic, manages offsets, and invokes your function with batches. This is a fully serverless consumer pattern. Know that Lambda supports BOTH MSK and self-managed Kafka (on EC2) as event sources.

criticalAuthentication, security

MSK offers THREE authentication mechanisms: (1) mTLS — client certificates via ACM Private CA, (2) SASL/SCRAM — username/password stored in Secrets Manager, (3) IAM Access Control — AWS IAM policies. IAM is the most 'AWS-native' and preferred for new workloads. Know which to use in each scenario.

critical

If the exam question mentions 'Apache Kafka', 'Kafka API compatibility', or 'migrating from on-premises Kafka' → the answer is Amazon MSK, not Kinesis. These are not interchangeable.

critical

MSK clusters are VPC-private by default. Any scenario involving external/on-premises clients connecting to MSK requires a network solution (VPN, Direct Connect, VPC peering) or explicitly enabling public broker endpoints with TLS.

critical

Lambda can be an MSK consumer via event source mapping — Lambda polls the topic, manages offsets, and processes batches serverlessly. This is the correct serverless Kafka consumer answer for exam scenarios.

importantDisaster recovery, cross-region replication

MSK does NOT natively replicate data across AWS Regions. For cross-region disaster recovery or geo-distribution, you must use MirrorMaker 2 (Apache Kafka's built-in replication tool), which can run as an MSK Connect connector or on EC2/EKS.

importantMSK Serverless, capacity planning

MSK Serverless vs. MSK Provisioned: Serverless automatically scales and you pay per usage (partition-hours + I/O). Provisioned requires you to choose broker instance types and count, but gives more control and is cost-predictable at high, steady-state throughput. Exam scenarios about 'unpredictable workloads' or 'no capacity planning' point to Serverless.

importantMSK Tiered Storage, cost optimization

MSK Tiered Storage enables long data retention at low cost by offloading older log segments to S3. Consumers read older data transparently — they don't need to know data is in S3 vs. broker storage. This is the correct answer for 'cost-optimize Kafka storage for long retention' scenarios.

importantMSK Connect, Kafka Connect

MSK Connect is a fully managed Kafka Connect service — you don't manage the Connect worker infrastructure. It's the right answer when a scenario asks how to move data from Kafka to S3, DynamoDB, or other systems without managing EC2 servers for Kafka Connect.

Good to KnowPricing, ZooKeeper

ZooKeeper nodes in MSK are FREE — AWS manages them and does not charge for ZooKeeper. You are only charged for Kafka broker instances and storage. This is a common cost calculation trap.

Good to KnowMonitoring, CloudWatch, Prometheus

For monitoring MSK: Basic monitoring is free (cluster-level metrics). Enhanced monitoring adds broker-level and topic-level metrics (extra CloudWatch cost). Maximum monitoring adds partition-level metrics. Open Monitoring exposes Prometheus endpoints. Know these tiers for observability architecture questions.

Common Misconceptions & Traps

Common Mistake

MSK and Amazon Kinesis are interchangeable — both are 'managed streaming services' so you can use either one for any streaming workload

Correct

MSK is Apache Kafka-compatible (open-source API) and best for Kafka migrations or when Kafka ecosystem tools are required. Kinesis Data Streams is AWS-proprietary, fully serverless, and tightly integrated with AWS services. They are NOT drop-in replacements for each other — Kafka clients cannot talk to Kinesis, and Kinesis clients cannot talk to MSK.

This is the #1 MSK exam trap. Questions will describe a workload and ask you to choose between MSK and Kinesis. The deciding factor is almost always: 'Does it need Kafka API compatibility?' If yes → MSK. If it's a new AWS-native workload → Kinesis is often simpler.

Common Mistake

MSK automatically replicates data across AWS Regions for disaster recovery

Correct

MSK does NOT provide native cross-region replication. MSK clusters are regional. For cross-region DR or active-active setups, you must implement MirrorMaker 2 yourself (via MSK Connect or on EC2/EKS). This is your responsibility, not AWS's.

Candidates assume 'managed' means 'globally replicated.' MSK manages broker infrastructure within a region, not cross-region data replication. Any exam scenario about cross-region Kafka DR requires MirrorMaker 2 as the answer.

Common Mistake

MSK clusters are publicly accessible by default — you just need the broker endpoint to connect

Correct

MSK clusters are deployed INSIDE your VPC and are private by default. Brokers have no public endpoints unless you explicitly enable public access. External clients must connect via VPC peering, VPN, Direct Connect, or with public access enabled (which requires TLS).

Candidates confuse MSK with SaaS Kafka services. On AWS, MSK is VPC-bound. Any architecture question about 'connecting an on-premises application to MSK' must include a network connectivity solution (VPN or Direct Connect), not just the broker address.

Common Mistake

You are charged for ZooKeeper nodes in MSK, which significantly increases the cost of running a Kafka cluster

Correct

AWS manages ZooKeeper nodes as part of the MSK service and does NOT charge you for them. You only pay for Kafka broker instances (by instance type and hour) and EBS storage. This makes MSK's total cost of ownership lower than self-managed Kafka on EC2 where you'd provision and pay for ZooKeeper EC2 instances.

This misconception leads to incorrect cost comparisons between MSK and self-managed Kafka. On the exam, ZooKeeper is a hidden cost of self-managed Kafka that MSK eliminates — this is a TCO advantage of MSK.

Common Mistake

MSK Serverless works exactly like MSK Provisioned — just with automatic scaling added on top

Correct

MSK Serverless is a fundamentally different deployment mode with different pricing (per partition-hour + I/O vs. per broker-hour), different quota tracking, different configuration options, and some feature limitations compared to provisioned. It is NOT simply 'auto-scaling MSK Provisioned.'

Candidates assume Serverless is just a scaling feature of Provisioned MSK. For exam scenarios, treat them as distinct offerings. Serverless is ideal for variable/unpredictable workloads; Provisioned is better for steady, high-throughput workloads where you need fine-grained broker control.

Common Mistake

MSK automatically upgrades Kafka versions to keep your cluster current and secure

Correct

MSK does NOT automatically upgrade Kafka versions. Version upgrades must be initiated manually by the cluster owner through the MSK console, CLI, or API. AWS will notify you of deprecated version end-of-life, but the upgrade action is yours to perform.

This is a common operational misconception. In exam scenarios about 'keeping Kafka version current' or 'applying security patches,' the correct answer involves the customer taking action — MSK handles broker patching at the OS/infrastructure level, but Kafka version upgrades are customer-controlled.

Memory Tricks

🧠

MSK = 'My Streaming Kafka' — Managed, Secure (VPC-private by default), Kafka-compatible. If the exam says 'Kafka' → think MSK.

🧠

MSK Auth = 'MIS': mTLS (certificates), IAM (AWS-native), SASL/SCRAM (Secrets Manager passwords) — three ways in, pick based on the scenario.

🧠

MSK vs. Kinesis: 'K for Kafka = K for Keep your existing code' (MSK). 'Kinesis = AWS-native, no Kafka knowledge needed.'

🧠

ZooKeeper = FREE in MSK. Remember: 'Zoo animals are on the house' — AWS pays for the Zoo(Keeper).

🧠

Cross-region MSK = MirrorMaker 2. Remember: 'To see your Kafka in another region, hold up a MirrorMaker.'

CertAI Tutor · SAA-C03, SAP-C02, CLF-C02, DEA-C01 · 2026-02-21

Ready to test your knowledge?

Practice SAA-C03, SAP-C02, CLF-C02, DEA-C01 exam questions with AI-powered explanations — free to start.

Amazon MSK (Kafka): The Managed Streaming Powerhouse

Overview

Key Features

Integration Patterns

Service Limits & Quotas

Pricing Model

Exam Tips

Common Misconceptions & Traps

Memory Tricks

Ready to test your knowledge?

Related Cheat Sheets