
Cargando...
Seamlessly connect on-premises environments to AWS cloud storage — without ripping and replacing your infrastructure
AWS Storage Gateway is a hybrid cloud storage service that gives on-premises applications access to virtually unlimited cloud storage via standard storage protocols (NFS, SMB, iSCSI). It acts as a bridge between your on-premises environment and AWS storage services like S3, EBS, and Glacier — not as a standalone storage service itself. Gateway appliances run on-premises (as a VM, hardware appliance, or EC2 instance) and cache frequently accessed data locally while durably storing data in AWS.
Enable hybrid cloud storage architectures that let on-premises workloads use AWS storage (S3, EBS, Glacier) through familiar protocols (NFS, SMB, iSCSI) without changing existing applications
Use When
Avoid When
File Gateway (NFS/SMB to S3)
Presents S3 buckets as NFS or SMB file shares; files stored as native S3 objects accessible via both protocols and S3 API
Volume Gateway - Cached Mode (iSCSI to S3 + EBS Snapshots)
Primary data in S3, frequently accessed data cached locally; snapshots stored as EBS snapshots
Volume Gateway - Stored Mode (iSCSI + async backup to S3)
Full dataset stored on-premises, asynchronously backed up to AWS as EBS snapshots for DR
Tape Gateway / Virtual Tape Library (VTL)
iSCSI VTL interface compatible with NetBackup, Veeam, Backup Exec; tapes stored in S3 and archived to Glacier/Deep Archive
Local caching for low-latency access
All gateway types cache recently accessed data locally to minimize latency for hot data
Bandwidth throttling and scheduling
Configure upload/download bandwidth limits by time of day to avoid saturating WAN links during business hours
Data encryption in transit and at rest
SSL/TLS for data in transit; AWS KMS for encryption at rest in S3, EBS, and Glacier
AWS KMS integration for encryption
Use AWS-managed keys or customer-managed KMS keys for encryption of stored data
Active Directory / SMB authentication
SMB file shares support Active Directory authentication and POSIX-style ACLs
S3 Lifecycle policies on File Gateway data
Since files are native S3 objects, you can apply S3 Lifecycle rules to tier data to Glacier independently
CloudWatch monitoring and alerting
Gateway metrics (CacheHitPercent, CacheUsed, ReadBytes, WriteBytes) available in CloudWatch
Hardware appliance option
Physical Storage Gateway appliance available for environments that cannot run VMware/Hyper-V VMs
EC2-hosted gateway (cloud-side)
Deploy gateway as EC2 instance for cloud-based hybrid scenarios or testing
VMware ESXi / Microsoft Hyper-V / KVM support
Gateway VM can run on VMware ESXi, Microsoft Hyper-V, or Linux KVM hypervisors on-premises
AWS Backup integration
Volume Gateway snapshots can be managed through AWS Backup for centralized backup policy management
Cross-region replication of gateway data
Storage Gateway itself does not replicate across regions; use S3 Cross-Region Replication (CRR) on the underlying S3 bucket for File Gateway data
Multi-AZ redundancy (built-in gateway HA)
The gateway VM itself is a single point of failure; HA requires deploying redundant gateway VMs or using the hardware appliance with external HA solutions
File Gateway as NFS/SMB front-end for S3
high freqOn-premises applications write files via NFS or SMB to File Gateway; files are stored as native S3 objects. Applications can then access the same data via S3 API, enabling analytics (Athena, EMR) and AI/ML pipelines on the same data without copying. This is the most common File Gateway pattern.
Tape Gateway replacing physical tape with VTL + Glacier archival
high freqBackup software (Veeam, NetBackup, Backup Exec) writes to virtual tapes via iSCSI VTL interface. Active tapes reside in S3; when ejected to the virtual shelf, they are automatically archived to S3 Glacier or Glacier Deep Archive. Eliminates tape handling, offsite shipping, and physical media management.
Volume Gateway stored mode for on-premises DR
high freqOn-premises servers write to iSCSI volumes (full copy on-prem). Gateway asynchronously creates EBS snapshots in AWS. In a DR event, snapshots can be restored to EBS volumes and attached to EC2 instances in minutes. This is the classic 'on-premises to cloud DR' pattern.
High-throughput hybrid storage over dedicated network
high freqStorage Gateway over Direct Connect provides consistent, high-bandwidth, low-latency connectivity between on-premises and AWS storage. Eliminates public internet variability for latency-sensitive workloads. Recommended for large datasets or compliance environments requiring private connectivity.
IAM roles controlling gateway access to S3 and other AWS services
high freqStorage Gateway uses IAM roles (not bucket policies alone) to authenticate to S3, KMS, and CloudWatch. The gateway's IAM role must have appropriate permissions to read/write S3 objects, create EBS snapshots, and publish CloudWatch metrics. Bucket policies are secondary and additive.
Anti-pattern: Storage Gateway is NOT a replacement for EFS
high freqEFS provides native NFS for EC2 instances within AWS with multi-AZ HA. Storage Gateway provides NFS/SMB for ON-PREMISES applications connecting TO AWS. Do not use Storage Gateway for EC2-to-EC2 file sharing — use EFS. This distinction is heavily tested.
Centralized backup policy for Volume Gateway snapshots
medium freqAWS Backup can manage Volume Gateway EBS snapshot lifecycle (retention, cross-region copy, compliance reporting) through a unified backup plan. Eliminates manual snapshot management scripts and provides audit-ready backup compliance reporting.
SMB File Gateway with Active Directory and IAM Identity Center
medium freqSMB file shares can be integrated with on-premises Active Directory for user authentication and access control. IAM Identity Center can extend identity federation for management console access to the gateway configuration. Enables enterprise-grade access control for hybrid file shares.
Storage Gateway is a HYBRID CONNECTOR, not a standalone storage service. It always requires an underlying AWS storage service (S3, EBS snapshots, Glacier) to actually store data. When you see 'on-premises applications need cloud storage without changing protocols,' think Storage Gateway.
Know all four gateway types and their protocols cold: File Gateway (NFS/SMB → S3), Volume Gateway Cached (iSCSI → S3 + EBS snapshots, primary in cloud), Volume Gateway Stored (iSCSI → EBS snapshots, primary on-premises), Tape Gateway (iSCSI VTL → S3/Glacier). The exam tests your ability to match the right type to a scenario.
Cached vs. Stored Volume Gateway is a critical distinction: CACHED = primary data lives in S3 (cloud-first, larger volume limit 32 TB), local disk is just a cache. STORED = primary data lives on-premises (full local copy, smaller volume limit 16 TB), cloud is just async backup. Memory trick: 'STORED = stored locally; CACHED = cloud stores it, you cache it.'
File Gateway files are stored as NATIVE S3 OBJECTS. This means you can apply S3 Lifecycle policies, enable S3 Intelligent-Tiering, use S3 Replication, and query data with Athena — all without going through the gateway. This dual-access pattern (NFS/SMB from on-prem + S3 API from cloud) is a major exam scenario.
Storage Gateway is a HYBRID CONNECTOR — it translates on-premises protocols (NFS, SMB, iSCSI) to AWS storage (S3, EBS, Glacier). It is NOT standalone storage. Data durability comes from the underlying AWS service, not the gateway appliance.
CACHED Volume = primary data in S3 (cloud-first, 32 TB max/vol) | STORED Volume = primary data on-premises (local-first, 16 TB max/vol, async backup to EBS snapshots). Match the mode to WHERE primary data should live.
File Gateway files are native S3 objects — you can bypass the gateway and access them directly via S3 API, apply Lifecycle policies, use Athena, or enable CRR. This dual-access pattern (NFS/SMB + S3 API) is a major differentiator tested on SAA-C03 and SAP-C02.
For Tape Gateway, archived tapes go to S3 Glacier or S3 Glacier Deep Archive — retrieval is NOT instant. This mirrors physical tape behavior intentionally. If a question asks about instant tape retrieval, the answer is to keep the tape in the VTL (S3), not archive it. Deep Archive = cheapest, slowest (12-48 hr retrieval).
Storage Gateway does NOT provide HA out of the box. The gateway VM is a single point of failure. For HA, you need to deploy multiple gateway VMs or use the hardware appliance with external clustering. Exam scenarios about 'highly available on-premises file access to S3' may require two File Gateway VMs behind a load balancer or DNS failover.
Direct Connect + Storage Gateway is the recommended pattern for large-scale, consistent, or compliance-driven hybrid storage. If a question mentions 'consistent network performance' or 'private connectivity' for on-premises to cloud storage, combine Direct Connect with Storage Gateway — not just one or the other.
IAM roles (not just S3 bucket policies) control what Storage Gateway can do in your AWS account. The gateway must have an IAM role with permissions to s3:PutObject, s3:GetObject, s3:ListBucket, plus KMS permissions if using customer-managed keys. S3 bucket policies alone are insufficient — IAM role on the gateway is required.
For CLF-C02: Remember Storage Gateway as 'the service that connects on-premises to AWS storage.' For SAA-C03: Know which gateway type solves which scenario. For SAP-C02/DOP-C02: Know the HA patterns, monitoring (CloudWatch metrics like CacheHitPercent), and how to integrate with AWS Backup, Direct Connect, and IAM for enterprise architectures.
Bandwidth throttling in Storage Gateway is a built-in feature (not requiring additional services). You can schedule throttle windows to limit upload/download bandwidth during business hours. This is tested in cost optimization and operational scenarios — no need for third-party tools or Traffic Shaping appliances.
Common Mistake
Storage Gateway IS the storage — data lives in the gateway appliance
Correct
Storage Gateway is a PROTOCOL TRANSLATOR and cache. All data ultimately lives in AWS storage services: S3 (File and Tape Gateway), S3 + EBS snapshots (Volume Gateway). The gateway appliance only holds a local cache of recently accessed data. If the gateway VM dies, your data is safe in AWS — just restore a new gateway and reconnect.
This is the #1 misconception and appears in CLF-C02 and SAA-C03 questions. Candidates who think the gateway IS the storage will choose wrong answers about data durability, backup, and disaster recovery. Remember: gateway = bridge, not bucket.
Common Mistake
S3 bucket policies are sufficient to control what Storage Gateway can access in S3
Correct
Storage Gateway authenticates to AWS using an IAM ROLE assigned to the gateway, not via bucket policies alone. The IAM role must grant the gateway permissions to interact with S3, KMS, and CloudWatch. Bucket policies can ADD restrictions but cannot GRANT access without the IAM role. Relying solely on bucket policies will result in access denied errors.
This appears in SAA-C03 and SAP-C02 security questions. Candidates familiar with S3 public access patterns assume bucket policies control everything. For Storage Gateway, IAM role = primary auth mechanism. Think of it like an EC2 instance role — the instance (gateway) needs a role to call AWS APIs.
Common Mistake
Storage Gateway can replace Amazon EFS for EC2 instances that need shared file storage
Correct
Storage Gateway is designed for ON-PREMISES applications connecting to AWS storage. EC2 instances that need shared file storage should use Amazon EFS (NFS) or Amazon FSx. Using Storage Gateway for EC2-to-EC2 file sharing adds unnecessary latency (on-prem → gateway → S3 → gateway → EC2) and cost. EFS is purpose-built for cloud-native shared file storage.
This confusion appears in SAA-C03 storage selection questions. The trap is that both File Gateway and EFS offer NFS — but EFS is for cloud workloads, File Gateway is for on-premises workloads. Key differentiator: 'Where does the application run?' On-premises = Storage Gateway. In AWS = EFS.
Common Mistake
Tape Gateway provides instant tape retrieval because tapes are 'virtual' and stored in AWS
Correct
Tape Gateway deliberately mirrors physical tape behavior. Active tapes in the VTL are stored in S3 (fast access), but archived tapes (ejected to virtual shelf) go to S3 Glacier or S3 Glacier Deep Archive. Glacier retrieval takes minutes to hours; Deep Archive takes 12-48 hours. 'Virtual' does NOT mean instant — archival tiers have real retrieval delays.
This trips up candidates who assume 'cloud = instant.' The Tape Gateway is designed to be a drop-in replacement for physical tape, including the retrieval delay for archived tapes. Exam questions about 'immediately available tape data' require keeping tapes in the VTL (S3-backed), not archiving them.
Common Mistake
Volume Gateway Stored mode is better than Cached mode because you have a full local copy
Correct
Neither mode is universally 'better' — they serve different use cases. STORED mode is right when you need full local performance and low latency for ALL data, and cloud backup is secondary (DR use case). CACHED mode is right when your dataset exceeds local storage capacity and you only need fast access to your working set (hot data). Stored mode requires enough local disk for your ENTIRE dataset; Cached mode only requires cache for frequently accessed data.
SAA-C03 and SAP-C02 questions present scenarios where candidates must choose the right mode. The key decision factors: total dataset size vs. local disk capacity, and whether primary storage should be on-premises or in AWS. If the dataset exceeds local capacity → Cached. If full local copy is required for performance → Stored.
Common Mistake
You need to use the AWS Management Console to access data stored through Storage Gateway
Correct
End users access data through standard storage protocols: NFS or SMB for File Gateway, iSCSI for Volume and Tape Gateway. The AWS Management Console is only used by administrators to configure and monitor the gateway — not by end users to access files. Applications and users never interact with the console for data access.
This misconception appears in CLF-C02 questions about access management. The console is for administration, not data access. This also relates to the broader misconception that the console = IAM/access management service (it doesn't — IAM is the access management service, the console is just a UI).
FCVT = Four Gateway Types: File (NFS/SMB→S3), Cached Volume (iSCSI, cloud-primary), Stored Volume (iSCSI, on-prem-primary), Tape (VTL→Glacier). 'Friendly Cats Sleep Together.'
STORED = 'Stored Locally, Backed to cloud' | CACHED = 'Cloud Actually Caches Hot Entries Daily' — Stored keeps full copy local; Cached keeps working set local.
File Gateway files = Native S3 Objects = 'What you write through the gateway, you can read through S3' — dual access is the superpower.
Tape Gateway archival = 'Virtual tape, REAL wait' — archived tapes in Glacier have real retrieval delays just like physical tapes.
Gateway = Bridge, not Bucket — the gateway is the connector, AWS storage services are the actual storage. Kill the gateway VM, your data survives in S3/EBS.
CertAI Tutor · CLF-C02, SAA-C03, SAP-C02, DOP-C02 · 2026-02-21
In the Same Category
Comparisons