
Cargando...
Fully managed access to leading foundation models — build, customize, and deploy generative AI without managing infrastructure
Amazon Bedrock is a fully managed service that provides API-based access to high-performing foundation models (FMs) from Amazon and leading AI companies such as Anthropic, Meta, Mistral, Cohere, and Stability AI. It enables developers to build and scale generative AI applications using features like Agents, Knowledge Bases, Guardrails, and fine-tuning — all without provisioning or managing any ML infrastructure. Bedrock integrates natively with AWS security services to ensure enterprise-grade data privacy, with your data never used to train the underlying models.
Enable enterprises to rapidly build production-grade generative AI applications using best-in-class foundation models via a single, secure, serverless API — with built-in customization, retrieval-augmented generation (RAG), orchestration, and responsible AI controls
Use When
Avoid When
On-Demand Inference (pay-per-token)
Default mode; no commitment; throughput not guaranteed under load
Provisioned Throughput
Guaranteed throughput; required for fine-tuned models; 1-month or 6-month commitment
Batch Inference
Process large datasets asynchronously at lower cost; results stored in S3
Knowledge Bases (RAG)
Managed RAG pipeline with automatic embedding, vector storage, and retrieval
Bedrock Agents
LLM-driven autonomous orchestration using ReAct framework; integrates with Lambda, APIs, Knowledge Bases
Guardrails
Content filtering, topic denial, PII redaction, grounding checks — applied bidirectionally
Fine-Tuning (Supervised)
Supported for select models only; requires labeled training data in S3; output requires Provisioned Throughput
Continued Pre-training
Unlabeled domain data; adapts model knowledge base without task-specific labels
Model Evaluation
Automatic metrics and human review workflows; compare multiple FMs side-by-side
Watermark Detection (Amazon Titan Image)
Invisible watermarks embedded in Titan-generated images; detectable via API
Cross-Region Inference
Routes requests across Regions for higher throughput and resilience; uses inference profiles
VPC Endpoints (PrivateLink)
Private connectivity without traversing public internet; required for strict compliance workloads
AWS PrivateLink support
Interface VPC endpoints available for Bedrock runtime and management APIs
Encryption at Rest (Customer Managed Keys)
KMS CMKs supported for fine-tuned model artifacts and Knowledge Base data
CloudTrail API Logging
ALL Bedrock API calls logged to CloudTrail — critical for compliance and auditing
CloudWatch Metrics and Logs
Invocation metrics, latency, token counts; model invocation logging to S3 or CloudWatch Logs
IAM Resource-Based Policies
Fine-grained access control per model ARN; supports cross-account model sharing via resource policies
Streaming Responses
InvokeModelWithResponseStream API for token-by-token streaming; improves perceived latency
Converse API
Unified multi-turn conversation API that works across all supported models with a consistent interface
Prompt Management
Store, version, and share prompt templates across teams via Bedrock Prompt Management
Managed RAG Pipeline via Knowledge Bases
high freqStore source documents (PDF, DOCX, TXT, HTML) in S3 → Bedrock Knowledge Base automatically chunks, embeds, and indexes into OpenSearch Serverless → at query time, Bedrock retrieves relevant chunks and augments the FM prompt with grounded context. Eliminates hallucination for enterprise knowledge queries.
Serverless GenAI API Backend
high freqAPI Gateway receives user requests → Lambda invokes Bedrock InvokeModel or Converse API → response returned to client. Lambda handles prompt construction, session state management, and post-processing. Fully serverless, scales to zero.
Compliance Auditing and Observability
high freqCloudTrail captures every Bedrock API call (who called which model, when, with what parameters) for compliance auditing. CloudWatch collects invocation metrics (latency, token counts, errors) and model invocation logs (full prompt/response) for operational monitoring. These serve DIFFERENT purposes — CloudTrail for audit, CloudWatch for operations.
Autonomous AI Agent with Action Groups
high freqBedrock Agent uses FM reasoning to determine which action to take → calls Action Group (Lambda function) → Lambda queries DynamoDB or external APIs → returns results to agent → agent synthesizes final response. Enables multi-step task automation without hardcoded workflows.
Pre/Post Processing NLP Pipeline
high freqUse Amazon Comprehend for deterministic NLP tasks (entity extraction, sentiment analysis, language detection, PII identification) as pre-processing before sending to Bedrock FM, or as post-processing to validate/structure FM outputs. Comprehend is cheaper and faster for tasks it handles natively.
Document Intelligence Pipeline
high freqTextract extracts structured text, tables, and forms from scanned PDFs/images stored in S3 → extracted text passed to Bedrock FM for summarization, Q&A, or classification. Textract handles OCR/layout understanding; Bedrock handles language reasoning.
Custom Model Training + Bedrock Deployment
medium freqTrain custom models or perform advanced fine-tuning in SageMaker (full control over training infrastructure) → import resulting model artifacts into Bedrock for managed serverless inference. Combines SageMaker's training flexibility with Bedrock's operational simplicity.
Secure Third-Party API Key Management for Agents
medium freqBedrock Agents that call external APIs (e.g., Salesforce, weather APIs) need credentials. Store API keys in Secrets Manager with automatic rotation → Lambda Action Group retrieves secrets at runtime. NEVER hardcode credentials in agent instructions or Lambda code.
Enterprise Search + Generative AI
medium freqAmazon Kendra indexes enterprise content (SharePoint, Confluence, S3) with ML-powered search → Bedrock FM generates natural language answers grounded in Kendra search results. Kendra handles structured enterprise search; Bedrock handles natural language generation.
Deterministic Orchestration of GenAI Workflows
medium freqUse Step Functions for complex, multi-stage workflows requiring deterministic branching, error handling, and retry logic — with individual steps invoking Bedrock for specific FM tasks. Contrast with Bedrock Agents (LLM-driven, non-deterministic). Step Functions = predictable; Agents = autonomous.
CloudTrail logs ALL Bedrock API calls for AUDITING (who called what model, when) — CloudWatch Logs captures model INVOCATION CONTENT (prompts and responses) for operational monitoring. These serve completely different purposes. Exam questions about compliance auditing → CloudTrail. Questions about debugging FM outputs → CloudWatch/model invocation logging.
Fine-tuned models in Bedrock CANNOT be invoked via on-demand pricing — they REQUIRE Provisioned Throughput. If an exam scenario describes deploying a fine-tuned model to production, the answer must include Provisioned Throughput, not on-demand inference.
Bedrock Guardrails perform CONTENT FILTERING (blocking harmful, toxic, or off-topic content) — they do NOT detect statistical bias, fairness issues, or demographic disparities in model outputs. Bias detection requires model evaluation frameworks, external fairness tooling, or human review workflows.
In RAG architecture with Bedrock Knowledge Bases: S3 is the DATA SOURCE (where documents are stored), NOT the vector store. The vector store (where embeddings live) is OpenSearch Serverless, Aurora pgvector, Redis, Pinecone, or MongoDB Atlas. Exam questions frequently test this distinction.
AWS-managed KMS keys for Bedrock CANNOT be shared across AWS accounts. For cross-account model access or shared encryption, you must use Customer Managed Keys (CMKs) with appropriate key policies. This is a common trap in enterprise architecture questions.
CloudTrail = compliance audit trail for WHO called WHAT model WHEN. CloudWatch = operational monitoring of HOW models perform. Never swap these — compliance questions → CloudTrail, operational debugging → CloudWatch.
Fine-tuned models REQUIRE Provisioned Throughput — on-demand inference is impossible for custom model variants. Any exam scenario about deploying a fine-tuned model must include Provisioned Throughput in the solution.
Guardrails = content safety filtering (harmful content, PII, off-topic). NOT bias detection. Bias requires model evaluation or human review. Content policy enforcement ≠ algorithmic fairness.
Bedrock Agents use a ReAct (Reasoning + Acting) framework — the FM reasons about what to do, acts by calling tools/APIs, observes results, and iterates. This is fundamentally different from Step Functions (deterministic state machine). Choose Agents when you need autonomous, adaptive task completion; Step Functions when you need predictable, auditable workflows.
Few-shot prompting (providing 2-5 examples in the prompt) significantly outperforms zero-shot prompting for complex analytical, classification, or structured output tasks. Exam scenarios asking how to improve FM accuracy without fine-tuning → answer is few-shot or chain-of-thought prompting, NOT zero-shot.
Secrets Manager (NOT Parameter Store) is the correct answer for storing API keys used by Bedrock Agents when AUTOMATIC ROTATION is required. Parameter Store supports rotation only with custom Lambda rotation functions and is not designed for third-party API credential rotation. Exam questions with 'automatic rotation' → Secrets Manager.
Bedrock's data privacy guarantee: your prompts and completions are NEVER used to train or improve Amazon's or any third-party's base models. This is a key differentiator for enterprise adoption and frequently appears in exam questions about data governance and privacy.
Use Amazon Comprehend for DETERMINISTIC NLP tasks (entity recognition, sentiment, PII detection, language detection) and Bedrock FMs for GENERATIVE or REASONING tasks. Comprehend is cheaper, faster, and more consistent for structured NLP. Exam questions testing cost optimization or task appropriateness often hinge on this distinction.
Bedrock supports BATCH INFERENCE for large-scale asynchronous workloads at lower cost — input data from S3, results written back to S3. Choose batch inference when latency is not critical and you're processing thousands of records. On-demand = real-time; Batch = async + cheaper.
The Converse API provides a UNIFIED interface for multi-turn conversations across ALL supported Bedrock models — you don't need model-specific prompt formatting. This simplifies code when switching between models and is the recommended approach for chatbot/assistant applications.
Amazon Titan Image Generator embeds INVISIBLE WATERMARKS in generated images by default — these watermarks can be detected via the Bedrock DetectGeneratedContent API. This is a responsible AI feature for provenance tracking and is tested in the Responsible AI domain of AIF-C01.
Common Mistake
CloudWatch Logs is the right service to audit WHO called which Bedrock model and WHEN for compliance reporting
Correct
AWS CloudTrail is the authoritative audit trail for ALL Bedrock API calls — it records the caller identity (IAM principal), source IP, timestamp, model ID, and API action. CloudWatch Logs captures model invocation CONTENT (prompts and responses) when model invocation logging is enabled, but this is for operational debugging, not compliance auditing.
This is the #1 most common trap in Bedrock exam questions. Remember: CloudTrail = WHO did WHAT (audit log) | CloudWatch = WHAT happened operationally (metrics + content logs). Compliance questions → CloudTrail. Debugging FM quality → CloudWatch.
Common Mistake
Bedrock Guardrails detect and prevent AI bias, ensuring fair and unbiased model outputs
Correct
Guardrails perform CONTENT FILTERING — blocking harmful content categories (hate speech, violence, sexual content), denying off-topic subjects, redacting PII, and detecting prompt injection attacks. They do NOT evaluate statistical bias, demographic fairness, or model accuracy disparities. Bias detection requires model evaluation, human review (A2I), or external fairness frameworks.
Content safety ≠ Algorithmic fairness. Guardrails are a runtime safety net for content policy enforcement. Bias is a model quality and training data problem that requires evaluation methodologies, not runtime filters.
Common Mistake
Zero-shot prompting is sufficient for complex analytical tasks like multi-step reasoning, structured data extraction, or nuanced classification
Correct
Zero-shot prompting (no examples) works for simple, well-defined tasks but consistently underperforms for complex analytical work. Few-shot prompting (2-5 examples in context), chain-of-thought prompting (asking the model to reason step-by-step), or fine-tuning are required for complex tasks. Exam scenarios asking how to improve FM accuracy → few-shot or CoT first, fine-tuning as last resort.
The exam tests understanding of the prompt engineering hierarchy: Zero-shot → Few-shot → Chain-of-Thought → Fine-tuning. Each step increases complexity and cost but improves performance for harder tasks.
Common Mistake
AWS-managed KMS keys can be shared across AWS accounts to encrypt Bedrock fine-tuned model artifacts
Correct
AWS-managed KMS keys are account-specific and CANNOT be shared across accounts or have their key policies modified. For cross-account encryption or sharing encrypted Bedrock resources (fine-tuned model artifacts, Knowledge Base data), you MUST use Customer Managed Keys (CMKs) with a key policy that grants cross-account access.
AWS-managed keys = AWS controls, account-scoped, no sharing. CMKs = you control, shareable via key policy. Any exam question mentioning 'cross-account' + 'encryption' → CMK is always the answer.
Common Mistake
S3 is the vector store for Bedrock Knowledge Bases — documents are stored and searched directly in S3
Correct
S3 is the DATA SOURCE for Knowledge Bases (where your raw documents live). After ingestion, Bedrock chunks the documents, generates embeddings using an embedding model, and stores the vector embeddings in a VECTOR DATABASE (Amazon OpenSearch Serverless, Aurora pgvector, Redis Enterprise Cloud, Pinecone, or MongoDB Atlas). S3 is never queried at retrieval time — the vector store is.
RAG architecture has two distinct storage layers: S3 (raw documents, write-once) and vector store (embeddings, queried at runtime). Confusing these leads to wrong answers about where to optimize retrieval performance.
Common Mistake
Fine-tuned Bedrock models can be invoked using standard on-demand pricing like base models
Correct
Fine-tuned (custom) models in Bedrock REQUIRE Provisioned Throughput to invoke — you must purchase at least 1 Model Unit with a 1-month or 6-month commitment before you can call your fine-tuned model. On-demand inference is only available for base foundation models in the Bedrock model catalog.
This trips up candidates who assume fine-tuning is just a training step after which the model works like any other. The Provisioned Throughput requirement has significant cost implications and is a critical architectural consideration.
Common Mistake
AWS Parameter Store is equivalent to Secrets Manager for storing API keys used by Bedrock Agents when rotation is needed
Correct
AWS Secrets Manager is the correct service when AUTOMATIC ROTATION of credentials is required. Secrets Manager has built-in rotation support for AWS services and supports custom Lambda rotation functions for third-party APIs. Parameter Store SecureString supports encryption but does NOT have native automatic rotation — rotation requires custom implementation. Exam keyword 'automatic rotation' → always Secrets Manager.
Both services store secrets securely, but the rotation requirement is the differentiator. Parameter Store is cheaper for static secrets; Secrets Manager is required for automatically rotating credentials used by Bedrock Agent Action Groups calling external APIs.
Common Mistake
Bedrock Agents and AWS Step Functions are interchangeable for orchestrating multi-step AI workflows
Correct
Bedrock Agents use LLM-driven, non-deterministic orchestration (ReAct framework) — the FM autonomously decides which tools to call and in what order based on reasoning. Step Functions is a deterministic state machine where YOU define every state, transition, and branch. Use Agents when tasks are open-ended and require adaptive reasoning; use Step Functions when workflows are predictable, auditable, and require guaranteed execution order.
Agents = autonomous + adaptive (LLM decides). Step Functions = deterministic + auditable (developer decides). Mixing these up leads to wrong architecture answers. A compliance-heavy workflow requiring audit trails → Step Functions. An open-ended customer service task → Bedrock Agents.
GRAFFITI for Bedrock features: Guardrails, RAG (Knowledge Bases), Agents, Fine-tuning, Foundation models, Invocation logging, Titan models, Inference (on-demand/provisioned)
CloudTrail = 'Trail of breadcrumbs WHO walked WHERE' (audit) | CloudWatch = 'Watch WHAT is happening now' (operations) — never swap these for Bedrock compliance questions
RAG storage layers: 'S3 = Source documents (raw)' | 'Vector DB = Vectorized embeddings (searchable)' — S3 IN, Vector DB OUT at query time
Fine-tune → Must Provision: 'You FINE-tune your PROVISION-al athlete before the big game' — fine-tuned models always need Provisioned Throughput
Prompt engineering ladder (bottom to top, increasing power): Zero-shot → Few-shot → Chain-of-Thought → Fine-tuning → Continued Pre-training
Secrets Manager vs Parameter Store: 'ROTATE with Secrets Manager, STORE static with Parameter Store' — rotation keyword → Secrets Manager, always
CertAI Tutor · AIF-C01 · 2026-02-22
In the Same Category
Comparisons
Guides & Patterns