at the beginning of 2026, AWS has several related yet distinct components that make up its agentic and LLM abstractions.
- Bedrock is the model layer that enables access to large language models.
- Agents for Bedrock is the managed application layer. In other words, AWS runs the agents for you based on your requirements.
- Bedrock AgentCore is an infrastructure layer that enables AWS to run agents you develop using third-party frameworks such as CrewAI and LangGraph.
Apart from these three services, AWS also has Strands, an open source Python library for building agents outside of the Bedrock service, which can then be deployed on other AWS services such as ECS and Lambda.
It can be confusing because all three agentic-based services have the term “Bedrock” in their names, but in this article, I’ll focus on the standard Bedrock service and show how and why you would use it. standard Bedrock service and show how and why you would use it.
As a service, Bedrock has only been available on AWS since early 2023. That should give you a clue as to why it was introduced. Amazon could clearly see the rise of Large Language Models and their impact on IT architecture and the systems development process. That’s AWS’s meat and potatoes, and they were keen that nobody was going to eat their lunch.
And although AWS has developed a few LLMs of its own, it realised that to stay competitive, it would need to make the very top models, such as those from Anthropic, available to users. And that’s where Bedrock steps in. As they said in their own blurb on their website,
… Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications, simplifying development while maintaining privacy and security.
How do I access Bedrock?
Ok, so that’s the theory behind the why of Bedrock, but how do we get access to it and actually use it? Not surprisingly, the first thing you need is an AWS account. I’m going to assume you already have this, but if not, click the following link to set one up.
https://aws.amazon.com/account
Usefully, after you register for a new AWS account, a good number of the services that you use will fall under the so-called “free tier” at AWS, which means your costs should be minimal for one year following your account creation – assuming you don’t go crazy and start firing up huge compute servers and such like.
There are three main ways to use AWS services.
- Via the console. If you’re a beginner, this will probably be your preferred route as it’s the easiest way to get started
- Via an API. If you’re handy at coding, you can access all of AWS’s services via an API. For example, for Python programmers, AWS provides the boto3 library. There are similar libraries for other languages, such as JavaScript, etc.
- Via the command line interface (CLI). The CLI is an additional tool you can download from AWS and allows you to interact with AWS services straight from your terminal.
Note that, to use the latter two methods, you should have login credentials set up on your local system.
What can I do with Bedrock?
The short answer is that you can do most of the things you can with regular chat models from OpenAI, Anthropic, Google, and so on. Underlying Bedrock are a number of foundation models that you can use with it, such as:-
- Kimi K2 Thinking. A deep reasoning model
- Claude Opus 4.5. To many people, this is the top LLM available to date.
- GPT-OSS. OpenAI’s open source LLM
And many, many others besides. For a full list, check out the following link.
https://aws.amazon.com/bedrock/model-choice
How do I use Bedrock?
To use Bedrock, we will use a mix of the AWS CLI and the Python API provided by the boto3 library. Make sure you have the following setup as a prerequisite
- An AWS account.
- The AWS CLI has been downloaded and installed on your system.
- An Identity and Access Management (IAM) user is set up with appropriate permissions and access keys. You can do this via the AWS console.
- Configured your user credentials via the AWS CLI like this. In general, three pieces of information need to be supplied. All of which you’ll get from the previous step above. You will be prompted to enter relevant information,
$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]:
Giving Bedrock access to a model
Back in the day (a few months ago!), you had to use the AWS management console to request access to particular models from Bedrock, but now access is automatically granted when you invoke a model for the first time.
Note that for Anthropic models, first-time users may need to submit use case details before they can access the model. Also note that access to top models from Anthropic and other providers willincur costs so please ensure you monitor your billing regularly and remove any model access you no longer need.
However, we still need to know the model name we want to use. To get a list of all Bedrock-compatible models, we can use the following AWS CLI command.
aws bedrock list-foundation-models
This will return a JSON result set listing various properties of each model, like this.
{
"modelSummaries": [
{
"modelArn": "arn:aws:bedrock:us-east-2::foundation-model/nvidia.nemotron-nano-12b-v2",
"modelId": "nvidia.nemotron-nano-12b-v2",
"modelName": "NVIDIA Nemotron Nano 12B v2 VL BF16",
"providerName": "NVIDIA",
"inputModalities": [
"TEXT",
"IMAGE"
],
"outputModalities": [
"TEXT"
],
"responseStreamingSupported": true,
"customizationsSupported": [],
"inferenceTypesSupported": [
"ON_DEMAND"
],
"modelLifecycle": {
"status": "ACTIVE"
}
},
{
"modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0",
...
...
...
Choose the model you need and note its modelID from the JSON output, as we’ll need this in our Python code later. An important caveat to this is that you’ll often see the following in a model description,
...
...
"inferenceTypesSupported": [
"INFERENCE_PROFILE"
]
...
...
This is reserved for models that:
- Are large or in high demand
- Require reserved or managed capacity
- Need explicit cost and throughput controls
For these models, we can’t just reference the modelID in our code. Instead, we need to reference an inference profile. An inference profile is a Bedrock resource that’s bound to one or more foundational LLMs and a region.
There are two ways to obtain an inference profile you can use. The first is to create one yourself. These are called Application Profiles. The second way is to use one of AWS’s Supported Profiles. This is the easier option, as it’s pre-built for you and you just need to obtain the relevant Profile ID associated with the inference profile to use in your code.
If you want to take the route of creating your Application Profile, check out the appropriate AWS documentation, but I’m going to use a supported profile in my example code.
For a list of Supported Profiles in AWS, check out the link below:
For my first code example, I want to use Claude’s Sonnet 3.5 V2 model, so I clicked the line above and saw the following description.

I took note of the profile ID ( us.anthropic.claude-3–5-sonnet-20241022-v2:0 ) and one of the valid source regions ( us-east-1 )
For my second two example code snippets, I’ll use OpenAI’s open-source LLM for text output and AWS’s Titan Image generator for images. Neither of these models requires an inference profile, so you can just use the regular modelID for them in your code.
NB: Whichever model(s) you choose, make sure your AWS region is set to the correct value for each.
Setting Up a Development Environment
As we’ll be doing some coding, it’s best to isolate your environment so we don’t interfere with any of our other projects. So let’s do that now. I’m using Windows and the UV package manager for this, but use whichever tool you’re most comfortable with. My code will run in a Jupyter notebook.
uv init bedrock_demo --python 3.13
cd bedrock_demo
uv add boto3 jupyter
# To run the notebook, type this in
uv run jupyter notebook
Using Bedrock from Python
Let’s see Bedrock in action with a few examples. The first will be simple, and we’ll gradually increase the complexity as we go.
Example 1: A simple question and answer using an inference profile
This example uses the Claude Sonnet 3.5 V2 model we talked about earlier. As mentioned, to invoke this model, we use a profile ID associated with its inference profile.
import json
import boto3
brt = boto3.client("bedrock-runtime", region_name="us-east-1")
profile_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
body = json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 200,
"temperature": 0.2,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is the capital of France?"}
]
}
]
})
resp = brt.invoke_model(
modelId=profile_id,
body=body,
accept="application/json",
contentType="application/json"
)
data = json.loads(resp["body"].read())
# Claude responses come back as a "content" array, not OpenAI "choices"
print(data["content"][0]["text"])
#
# Output
#
The capital of France is Paris.
Note that invoking this model (and others like it) creates an implied subscription between you and AWS’s marketplace. This is not a recurring regular charge. It only costs you when the model is actually used, but its best to monitor this to avoid unexpected bills. You should receive an email outlining the subscription agreement, with a link to manage and/or cancel any existing model subscriptions that are set up.
Example 2: Create an image
A simple image creation using AWS’s own Titan model. This model is not associated with an inference profile, so we can just reference it using its modelID.
import json
import base64
import boto3
brt_img = boto3.client("bedrock-runtime", region_name="us-east-1")
model_id_img = "amazon.titan-image-generator-v2:0"
prompt = "A hippo riding a bike."
body = json.dumps({
"taskType": "TEXT_IMAGE",
"textToImageParams": {
"text": prompt
},
"imageGenerationConfig": {
"numberOfImages": 1,
"height": 1024,
"width": 1024,
"cfgScale": 7.0,
"seed": 0
}
})
resp = brt_img.invoke_model(
modelId=model_id_img,
body=body,
accept="application/json",
contentType="application/json"
)
data = json.loads(resp["body"].read())
# Titan returns base64-encoded images in the "images" array
img_b64 = data["images"][0]
img_bytes = base64.b64decode(img_b64)
out_path = "titan_output.png"
with open(out_path, "wb") as f:
f.write(img_bytes)
print("Saved:", out_path)
On my system, the output image looked like this.

Example 3: A technical support triage assistant using OpenAI’s OSS model
This is a more complex and useful example. Here, we set up an assistant that will take problems reported to it by non-technical users and output additional questions you might want the user to answer, as well as the most likely causes of the issue and what further steps to take. Like our previous example, this model is not associated with an inference profile.
import json
import re
import boto3
from pydantic import BaseModel, Field
from typing import List, Literal, Optional
# ----------------------------
# Bedrock setup
# ----------------------------
REGION = "us-east-2"
MODEL_ID = "openai.gpt-oss-120b-1:0"
brt = boto3.client("bedrock-runtime", region_name=REGION)
# ----------------------------
# Output schema
# ----------------------------
Severity = Literal["low", "medium", "high"]
Category = Literal["account", "billing", "device", "network", "software", "security", "other"]
class TriageResponse(BaseModel):
category: Category
severity: Severity
summary: str = Field(description="One-sentence restatement of the problem.")
likely_causes: List[str] = Field(description="Top plausible causes, concise.")
clarifying_questions: List[str] = Field(description="Ask only what is needed to proceed.")
safe_next_steps: List[str] = Field(description="Step-by-step actions safe for a non-technical user.")
stop_and_escalate_if: List[str] = Field(description="Clear red flags that require a professional/helpdesk.")
recommended_escalation_target: Optional[str] = Field(
default=None,
description="If severity is high, who to contact (e.g., IT admin, bank, ISP)."
)
# ----------------------------
# Helpers
# ----------------------------
def invoke_chat(messages, max_tokens=800, temperature=0.2) -> dict:
body = json.dumps({
"messages": messages,
"max_tokens": max_tokens,
"temperature": temperature
})
resp = brt.invoke_model(
modelId=MODEL_ID,
body=body,
accept="application/json",
contentType="application/json"
)
return json.loads(resp["body"].read())
def extract_content(data: dict) -> str:
return data["choices"][0]["message"]["content"]
def extract_json_object(text: str) -> dict:
"""
Extract the first JSON object from model output.
Handles common cases like blocks or extra text.
"""
text = re.sub(r".*? ", "", text, flags=re.DOTALL).strip()
start = text.find("{")
if start == -1:
raise ValueError("No JSON object found.")
depth = 0
for i in range(start, len(text)):
if text[i] == "{":
depth += 1
elif text[i] == "}":
depth -= 1
if depth == 0:
return json.loads(text[start:i+1])
raise ValueError("Unbalanced JSON braces; could not parse.")
# ----------------------------
# The useful function
# ----------------------------
def triage_issue(user_problem: str) -> TriageResponse:
messages = [
{
"role": "system",
"content": (
"You are a careful technical support triage assistant for non-technical users. "
"You must be conservative and safety-first. "
"Return ONLY valid JSON matching the given schema. No extra text."
)
},
{
"role": "user",
"content": f"""
User problem:
{user_problem}
Return JSON that matches this schema:
{TriageResponse.model_json_schema()}
""".strip()
}
]
raw = invoke_chat(messages)
text = extract_content(raw)
parsed = extract_json_object(text)
return TriageResponse.model_validate(parsed)
# ----------------------------
# Example
# ----------------------------
if __name__ == "__main__":
problem = "My laptop is connected to Wi-Fi but websites won't load, and Zoom keeps saying unstable connection."
result = triage_issue(problem)
print(result.model_dump_json(indent=2))
Here is the output.
"category": "network",
"severity": "medium",
"summary": "Laptop shows Wi‑Fi connection but cannot load websites and Zoom
reports an unstable connection.",
"likely_causes": [
"Router or modem malfunction",
"DNS resolution failure",
"Local Wi‑Fi interference or weak signal",
"IP address conflict on the network",
"Firewall or security software blocking traffic",
"ISP outage or throttling"
],
"clarifying_questions": [
"Are other devices on the same Wi‑Fi network able to access the internet?",
"Did the problem start after any recent changes (e.g., new software, OS update, VPN installation)?",
"Have you tried moving closer to the router or using a wired Ethernet connection?",
"Do you see any error codes or messages in the browser or Zoom besides \"unstable connection\"?"
],
"safe_next_steps": [
"Restart the router and modem by unplugging them for 30 seconds, then power them back on.",
"On the laptop, forget the Wi‑Fi network, then reconnect and re-enter the password.",
"Run the built‑in Windows network troubleshooter (Settings → Network & Internet → Status → Network troubleshooter).",
"Disable any VPN or proxy temporarily and test the connection again.",
"Open a command prompt and run `ipconfig /release` followed by `ipconfig /renew`.",
"Flush the DNS cache with `ipconfig /flushdns`.",
"Try accessing a simple website (e.g., http://example.com) and note if it loads.",
"If possible, connect the laptop to the router via Ethernet to see if the issue persists."
],
"stop_and_escalate_if": [
"The laptop still cannot reach any website after completing all steps.",
"Other devices on the same network also cannot access the internet.",
"You receive error messages indicating hardware failure (e.g., Wi‑Fi adapter not found).",
"The router repeatedly restarts or shows error lights.",
"Zoom continues to report a poor or unstable connection despite a working internet test."
],
"recommended_escalation_target": "IT admin"
}
Summary
This article introduced AWS Bedrock, AWS’s managed gateway to foundation large language models, explaining why it exists, how it fits into the broader AWS AI stack, and how to use it in practice. We covered model discovery, region and credential setup, and the key distinction between on-demand models and those that require inference profiles – a common source of confusion for developers.
Through practical Python examples, we demonstrated text and image generation using both standard on-demand models and those that require an inference profile.
At its core, Bedrock reflects AWS’s long-standing philosophy: abstract infrastructure complexity without removing control. Rather than pushing a single “best” model, Bedrock treats foundation models as managed infrastructure components – swappable, governable, and region-aware. This suggests a future where Bedrock evolves less as a chat interface and more as a model orchestration layer, tightly integrated with IAM, networking, cost controls, and agent frameworks.
Over time, we might expect Bedrock to move further toward standardised inference contracts (subscriptions) and clearer separation between experimentation and production capacity. And with their Agent and AgentCore services, we are already seeing deeper integration of agentic workflows with Bedrock, positioning models not as products in themselves but as durable building blocks within AWS systems.
For the avoidance of doubt, apart from being a sometime user of their services, I have no connection or affilliation with Amazon Web Services
Source link
#introduction #AWSBedrock #Data #Science
























