• Latest
  • Trending
eksbedrock arch diagram1 Toolz Guru Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock

Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock

May 13, 2025
pixart trainium inferentia 1120x630 Toolz Guru Cost-effective AI image generation with PixArt-Σ inference on AWS Trainium and AWS Inferentia

Cost-effective AI image generation with PixArt-Σ inference on AWS Trainium and AWS Inferentia

May 15, 2025
Social share Device trust.width 1300 Toolz Guru Device Trust from Android Enterprise

Device Trust from Android Enterprise

May 15, 2025

Detecting misbehavior in frontier reasoning models

May 14, 2025
TAS Gemini Across Devices Blog Header.width 1300 Toolz Guru Gemini is coming to watches, cars, TV and XR devices

Gemini is coming to watches, cars, TV and XR devices

May 14, 2025

New tools for building agents

May 14, 2025

Driving growth and ‘WOW’ moments with OpenAI

May 14, 2025

OpenAI’s proposals for the U.S. AI Action Plan

May 14, 2025

The court rejects Elon’s latest attempt to slow OpenAI down

May 14, 2025

New in ChatGPT for Business: March 2025

May 14, 2025

EliseAI improves housing and healthcare efficiency with AI

May 14, 2025

Introducing next-generation audio models in the API

May 14, 2025
TAS Material 3 Expressive Blog Header 1.width 1300 Toolz Guru Google launches Material 3 Expressive redesign for Android, Wear OS devices

Google launches Material 3 Expressive redesign for Android, Wear OS devices

May 14, 2025
Toolz Guru
  • Home
    Social share Device trust.width 1300 Toolz Guru Device Trust from Android Enterprise

    Device Trust from Android Enterprise

    TAS Gemini Across Devices Blog Header.width 1300 Toolz Guru Gemini is coming to watches, cars, TV and XR devices

    Gemini is coming to watches, cars, TV and XR devices

    TAS Material 3 Expressive Blog Header 1.width 1300 Toolz Guru Google launches Material 3 Expressive redesign for Android, Wear OS devices

    Google launches Material 3 Expressive redesign for Android, Wear OS devices

    Googles Geothermal Agreement SS 1920x1080.max 1440x810 Toolz Guru Google’s new model for clean energy approved in Nevada

    Google’s new model for clean energy approved in Nevada

    Superpollutants SS 1920x1080.max 1440x810 Toolz Guru We’re announcing two new partnerships to eliminate superpollutants and help the atmosphere.

    We’re announcing two new partnerships to eliminate superpollutants and help the atmosphere.

    Searchscams SS 1920x1080.max 1440x810 Toolz Guru Google’s new report on fighting scams in search results

    Google’s new report on fighting scams in search results

    AIFF SS.width 1300 Toolz Guru Google’s AI Futures Fund works with AI startups

    Google’s AI Futures Fund works with AI startups

    GFSA AI for Energy demo copy blog banner v24.width 1300 Toolz Guru Google for Startup Accelerator: AI for Energy opens

    Google for Startup Accelerator: AI for Energy opens

  • AI News
  • AI Tools
    • Image Generation
    • Content Creation
    • SEO Tools
    • Digital Tools
    • Language Models
    • Video & Audio
  • Digital Marketing
    • Content Marketing
    • Social Media
    • Search Engine Optimization
  • Reviews
No Result
View All Result
Toolz Guru
  • Home
    Social share Device trust.width 1300 Toolz Guru Device Trust from Android Enterprise

    Device Trust from Android Enterprise

    TAS Gemini Across Devices Blog Header.width 1300 Toolz Guru Gemini is coming to watches, cars, TV and XR devices

    Gemini is coming to watches, cars, TV and XR devices

    TAS Material 3 Expressive Blog Header 1.width 1300 Toolz Guru Google launches Material 3 Expressive redesign for Android, Wear OS devices

    Google launches Material 3 Expressive redesign for Android, Wear OS devices

    Googles Geothermal Agreement SS 1920x1080.max 1440x810 Toolz Guru Google’s new model for clean energy approved in Nevada

    Google’s new model for clean energy approved in Nevada

    Superpollutants SS 1920x1080.max 1440x810 Toolz Guru We’re announcing two new partnerships to eliminate superpollutants and help the atmosphere.

    We’re announcing two new partnerships to eliminate superpollutants and help the atmosphere.

    Searchscams SS 1920x1080.max 1440x810 Toolz Guru Google’s new report on fighting scams in search results

    Google’s new report on fighting scams in search results

    AIFF SS.width 1300 Toolz Guru Google’s AI Futures Fund works with AI startups

    Google’s AI Futures Fund works with AI startups

    GFSA AI for Energy demo copy blog banner v24.width 1300 Toolz Guru Google for Startup Accelerator: AI for Energy opens

    Google for Startup Accelerator: AI for Energy opens

  • AI News
  • AI Tools
    • Image Generation
    • Content Creation
    • SEO Tools
    • Digital Tools
    • Language Models
    • Video & Audio
  • Digital Marketing
    • Content Marketing
    • Social Media
    • Search Engine Optimization
  • Reviews
No Result
View All Result
Toolz Guru
No Result
View All Result
Home SEO Tools

Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock

by Maxim Makedonsky
May 13, 2025
in SEO Tools
0 0
eksbedrock arch diagram1 Toolz Guru Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock
Share on FacebookShare on Twitter


Generative artificial intelligence (AI) applications are commonly built using a technique called Retrieval Augmented Generation (RAG) that provides foundation models (FMs) access to additional data they didn’t have during training. This data is used to enrich the generative AI prompt to deliver more context-specific and accurate responses without continuously retraining the FM, while also improving transparency and minimizing hallucinations.

In this post, we demonstrate a solution using Amazon Elastic Kubernetes Service (EKS) with Amazon Bedrock to build scalable and containerized RAG solutions for your generative AI applications on AWS while bringing your unstructured user file data to Amazon Bedrock in a straightforward, fast, and secure way.

Amazon EKS provides a scalable, secure, and cost-efficient environment for building RAG applications with Amazon Bedrock and also enables efficient deployment and monitoring of AI-driven workloads while leveraging Bedrock’s FMs for inference. It enhances performance with optimized compute instances, auto-scales GPU workloads while reducing costs via Amazon EC2 Spot Instances and AWS Fargate and provides enterprise-grade security via native AWS mechanisms such as Amazon VPC networking and AWS IAM.

Our solution uses Amazon S3 as the source of unstructured data and populates an Amazon OpenSearch Serverless vector database via the use of Amazon Bedrock Knowledge Bases with the user’s existing files and folders and associated metadata. This enables a RAG scenario with Amazon Bedrock by enriching the generative AI prompt using Amazon Bedrock APIs with your company-specific data retrieved from the OpenSearch Serverless vector database.

Solution overview

The solution uses Amazon EKS managed node groups to automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for the Amazon EKS Kubernetes cluster. Every managed node in the cluster is provisioned as part of an Amazon EC2 Auto Scaling group that’s managed for you by EKS.

The EKS cluster consists of a Kubernetes deployment that runs across two Availability Zones for high availability where each node in the deployment hosts multiple replicas of a Bedrock RAG container image registered and pulled from Amazon Elastic Container Registry (ECR). This setup makes sure that resources are used efficiently, scaling up or down based on the demand. The Horizontal Pod Autoscaler (HPA) is set up to further scale the number of pods in our deployment based on their CPU utilization.

The RAG Retrieval Application container uses Bedrock Knowledge Bases APIs and Anthropic’s Claude 3.5 Sonnet LLM hosted on Bedrock to implement a RAG workflow. The solution provides the end user with a scalable endpoint to access the RAG workflow using a Kubernetes service that is fronted by an Amazon Application Load Balancer (ALB) provisioned via an EKS ingress controller.

The RAG Retrieval Application container orchestrated by EKS enables RAG with Amazon Bedrock by enriching the generative AI prompt received from the ALB endpoint with data retrieved from an OpenSearch Serverless index that is synced via Bedrock Knowledge Bases from your company-specific data uploaded to Amazon S3.

The following architecture diagram illustrates the various components of our solution:

eksbedrock arch diagram1 Toolz Guru Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock

Prerequisites

Complete the following prerequisites:

  1. Ensure model access in Amazon Bedrock. In this solution, we use Anthropic’s Claude 3.5 Sonnet on Amazon Bedrock.
  2. Install the AWS Command Line Interface (AWS CLI).
  3. Install Docker.
  4. Install Kubectl.
  5. Install Terraform.

Deploy the solution

The solution is available for download on the GitHub repo. Cloning the repository and using the Terraform template will provision the components with their required configurations:

  1. Clone the Git repository:
    sudo yum install -y unzip
    git clone https://github.com/aws-samples/genai-bedrock-serverless.git
    cd eksbedrock/terraform

  2. From the terraform folder, deploy the solution using Terraform:
    terraform init
    terraform apply -auto-approve

Configure EKS

  1. Configure a secret for the ECR registry:
    aws ecr get-login-password 
    --region  | docker login 
    --username AWS 
    --password-stdin .dkr.ecr..amazonaws.com/bedrockragrepodocker pull .dkr.ecr..amazonaws.com/bedrockragrepo:latestaws eks update-kubeconfig 
    --region  
    --name eksbedrockkubectl create secret docker-registry ecr-secret  
    --docker-server=.dkr.ecr..amazonaws.com 
    --docker-username=AWS 
    --docker-password=$(aws ecr get-login-password --region )

  2. Navigate to the kubernetes/ingress folder:
    • Make sure that the AWS_Region variable in the bedrockragconfigmap.yaml file points to your AWS region.
    • Replace the image URI in line 20 of the bedrockragdeployment.yaml file with the image URI of your bedrockrag image from your ECR repository.
  3. Provision the EKS deployment, service and ingress:
    cd ..
    kubectl apply -f ingress/

Create a knowledge base and upload data

To create a knowledge base and upload data, follow these steps:

  1. Create an S3 bucket and upload your data into the bucket. In our blog post, we uploaded these two files, Amazon Bedrock User Guide and the Amazon FSx for ONTAP User Guide, into our S3 bucket.
  2. Create an Amazon Bedrock knowledge base. Follow the steps here to create a knowledge base. Accept all the defaults including using the Quick create a new vector store option in Step 7 of the instructions that creates an Amazon OpenSearch Serverless vector search collection as your knowledge base.
    1. In Step 5c of the instructions to create a knowledge base, provide the S3 URI of the object containing the files for the data source for the knowledge base
    2. Once the knowledge base is provisioned, obtain the Knowledge Base ID from the Bedrock Knowledge Bases console for your newly created knowledge base.

Query using the Application Load Balancer

You can query the model directly using the API front end provided by the AWS ALB provisioned by the Kubernetes (EKS) Ingress Controller. Navigate to the AWS ALB console and obtain the DNS name for your ALB to use as your API:

curl -X POST "/query" \

-H "Content-Type: application/json" \

-d '{"prompt": "What is a bedrock knowledgebase?", "kbId": ""}'

Cleanup

To avoid recurring charges, clean up your account after trying the solution:

  1. From the terraform folder, delete the Terraform template for the solution:
    terraform apply --destroy 
  2. Delete the Amazon Bedrock knowledge base. From the Amazon Bedrock console, select the knowledge base you created in this solution, select Delete, and follow the steps to delete the knowledge base.

Conclusion

In this post, we demonstrated a solution that uses Amazon EKS with Amazon Bedrock and provides you with a framework to build your own containerized, automated, scalable, and highly available RAG-based generative AI applications on AWS. Using Amazon S3 and Amazon Bedrock Knowledge Bases, our solution automates bringing your unstructured user file data to Amazon Bedrock within the containerized framework. You can use the approach demonstrated in this solution to automate and containerize your AI-driven workloads while using Amazon Bedrock FMs for inference with built-in efficient deployment, scalability, and availability from a Kubernetes-based containerized deployment.

For more information about how to get started building with Amazon Bedrock and EKS for RAG scenarios, refer to the following resources:


About the Authors

bacchus Toolz Guru Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon BedrockKanishk Mahajan is Principal, Solutions Architecture at AWS. He leads cloud transformation and solution architecture for AWS customers and partners. Kanishk specializes in containers, cloud operations, migrations and modernizations, AI/ML, resilience and security and compliance. He is a Technical Field Community (TFC) member in each of those domains at AWS.

batchus Toolz Guru Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon BedrockSandeep Batchu is a Senior Security Architect at Amazon Web Services, with extensive experience in software engineering, solutions architecture, and cybersecurity. Passionate about bridging business outcomes with technological innovation, Sandeep guides customers through their cloud journey, helping them design and implement secure, scalable, flexible, and resilient cloud architectures.

Related Post

Introducing OpenAI o3 and o4-mini

May 12, 2025
1739544704 maxresdefault Toolz Guru Top 10 Free AI Tools for 2025: Maximize Efficiency & Innovation

Top 10 Free AI Tools for 2025: Maximize Efficiency & Innovation

February 14, 2025



Source link

Donation

Buy author a coffee

Donate
Maxim Makedonsky

Maxim Makedonsky

  • ChatGPT

    The Rise of the Content Creator: How to Build Your Brand in the Digital Age

    36 shares
    Share 14 Tweet 9
  • Grok AI Upgrade

    27 shares
    Share 11 Tweet 7
  • Junia AI: Content Generation & SEO Tools

    26 shares
    Share 10 Tweet 7
  • Boost Your WordPress Speed: Quick Tips!

    25 shares
    Share 10 Tweet 6
  • Cool Tech Gifts for Your Valentine

    23 shares
    Share 9 Tweet 6
pixart trainium inferentia 1120x630 Toolz Guru Cost-effective AI image generation with PixArt-Σ inference on AWS Trainium and AWS Inferentia

Cost-effective AI image generation with PixArt-Σ inference on AWS Trainium and AWS Inferentia

by Maxim Makedonsky
May 15, 2025
0

PixArt-Sigma is a diffusion transformer model that is capable of image generation at 4k resolution. This model shows significant improvements...

Social share Device trust.width 1300 Toolz Guru Device Trust from Android Enterprise

Device Trust from Android Enterprise

by Maxim Makedonsky
May 15, 2025
0

Integrated security, all in one viewMobile security has often been treated as a silo, separate from endpoint and identity security....

Detecting misbehavior in frontier reasoning models

by Maxim Makedonsky
May 14, 2025
0

Frontier reasoning models exploit loopholes when given the chance. We show we can detect exploits using an LLM to monitor...

TAS Gemini Across Devices Blog Header.width 1300 Toolz Guru Gemini is coming to watches, cars, TV and XR devices

Gemini is coming to watches, cars, TV and XR devices

by Maxim Makedonsky
May 14, 2025
0

Make your drive more productive and enjoyable, hands-freeHands-free voice commands with Google Assistant have always been at the core of...

No Content Available
Facebook Twitter Instagram Youtube
Currently Playing

Recent Posts

  • Cost-effective AI image generation with PixArt-Σ inference on AWS Trainium and AWS Inferentia
  • Device Trust from Android Enterprise
  • Detecting misbehavior in frontier reasoning models

Categories

  • AI News
  • AI News Feeds
  • AI Tools
  • Blogging Tips
  • Business
  • ChatGPT
  • Content Markeeting
  • Digital
  • Digital Marketing
  • Digital Tools
  • Image Generation
  • Language Models
  • Productivity
  • Prompts
  • Reviews
  • Search Engine Optimization
  • SEO Tools
  • Social Media
  • Technology
  • Video & Audio
  • Videos

2025 by Toolz Guru

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Home

2025 by Toolz Guru

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version