Run AI Models Inference on Amazon SageMaker HyperPod EKS | Amazon Web Services

0 views

0 0

Run AI Models Inference on Amazon SageMaker HyperPod EKS | Amazon Web Services

In the final part 3 video of the series, we shift focus to model inference on the same HyperPod EKS cluster. Discover how the HyperPod Inference Operator simplifies deploying over 400 open-weights foundation models with one-click deployment from JumpStart, S3, or FSx for Lustre, built-in autoscaling using CloudWatch and Prometheus metrics, and deep observability through Grafana dashboards. See how training and inference workloads can coexist efficiently on a shared cluster using priority-based scheduling via Task Governance.

AI on SageMaker HyperPod – our new website and GitHub repo containing Slurm and EKS reference architectures, training and inference code samples, tips and tricks, and setup guides based on 2+ years and way to many cluster deployments to count. Linked here: https://go.aws/4s7T3NL

Learn more:
Amazon SageMaker HyperPod Documentation: https://go.aws/4rvrHRO
GitHub code of presentation demo: https://go.aws/3MZ7lRY
ML Framework’s team repository: https://go.aws/46ZayaV

Subscribe to AWS: https://go.aws/subscribe

Create a free AWS account: https://go.aws/signup
Try AWS for free: https://go.aws/free
Connect with an expert: https://go.aws/contact
Explore more: https://go.aws/more

Next steps:
Explore on AWS in Analyst Research: https://go.aws/reports
Discover, deploy, and manage software that runs on AWS: https://go.aws/marketplace
Join the AWS Partner Network: https://go.aws/partners
Learn more on how Amazon builds and operates software: https://go.aws/library

Do you have technical AWS questions?
Ask the community of experts on AWS re:Post: https://go.aws/3lPaoPb

Why AWS?
Amazon Web Services is the world’s most comprehensive and broadly adopted cloud, enabling customers to build anything they can imagine. We offer the greatest choice of innovative cloud capabilities and expertise, on the most extensive global infrastructure with industry-leading security, reliability, and performance.

#AWS #AmazonWebServices #CloudComputing #SageMakerHyperPod
#GenAI

Date: March 1, 2026

Related videos