Introduction to Distributed ML Workloads with Ray on Kubernetes – Abdel Sghiouar, Google Cloud
The rapidly evolving landscape of Machine Learning and Large Language Models demands efficient scalable ways to run distributed workloads to train, fine-tune and serve models. Ray is an Open Source framework that simplifies distributed machine learning, and Kubernetes streamlines deployment. In this introductory talk, we’ll uncover how to combine Ray and Kubernetes for your ML projects. You will learn about: – Basic Ray concepts (actors, tasks) and their relevance to ML – Setting up a simple Ray cluster within Kubernetes – Running your first distributed ML training job