sebae banner ad-300x250
sebae intro coupon 30 off
sebae banner 728x900
sebae banner 300x250

Scaling AI with Google Cloud’s TPUs

0 views
0%

Scaling AI with Google Cloud's TPUs

Dive deep into Google Cloud’s Tensor Processing Units (TPUs) and discover how to scale your AI workloads efficiently. This video breaks down the specialized TPU architecture, including Matrix Multiply Units (MXUs), High Bandwidth Memory (HBM), and SparseCores, designed for high performance AI training and inference. Explore the TPU cloud architecture, from individual chips to massive pods and multislice configurations, showcasing how Google builds scalable, purpose built infrastructure to meet the demands of advanced AI.

Chapters:
0:00 – Introduction
0:26 – TPUs explained
1:11 – High Bandwidth Memory (HBM) for fast data access
1:50 – SparseCores for sparse datasets
2:18 – Inter-chip Interconnect (ICI) and resiliency
3:44 – Multislice: Scaling beyond the pod
4:06 – Evolution of TPU versions
4:40 – Frameworks: PyTorch with XLA, vLLM, and JAX
5:19 – Conclusion: Building scalable AI

Resources:
Managed Lustre product overview → https://goo.gle/48a2bdw
Optimize AI and ML workloads with Cloud Storage FUSE → http://goo.gle/ra-gcs-fuse

Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech

#GoogleCloud #AIFrameworks #TPU #MXU #HBM

Speakers: Don McCasland
Products Mentioned: AI Infrastructure, Tensor Processing Units, PyTorch, SparseCores

Date: November 13, 2025