How are enterprises re-imagining AI for real-world impact? Chris Wright, Red Hat CTO and SVP Global Engineering sits down with Brian Stevens, Red Hat SVP and AI CTO, to discuss the journey towards production-quality AI inference at scale. They explore the critical role of open source projects like vLLM, the evolution from CPU to GPU optimization, and the parallels between today’s AI challenges and the early days of enterprise Linux.
00:00:37 – Brian Stevens on returning to Red Hat & parallels with early Linux
00:02:00 – The path from cloud to AI & the impact of ChatGPT
00:03:58 – Pivoting to GPUs & the rise of vLLM for generative AI
00:05:48 – From CPU sparsification to GPU model compression
00:08:00 – Optimizing for modern GPUs with vLLM
00:11:38 – An ""AI Operating System""? Integrating vLLM with Kubernetes
00:15:31 – vLLM: A common platform for diverse AI hardware & models
00:17:41 – The importance of distributed KV cache for scalable inference
00:22:53 – Inference-time scaling, reasoning, and platform Demands
00:25:10 – Ecosystem & Community: The key to AI’s future
Learn More:
Red Hat AI Solutions: https://www.redhat.com/en/products/ai
vLLM Project: https://docs.vllm.ai/
vLLM GitHub: https://github.com/vllm-project/vllm
Follow us:
Chris Wright (LinkedIn): https://www.linkedin.com/in/chris-wright-b733851/
Brian Stevens (LinkedIn): https://www.linkedin.com/in/brianmarkstevens/
What is Technically Speaking?
Technically Speaking taps into emerging technology trends with insights from leading experts across the globe and Red Hat CTO Chris Wright. The series blends deep-dive discussions, tech updates, and creative short-form content, solidifying Red Hat’s role as a pioneer in technology innovation and open source thought leadership.
Want to participate? Leave us a comment if there’s a topic or a guest you’d like to see featured.
Watch More Technically Speaking:
YouTube playlist: https://www.youtube.com/playlist?list=PLbMP1JcGBmSGfI0Rl4s6PpycLF4rZcfW8
Show Page: https://www.redhat.com/en/technically-speaking
Subscribe to Red Hat’s YouTube channel: https://www.youtube.com/redhat/?sub_confirmation1
#RedHat #TechnicallySpeaking #AIInference #vLLM #EnterpriseAI #OpenSource #BrianStevens #PracticalAI #llmd"