Tutorial: Serve Gemma on GKE with TGI → https://goo.gle/4fFKt2Q
Learn more about TGI (text generation inference) from Hugging Face → https://goo.gle/4e7qusz
Hugging Face Deep Learning containers for Google Cloud → https://goo.gle/3BPaYUM
Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high performance text generation for the most popular open LLMs. Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Watch along as Googlers Wietse Venema and Mofi Rahman demonstrate how to deploy Gemma 2 with 27 billion parameters on Google Kubernetes Engine using Hugging Face TGI.
Watch more Google Cloud: Building with Hugging Face → https://goo.gle/BuildWithHuggingFace
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
#GoogleCloud #HuggingFace
Speakers: Wietse Venema, Mofi Rahman
Products Mentioned: Gemma, Hugging Face Deep Learning containers, Google Kubernetes Engine