Best practices for loading models in Cloud Run → https://goo.gle/4jxBzWX
Episode showing two-tier architecture → https://goo.gle/4jb4y36
Request GPU quota for your project → https://goo.gle/3Rfx9rJ
Simplify hosting the DeepSeek AI model with Cloud Run GPUs. Join Googlers Martin Omander and Lisa Shen as they demonstrate how to deploy and manage Large Language Models (LLMs) on Google Cloud with only 3 commands. Watch along and discover the capabilities of Cloud Run and the Ollama command-line tool, allowing developers to operate AI applications rapidly with on demand resource allocation and scaling.
Watch more Serverless Expeditions → https://goo.gle/ServerlessExpeditions
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
#ServerlessExpeditions #GoogleCloud
Speakers: Martin Omander, Lisa Shen
Products Mentioned: Cloud Run, AI Infrastructure, AI Infrastructure