Explore the world of Large Language Models (LLMs) and Small Language Models (SLMs) with insights from NVIDIAs Jay! In this episode, we break down when to use each type of model, discuss the crucial role of guardrails, and dive into optimization techniques like quantization. Learn about the latest hardware (NVIDIAa GeForce RTX 50 Series) and open-source frameworks (NeMo Framework, TensorRT-LLM) empowering developers to build cutting-edge AI applications locally
Chapters:
0:00 – Introduction: Real Terms with AI & NVIDIA
0:31 – Adding AI – What is the Use Case?
1:20 – Starting with the use case
1:41 – Open Source tools
2:06 – NVIDIA’s NeMo Framework & Hugging Face
2:20 – Small Language Models (SLMs) explained
3:35 – Advantages of SLMs: local runnability & latency
4:18 – When NOT to Use an SLM: accuracy considerations
5:33 – What guardrails are?
6:50 – Trade-offs & the key factors
7:10 – Optimization basics
7:30 – Good Open Source Tools
8:58 – New NVIDIA RTX GeForce 50 Series hardware and its impact
10:08 – Al powered coding assistance
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
#GoogleCloud #GenerativeAI
Speakers: Aja Hammerly, Jason Davenport, Jay Rodge
Products Mentioned: AI Infrastructure, Gemini