[vLLM Office Hours #31] vLLM and LLM Compressor Update - August 28, 2025

0 views

0 0

[vLLM Office Hours #31] vLLM and LLM Compressor Update - August 28, 2025

Join us for our next vLLM Office Hours on August 28, 2025, at 2:00pm EST! These bi-weekly sessions are your chance to stay up to date on the latest in the vLLM ecosystem, ask questions, and hear directly from contributors and power users.

This week’s special topic: LLM Compressor Release Update

We’ll start with our bi-weekly vLLM project update by Michael Goin. After that, join Red Hat’s Model Optimization Team, Brian Dellabetta and Kyle Sayers, as they walks through what’s new LLM Compressor, including:

–Transform-based modifiers, a new approach to decreasing quantization loss without need of a calibration dataset or significant compute time.
–Non-uniform quantization, so that users have more flexibility and control in setting precision in each layer of the model.
–Support for block-quantization, as found in DeepSeek v3.

Join to find out how it works, how it can be run in LLM Compressor, and next steps to benchmark transform performance against AWQ, GPTQ and round-to-nearest on a wide range of models.

Want to join our discussion live on Google Meet? Get a calendar invite by filling out this form: https://red.ht/office-hours

Date: August 27, 2025