BitsAndBytes¶
BitsAndBytes support is provided by the out-of-tree vllm-bnb-plugin.
Install the plugin first:
The plugin registers the bitsandbytes quantization method and bitsandbytes load format through vLLM's general plugin system, so existing usage stays the same after installation.
It supports both in-flight 4-bit quantization and pre-quantized 4-bit / 8-bit checkpoints. Refer to the plugin README for the current installation matrix and examples.