1 article
FP8 quantization slashes the memory footprint of 70B-class open-weight models, maintaining accuracy within 0.