Qwen3.6 4bit DWQ now up on MLX, uses custom quantization scheme (4bit MLP 8bit everything else) + DWQ for additional gains. It gets 0.0225 KL w/ the base model, and matches it on PPL - versus 0.0819 for a naive 4bit quant. Adds only 0.25BPW! https://huggingface.co/mlx-community/Qwen3.6-35B-A3B-4bit-DWQ
中文: Qwen3.6 4位DWQ现已在MLX上上升,采用自定义量化方案(4bit MLP 8bit 其他所有)+ DWQ 以获取额外收益。其基础模型的比值为0.0225 KL,与 PPL 上的 0.0819 相比,对于一个天真的 4 位量子。仅添加 0.25BPW!