Skip to content

[Feature] Wanda++ for LM head pruning #2517

@namgyu-youn

Description

@namgyu-youn

Background

Current Wanda (torchao/sparsity/wanda.py) uses magnitude-based pruning criteria: |weight| * ||activation|| (arXiv, fig1)

Image

Feature : Wanda++

And recently, Wanda++ (arXiv) was published. The main difference can be summarized like the following:

  • Regional Gradient Score (RGS): Uses block-level gradients instead of magnitude
  • Regional Optimization (RO): Block-level weight fine-tuning after pruning (iterately)

How about expanding Wanda into Wanda++? I am not certain if I can handle full RO algorithm, but Table 1 shows significant improvements even with RGS.

Image

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions