Skip to content

Fix dwq: check for actual safetensors in target_dir#1173

Merged
angeloskath merged 2 commits intoml-explore:mainfrom
micuentadecasa:lc/fix-dwq-target-dir-empty-check
Apr 21, 2026
Merged

Fix dwq: check for actual safetensors in target_dir#1173
angeloskath merged 2 commits intoml-explore:mainfrom
micuentadecasa:lc/fix-dwq-target-dir-empty-check

Conversation

@micuentadecasa
Copy link
Copy Markdown
Contributor

Bug Description

mlx_lm.dwq silently skips target computation when --target-dir is empty but exists.
When the directory exists with no .safetensors files, the model is not loaded (model = None),
target computation is skipped, and then mx.load() crashes at runtime with:

RuntimeError: [load_safetensors] Failed to open file .../train/0000000000.safetensors                                     

This happens when a user creates the target directory ahead of time, or when a previous run
was interrupted before writing targets.

Closes #1159

Root Cause

In mlx_lm/quant/dwq.py line 317, the has_targets check uses target_dir.exists():

has_targets = target_dir.exists()                                                                                         

This returns True for any existing directory, regardless of whether it contains target files.
The subsequent code paths skip both model loading (line 334-335: model = None) and target
computation (line 338), then try to load nonexistent files at line 356.

Fix

  • Replace target_dir.exists() with a check that verifies actual .safetensors files exist
    in both train/ and valid/ subdirectories
  • Before: has_targets = target_dir.exists()
  • After:
    has_targets = (                                                                                                         
        target_dir.is_dir()                                                                                                 
        and any((target_dir / "train").glob("*.safetensors"))                                                               
        and any((target_dir / "valid").glob("*.safetensors"))                                                               
    )                                                                                                                       

Validation

6 new unit tests in tests/test_dwq.py:

  • Empty existing dir → has_targets = False (was True)
  • train/ only → has_targets = False (was True)
  • valid/ only → has_targets = False (was True)
  • Both splits present → has_targets = True (unchanged)
  • Nonexistent dir → has_targets = False (unchanged)
  • Non-safetensors files → has_targets = False (was True)
$ python -m unittest tests.test_dwq -v                                                                                    
test_empty_dir_not_treated_as_has_targets ... ok                                                                          
test_dir_with_train_only_not_treated_as_has_targets ... ok                                                                
test_dir_with_valid_only_not_treated_as_has_targets ... ok                                                                
test_dir_with_both_splits_treated_as_has_targets ... ok                                                                   
test_nonexistent_dir_not_treated_as_has_targets ... ok                                                                    
test_dir_with_non_safetensors_files_not_treated_as_has_targets ... ok                                                     
                                                                                                                          
Ran 6 tests in 0.004s OK                                                                                                  
                                                                                                                          
---                                                                                                                       
                                                                                                                          
**Branch name:** `lc/fix-dwq-target-dir-empty-check`                                                                      
                                                                                                                          
**Files to include in the PR:**                                                                                           
- `mlx_lm/quant/dwq.py` (the fix)                                                                                         
- `tests/test_dwq.py` (new, 6 unit tests)                                                                                 
- `.gitignore` (add `test_temp/`)     

Copy link
Copy Markdown
Member

@angeloskath angeloskath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks!

I removed the test as it wasn't really testing the dwq code.

@angeloskath angeloskath merged commit f39cb8e into ml-explore:main Apr 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mlx_lm.dwq silently skips target computation when --target-dir is empty but exists

2 participants