`RuntimeError: Expected to mark a variable ready only once` when training with PEFT

### Reproduction

```python
from datasets import load_dataset
from trl import DPOTrainer
from peft import LoraConfig

if __name__ == "__main__":
    dataset = load_dataset("trl-internal-testing/zen", "standard_preference", split="train")

    trainer = DPOTrainer(
        model="Qwen/Qwen2.5-0.5B",
        train_dataset=dataset,
        peft_config=LoraConfig(),
    )
    trainer.train()
```

outputs:

```
/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/utils/checkpoint.py:85: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/utils/checkpoint.py:85: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
[rank1]: Traceback (most recent call last):
[rank1]:   File "/fsx/qgallouedec/trl/bug_dpo_peft.py", line 13, in <module>
[rank1]:     trainer.train()
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/transformers/trainer.py", line 2325, in train
[rank1]:     return inner_training_loop(
[rank1]:            ^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/transformers/trainer.py", line 2674, in _inner_training_loop
[rank1]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank1]:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/transformers/trainer.py", line 4071, in training_step
[rank1]:     self.accelerator.backward(loss, **kwargs)
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/accelerate/accelerator.py", line 2852, in backward
[rank1]:     loss.backward(**kwargs)
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/_tensor.py", line 647, in backward
[rank1]:     torch.autograd.backward(
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/__init__.py", line 354, in backward
[rank1]:     _engine_run_backward(
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/graph.py", line 829, in _engine_run_backward
[rank1]:     return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/function.py", line 311, in apply
[rank1]:     return user_fn(self, *args)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/utils/checkpoint.py", line 319, in backward
[rank1]:     torch.autograd.backward(outputs_with_grad, args_with_grad)
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/__init__.py", line 354, in backward
[rank1]:     _engine_run_backward(
[rank1]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/graph.py", line 829, in _engine_run_backward
[rank1]:     return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1) Use of a module parameter outside the `forward` function. Please make sure model parameters are not shared across multiple concurrent forward-backward passes. or try to use _set_static_graph() as a workaround if this module graph does not change during training loop.2) Reused parameters in multiple reentrant backward passes. For example, if you use multiple `checkpoint` functions to wrap the same part of your model, it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a variable ready multiple times. DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if your module graph does not change over iterations.
[rank1]: Parameter at index 95 with name base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight has been marked as ready twice. This means that multiple autograd engine  hooks have fired for this particular parameter during this iteration.
[rank0]: Traceback (most recent call last):
[rank0]:   File "/fsx/qgallouedec/trl/bug_dpo_peft.py", line 13, in <module>
[rank0]:     trainer.train()
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/transformers/trainer.py", line 2325, in train
[rank0]:     return inner_training_loop(
[rank0]:            ^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/transformers/trainer.py", line 2674, in _inner_training_loop
[rank0]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank0]:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/transformers/trainer.py", line 4071, in training_step
[rank0]:     self.accelerator.backward(loss, **kwargs)
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/accelerate/accelerator.py", line 2852, in backward
[rank0]:     loss.backward(**kwargs)
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/_tensor.py", line 647, in backward
[rank0]:     torch.autograd.backward(
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/__init__.py", line 354, in backward
[rank0]:     _engine_run_backward(
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/graph.py", line 829, in _engine_run_backward
[rank0]:     return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/function.py", line 311, in apply
[rank0]:     return user_fn(self, *args)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/utils/checkpoint.py", line 319, in backward
[rank0]:     torch.autograd.backward(outputs_with_grad, args_with_grad)
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/__init__.py", line 354, in backward
[rank0]:     _engine_run_backward(
[rank0]:   File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.12/site-packages/torch/autograd/graph.py", line 829, in _engine_run_backward
[rank0]:     return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1) Use of a module parameter outside the `forward` function. Please make sure model parameters are not shared across multiple concurrent forward-backward passes. or try to use _set_static_graph() as a workaround if this module graph does not change during training loop.2) Reused parameters in multiple reentrant backward passes. For example, if you use multiple `checkpoint` functions to wrap the same part of your model, it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a variable ready multiple times. DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if your module graph does not change over iterations.
[rank0]: Parameter at index 95 with name base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight has been marked as ready twice. This means that multiple autograd engine  hooks have fired for this particular parameter during this iteration.
[rank0]:[W107 14:35:53.391642689 ProcessGroupNCCL.cpp:1538] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
  0%|          | 0/6 [00:02<?, ?it/s] 
```


### System Info

- Platform: Linux-5.15.0-1048-aws-x86_64-with-glibc2.31
- Python version: 3.12.12
- TRL version: 0.27.0.dev0+db868c5
- PyTorch version: 2.8.0
- accelerator(s): NVIDIA H100 80GB HBM3, NVIDIA H100 80GB HBM3
- Transformers version: 4.57.3
- Accelerate version: 1.12.0
- Accelerate config: not found
- Datasets version: 4.4.2
- HF Hub version: 0.36.0
- bitsandbytes version: 0.49.0
- DeepSpeed version: 0.18.3
- Liger-Kernel version: 0.6.4
- LLM-Blender version: 0.0.2
- OpenAI version: 2.8.1
- PEFT version: 0.18.0
- vLLM version: 0.10.2

### Checklist

- [x] I have checked that my issue isn't already filed (see [open issues](https://github.com/huggingface/trl/issues?q=is%3Aissue))
- [x] I have included my system information
- [x] Any code provided is minimal, complete, and reproducible ([more on MREs](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks))
- [x] Any code provided is properly formatted in code blocks, (no screenshot, [more on code blocks](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks))
- [x] Any traceback provided is complete

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`RuntimeError: Expected to mark a variable ready only once` when training with PEFT #4782

Reproduction

System Info

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: Expected to mark a variable ready only once when training with PEFT #4782

Description

Reproduction

System Info

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`RuntimeError: Expected to mark a variable ready only once` when training with PEFT #4782