[Bug]: PyTorch 2.7 compile results in bad state_dict keys and policy fails to load

### 🐛 Bug

Cannot load a saved compiled model.

Issue linked to:
https://github.com/DLR-RM/stable-baselines3/issues/1438

### To Reproduce

```python
import gym
import torch as th
from stable_baselines3 import PPO

env = gym.make("Pendulum-v1")

model = PPO("MlpPolicy", env, verbose=1)
model.save("sac_pendulum_uncompiled")
del model # remove to demonstrate saving and loading

# Success
model = PPO.load("sac_pendulum_uncompiled")
del model # remove to demonstrate saving and loading

model = PPO("MlpPolicy", env, verbose=1)
model.policy = th.compile(model.policy)  # Compile the model
model.save("sac_pendulum_compiled")
del model # remove to demonstrate saving and loading

# Fail
model = PPO.load("sac_pendulum_compiled")
del model # remove to demonstrate saving and loading
```


### Relevant log output / Error message

```shell
Error(s) in loading state_dict for ActorCriticPolicy:
	Missing key(s) in state_dict: "log_std", "mlp_extractor.policy_net.0.weight", "mlp_extractor.policy_net.0.bias", "mlp_extractor.policy_net.2.weight", "mlp_extractor.policy_net.2.bias", "mlp_extractor.value_net.0.weight", "mlp_extractor.value_net.0.bias", "mlp_extractor.value_net.2.weight", "mlp_extractor.value_net.2.bias", "action_net.weight", "action_net.bias", "value_net.weight", "value_net.bias". 
	Unexpected key(s) in state_dict: "_orig_mod.log_std", "_orig_mod.mlp_extractor.policy_net.0.weight", "_orig_mod.mlp_extractor.policy_net.0.bias", "_orig_mod.mlp_extractor.policy_net.2.weight", "_orig_mod.mlp_extractor.policy_net.2.bias", "_orig_mod.mlp_extractor.value_net.0.weight", "_orig_mod.mlp_extractor.value_net.0.bias", "_orig_mod.mlp_extractor.value_net.2.weight", "_orig_mod.mlp_extractor.value_net.2.bias", "_orig_mod.action_net.weight", "_orig_mod.action_net.bias", "_orig_mod.value_net.weight", "_orig_mod.value_net.bias".
```

### System Info

- OS: Windows-11-10.0.26100-SP0 10.0.26100
- Python: 3.13.2
- Stable-Baselines3: 2.6.0
- PyTorch: 2.7.0+cu128
- GPU Enabled: True
- Numpy: 2.2.5
- Cloudpickle: 3.1.1
- Gymnasium: 1.1.1
- OpenAI Gym: 0.26.2

### Checklist

- [x] My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
- [x] I have checked that there is no similar [issue](https://github.com/DLR-RM/stable-baselines3/issues) in the repo
- [x] I have read the [documentation](https://stable-baselines3.readthedocs.io/en/master/)
- [x] I have provided a [minimal and working](https://github.com/DLR-RM/stable-baselines3/issues/982#issuecomment-1197044014) example to reproduce the bug
- [x] I've used the [markdown code blocks](https://help.github.com/en/articles/creating-and-highlighting-code-blocks) for both code and stack traces.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: PyTorch 2.7 compile results in bad state_dict keys and policy fails to load #2137

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: PyTorch 2.7 compile results in bad state_dict keys and policy fails to load #2137

Description

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions