Skip to content

[Bug]: PyTorch 2.7 compile results in bad state_dict keys and policy fails to load #2137

@Raa23

Description

@Raa23

🐛 Bug

Cannot load a saved compiled model.

Issue linked to:
#1438

To Reproduce

import gym
import torch as th
from stable_baselines3 import PPO

env = gym.make("Pendulum-v1")

model = PPO("MlpPolicy", env, verbose=1)
model.save("sac_pendulum_uncompiled")
del model # remove to demonstrate saving and loading

# Success
model = PPO.load("sac_pendulum_uncompiled")
del model # remove to demonstrate saving and loading

model = PPO("MlpPolicy", env, verbose=1)
model.policy = th.compile(model.policy)  # Compile the model
model.save("sac_pendulum_compiled")
del model # remove to demonstrate saving and loading

# Fail
model = PPO.load("sac_pendulum_compiled")
del model # remove to demonstrate saving and loading

Relevant log output / Error message

Error(s) in loading state_dict for ActorCriticPolicy:
	Missing key(s) in state_dict: "log_std", "mlp_extractor.policy_net.0.weight", "mlp_extractor.policy_net.0.bias", "mlp_extractor.policy_net.2.weight", "mlp_extractor.policy_net.2.bias", "mlp_extractor.value_net.0.weight", "mlp_extractor.value_net.0.bias", "mlp_extractor.value_net.2.weight", "mlp_extractor.value_net.2.bias", "action_net.weight", "action_net.bias", "value_net.weight", "value_net.bias". 
	Unexpected key(s) in state_dict: "_orig_mod.log_std", "_orig_mod.mlp_extractor.policy_net.0.weight", "_orig_mod.mlp_extractor.policy_net.0.bias", "_orig_mod.mlp_extractor.policy_net.2.weight", "_orig_mod.mlp_extractor.policy_net.2.bias", "_orig_mod.mlp_extractor.value_net.0.weight", "_orig_mod.mlp_extractor.value_net.0.bias", "_orig_mod.mlp_extractor.value_net.2.weight", "_orig_mod.mlp_extractor.value_net.2.bias", "_orig_mod.action_net.weight", "_orig_mod.action_net.bias", "_orig_mod.value_net.weight", "_orig_mod.value_net.bias".

System Info

  • OS: Windows-11-10.0.26100-SP0 10.0.26100
  • Python: 3.13.2
  • Stable-Baselines3: 2.6.0
  • PyTorch: 2.7.0+cu128
  • GPU Enabled: True
  • Numpy: 2.2.5
  • Cloudpickle: 3.1.1
  • Gymnasium: 1.1.1
  • OpenAI Gym: 0.26.2

Checklist

  • My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • I have provided a minimal and working example to reproduce the bug
  • I've used the markdown code blocks for both code and stack traces.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationImprovements or additions to documentationhelp wantedHelp from contributors is welcomed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions