Update hyper params and set seeds #3384

splion-360 · 2025-06-04T23:49:45Z

Fixes #3080

Description

This PR updates the hyper parameters for the CartPole-v1 environment in the DQN tutorial to better match the results shown in the reference image (only the tutorial file is modified).
A fixed seed has been added to ensure reproducibility of training behavior and evaluation outcomes.

Checklist

The issue that is being fixed is referred in the description (see above "Fixes [Improve] - Pytorch Reinforcement DQN Tutorial #3080")
Only one issue is addressed in this pull request
Labels from the issue that this PR is fixing are added to this pull request
No unnecessary issues are included into this pull request.

pytorch-bot · 2025-06-04T23:49:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3384

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7f52575 with merge base 06f9c4b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

sekyondaMeta · 2025-06-05T14:39:02Z

@vmoens Mind taking a look at these changes

vmoens

LGTM
Before approving, do you have a learning curve to share before / after these changes?

splion-360 · 2025-06-05T18:01:35Z

Hello @vmoens . I have attached the learning curves for the CartPole-v1 environment in the DQN tutorial. These plots are generated after confirming the behavior for 2 complete runs by fixing the seeds for reproducibility (as mentioned in the issued #3080).

Before fix

After fix

malfet · 2025-06-05T20:37:38Z

intermediate_source/reinforcement_q_learning.py

@@ -91,6 +91,16 @@
    "cpu"
 )

+# set the seeds for reproducibility


Just FYI, we already doing it in the CI, not sure if it's helpful to do something like that for all users...
May be add a paragraph saying to uncomment those if you want fixed output all the time

I think it's good practice to have it being part of the script and I'd keep it here.
It helps when you run it locally - RL is very seed dependent usually

Update hyper params and set seeds

58794c7

facebook-github-bot added the cla signed label Jun 4, 2025

github-actions bot added docathon-h1-2025 A label for the docathon in H1 2025 medium Reinforcement Learning Issues relating to reinforcement learning tutorials labels Jun 4, 2025

Merge branch 'main' into splion-360/rl-dqn-fixes

0c65d4e

vmoens reviewed Jun 5, 2025

View reviewed changes

Updated hyper params

430f634

svekars requested a review from malfet June 5, 2025 19:27

malfet reviewed Jun 5, 2025

View reviewed changes

splion-360 and others added 2 commits June 5, 2025 19:25

Added commented paragraph

8677557

Merge branch 'main' into splion-360/rl-dqn-fixes

7f52575

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update hyper params and set seeds #3384

Update hyper params and set seeds #3384

Uh oh!

splion-360 commented Jun 4, 2025

Uh oh!

pytorch-bot bot commented Jun 4, 2025 •

edited

Loading

Uh oh!

sekyondaMeta commented Jun 5, 2025

Uh oh!

vmoens left a comment

Uh oh!

splion-360 commented Jun 5, 2025

Uh oh!

malfet Jun 5, 2025

Uh oh!

vmoens Jun 6, 2025

Uh oh!

Uh oh!

Update hyper params and set seeds #3384

Are you sure you want to change the base?

Update hyper params and set seeds #3384

Uh oh!

Conversation

splion-360 commented Jun 4, 2025

Description

Checklist

Uh oh!

pytorch-bot bot commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3384

✅ No Failures

Uh oh!

sekyondaMeta commented Jun 5, 2025

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

splion-360 commented Jun 5, 2025

Before fix

After fix

Uh oh!

malfet Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

vmoens Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 4, 2025 •

edited

Loading