-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
RTFMAnswer is the documentationAnswer is the documentationduplicateThis issue or pull request already existsThis issue or pull request already existsquestionFurther information is requestedFurther information is requested
Description
❓ Question
Hi, first off great work on making Stable-Baselines3 an excellent resource for deep reinforcement learning practitioners.
I noticed that your DQN implementation features a target q network which resembles Google's Deep Mind paper, Deep Reinforcement Learning with Double Q-learning. Meanwhile, Neural Fitted Q Iteration, by Riedmiller, calculates the target using the "current estimate" of the Q function. I am looking for clarification if DQN is truly a Double DQN. I hope to use this information to accurately hopefully implement prioritized experience replay based off of your DQN implementation.
Thanks,
Oliver
Checklist
- I have checked that there is no similar issue in the repo
- I have read the documentation
- If code there is, it is minimal and working
- If code there is, it is formatted using the markdown code blocks for both code and stack traces.
Metadata
Metadata
Assignees
Labels
RTFMAnswer is the documentationAnswer is the documentationduplicateThis issue or pull request already existsThis issue or pull request already existsquestionFurther information is requestedFurther information is requested