A collection of Reinforcement Learning (RL) Methods I have implemented in jax/flax, flux and pytorch with particular effort put into readability and reproducibility.
- Python >= 3.8
- jax
$ git clone https://github.com/BeeGass/Agents.git
$ cd Agents/agents-jax
$ python main.py
- PyTorch >= 1.10
$ cd Agents/agents-pytorch
$ python main.py
- TODO
- TODO
$ cd Agents/agents-flux
$ # TBA
Config File Template
TBAWeights And Biases Integration
TBA
| Model | NumPy/Vanilla | Jax/Flax | Flux | Config | Paper |
|---|---|---|---|---|---|
| Policy Evaluation | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| Policy Improvement | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| Policy Iteration | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| Value Iteration | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| On-policy first visit Monte-Carlo prediction | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| On-policy first visit Monte-Carlo control | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| Sarsa (on-policy TD control) | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| Q-learing (off-policy TD control) | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
| Model | PyTorch | Jax/Flax | Flux | Config | Paper |
|---|---|---|---|---|---|
| DQN | ☐ | ☐ | ☐ | ☐ | Link |
| DDPG | ☐ | ☐ | ☐ | ☐ | Link |
| DRQN | ☐ | ☐ | ☐ | ☐ | Link |
| Dueling-DQN | ☐ | ☐ | ☐ | ☐ | Link |
| Double-DQN | ☐ | ☐ | ☐ | ☐ | Link |
| PER | ☐ | ☐ | ☐ | ☐ | Link |
| Rainbow | ☐ | ☐ | ☐ | ☐ | Link |
| Model | PyTorch | Jax/Flax | Flux | Config | Paper |
|---|---|---|---|---|---|
| PPO | ☐ | ☐ | ☐ | ☐ | Link |
| TRPO | ☐ | ☐ | ☐ | ☐ | Link |
| SAC | ☐ | ☐ | ☐ | ☐ | Link |
| A2C | ☐ | ☐ | ☐ | ☐ | Link |
| A3C | ☐ | ☐ | ☐ | ☐ | Link |
| TD3 | ☐ | ☐ | ☐ | ☐ | Link |
| Model | PyTorch | Jax/Flax | Flux | Config | Paper |
|---|---|---|---|---|---|
| World Models | ☐ | ☐ | ☐ | ☐ | Link |
| Dream to Control | ☐ | ☐ | ☐ | ☐ | Link |
| Dream to Control v2 | ☐ | ☐ | ☐ | ☐ | Link |
@software{Gass_Agents_2021,
author = {Gass, B.A., Gass, B.A.},
doi = {10.5281/zenodo.1234},
month = {12},
title = {{Agents}},
url = {https://github.com/BeeGass/Agents},
version = {1.0.0},
year = {2021}
}