Skip to content

Commit bcfa95a

Browse files
chore: Simplify html and base it on @brainjs/rl
1 parent 9ddecfc commit bcfa95a

37 files changed

+171
-2311
lines changed

README.md

Lines changed: 4 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,8 @@
1-
# REINFORCEjs
1+
# @brainjs/REINFORCEjs Worlds
22

3-
**REINFORCEjs** is a Reinforcement Learning library that implements several common RL algorithms, all with web demos. In particular, the library currently includes:
3+
A typescript fork of the world demonstrations from https://github.com/karpathy/reinforcejs that uses https://github.com/brainjs/reinforcejs.
4+
An effort to show off the original demonstrations, and a place for more.
45

5-
- **Dynamic Programming** methods
6-
- (Tabular) **Temporal Difference Learning** (SARSA/Q-Learning)
7-
- **Deep Q-Learning** for Q-Learning with function approximation with Neural Networks
8-
- **Stochastic/Deterministic Policy Gradients** and Actor Critic architectures for dealing with continuous action spaces. (*very alpha, likely buggy or at the very least finicky and inconsistent*)
9-
10-
See the [main webpage](http://cs.stanford.edu/people/karpathy/reinforcejs) for many more details, documentation and demos.
11-
12-
# Code Sketch
13-
14-
The library exports two global variables: `R`, and `RL`. The former contains various kinds of utilities for building expression graphs (e.g. LSTMs) and performing automatic backpropagation, and is a fork of my other project [recurrentjs](https://github.com/karpathy/recurrentjs). The `RL` object contains the current implementations:
15-
16-
- `RL.DPAgent` for finite state/action spaces with environment dynamics
17-
- `RL.TDAgent` for finite state/action spaces
18-
- `RL.DQNAgent` for continuous state features but discrete actions
19-
20-
A typical usage might look something like:
21-
22-
```javascript
23-
// create an environment object
24-
var env = {};
25-
env.getNumStates = function() { return 8; }
26-
env.getMaxNumActions = function() { return 4; }
27-
28-
// create the DQN agent
29-
var spec = { alpha: 0.01 } // see full options on DQN page
30-
agent = new RL.DQNAgent(env, spec);
31-
32-
setInterval(function(){ // start the learning loop
33-
var action = agent.act(s); // s is an array of length 8
34-
//... execute action in environment and get the reward
35-
agent.learn(reward); // the agent improves its Q,policy,model, etc. reward is a float
36-
}, 0);
37-
```
38-
39-
The full documentation and demos are on the [main webpage](http://cs.stanford.edu/people/karpathy/reinforcejs).
40-
41-
# License
6+
Thank you [Andrej karpathy](https://github.com/karpathy) for your fine work.
427

438
MIT.

gridworld_dp/agents/world-agent.ts

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
import { zeros } from "../../lib/zeros";
2-
import { DPAgent, IDPAgentOptions } from "../../lib/agents/dp-agent";
1+
import { DPAgent, IDPAgentOptions } from "@brainjs/rl";
32

43
export interface IWorldAgentOpt {
54
gh: number;
@@ -28,8 +27,8 @@ export class WorldAgent extends DPAgent {
2827
this.gs = opt.gs;
2928

3029
// specify some rewards
31-
const Rarr = this.Rarr = zeros(this.gs);
32-
const T = this.T = zeros(this.gs);
30+
const Rarr = this.Rarr = new Float64Array(this.gs);
31+
const T = this.T = new Float64Array(this.gs);
3332
Rarr[55] = 1;
3433

3534
Rarr[54] = -1;

gridworld_dp/grid-world-dp.ts

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,4 @@
11
// Gridworld
2-
import { zeros } from "../lib/zeros";
3-
4-
5-
62
export class GridWorldDP {
73

84
}

gridworld_dp/index.html

Lines changed: 10 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -3,36 +3,10 @@
33
<head>
44
<meta charset="utf-8">
55
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
6-
<title>REINFORCEjs: Gridworld with Dynamic Programming</title>
6+
<title>@brainjs/rl: Gridworld with Dynamic Programming Demo</title>
77
<meta name="description" content="">
88
<meta name="author" content="">
99
<meta name="viewport" content="width=device-width, initial-scale=1.0">
10-
11-
<!-- jquery and jqueryui -->
12-
<!-- <script src="https://code.jquery.com/jquery-2.1.3.min.js"></script>-->
13-
<!-- <link href="external/jquery-ui.min.css" rel="stylesheet">-->
14-
<!-- <script src="external/jquery-ui.min.js"></script>-->
15-
16-
<!-- bootstrap -->
17-
<!-- <script src="http://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>-->
18-
<!-- <link href="http://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css" rel="stylesheet">-->
19-
20-
<!-- d3js -->
21-
<!-- <script type="text/javascript" src="external/d3.min.js"></script>-->
22-
23-
<!-- markdown -->
24-
<!-- <script type="text/javascript" src="external/marked.js"></script>-->
25-
<!-- <script type="text/javascript" src="external/highlight.pack.js"></script>-->
26-
<!-- <link rel="stylesheet" href="external/highlight_default.css">-->
27-
<!-- <script>hljs.highlightAll();</script>-->
28-
29-
<!-- mathjax: nvm now loading dynamically
30-
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
31-
-->
32-
33-
<!-- rljs -->
34-
<!-- <script type="text/javascript" src="lib/rl.js"></script>-->
35-
3610
<!-- GA -->
3711
<!-- <script>-->
3812
<!-- (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){-->
@@ -42,40 +16,7 @@
4216
<!-- ga('create', 'UA-3698471-24', 'auto');-->
4317
<!-- ga('send', 'pageview');-->
4418
<!-- </script>-->
45-
<style>
46-
#wrap {
47-
width:800px;
48-
margin-left: auto;
49-
margin-right: auto;
50-
}
51-
body {
52-
font-family: Arial, "Helvetica Neue", Helvetica, sans-serif;
53-
}
54-
#draw {
55-
margin-left: 100px;
56-
}
57-
#exp {
58-
margin-top: 20px;
59-
font-size: 16px;
60-
}
61-
svg {
62-
cursor: pointer;
63-
}
64-
h2 {
65-
text-align: center;
66-
font-size: 30px;
67-
}
68-
#rewardui {
69-
font-weight: bold;
70-
font-size: 16px;
71-
}
72-
</style>
73-
74-
<!-- <script type="application/javascript">-->
75-
76-
<!-- -->
77-
<!-- </script>-->
78-
<script src="../reinforce-browser.js"></script>
19+
<script src="../gridworld_dp.js"></script>
7920
</head>
8021
<body>
8122

@@ -87,15 +28,15 @@
8728
<div id="mynav" style="border-bottom:1px solid #999; padding-bottom: 10px; margin-bottom:50px;">
8829
<div>
8930
<img src="../loop.svg" style="width:50px;height:50px;float:left;">
90-
<h1 style="font-size:50px;">REINFORCE<span style="color:#058;">js</span></h1>
31+
<h1 style="font-size:50px;">@brainjs/rl</h1>
9132
</div>
92-
<ul class="nav nav-pills">
93-
<li role="presentation"><a href="index.html">About</a></li>
94-
<li role="presentation" class="active"><a href="../gridworld_dp/index.html">GridWorld: DP</a></li>
95-
<li role="presentation"><a href="../gridworld_td/index.html">GridWorld: TD</a></li>
96-
<li role="presentation"><a href="../puckworld/index.html">PuckWorld: DQN</a></li>
97-
<li role="presentation"><a href="../waterworld/index.html">WaterWorld: DQN</a></li>
98-
</ul>
33+
<nav class="nav nav-pills nav-fill">
34+
<a class="nav-item nav-link" href="../index.html">About</a>
35+
<a class="nav-item nav-link active" href="../gridworld_dp/index.html">GridWorld: DP</a>
36+
<a class="nav-item nav-link" href="../gridworld_td/index.html">GridWorld: TD</a>
37+
<a class="nav-item nav-link" href="../puckworld/index.html">PuckWorld: DQN</a>
38+
<a class="nav-item nav-link" href="../waterworld/index.html">WaterWorld: DQN</a>
39+
</nav>
9940
</div>
10041

10142
<h2>GridWorld: Dynamic Programming Demo</h2>

gridworld_dp/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ import { select } from "d3";
1212
import { WorldAgent } from "./agents/world-agent";
1313
import * as d3 from "d3";
1414
import {highlightTex} from "../highlight-tex";
15+
import "../style.css";
1516

1617
type D3Line = d3.Selection<SVGLineElement, unknown, HTMLElement, null>;
1718
type D3Rect = d3.Selection<SVGRectElement, unknown, HTMLElement, null>;

gridworld_td/agents/world-agent.ts

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
import { IQAgentOptions, QAgent } from "../../lib/agents/q-agent";
2-
import { zeros } from "../../lib/zeros";
1+
import { IQAgentOptions, QAgent } from "@brainjs/rl";
32

43
export interface IWorldAgentState {
54
ns: number;
@@ -28,8 +27,8 @@ export class WorldAgent extends QAgent {
2827
this.gs = opt.gs;
2928

3029
// specify some rewards
31-
const Rarr = zeros(this.gs);
32-
const T = zeros(this.gs);
30+
const Rarr = new Float64Array(this.gs);
31+
const T = new Float64Array(this.gs);
3332
Rarr[55] = 1;
3433

3534
Rarr[54] = -1;

0 commit comments

Comments
 (0)