[Question] Single action episode #507
Unanswered
riccardobussola
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
A brief introduction: my RL task consists of learning the optimal parameters of a trajectory (e.g. a Spline, or a Bezier Curve) in Cartesian Space that a quadruped has to follow for a non-constant time t_follow.
I'm trying to implement this using Orbit's RLTaskEnv but I'm struggling with a few things.
My episode asks for the action one single time (since once the parameters are obtained I have only to calculate the trajectory). For the rest of the simulation, the robot has to follow the obtained trajectory for a certain amount of time that varies from episode to episode (so a Task space IK controller has to run in the background).
In this case, the episode corresponds to a single policy step where the reward and a possible NN update are done at the termination once I can verify where is the robot's final position/orientation.
Is there a way to implement this with Orbit without rewriting the logic of the RLTaskEnv provided?
I can't change only the decimation factor since an episode has a variable duration depending on t_follow.
Many thanks for considering my request.
Beta Was this translation helpful? Give feedback.
All reactions