qgym.templates.environment module
Generic abstract base class for RL environments.
All environments should inherit from Environment
.
- class qgym.templates.environment.Environment[source]
Bases:
Env
[ObservationT
,ActionT
]RL Environment containing the current state of the problem.
Each subclass should set at least the following attributes:
- action_space: Space[Any]
The action space of this environment.
- observation_space: Space[Any]
The observation space of this environment.
- abstract reset(*, seed=None, options=None)[source]
Reset the environment and load a new random initial state.
To be used after an episode is finished. Optionally, one can provide additional options to configure the reset.
- Parameters:
- Return type:
- Returns:
Initial observation and a dictionary containing debugging information.
- property rewarder: Rewarder
Return the rewarder that is set for this environment.
Used to compute rewards after each step.
- property rng: Generator
Return the random number generator of this environment.
If none is set yet, this will generate a new one using
numpy.random.default_rng
.
- step(action)[source]
Update the state based on the input action.
Return observation, reward, done-indicator, terminated-indicator and debugging info based on the updated state.
- Parameters:
action (
TypeVar
(ActionT
)) – Action to be performed.- Return type:
tuple
[TypeVar
(ObservationT
),float
,bool
,bool
,dict
[Any
,Any
]]- Returns:
A tuple containing five entries
The updated state;
Reward for the given action;
Boolean value stating whether the new state is a final state (i.e., if we are done);
Boolean value stating whether the episode is truncated.
Additional (debugging) information.