qgym.templates.environment module

Generic abstract base class for RL environments.

All environments should inherit from Environment.

class qgym.templates.environment.Environment[source]

Bases: Env[ObservationT, ActionT]

RL Environment containing the current state of the problem.

Each subclass should set at least the following attributes:

Close the screen used for rendering.

Render the current state using pygame.

abstract reset(*, seed=None, options=None)[source]

Reset the environment and load a new random initial state.

To be used after an episode is finished. Optionally, one can provide additional options to configure the reset.

Parameters:

seed (int | None) – Seed for the random number generator, should only be provided (optionally) on the first reset call, i.e., before any learning is done.
options (Mapping[str, Any] | None) – Dictionary containing keyword-argument pairs to configure the reset.

Return type:

tuple[TypeVar(ObservationT), dict[str, Any]]

Returns:

Initial observation and a dictionary containing debugging information.

property rewarder: Rewarder

Return the rewarder that is set for this environment.

Used to compute rewards after each step.

property rng: Generator

Return the random number generator of this environment.

If none is set yet, this will generate a new one using numpy.random.default_rng.

step(action)[source]

Update the state based on the input action.

Return observation, reward, done-indicator, terminated-indicator and debugging info based on the updated state.

Parameters:

action (TypeVar(ActionT)) – Action to be performed.

Return type:

Returns:

A tuple containing five entries

The updated state;
Reward for the given action;
Boolean value stating whether the new state is a final state (i.e., if we are done);
Boolean value stating whether the episode is truncated.
Additional (debugging) information.