qgym.envs.scheduling package

Module containing the environment, rewarders, visualizer and other utils for the scheduling problem of OpenQL.

class qgym.envs.scheduling.BasicRewarder(illegal_action_penalty=-5.0, update_cycle_penalty=-1.0, schedule_gate_bonus=0.0)[source]

Bases: Rewarder

Basic rewarder for the Scheduling environment.

__init__(illegal_action_penalty=-5.0, update_cycle_penalty=-1.0, schedule_gate_bonus=0.0)[source]

Initialize the reward range and set the rewards and penalties.

Parameters:

illegal_action_penalty (float) – Penalty for performing an illegal action. An action is illegal if action[0] is not in state["legal_actions"]. This value should be negative (but is not required) and defaults to -5.
update_cycle_penalty (float) – Penalty given for incrementing a cycle. Since the Scheduling environment wants to create the shortest schedules, incrementing the cycle should be penalized. This value should be negative (but is not required) and defaults to -1.
schedule_gate_bonus (float) – Reward gained for successfully scheduling a gate. This value should be positive (but is not required) and defaults to 0.

compute_reward(*, old_state, action, new_state)[source]

Compute a reward, based on the new state, and the given action. Specifically the ‘legal_actions’ actions array.

Parameters:

old_state (SchedulingState) – State of the Scheduling environment before the current action.
action (ndarray[Any, dtype[int32]]) – Action that has just been taken.
new_state (SchedulingState) – Updated state of the Scheduling environment.

Return type:

float

Returns:

The reward for this action. If the action is illegal, then the reward is illegal_action_penalty. If the action is legal, and increments the cycle, then the reward is update_cycle_penalty. Otherwise, the reward is schedule_gate_bonus.

class qgym.envs.scheduling.CommutationRulebook(default_rules=True)[source]

Bases: object

Commutation rulebook used in the Scheduling environment.

__init__(default_rules=True)[source]

Init of the CommutationRulebook.

Parameters:: default_rules (bool) – If True, default rules are used. Default rules dictate that gates with disjoint qubits commute and that gates that are exactly the same commute. If False, then no rules will be initialized.

__repr__()[source]

Create a string representation of the CommutationRulebook.

Return type:: str

add_rule(rule)[source]

Add a new commutation rule to the rulebook.

Parameters:: rule (Callable[[Gate, Gate], bool]) – Rule to add to the rulebook. A rule is a Callable which takes as input two gates and returns a Boolean value that should be True if two gates commute and False otherwise.
Return type:: None

commutes(gate1, gate2)[source]

Check if gate1 and gate2 commute according to the rules in the rulebook.

Parameters:

gate1 (Gate) – Gate to check the commutation.
gate2 (Gate) – Gate to check gate1 against.

Return type:

bool

Returns:

Boolean value indicating whether gate1 commutes with gate2.

make_blocking_matrix(circuit)[source]

Make a square array of shape (len(circuit), len(circuit)), with dependencies based on the given commutation rules.

Parameters:: circuit (list[Gate]) – Circuit to check dependencies for.
Return type:: ndarray[Any, dtype[int32]]
Returns:: Dependencies matrix of the circuit based on the rules and scheduling from right to left.

class qgym.envs.scheduling.EpisodeRewarder(illegal_action_penalty=-5.0, update_cycle_penalty=-1.0)[source]

Bases: Rewarder

Rewarder for the Scheduling environment, which only gives a reward at the end of the episode or when an illegal action is taken.

__init__(illegal_action_penalty=-5.0, update_cycle_penalty=-1.0)[source]

Initialize the reward range and set the rewards and penalties.

Parameters:

illegal_action_penalty (float) – Penalty for performing an illegal action. An action is illegal if action[0] is not in state["legal_actions"]. This value should be negative (but is not required) and defaults to -5.
update_cycle_penalty (float) – Penalty given for incrementing a cycle. Since the Scheduling environment wants to create the shortest schedules, incrementing the cycle should be penalized. This value should be negative (but is not required) and defaults to -1.

compute_reward(*, old_state, action, new_state)[source]

Compute a reward, based on the new state, and the given action.

Parameters:

old_state (SchedulingState) – State of the Scheduling environment before the current action.
action (ndarray[Any, dtype[int32]]) – Action that has just been taken.
new_state (SchedulingState) – Updated state of the Scheduling environment.

Return type:

float

Returns:

The reward for this action. If the action is illegal, then the reward is illegal_action_penalty. If the action is legal, but the episode is not yet done, then the reward is 0. Otherwise, the reward is update_cycle_penalty`x`current cycle.

class qgym.envs.scheduling.MachineProperties(n_qubits)[source]

Bases: object

MachineProperties is a class to conveniently setup machine properties for the Scheduling environment.

__init__(n_qubits)[source]

Init of the MachineProperties class.

Parameters:: n_qubits (int) – Number of qubits of the machine.

__repr__()[source]

Make a string representation without endline characters.

Return type:: str

__str__()[source]

Make a string representation of the machine properties.

Return type:: str

add_gates(gates)[source]

Add gates to the machine properties that should be supported.

Parameters:: gates (Mapping[str, int]) – Mapping of gates that the machine can perform as keys, and the number of machine cycles (time) as values.
Return type:: MachineProperties
Returns:: The MachineProperties with the added gates.

add_not_in_same_cycle(gates)[source]

Add gates that should not start in the same cycle.

Parameters:: gates (Iterable[tuple[str, str]]) – Iterable of tuples of gate names that should not start in the same cycle.
Return type:: MachineProperties
Returns:: The MachineProperties with an updated not_in_same_cycle property. The not_in_same_cycle property is updated according to the input gates.

add_same_start(gates)[source]

Add gates that should start in the same cycle, or wait till the previous gate is done.

Parameters:: gates (Iterable[str]) – Iterable of gate names that should start in the same cycle.
Return type:: MachineProperties
Returns:: The MachineProperties with the same start gates.

encode()[source]

Encode the gates in the machine properties to integer values.

Return type:: GateEncoder
Returns:: The GateEncoder used to encode the gates. This GateEncoder can be used to decode the gates or encode quantum circuits containing the same gate names as in this MachineProperties object.

classmethod from_file(filename)[source]

Load MachineProperties from a JSON file. Not implemented.

Return type:: MachineProperties

classmethod from_mapping(machine_properties)[source]

Initialize the MachineProperties class from a Mapping containing valid machines properties.

Parameters:: machine_properties (Mapping[str, Any]) – Mapping containing valid machine properties.
Return type:: MachineProperties
Returns:: Initialized MachineProperties object with the properties described in the machine_properties Mapping.

property gates: dict[str, int] | dict[int, int]: Return a``Dict`` with the gate names the machine can perform as keys, and the number of machine cycles (time) as values.

property n_gates: int: Return the number of supported gates.

property n_qubits: int: Return the number of qubits of the machine.

property not_in_same_cycle: dict[str, set[str]] | dict[int, set[int]]: Gates that can not start in the same cycle.

property same_start: set[str] | set[int]: Set of gate names that should start in the same cycle, or wait till the previous gate is done.

class qgym.envs.scheduling.Scheduling(machine_properties, *, max_gates=200, dependency_depth=1, circuit_generator=None, rulebook=None, rewarder=None, render_mode=None)[source]

Bases: Environment[Dict[str, ndarray[Any, dtype[int32]] | ndarray[Any, dtype[int8]]], ndarray[Any, dtype[int32]]]

RL environment for the scheduling problem.

__init__(machine_properties, *, max_gates=200, dependency_depth=1, circuit_generator=None, rulebook=None, rewarder=None, render_mode=None)[source]

Initialize the action space, observation space, and initial states for the scheduling environment.

Parameters:

machine_properties (Mapping[str, Any] | str | MachineProperties) – A MachineProperties object, a Mapping containing machine properties or a string with a filename for a file containing the machine properties.
max_gates (int) – Maximum number of gates allowed in a circuit. Defaults to 200.
dependency_depth (int) – Number of dependencies given in the observation. Determines the shape of the dependencies observation, which has the shape (dependency_depth, max_gates). Defaults to 1.
circuit_generator (CircuitGenerator | None) – Generator class for generating circuits for training.
rulebook (CommutationRulebook | None) – CommutationRulebook describing the commutation rules. If None (default) is given, a default CommutationRulebook will be used. (See CommutationRulebook for more info on the default rules.)
rewarder (Rewarder | None) – Rewarder to use for the environment. If None (default), then a default BasicRewarder is used.
render_mode (str | None) – If "human" open a pygame screen visualizing the step. If "rgb_array", return an RGB array encoding of the rendered frame on each render call.

get_circuit(mode='human')[source]

Return the quantum circuit of this episode.

Parameters:: mode (str) – Choose from be "human" or "encoded". Defaults to "human".
Raises:: ValueError – If an unsupported mode is provided.
Return type:: list[Gate]
Returns:: Human or encoded quantum circuit.

reset(*, seed=None, options=None)[source]

Reset the state, action space and load a new (random) initial state.

To be used after an episode is finished.

Parameters:

seed (int | None) – Seed for the random number generator, should only be provided (optionally) on the first reset call, i.e., before any learning is done.
return_info – Whether to receive debugging info.
options (Mapping[str, Any] | None) – Mapping with keyword arguments with additional options for the reset. Keywords can be found in the description of SchedulingState.reset.
_kwargs – Additional options to configure the reset.

Return type:

tuple[dict[str, ndarray[Any, dtype[int32]] | ndarray[Any, dtype[int8]]], dict[str, Any]]

Returns:

Initial observation and debugging info.

class qgym.envs.scheduling.SchedulingState(*, machine_properties, max_gates, dependency_depth, circuit_generator, rulebook)[source]

Bases: State[Dict[str, ndarray[Any, dtype[int32]] | ndarray[Any, dtype[int8]]], ndarray[Any, dtype[int32]]]

The SchedulingState class.

__init__(*, machine_properties, max_gates, dependency_depth, circuit_generator, rulebook)[source]

Init of the SchedulingState class.

Parameters:

machine_properties (MachineProperties) – A MachineProperties object.
max_gates (int) – Maximum number of gates allowed in a circuit.
dependency_depth (int) – Number of dependencies given in the observation. Determines the shape of the dependencies observation, which has the shape (dependency_depth, max_gates).
circuit_generator (CircuitGenerator) – Generator class for generating circuits for training.
rulebook (CommutationRulebook) – CommutationRulebook describing the commutation rules.

busy: Amount of cycles that a qubit is still busy (zero if available). Used internally for the hardware limitations.

circuit_info: CircuitInfo` dataclass containing the encoded circuit and attributes used to update the state.

create_observation_space()[source]

Create the corresponding observation space.

Return type:

Dict

Returns:

Observation space in the form of a Dict space containing:

MultiBinary space representing the legal actions. If the value at index \(i\) determines if gate number \(i\) can be scheduled or not.
MultiDiscrete space representing the integer encoded gate names.
MultiDiscrete space representing the interaction of each gate (q1 and q2).
MultiDiscrete space representing the first \(n\) gates that must be scheduled before this gate.

cycle: Current ‘machine’ cycle.

gates: Dictionary with gate names as keys and GateInfo dataclasses as values.

is_done()[source]

Determine if the state is done or not.

Return type:: bool
Returns:: Boolean value stating whether we are in a final state.

machine_properties: MachineProperties class containing machine properties and limitations.

obtain_info()[source]

Obtain additional information.

Return type:: dict[str, int | ndarray[Any, dtype[int32]]]
Returns:: Optional debugging info for the current state.

obtain_observation()[source]

Obtain an observation based on the current state.

Return type:: dict[str, ndarray[Any, dtype[int32]] | ndarray[Any, dtype[int8]]]
Returns:: Observation based on the current state.

reset(*, seed=None, circuit=None, **_kwargs)[source]

Reset the state and load a new (random) initial state.

To be used after an episode is finished.

Parameters:

seed (int | None) – Seed for the random number generator, should only be provided (optionally) on the first reset call, i.e., before any learning is done.
circuit (list[Gate] | None) – Optional list of a circuit for the next episode, each entry in the list should be a Gate. When a circuit is give, no random circuit will be generated.
_kwargs (Any) – Additional options to configure the reset.

Return type:

SchedulingState

Returns:

Self.

steps_done: int: Number of steps done since the last reset.

update_state(action)[source]

Update the state of this environment using the given action.

Parameters:: action (ndarray[Any, dtype[int32]]) – First entry determines a gate to schedule, the second entry increases the cycle if nonzero.
Return type:: SchedulingState
Returns:: Self.

utils: SchedulingUtils dataclass with a random circuit generator, commutation rulebook and a gate encoder.