qgym.envs.initial_mapping package
Module containing the environment, rewarders and visualizer for the initial mapping problem of OpenQL.
- class qgym.envs.initial_mapping.BasicRewarder(illegal_action_penalty=-100, reward_per_edge=5, penalty_per_edge=-1)[source]
Bases:
Rewarder
Basic rewarder for the
InitialMapping
environment.- __init__(illegal_action_penalty=-100, reward_per_edge=5, penalty_per_edge=-1)[source]
Initialize the reward range and set the rewards and penalties.
- Parameters:
illegal_action_penalty (
float
) – Penalty for performing an illegal action. An action is illegal if the action contains a virtual or physical qubit that has already been mapped. This value should be negative (but is not required) and defaults to -100.reward_per_edge (
float
) – Reward gained per ‘good’ edge in the interaction graph. An edge is ‘good’ if the mapped edge overlaps with an edge of the connection graph. This value should be positive (but is not required) and defaults to 5.penalty_per_edge (
float
) – Penalty given per ‘bad’ edge in the interaction graph. An edge is ‘bad’ if the edge is mapped and is not ‘good’. This value should be negative (but is not required) and defaults to -1.
- compute_reward(*, old_state, action, new_state)[source]
Compute a reward, based on the new state, and the given action.
Specifically the connection graph, interaction graphs and mapping are used.
- Parameters:
old_state (
InitialMappingState
) – State of theInitialMapping
before the current action.action (
ndarray
[Any
,dtype
[int32
]]) – Action that has just been taken.new_state (
InitialMappingState
) – Updated state of theInitialMapping
.
- Return type:
- Returns:
The reward for this action. If the action is illegal, then the reward is illegal_action_penalty. If the action is legal, then the reward is the total number of ‘good’ edges times reward_per_edge plus the total number of ‘bad’ edges times penalty_per_edge.
- class qgym.envs.initial_mapping.EpisodeRewarder(illegal_action_penalty=-100, reward_per_edge=5, penalty_per_edge=-1)[source]
Bases:
BasicRewarder
Rewarder for the
InitialMapping
environment, which only gives a reward at the end of the episode or when an illegal action is taken.- compute_reward(*, old_state, action, new_state)[source]
Compute a reward, based on the new state, and the given action.
Specifically the connection graph, interaction graphs and mapping are used.
- Parameters:
old_state (
InitialMappingState
) – State of theInitialMapping
before the current action.action (
ndarray
[Any
,dtype
[int32
]]) – Action that has just been taken.new_state (
InitialMappingState
) – Updated state of theInitialMapping
.
- Return type:
- Returns:
The reward for this action. If the action is illegal, then the reward is illegal_action_penalty. If the action is legal, but the mapping is not yet finished, then the reward is 0. If the action is legal, and the mapping is finished, then the reward is the number of ‘good’ edges times reward_per_edge plus the number of ‘bad’ edges times penalty_per_edge.
- class qgym.envs.initial_mapping.InitialMapping(connection_graph, graph_generator=None, *, rewarder=None, render_mode=None)[source]
Bases:
Environment
[Dict
[str
,ndarray
[Any
,dtype
[int32
]]],ndarray
[Any
,dtype
[int32
]]]RL environment for the initial mapping problem of OpenQL.
- __init__(connection_graph, graph_generator=None, *, rewarder=None, render_mode=None)[source]
Initialize the action space, observation space, and initial states. Furthermore, the connection graph and edge probability for the random interaction graph of each episode is defined.
The supported render modes of this environment are
"human"
and"rgb_array"
.- Parameters:
connection_graph (
Union
[Graph
,Buffer
,_SupportsArray
[dtype
[Any
]],_NestedSequence
[_SupportsArray
[dtype
[Any
]]],bool
,int
,float
,complex
,str
,bytes
,_NestedSequence
[Union
[bool
,int
,float
,complex
,str
,bytes
]],list
[int
],tuple
[int
,...
]]) – Graph representation of the QPU topology. Each node represents a physical qubit and each edge represents a connection in the QPU topology. Seeparse_connection_graph()
for supported formats.graph_generator (
GraphGenerator
|None
) – Graph generator for generating interaction graphs. This generator is used to generate a new interaction graph whenInitialMapping.reset()
is called without an interaction graph. IfNone
is provided a newBasicGraphGenerator
with the same number of nodes as the interaction graph will be made.rewarder (
Rewarder
|None
) – Rewarder to use for the environment. Must inherit fromqgym.templates.Rewarder
. IfNone
(default), thenBasicRewarder
is used.render_mode (
str
|None
) – If"human"
open apygame
screen visualizing the step. If"rgb_array"
, return an RGB array encoding of the rendered frame on each render call.
- action_space: Space[Any]
The action space of this environment.
- metadata: dict[str, Any]
Additional metadata of this environment.
- observation_space: Space[Any]
The observation space of this environment.
- reset(*, seed=None, options=None)[source]
Reset the state and set a new interaction graph.
To be used after an episode is finished.
- Parameters:
seed (
int
|None
) – Seed for the random number generator, should only be provided (optionally) on the first reset call i.e., before any learning is done.return_info – Whether to receive debugging info. Default is
False
.options (
Mapping
[str
,Any
] |None
) – Mapping with keyword arguments with additional options for the reset. Keywords can be found in the description ofInitialMappingState
.reset()
.
- Return type:
tuple
[dict
[str
,ndarray
[Any
,dtype
[int32
]]],dict
[str
,Any
]]- Returns:
Initial observation and debugging info.
- class qgym.envs.initial_mapping.InitialMappingState(connection_graph, graph_generator)[source]
Bases:
State
[Dict
[str
,ndarray
[Any
,dtype
[int32
]]],ndarray
[Any
,dtype
[int32
]]]The
InitialMappingState
class.- __init__(connection_graph, graph_generator)[source]
Init of the
InitialMappingState
class.- Parameters:
connection_graph (
Graph
) – networkx Graph representation of the QPU topology. Each node represents a physical qubit and each edge represents a connection in the QPU topology.graph_generator (
GraphGenerator
) – Graph generator for generating interaction graphs. This generator is used to generate a new interaction graph whenInitialMappingState.reset()
is called without an interaction graph.
- create_observation_space()[source]
Create the corresponding observation space.
- Return type:
- Returns:
Observation space in the form of a
Dict
space containing the following values if the connection graph has no fidelity information:MultiDiscrete
space representing the mapping.MultiBinary
representing the interaction matrix.
- graphs
Dictionary containing the graph and matrix representations of the both the interaction graph and connection graph.
- is_done()[source]
Determine if the state is done or not.
- Return type:
- Returns:
Boolean value stating whether we are in a final state.
- is_truncated()[source]
Determine if the episode should be truncated or not.
- Return type:
- Returns:
Boolean value stating whether the episode should be truncated. The episode is truncated if the number of steps in the current episode is more than 10 times the number of nodes in the connection graph.
- mapped_qubits: dict[str, set[int]]
Dictionary with two sets containing mapped physical and logical qubits.
- mapping
Array of which the index represents a physical qubit, and the value a virtual qubit. A value of
n_nodes + 1
represents the case when nothing is mapped to the physical qubit yet.
- mapping_dict: dict[int, int]
Dictionary that maps logical qubits (keys) to physical qubits (values).
- reset(*, seed=None, interaction_graph=None, **_kwargs)[source]
Reset the state and set a new interaction graph.
To be used after an episode is finished.
- Parameters:
seed (
int
|None
) – Seed for the random number generator, should only be provided (optionally) on the first reset call i.e., before any learning is done.interaction_graph (
Graph
|None
) – Interaction graph to be used for the next iteration, ifcreated. (None a random interaction graph will be)
_kwargs (
Any
) – Additional options to configure the reset.
- Return type:
- Returns:
(self) New initial state.
- steps_done: int
Number of steps done since the last reset.
- class qgym.envs.initial_mapping.SingleStepRewarder(illegal_action_penalty=-100, reward_per_edge=5, penalty_per_edge=-1)[source]
Bases:
BasicRewarder
Rewarder for the
InitialMapping
environment, which gives a reward based on the improvement in the current step.- compute_reward(*, old_state, action, new_state)[source]
Compute a reward, based on the new state, and the given action.
Specifically the connection graph, interaction graphs and mapping are used.
- Parameters:
old_state (
InitialMappingState
) – State of theInitialMapping
before the current action.action (
ndarray
[Any
,dtype
[int32
]]) – Action that has just been taken.new_state (
InitialMappingState
) – Updated state of theInitialMapping
.
- Return type:
- Returns:
The reward for this action. If the action is illegal, then the reward is illegal_action_penalty. If the action is legal, then the reward is the number of ‘good’ edges times reward_per_edge plus the number of ‘bad’ edges times penalty_per_edge created by the this action.
- qgym.envs.initial_mapping.initial_mapping module
- qgym.envs.initial_mapping.initial_mapping_rewarders module
- qgym.envs.initial_mapping.initial_mapping_state module
InitialMappingState
InitialMappingState.__init__()
InitialMappingState.create_observation_space()
InitialMappingState.graphs
InitialMappingState.is_done()
InitialMappingState.is_truncated()
InitialMappingState.mapped_qubits
InitialMappingState.mapping
InitialMappingState.mapping_dict
InitialMappingState.n_nodes
InitialMappingState.obtain_info()
InitialMappingState.obtain_observation()
InitialMappingState.reset()
InitialMappingState.steps_done
InitialMappingState.update_state()
- qgym.envs.initial_mapping.initial_mapping_visualiser module