qgym.envs.routing.routing module

This module contains an environment for training an RL agent on the routing problem of OpenQL. The routing problem is aimed at enabling to execute the quantum circuit by putting those physical qubits into connection that have an interaction in the quantum circuit. This problem arises when there are mismatches between the interaction graph and the QPU-topology in the initial mapping. The quantum circuit is represented as an interaction graph, where each node represent a qubit and each edge represent an interaction between two qubits as defined by the circuit (See the example below). The QPU structure is called the connection graph. In the connection graph each node represents a physical qubit and each edge represent a connection between two qubits in the QPU.

          QUANTUM CIRCUIT                        INTERACTION GRAPH
       ┌───┐               ┌───┐
|q3>───┤ R ├───┬───────────┤ M ╞══                 q1 ────── q2
       └───┘   │           └───┘                            ╱
       ┌───┐ ┌─┴─┐         ┌───┐                           ╱
|q2>───┤ R ├─┤ X ├───┬─────┤ M ╞══                        ╱
       └───┘ └───┘   │     └───┘                         ╱
       ┌───┐       ┌─┴─┐   ┌───┐                        ╱
|q1>───┤ R ├───┬───┤ X ├───┤ M ╞══                     ╱
       └───┘   │   └───┘   └───┘                      ╱
       ┌───┐ ┌─┴─┐         ┌───┐                     ╱
|q0>───┤ R ├─┤ X ├─────────┤ M ╞══                q3 ─────── q4
       └───┘ └───┘         └───┘

A SWAP-gate changes the mapping from logical qubits to physical qubits at a certain point in the circuit, and thereby allows to solve mismatchings from the initial mapping. The goal is to place SWAP-gates in the quantum circuit to fix the mismatches. The least amount of SWAP-gates is preferred. In more advanced setups, also different factors can be taken into account, like the fidelity of connections in the QPU.

State Space:

The state space is described by a RoutingState with the following attributes:

steps_done: Number of steps done since the last reset.
num_nodes: Number of physical qubits.
connection_graph: A networkx representation of the connection graph.
edges: List of edges of the connection graph used for decoding actions.
mapping: Array of which the index represents a physical qubit, and the value a virtual qubit. This is updated after each swap.
interaction_generator: Generator for interaction circuits.
interaction_circuit: An array of 2-tuples of integers, where every tuple represents a, not specified, gate acting on the two qubits labeled by the integers in the tuples.
position: The position in the original connection circuit.
max_observation_reach: Caps the maximum amount of gates the agent can see ahead when making an observation.
observe_legal_surpasses: If True a list called boolean_flags will be added to the observation space. The list boolean_flags has length observation_reach and containing Boolean values indicating whether the gates ahead can be executed.
observe_connection_graph: If True, the connection_graph will be incorporated in the observation_space.
swap_gates_inserted: A list of 3-tuples of integers, to register which gates to insert and where. Every tuple (g, q1, q2) represents the insertion of a SWAP-gate acting on logical qubits q1 and q2 before gate g in the interaction_circuit.

Observation Space:

The observation space is a Dict with 2-4 entries:

interaction_gates_ahead: Array with Boolean values for the upcoming connection gates in the quantum circuit.
mapping: The current state of the mapping.
(Optional) connection_graph: Adjacency matrix of the connection graph.
(Optional) is_legal_surpass_booleans: Array with boolean values stating whether a connection gate can be surpassed with the current mapping.

Action Space:

A valid action is an integer in the domain [0, n_connections]. The values 0 to n_connections-1 represent an added SWAP gate. The value of n_connections indicates that the agents wants to surpass the current gate and move to the next gate.

Illegal actions will not be executed. An action is considered illegal when the agent want to surpass a gate that cannot be executed with the current mapping.

# TODO: create Examples

class qgym.envs.routing.routing.Routing(connection_graph, interaction_generator=None, max_observation_reach=5, observe_legal_surpasses=True, observe_connection_graph=False, *, rewarder=None, render_mode=None)[source]

Bases: Environment[Dict[str, ndarray[Any, dtype[int32]]], int]

RL environment for the routing problem of OpenQL.

__init__(connection_graph, interaction_generator=None, max_observation_reach=5, observe_legal_surpasses=True, observe_connection_graph=False, *, rewarder=None, render_mode=None)[source]

Initialize the action space, observation space, and initial states.

The supported render modes of this environment are "human" and "rgb_array".

Parameters:

connection_graph (Union[Graph, Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]], list[int], list[Iterable[int]], tuple[int, ...], tuple[Iterable[int], ...]]) – Graph representation of the QPU topology. Each node represents a physical qubit and each edge represents a connection in the QPU topology. See parse_connection_graph() for supported formats.
interaction_generator (InteractionGenerator | None) – Interaction generator for generating interaction circuits. This generator is used to generate a new interaction circuit when Routing.reset() is called without an interaction circuit.
max_observation_reach (int) – Sets a cap on the maximum amount of gates the agent can see ahead when making an observation. When bigger that max_interaction_gates the agent will always see all gates ahead in an observation
observe_legal_surpasses (bool) – If True a boolean array of length observation_reach indicating whether the gates ahead can be executed, will be added to the observation_space.
observe_connection_graph (bool) – If True, the connection_graph will be incorporated in the observation_space. Reason to set it False is: QPU-topology practically doesn’t change a lot for one machine, hence an agent is typically trained for just one QPU-topology which can be learned implicitly by rewards and/or the booleans if they are shown, depending on the other flag above. Default is False.
rewarder (Rewarder | None) – Rewarder to use for the environment. Must inherit from Rewarder. If None (default), then BasicRewarder is used.
render_mode (str | None) – If "human" open a pygame screen visualizing the step. If "rgb_array", return an RGB array encoding of the rendered frame on each render call.

reset(*, seed=None, options=None)[source]

Reset the state and set/create a new interaction circuit.

To be used after an episode is finished.

Parameters:

seed (int | None) – Seed for the random number generator, should only be provided (optionally) on the first reset call i.e., before any learning is done.
options (Mapping[str, Any] | None) – Mapping with keyword arguments with additional options for the reset. Keywords can be found in the description of RoutingState.reset().

Return type:

tuple[dict[str, ndarray[Any, dtype[int32]]], dict[str, Any]]

Returns:

Initial observation and debugging info.