Optimal and robust control of quantum systems using reinforcement learning approaches

Access & Terms of Use
embargoed access
Embargoed until 2025-09-12
Copyright: SHINDI, Omar
Altmetric
Abstract
Optimal and robust quantum control methods are essential for developing quantum technology. This thesis proposes and examines the implementation of reinforcement learning algorithms for three quantum control tasks. First, a modified tabular Q-learning (TQL) algorithm is proposed for optimal quantum state preparation. This algorithm is compared with the standard TQL method and other methods, such as the stochastic gradient descent and Krotov algorithms, in the context of quantum state preparation for a two-qubit system. The results indicate that the modified TQL algorithm outperforms standard TQL methods, in generating high-fidelity control protocols that guide the quantum state closer to the target state. Moreover, modified TQL shows stability in discovering high-fidelity control protocols regardless of changes in the length of the control protocol. The modifications on standard TQL, including a modified action selection procedure, delayed n-step reward function, and dynamic e-greedy method, improve the stability and enhance performance for discovering global optimal solutions in some cases. Subsequently, a modified Deep Q-Learning (DQL) method is proposed for optimal quantum state preparation, considering constraints like limited control resources and fixed pulse duration. The modified DQL algorithm outperforms the standard DQL in discovering high-fidelity control protocols and shows better convergence to a more effective control policy. Additionally, the improved experience replay memory delayed n-step reward function, and modified action selection method boost the exploration-exploitation ability of the DQL agent in discovering high fidelity solutions for longer protocols. For optimal quantum gate design, this thesis introduces a modified dueling DQL method. This method demonstrates superiority in constructing high-fidelity controls that mimic target gates and discover globally optimal or near-global optimal control protocols. Furthermore, the modified dueling DQL method converges more rapidly to a better control policy compared to the standard dueling DQL methods. The second part of this thesis focuses on robust quantum gate design, introducing a modified dueling Deep Q-Learning (DQL) method for the design of singlequbit gates. The proposed method outperforms the standard Dueling DQL in discovering robust high-fidelity control protocols for single-qubit gates. However, robust gate design for multi-qubit systems poses more significant challenges than for single-qubit systems. To address this, this thesis introduces the Trust Region Policy Optimization (TRPO), an on-policy reinforcement learning method, for the design of robust gates for two-qubit and three-qubit systems. Additionally, this thesis proposes an enhanced Krotov method for a robust gate design. The effectiveness of these proposed methods is presented through numerical examples of robust gate design for CNOT and Toffoli gates. Both TRPO and the improved Krotov method successfully construct robust, high-fidelity protocols capable of executing CNOT gates within a specified uncertainty range. For the Toffoli gate, TRPO manages to construct a robust control protocol applicable to varying parameters, while the improved Krotov method is successful only with a longer control protocol. The increase in the number of control protocols increases the complexity and thus increases the challenge for the improved Krotov method. However, the Hamiltonian gradient with respect to the control pulse used in the updating procedure of the improved Krotov method makes it suitable for longer control protocols. In contrast, TRPO demonstrates a stable performance for discovering robust control protocols regardless of the increase in the complexity of the control problem, whether by increasing the number of control protocols or extending the length of the control. Third, this thesis explores the model free-quantum gate design and calibration. Constructing a quantum gate design is hard when the model of a quantum system is not available due to the challenges in mathematical characterizing the quantum systems and considering all the factors in the mathematical model. A modified RL framework based on DQL procedure is proposed for model-free quantum gate design and calibration. This proposed RL framework relies only on the measurement at the end of the evolution process to identify the optimal control strategy without requiring access to the quantum system. The efficacy of the proposed approach is established numerically, demonstrating its application for model-free quantum gate design and calibration, using off-policy reinforcement learning algorithms. In summary, this thesis presents innovative RL methods for optimal and robust quantum control, contributing to the development of more resilient and efficient quantum systems.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2023
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty