Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic [arXiv] W. A. Suttle*, A. S. Bedi*, B. Patel, B. Sadler, A. Koppel, D. Manocha. [preprint].
Information-Directed Policy Search in Sparse-Reward Settings via the Occupancy Information Ratio [to appear] W. A. Suttle, A. Koppel, J. Liu. 57th Annual Conference on Information Sciences and Systems (CISS), 2023.
Semi-Supervised Data-Generation for Offline Reinforcement Learning via the Occupancy Information Ratio W. A. Suttle, G. Warnell, A. Koppel, J. Liu. 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2022. Extended abstract.
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search [arXiv] W. A. Suttle, A. Koppel, J. Liu. [preprint]
Policy Gradient for Ratio Optimization: A Case Study [IEEE] W. A. Suttle, A. Koppel, J. Liu. 56th Annual Conference on Information Sciences and Systems (CISS), 2022.
Reinforcement Learning for Cost-Aware Markov Decision Processes [PMLR] W. A. Suttle, K. Zhang, Z. Yang, D. Kraemer, and J. Liu. 38th International Conference on Machine Learning (ICML), 2021.
A Multi-agent Off-policy Actor-critic Algorithm for Distributed Reinforcement Learning [arXiv] W. A. Suttle, Z. Yang, K. Zhang, Z. Wang, T. Başar, and J. Liu. 21st International Federation of Automatic Control (IFAC) World Congress, 2020.
Reinforcement Learning Based Distributed Control of Dissipative Networked Systems [arXiv] K. C. Kosaraju, S. Sivaranjani, W. A. Suttle, V. Gupta, J. Liu. IEEE Transactions on Control of Networked Systems, 2021.
A Convergence Result for Regularized Actor-Critic Methods W. A. Suttle, Z. Yang, K. Zhang, J. Liu. NeurIPS 2019 Optimization Foundations of Reinforcement Learning Workshop. Workshop paper.