Publications

(2022). Online Statistical Inference for Matrix Contextual Bandit.. In AOS.

PDF

(0001). Off-Policy Evaluation For Low-Rank Tensor Markov Decision Processes..