Some Reward–penalty Rules for the Multi-Armed Bandit Problem Which Are Asymptotically Optimal
Advances in Applied Probability - United Kingdom
doi 10.1017/s0001867800021121
Full Text
Open PDFAbstract
Available in full text
Date
March 1, 1983
Authors
Publisher
Cambridge University Press (CUP)