Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer.

TitleReinforcement learning strategies for clinical trials in nonsmall cell lung cancer.
Publication TypeJournal Article
Year of Publication2011
AuthorsZhao, Yufan, Donglin Zeng, Mark A. Socinski, and Michael R. Kosorok
JournalBiometrics
Volume67
Issue4
Pagination1422-33
Date Published2011 Dec
ISSN1541-0420
KeywordsAntineoplastic Agents, Artificial Intelligence, Carcinoma, Non-Small-Cell Lung, Clinical Trials as Topic, Data Interpretation, Statistical, Drug Therapy, Computer-Assisted, Humans, Lung Neoplasms, Outcome Assessment, Health Care, Prognosis, Reinforcement, Psychology, Treatment Outcome
Abstract

Typical regimens for advanced metastatic stage IIIB/IV nonsmall cell lung cancer (NSCLC) consist of multiple lines of treatment. We present an adaptive reinforcement learning approach to discover optimal individualized treatment regimens from a specially designed clinical trial (a "clinical reinforcement trial") of an experimental treatment for patients with advanced NSCLC who have not been treated previously with systemic therapy. In addition to the complexity of the problem of selecting optimal compounds for first- and second-line treatments based on prognostic factors, another primary goal is to determine the optimal time to initiate second-line therapy, either immediately or delayed after induction therapy, yielding the longest overall survival time. A reinforcement learning method called Q-learning is utilized, which involves learning an optimal regimen from patient data generated from the clinical reinforcement trial. Approximating the Q-function with time-indexed parameters can be achieved by using a modification of support vector regression that can utilize censored data. Within this framework, a simulation study shows that the procedure can extract optimal regimens for two lines of treatment directly from clinical data without prior knowledge of the treatment effect mechanism. In addition, we demonstrate that the design reliably selects the best initial time for second-line therapy while taking into account the heterogeneity of NSCLC across patients.

DOI10.1111/j.1541-0420.2011.01572.x
Alternate JournalBiometrics
Original PublicationReinforcement learning strategies for clinical trials in nonsmall cell lung cancer.
PubMed ID21385164
PubMed Central IDPMC3138840
Grant ListR29 CA075142 / CA / NCI NIH HHS / United States
P01 CA142538-01 / CA / NCI NIH HHS / United States
R01 CA075142-10 / CA / NCI NIH HHS / United States
CA075142 / CA / NCI NIH HHS / United States
R01 CA075142 / CA / NCI NIH HHS / United States
P01 CA142538 / CA / NCI NIH HHS / United States
CA142538 / CA / NCI NIH HHS / United States
R01 CA075142-11 / CA / NCI NIH HHS / United States
P30 ES010126 / ES / NIEHS NIH HHS / United States
Project: