Chun Li, PhD, collaborates with Vanderbilt colleagues on novel statistical methods

Cleveland, Ohio – A woman diagnosed with HIV works with her care team who consider many variables in tailoring a treatment plan that tracts if the therapeutics being used are keeping the virus in check. They may consider the estimated length of time from infection, her BMI (body mass index) which can fluctuate over time, the range of blood glucose at different intervals, and changes in the viral load as treatment progresses. An informed, comprehensive plan increases her chances of living with a manageable chronic condition.

How do researchers whose work shapes clinical treatment plans determine which variables, such as BMI and glucose level, are relevant when they review data from thousands of patients or study participants with distinct biomedical profiles? How do researchers screen out variables that may appear to correlate to certain outcomes, but that may be only coincidental?

“Big data holds so much promise, but mining and ranking information about biomedical variables to determine what correlations are relevant has been done with statistical methods that are not always adequate to the challenge,” says Chun Li, PhD, Associate Professor with the Department of Population and Quantitative Health Sciences at Case Western Reserve University’s School of Medicine and a member of the Cleveland Institute for Computational Biology. “Our goal is to advance the field by defining more appropriate calculations that capture the complexity of the work we do.”

Dr. Li and colleagues with Vanderbilt University Medical Center were recently granted renewed funding from the NIH to continue work in developing and testing new statistical methods that are being applied to HIV research but that can be applied to many complex conditions.

The work draws from tens of thousands of de-identified records from people with or without HIV status. Part of the work focuses on developing new statistical models that clarify relevant variable correlations. Another aspect of their work involves developing methods that support dividing and re-combining data, which applies when simultaneously using many computer nodes to run complex calculations on huge data sets.

“The earliest researchers who developed the field of biomedical statistics created methods that have been applied for decades,” said Bryan E. Shepherd, PhD, Professor and Vice Chair for Research, Department of Biostatistics at Vanderbilt University Medical Center. “We are building from that foundational work to develop new approaches to address complex analytical problems arising from today’s large biomedical datasets that we ultimately hope will improve the lives of individuals and populations.”

# # #

NIH Grant Number: 2R01AI09324-07
About CWRU School of Medicine, Department of Population and Quantitative Health Sciences
About the Cleveland Institute for Computational Biology
About Vanderbilt University Medical Center, Department of Biostatistics