UD computer scientist uses text mining to advance personalized cancer treatment

Doctors have many weapons in the fight against cancer, but choosing the right one can be a challenge. Soon they will have a new resource to inform them about treatment options.

A group of researchers is building a portal of genetic information to help doctors detect, diagnose and treat cancer optimally in each of their patients. Vijay Shanker, a professor of computer and information sciences at the University of Delaware, is part of the team, which recently received a grant of more than $1.2 million. Shanker will receive $215,737 for his portion of the project.

Cancerous tumors develop when genes go rogue. Sometimes the culprit is a mutation—a fault in a DNA sequence. Sometimes there’s a problem with the level of the gene expression, leading to the creation of cancerous tissue. Researchers all over the world have linked mutations and expression patterns of hundreds of biomarkers to cancer incidence, treatment, and prognosis.

This information is held in a variety of databases, including those resulting from the Early Detection Research Network, Cancer Genome Atlas, and International Cancer Genomic Consortium.

The findings have never been collected systematically in one place—but that’s about to change.

“We are trying to build on that work further by integrating large-scale experimental results with different kinds of information found in the research literature,” said Shanker.

The goal: Someday, oncologists will pull up this data on a tablet and use it to guide treatment decisions.

“For example, this information could be used to predict how mutations might affect drug response, an important aspect of personalized medicine,” said Shanker.

This information will also be helpful for clinical researchers developing new medicine.

Shanker is collaborating with researchers at George Washington University, the Swiss Institute of Bioinformatics, and NASA’s Jet Propulsion Laboratory.

While this project involves many components, Shanker is in charge of text mining — using research literature to cull important information. Shanker has developed programs to determine which text is relevant, parse that text, and then simplify it and organize it in a useful way. For this project, that means using his software to scour millions of research papers, pull out information about relevant biomarkers and outcomes, and organize them.

Shanker has been doing text mining and natural language processing for nearly three decades. Over the past few years, projects like this one, and another major NIH initiative Shanker is involved in, have taken him in a new direction—health-focused research.

“I’ve worked mostly on theoretical projects,” he said. “Recent research has been more driven by application.”

This work could eventually save lives, and that’s something Shanker considers often.

“I have really become fascinated by this area and the idea that maybe my research can have an impact on people’s health,” he said.