|
|
-
Machine learning
Computer science research aims to overcome challenges of data-driven discovery.
- New discoveries are being fueled by the ability of researchers to access, use and share massive amounts of data. The challenge is getting large, disparate data sets to, in effect, talk to one another.
That is the goal of Vasant Honavar, professor of computer science, who has been tackling this problem for several years.
The National Science Foundation in August awarded Honavar and his research team a $450,000 grant to support additional work on the project. It's the latest in a series of grants secured by his team to advance "machine learning," the subfield of computer science that seeks to develop computational models of learning.
The funds will go toward the development of algorithms and software to support "collaborative, integrative knowledge acquisition" from a variety of data sources. The project will involve several graduate and undergraduate students and collaborators.
"Our research is aimed at overcoming some of the challenges in data-driven scientific discovery," said Honavar, director of ISU's Center for Computational Intelligence, Learning and Discovery.
Researchers and others have access to massive amounts of useful information sources. The sheer size of the sources and differences in access, structure, level of detail and "data semantics" pose challenges.
Honavar noted they hinder the ability of researchers to integrate and analyze "dissimilar forms of data, and to collaborate across organizational and disciplinary boundaries."
Honavar began working with biologists who needed better ways to integrate and analyze data from multiple large databases such as Genbank (for genome sequences) and Protein Data Bank (protein structures).
"Over the past several decades, biologists have been accumulating detailed knowledge of the building blocks of biological systems," Honavar said. "Yet, biological systems are more than simply a collection of molecules or cells or organs. We need to understand how the parts work together to form dynamic functional units.
"Like the proverbial blind men trying to describe an elephant, we have been limited by their instruments of observation and our ability to synthesize information across disciplines and across scales."
He added that other science disciplines, engineering and even social sciences are facing similar challenges.
"You want to be able to leverage all of that data,"
Honavar said.
Fundamental advances in many disciplines are increasingly dependent on robust approaches to building and testing predictive models from disparate data sources.
Prototype software will be created, deployed and tested on the Internet. The software will be an open source product (meaning the code is available to others for their application or improvement).
"It's something we like to do," Honavar said about the open source software. "Potentially there is a larger community beyond us who can benefit from the software and participate in further developing it."

Vasant Honavar
Around LAS
October 29 to November 11, 2007
|
|