Floating on the data

Technology Review reports on a recent conference trying to spread data mining techniques. The point of departure is the growth of electronic sensor networks in industry and online social media information: "The New Big Data".

People have been working with graphs of data for hundreds of years, but the graphs now being plotted from social networks or sensor networks are of an unprecedented scale, Apte says. "These are gigantic graphs," he says. "You're talking about millions of nodes and tens of millions of links."

Dealing with graphs of that size and scope, and applying modern analytic tools to them, calls for better algorithms and other innovations.

I'm dealing here with genetic data networks, which are becoming rapidly denser and we're beginning to apply these kinds of network methods to understand them. Once you begin to pass beyond the analysis of a single locus, and spread the data across the whole genome, it becomes necessary to go beyond a single tree, to understand the relationships (and commonalities) among genealogical networks that connect people with each other. In some ways, this shares more with epidemiological modeling than with traditional genetics.