21.06.2014 Views

Annual Report 2008.pdf - SAMSI

Annual Report 2008.pdf - SAMSI

Annual Report 2008.pdf - SAMSI

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Harvard University<br />

patrick@seas.harvard.edu<br />

“Sampling Random Networks to Discover Structure: Models and Methods”<br />

Benjamin P. Olding and Patrick J. Wolfe (joint work with Omar Abdala and David Parkes)<br />

Statistics and Information Sciences Laboratory, Harvard University<br />

While the study of random graphs has yielded many classical results, today's vast network data<br />

sets often preclude such direct analysis techniques. In this talk we describe alternative models<br />

and methods to discover structure in random network topology via sampling. Though traditional<br />

models have well-established asymptotic properties and<br />

admit straightforward simulation techniques, they often fail to describe properties of observed<br />

random networks, such as constraints on node linkages. In such scenarios, exact simulation in a<br />

way that respects the underlying measure rapidly becomes difficult, and fewer large-sample<br />

limits are known. We show that stochastic computation can be used to effectively sample from<br />

such spaces and hence estimate functionals of interest. As an example, we sample from a null<br />

model designed for the problem of "community detection" in large graphs—a task closely related<br />

to spectral clustering. We give estimates of the power of this test for several possible models of<br />

topological structure.<br />

Jun Yang<br />

Duke University<br />

junyang@cs.duke.edu<br />

“Data-Driven Processing in Sensor Networks”<br />

Wireless sensor networks enable data collection from the physical environment on<br />

unprecedented scales. In this talk, I will describe some data processing problems that arise in<br />

building an environmental sensing network in Duke Forest, in collaboration with ecologists and<br />

statisticians. Because of severe resource constraints on battery-powered sensor nodes, it is<br />

infeasible to collect and report all raw readings for centralized processing. An effective<br />

approach is model-driven data acquisition, which avoids acquiring readings that can be<br />

accurately predicted from known spatio-temporal models of data. We argue for an alternative,<br />

data-driven approach, which exploits models in optimizing push-based reporting, but does not<br />

depend on the quality of models for correctness. A particularly thorny issue with push-based<br />

reporting is transmission failures, which are common in sensor networks, and make failed reports<br />

indistinguishable from intentionally suppressed ones. The cost of implementing reliable<br />

transmissions is prohibitively high. We show how to inject application-level redundancy in data<br />

reporting to enable efficient, effective, and principled resolution of uncertainty in the missing<br />

data.<br />

Bin Yu<br />

University of California, Berkeley<br />

binyu@stat.berkeley.edu

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!