![]() |
CiteULike | ![]() |
plotti's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
On unbiased sampling for unstructured peer-to-peer networksIn IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement (2006), pp. 27-40.
|
Reviews
[Write a review of this article]
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
AbstractThis paper addresses the difficult problem of selecting representative samples of peer properties ( eg degree, link bandwidth, number of files shared) in unstructured peer-to-peer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer is often prohibitively expensive, while sampling provides a natural means for estimating system-wide behavior efficiently. However, commonly-used sampling techniques for measuring peer-to-peer systems tend to introduce considerable bias for two reasons. First, the dynamic nature of peers can bias results towards short-lived peers, much as naively sampling flows in a router can lead to bias towards short-lived flows. Second, the heterogeneous nature of the overlay topology can lead to bias towards high-degree peers.We present a detailed examination of the ways that the behavior of peer-to-peer systems can introduce bias and suggest the Metropolized Random Walk with Backtracking (MRWB) as a viable and promising technique for collecting nearly unbiased samples. We conduct an extensive simulation study to demonstrate that the proposed technique works well for a wide variety of common peer-to-peer network conditions. Using the Gnutella network, we empirically show that our implementation of the MRWB technique yields more accurate samples than relying on commonly-used sampling techniques. Furthermore, we provide insights into the causes of the observed differences. The tool we have developed, ion-sampler , selects peer addresses uniformly at random using the MRWB technique. These addresses may then be used as input to another measurement tool to collect data on a particular property.
BibTeX record
RIS record