10.07.2015 Views

A Square Root Topologys To Find Unstructured Peer-To ... - ijcsmr

A Square Root Topologys To Find Unstructured Peer-To ... - ijcsmr

A Square Root Topologys To Find Unstructured Peer-To ... - ijcsmr

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

International Journal of Computer Science and Management Research Vol 2 Issue 3 March 2013ISSN 2278-733Xperformance analysis shows that P2P networks. thesearch path length is OðN c2 Þ (where 0 < c2 < 1)if any peer i on the path has to search another peer j,which is to be same as to the destination peer d thani, to receive and forward the query toward d. fromhaving a best performance analysis, our theoreticalanalysis is existed in simulations. we compare ourproposal with two representative distributedalgorithms. With our similarity-aware searchprotocol, that the overlay networks that shows thesimilarity of participating peers can considerablydicrease the query traffic and the search protocolbased on blind flooding.2. Our Proposal: The <strong>Square</strong>-<strong>Root</strong><strong>To</strong>pologyA peer-to-peer network with N peers. Each peer kin the network has degree dk . The total degree inthe network is D, whereD =_Nk=1 dk.Equivalently , the total number of connections inthe network is D/2. We used the square-roottopology as a topology where the degree of eachpeer is proportional to the square root of thepopularity of the peer’s content. if we define gk asthe proportion of searches submitted to the systemthat are satisfied by content at peer k, then a squareroottopology has dk √ gk for all k. consider auser submits a search s that is existed by the bycontent at a particular peer k. Until the search isprocessed by the network, we do not know whichpeer k is. How many hops will the search messagetake before it arrives at k. The expected length ofthe random walk depends on the degree of k:Lemma 1. If the network is connected and nonbipartite,then the expected number of hops forsearch s to reach peer k is D/dk.where the probability of transition from state i tostate j depends only on i and j, and not on any otherhistory about the process. The states of the Markovchain are the peers in the system, and 1 ≤ i, j ≤ N.Associated with a Markov chain is a transitionmatrix T that shows the probability that a transitionoccurs from a state i to another state j. thistransition probability is the probability that a searchmessage that is at peer i is next sends to peer j. Withrandom walks, the transition probability from peer ito peer j is 1/di if i and j are neighbors, and zero. itdepends only on the node degrees, and not on thestructure. the expected length of a walk does notdepend on which peers are connected to whichother peers. This property exicutes from the factthat the Markov chain converges to the samestationary distribution of which vertices areconnected.Thismodel shows peers forward search messages toa randomly chosen peers, even if that searchmessage has just come from that neighbor or hasalready visited this neighbor. This processsimplifies the Markov chain analysis. Already usedprocess for random walks have noted that avoidingpreviously visited peers can improve the efficiencyof walks, and we state this possibility in simulationresults in the next section. Using the transitionmatrix, we can calculate the probability that asearch message is at a given peer at a given point intime. First, we define an N element vector V0,called the initial distribution vector, the kth entry inV represents the probability that a random walksearch starts at peer k. The entries of V sum to 1.Given T and V0, we can calculate V1, where the kthentry represents the probability of the search beingat peer k after one hop, as V1 = TV0. In general, thevector Vm, representing the probabilities that asearch is at a given peer after m hops, is recursivelydefined as Vm = TVm−1. Under the conditions ofthe lemma, Vm converges to distribution vector Vs,representing the probability that a random walksearch visits a given peer at a particular point intime. It shown that the kth entry of Vs is dk/D. Inthe steady state, the probability that a searchmessage is at a given peer k is dk/D.The search routing as a series of experiments, bychoosing a random peer k from the population of Npeers with probability dk/D. The successfulexperiment occurs when a search chooses a peerwith matching content. The expected number ofexperiments before the search message successfullyreaches a particular peer k is a geometric randomvariable with expected value 1 dk/D = D dk. This isthe result comes by Lemma 1.If a given search requires D/dk hops to reach peer k,we assume that a search will be satisfied by a singlepeer. We define Gk to be the probability that peer kis the goal peer, gk ≥ 0 and _N k=1 gk = 1. The gkvary from peer to peer. The proportion of searchesseeking peer k is gk, The expected number of hopsthat will be taken by peers seeking peer k is D/dk,The expected number of hops taken by searches is:H = _N k=1 gk D dk (1)It turns out that H is minimized when the degree ofa peer is proportional to the square root of thepopularity of the documents at that peer. This is thesquare-root topology.Theorem 1:H is minimized when dk = D √ gk _N i=1 √ gi (2)・Proof:D. Raman et.al. 1732www.<strong>ijcsmr</strong>.org


International Journal of Computer Science and Management Research Vol 2 Issue 3 March 2013ISSN 2278-733XWe use the method of Lagrange multipliers tominimize equation (1). Recall the constraint that alldegrees dk sum to D, the constraint for ouroptimization problem isf = ( _N k=1 dk) − D = 0.We must find a Lagrange multiplier λ that satisfiesH = λf (where is the gradient operator). First,treating the gk values as constants,uˆkH k=1−D・gk・ =_N d −2 (3)where uˆk is a unit vector. Next,λf = λ _N uˆk = _N k=1 λuˆk (4)Because H = λf, we can set each term in thesummation of equation (3) equal to thecorresponding term of the summation of equation(4), so thatK・gk・d−2k・ uˆk =λuˆk.Solving dk givesdk =√√D・ gk−λ(5)Now we will eliminate λ, the Lagrange multiplier.−D・Substituting equation (5) into f givesgk_N k=1 ( √ √D −λ ) = D (6)and solving gives1 √ −λ = √ D D _N k=1 √ gk (7)If we change the dummy variable of the summationin equation (7) from k to i, and substitute back intoequation(1), we get equation (2). Theorem 1 showsthat the square-root topology is the optimaltopology over a large network, does not impactperformance substituting equation (2) into equation(1) eliminates D. any value of D that ensures thenetwork is connected is sufficient. Result showsmore of which peers are connected to which other・peers, because of the properties of the stationarydistribution of Markov chains. <strong>Peer</strong> degrees must beinteger values, Therefore, the optimal peer degreesmust be calculated by rounding the value calculatedin equation (2).3. Experimental Results On The <strong>Square</strong>-<strong>Root</strong> <strong>To</strong>pologyIn analysis of the square-root topology is based onan idealized model of searches and content. peer-topeersystems are less idealized, searches may matchcontent at multiple peers. In this we presentsimulation results to get the performance of asquare-root topology. We use simulation becausewe wish to exbit the performance of large networksand it is difficult to deploy that many live peers forresearch on the Internet. Our first metric is to countthe total number of messages sent under each searchmethod. Searches terminate when enough resultsare found, where enough is defined as a userspecified goal number of results G.The results show:• Random walks perform best on the square-roottopology, requiring up to 45 percent fewermessages than in a power-law topology. Thesquare-root topology also results in up to 50 percentless search latency than power-law networks, evenwhen multiple random walks are started in parallel.• The square-root topology is the best topologywhen replication is used, and the combination ofsquare-root topology and replication provideshigher efficiency than technique alone.• Other search techniques based on random walks,such as biased high-degree, biased towards resultsor fewest result hop neighbors, and random walkswith state keeping performed best on the squareroottopology, decreasing the number of messagessent by as much as 52 percent compared to a powerlawtopology.• The square-root topology shown better than othertopology structures, including a constant degreenetwork, and a topology with peer degrees directlyproportional to peer popularity. In super-peernetworks the square-root was the best topology forconnecting the super-peers. we first sets ourexperimental setup, and then present our results.3.1 Experimental setupOur results were exhibited by using a discrete-eventpeer-to-peer simulator. In this simulator modelsindividual peers, documents and queries, also thetopology of the peer-to-peer overlay. Searches aresend to individual peers, and then walk around thenetwork according to the routing algorithm.ParameterValueNumber of peers 20,000Documents 631,320Queries submitted 100,000Goal number of results 10Average links per peer 4Minimum links per peer 1Table 1. Experimental parameters.Simulations used networks with 20,000 peers.Simulation parameters are listed in Table 1.<strong>Square</strong>-root topology is based on thepopularity of documents stored at different peers, itis important to accurately model the number ofqueries that match each document, and the peers atwhich each document is stored. It is difficult togather complete query, document and location datafor tens of thousands of real peers. Therefore, weused the content model described in, which is basedon a trace of real queries and documents, and moreaccurately describes real systems than simpleuniform or Zipfian distributions. we downloadedtext web pages from 1,000 real web sites, andD. Raman et.al. 1733www.<strong>ijcsmr</strong>.org


International Journal of Computer Science and Management Research Vol 2 Issue 3 March 2013ISSN 2278-733Xevaluated keyword queries against the web pages.Then we generated 20,000 synthetic queriesmatching 631,320 synthetic documents, stored at20,000 peers, such that the statistical properties ofour synthetic content model matched those of thereal trace. The resulting content model allowed usto simulate a network of 20,000 peers. In thissimulation, we submitted random queries chosenfrom the set of 20,000 to produce a total of 100,000query submissions. we describe the details of thismethod of generating synthetic documents andqueries, and provide experimental evidence that thecontent model, though synthetic, results in highlyaccurate simulation results. The synthetic modelretains an accurate distribution of the popularity ofpeer content, which is critical for the construction ofthe square-root topology.3.2 Random walksWe conducted an experiment to examine theperformance of random walk searches in differenttopologies. This queries matched documents storedat different peers, and had a goal G = 10 results. Wecompared three different topologies.• A square-root topology, generated by assigning adegree to each peer based on equation (2), and thencreating links between randomly chosen pairs ofpeers based on the assigned degrees.• A low-skew power-law topology, generated usingthe PLOD algorithm . In this network, α = 0.58.• A high-skew power-law topology, generated usingthe PLOD algorithm, with α = 0.74.Random walks in the square-root topology require8,940 messages per search, 26 percent less. thanrandom walks in the low-skew power-law topologyand 45 percent less than random walks in the highskewpower-law topology. In the power-lawtopologies, searches tend toward high degree peers,even if the walk is truly random and not explicitlydirected to high degree peers. These high degreepeers also have the most popular content, Result isthat searches have a low probability of going to thepeer with matching content, and the number of hopsand thus messages increases. If the power-lawdistribution is more skewed, then the probabilitythat searches will congregate at the wrong peers ishigher and the total number of messages arenecessary to get to the right peers increases. Eventhough random walks perform best in the squareroottopology, a large number of messages need tobe sent. the result is a significant improvement overtraditional Gnutella style search, flooding in a highskewpower-law network, with a TTL of five inorder to find at least ten results on average, requires17,700 messages per search. The above results arefor simple, unoptimized random walks. Addingoptimizations such as proactive replication orneighbor indexing reduces the cost of a randomwalk search, and results for these techniques showthat the square-root topology is still best. Anotherissue with random walks is that the search latency ishigh, as queries may have to walk many hopsbefore finding content. <strong>To</strong> deal with this, Lv et alpropose creating multiple, parallel random walksfor each search. Since the network processes thesewalks in parallel, the result is reduced searchlatency. We ran experiments where we created 2,10,15, 20, 30, and 100 parallel random walks for eachsearch, and measured search latency.Walks<strong>Square</strong>rootPower-lawPower lawlow-skewhigh-skew1 8930 12090 163502 4500 6210 89705 1800 2490 374010 904 1250 188020 454 630 947100 96 130 194Table2. Parallel random walks: search latency(ticks).These results are shown in Table 2.Thesquare-root topology provided the lowest searchlatency, regardless of the number of parallel walksthat were generated. The improvement for thesquare-root topology was consistently 27 percentcompared to the low-skew power-law topology, and50 percent compared to the high-skew power-lawtopology. Even when searches are walking inparallel, the square root topology helps those searchwalks quickly arrive at the peers with the rightcontent.3.3 Proactive ReplicationThe square-root topology is complementary to thesquare-root replication. It is feasible to proactivelyreplicate content, the square-root replicationspecifies that the number of copies made of contentshould be proportional to the square root of thepopularity of the content. The square-root topologycan be used whether or not proactive replication isused, the combination of the two techniques canprovide significant performance benefits. Weconducted an experiment where Replicated contentaccording to the square-root replication. Then weconnected peers in the square-root, high-skewpower-law, and low-skew power-law topologies,and states the performance of random walksearches. Again, G= 10. As expected, proactiveD. Raman et.al. 1734www.<strong>ijcsmr</strong>.org


International Journal of Computer Science and Management Research Vol 2 Issue 3 March 2013ISSN 2278-733Xreplication provided better performance than noreplication. Proactive replication performs best withthe square-root topology, requiring only 2,830messages per search, 42 percent less than in thelow-skew power-law network and 56 percent lessthan in the high-skew power-law network.replication makes more copies of the documentsthat a search will match, while the square-roottopology makes it easier for the search to get to thepeers where the documents are stored. Thecombination of the two techniques provides moreefficiency than either technique alone. For example,the square root topology with proactive replicationrequired 68 percent fewer messages than the squareroot topology without replication.3.4 Other search walk techniquesWe examined the performance of other walk-basedtechniques on different topologies. We comparedthree other techniques based on random walks:• Biased high degree messages are preferentiallyforwarded to neighbors that have the highestdegree.• Most results messages are forwardedpreferentially to neighbors that have returned themost results for the past 10 queries.• Fewest result hops messages are forwardedpreferentially to neighbors that returned results forthe past 10 queries who have travel the fewestaverage hops .In each case, ties are broken randomly. For thebiased high degree technique, we examined bothneighbour indexing and no neighbour indexing.Although describes several ways to route searchesin addition to most results and fewest result hops,these two techniques represent the best that theresult hops requires the least bandwidth, while mostresults has the best chance of finding the requestednumber of matching documents. The square-roottopology is best. The most improvement is seenwith the biased high degree technique, where theimprovement on going from the high-skew powerlawtopology to the square-root topology is 52percent. Large improvements are achieved with thefewest result hops technique and most results. Thesmallest improvement observed was for the biasedhigh degree technique with neighbor indexing.<strong>Square</strong>-root topology offers a 16 percent decrease inmessages compared to the lows-kew power-lawtopology. The square-root topology provides thebest performance, even with the extremely efficientbiased high degree / neighbor indexingcombination. The square-root topology can be usedeven when neighbor indexing is not feasible. Thecombination of square-root topology, square-rootreplication and biased high degree walking withneighbor indexing provides even betterperformance. The Results indicate that thisapproach is extremely efficient, requiring only 248messages per search on average. The square roottopology is better than the power law topologywhen square-root replication and neighbor indexingare used. Using three techniques together results ina searching mechanism that contacts less than 2percent of the systems peers on average while stillfinding sufficient results. The results so far assumestate-keeping, where peers keep state about wherethe search has been. <strong>Peer</strong>s can avoid forwardingsearches to neighbours that the search has alreadyvisited. The results demonstrate that the square-roottopology is better than power-law top logies,whether or not state keeping is used.3.5 Other <strong>To</strong>pologiesWe also tested the square-root topology incomparison to several other network structures. wecompared against two simple structures.• Constant-degree topology: every peer has thesame number of neighbors. In our simulations, eachpeer had five neighbors.• Proportional topology: every peer had a degreeproportional to their popularity gk.Our results show that the square-root topology isbest, requiring 10 percent fewer messages than theconstant degree network, and 7 percent fewermessages than the proportional topology. Althoughthe improvement is smaller than when comparingthe square-root topology to power-law topologies,these results again demonstrate that the square-roottopology is best. The cost of maintaining thesquare-root topology is low, as we discuss inSection 4, requiring easily obtainable localinformation. It clearly makes sense to use thesquare-root topology instead of constant degree orproportional topologies. A widely used topology inmany systems is the super-peer topology. In thistopology, a fraction of the peers serve as superpeers,aggregating content information from severalleaf pears. Then, searches only need to be sent tosuper-peers. They are connected using a normalunstructured topology. We ran simulations using astandard super peer topology, in which searches areflooded to super peers. We compared this standardtopology to a super peer topology that used thesquare-root topology and random walks betweensuper peers. The results indicate a significantimprovement using our techniques the square rootsuper peer network required 54 percent fewermessages than a standard super-peer network.4 ConclusionsD. Raman et.al. 1735www.<strong>ijcsmr</strong>.org


International Journal of Computer Science and Management Research Vol 2 Issue 3 March 2013ISSN 2278-733XWe have presented the square-root topology, andshown that implementing a protocol that causes thenetwork to converge to the square root topology,rather than a power-law topology, can providesignificant performance improvements for peer-topeersearches. In the square-root topology, thedegree of each peer is proportional to the squareroot of the popularity of the content at the peer. Ouranalysis shows that the square-root topology isoptimal in the number of hops required for simplerandom walk searches. We also present simulationresults which demonstrate that the square-roottopology is better than power-law topologies forother peer-to-peer search techniques. we presentedan algorithm for constructing the square-roottopology using purely local information. Each peerestimates its ideal degree by tracking how manyqueries match its content, and then adds or dropsconnections to achieve its estimated ideal degree.Results from simulations and our prototype showthat this locally adaptive algorithm quicklyconverges to a globally efficient square-roottopology. Our results show that the combination ofan optimized topology and efficient searchmechanisms provides high performance inunstructured peer-to-peer networks.D Raman,Associate Professor InVardhaman College Of Engineering(Autonomous)Aflicated <strong>To</strong> AICTE, Jwaharlal TechnologicalUniversity,Hyderabad.Yadma Srinivas Reddy PersuingMasters Degree In Vardhaman College OfEngineering, Aflicated <strong>To</strong> Aicte.Jawaharlal NehruTechnological University,Hyderabad.5 References1. Stoica, I, Morris, R, Karger, D. Kaashoek, M.F.Balakrishnan, H: Chord: A scalable peer-to-peerlookup service for internet applications.2. Ratna samy S., Francis, P , Handley, M , Karp, R,Shenker, S: A scalable content addressable network.3. Rowstron, A Druschel, P: Pastry: Scalable,decentralized object location and routing for largescalepeer-to-peer systems.4. Chawathe, Y., Ratnasamy, S, Breslau, L,Lanham, N, Shenker, S: Making Gnutella-like P2Psystems scalable.5. Loo, B, Hellerstein, J, Huebsch, R, Shenker, S,Stoica, I: Enhancing P2P file-sharing with anInternet-scale query processor.6. Loo, B, Huebsch, R, Stoica, I, Hellerstein, J:Enhancing P2P file-sharing with an Internet scalequery processor.7. Yang, B., Garcia-Molina, H.: Designing a superpeernetwork.8. Kalnis, P, Ng, W, Ooi, B, Papadias, D, Tan, K:An adaptive peer-to-peer network for distributedcaching of OLAP results.D. Raman et.al. 1736www.<strong>ijcsmr</strong>.org

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!