Evolutionary Games for Cooperative P2P Video Streaming

Proceedings of 2010 IEEE 17th International Conference on Image ProcessingSeptember 26-29, 2010, Hong KongEVOLUTIONARY GAMES FOR COOPERATIVE **P2P** VIDEO STREAMINGYan Chen, Beibei Wang, W. Sabrina Lin, Yongle Wu and K. J. Ray LiuDept. ECE, University of Maryland, College Park.E-mail:{yan, bebewang, wylin, wuyl, and kjrliu}@umd.edu.ABSTRACTThe wide-spread use of **P2P** video streaming systems have introduceda large number of unnecessary traverse links leadingto substantial network inefficiency. To address this problem andachieve better streaming per**for**mance, we propose to enable cooperationamong group peers, which are geographically neighboringpeers with large intra-group upload and download bandwidths.Considering the peers’ selfish nature, we **for**mulate thecooperative streaming problem as an evolutionary game and derivethe evolutionarily stable strategy (ESS) **for** every peer. Moreover,we propose a simple and distributed learning algorithm **for**the peers to converge to the ESSs. Compared to the traditionalnon-cooperative **P2P** schemes, the proposed cooperative schemeachieves much better per**for**mance in terms of social welfare andprobability of real-time streaming.Index Terms— **P2P**, cooperative streaming, evolutionary,game theory, replicator dynamics, distributed learning.1. INTRODUCTIONWith the rapid development of signal processing and networkingtechnologies, video-over-IP applications become more and morepopular and have attracted millions of users over the Internet. Onesolution to video streaming over Internet is the Peer-to-Peer (**P2P**)service model, where a peer not only acts as a client to downloaddata from the network, but also acts as a server to upload data **for**the other peers in the network. The upload bandwidth of the peersreduces the workload placed on the server dramatically, whichmakes large-scale video streaming possible.While **P2P** video streaming systems have achieved promisingresults, they have several drawbacks. First, there is a large numberof unnecessary traverse links within a provider’s network. As observedin [1], each **P2P** bit on the Verizon network traverses 1000miles and takes 5.5 metro-hops on average. Second, there is ahuge number of cross Internet Service Provider (ISP) traffic. Thestudies in [2] showed that 50%-90% of the existing local piecesin active peers are downloaded externally. Third, most of the current**P2P** systems assume that all peers are willing to contributetheir resources. However, this assumption may not be true sincethe **P2P** systems are self-organizing networks and the peers areselfish by nature [3].In the literature, many approaches have been proposed to overcomethese drawbacks. Karagiannis et al. [2] proposed to uselocality-aware **P2P** schemes to reduce the unnecessary traverselinks within and cross ISPs and thus reduce the download time.Xie et al. [1] proposed a P4P architecture that allows cooperativetraffic control between applications and network providers.To stimulate selfish peers to contribute their resources, paymentmechanism [4] and reputation schemes [5] are proposed. Gametheoretical incentive mechanisms are also investigated in [3].Most of the existing schemes treat every peer as an independentindividual. However, in reality, every peer can have a largenumber of geographically neighboring peers with large intragroupupload and download bandwidths, e.g. the peers in thesame lab, building, or campus. In this paper, we name those geographicallyneighboring peers with large intra-group upload anddownload bandwidths as group peers. To reduce the unnecessarytraverse links and improve network efficiency, instead of consideringeach peer’s strategy independently, we investigate possiblecooperation among the group peers and answer the question of“how a group of selfish peers should cooperate with each other toachieve better streaming per**for**mance”.The rest of this paper is organized as follows. In Section 2,we describe the system model and the utility function. Then, weshow in details how to select agents in Section 3. In Section 4,we propose a distributed learning algorithm **for** ESS. Finally, weshow the simulation results in Section 5 and draw conclusions inSection 6.2. THE SYSTEM MODEL AND UTILITY FUNCTION2.1. System ModelAs shown in Figure 1, there is a set of group peers 1 (three inthis example) who want to view a real-time video streaming simultaneously.We assume that the upload and download bandwidthamong the group is large and the peers in the same groupwill cooperate with each other and choose k representative peers,called agents, to download video streams from the agents in othergroups. Then, the agents will distribute the video streams to theother peers within the group. In order to achieve good stream-1 How to group the peers itself is an interesting problem. However, in this paper,we assume that the peers have already been grouped and mainly focus on how thegroup peers cooperate with each other to achieve better streaming per**for**mance.978-1-4244-7994-8/10/$26.00 ©2010 IEEE 4453ICIP 2010

Group PeersAgentzero. There**for**e, given the total download rate y k and the sourcerate r, if peer u i chooses to be an agent, then the utility functionof u i is given byU A,i (k) =P (y k ≥ r)G − C i , ∀k ∈ [1,N], (1)where C i is the cost of u i when he/she serves as an agnet, andP (y k ≥ r) is the probability of achieving a real-time streaming.Since the upload and download bandwidths within the groupis large, the cost of uploading data to the other peers within thegroup can be negligible. In such a case, if peer u i chooses not tobe an agent, then there is no cost **for** u i and the utility functionbecomesU N,i (k) ={P (yk ≥ r)G, if k ∈ [1,N − 1];0, if k =0.(2)3. AGENTS SELECTIONFig. 1. In this example, there are three group peers who want to view a realtimevideo streaming simultaneously. The peers within a group cooperate witheach other and choose k agents to download video streams from the agents inother groups, where the agents are marked in purple. Then, the agents distributethe video streams to the other peers within the group. Here, the solid line stands**for** the inter-group connection while the dashed line stands **for** the intra-groupconnection. In this paper, we address the following two questions: given a groupof peers, how many agents should be chosen and which peers should be chosen asagents?ing per**for**mance through cooperation, two questions need to beaddressed: given a group of peers, how many agents should bechosen and which peers should be chosen as agents.2.2. Utility FunctionIn a **P2P** network, a peer not only acts as a client to downloadvideo data from the other peers but also acts as a server to uploadvideo data **for** the other peers. There**for**e, while a peer can benefitfrom downloading data from the other peers, he/she also incurs acost in uploading data **for** the other peers, where the cost can beresource spending on uploading data, e.g. bandwidth, buffer size.Given the group peers, u 1 , u 2 , ..., u N , we assume that k agentsare chosen to download multimedia data from the peers outsidethe group. Suppose that the download rates of the k agents are r 1 ,r 2 , ..., r k , then the total download rate of the group peers is givenby y k = ∑ ki=1 r i.Since the agents randomly and independently select peers outsidethe group **for** downloading data, the download rate r i ’s arerandom variables. According to [6], the Cumulative DistributionFunction (CDF) of a peer’s download bandwidth can be modelledas a linear function, which means that the PDF of a peer’s downloadbandwidth can be modelled as a uni**for**m distribution, i.e.,r i ’s are uni**for**mly distributed.Obviously, if the total download rate y k is not smaller than thesource rate r, then the group peers can have a real-time streaming,and all the group peers can obtain a certain gain G. Otherwise,there will be some delay, and in this case we assume the gain isIn this section, we will discuss how to select agents within a homogeneousgroup where the cost of all peers serving as an agentis assumed to be the same. 23.1. **Evolutionary** **Cooperative** **Streaming** GameSince peers are selfish by nature, they are not as cooperative asa system designer/controller desires. To provide a robust equilibriumstrategy **for** the selfish peers, we adopt the concept of EvolutionarilyStable Strategy (ESS) [7].Since all peers are selfish, they will cheat if cheating canimprove their payoffs, which means that all peers are uncertainof other peers’ actions and utilities. In such a case, to improvetheir utilities, peers will try different strategies in every play andlearn from the strategic interactions using the methodology ofunderstanding-by-building. During the process, the percentage ofpeers using a certain pure strategy may change. Such a populationevolution can be modelled by replicator dynamics. Specifically,let x a stand **for** the probability of a peer using pure strategya ∈A, where A = {A, N} is the set of pure strategies includingbeing an agent (A) and not being an agent (N). By replicator dynamics,the evolution dynamics of x a are given by the followingdifferential equationẋ a = η[Ū(a, x −a) − Ū(x a)]x a , (3)where Ū(a, x −a) is the average payoff of the peers using purestrategy a, x −a is the set of peers who use pure strategies otherthan a, Ū(x a) is the average payoff of all peers, and η is a positivescale factor.From (3), we can see that if adopting pure strategy a can leadto a higher payoff than the average level, the probability of a peerusing a will grow and the growth rate ẋ a /x a is proportional tothe difference between the average payoff of using strategy a (i.e.,Ū(a, x −a )) and the average payoff of all peers (i.e., Ū(x a)).2 Due to the page limitation, we will not show the analysis **for** the heterogeneousscenario in this paper. However, we would like to point out that the proposeddistributed learning algorithm in Section 4 is applicable to both the homogeneousand heterogeneous scenarios.4454

3.2. Analysis of the **Cooperative** **Streaming** GameAccording to (1) and (2), the average payoff of a peer if he/shechoose to be an agent can be computed byN−1∑( )N−1Ū A (x)= x i (1 − x) N−1−i[ ]P (yii+1 ≥ r)G − C , (4)i=0where x is the probability of a peer being an agent, and ( )N−1i x i (1−x) N−1−i is the probability that there are i agents out of N − 1other peers.Similarly, the average payoff of a peer if he/she chooses not tobe an agent is given byŪ N (x) =N−1∑i=1( )N−1x i (1 − x) N−1−i P (yii ≥ r)G. (5)According to (4) and (5), the average payoff of a peer isŪ(x) =xŪA(x)+(1− x)ŪN (x). (6)Substituting (6) back to (3), we haveẋ = ηx(1 − x)[ŪA(x) − ŪN(x)]. (7)At equilibrium x ⋆ , no player will deviate from the optimalstrategy, which means ẋ ⋆ =0, and we can obtain x ⋆ =0,1,orthe solutions to ŪA(x) =ŪN(x). However, since ẋ ⋆ =0is onlythe necessary condition **for** x ⋆ to be ESS, we examine the sufficientcondition **for** each ESS candidate and draw the followingconclusions.• x ⋆ =0is an ESS only when P (y 1 ≥ r)G − C ≤ 0.• x ⋆ =1is an ESS only when P (y N ≥ r)G − P (y N−1 ≥r)G ≥ C.• Let x ⋆ be the solution to ŪA(x) =ŪN (x), and x ⋆ ∈ (0, 1).Then, x ⋆ is an ESS.4. A DISTRIBUTED LEARNING ALGORITHM FOR ESSFrom the previous section, we can see that the ESS can be foundby solving the replicator dynamics equations. However, solvingthe replicator dynamics equations require the exchange of privatein**for**mation and strategies adopted by other peers. In this section,we will present a distributed learning algorithm that can graduallyconverge to ESS without in**for**mation exchange.We first discretize the replicator dynamics equation asx i (t +1)=x i (t)+η [ Ū i (A, x −i (t)) − Ūi(x i (t)) ] x i (t), (8)where t is the slot index and x i (t) is the probability of u i being anagent during slot t. Here, we assume that each slot can be furtherdivided into M subslots and each peer can choose to be an agentor not at the beginning of each subslot.Social Welfare201816141210864 Non−CoopESS−D2150 200 250 300 350 400 450 500 550 600Source Rate (kb/s)Fig. 2. The social welfare comparison.From (8), we can see that in order to update x i (t +1),weneed to first compute Ūi(A, x −i (t)) and Ūi(x i (t)). Let us definean indicator function 1 i (t, k) as{1, if ui is an agent at subslot q in slot t,1 i (t, q)=(9)0, else,where q is the subslot index.The immediate utility of u i at subslot q in slot t can be computedby⎧G − C i , if u i is an agent and r ⎪⎨t ≥ r,−CU i (t, q) = i ,⎪⎩if u i is an agent and r t

Probability of being an agent0.50.450.40.350.30.250.2N=10N=15N=200 100 200 300 400 500 600Time intervalFig. 3. Behavior dynamic of a homogeneous group of peers.Probability of real−time streaming10.90.80.70.60.50.40.3Non−Coop0.2 ESS−D with N=2ESS−D with N=30.1 ESS−D with N=4ESS−D with N=500 500 1000 1500Source rate (kb/s)Fig. 4. The probability of real-time streaming comparison.we denote the proposed ESS-based approach as ESS-D. We comparethe proposed method with traditional **P2P** non-cooperationmethod, which is denoted as Non-Coop.In the first simulation, we show the social welfare comparison,where we assume that there are 20 homogenous peers andthe cost C is 0.1. As show in Fig. 2, ESS-D per**for**ms much betterthan the Non-Coop method since the social welfare per**for**manceof Non-Coop decreases linearly in terms of the source rate. Withcooperation and adaptively selecting the proper number of agents,the proposed method can preserve a high social welfare per**for**manceeven with a large source rate.In the second simulation, we evaluate the convergence propertyof the ESS-D. In Fig. 3, we show the replicator dynamicof the cooperation streaming game with homogeneous peers withr = 500. We can see that starting from a high initial value, allpeers gradually reduce their probabilities of being an agent sincebeing a free-rider more often can bring a higher payoff. However,since too low a probability of being an agent increases the chanceof having no peer be an agent, the probability of being an agentwill finally converge to a certain value which is determined by thenumber of peers.In the third simulation, we compare the probability of realtimestreaming per**for**mance between Non-Coop and ESS-D. Thesimulation results are shown in Fig. 4. We can see that with cooperation,the probability of real-time streaming can be significantlyimproved especially at the high source rate region. We alsofind that at the high source rate region, the probability of real-timestreaming increases as N increases. Note that to give more insightinto the proposed algorithms, we assume that there is no bufferingeffect in this paper. However, the analysis and conclusion can beextended to the case when buffering effect is considered.6. CONCLUSIONIn this paper, we propose a cooperative streaming scheme to addressthe network inefficiency problem encountered by the traditionalnon-cooperative **P2P** schemes. We answer the questionof “how a group of selfish peers with large intra-group uploadand download bandwidths cooperate with each other to achievebetter streaming per**for**mance” by **for**mulating the problem as anevolutionary game and deriving the ESS **for** every peer. We furtherpropose a distributed learning algorithm **for** each peer to convergeto the ESS by learning from his/her own past payoff history.From the simulation results, we can see that compared with thetraditional non-cooperative **P2P** schemes, the proposed algorithmachieves much better social welfare and higher probability of realtimestreaming.7. REFERENCES[1] Haiyong Xie, Y. Richard Yang, Arvind Krishnamurthy, Yanbin Liu, and Avi Siberschatz,“P4p: Provider portal **for** applications,” in ACM SIGCOMM, 2008, pp. 351–362.[2] T. Karagiannis, P. Rodriguez, and K. Papagiannaki, “Should internet service providersfear peer-assisted content distribution?,” in Proceedings of the Internet MeasurementConference, 2005.[3] Min Xiao and Debao Xiao, “Understanding peer behavior and designing incentive mechanismin peer-to-peer networks: An analytical model based on game theory,” in Proc.of International Conference on Algorithms and Architectures **for** Parallel Processing(ICA3PP), 2007.[4] V. Vishumurthy, S. Chandrakumar, and E.G. Sirer, “Karma: A secure economic framework**for** peer-to-peer resource sharing,” in Proc. of the 2003 Workshop on Economics ofPeer-to-Peer Systems, 2003.[5] S. Marti and H. Garcia-Molina, “Limited reputation sharing in p2p systems,” in Proc. ofthe 5th ACM conference on Electronic commerce, 2004.[6] Cheng Huang, Jin Li, and Keith W. Ross, “Can internet video-on-demand be profitable?,”in ACM SIGCOMM, 2007, pp. 133–144.[7] B. Wang, K. J. R. Liu, and T. C. Clancy, “**Evolutionary** game framework **for** behaviordynamics in cooperative spectrum sensing,” in Proc. IEEE Globecom, 2008.4456