At present, related applications for P2P networks are widespread day by day but still lack the effective incentive mechanism to enhance the whole usability of the system. The research indicates that 70% users of the P2P system do not share any resource, only simply riding above other users that share resource. Because of few user sharing information or provides the service, approximately 50% document inquiries of the network response from 1% of resource sharing nodes.
The non-cooperation question universally exists in P2P networks and node non-cooperation has seriously influenced the overall effectiveness of P2P networks. While hitchhike in P2P network has become more and serious, it has a negative influence on toughness, the usability and the life cycle of the network. The design and application of reasonable suppressing mechanism for hitchhike is currently an important direction for P2P network researches. In addition, there are also malicious node behaviors such as White Washing, Sybil Attack and Collusion. In order to guarantee that the P2P network is highly effective, safe and reliable, it is necessary to use the suitable measure to suppress the serious hitchhike.
Presently, many domestic and foreign scholars propose different theories for incentive mechanism, in view of massive existence of hitchhike and malicious node phenomenon. In summary, it can be approximately divided into the following kinds of incentive mechanisms: hypothesized payment incentive mechanism, direct reciprocal revenue incentive mechanism, incentive mechanism based on prestige and selfless node incentive mechanism.
Through the release of hypothesized money, hypothesized pays (Yang
and Molina, 2003; Courcoubetis and Weber, 2006;
Figueiredo et al., 2005) use the accounting architecture
to track each kind of transaction and use the hypothesized payment means to
charge the node that demands service. When node gains document resources, reduces
the hypothesized money. When node contributes resources, increases the hypothesized
goods, depending upon the hypothesized goods to drive the node that makes more
contribution. The following questions exist in payment mechanism: limited extendibility,
the need to maintain a security, highly effective authentication organization
that is difficult to realize in the P2P network and the not high exchange pattern
In incentive mechanism based on the direct reciprocal revenue (Moscibroda
et al., 2006; Feldman et al., 2004),
the node maintains the behavior record of other nodes and carries on the decision-making
using these information. The direct reciprocal revenue mechanism is suitable
for the long-term interactive scene between the nodes, being advantageous for
the nodes to have more opportunities to carry on the interactive reciprocal
revenue. The direct reciprocal revenue is similarly suitable for P2P multi-broadcast
flow applications and multi-broadcast tree periodically reconstructs to help
the direct repayment or retaliates, usually using the strategy of one for one.
The massive researches promote the cooperation through the confidence building
or the honorary system and help the node high in prestige or the trust value
obtain a better service. Essentially, this kind of system constructs redundant
games between the autonomous node and trust system. The majority of P2P network's
incentive mechanisms are based on the discrimination service mechanism of prestige
system (Ma et al., 2006, 2004a,
b; Yan et al., 2007;
Kaune et al., 2008). Its essential thought is
to provide nodes that participate in system the DiffServ such as a quicker downloading
speed or higher service priority and so on and urge the node cooperate with
each other and reasonably use network resource. The prestige incentive mechanism
has the following insufficiencies: the lack of the drive for the appraisal,
false appraises and status transformation.
Some scholars has obtained some achievement (Typpi, 2009;
Buragohain et al., 2003; Sasabe
et al., 2009; Su and Dhaliwal, 2010; Wang
et al., 2010) in the research direction of P2P incentive mechanism
based on the game theory. The essence of P2P incentive mechanism based on the
game theory is a unified consideration for the node rationality and the game
of nodes and to restrict the tendency of non-cooperation between nodes through
the design of behavior strategy. The node judges the cooperation degree of other
nodes based on the prestige system and according to the above, decides the game
behavior. After game, the rational node compares the effectiveness caused by
different strategies and further chooses the strategy that can bring a bigger
effectiveness. These systems grantee the cooperation node be able to obtain
a bigger effectiveness that non-cooperation one through the mechanism design,
thus cause the overall system maintain the cooperation.
This study is on the contrast of the maturely developing trust models and discovers that many models are only based on computation of the node prestige value which do not carry on the incentive and penalty measure to the node of good and evil behavior. But the existing models generally adopt different methods to solve the problem which appeared but can not take a more prompt measure in advance according to the future state of the network. Based on the constructed game framework between nodes, this study constructs the game mechanism that has the forecast function, the design mechanism and the node game flow, promotes from the detail to overall, the node and the node, the node and the network, form more positive relations.
PREDICTION OF TRANSITION FOR NODE STATES
Combined with the interactive situation of the nodes in the network, this study induces the node state type into the following several kinds.
Trust state (Nc): When the node at this state interacts with
any node, the cooperation strategy is positively adopted, less betrayal or fault.
But when the majority of nodes in the network have the lucky psychology, this
kind of state does not exclude the possibility of choosing betrayal actions.
Free state (Nr): The node at this state that randomly selects the cooperation or non-cooperation strategy is easily driven by various factors of the network, thus shifts to other states. Here, this study supposes the probability for freely choosing cooperation strategy is p (0<p<1) and the probability to choose non-cooperation strategy is 1-p.
Betrayal state (Nd): When the node at this state interacts with any node, it always hopes to realize its own income maximized through the choice of strategy and usually adopts non-cooperation strategy. But if the network exists mechanism to manage the restraints, few nodes will shift to other states.
Based on the forecast technique of Markov chains, the mechanism will carry on the statistics to node states every t time. Suppose in the t time, three kinds of node states in the network are E, E ∈ (En, n = 1, 2, 3), initial distribution for node states is π (0) = (π1 (0), π2 (0), π3 (0)).
According to the shift situation of node states, a transition matrix of probability for node states is constructed. Suppose the sum of nodes at the state Ei is mi and the sum of nodes shift from Ei to Ej is mij. Then the transition probability can be obtained as Pij (Pij = mij/mi) for nodes shift from Ei to Ej. From this, the transition matrix P for these three node states can be obtained as follow:
According to the transition matrix of probability for node states, the forecast mechanism can obtain the shift situation for state nodes in network at a random time in the future. In other words, suppose at the time n, the transition matrix P for these node states is P[n] = P (0). Pn and by π = π. P, the stable state distributed vector for three kinds of state nodes (trust, free, betrayal) in the network can be obtained as π = [π1, π2, π3]. According to the stable state distributed vector for state nodes, this study combines with the status of network running and obtains the state equation as follows:
P2P GAME MODEL BASED ON INCENTIVE AND PENALTY POLICIES
Based on the game rule, the incentive drive is carried on in this stage to
all the network nodes, fundamentally promoting the cooperative nodes positively
cooperate with other nodes, also making the betrayal nodes as well as the random
nodes have the possibility to shift to the trust state.
|| Game model for nodes adding incentive factors
P2P game model based on incentive policy
The establishment of incentive game model: According to the primitive
game model for nodes, the P2P node behavior for the trust problem is induced
and incentive factors are proposed and introduced into the game model, as is
shown in Table 1. The incentive factors aim at the nodes in
the network that choose the trust behavior. When the node simultaneously chooses
the trust behavior with its interactive object, they will obtain the extra revenue
A/2. And when the node adopts the trust behavior but actually receives the malicious
betrayal, this node will obtain bigger revenue A. By performing incentive policies
to the cooperation nodes, the loss of these nodes being betrayed can be reduced
and their cooperation behavior is also encouraged. Moreover, the incentive factors
change along with the change of the proportion that the cooperation nodes hold
in the network. When the tendency of network running is well, the cooperation
nodes hold a bigger proportion and then the cooperation nodes obtain more revenue
s. Similarly, based on to the hypothesis of the incentive factors, it encourages
more nodes join the set of cooperation nodes. Microscopically, it drives the
nodes to carry on the cooperation and interaction while macroscopically, it
realizes the utilization of network resources, the cooperation nodes hold an
expanding proportion and finally the running of network is stable.
Discussions on the node game model of incentive factors: Because the random nodes occupy a relatively bigger type in the network and the incentive policy model is not the pure policy game model, where the choice of policies for the node is more stochastic, therefore, Nash balancing solution must be performed to the model.
Suppose the probability for node 1 to choose trust behavior is x and the probability of node 2 to choose cooperation with node 1 is y, their revenue function is summarized as follows:
Carry on the solution to the optimization problem of node 1 and node 2, then:
Carry on the calculus to this optimization to get its extreme value, its first-order differential is:
In order to m*1, m*2 become a Nash equilibrium, the parameters of node gaming must accord with:
Suppose the node chooses the cooperation policy, the expected revenue function of x = 1 is:
If the node betrays its interactive object, the expected revenue function of x = 0 is:
Suppose the best choice for the node is m1, whether the node chooses
the cooperation or the betrayal, it will not affect the final gaming result,
v1 (cooperation, m2) = v1
Based on the equations above, when the probability for node 2 to choose the cooperation behavior is y< (2A-2V)/A, node 1 chooses the cooperation behavior. By the contrary, if y> (2A-2V)/A, node 1 chooses to betray node 2. If y = (2A-2V)/A, node 1 chooses the cooperation or non-cooperation at random.
Discussions on incentive factors: In order to enhance the enthusiasm of the network node to choose the cooperation behavior and promote the majority of network nodes have the tacit understanding to carry on the cooperation as the goal, this study proposes the incentive factor, in order to encourage more nodes carry on the benign cooperation and let the nodes rationally realize that only when the cooperation nodes in the network become more and more, they will obtain the revenue more and more greatly.
This study proposes the formula of incentive factors as follows:
Suppose at the time t, there are N nodes in the P2P network environment to carry on the resource correspondence and n nodes to choose cooperation. According to:
it obtains V≤A≤2V. Only when the proportion of network nodes to hold in the network nodes becomes bigger and bigger, n/N is bigger and A is bigger.
P2P game model based on penalty policy: The execution of penalty policy is based on the flow of game rule and distributed situation of real-time node states. When it forecasts the possibility of future state of the network being worse, it will take the prompt penalty measures according to the game rule.
Revenue matrix for nodes of different types: Table 2 summarizes the revenue s for nodes of different states after N gaming.
Suppose at the time t, the probability of interacting with the node with cooperation strategy is P1, then:
Suppose at the time t, the probability of interacting with the node with non-cooperation strategy is P2, then:
|| Revenue matrix of nodes
|| Trust game model with addition of penalty factors
Combined with Table 2, the total revenue T (N∈), T (Nr), T (Nd) for nodes Nc, Nr, Nd are shown as follows:
The establishment of penalty game model: After microscopically analyzing the influence of the penalty policy on the revenues of three kinds of network nodes, this study constructs penalty game model for network nodes macroscopically, as indicated in Table 3.
When the node adopts the betrayal behavior to its interactive object of choosing trust strategy, it will receive the penalty. Suppose the cost is W which is the penalty factor premised in this study and the precondition of which is W>V and at the same time, the betrayed node will receive the revenue W for compensation. After adding the penalty factor, Nash point of equilibrium for the revenue matrix is changed and at this time, the point of equilibrium for this game is (cooperation, cooperation).
Discussions on penalty factors: The penalty factor will be discussed as below and the goal for this study to propose that the penalty factor is to enhance the enthusiasm of the node to choose cooperation strategy, attack lucky psychology of betrayal node, guide the node that does not cooperate with the node after interaction many times favor in the choice of cooperation and simultaneously guarantees the stability for the entire P2P network.
This study gives the formula of penalty factor W as follows:
Suppose the previous time for the node to choose betrayal behavior is ta and D (k) is the number of times for the node in the network to choose non-cooperation with the interactive object adopting the trust strategy. When D (k) = 0, then W = V. From the time t0 when the node first enters the P2P network, if the node maintains the choice of cooperation with other nodes, the penalty strategy will not be adopted. At the time t, the node has the lucky psychology first time to betray the interactive object with cooperation many times and at the next time node will receive the penalty W. It can be seen from above that, the more network trust nodes are, the less node chooses betrayal behavior, D (k) will be smaller and the farther away from the previous betrayal time is, 1/(t-ta) will be smaller, then the penalty factor will be smaller.
Forecast game mechanism: Based on the introduction of forecast function to the mechanism and he detailed design of the policy set, the forecast mechanism for game rule will be proposed and established in the P2P network which is advantageous for the control and the restraint directly to the node behavior. The flow of forecast game mechanism is shown as Fig. 1.
After the game mechanism carries on the penalty restraint to the network, it needs to compare the forecasting result at the preceding time and the running status of the network in the future by using the penalty strategy. If it has a good tendency, it proves that the penalty policy is effective for this network, otherwise, it needs to further adjust the parameters for the penalty policy or uses a more precise forecast style which performs to control in advance.
This study uses the game simulation tool Gambit to carry on the simulation to the node gaming behavior each time. First the simulation is performed to the P2P trust-difficult model before improvement, then the node behavior tendency is induced, the behavior psychology of the node is grasped and the simulation contrast is performed to the model after improvement. This study contrasts from the most direct Gambit graph simulation and verifies whether the confirmation model have the effective restraint to the node behavior and to have the improvement to the penalty of the betrayal behaviors.
Simulation on the incentive stage: In this stage, this study utilizes
gambit to carry on the simulation and the analysis to the network node game
model and according to whether the cooperation node is rewarded, makes the comparison,
gives related parameter the real value, analyze intuitively the influence of
the incentive factors each time on the node gaming result as well as policy
choice of the nodes and even the tendency of node cooperation.
|| The flow of forecast game mechanism
Take the node gaming-difficult model as the reference, this study first establishes the cooperation revenue U = 9, the node correspondence consumption A = V = 4. Because at the incentive stage, node gaming and the difficult model are different, the incentive model is no longer the pure policy game model but is the game model of mixed policies. The game curve after the simulation no longer has only one kind of Nash equilibrium, the tendency of node gaming is no longer single, the mixed policy will receive the adjustment of each parameter, the Nash equilibrium also changes along with its change. Based on the change of the model type, the incentive factor is established as the same with the network service consumption, namely A = V = 4, with node gaming simulation curve as Fig. 2 and it gradually increases the incentive strength and sets the factor in the reasonable scope proposed by this study, making the incentive factor A = 6, with node gaming simulation curve as Fig. 3. Continuing to increase the incentive strength, making the incentive factor A = 2V = 8, Fig. 4 is obtained. Until the value of the incentive factor increases to A = U = 9, simultaneous choice of the cooperation by the node obtains two time cooperation revenue, as expressed in Fig. 5.
Contrasting with the node gaming curves above at four stages, when the reward
and the network consumption are the same, the choice of the node for the tendency
of trust comparably pure policy gaming the model has certain improvement.
|| Gaming curve of P2P nodes when A = V = 4
|| Gaming curve of P2P nodes when A = 6
|| Gaming curve of P2P nodes when A = 8
Under the premise of mixed policy choice, the node chooses different policies,
the curve presents the different branch but the overall cooperation choice has
the gathering tendency. When the value of incentive factor increases to between
the consumption and the revenue, under mixed policy gaming, the tendency for
the node to synchronically choose to trust each other has more effective gathering
than Fig. 2.
|| Gaming curve of P2P nodes when A = U = 9
When the incentive factor is two times as consumption, the tendency of the
node cooperation do not follow the increase of the incentive strength to obtain
a more obvious promotion. Compared with Fig. 4, the possibility
for the node to synchronically choose to trust each other instead reduces, the
straight appropriate cooperation node obtains two time of the network revenue,
the simulation result returns to the pure policy gaming sole curve but the difference
is according to the choice of curve expansion direction the behavior it is no
longer positively to the cooperation tendency development. Through the adjustment
of the incentive factor, in each region the incentive model is analyzed, obtaining
the following result. When the value of the incentive factor is between the
correspondence consumption and the network revenue, regarding the node cooperation
drive is most effective and if the value of the incentive factor closes up willfully
to one side, the tendency of node cooperation has the possibility to be weakened.
Simulation at the penalty stage: At the penalty stage, according to the gaming behaviors of two nodes this study first carries on the tracing analysis in view of the network, combines with the node interactive situation and the network actual movement environment, carries on the simulation using Gambit to the node gaming behavior. And according to whether the management mechanism executes or not, this study compares the tendencies of the node behavior choice, discusses the parameters above to give the real value, analyze the penalty factor intuitively each time the influence of the node gaming result as well as the node policy choice.
Suppose the cooperation behavior revenue U = 9, the network service loss V
= 4, adjust the penalty parameter, do simulation multiple times, contrast and
adopt the penalty policies under the different strength, the tendency of node
gaming behavior choice and confirms the suitable establishment of the penalty
factor. First not changing other factors in the situation, based on the proposal
of the penalty factor scope V<W<U in this study, combined with the node
||Trust gaming-difficult model curve that simulation curve does
not adopt any policies
||Gaming model after the execution of the penalty policy
|| Trust gaming-difficult model curve that when W=11
Suppose the penalty factor W = 6, Fig. 6 shows the trust
gaming-difficult model curve that does not adopt any policies and Fig.
7 shows the gaming model simulation curve after the execution of the penalty
|| P2P node gaming does not adopt any policies
||Trust gaming-difficult model curve that when W = 9
When the penalty factor is bigger than the revenue, that is W = 11>U, the
result is shown in Fig. 8 and 9. When the
penalty factor is the same with the revenue, that is W = U = 9, the result is
shown in Fig. 10 and 11. When the penalty
factor is the same with the communication consumption, that is W = V = 4, the
result is shown in Fig. 12 and 13.When
the penalty factor is less than the communication consumption, that is W = 2<V,
the result is shown in Fig. 14 and 15.
This study confirms the rationality of the penalty factor through the adjustment of penalty strength, also pursue to find most appropriate penalty way to drive node cooperation gaming but is not the vicious hitchhike. Through five groups of contrast simulation curves above, this study discovers that the penalty factors in different zones of value have the different influence to the node gaming tendency. First the penalty factor is set in the reasonable scope which proposed for this study, contrast node gaming-difficult curves, the node gambling cooperation tendency in Fig. 7 is bigger than Fig. 6 obviously.
|| P2P node gaming does not adopt any policies
|| P2P gaming that does not adopt any policies
|| P2P node gaming when W = 4
Contrast Fig. 9 and 11, the penalty factor
is hypothesis to be equal to the network revenue or bigger, the node gaming
has not been improved, instead has the worsened tendency, from this, the oversized
penalty strength has not played the effective role regarding the node drive,
otherwise cause node rebel.
Contrast Fig. 13 with Fig. 15, when the
penalty factor and the network consumption are impartial presented in the gaming
exceptional case but penalty strength reduce slightly once more, regarding the
node gambling drive is effective.
|| P2P gaming that does not adopt any policies
|| P2P node gaming when W = 4
But compared with Fig. 7, the most effective gaming drive
is still when the value region of the penalty factor is between the correspondence
consumption and the network income, therefore may confirm the rationality of
the penalty factor hypothesis. It may also prove that, only when the penalty
factor is smaller than the network revenue, regarding node gaming, the penalty
policy is effective and when V<W<U, the penalty policy is most reasonable.
Influence of penalty policy on network state: This stage first gains each period real-time node data in simulated environment PeerSim, differentiates three kinds of node percentage in statistics network, based on Markov forecast, carry on the computation to the node distribution probability before and after execution management rule, simulates and forecasts n shift situations in the future to the node. The simulation parameter is as Table 4.
|| Parameters of predictive simulation
Based on the real-time data, node state probability distribution at time t1 is π1 = [0.80 0.11 0], three kinds of node state transfer matrix is:
Through the analysis of transfer matrix P1, it may see that before the execution of gaming rule, the probability of the shift from free state to the betrayal state that is easily influenced by other factors is bigger than the probability positive state shift. Therefore, carry on the predict to all node states that do not obtain the restraint of the penalty mechanism, in the future 20 time shift situation, as is shown in Fig. 16. As is shown in Fig. 16, it may see the shift tendency of the future three kinds of nodes obviously through the diagram of curves and because the network has not obtained the surveillance restraint of the penalty mechanism, each kind of node has the lucky psychological influence, more likely chooses the betrayal behavior but this betrayal behavior has not been effectively under penalty control which will enable the network environment to obtain a worsening step in the future.
Because of obtaining the future node shift tendency in advance, this study at the time t2, punishes the mechanism execution according to the gaming rule in the network, at the right moment carries on the penalty to and the restraint on node malignant behavior and in the next time section, carries on the statistics to the node shift condition, obtaining node transfer matrix P2 joining the management mechanism:
Contrasting Fig. 17, it may obviously observe, the penalty
mechanism has positively effects to the node shift situation, so long as the
mechanism adopts the prompt behavior restraint, the node shift tendency will
have the transformation slightly and will further cause the entire network node
future behavior tendency to have a very big change.
|| Distribution tendency for future at t1
||Distribution tendency for future node state at t2
after the execution of forecast mechanism
And this study confirms the existence of the management mechanism regarding
the node betrayal behavior and even has the positive effects on the stability
of the entire network. It regards the node trust and the cooperation choice
had a very big auxiliary function. This study also gives that the choice change
for the behavior of single node has the influence on the network, therefore
the existence of gaming rule is essential, moreover it needs the forecast to
track and predict synchronous network node state, in order to prevent the network
to have the possibility of paralysis.
According to the network running state and the node interactive manner as well
as the behavior choice, the study first classifies the network nodes, carries
on the introduction and the analysis of forecast function for the mechanism.
This study then adds incentive and penalty mechanism into the basis game framework
for the P2P nodes, carries on the construction and the analysis of two different
kinds of game model. Present study finally constructs the forecast mechanism,
proposes the gaming rules for the reasonable mechanism and adopts the corresponding
policy with combination of real-time network state.
This study was supported by Natural Science Foundation of Hubei Province of China (2010CDA011), Foundation of Wuhan Twilight Plan Project (201050231084) and Educational Commission of Hubei Province of China (D20111409). We would like to thank the reviewers and editors for their detailed and valuable comments.