Nowadays, CDN network is mainly used in web and streaming media content distribution and the related research studies mainly focus on the replica placement policy, content routing algorithms, load balancing and redirect requests etc. Gadde et al. (2001) study integration issues of the CDN network and the web in-depth, in which the results have been widely used. Accessing to the copy of the web on edge servers by users can improve the efficiency (Liu and Wei, 2013). Fei et al. (2003) make improvement on the replica placement strategies, propose the dynamic mechanism of distribution and propose the load balancing algorithm (Buyya et al., 2009; Ramesh and Perros, 2000; Suri et al., 2011). Shaikh et al. (2001) describe the request redirection mechanism based on intelligent DNS and the users orient the user requests to the nearest edge servers (Franks et al., 2009; Woodside et al., 2012; Bacigalupo et al., 2012). Fei and Yang (2005) propose the large file distribution technology based on fragmentation; its characteristic is that by memory on the copy storied on the edge server is not a complete file but the file fragmentation according to strategy division. Cahill and Sreenan (2007) describe a video-CDN system which uses the high quality of TV content with CDN distribution technology. Day et al. (2003) make research on the Content Distribution Internet (CDI) which offers interoperability to a plurality of independent CDN networks (Soror et al., 2013).
On the other hand, in recent years, P2P file has been widely used in sharing networks field and people conduct a lot of research work on P2P technology at the same time. According to structure P2P network can be divided into centralized, decentralized and hybrid type. In the center structure, there is an index server to provide resources location information for all peer nodes. The typical applications is shown as BitTorrent proposed by Johan Pouwelse, where its structure is simple and positioning queries is fast but its scalability is limited to the index server capacity (Aboulnaga et al., 2012). In the completely decentralized structure, there is no server and the resource location is realized through the collaboration of the peer. The typical method is as chord proposed by Stoica et al. (2001). In the hybrid structure, some of the peer node is selected as a super-node which provides positioning information for the other local peer.
P2P is a kind of end to end technology which directly accesses the network resource; the involved nodes are the providers and they are the recipients as well. There is no need for relay server to pass through the resource but the nodes can directly connect. P2P system is one of the most important application systems of the internet at a rapid pace; the traffic load on the internet is also mainly presented as the P2P application-driven situation. With the continues grow of internet and the gradual increasing of the internet traffic from the P2P, the issues of scale scalability and flexibility and the quality of service have been become increasingly evident, therefore P2P systems must provide services like routing, finding and accessing resources and so on (Li et al., 2011).
By deploying multiple distributed CDN replica servers at the network edge CDN optimize the distribution of content on the internet end users. Since the emergence of CDN in the late 1990s, CDN has experienced ten years of development. The value orientation of traditional CDN service changed over time: Initially, CDN focus on improving the response time to reduce the perception experience of the end user and now, for content providers CDN can share the infrastructure of their content distribution services and by improving the effectiveness of the system, the content providers needs of service capacity in the network traffic peak can be met, thereby the content providers cost input on Web infrastructure can be reduced. Besides, some recent trends indicate that CDN pattern is begin to be transferred as a utility computing model but the construction cost of CDN is high and the scalability is poor (Armbrust et al., 2009).
The research on combination of CDN and P2P is still at a preliminary stage. In the literature it is proposed that P2P-basedMultimedia CDN (PM-CDN) which introduce the P2P technology into CDN backbone network, so that the edge severer can exchange copies to improve the efficiency of distributed copies. Yan et al. (2010) combine the CDN and P2P technologies for streaming media distribution however, it is mainly focused on the peer incentives on P2P network. Khosroshahy et al. (2013) simple introduce the routing algorithm of PCDN but there is no description of architecture and effectiveness of CDN and P2P (Bhaskar and Laurie Jr., 2009; Hordijk and Loeve, 2000). Qiu and Sang (2008) introduce the P2P technology to CDI network to achieve data sharing on multiple CDN networks (Omari et al., 2007). There is a significant complementary strength of CDN and P2P. This study combines CDN and P2P technologies, proposes a hybrid content distribution network and compares and analyzes the performance of HCDN, CDN and P2P through the model establishment.
This study mainly makes the extensive and innovative works in the following areas:
||For the network distribution of large-scale digital content, the complementary strength of CDN and P2P networks is used. This study proposes a Hybrid Content Distribution Network (HCDN) model which deploys CDN system on the backbone network and builds P2P area networking on the access network. Users can simultaneously access the data through CDN and P2P networks. HCDN content network model and its inner content distribution processes are elaborated, including the content routing, replica placement and data download; a performance model of mixed network based on flow model is given
||In order to further validate the correctness and validity of presented HCDN model, the simulation experiments to compare CDN, P2P and HP2P networks is conducted which makes analysis on the indicators of nodes number changes, download time, service capacity and transmission overhead. Simulation experiment results show that the proposed HCDN model can improve the download speed, reduce server load and reduce transmission traffic on backbone network. It can overcome the shortcomings of the high cost of CDN network deployment but also it can avoid low performance of P2P network when the number of nodes is scarce
Network integration of traditional P2P and CDN technology
CDN is as the center and the P2P autonomous region is as the edge: The management mechanism and service capacity of CDN are introduction to the P2P networks and formed the structure with CDN as the center and the P2P autonomous region as the edge that is the PCDN technology shown in Fig. 1. The structure is currently used in R and D projects related to IPTV. Since the introduction of P2P technology, compared to the IPTV system of C/S mode, it greatly saves the bandwidth overhead. This kind of structure improves the controllability of the content and enhances the stability of P2P but it uses only the merits of both technologies but not effectively achieves the complementary of the advantages and disadvantages of the two technologies.
Establishing P2P network between the edge servers: The storage device of CDN is organized as the way of P2P; the directory services and multi-point transmission capacity P2P are used to achieve content exchange between CDN storage devices to enhance the ability of content distribution of CDN (Fig. 2). This kind of structure reduces the pressure on the central server caused by content distribution. Wherein, P2P nodes are servers, that is, server to server. Distance between the edge servers is far and the network environment is quite different, so transmission between servers must have some bottlenecks.
The existing structures have the entirely different performance in terms of scalability, content, copyright, effectiveness of user management, QoS, traffic ordering, client deployment and so on, while the performances of complementary are the specific characteristics of P2P system and CDN system. That is to say, the Fig. 1 is an improved P2P system and Fig. 2 is an improved CDN system. Although both of the two systems combined the advantages of P2P and CDN, it do not fully integrated with the CDN technology and P2P technology. The network architecture structure of HCDN is shown in Fig. 3.
|Fig. 1:||Structure with CDN as the center and the P2P autonomous region as the edge|
|Fig. 2:||Structure of establishing P2P network between the edge servers|
||Network architecture structure of HCDN
Hybrid Content Distribution Network (HCDN) model: The structure of traditional CDN network is shown in Fig. 4a. The contents are strategically distributed from the source server to the edge servers; the user node can obtain data from the edge server. Compared with the traditional C/S structure CDN networks can reduce data latency, increase the transmission rate and reduce the source server load. A Hybrid Content Distribution Network (HCDN) is proposed.
The centralized P2P network structure can be expressed as the index model shown in Fig. 4b; the user nodes can obtain the other peer information of the same file by exchanging the index servers and the file data can be exchanged between the user nodes.
||Network models, (a) Traditional CDN network, (b) Centralized P2P network and (c) HCDN network
Since the server only maintains the index information but do not involved in the transmission of the file data, the load is reduced and it has good scalability.
The network structure of Hybrid Content Distribution Network (HCDN) can be abstracted into the two hierarchical models as shown in Fig. 4c. Content distribution process is divided into two stages: CDN grade and P2P grade. CDN level distribution system is deployed in the backbone network and its content is distributed to the edge of the backbone network. P2P technology is used in the transformation of the copies; edge server can simultaneously get copy data from the source server and the other edge servers; centralized P2P autonomous network is built in the access network and the user nodes can exchange file contents. Therefore, the user nodes can obtain data from the edge servers on CDN network but it also can obtain data through the other peer nodes on P2P network. Compared with the traditional CDN network, HCDN can reduce server load and improve its scalability and thus it can reduces the number of edge server and save the deployment cost. Compared with the pure P2P network, HCDN can provide better QOS guarantee. While the number of peer nodes sharing the same file is low, the user nodes can be downloaded from the edge server through the CDN network. In addition, the regionalized P2P network in HCDN can significantly reduce transmission load of backbone network.
It is worth mentioning that, HCDN is the overlay network formed by the server and the user nodes. The edge server is a logical entity which is constituted by the cluster or multiple physical servers. P2P network in HCDN uses the central structure, so: (1) The query speed of centralized index is fast which can reduce the response time of the user request, (2) It is simple to achieve and control and (3) The index server can be integrated with the edge server and also it can be deployed.
Performance analysis model based flow model: Due to the randomness of the user node, P2P network modeling analysis is a taff problem. Qiu and Srikant (2004) give a generalized analysis model based on queuing theory (Gaeta et al., 2004), respectively make analysis on the BitTorrent protocol and P2P caching system and verificate the effectiveness of the fluid model.
Flow model of HCDN network, (a) Download
the broadband is the bottleneck and (b) Broadband is not limited to download
|Table 1:||Description table of symbol definition|
This section makes performance analysis model of hybrid network based on fluid model and makes comparison on the traditional CDN and P2P network.
Symbols used in performance analysis are defined in Table 1. The user nodes in HCDN are divided into two categories: Download nodes and seed nodes. Download nodes refer to the nodes cant be downloaded the entire file currently and they are the being download user node; seed node represents the nodes has a complete file but they are still remained in the P2P network to offer upload service for other nodes.
Model description: Assuming that each user node has the same upload bandwidth and download bandwidth u1, the request arrival rate is parameter obeyed the Poisson distribution of parameter λ; download nodes can be interrupted before becoming the seed nodes and the interrupt rate is obeyed with the exponential distribution with the mean as 1/∂; seed node can leave the system after a period of time and the leaving rate is obeyed the exponential distribution with the mean as 1/β. Factor η (0≤η≤1) is used to represent the upload efficiency of the download nodes, namely utilization of uploading bandwidth; when μ = 0, it means that the download node does not upload data; when μ = 1, it means that the upload rate of the download nodes is the maximum bandwidth. In order to avoid losing the generality, file size is set as fi = 0.
In HCDN network, if the download bandwidth is the bottleneck, the overall upload speed of the system is u1; if it is not limited to the download bandwidth, the overall upload speed of the system is u1 (x(t))+ηy (t))+u1, in which u1 (x(t))+ηy (t)) comes from the P2P network; μi comes from the edge server of CDN network.
Therefore, flow model of HCDN network is shown in Fig. 5. The number of changes of seed nodes and download nodes in the system can be expressed as follows:
Seed nodes and the number of download nodes with the steady state: The steady state is considered, so:
According to Eq. 1, it can be obtained:
wherein, and are the equilibrium values of x(t) and y(t). Solving the Eq. 4 to get:
wherein, means that in HCDN download bandwidth is the bottleneck:
Similarly, to the traditional pure P2P network, let ui = 1, using the flow model it can be derived the expression of seed nodes and download nodes under the steady state:
Average download time: Using the Little rule, the average download time of the user nodes in a steady state is obtained as:
wherein, D is the average download time; is average ratio of download nodes after downloading; is the average number of download nodes turning into seed nodes.
Thus, for HCDN network, according to Eq. 7-9 it can be obtained as follows:
Similarly, for the P2P network, according to Eq. 7-9 it can be obtained as follows:
For traditional CDN network, it conform to the queuing model characteristics of the typical M/M/1 and its reach rate and service are as followings:
Thus, the probability when the length of the queue is i can be obtained:
Therefore, the average download time of CDN network can be expressed as following:
System service capabilities: System service capacity refers to the maximum upload rate provided by the system. In CDN, only the edge server offers upload services; in P2P, the seed nodes and download nodes offer upload services; in HCDN, the upload service comes from the CDN and P2P level network. Therefore, the following expression can be obtained:
Edge server load: In CDN network, the load of edge server depends on the number of nodes downloading which is shown as follows:
In HCDN networks, user nodes request load is b•y(t); service capability of P2P-class network is u1 (x(t)-ηy(t)); assuming the downloading from the P2P network completely prior to the downloading from the CDN network, the load of the edge server can be expressed as follows:
Network transmission overhead: In the CDN network, the user nodes download content from the edge server, so its network transmission overhead is as follows:
In the P2P network, user nodes exchange data, so the total cost of network transmission is:
In HCDN network, part of the contents is downloaded from P2P network and the rest of the content comes from the CDN edge server. If the downloading comes from the P2P network is prior to the downloading from the CDN network, then according to Eq. 19, it can be obtained as the follows:
|Fig. 6:||Network topology of simulation|
Setup experimental environment: Theoretical analysis and numerical simulations combine to make the performance evaluation and comparison. Based on the theoretical analysis of the numerical formula model described in previous section with a derivation, MATLAB software tool is used to compare the performance characteristics from HCDN and other programs (CDN, P2P and HP2P) between. In the simulation, based on discrete event simulation mechanism, by using a General Peer-to-Peer network simulator (GPS) (generic P2P simulation framework based on JAVA) HCDN network simulation is completed.
In the simulation, this study primarily focuses on the number of nodes trend. By comparison with theory, the effective performance of the model was verified. Among them, the simulation scenario is as follows: Each user node P and the edge server S are, respectively connected with the forwarding node T; the source server R are connected with the four forwarding nodes; the topology is shown in Fig. 6. The round-trip delay of P and T is 6 msec; the round-trip delay of S and T is 17 msec; the round-trip delay of R and T is 26 msec; the bandwidth of edge server S and the source server R is u1; the download of each user node and upload bandwidth are, respectively b and u1; the Poisson distribution probability with the parameter λ is joined the network; the exit rate of download nodes and seed nodes are, respectively subject to the exponential distribution of parameter α and β; the congestion control strategy of transport layer is ignored in the experiments, that is, the nodes can use its maximum transmission bandwidth.
The following data is used as the basic parameters of theoretical analysis and numerical simulation: ui = 0.0016, u1 = 0.05, b = 0.005, η = 0.002, α = 0.001, η = 1, u = 1, ti = 3, tp = 5, tr = 7. By changing the parameter values, the influence of the performance by respective factors is analyzed.
RESULTS AND DISCUSSION
Results analysis: In the performance evaluation, the hybrid P2P network (Hybrid Peer-to-Peer, HP2P) is used as the comparison program; in HP2P scenario, the user nodes can download data via P2P networks and servers (source server).
Trends in the number of nodes: Given parameter can basically meet the conditions, , that is, the download bandwidth of download nodes is the bottleneck; Fig. 7a shows the trend changing of the download node and seed node in P2P level network of HCDN over time.
||Tendency of the number of nodes, (a) Download the broadband is the bottleneck and (b) Broadband is not limited to download
In order to test the situation unlimited to the download bandwidth, let β = 0.005, so . Figure 7b shows the corresponding trend. As can be seen, the trend includes two phases: The exponential growth phase and the steady phase. At the same time, it can be seen that the experimental results are very close to the theoretical value which verifies the effectiveness of the HCDN network performance based on fluid model. The subsequent performance evaluation and comparison will be based on the numerical analysis of the performance model in previous section.
Factors of the number of nodes: To analyze the impact of the exit rate of the seed nodes to the number of nodes on the steady state, the value of α varies between 0.004~0.012 and Fig. 8 shows the corresponding numerical results. With the increase of the α value, the number of seed nodes is reducing and then the number of download nodes is increasing.
In order to analysis the impact of the upload bandwidth of user nodes to the number of nodes on steady state, the value of ui varies between 0.004~0.012. Figure 9 shows the corresponding results of numerical results. With the increasing of the value of ui, the number of seed nodes is increasing, while the number of download nodes is decreasing; when ui<bui<b, the size of ui has a significant influence on the number of nodes on the steady state.
In order to analysis the impact of the up edge server bandwidth to the number of nodes on steady state, the value of ui varies between 0~0.5. Figure 10 is the corresponding results of numerical analysis.
|Fig. 8:||Impact of the exit rate of seed nodes to the number of nodes|
|Fig. 9:||Impact of the upload bandwidth (ui) of user node to the number of nodes|
It can be seen that with the increase of the edge server services capability, the number of seed nodes in stable state is increasing, while the number of download nodes is decreasing.
Average download time: Let β = 0.001 (i.e., download bandwidth is the bottleneck), in order to analysis the impact of the request arrival rate to average download time T, the value of η varies between 0.027~0.029 (close to the ability of edge server ui = 0.04); Fig. 11 shows the corresponding curve. As can be seen from Fig. 11, in the P2P and HP2P network, as the request arrival rate increasing, the average download time is dramatically increasing; while in HCDN networks, the average download time does not change with the request arrival rate which verifies the high scalability of the HCDN sharing mechanism in three network structures.
|Fig. 10:||Impact of the edge server capability (ui) to the number of nodes|
|Fig. 11:||Average download time (when the download bandwidth is the bottleneck)|
To analysis of the situation unlimited to the download bandwidth, let α = 0.006, so that the value of η changes between 0~1; Fig. 12 is the changing curve of the average arrival rate with the request arrival rate. As can be seen from Fig. 12, HCDN and HP2P have smaller download time than the pure P2P. With the increasing of the request rate, average download time of HCDN and HP2P is getting close to P2P; it is because with the continuous increasing of the user nodes, the proportion of server service to the total system capabilities becomes smaller. While, the download time of HCDN is smaller than HP2P which verifies the affection of the introduction of edge servers in CDN network.
System service capacity: As shown in Fig. 13, the service capabilities of HCDN has obvious advantages. The service capabilities of CDN, HP2P and HCDN networks are increasing with the increase of the number of user nodes; while the service capability if CDN network is the fixed value, that is, the service rate of the edge server.
|Fig. 12:||Average download time (unlimited to download bandwidth)|
|Fig. 13:||System service capacities|
When the number of user node is large, CDN is superior to P2P network; when the number of user nodes is small, P2P networks is superior to CDN. Meanwhile, the service capabilities of HCDN network are superior to P2P.
Loads of the servers: To analyze the impact of the download nodes to the server load, let ui = 0.06, the number of seed nodes is 5. Figure 14 is the corresponding result curve. As can be seen from Fig. 14, with the increasing of the number of nodes, for the load increase of edge servers P2P and HCDN networks are significantly higher than HP2P network. When the number of nodes is big (like y>52), the edge servers can work at full capacity.
||Edge server load
|Fig. 15:||Network transmission overhead in CDN, P2P, HP2P and HCDN|
Network transport overhead: Let ui = 0.06, the number of seed nodes is 15; Fig. 15 shows the experimental results of the network transport overhead. Due to ti™tp (HCDN) and it has the control of P2P zone, the distance between peer nodes is typically less than the distance between the peer and the edge server). From the Fig. 15, network transmission overhead of HP2P and CDN is between the HCDN and P2P; when the number of nodes is small, it is close to the P2P network; when the number of nodes is large, it is close to the CDN network. Since ti™tp, network transmission overhead of HP2P is larger than HCDN.
Through the simulation, validate the correctness and validity of presented HCDN model, make analysis on download time, service capacity, server load and transmission overhead. Simulation experiment results show that the proposed HCDN model can improve the download speed, reduce server load and reduce transmission traffic on backbone network as show in Table 2.
|Table 2:||Comparison of difference in HCDN, P2P and HP2P networks|
|Table 3:||Comparison of HCDN model, CDN and P2P networks (comparison of HCDN model and previous studies)|
Experimental results show that, HCDN deploys the CDN system in the backbone networks and builds the P2P regionalization network in the access network, users can simultaneously access data through the CDN and P2P networks. Experimental results show that compared with the traditional CDN network the proposed model can reduce the edge server load and save the cost of deployment; compared with the P2P network, it can enhance the QOS guarantee and reduce the backbone traffic; compared with HP2P, it has greater download rate which can reduce the network transmission overhead as show in Table 3.
Based on CDN and P2P technologies, this study proposes a hybrid content delivery (HCDN) network which can comprehensive utilize the complementary advantages of CDN and P2P networks and makes detailed exposition of the key processes for HCDN network. Compared to other network, HCDN network has three advantages: (1) It can reduce the edge server load to save the cost of deployment, (2) It can enhance the QOS and guarantee to reduce the backbone traffic and (3) It has greater download speeds which can reduce the network traffic overhead.