ABSTRACT
Network on Chips (NoCs) overcome major drawbacks of bus based System on Chip (SoCs) like large delay, scalability, power etc. NoC adopts the principle of computer network into SoCs. Several factors such as Topology, Router and Routing algorithm have strong impact on effectiveness of NoCs in different dimensions. This study analysis presents different routing algorithms for NoCs and a new algorithm for multicast messaging based on mesh topology proposed to improve the overall NoC performance. The efficiency of new algorithm is verified by comparing its performance with Path Based Shortest Path (PBSP) algorithm in terms of energy, throughput and latency. These algorithms were developed and evaluated using C/C++ and Network Simulator, respectively for 2D Mesh NoCs. Comparison results show that the proposed algorithm has consumed less energy with high throughput while providing low latency.
PDF Abstract XML References Citation
How to cite this article
DOI: 10.3923/ajsr.2013.754.762
URL: https://scialert.net/abstract/?doi=ajsr.2013.754.762
INTRODUCTION
As technology scale, the number of processors integrated on a System on Chip (SoC) is increasing sharply. The continuously increasing number of processors, traditional SoC interconnection became the bottleneck of on chip communication. SoCs also encounter some other challenges like grouping system complexity, negative effect of technology scaling on global interconnect, need to construct flexible multiuse design and platforms. With the large numbers of research on SoC design challenges, Network on Chips (NoCs) has been acknowledged as a solution to these problems. NoC applies the network concepts to SoC architecture design and simultaneously solve the communication bottleneck and also other challenges as we discussed in earlier (Benini and De Micheli, 2002; Dally and Towles, 2001). Recently, NoC architectures are investigated in the concept of network topology, router design and routing algorithms that offers the performance, cost, throughput and power consumptions. In addition to these parameters, some of the routing algorithms are used to find shortest path from source node to destination nodes based on routing techniques such as unicast or multicast (Boppana and Chalasani, 1993; Kowalik and Collier, 2003). Figure 1 shows nxm NoC topology that consists of Router Node, Router Interface, Core Element, Core Interface and Physical Link (Benini and De Micheli, 2006).
Past years different routing techniques had developed for various NoC topologies to eliminate bottleneck, find shortest path and provide Quality of Service. In addition to provide shortest path, another important encountered problem is deadlock.
![]() | |
Fig. 1: | Generic nxm network on chip architecture |
This problem is eliminated by odd-even turn model and multiple path technique while increase communication efficiency (Kumar et al., 2002; Chiu, 2000; Carara and Moraes, 2008). Shortest path can also be achieved by balancing traffic among network with low latency and high throughput (Lin et al., 2008). Application specific routing algorithm is used to increase the communication adaptivity and performance while giving shortest path. This algorithm enables specific communication links only based on application and others remain silent (Palesi et al., 2009). Other algorithms such as unicast based, tree-based and path based also proposed to achieve shortest path in multicast messaging (McKinley et al., 1994; Malumbres et al., 1996). This study proposed a new algorithm has been developed and implemented on partitioned NoC takes the advantages of ad-hoc on demand distance vector routing.
NETWORK TOPOLOGIES AND SEGMENTATION
Similarities of NoCs with computer network, possible topologies in NoCs are Tree, Fat Tree, H-Tree, 2D mesh, 3D Mesh, C-Mesh, Ring, Torus, Mesh of Tree and etc. Due to the advantages of Mesh topologies like the addressing of cores quite simple during communication, layout efficacy, predictable result and good electrical properties, we are selecting nxm 2D mesh topology with 36 node counts for this study.
For providing better performance of NoCs, one of the such methods is partitioning of network architectures into several subsets based on Physical concepts (for static and dynamic communications) and Logical concepts (for data and instruction transfers) between the subsets which contains the destination nodes (Lin and Nil, 1993; Mohapatra and Varavithya, 1996). Two dimensional mesh based network looks like matrix, therefore network architecture partitioned with the help of node value (node N = nxm). Each node has its own integer coordinate pair (x, y), where, 0 ≤ x ≤ n and 0 ≤ y ≤ m. Network segmentation is described in Fig. 2. Network is partitioned in to four segment based on node label L(x, y) value of destination nodes; where label L(x, y) = yxn+x. Assume (x0, y0) is source node (Violet) and D is destination set. Initially, entire network is partitioned into two sets (D->{DH, DL}) by comparing node label values and each set are divided into two subsets (DH->{DHL (Red), DHR (Pink)}, DL->{DLL (Blue), DLR (Green)}). All orange colour nodes can be identified as destination nodes.
![]() | |
Fig. 2(a-c): | (a) 6x6 NoC structure, (b) Network partitioning and (c) NoC with 1 source and 9 destinations |
Routing methodologies: Routing technique takes the advantages of deciding a path from source to destinations in a particular topology. After deciding network topology scheme, selection of routing algorithm is the next step. Routing algorithms realize the network topology and offer better communication among all the nodes. A good routing technique balances a load across the network channels to provide high throughput. The routing algorithms are classified into deterministic, oblivious and adaptive algorithms in terms of how they select between the set of possible paths from source to destinations. A well designed routing algorithm also keeps path lengths as short as possible, reducing the number of hops and overall message latency, these types of routing algorithms are known as shortest path routing algorithms. These routing techniques can be applied for both unicast and multicast NoC communications (Duato et al., 2003; Liao and Lin, 2008).
More number of routing algorithms had proposed to provide shortest path by many researchers. Some of the earliest algorithms are Dual Path, Multi Path, Column Path, Congestion Minimized Multipath, Distributed Congestion Minimized Multipath, Maxflow and Path Based (PB) shortest path. Based on analysis of all these algorithms, shortest path between source and destinations is achieved with some trade-offs between performance parameters (Al-Dubai and Romdhani, 2006; Al-Mahadeen and Omari, 2004; Xin et al., 2009; Dally and Towels, 2004).
PROPOSED ALGORITHM
Proposed algorithm is developed based on Floyd Wershal Shortest path algorithm for multicast messaging in 2D mesh NoC. The main feature of this algorithm is to provide the lengths between all pairs of nodes and also it is possible to provide a method to reconstruct the actual path between nodes. This algorithm also, utilizes the network partitioning concept which is similar to that of PBSP algorithm but here the shortest path is calculated based on the adjacent hop. The distance is calculated by incorporating three important issues. First, calculate the distance between the nodes to find the neighbor node. Second, choose the next hop neighbor which has the maximum distance from the source, towards the destination. Third, repeat the steps one and two until we find destination. The pseudo code for proposed algorithm is presented below as Algorithm 1. Message routing is to be done by using Ad-hoc On-Demand Distance Vector scheme (AODV). This scheme is called as reactive routing because it determines the routes when path is needed and it is suitable for both unicast and multicast messaging.
Algorithm 1: Pseudo code of proposed algorithm![]() |
![]() | |
Fig. 3: | Message format |
Message format: The message format for transmission is described as shown in Fig. 3. It consists of a segment header and a data section in which header section contains ten mandatory fields and optional extension field and data section contains payload data for carried applications. The length of the data section can be calculated by subtracting the combined length of the header and the encapsulating IP segment header from the total IP segment length. And flag register contains 8 bit, CWR (Congestion Window Reduced), ECE (ECN-Echo Indicates), ACK (Acknowledgment), PSH (Push), RST (Reset), SYN (Synchronize Sequence numbers), FIN (No address).
RESULTS AND DISCUSSION
These two algorithms for multicast messaging are developed by C/C++ based on standard libraries and evaluated with Network simulator on 2D Mesh NoC, running under Fedora Linux OS. The simulator determines the shortest path from source node {L (2, 3) = 20} to all required 9 destinations {L (2, 0) = 2; L (4, 0) = 4; L (0, 6) = 6; L (2, 1) = 8; L (4, 1) = 10; L (0, 4) = 24; L (5, 4) = 29; L (3, 5) = 33; L (5, 5) = 35}. In addition to shortest path, simulator also used to calculate throughput, energy and latency with the inputs includes data size, network topology, routing algorithm and operating frequency.
Ad-hoc on demand Distance Vector Routing (AODV) used to route the message packets from source to destinations. For our simulation purpose, initially we have fixed 100 J for both the algorithms. Simulation results of throughput and energy with simulation time (sec) of Proposed Algorithm are described in Fig. 4 and 5, respectively with 9 destinations. Simulated output showed that, proposed routing algorithm transmitted 250 packets while PBSP routing algorithm transmitted 160 packets. From the energy graph, PBSP was utilizing up to 95.00 J but proposed algorithm was taking up to 97.20 J from the fixed energy to send the packets as we mentioned above.
![]() | |
Fig. 4: | Energy analysis of proposed algorithm |
![]() | |
Fig. 5: | Throughput graph of proposed algorithm |
![]() | |
Fig. 6: | Throughput of PBSP and proposed algorithm with 9 destinations |
For transmission over nine destinations, proposed algorithm consumed only 2.80 J but PBSP algorithm consumed 5.0 J. Figure 6 and 7 describe the comparison results of proposed algorithm and PBSP algorithm. In addition to energy and throughput, average latency of the proposed one also calculated for 9 destinations and then compared with PBSP algorithm which is plotted in Fig. 8. To assess the performance of proposed algorithm, it is then compared with other multicast routing algorithms like Dual Path (DP), Multi Path (MP), Column Path (CP), Path Based (PB), Congestion Minimized Multipath (CMM), Distributed-CMM (DCMM), Maxflow Algorithm (MA). In dual path multicast messaging algorithm, the network is partitioned in to two subnet based on comparison of source and destination nodes label.
![]() | |
Fig. 7: | Remaining energy of PBSP and proposed with 9 destinations |
![]() | |
Fig. 8: | Average latency of PBSP and proposed with 9 destinations |
First subnet is assumed as higher subnet DH if destination nodes label are higher than source node label, second subnet is assumed as lower subnet DL if destination nodes label are lower than source node label. Multi Path algorithm proposed to reduce message path length. In which each subnet (DH and DL) is further divided in to two subnets by comparing x co-ordinate values of source and destination nodes. Column Path multicast algorithm reduced path length by grouping the columns which contain destination nodes in the network and making multiple copies of message. Therefore multiple copies of messages used to achieve desired lowest path. In Path Based algorithm, entire network is segmented in to four groups based on comparison of node label values and routing is performed using odd-even technique.
All the above algorithms are used to deliver the data packets to multiple destinations from single source with reduced path length. But CMM and DCMM are focusing to minimize the traffic congestion by applying polynomial algorithm and distributed flow algorithm, respectively. Performance of CMM, DCMM and MA are verified under 6x6 mesh network without segmentation. The simulation works show that average preparation time to complete multicast messaging in DP, MP, CP, PB, CMM, DCMM, MA and PA are 35, 46, 85, 46, 64, 36, 58 and 39 cycles, respectively. Therefore, proposed algorithm performed data transmission with 39 clock cycles but DCMM takes 36 clock cycles to transmit the data to designated destinations. The percentage of peak energy consumption of DP, MP, CP, PB, CMM, DCMM, MA and PA are 27.5, 6, 35.5, 4.8, 17.3, 6.5, 37.1 and 2.5, respectively. So that, PA consumed very less energy compared to all other algorithm because of network segmentation and consideration of all pairs of nodes in network.
![]() | |
Fig. 9: | Comparison results of all previous results with proposed algorithm |
The comparison result of all these algorithms is shown in Fig. 9. Due to formal distribution method, DCMM has transmitted more number of data packets with fixed time duration 30 seconds. Hence this study shows that all multicast algorithms for network on chip are the trade-off between throughput, energy and latency.
CONCLUSION AND FUTURE WORK
In this study, a new algorithm for 2D mesh Network on Chips proposed. To improve the performance of NoCs, this proposed algorithm used segmentation of network architecture, adjacent node calculation and Ad-hoc On-demand Distance Vector routing for messaging through network. A Network Simulator was used to calculate shortest path, throughput, energy and average latency of proposed algorithm as well PBSP algorithm. To understand the performance and efficiency of proposed algorithm, it has compared with PBSP algorithm. The comparison result showed that the proposed algorithm had high throughput and it consumed less energy for data transmission among the networks. The proposed algorithm also had less average communication delay. The future work will be extended to carry out new algorithms for multicast messaging with different network architectures, destinations and loads to offer better performance of NoC. In addition to introduce new algorithms, different network routers are also to be analyzed and implemented.
ACKNOWLEDGMENTS
We would like to thank all the anonymous reviewers for their valuable suggestions. And also we extend our sincere thanks to VLSI Design centre, Department of Electronics and Communication Engineering, PSG College of Technology, India for providing necessary facilities.
REFERENCES
- Benini, L. and G. De Micheli, 2002. Networks on chips: A new SoC paradigm. IEEE Comput., 35: 70-78.
CrossRef - Kowalik, K. and M. Collier, 2003. Should QoS routing algorithms prefer shortest paths?. Proceedings of the IEEE International Conference on Communications, Volume 1, May 11-15, 2003, IEEE Press, USA., pp: 213-217.
CrossRef - Boppana, R.V. and S. Chalasani, 1993. A comparison of adaptive wormhole routing algorithms. Proceedings of the 20th Annual International Symposium on Computer Architecture, May 16-19, 1993, San Diego, CA., USA., pp: 351-360.
CrossRef - Kumar, S., A. Jantsch, M. Millberg, J. Oberg and J.P. Soininen et al., 2002. A network on chip architecture and design methodology. Proceedings of the IEEE Computer Society Annual Symposium on VLSI, April 26-26, 2002, Pittsburgh, PA., USA., pp: 105-112.
CrossRef - Lin, X. and L.M. Nil, 1993. Multicast communication in multicomputer networks. IEEE Trans. Parallel Distrib. Syst., 4: 1105-1117.
CrossRef - Al-Dubai A. and I. Romdhani, 2006. A performance study of path based multicast communication algorithms. Proceedings of the International Symposium on Parallel Computing in Electrical Engineering, September 13-17, 2006, Białystok, Poland, pp: 245-250.
CrossRef - Al-Mahadeen, B. and M. Omari, 2004. Adaptive wormhole routing in mesh-hypercube network. J. Applied Sci., 4: 568-574.
CrossRefDirect Link - Mohapatra, P. and V. Varavithya, 1996. A hardware multicast routing algorithm for two-dimensional meshes. Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing, October 23-26, 1996, New Orleans, LA, USA, pp: 198-205.
CrossRef - Liao, H.C. and C.J. Lin, 2008. A position-based connectionless routing algorithm for MANET and WiMAX under high mobility and various node densities. Inform. Technol. J., 7: 458-465.
CrossRefDirect Link - Chiu, G.M., 2000. The odd-even turn model for adaptive routing. IEEE Trans. Parallel Distributed Syst., 11: 729-738.
CrossRefDirect Link - Lin, S.Y., C.H. Huang, C.H. Chao, K.H. Huang and A.Y. Wu, 2008. Traffic-balanced routing algorithm for irregular mesh-based on-chip networks. IEEE Trans. Comput., 57: 1156-1168.
CrossRef - Carara, E.A. and F.G. Moraes, 2008. Deadlock-free multicast routing algorithm for wormhole-switched mesh networks-on-chip. Proceedings of the Symposium on VLSI, ISVLSI, IEEE Computer Society Annual, Montpellier, April 7-9, 2008, pp: 341-346.
CrossRef - Palesi, M., R. Holsmark, S. Kumar and V. Catania, 2009. Application specific routing algorithms for networks on chip. Parallel Distributed Syst. IEEE Trans., 20: 316-330.
CrossRef - McKinley, P.K, H. Xu, A.H. Esfahanian and L.M. Ni, 1994. Unicast-based multicast communication in wormhole-routed networks. Parallel Distributed Syst. IEEE Trans., 5: 1252-1265.
CrossRef - Malumbres, M.P., J. Duato and J. Torrellas, 1996. An efficient implementation of tree-based multicast routing for distributed shared-memory multiprocessors. Proceedings of the 18th IEEE Symposium on, Parallel and Distributed Processing, October 23-26, 1996, New Orleans, LA., pp: 186-189.
CrossRef - Xin, G., Z. Jun and Z. Tao, 2009. A distributed multipath routing algorithm to minimize congestion. Proceedings of the IEEE/AIAA 28th Digital Avionics Systems Conference, October 23-29, 2009, Orlando, FL, USA., pp: 7.B.2-1-7.B.2-8.
CrossRef