

# Reconfigurable NOC: A Data Approximation Framework for Network on Chip Architectures

T. Thangam<sup>1</sup>, R. Sangeetha<sup>2</sup>

<sup>1</sup>Associate Professor, Department of Electronics and Communication, PSNACET, Dindigul, India <sup>2</sup>ME Student, Department of Electronics and Communication, PSNACET, Dindigul, India

Abstract: The pattern of unsustainable power consumption and large memory transmission capacity demands in enormously parallel multi-core frameworks, with the approach of the big data era, has brought upon the onset of alternate computation model utilizing heterogeneity, specialization, processor-in-memory and Approximate computing is approximation. promising methodology for low power IC design and which restores a possibly approximate result rather than absolutely accurate result using only error-tolerant application like audio, video processing, machine learning, multimedia and big data analytics that allow inaccurate outputs within an acceptable variance. Utilizing relaxed accuracy for high throughput in network-on-chip which have quickly turned into accepted method for connecting a large number of on-chip components, has not yet explored. The propose accuracy configurable adder based reconfigurable NOC, a data approximation framework with an online data error control mechanism for high performance NOCs. In this work facilitates approximate matching of data, to reduce the transmission of approximately similar data in the NOC propose a new Reconfigurable approximate carry look-ahead adder. The proposed method has been evaluated using xilinx12.1.

*Keywords*: approximate computing, accuracy configurable adder, network-on-chip

#### 1. Introduction

Estimated registering is a promising methodology for low power IC outline which restores a perhaps erroneous outcome instead of ensured precise outcome. This calculation method can be utilized for some applications, for example, sound, video, haptic preparing and machine realizing where a rough outcome is adequate for its motivation .For such mistake tolerant applications are found in wealth in rising advancements and applications. Inaccurate figuring is the most part focused on number-crunching circuits. In a few inexact adder outlines have been produced. Network-on-Chip is a communication subsystem on an integrated circuit. It is typically between IP cores in a system on chip. The Network-on-Chip technology applies networking theory and on-chip communication methods. It is a reliable and scalable communication paradigm deemed as an alternative to classic bus system in modern system -on- chip. In the concept of reconfigurable Network-on-Chip that introduce self-adaptive mechanisms in Network-on-Chips. This self-adaptive mechanism is more efficient and flexible and also applicable for self-optimizing, self-healing and self-protecting. The design goals for Network-on-Chips can be described as platform based design, separation between communication and computing resource, reducing area and energy. Estimate can lessen control utilization with mistaken circuit. ACA snake for which the precision of results is configurable amid runtime on account of its configurability; ACA adder can work both approximate mode and exact mode. More summed up rendition of ACA, called Generic Accuracy Configurable Adder [2]. Proposed a new accuracy configurable adder circuit, Accurus, which is similar to ACA in producing the first approximate result, but subsequently we improve the accuracy over pipeline stages starting from the Most Significant Bit (MSB) first. Due to this, get tremendous improvement in accuracy just within the first few stages of the error correction pipeline [1]. Proposed an accuracy configurable adder circuit called Gracefully-degrading adder (GDA). Which allows for reconfigurable sub-adder widths carry chain consideration during runtime. It has no correction unit [3]. Proposed a Simple Accuracy Reconfigurable Adder plan, it is a convey expectation based precision configurable viper outline with less region contrast with CLA. Contrast with nimbly corruption snake, basic precision reconfigurable viper brings about half less power-delay-item and can accomplish a similar pinnacle motion to-commotion ratio(PSNR).In the postpone delayversatile reconfiguration system likewise enhance the exactness control defer trade-off [4]. Proposed a Dynamic programmingbased life time aware routing algorithms to optimize the lifetime reliability of Network-on-Chip routers [5]. The use of a reconfigurable router, where the buffer slots are dynamically allocated to increase router efficiency in a Network-on-Chip, even under rather different communication loads [6].In this paper, propose a Reconfigurable Approximate Carry Look-Ahead Adder. It has two working methods of correct and estimated. In this structure of the adder depends on the correct convey look-ahead snake. For the correct include activity does not require an outer adjustment unit. In this correct CLA snake significantly littler delay, area and power in estimated mode contrast with other kind of adder.

#### 2. Comparison of different adders

#### A. Accuracy configurable adder

In this area consider the precision configurable adder circuits to play out the number juggling activity by covering sub-adders to decrease mistakes from convey truncation as appeared in Fig.



1. N and K is the two distinctive outline parameters in ACA snake. N is the length of sub-adder and 2K bits wide for each sub-snake. Consider N=20 and K=4 in fig 1.The aggregate number of sub-adders are computed by utilizing beneath condition.

$$M = (N/K) - 1$$
 (1)

Consider the Fig. 1, which has four sub-adders S0 to S3 and each sub-adder is 8-bits when the carry is greater than k bits. Then the sub-adder missed the carry [1]. So, the output is wrong to correct that error output should be incremented by 1.The error correction is start from least significant sub-adder to most significant sub-adder. ACA adder suffers from some drawback that the early correction techniques in the pipeline stages are small in magnitude and they are performed from the least significant bit.



Fig. 1. Accuracy configurable adder

#### B. Generic accuracy configurable adder

In this section, discuss about Generic accuracy configurable adder. The generalized version of ACA adder called Generic accuracy configurable adder [2]. In this adder have three different parameters like R, P and N.R is the number of redundant bits. P is number of previous bits used for carry prediction. N is the length of operands. The value of K can be calculated by using below equation.

$$K = ((N-L)/R) + 1$$
 (2)



The Fig. 2. Show the architecture design and configurations of generic accuracy configurable adder for N, R and P respectively. Consider the below figure two sub-adders are used and each sub-adder length is 8-bits. In the configuration all the

sum bits of sub-adder 1 contribute to the final sum. In sub-adder 2 the lower 4-bits are used to improve the carry prediction and hence the sum bits obtained by previous bits. For the example of fig 2 an additional sub-adder unit is needed to increasing the p results. The configuration results to achieve low delay, low area and average accuracy.

The main drawback of generic accuracy configurable adder is similar to ACA adder that very low accuracy is gained in the early stages of correction technique.

#### C. Gracefully-degrading accuracy configurable adder

In this section, discuss about gracefully-degrading accuracy configurable adder design. The below Fig. 3, shows the accuracy configurable adder based pipeline structure. This adder consists of four stages, all the stages to be performed if the most significant bits of the results [3]. Consider N-bit GDA adder and two N-bit addends A and B and use adder units to obtain the segmented addend. Where, Ai and Bi are the i<sup>th</sup> bit of A and B, propagate P, and generate G as follows:

$$\begin{array}{l} P_i=A_i \bigoplus B_i \\ G_i=A_i. \ B_i \\ C_{i+1}=G_i+P_i. \ C_i \end{array} \tag{3}$$

Multiplexers are used to connect adder units and carry-in selects from either least significant adder unit or carry-in prediction component. The problem of reconfiguring GDA adder into certain mode is equal to the problem of combining adder units into certain sub-adders and suffer the from low accuracy loss. The carry-in prediction is difficult to determine accuracy of approximate adders and the accuracy configuration mechanism are used to reconfigure the number of carry prediction bits. The implementation of carry-in prediction is complex but accurate one. In literature work various types' carry-in prediction schemes are proposed. GDA adder suffer from some drawbacks, it has no error correction and detection scheme, cannot be pipelined. To get results in high accuracy of greater delay.



Fig. 3. Gracefully-degrading accuracy configurable adder

### D. Simple accuracy reconfigurable adder

In this section, discuss about simple accuracy reconfigurable adder design. It is a carry-prediction based accuracy configurable adder design with less area compare to CLA. Compare to gracefully-degradation adder, simple accuracy



reconfigurable adder incurs 50% less power-delay-product and can achieve the same peak signal-to-noise technique also improve the accuracy-power-delay tradeoff ratio (PSNR) [4]. The two options is realized by using multiplexers in fig 4 and the multiplexer result is denoted by  $c_i^{\text{A}}$ .  $c_i^{\text{A}} c_{ab}$  be configured to either accurate mode or approximate mode. The advantages of simple accuracy reconfigurable adder are confirmed by the application in multiplication and discrete cosine transform computing.



Fig. 4. Simple accuracy-reconfigurable adder design

| Definition of parameters |                                |  |  |  |
|--------------------------|--------------------------------|--|--|--|
| Parameter                | Definition                     |  |  |  |
| N                        | Length of addends              |  |  |  |
| K                        | Each sub-adder is 2K bits      |  |  |  |
| Р                        | number of previous bits        |  |  |  |
| R                        | number of redundant bits       |  |  |  |
| Ν                        | bit-width of the addends       |  |  |  |
| Ai and Bi                | i <sup>th</sup> bit of A and B |  |  |  |
| Pi                       | Propagate bit                  |  |  |  |
| Gi                       | Generate bit                   |  |  |  |
| Ci                       | Accurate carry-out bit         |  |  |  |

Table 1 Definition of parameters

#### 3. Error correction and detection



Fig. 5. Error correction and detection circuit

In Accuracy Configurable Adder and Gracefully-Degrading accuracy configurable adder, most of the error correction stages to be enabled to produce accuracy enhancement in the result of most significant bits. Even in the pipeline implementation of carry look-ahead adder to give the poor performance than the accurate adder result. The correction part is to produce the inaccurate result more or less equal to accurate result. First the error correction starts from least significant bit and the result is not well. Because of its result produce small difference in accuracy of output to give poor efficiency. To reduce these problems, introduce the new error correction method, like accurus technique for accuracy configurable adder. In this technique the correction starts from most significant bit position. The accuracy enhancement is starts from most significant sub-adder to least significant sub-adder. The area, power and delay are almost same for this correction technique.

In the above Fig. 5, shows the error correction and detection circuit for a sub-adder. This forms the basis for error detection and correction circuit in ACA adder. The AND gate structure are used to detect the error and the incrementor unit are used to correct the error. In this adder, two possible implementation schemes like at a time enhancing on sub-adder and parallel enhancements. In first scheme uses almost same delay, power and area in ACA adder. In this scheme the accurus technique does not give accurate result. In second scheme is very useful for addition application. In earlier technique is iterated X times to get 100% accurate result. The error detection and correction for Generic Accuracy Configurable Adder are two sub-adders without error correction and with error correction it will require 1 cycle and 2 cycles. In error correction stage, the lower inputs of the multiplexers are used to generate the accurate results.

#### 4. Error probability

A general framework for determining the probability of error and gives the probability of the output being incorrect. The bit-width of sub-adder and adder is 2K and N. The total number of sub-adder is used to calculate the eq. 4.

$$M = (N/K) - 1$$
 (4)

The output of sub-adder is dependent on the carry-in and as discussed earlier. The probability of final result is given by

$$P_{error} = P (U A_i) = \sum_{i=1}^{M} P (A_i) - \sum_{i=1}^{M-1} \sum_{j=i+1}^{M} P (A_i \cap A_j)$$

$$M = 2 M - 1 M$$

$$+ \sum_{i=1}^{M} \sum_{j=i+1}^{M} \sum_{k=j+1}^{M} P (A_i \cap A_j \cap A_k).... (5)$$

Where, Ai the probability of a sub-adder

The probability which two addends of i\*k bits produce a carry out is given by

$$P(E2) = 2^{ik} (2^{ik} + 1)/2^{2ik+1}$$
(6)

Accuracy Configurable Adder for N=20 and K=4, the probability of error for conventional method 0.9387(93.87%) and for accurus method for 0.9398(93.98%).

Generic Accuracy Configurable Adder for N=12, R=4, P=4 and K=2, the probability of error is 2.9297%.

#### 5. Proposed work

Network-on-Chip is a communication subsystem on an



integrated circuit. It is typically between IP cores in a system on chip. The Network-on-Chip technology applies networking theory and on-chip communication methods. It is a reliable and scalable communication paradigm deemed as an alternative to classic bus system in modern system -on- chip.

|                                     | Table 2              |                         |  |  |  |
|-------------------------------------|----------------------|-------------------------|--|--|--|
| Comparison for probability of error |                      |                         |  |  |  |
| Methods                             | Probability of error | Probability of error by |  |  |  |
|                                     |                      | simulation              |  |  |  |
| GeAr                                | 2.9297%              | 2.9480%                 |  |  |  |
| Accurus                             | 96.98%               | 94.13%                  |  |  |  |

- *Router*:A router or switch is a systems administration gadget that advances information parcels between PC systems. Switches play out the movement coordinating capacities on the Internet. It has significantly a bigger number of abilities than other system gadgets, for example, a switch or center that are just ready to perform fundamental system capacities.
- *IP Core:*IP Core or IP block is a reusable unit of rationale, cell or incorporated circuit format structure that is the protected innovation of one gathering. IP Cores can be utilized as building block inside application particular coordinated circuit.
- *Network Interface:* A network interface is generally a network interface card (NIC). A network interface is the point of interconnection between a computer and a private or public network [7].



Propose reconfigurable compressed data –RDNoCs, a data approximation framework for Network-on-Chips to alleviate the impact of heavy data communication stress by leveraging the error tolerance of applications. To reduce the transmission of approximately similar data in the Network-on-Chip propose a new configurable accuracy tuning adder. In this paper, propose a Reconfigurable Approximate Carry Look-Ahead Adder. It has two working methods of correct and estimated. In this structure of the adder depends on the correct convey lookahead snake. For the correct include activity does not require an outer adjustment unit. In this correct CLA snake significantly littler delay, area and power in estimated mode contrast with other kind of adder.

$$\begin{array}{l} S_i = P_i \bigoplus C_i \\ C_{i+1} = G_i + P_i C_i \end{array}$$

Where  $C_i$  is the input carry and  $P_i$  and  $G_i$  are the propagate  $(A_i \bigoplus B_i)$  and generate  $(A_i B_i)$  signals of the i<sup>th</sup> stage.



Fig. 7. Proposed reconfigurable approximate carry look-ahead adder

In the above Fig. 7, shows the proposed Reconfigurable Approximate Carry Look-Ahead Adder structure. The structure of left part is approximate part and right part is augmenting part. Both of these parts are used to calculate the carry output. For the proposed reconfigurable approximate adder based on this segmentation, two exact and approximate operating modes are realized. In this circuit, only one multiplexer is added compared to the exact CLA.

## 6. Results and discussion

In this section, first different design parameters of the proposed adder of with and without error reduction unit and our proposed adder (Reconfigurable approximate carry look-ahead adder) are studied. Generic Accuracy Configurable Adder provides more design possibilities compared to grace-fully degrading accuracy configurable adder. Some Configurable adders such as ACA [1] and GeAr [2], implement error correction with pipelining, to obtain the complete result which sometimes takes multiple clock cycles. The results include both approximate and exact operating modes. The results for the delay, area and power of the 16-bit approximate adder have been reported in the below Table 3. In the Table 3 compare to different adders to determine the delay, area and power of reconfigurable approximate adder. The reconfigurable approximate carry look-ahead adder lower delay compare to GeAr.

|           | C        | Table 3<br>Comparison table |                  |
|-----------|----------|-----------------------------|------------------|
| Parameter | Existing | Proposed with A0            | Proposed with A1 |
| slice     | 12       | 9                           | 11               |
| LUT       | 22       | 17                          | 22               |
| TIME      | 11.090ns | 12.427ns                    | 9.082ns          |

# 7. XILINX synthesis report

The proposed Adder circuit has been simulated and the synthesis report obtained by using Xilinx ISE 12.1i. The various parameters used for computing existing and proposed systems with Spartan-3 processor are listed below.



| Design s               | Table 4<br>ummary o | of existing    |             |
|------------------------|---------------------|----------------|-------------|
| Logic utilization      | Used                | Available      | Utilization |
| Number of Slices       | 12                  | 960            | 1%          |
| Number of 4 input LUTs | 22                  | 1920           | 1%          |
| Number of bonded IOBs  | 26                  | 66             | 39%         |
| Path delay: 11.090ns   |                     |                |             |
| -                      | Table 5             |                |             |
| Design summary         | of propos           | sed with appro | ox. 0       |
| Logic utilization      | Used                | Available      | Utilization |
| Number of Slices       | 12                  | 960            | 0%          |
| Number of 4 input LUTs | 22                  | 1920           | 0%          |
| Number of bonded IOBs  | 26                  | 66             | 39%         |
| Path delay: 12.427ns   |                     |                |             |
|                        | Table 6             |                |             |
| Design summary         | of propos           | sed with appro | ox. 1       |
| Logic utilization      | Used                | Available      | Utilization |
| Number of Slices       | 11                  | 960            | 1%          |
| Number of 4 input LUTs | 21                  | 1920           | 1%          |
| Number of bonded IOBs  | 26                  | 66             | 39%         |
| Path delay: 9.082ns    |                     |                | -           |
| -                      |                     |                |             |
|                        |                     |                |             |
| 25 -                   |                     |                |             |
|                        |                     | _              |             |
|                        |                     |                |             |
| 20                     | _                   |                |             |

compared to the existing system.

#### 8. Conclusion

In this work facilitates approximate matching of data, to reduce the transmission of approximately similar data in the Network-On-Chip. In this paper propose a new Reconfigurable approximate carry look-ahead adder. The adder delighted the capacity of switching between approximate and exact operating modes. To survey the efficiency of the proposed structure its design parameters were compared to those of some recommended reconfigurable approximate adders. The parameters which included delay area and power were evaluated.

#### References

- Vinamra Benara, Suresh Purini, "Accurus: A Fast Convergence Technique forAccuracy Configurable Approximate Adder Circuits," in proc.IEEE Computer Society AnnualSymposium on VLSI, July 2016, pp. 577-582.
- [2] Mohammed Shafique ,Waqas Ahmad, Rehan Hafiz, Jorg Henkel, "Low latency generic accuracy configurable adder", in proc. Design Autom. Conf.(DAC), 2015, pp.1-6.
- [3] Rong Ye, Ting Wang, Feng Yuan, Rakesh Kumar and Qiang Xu, "On Reconfigurable-Oriented approximate adder design and its application," in Proc. Int. Conf. Computer-Aided Design (ICCAD), 2013, pp. 48-54.
- [4] Wenbin Xu, Sachin S. Sapatnekar, Jiang Hu, "A Simple Yet Efficient Accuracy-Configurable Adder Design,"IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018.
- [5] Liang Wang, Xiaohang Wang, Terrence Mak, "Adaptive Routing Algorithms for Lifetime Reliability Optimization in Network-on-Chip," IEEE Journal of Latex Class Files, 2012.
- [6] Débora Matos, Caroline Concatto, Márcio Kreutz, Fernanda Kastensmidt, Luigi Carro, Altamiro Susin Y. Yorozu, M. Hirano, "Reconfigurable Routers for Low power and High Performance," IEEE Transaction On Very Large Scale Integration (VLSI) System, 2011.
- [7] Sarzamin Khan, Sheraz Anjum, Usman Ali Gulzar, Frank Sill Torres, "Comparative analysis of network-on-chip simulation tools", IET Computers & Digital Technique, 2018, Vol. 12 Issue. 1, pp. 30-38.



Fig. 8. Performance results

The Fig. 8, shows that there is a considerable reduction in time and area based on the implementation results which have been done by using Spartan-3 processor. The proposed algorithm significantly reduces area consumption when