research article

Highly-Efficient Number-Crunching-Performance SoC Macros for Smart Biosensors

Upasna Vishnoi*

Electrical Engineering and Computer Systems, RWTH Aachen University, Germany

*Corresponding author: Upasna Vishnoi, Electrical Engineering and Computer Systems, RWTH Aachen University, Germany. Tel: +4916509468949; Email: vishnoiupasna@gmail.com

Received Date: 24 January, 2018; Accepted Date: 14 March, 2018; Published Date: 20 March, 2018

Citation: Vishnoi U (2018) Highly-Efficient Number-Crunching-Performance SoC Macros for Smart Biosensors. Biosens Bioelectron Open Acc: BBOA-115. DOI: 10.29011/BBOA-115.100015

1.                   Keywords: Coordinate Rotate Digital Computer; Design-Space Exploration; Early Cost Estimation; Pareto Optimization; QR-Decomposition; Smart Biosensors

1.                   Research

Depending from the specifications a huge design space, featuring up to thousands of possible implementations is available from the QRD-architecture template published in [1], these QR-Decomposition methods can be efficiently used in smart Biosensors. In order to support design-space exploration the parameterized cost model as well as routines for pruning (e.g. according to maximum latency), Pareto optimization etc. are implemented in a MATLAB-based optimization environment [2]. The execution time for a whole design space exploration (one set of specification parameters) is in the order of a few minutes only. Cost breakdown tables and figures can be generated automatically to get detailed insights to the cost contributions of the individual building blocks in order to identify bottlenecks and to guide optimization. Constraints can be set on the maximum clock frequency e.g. to avoid unreasonably high clock frequencies due to O- fold Coordinate Rotate Digital Computer (CORDIC) multiplexing or on selected clock frequencies being available on a So C. Arbitrarily high throughput rates can be achieved at almost unchanged ATE complexity and latency by multiplexing parallel QRD blocks [1]. Therefore, especially the area- and energy-optimization for less challenging throughput specifications is a valuable capability of this optimization environment [1]. Finally, it can be applied in early cost estimation to support system conception and design.

Just to give an idea of the capabilities of the optimization environment exemplary results are in Figure 1a which shows an example of AT- and ATE-design spaces for complex valued integer full ([R]and [Q]) QRD of matrix size N=12, iteration count M=16 and word length W=16. The whole design space with 1,440 possible implementations is pruned for  in Figure 1b. The execution time for this example is 1 minute 42 seconds, only.

Figure 2 shows the AT as well as the ATE (insets) design spaces for different QRD specifications (applying extra delay stages in the PEs). The technology used is 40-nm CMOS technology. Here, for delay and output slopes SS-technology and slow-application (temperature) corner features were used while for energy and power features derived in FF-technology and fast-application corner were applied. Even though no fabricated silicon die would feature this combination of cost figures, it is still the adequate worst-case approach to ensure meeting specified figures. The back-biasing experiments were conducted assuming a reverse back-biasing voltage of VBS=-0.5V in order to reduce leakage power in the FF-corner and assuming a forward back-biasing voltage of VBS=-0.5V Volts in order to reduce critical path delay in the SS-corner. Supply voltage is 0.8 V; word length and CORDIC iteration count are specified to be w=M=16. Energy figures are given for the case that no clock or power gating is applied.

In Figures 2a,2b,2c the variation with matrix sizes from N=16 to N=18 for a real-valued, integer-data format QRD is shown. The corresponding total numbers of possible implementations are 960, 1,200, and 1,440. The AT -design spaces feature hyperbola-like Pareto-optimal fronts, offering trade-offs between throughput and silicon area. The ATE -design spaces are pruned for implementations featuring an ATE not being smaller than one tenth of the optimum ATE. Filled markers depict carry-ripple and unfilled markers depict carry-select adder-based implementations. Latency optimal implementations are depicted by red-filled circles. Squares mark latencies up to two, And diamonds up to five times the minimum latency. For this word length, ATE -optimal implementations in the lower left corner are solely carry-ripple adder based.

Figures 2d-2f shows the results for matrix size of N=16  for complex-valued / integer- w mantissa=16 and the exponent word length is Wexp=6. As can be seen from comparing the ATE -optimal implementations, the overhead for floating-point data format extensions is rather small (in the order of 26 % for AT and 14 % for E). In contrast to that, the overhead for the extensions for complex-valued matrix processing is quite high: Both, area and period costs are increased by a factor of more than two, resulting in a 4.9 time larger AT complexity. Energy also is increased by more than a factor of 2.3.

2.       Acknowledgement

This research work was done at the Chair of Electrical Engineering and Computer Systems, RWTH Aachen University, Germany under the able guidance of Professor Dr-Ing. Tobias G. Noll.


Figures 1(a, b): Examples of AT - and ATE -design spaces for complex valued integer full ([R]and [Q]) QRD of matrix size N=12.



Figures 2(a-f): AT - and ATE-design space for a) - c) real-valued integer QRD with matrix size a) n=14, b) n=16, c) n=18, and d) complex-valued integer n=14, e) real-valued float n=14, f) complex-valued float n=14; all figures for ([R]and [Q]), 40-nm CMOS worst case, integer word lengths W=16 bit, floating-point word lengths  w mantissa=16 bit, Wexp=6 bit, CORDIC iteration count M=16.


1.       Vishnoi U, Meixner M, Noll TG (2016) A Family of Modular QRD-Accelerator Architectures and Circuits Cross-Layer Optimized for High Area- and Energy-Efficiency. The Journal of Signal Processing Systems 83: 329-356.

2.       Vishnoi U, Noll TG (2013) A Family of Modular Area- and Energy-Efficient QRD-Accelerator Architectures. IEEE International Symposium on System-on-Chip (SoC Tampere). Finland.

3.       Vishnoi U, Noll TG (2013) Cross-Layer Optimization of QRD Accelerators. IEEE Proceedings of 39th European Solid-State Circuits Conference (ESSCIRC). Bucharest. Pg No: 263-266.

4.       Cavallaro JR, Elster AC (1991) A CORDIC Processor Array for the SVD of a Complex Matrix. Elsevier Science Publishers. Pg No: 227-239.

5.                   Vishnoi U, Noll TG (2012) Area- and energy-efficient CORDIC Accele­rat­ors in Deep Sub-micron CMOS Technologies. Advances in Radio Science 10: 207-213.

6.                   Cavallaro JR, Elster AC (1991) A CORDIC Processor Array for the SVD of a Complex Matrix. SVD and Signal Processing II - Algorithms, Analysis and Applications, Elsevier Publishers. Pg No: 227-239.

7.                   Kung SY (1987) VLSI Array Processors. Upper Saddle River, New Jersey, USA.

8.                   Ercegovac MD, Lang T (2003) Digital Arithmetic (1st Edition).

9.                   Senning Ch, Staudacher A, Burg A (2010) Systolic-Array based regularized QR-Decomposition for IEEE 802.11n Compliant Soft-MMSE Detection. International Conference on Microelectronics ICM. Pg No: 391.

10.                Salmela P, Burian A, Sorokin H, Takala J (2008) Complex-Valued QR Decomposition Implementation for MIMO Receivers. Proceedings in IEEE ICASSP. Pg No: 1433-1436.

11.                Patel D, Shabany M, Gulak GP (2009) A Low-Complexity High‑Speed QR Decomposition Implementation for MIMO Receivers. IEEE.

12.       Vishnoi U, Meixner M, Noll TG (2012) An Approach for Quantitative Optimization of Highly Efficient Dedicated CORDIC Macros as SoC Building Blocks. SOCC International Conference. pg No: 242-247.


© by the Authors & Gavin Publishers. This is an Open Access Journal Article Published Under Attribution-Share Alike CC BY-SA: Creative Commons Attribution-Share Alike 4.0 International License. With this license, readers can share, distribute, download, even commercially, as long as the original source is properly cited. Read More.

Biosensors and Bioelectronics Open Access

cara menggunakan pola slot mahjongrtp tertinggi hari inislot mahjong ways 1pola gacor olympus hari inipola gacor starlight princessslot mahjong ways 2strategi olympustrik mahjong ways 2trik olympus hari inirtp koi gatertp pragmatic tertinggicheat jackpot mahjongpg soft link gamertp jackpotelemen sakti mahjongpola maxwin mahjongslot olympus mudah mainrtp live starlightrumus slot mahjongmahjong scatter hitamslot pragmaticjam gacor mahjongpola gacor mahjongstrategi maxwin olympusslot jamin menangrtp slot gacorscatter wild banditopola slot mahjongstrategi maxwin sweet bonanzartp slot terakuratkejutan scatter hitamslot88 resmimaxwin olympuspola mahjong pgsoftretas mahjong waystrik mahjongtrik slot olympusewallet modal recehpanduan pemula slotpg soft primadona slottercheat mahjong androidtips dewa slot mahjongslot demo mahjonghujan scatter olympusrtp caishen winsrtp sweet bonanzamahjong vs qilinmaxwin x5000 starlight princessmahjong wins x1000rtp baru wild scatterpg soft trik maxwinamantotorm1131