I. Introduction: Scope of Work: In order to achieve high speed of operation in multiplication, the trend is to resort to Parallel Multipliers. However, the performance of the multiplier is determined by number of partial products to be added. In this paper we identify and optimize the critical path that determines the maximum delay in parallel multipliers. A brief look into the standard architectures like Array Multipliers, Wallace Tree Multiplier and Baug Wooley Multiplier is undertaken, specifically with respect to estimating and minimizing power dissipation due to the shrinking of technology all the way to the sub-micron level in manufacturing the chips. Power consumption can be reduced through suitable methods such as gate-level optimization, increasing gate oxide thickness in non-critical paths, etc. II Standard High-Speed Multipliers A. Array Multipliers Fig. 1 illustrates the Array Multiplier. With its good structure, this multiplier is based on the standard add and shift operations. Each partial product is generated by taking into account the multiplicand and one bit of multiplier each time. The impending addition is carried out by high-speed carry-save algorithm and the final product is obtained employing any fast adder – the number of partial products depends upon the number of multiplier bits. To accommodate the positive and negative values of the multiplier and multiplicands, sign bit extension has to be done at the appropriate stage. The sign bit extension will result in higher capacitive load and will result in more power consumption and due to increase capacitive load it will lead to reducing the speed of operation. B. Wallace Tree Multiplier Fig. 2 illustarate a Wallace Tree Multiplier. In this architecture, all the bits of all partial products in each column are added together to a set of counters in parallel without propagating the carries. Another set of counters reduces this new matrix until a two row matrix is generated. A fast adder is at the end produces the final result.
C. Baugh Wooley Multiplier This was developed to design direct multipliers for 2’s complement numbers – each of the partial product is a signed number that needs sign extension to the width of the final product to form the correct sum. According to the Baugh Wooley approach, an efficient method of adding extra entries to the matrix is suggested to avoid negatively weighted bits in the partial product matrix.
D. By combining Wallace Tree structure with the transmission gates, Flavio Carbognani proposed (Fig. 4) a new architecture with improved energy efficiency of 2.7 uW/Mhz. Transmission gates are used to act as a low pass filter that that suppresses glitches and reduces the overall capacitance thus increasing the speed.