FPGA Based Ultra Fast Signed Multiplier Rajkumar Chinthala and Indra Gupta Email:
[email protected],
[email protected]
Abstract This paper presents hardware implementation of parallel signed multiplier and ultra fast signed multiplier on FPGA using Verilog. Performance Comparison of both the developed multipliers is carried out. Ultra fast signed multiplier is faster and utilises less resources of target device compared to parallel signed multiplier. The model for both the multipliers is created and simulated in ISE foundation design tool.
1. Introduction Multipliers, adders, multiplexers and memories are the basic building blocks which are playing an important role in today’s digital signal processing and in almost all other applications. With advances in technology multipliers are designed with high speed, low power consumption and compact VLSI implementation. If the multiplier which is the primary element is faster than whole the process will be faster. Implementation of multiplication can be done in many ways but the general choice of industry is towards the multiplier circuit which has high sped, compact and consumes less power [1]. There are many ways of implementations of multiplier circuits mentioned in the literature [2-4]. Array multipliers and multipliers based on the modified Booth’s algorithm have been popular among the different types of multipliers. In array multipliers, partial products are generated by AND gate cells and are added to give the product where as in case of multipliers based on the modified Booth’s algorithm, three-bit strings of the multiplier are scanned and appropriate operations are carried out on the multiplicand [5]. In this paper hardware implementation of ultra fast multiplier is proposed and is compared with the conventional parallel signed multiplier in respects of speed and device utilization. Comparison is based on the synthesis results obtained by synthesizing the multipliers architecture towards Xilinx FPGA device. Results denote that ultra fast signed multiplier is
implemented with significant savings in hardware resources and has faster response compared to parallel signed multiplier.
2. Parallel signed multiplier The common multiplication method is to add and shift. The same technique is applied in the functioning of parallel signed array multiplier. If the numbers of partial products to be added are more, then the speed of the multiplier reduces and the number of slices utilized on Field Programmable Gate Array increases. The number of partial products to be added depends on the operands size [6].
Figure 1 Parallel signed multiplier Considering a 4-bit parallel signed multiplier as shown in Figure 1.This module contains multiplicand and multiplier are of 4 bit each. The product which will result is of maximum of 8 bits. One bit sign signal is used in the model to indicate whether it is signed multiplication or unsigned multiplication. Both positive and negative operands are to be properly sign extended whenever needed. When the multiplier or/and multiplier is/are negative, it is represented in signed 2's complement; each 1 added in front due to the sign extension requires an addition of the multiplicand to the partial product. However, one only needs to consider enough bits to guarantee 8 bits required in the result. One bit ready signal is added to the module to indicate the busy status of the circuit. The model is described in behavior level abstraction in Verilog. After the design entry in the ISE (Integrated Software
Environment) design tool, the module to be designed will look like Figure 2.
2.1. Simulation results
Figure 2 RTL schematic diagram of the parallel signed multiplier
Simulation verifies the functionality of the circuit that has to be implemented on the FPGA kit [4]. The above circuit is simulated on ISIM (ISE simulator) provided by the ISE (Integrated Software Environment) foundation tool 10.1. Inputting multiplier = 0;multiplicand = 0;sign = 0;clk = 0;#100; multiplier = 4'hb;multiplicand = 4'he;sign = 1'b0;clk = 1'b1; in stimulus file yields following results shown in Figure 3.
Figure 3 Simulation results of the parallel signed multiplier respectively. These reports are generated by the ISE 10.1(integrated software environment) design tool.
2.2 Synthesis The target device for the synthesis of the above mentioned circuit is Xilinx SPARTAN-3E FPGA board. There are two reports namely timing report and Minimum period
6.271ns
Minimum input arrival time before clock Maximum output required time after clock
3.739ns 6.059ns
device utilization report, which describes about the synthesis of the circuit tabled in Table 1 and Table 2
Table 1 Timing report of parallel signed multiplier Table 2 Device utilization report of parallel signed multiplier Number of Utilization %Utilization devices Slices 40 out of 4656 0 Slice Flip Flops 34 out of 9312 0 4 input LUTs
74 out of 9312
0
IOs bonded IOBs
19 19 out of 232
8
GCLKs
1 out of
24
4
Number of BRAMs
0 out of
20
0
3. Ultra fast signed multiplier In this multiplier, 256×8 ROM is used as look up table to store the result i.e., product of two 4-bit operands namely multiplier and multiplicand. Here 4 bit multiplier is considered although it can be of any size. After fetching of the operands the circuit concatenates two operands to give the address of memory location in ROM containing desired result. So the product is just one clock away.
multiplicand and multiplier is of 4 bit each. The resultant product shall be of 8 bits. The only disadvantage with this LUT based multiplier is the memory requirement. The memory required increases with the increase in size of multiplier and multiplicand. After the design entry the module obtained is shown in Figure 5.
Figure 5 RTL view of ultra fast signed multiplier
3.1. Simulation results . Figure 4 Ultra fast signed multiplier Ultra fast signed multiplier provides very fast result and occupies less area of the target device comparatively to parallel signed multiplier. This multiplier utilizes the LUT (look up tables) to make the operation fast. Since the retrieval of the data from the memory is the only operation is has to do. The module of 4-bit Ultra fast signed multiplier is shown in Figure 4 containing
Considering same inputs which are applied for parallel signed multiplier in the stimulus file of ultrafast multiplier, the simulation results obtained are shown in Figure 6.
3.2 Synthesis For the same target device i.e., FPGA, synthesis of the ultra fast multiplier is carried out. The synthesis report is tabled in Table 3 and Table 4.
Figure 6 Simulation results of ultra fast signed multiplier.
Table 3 Device Ultra fast signed utilization report of multiplier takes only one ufm clock pulse i.e., 0.144ns Number of Utilization to give the result where devices the parallel multiplier Slices 0 out ofas 4656 takes 6.059ns after the Slice Flip Flops 0 out ofclock 9312 pulse. In many 4 input LUTs 0 out ofdigital 9312 signal processing applications, there are IOs many tasks which need Bonded IOBs 17 out of 2 repeated multiplication; the conventional GCLKs 1 out ofif 24 multipliers are replaced Number of 1 out ofwith20 ultra fast BRAMs multipliers then fastness of whole circuit can be Table 4 Timing improved. report of ufm Device utilization Minimum period 0 reports of the two Minimum input arrival 0 multipliers presented in Table1 and Table 3 time before clock Maximum output 0 shows that Ultra fast signed multiplier utilizes required time after clock less resources on FPGA compared to the parallel 4. multiplier. The only Conclusi disadvantage with this on kind of implementation of multiplier is the The two proposed memory requirement. multipliers namely Memory requirement parallel signed increases with the size multiplier and ultra fast of multiplier and signed multiplier are multiplicand. With the designed and increase in size of implemented on the operand by one bit the SPARTAN 3-E FPGA memory requirement kit. By the timing increases by 4 times. reports of two multipliers presented in 5. References the Table 5 it is clear that ultra fast signed [1] S.Shah, A. J. multiplier is 43 times A-Khabb, D. faster than the parallel AI-Khabb, multiplier. “Comparison Parallel signed multiplier
Ultra fast multiplier
6.203ns
0.144ns
Maximum time required to get output Table 5 Comparison of parallel multiplier and ufm.
of 32-bit Multipliers for Various Performance
Measures”, IEEE International Conference on Microelectroni cs, Te hr an Oct.31-Nov2, 2000,pp.75-80. [2] Michael A. Soderstrand, “Csd Multipliers For Fpga Dsp Applications”, IEEE International Conference on Circuits and Systems, 2003,pp. v469-v-472. [3] Xiaohui Yang ,Zibin Dai , Xuerong Yu , Jinhai Su,“A Design of General Multiplier in GF_28_and FPGA Implementation ”,IEEE International Symposium on Pervasive Computing and Applications,20 06,pp.503-507. [4] S. Shah, A.J. Al-khabb, “Comparision of 32-bit Multipliers for Various Performance Measures”, IEEE International Conference on Microelectroni cs,2000,pp.7580.
[5] Sunder S. Kidambi, Fayez ElGuibaly, and Andreas Antoniou, “Area efficient Multipliers for Digital Signal Processing Applications”, IEEE International Conference on Circuits and Systems,1996. [6] Rizwan Mudassir, H. El-Razouk and Z. Abid, ” New Designs of Signed mulltipliers”, IEEE International Conference on Circuits and Systems,2005,p p. 259-262. Rajkumar chinthala is pursuing M.tech in the specialization of System Engineering and Operations Research in the Electrical Engineering Department of Indian Institute of Technology Roorkee, India. Previously he was with Bharath Sanchar Nigam Limted as Telecom Technical Assistant for 5 years. His area of interests includes VLSI design, Process Control applications and Online Control Applications. Indra Gupta is working as an Associate professor in the Electrical Engineering Department of Indian Institute of Technology Roorkee, India. She completed her Ph.D.
from I.I.T. Roorkee, in year 1996. She has published many papers in reputed journals so far. Her area of interests includes Power System, Simulation, Process
Control Applications, Microprocessor Applications, ANN, Online Control Applications and VLSI design.