Basic Fpga Arch Xilinx

  • Uploaded by: openid_ZufDFRTu
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Basic Fpga Arch Xilinx as PDF for free.

More details

  • Words: 3,257
  • Pages: 45
Basic FPGA Architecture

© 2005 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: •





Identify the basic architectural resources of the Virtex™-II FPGA List the differences between the Virtex-II, Virtex-II Pro, Spartan™-3, and Spartan-3E devices List the new and enhanced features of the new Virtex-4 device family

Basic FPGA Architecture 2 - 2

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 3

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Overview •

All Xilinx FPGAs contain the same basic resources –

Slices (grouped into CLBs) •



IOBs •

– –

Contain combinatorial logic and register resources Interface between the FPGA and the outside world

Programmable interconnect Other resources • • • •

Basic FPGA Architecture 2 - 4

Memory Multipliers Global clock buffers Boundary scan logic

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Virtex-II Architecture I/O Blocks (IOBs)

Block SelectRAM™ resource

Programmable interconnect Dedicated multipliers Configurable Logic Blocks (CLBs) •

Virtex™-II architecture’s core voltage operates at 1.5V

Basic FPGA Architecture 2 - 5

Clock Management (DCMs, BUFGMUXes)

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 6

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Slices and CLBs •

Each Virtex-II CLB contains four slices –



COUT BUFT BUF T

Slice S3

Local routing provides feedback between slices in the same CLB, and it provides routing to neighboring CLBs A switch matrix provides access to general routing resources

Slice S2 Switch Matrix

SHIFT

Slice S1

Slice S0

CIN

Basic FPGA Architecture 2 - 7

COUT

© 2005 Xilinx, Inc. All Rights Reserved

Local Routing

CIN

For Academic Use Only

Simplified Slice Structure •

Each slice has four outputs –





Two registered outputs, two non-registered outputs Two BUFTs associated with each CLB, accessible by all 16 CLB outputs

Carry logic runs vertically, up only

Slice 0 LUT

Carry

PRE D Q CE CLR

LUT

Carry

D PRE Q CE CLR

Two independent carry chains per CLB Basic FPGA Architecture 2 - 8 –

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Detailed Slice Structure •

The next few slides discuss the slice features – –

– – –

LUTs MUXF5, MUXF6, MUXF7, MUXF8 (only the F5 and F6 MUX are shown in this diagram) Carry Logic MULT_ANDs Sequential Elements

Basic FPGA Architecture 2 - 9

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Look-Up Tables •

Combinatorial logic is stored in LookUp Tables (LUTs) – –



Also called Function Generators (FGs) Capacity is limited by the number of inputs, not by the complexity

Delay through the LUT is constant Combinatorial Logic

A B C D

Basic FPGA Architecture 2 - 10

Z

© 2005 Xilinx, Inc. All Rights Reserved

A B C D Z 0

0

0

0

0

0

0

0

1

0

0

0

1

0

0

0

0

1

1

1

0

1

0

0

1

0

1

0

1

1

.

.

.

1

1

0

0

0

1

1

0

1

0

1

1

1

0

0

1

1

1

1

1

For Academic Use Only

Connecting Look-Up Tables

Basic FPGA Architecture 2 - 11

F6

Slice S0

F5

Slice S1

F5

F7

Slice S2

F5

F6

Slice S3

F5

F8

CLB

MUXF8 combines the two MUXF7 outputs (from the CLB above or below) MUXF6 combines slices S2 and S3 MUXF7 combines the two MUXF6 outputs

MUXF6 combines slices S0 and S1 MUXF5 combines LUTs in each slice

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Fast Carry Logic •

Simple, fast, and complete arithmetic Logic –





Dedicated XOR gate for singlelevel sum completion Uses dedicated routing resources All synthesis tools can infer carry logic

COUT

COUT To S0 of the next CLB

To CIN of S2 of the next CLB

SLICE S3

First Carry Chain

CIN COUT

SLICE S2 SLICE S1

CIN

Second Carry Chain

COUT

SLICE S0 CIN

Basic FPGA Architecture 2 - 12

© 2005 Xilinx, Inc. All Rights Reserved

CIN

CLB

For Academic Use Only

MULT_AND Gate •

Highly efficient multiply and add implementation –



Earlier FPGA architectures require two LUTs per bit to perform the multiplication and addition The MULT_AND gate enables an area reduction by performing the multiply and the add in one LUT per bit LUT

A

CY_MUX

S CO DI CI

CY_XOR MULT_AND

AxB LUT

B

Basic FPGA Architecture 2 - 13

LUT

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Flexible Sequential Elements • •





Either flip-flops or latches Two in each slice; eight in each CLB Inputs come from LUTs or from an independent CLB input Separate set and reset controls –



Can be synchronous or asynchronous

All controls are shared within a slice

Control signals can be inverted locally within a Basic FPGA Architecture 2 - 14

FDRSE_1 D

S

Q

CE R FDCPE D PRE Q CE CLR

LDCPE D PRE Q CE G CLR



© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Shift Register LUT (SRL16CE) •

Dynamically addressable serial shift registers –



D CE CLK

Maximum delay of 16 clock cycles per LUT (128 per CLB) Cascadable to other LUTs or CLBs for longer shift registers •



LUT

D Q CE

D Q CE

Dedicated connection from Q15 to D input of the next SRL16CE LUT

Shift register length can be changed asynchronously by toggling address A

Basic FPGA Architecture 2 - 15

D Q CE

Q

D Q CE

A[3:0]

© 2005 Xilinx, Inc. All Rights Reserved

Q15 (cascade out)

For Academic Use Only

Shift Register LUT Example •

The SRL can be used to create a No Operation (NOP) –

This example uses 64 LUTs (8 CLBs) to replace 576 flip-flops (72 CLBs) and associated routing and delays 12 Cycles

64

Operation A

Operation B

4 Cycles

8 Cycles

Operation C

Operation D NOP

3 Cycles

9 Cycles

64

Paths are Statically Balanced 12 Cycles

Basic FPGA Architecture 2 - 16

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 17

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

IOB Element •

Input path –









Two DDR registers DDR MUX Reg

Output path –

IOB

Two DDR registers Two 3-state enable DDR registers

Separate clocks and clock enables for I and O Set and reset signals are shared

OCK1

Reg ICK1

Reg

OCK2

3-state

Reg ICK2

DDR MUX Reg

OCK1

Reg OCK2

Basic FPGA Architecture 2 - 18

Input

© 2005 Xilinx, Inc. All Rights Reserved

PAD Output

For Academic Use Only

SelectIO Standard •

Allows direct connections to external signals of varied voltages and thresholds – –



Differential signaling standards – – –



Optimizes the speed/noise tradeoff Saves having to place interface components onto your board LVDS, BLVDS, ULVDS LDT LVPECL

Single-ended I/O standards – – – –

LVTTL, LVCMOS (3.3V, 2.5V, 1.8V, and 1.5V) PCI-X at 133 MHz, PCI (3.3V at 33 MHz and 66 MHz) GTL, GTLP and more!

Basic FPGA Architecture 2 - 19

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Digital Controlled Impedance (DCI) •

DCI provides –





Output drivers that match the impedance of the traces On-chip termination for receivers and transmitters

DCI advantages –





Improves signal integrity by eliminating stub reflections Reduces board routing complexity and component count by eliminating external resistors Eliminates the effects of temperature, voltage, and process variations by using an internal feedback circuit

Basic FPGA Architecture 2 - 20

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 21

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Other Virtex-II Features •

Distributed RAM and block RAM –



• •

Distributed RAM uses the CLB resources (1 LUT = 16 RAM bits) Block RAM is a dedicated resources on the device (18kb blocks)

Dedicated 18 x 18 multipliers next to block RAMs Clock management resources – –

Sixteen dedicated global clock multiplexers Digital Clock Managers (DCMs)

Basic FPGA Architecture 2 - 22

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Distributed SelectRAM Resources •

• •

Uses a LUT in a slice as memory Synchronous write Asynchronous read –



RAM and ROM are initialized during configuration –



Accompanying flip-flops can be used to create synchronous read

LUT

Slice LUT

Data can be written to RAM after configuration

RAM16X1S D WE WCLK A0 O A1 A2 A3

RAM32X1S D WE WCLK A0 O A1 A2 A3 A4

LUT

Emulated dual-port RAM

One read/write port Basic FPGA Architecture 2 - 23 – One read-only port –

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

RAM16X1D D WE WCLK A0 SPO A1 A2 A3 DPRA0 DPO DPRA1 DPRA2 DPRA3

Block SelectRAM Resources •

Up to 3.5 Mb of RAM in 18-kb blocks –



Synchronous read and write

True dual-port memory –



Each port has synchronous read and write capability Different clocks for each port

Supports initial values • Synchronous reset on output latches • Supports parity bits Basic FPGA Architecture 2 - 24 •

© 2005 Xilinx, Inc. All Rights Reserved

18-kb block SelectRAM memory DIA DIPA ADDRA WEA ENA SSRA CLKA

DOA DOPA

DIB DIPB ADDRB WEB ENB SSRB CLKB

DOB DOPB

For Academic Use Only

Dedicated Multiplier Blocks • •



18-bit twos complement signed operation Optimized to implement Multiply and Accumulate functions Multipliers are physically located next to block SelectRAM™ memory Data_A (18 bits)

18 x 18 Multiplier

Output (36 bits)

signed 18 x 18 signed

Data_B (18 bits)

Basic FPGA Architecture 2 - 25

4x4 signed 8x8 signed 12 x 12

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Global Clock Routing Resources •

Sixteen dedicated global clock multiplexers –





Global clock multiplexers provide the following: – – –



Eight on the top-center of the die, eight on the bottom-center Driven by a clock input pad, a DCM, or local routing Traditional clock buffer (BUFG) function Global clock enable capability (BUFGCE) Glitch-free switching between clock signals (BUFGMUX)

Up to eight clock nets can be used in each clock region of the device –

Each device contains four or more clock regions

Basic FPGA Architecture 2 - 26

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Digital Clock Manager (DCM) •

Up to twelve DCMs per device – –



DCMs provide the following: – – –



Located on the top and bottom edges of the die Driven by clock input pads Delay-Locked Loop (DLL) Digital Frequency Synthesizer (DFS) Digital Phase Shifter (DPS)

Up to four outputs of each DCM can drive onto global clock buffers –

All DCM outputs can drive general routing

Basic FPGA Architecture 2 - 27

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • • Basic FPGA Architecture 2 - 28

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan-3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Spartan-3 versus Virtex-II • •

Lower cost Smaller process = lower core voltage –









.09 micron versus .15 micron Vccint = 1.2V versus 1.5V



Different I/O standard support –



New standards: 1.2V LVCMOS, 1.8V HSTL, and SSTL Default is LVCMOS, versus LVTTL

Basic FPGA Architecture 2 - 29

More I/O pins per package Only one-half of the slices support RAM or SRL16s (SLICEM) Fewer block RAMs and multiplier blocks –



• •

Same size and functionality

Eight global clock multiplexers Two or four DCM blocks No internal 3-state buffers For Academic Use Only

© 2005 Xilinx, Inc. All Rights Reserved

SLICEM and SLICEL •

Each Spartan™-3 CLB contains four slices –



Right-Hand SLICEL Left-Hand SLICEM COUT

Similar to the Virtex™-II

Slice X1Y1

Slices are grouped in pairs –



Slice X1Y0 Switch Matrix

Left-hand SLICEM (Memory) •

LUTs can be configured as memory or SRL16

Right-hand SLICEL (Logic) •

COUT

SHIFTIN

Slice X0Y1

Fast Connects

Slice X0Y0

SHIFTOUT CIN

CIN

LUT can be used as logic only

Basic FPGA Architecture 2 - 30

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Spartan-3E Features •



More gates per I/O than Spartan-3 Removed some I/O standards – – – –





Higher-drive LVCMOS GTL, GTLP SSTL2_II HSTL_II_18, HSTL_I, HSTL_III LVDS_EXT, ULVDS

DDR Cascade



16 BUFGMUXes on left and right sides –



• •

Pipelined multipliers Additional configuration modes – –

Internal data is presented on a single clock edge Basic FPGA Architecture 2 - 31

Drive half the chip only In addition to eight global clocks

SPI, BPI Multi-Boot mode



© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Virtex-II Pro Features • •

0.13 micron process Up to 24 RocketIO™ Multi-Gigabit Transceiver (MGT) blocks – –

– –



Serializer and deserializer (SERDES) Fibre Channel, Gigabit Ethernet, XAUI, Infiniband compliant transceivers, and others 8-, 16-, and 32-bit selectable FPGA interface 8B/10B encoder and decoder

PowerPC™ RISC processor blocks – – –

Thirty-two 32-bit General Purpose Registers (GPRs) Low power consumption: 0.9mW/MHz IBM CoreConnect bus architecture support

Basic FPGA Architecture 2 - 32

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 33

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Virtex-4 Features •

New features – – – –



Dedicated DSP blocks Phase-matched clock dividers (PMCD) SERDES built into the Virtex™-4 SelectIO™ standard Dynamic reconfiguration port (DRP)

Enhanced features – –





Block RAM can be configured as a FIFO Advanced clocking networks, including regional clock buffers and source- synchronous support 11.1 Gbps RocketIO™ Multi-Gigabit Transceiver (MGT) blocks Enhanced PowerPC™ processor blocks

Basic FPGA Architecture 2 - 34

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 35

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Review Questions • •

List the primary slice features List the three ways a LUT can be configured

Basic FPGA Architecture 2 - 36

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Answers •

List the primary slice features –

– –

– –



Look-up tables and function generators (two per slice, eight per CLB) Registers (two per slice, eight per CLB) Dedicated multiplexers (MUXF5, MUXF6, MUXF7, MUXF8) Carry logic MULT_AND gate

List the three ways a LUT can be configured – – –

Combinatorial logic Shift register (SRL16CE) Distributed memory

Basic FPGA Architecture 2 - 37

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Summary •

Slices contain LUTs, registers, and carry logic –



• •



LUTs are connected with dedicated multiplexers and carry logic LUTs can be configured as shift registers or memory

IOBs contain DDR registers SelectIO™ standards and DCI enable direct connection to multiple I/O standards while reducing component count Virtex™-II memory resources include the following: –



Distributed SelectRAM™ resources and distributed SelectROM (uses CLB LUTs) 18-kb block SelectRAM resources

Basic FPGA Architecture 2 - 38

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Summary •



The Virtex™-II devices contain dedicated 18x18 multipliers next to each block SelectRAM™ resource Digital clock managers provide the following: – – –

Delay-Locked Loop (DLL) Digital Frequency Synthesizer (DFS) Digital Phase Shifter (DPS)

Basic FPGA Architecture 2 - 39

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Where Can I Learn More? •

User Guides –



Application Notes –



www.xilinx.com → Documentation → User Guides

www.xilinx.com → Documentation → Application Notes

Education resources – –

Designing with the Virtex-4 Family course Spartan-3E Architecture free Recorded e-Learning

Basic FPGA Architecture 2 - 40

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Outline • • • •



• • •

Basic FPGA Architecture 2 - 41

Overview Slice Resources I/O Resources Memory and Clocking Spartan-3, Spartan3E, and Virtex-II Pro Features Virtex-4 Features Summary Appendix © 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Double Data Rate Registers •

DDR registers can be clocked – –

By Clock and NOT(Clock) if the duty cycle is 50/50 By the CLK0 and CLK180 outputs of a DCM Clock

D1

Reg DDR MUX

OCK1

OBUF

PAD

D2

Reg OCK2 •

FDDR

If D1 = “1” and D2 = “0”, the output is a copy of Clock –

Use this technique to generate a clock output that is synchronized to DDR output data

Basic FPGA Architecture 2 - 42

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Dual-Port Block RAM Configurations •



ConfigurationsConfigurati available on onx 1 16k each port

16 kb

Data Bits 1

Parity Bits 0

8k x 2

8 kb

2

0

4k x 4

4 kb

4

0

2k x 9

2 kb

8

1

1k x 18

1 kb

16

2

512 x 36

512

32

4

Independent configurations on ports A and B –

Depth

IN 8 bit

Supports data-width conversion, including parity bits

Basic FPGA Architecture 2 - 43

© 2005 Xilinx, Inc. All Rights Reserved

Port A: 8 bits

Port B: 32 bits

OUT 32 bit

For Academic Use Only

Clock Buffer Configurations •

Clock buffer (BUFG) –

Low-skew clock distribution I



Clock enable buffer (BUFGCE) –





Holds the clock output Low when Clock Enable (CE) is inactive CE can be active-High or active-Low Changes in CE are only recognized when the clock input is Low to avoid glitches and short clock pulses

Basic FPGA Architecture 2 - 44

© 2005 Xilinx, Inc. All Rights Reserved

I

BUFG

BUFGCE

CE

For Academic Use Only

O

O



Clock multiplexer (BUFGMUX) –





I0

Switches from one clock to another, glitch-free After a change on S, the BUFGMUX waits for the currently selected clock input to go Low The output is held Low until the newly selected clock goes Low, then switches

Basic FPGA Architecture 2 - 45

I1

BUFGMUX

Clock Buffer Configurations O

S

S

Wait for low

I0 I1

Switch

O

© 2005 Xilinx, Inc. All Rights Reserved

For Academic Use Only

Related Documents