Advanced Xilinx Fpga Design With Ise

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Advanced Xilinx Fpga Design With Ise as PDF for free.

More details

  • Words: 30,891
  • Pages: 482
Advanced Xilinx FPGA Design with ISE Course Agenda

© 2002 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Modular Design Techniques •

Agenda - 3

Lab : IDT Flow Lab : MDT Flow

© 2002 Xilinx, Inc. All Rights Reserved

Agenda Section 5 : Reduce Debug Time –

FPGA Editor: Viewing and Editing a Routed Design •

Lab: FPGA Editor

Section 6 : On-Chip Verification and Debugging –

ChipScope Pro •

Demo

Section 7 : Course Summary Optional Topics – – – –

Agenda - 4

Power Estimation with Xpower Advance Implementation Options Embedded Solutions with Power PC/MicroBlaze and Embedded Development Kit (EDK) Xtreme DSP Solutions with System Generator

© 2002 Xilinx, Inc. All Rights Reserved

Objectives After completing this course, you will be able to: •

• •

• • •

• •

Agenda - 5

Describe Virtex™-II advanced architectural features and how they can be used to improve performance Create and integrate cores into your design flow using the CORE Generator™ System Describe the different ISE options available and how they can be used to improve performance Describe a flow for obtaining timing closure with Advance Timing Constraints Use FloorPlanner to improve timing Reduce implementation time with Incremental Design Techniques and Modular Design Techniques Reduce debugging time with FPGA Editor On-Chip Verification with ChipScope Pro

© 2002 Xilinx, Inc. All Rights Reserved

Prerequisites Basic knowledge of : • • • • •

Virtex™-II architecture features The Xilinx implementation software flow and implementation options Reading timing reports Basic FPGA design techniques Global timing constraints and the Constraints Editor

Basic HDL knowledge (VHDL or Verilog) Basic digital design knowledge

Agenda - 6

© 2002 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

CORE Generator System - 9 - 2

© 2003 Xilinx, Inc. All Rights Reserved

Optimize Your Design for Xilinx Architecture

CORE Generator System

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: •

• • • •

Describe the differences between LogiCORE and AllianceCORE solutions List two benefits of using cores in your designs Create customized cores by using the CORE Generator GUI Instantiate cores into your schematic or HDL design Run behavioral simulation on a design containing cores

CORE Generator System - 9 - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

CORE Generator System - 9 - 5

Introduction Using the CORE Generator System CORE Generator Design Flows Summary

© 2003 Xilinx, Inc. All Rights Reserved

What are Cores? •



A core is a ready-made function that you can instantiate into your design as a “black box” Cores can range in complexity – – –



Simple arithmetic operators, such as adders, accumulators, and multipliers System-level building blocks, including filters, transforms, and memories Specialized functions, such as bus interfaces, controllers, and microprocessors

Some cores can be customized

CORE Generator System - 9 - 6

© 2003 Xilinx, Inc. All Rights Reserved

Benefits of Using Cores •

Save design time –





Cores are created by expert designers who have in-depth knowledge of Xilinx FPGA architecture Guaranteed functionality saves time during simulation

Increase design performance –



Cores that contain mapping and placement information have predictable performance that is constant over device size and utilization The data sheet for each core provides performance expectations •

Use timing constraints to achieve maximum performance

CORE Generator System - 9 - 7

© 2003 Xilinx, Inc. All Rights Reserved

Types of Cores •

LogiCORE



AllianceCORE



The CORE Generator  GUI lists the type of each core

CORE Generator System - 9 - 8

© 2003 Xilinx, Inc. All Rights Reserved

LogiCORE Solutions • • • •

Typically customizable Fully tested, documented, and supported by Xilinx Many are pre-placed for predictable timing Many are unlicensed and provided for free with the Xilinx software –

• •

More complex LogiCORE products are licensed

Support VHDL and Verilog flows with several EDA tools Schematic flow support for Foundation, Mentor, and Innoveda for most cores

CORE Generator System - 9 - 9

© 2003 Xilinx, Inc. All Rights Reserved

AllianceCORE Solutions •

Point-solution cores –



Sold and supported by Xilinx AllianceCORE partners –

• • •

Typically not customizable (some HDL versions are customizable) Partners may be contacted directly to provide customized cores

All cores optimized for Xilinx; some are pre-placed Typically supplied as an EDIF netlist Support VHDL and Verilog flows, some schematic

CORE Generator System - 9 - 10

© 2003 Xilinx, Inc. All Rights Reserved

Sample Functions •

LogiCORE solutions – DSP functions • Time skew buffers, FIR filters, correlators – Math functions • Accumulators, adders, multipliers, integrators, square root – Memories • Pipelined delay elements, single and dual-port RAM • Synchronous FIFOs – PCI master and slave interfaces, PCI bridge

CORE Generator System - 9 - 11



AllianceCORE solutions – Peripherals • DMA controllers • Programmable interrupt controllers • UARTs – Communications and networking • ATM • Reed-Solomon encoders / decoders • T1 framers – Standard bus interfaces • PCMCIA, USB

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

CORE Generator System - 9 - 12

Introduction Using the CORE Generator System CORE Generator Design Flows Summary

© 2003 Xilinx, Inc. All Rights Reserved

What is the CORE Generator System? •

Graphical User Interface (GUI) that allows central access to the cores themselves, plus: – –



Interfaces with design entry tools – –



Data sheets Customizable parameters (available for some cores) Creates graphical symbols for schematic-based designs Creates instantiation templates for HDL-based designs

Web access from the Help menu –

The IP Center contains new cores to download and install •



You always have access to the latest cores

Direct access to http://support.xilinx.com

CORE Generator System - 9 - 13

© 2003 Xilinx, Inc. All Rights Reserved

Invoking the CORE Generator System •





From the Project Navigator, select Project → New Source Select IP (CoreGen & Architecture Wizard) and enter a filename Click Next, then select the type of core

CORE Generator System - 9 - 14

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx CORE Generator System GUI Cores can be organized by function, vendor, or device family

Core type, version, device support, vendor, and status CORE Generator System - 9 - 15

© 2003 Xilinx, Inc. All Rights Reserved

Selecting a Core • •

Double-click folders to browse the catalog of cores Double-click a core to open its information window –

Or select a core, and click the Customize or Data Sheet icons in the toolbar

CORE Generator System - 9 - 16

© 2003 Xilinx, Inc. All Rights Reserved

Core Customize Window Core Overview tab provides version information and a brief functional description

Parameters tab allows you to customize the core

Web Links tab provides direct access to related Web pages Contact tab provides information about the vendor

CORE Generator System - 9 - 17

© 2003 Xilinx, Inc. All Rights Reserved

Data sheet access

CORE Data Sheets •

Performance expectations (not shown) Features Functionality Pinout Resource utilization

CORE Generator System - 9 - 18

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

CORE Generator System - 9 - 19

Introduction Using the CORE Generator System CORE Generator Design Flows Summary

© 2003 Xilinx, Inc. All Rights Reserved

Schematic Design Flow •

Generate a core –





Treated as a “black box” - no underlying schematic

Proceed with normal schematic flow

CORE Generator System - 9 - 20

.EDN & symbol

.xco

Instantiate symbol onto your schematic –



Use the Edit → Project Options to select a schematic symbol instead of HDL templates Creates an EDIF file and schematic symbol

Generate Core

© 2003 Xilinx, Inc. All Rights Reserved

Instantiate

Implement

Simulate

HDL Design Flow compxlib.exe XilinxCoreLib

Generate Core

.xco Instantiate

Simulate

CORE Generator System - 9 - 21

.VHD, .VHO, .V .VEO

Compile library for behavioral simulation (one time only)

.EDN

Implement

© 2003 Xilinx, Inc. All Rights Reserved

Core generation and integration

HDL Design Flow: Compile Simulation Library •

Before your first behavioral simulation, you must run compxlib.exe to compile the XilinxCoreLib simulation library – –



Located in $XILINX\bin\ Supports ModelSim, Cadence NC-Verilog, VCS, Speedwave, and Scirocco

If you download new or updated cores, additional simulation models will be automatically extracted during installation

CORE Generator System - 9 - 22

© 2003 Xilinx, Inc. All Rights Reserved

HDL Design Flow: Core Generation and Integration •

Generate or purchase a core – – –



Instantiate the core into your HDL source –

• •

Netlist file (EDN) Instantiation template files (VHO or VEO) Behavioral simulation wrapper files (VHD or V) Cut and paste from the templates provided in the VEO or VHO file

Design is ready for synthesis and implementation Use the wrapper files for behavioral simulation – –

ISE automatically uses wrapper files when cores are present in the design VHDL: Analyze the wrapper file for each core before analyzing the file that instantiates the core

CORE Generator System - 9 - 23

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

CORE Generator System - 9 - 24

Introduction Using the CORE Generator System CORE Generator Design Flows Summary

© 2003 Xilinx, Inc. All Rights Reserved

Skills Check

CORE Generator System - 9 - 25

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •

• •

What is the main difference between the LogiCORE and the AllianceCORE products? What is the purpose of compxlib.exe? What is the difference between the VHO/VEO files and the VHD/V files that are created by the CORE Generator™ system?

CORE Generator System - 9 - 26

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

What is the main difference between the LogiCORE and the AllianceCORE products? – –



What is the purpose of compxlib.exe? –



LogiCORE products are sold and supported by Xilinx AllianceCORE products are sold and supported by AllianceCORE partners Makes it easy to compile the XilinxCoreLib library before your first behavioral simulation

What is the difference between the VHO/VEO files and the VHD/V files that are created by the CORE Generator™ system? – –

VHO/VEO files contain instantiation templates VHD/V files are wrappers for behavioral simulation that reference the XilinxCoreLib library

CORE Generator System - 9 - 27

© 2003 Xilinx, Inc. All Rights Reserved

Summary • • •

• •

A core is a ready-made function that you can “drop” into your design LogiCORE  products are sold and supported by Xilinx AllianceCORE  products are sold and supported by AllianceCORE partners Using cores can save design time and provide increased performance Cores can be used in schematic or HDL design flows

CORE Generator System - 9 - 28

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •

Xilinx IP Center http://www.xilinx.com/ipcenter – –

• •

Software updates Download new cores as they are released

Tech Tips on http://support.xilinx.com Software manuals: CORE Generator Guide

CORE Generator System - 9 - 29

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

CORE Generator System - 9 - 30

© 2003 Xilinx, Inc. All Rights Reserved

CORE Generator System Lab

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this lab, you will be able to: • • •

Create a core using the Xilinx CORE Generator™ system Instantiate a core into an HDL design Perform behavioral simulation on a design that contains a core

CORE Generator System - 9 - 32

© 2003 Xilinx, Inc. All Rights Reserved

Lab Design: Correlate and Accumulate

CORE Generator System - 9 - 33

© 2003 Xilinx, Inc. All Rights Reserved

Channel FIFO Block

CORE Generator System - 9 - 34

© 2003 Xilinx, Inc. All Rights Reserved

Lab Overview • • •

Generate a dual-port block RAM core Replace an instantiated library primitive with the core Perform behavioral simulation on the design –

Testbench file provided

CORE Generator System - 9 - 35

© 2003 Xilinx, Inc. All Rights Reserved

General Flow Step 1:

Review the design

Step 2:

Generate the core

Step 3:

Instantiate block RAM core into Verilog or VHDL source

Step 4:

Perform behavioral simulation

CORE Generator System - 9 - 36

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Timing Closure with Timing Analyzer - 2

© 2003 Xilinx, Inc. All Rights Reserved

Achieving Timing Closure

Timing Closure with Timing Analyzer

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: • •

Interpret a timing report and determine the cause of timing errors Use the Timing Analyzer report options to create customized timing reports

Timing Closure with Timing Analyzer - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Timing Closure with Timing Analyzer - 5

Timing Reports Interpreting Timing Reports Report Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Timing Reports •

Timing reports enable you to determine how and why constraints were not met –



The Project Navigator can create timing reports at two points in the design flow – –



Reports contain detailed descriptions of paths that fail their constraints

Post-Map Static Timing Report Post-Place & Route Static Timing Report

The Timing Analyzer is a utility for creating and reading timing reports

Timing Closure with Timing Analyzer - 6

© 2003 Xilinx, Inc. All Rights Reserved

Using the Timing Analyzer •





Create and open a report in the Timing Analyzer by doubleclicking on Post-Place & Route Static Timing Report Open a plain text version of the report Start the Timing Analyzer to create custom reports by doubleclicking on Analyze Post-Place & Route Static Timing (Timing Analyzer)

Timing Closure with Timing Analyzer - 7

© 2003 Xilinx, Inc. All Rights Reserved

Timing Analyzer GUI •

Hierarchical browser –



Current position in the report –



Quickly navigate to specific report sections Identifies the portion of the report that is displayed in the text window

Report text –

Links to the Timing Improvement Wizard, Interactive Data Sheet and Floorplanner, highlighted in blue

Timing Closure with Timing Analyzer - 8

© 2003 Xilinx, Inc. All Rights Reserved

Cross Probing •





Shows the placement of logic in a delay path To enable cross probing, use the command: View → Floorplanner for Cross probing Click highlighted text –

The corresponding logic is selected in the Floorplanner

Timing Closure with Timing Analyzer - 9

© 2003 Xilinx, Inc. All Rights Reserved

Timing Report Structure •

Timing Constraints – –



Data Sheet Report –



Setup, hold, and clock-to-out times for each I/O pin

Timing Summary –



Number of paths covered and number of paths that failed for each constraint Detailed descriptions of the longest paths

Number of errors, Timing Score

Timing Analyzer Settings –

Allows you to easily duplicate the report

Timing Closure with Timing Analyzer - 10

© 2003 Xilinx, Inc. All Rights Reserved

Report Example •

Constraint summary – – –



Detailed path description –





Number of paths covered Number of timing errors Length of critical path Delay types are described in the data sheet Worst-case conditions assumed, unless pro-rated

Total delay – –

OFFSET paths have two parts Logic/routing breakdown

Timing Closure with Timing Analyzer - 11

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Timing Closure with Timing Analyzer - 12

Timing Reports Interpreting Timing Reports Report Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Estimating Design Performance • •

Performance estimates are available before implementation is complete Synthesis report – – –



Logic delays are accurate Routing delays are estimated based on fanout Reported performance generally accurate to within 20 percent

Post-Map Static Timing Report – – –

Logic delays are accurate Routing delays are estimated based on the fastest possible routing resources Use the 60/40 rule to get a more realistic performance estimate

Timing Closure with Timing Analyzer - 13

© 2003 Xilinx, Inc. All Rights Reserved

60/40 Rule • • •

A rule-of-thumb to determine whether timing constraints are reasonable Open the Post-Map Static Timing Report Look at the percentage of the timing constraint that is used up by logic delays – – –

Under 60 percent: Good chance that the design will meet timing 60 to 80 percent: Design may meet timing if advanced options are used Over 80 percent: Design will probably not meet timing (go back to improve synthesis results)

Timing Closure with Timing Analyzer - 14

© 2003 Xilinx, Inc. All Rights Reserved

Analyzing Post-Place & Route Timing •

There are many factors that contribute to timing errors, including: – – – –



Each root cause has a different solution – – –



Neglecting synchronous design rules or using incorrect HDL coding style Poor synthesis results (too many logic levels in the path) Inaccurate or incomplete timing constraints Poor logic mapping or placement Rewrite HDL code Add timing constraints Resynthesize or re-implement with different software options

Correct interpretation of timing reports can reveal the most likely cause –

And therefore, the most likely solution

Timing Closure with Timing Analyzer - 15

© 2003 Xilinx, Inc. All Rights Reserved

Example: Poor Placement Data Path: source to dest Location Delay type Delay(ns) ------------------------------------------------0.382 U29.IQ1 Tiockiq SLICE_X0Y65.F2 net (fanout=7) 1.921 SLICE_X0Y65.X Tilo 0.291 SLICE_X15Y1.G2 net (fanout=1) 2.359 SLICE_X15Y1.Y Tilo 0.291 SLICE_X15Y1.F2 net (fanout=1) 0.008 SLICE_X15Y1.X Tilo 0.291 SLICE_X15Y2.DY net (fanout=1) 0.108 SLICE_X15Y2.CLK Tdyck 0.001 ------------------------------------------------Total 5.652ns

• •

Logical Resource(s) ------------------source net_1 lut_1 net_2 lut_2 net_3 lut_3 net_4 dest -----------------------------(1.256ns logic, 4.396ns route) (22.2% logic, 77.8% route)

net_2 has a long delay, even though fanout is low Location column reveals that bad placement is the cause –

Go to Edit → Preferences in the Timing Analyzer to show this column

Timing Closure with Timing Analyzer - 16

© 2003 Xilinx, Inc. All Rights Reserved

Poor Placement: Solutions •

Timing-driven Map, if the placement is caused by packing unrelated logic together – –



PAR extra effort or MPPR options –



Cross-probe to the Floorplanner to see what has been packed together Timing-driven Map is covered in the Advanced Implementation Options module Covered in the Advanced Implementation Options module

Floorplanning or RLOC constraints, if you have the skill –

Covered in the Advanced FPGA Implementation course

Timing Closure with Timing Analyzer - 17

© 2003 Xilinx, Inc. All Rights Reserved

Example: High Fanout Data Path: source to dest Delay type Delay(ns) ---------------------------0.382 Tcko net (fanout=87) 4.921 Tilo 0.291 net (fanout=1) 0.080 Tilo 0.291 net (fanout=2) 0.523 Tilo 0.291 net (fanout=1) 0.108 Tdyck 0.001 ---------------------------Total



Logical Resource(s) ------------------source net_1 lut_1 net_2 lut_2 net_3 lut_3 net_4 dest -------------------------------------6.888ns (1.256ns logic, 5.632ns route) (18.2% logic, 81.8% route)

net_1 has a long delay and high fanout

Timing Closure with Timing Analyzer - 18

© 2003 Xilinx, Inc. All Rights Reserved

High Fanout: Solutions •

Most likely solution is to duplicate the source of the high-fanout net –



In this example, the net is the output of a flip-flop, so the solution is to duplicate the flip-flop If the net is driven by combinatorial logic, it may be more difficult to locate the source of the net in the HDL code

Timing Closure with Timing Analyzer - 19

© 2003 Xilinx, Inc. All Rights Reserved

Example: Too Many Logic Levels Data Path: source to dest Delay type Delay(ns) ---------------------------Tcko 0.314 net (fanout=7) 1.221 Tilo 0.291 net (fanout=1) 0.180 Tilo 0.291 net (fanout=1) 0.423 Tilo 0.291 net (fanout=1) 0.123 Tilo 0.291 net (fanout=1) 0.610 Tilo 0.291 net (fanout=1) 0.533 Tilo 0.291 net (fanout=1) 0.408 Tdyck 0.001 ---------------------------Total

Timing Closure with Timing Analyzer - 20

Logical Resource(s) ------------------source net_1 lut_1 net_2 lut_2 net_3 lut_3 net_4 lut_4 net_5 lut_5 net_6 lut_6 net_7 dest -------------------------------------5.559ns (2.129ns logic, 3.430ns route) (38.3% logic, 61.7% route)

© 2003 Xilinx, Inc. All Rights Reserved

Too Many Logic Levels: Solutions • •



The implementation tools cannot do much to improve performance The netlist must be altered to reduce the amount of logic between flip-flops Possible solutions: –

Check whether the path is a multi-cycle path •







If it is, add a multi-cycle path constraint

Use the retiming option during synthesis to distribute logic more evenly between flip-flops Confirm that good coding techniques were used to build this logic (no nested IF or CASE statements) Add a pipeline stage

Timing Closure with Timing Analyzer - 21

© 2003 Xilinx, Inc. All Rights Reserved

Example: I/O Timing Clock Path: clk to source_ff Delay type Delay(ns) ---------------------------Tiopi 0.669

Logical Resource(s) ------------------clk clk_BUFGP/IBUFG net (fanout=1) 0.019 clk_BUFGP/IBUFG Tgi0o 0.802 clk_BUFGP/BUFG.GCLKMUX clk_BUFGP/BUFG net (fanout=226) 0.307 clk_BUFGP ---------------------------- -----------------------------Total 1.797ns (1.471ns logic, 0.326ns route) (81.9% logic, 18.1% route)

Data Path: source_ff to dest_pad Delay type Delay(ns) ---------------------------0.314 Tcko net (fanout=2) 1.234 Tilo 0.291 net (fanout=1) 1.693 Tioop 4.133

Logical Resource(s) ------------------source_ff net_1 lut_1 net_2 dest_pad_OBUF dest_pad ---------------------------- -----------------------------Total 7.665ns (4.738ns logic, 2.927ns route) (61.8% logic, 38.2% route)

Timing Closure with Timing Analyzer - 22

© 2003 Xilinx, Inc. All Rights Reserved

I/O Timing: Solutions • •

Use a DCM to remove clock distribution delay Register all top-level inputs and outputs –



IOB flip-flops have the best timing

Increase the slew rate or drive strength on outputs –

Only available for LVCMOS and LVTTL I/O standards

Timing Closure with Timing Analyzer - 23

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Timing Closure with Timing Analyzer - 24

Timing Reports Interpreting Timing Reports Report Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Creating Custom Reports •



The Post-Map and Post-Place & Route Static Timing Reports are usually sufficient timing analysis tools Custom reports can be created with the Timing Analyzer to: – – – –

*Show detailed path descriptions for more timing paths *Analyze specific paths that may be unconstrained Analyze designs that contain no constraints Change the constraints or design parameters to perform “what-if” analysis

Timing Closure with Timing Analyzer - 25

© 2003 Xilinx, Inc. All Rights Reserved

Types of Timing Reports •

Analyze Against Timing Constraints – –

Compares design performance with timing constraints Most commonly used report format •



Analyze Against Auto Generated Design Constraints – –

Determines the longest paths in each clock domain Use with designs that have no constraints defined •



Used for Post-Map and Post-Place & Route Static Timing Reports if design contains no constraints

Analyze Against User Specified Paths by Defining Endpoints –



Used for Post-Map and Post-Place & Route Static Timing Reports if design contains constraints

Custom report for selecting sources and destinations

Analyze Against User Specified Paths by Defining Clock and I/O Timing – –

Allows you to define PERIOD and OFFSET constraints on-the-fly Use with designs that have no constraints defined

Timing Closure with Timing Analyzer - 26

© 2003 Xilinx, Inc. All Rights Reserved

Timing Constraints Tab •







After selecting a Timing Analyzer report, you can select from various report options Report Failing Paths: Lists only the paths failing to meet your specified timing constraints Report Unconstrained Paths: Allows you to list some or all of the unconstrained paths in your design You can also select which constraints you want reported

Timing Closure with Timing Analyzer - 27

© 2003 Xilinx, Inc. All Rights Reserved

Options Tab •

Speed grade –



Constraint Details –



• •

Generate new timing information without re-implementing Specify the number of detailed paths reported per constraint Report details of hold violations

Timing report contents Prorating –

Specify your own worst-case environment

Timing Closure with Timing Analyzer - 28

© 2003 Xilinx, Inc. All Rights Reserved

Filter Paths by Net Tab •





Restrict which paths are reported by selecting specific nets Each net is assigned to be included by default Net Filter values: –





Exclude paths containing this net Include Only paths containing this net Default

Timing Closure with Timing Analyzer - 29

© 2003 Xilinx, Inc. All Rights Reserved

Path Tracing Tab •



Restrict which paths are reported by selecting path end points or path types In this example, if you had an OFFSET OUT constraint associated with this design, the report would only include paths associated with the TENSOUT0 and TENSOUT1 pins

Timing Closure with Timing Analyzer - 30

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Timing Closure with Timing Analyzer - 31

Timing Reports Interpreting Timing Reports Report Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •

To which resources is the timing report linked?



List the possible causes of timing errors

Timing Closure with Timing Analyzer - 32

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

To which resources is the timing report linked? – – –



Timing Improvement Wizard on the Web Interactive Data Sheet on the Web Floorplanner for cross-probing

List the possible causes of timing errors – – – –

Neglecting synchronous design rules or using incorrect HDL coding style Poor synthesis results (too many levels of logic) Inaccurate or incomplete timing constraints Poor logic mapping or placement

Timing Closure with Timing Analyzer - 33

© 2003 Xilinx, Inc. All Rights Reserved

Summary •



• •



Timing reports enable you to determine how and why constraints were not met Use the synthesis report and Post-Map Static Timing Report to estimate performance before running place & route The detailed path description offers clues to the cause of timing failures Cross probe to the Floorplanner to view the placement of logic in a timing path The Timing Analyzer can generate various types of reports for specific circumstances

Timing Closure with Timing Analyzer - 34

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •

Online Help – –



Help → Timing Analyzer Help Contents Help button available in the Options GUIs

Timing Improvement Wizard: http://support.xilinx.com → Problem Solvers –

Decision-tree process guides you to the suggested next step to achieve timing closure

Timing Closure with Timing Analyzer - 35

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Global Timing Constraints - 2

© 2003 Xilinx, Inc. All Rights Reserved

Achieving Timing Closure

Timing Groups and OFFSET Constraints

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: • •

Use the Constraints Editor to create groups of path endpoints Use the Constraints Editor to create path-specific OFFSET constraints

Global Timing Constraints - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Global Timing Constraints - 5

Introduction Creating Groups OFFSET Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Path-Specific Timing Constraints •



Using global timing constraints (PERIOD, OFFSET, and PAD-TO-PAD) will constrain your entire design Using only global constraints often leads to over-constrained designs – – –



Constraints are too tight Increases compile time and can prevent timing objectives from being met Review performance estimates provided by your synthesis tool or the PostMap Static Timing Report

Path-specific constraints override the global constraints on specified paths –

This allows you to loosen the timing requirements on specific paths

Global Timing Constraints - 6

© 2003 Xilinx, Inc. All Rights Reserved

More About Path-Specific Timing Constraints •

Areas of your design that may benefit from path-specific constraints – – – –



Multi-cycle paths Paths that cross between clock domains Bidirectional buses I/O timing

Path-specific timing constraints should be used to define your performance objectives and should not be indiscriminately placed

Global Timing Constraints - 7

© 2003 Xilinx, Inc. All Rights Reserved

Global Constraint Review •



Using the global PERIOD, OFFSET IN, and OFFSET OUT constraints will constrain all of these paths This makes it easy to control the overall performance of your design ADATA

FLOP1

FLOP2

FLOP3

D

D

D

Q

Q

Q

OUT1

CLK BUFG

FLOP4

FLOP5

D

D

Q

Q

BUS [7..0]

CDATA

Global Timing Constraints - 8

© 2003 Xilinx, Inc. All Rights Reserved

OUT2

Path-Specific Constraint Example • •

A path-specific constraint can optimize as little as one path This gives you greater control over your design’s performance and gives the implementation tools the greatest flexibility in meeting your performance and utilization needs ADATA

FLOP1

FLOP2

D Q

D Q

FLOP3 D Q

OUT1

CLK BUFG

FLOP4

FLOP5

D Q

D Q

BUS [7..0]

CDATA

Global Timing Constraints - 9

© 2003 Xilinx, Inc. All Rights Reserved

OUT2

The Advanced Tab of the Constraints Editor •

Creating path-specific constraints requires two steps –





Step 1: Create groups of path end points Step 2: Communicate the timing objective between the groups

The constraints we discuss in this module can all be entered from the Advanced tab of the Constraints Editor

Global Timing Constraints - 10

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Global Timing Constraints - 11

Introduction Creating Groups OFFSET Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Creating Groups of Endpoints •

Path-specific timing constraints will only be effective if path end points can be easily grouped together –





Otherwise, constraining a large design would be time consuming and painstaking

The Constraints Editor makes this easy by allowing you to define groups of path end points (pads, flip-flops, latches, and RAMs) Specific delay paths can then be constrained with advanced timing constraints

Global Timing Constraints - 12

© 2003 Xilinx, Inc. All Rights Reserved

Creating Groups of Endpoints •

With the Constraints Editor, grouping path end points is made easy with the following options: – – – – – –

Group by nets Group by instance name Group by hierarchy Group by output net name Timing THRU Points option Group by clock edge

Global Timing Constraints - 13

© 2003 Xilinx, Inc. All Rights Reserved

Grouping by Nets or Output Net Name • •

Step 1: Enter a group name Step 2: Select the type of net to search for –





Optional filter string

Matching nets appear in the Available list Step 3: Select nets and click Add –

Nets appear in the Time Name Targets list

Global Timing Constraints - 14

© 2003 Xilinx, Inc. All Rights Reserved

Grouping by Nets versus Output Net Name •

Grouping by net “NET_A” will create a group containing FLOP2 only –



Group contains flip-flops that are driven by the selected net

Grouping by output new “NET_A” will create a group containing FLOP1 only –

Group contains the flip-flop that sources the selected net NET_A D

Q

FLOP1

Global Timing Constraints - 15

D

Q

FLOP2

© 2003 Xilinx, Inc. All Rights Reserved

Grouping by Instance Name or Hierarchy • •

Steps are the same Design Element Types are different –



Instance Name: FFs, pads, latches, RAMs Hierarchy: User levels, Xilinxcreated levels

Global Timing Constraints - 16

© 2003 Xilinx, Inc. All Rights Reserved

Grouping by Clock Edge • •

Step 1: Enter a group name Step 2: Select a previously defined group –



Optional filter to help find the group

Step 3: Select clock edge

Global Timing Constraints - 17

© 2003 Xilinx, Inc. All Rights Reserved

Timing THRU Points • •

Allows you to optimize paths through specific nets and 3-state buffers In this example, a group of nets was named TEOUTS. A constraint can now be referenced such that only the delay paths through the TEOUTS nets will be optimized TPTHRU = TEOUTS D

Q

reg MYCTR

D

Q

reg D

Q

reg

Global Timing Constraints - 18

© 2003 Xilinx, Inc. All Rights Reserved

Timing THRU Points • •

Group nets or 3-state buffers Use these groups to identify specific paths to be constrained

Step 1 Enter TPTHRU Name

Step 2 Select Element Type

Step 3 Select Nets or 3-state Buffers and click on Add

Global Timing Constraints - 19

© 2003 Xilinx, Inc. All Rights Reserved

Managing Groups •

Groups that you have defined are written into the UCF file – – –



To add items to an existing group, click one of the grouping buttons and use the same Time Name –



Not allowed when grouping by output net name

To delete a group, right-click on the line in the Constraints window and select Delete Constraint –



INST <element_name> TNM = ; OR NET TNM_NET = ; OR TIMEGRP = <elements>;

Or delete the line with a text editor

You cannot remove items from a group with the Constraints Editor –

Edit the UCF file with a text editor

Global Timing Constraints - 20

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Global Timing Constraints - 21

Introduction Creating Groups OFFSET constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Review of Global OFFSET Constraints •



Use the Pad-to-Setup and Clock-to-Pad columns to specify OFFSETs for all I/O paths on each clock domain Easiest way to constrain most I/O paths –

However, this may lead to an over-constrained design

Global Timing Constraints - 22

© 2003 Xilinx, Inc. All Rights Reserved

Pin-Specific OFFSET Constraints •



Use the Pad-to-Setup and Clock-to-Pad columns to specify OFFSETs for each I/O pin Use this type of constraint when only a few I/O pins need different timing

Global Timing Constraints - 23

© 2003 Xilinx, Inc. All Rights Reserved

Creating Groups of Pads •

Groups of I/O pads can be made in the Ports tab – –



Use Shift-click or CTRL-click to select multiple pads Enter a group name and click the Create Group button

Click the Pad to Setup or Clock to Pad button to define group OFFSETs –

Or use the Advanced tab

Global Timing Constraints - 24

© 2003 Xilinx, Inc. All Rights Reserved

Creating Group OFFSET Constraints •



OFFSET IN/OUT constraints can also be entered in the Advanced tab The Pad-to-Setup and Clock-to-Pad options allow you to enter OFFSET IN/OUT constraints on specific groups of pads

Global Timing Constraints - 25

© 2003 Xilinx, Inc. All Rights Reserved

Group OFFSET Constraints •

Select a group of pads



Enter timing requirement



Select a clock domain



Optional: Select a group of synchronous elements

Global Timing Constraints - 26

© 2003 Xilinx, Inc. All Rights Reserved

Source Synchronous OFFSET Constraints •

For source synchronous inputs, you can specify the width of the valid data window

Global Timing Constraints - 27

© 2003 Xilinx, Inc. All Rights Reserved

OFFSET Constraints with Two-Phase Clocks •



OFFSET constraints define the relationship between the data and the initial clock edge at the pins of the FPGA Initial clock edge is defined in the global PERIOD constraint using the HIGH or LOW keyword – –





HIGH: Initial edge rising (default) LOW: Initial edge falling

If all I/O are clocked on a single edge, use the HIGH/LOW keywords in the PERIOD constraint to define which edge is used If both clock edges are used, create two OFFSET constraints – –

One for each clock edge This includes cases where DDR flip-flops are used

Global Timing Constraints - 28

© 2003 Xilinx, Inc. All Rights Reserved

OFFSET IN Using Both Clock Edges clk 10 ns

3ns

2ns

3ns

data_rising data_falling t = -3 0ns 2 5 •

Input data is valid 3 ns before rising and falling edge –

• • •

PERIOD constraint is 10 ns, initial edge rising, 50-percent duty cycle

Create groups of flip-flops for each clock edge For inputs clocked on a rising edge, OFFSET = IN 3 ns BEFORE clk; For inputs clocked on a falling edge, OFFSET = IN –2 ns BEFORE clk; –

2 ns after initial (rising) edge = 3 ns before falling edge

Global Timing Constraints - 29

© 2003 Xilinx, Inc. All Rights Reserved

OFFSET OUT Using Both Clock Edges 8ns

clk

3ns 10 ns

3ns

data_rising data_falling t = 0ns 3 •

Output data must be valid 3 ns after rising and falling edge –

• • •

8

PERIOD constraint is 10 ns, initial edge rising, 50-percent duty cycle

Create groups of flip-flops for each clock edge For outputs clocked on a rising edge, OFFSET = OUT 3 ns AFTER clk; For outputs clocked on a falling edge, OFFSET = OUT 8 ns AFTER clk; –

8 ns after initial (rising) edge = 3 ns after falling edge

Global Timing Constraints - 30

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Global Timing Constraints - 31

Introduction Creating Groups OFFSET Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions • •



How do path-specific timing constraints improve your design’s performance? How would you constrain this design to get an internal clock frequency of 100 MHz? The input will be valid at least 3 ns before the rising edge of CLK. The output must be valid 4 ns after the falling edge of CLK. Write the appropriate OFFSET constraints IN

D

Q

D

Q

D

Q

OUT C

C

CLK RESET_A RESET_B

Global Timing Constraints - 32

Q

D

© 2003 Xilinx, Inc. All Rights Reserved

C

C

Answers •

How do path-specific timing constraints improve your design’s performance? –



How would you constrain this design to get a maximum internal clock frequency of 100 MHz? –



They give the implementation tools more flexibility to meet all of your timing objectives

Enter a global PERIOD constraint of 10 ns on the CLK signal

Write the appropriate OFFSET constraints. –

Assuming that the PERIOD constraint uses the HIGH keyword and 50-percent duty cycle: • •

OFFSET = IN 3 ns BEFORE CLK; OFFSET = OUT 9 ns AFTER CLK;

Global Timing Constraints - 33

© 2003 Xilinx, Inc. All Rights Reserved

Summary •

Path-specific constraints are used to override global constraints – –



Creating path-specific constraints is a two-step process – –





Keeps your design from becoming over-constrained Allows the software to make intelligent trade-offs to meet all of your performance goals Create groups of path endpoints Communicate the timing objective between the groups

Path-specific OFFSET constraints can be entered on either the Ports tab or the Advanced tab When using both clock edges for I/O, write separate OFFSET constraints for each clock edge

Global Timing Constraints - 34

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •



Timing Presentation on the Web: http://support.xilinx.com → Tech Tips → Timing & Constraints Constraints Guide: http://support.xilinx.com → Software Documentation –

Documentation may also be installed on your local machine

Global Timing Constraints - 35

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Global Timing Constraints - 36

© 2003 Xilinx, Inc. All Rights Reserved

Review of Global Timing Constraints Lab

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this lab, you will be able to: • •

Enter global timing constraints in the Constraints Editor Read reports to determine whether constraints were met

Global Timing Constraints - 38

© 2003 Xilinx, Inc. All Rights Reserved

Lab Design: Correlate and Accumulate

Global Timing Constraints - 39

© 2003 Xilinx, Inc. All Rights Reserved

Lab Overview • • •

Enter global timing constraints for five clock domains Implement the design using the default software options Review reports

Global Timing Constraints - 40

© 2003 Xilinx, Inc. All Rights Reserved

General Flow Step 1:

Enter Global Timing Constraints

Step 2:

Implement and analyze timing

Global Timing Constraints - 41

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Advance Timing Constraints - 2

© 2003 Xilinx, Inc. All Rights Reserved

Achieving Timing Closure

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: •

• • • •

Constrain paths that cross between clock domains by using the Constraints Editor Describe how constraints are prioritized Constrain multi-cycle paths by using the Constraints Editor Set path exception constraints Set I/O specific constraints

Advance Timing Constraints - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • •

Advance Timing Constraints - 5

Inter-Clock Domain Constraints Multi-cycle Paths False Paths Miscellaneous Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Constraining Between Rising and Falling Clock Edges •

The PERIOD constraint automatically accounts for two-phase clocks –





Includes adjustments for non-50-percent duty-cycle clocks

Example: A PERIOD constraint of 10 ns on CLK will apply a 5-ns constraint between these two flip-flops No path-specific constraints are required for this case

D

Q

D

Q

OUT CLK

Advance Timing Constraints - 6

© 2003 Xilinx, Inc. All Rights Reserved

Constraining Between Related Clock Domains •

Create a PERIOD constraint for one clock –





Define all related clocks in terms of this PERIOD constraint

The implementation tools will use the relationships to determine how to cross between clock domains DCM with multiple outputs: – – –

Define a PERIOD constraint on the input to the DCM The implementation tools will “push” the constraint onto each output All constraints will be defined relative to the original PERIOD constraint

Advance Timing Constraints - 7

© 2003 Xilinx, Inc. All Rights Reserved

Constraining Between Unrelated Clock Domains •

In this example, the delay path between the two clock domains is NOT covered by either of the PERIOD constraints –



You must add a constraint to cover paths when crossing between related clock domains –



This is the default behavior

Example: Same frequency, but CLK_B is phase shifted

You must add a synchronization circuit when crossing between unrelated clock domains PERIOD CLK_A DQ

PERIOD CLK_B

D Q

CLK_A CLK_B

Advance Timing Constraints - 8

© 2003 Xilinx, Inc. All Rights Reserved

D Q

D Q

OUT1

Constraining Between Unrelated Clock Domains •

To constrain the paths between the two clock domains (highlighted in gray) –

Define groups of registers CLK_A and CLK_B with the Group by Nets option •



Automatically done if you have specified a PERIOD constraint for both clock domains

Place a Slow/Fast Path Exception between the two groups of registers PERIOD CLK_A D

Q

D

5 ns

Q

PERIOD CLK_B D

Q

D

Q

OUT1 CLK_A CLK_B

Advance Timing Constraints - 9

© 2003 Xilinx, Inc. All Rights Reserved

Constraining Between Unrelated Clock Domains •

Step 1: Create the groups by using the Group by Nets option – –



Group by clock net Skip this step if PERIOD constraints are defined

Step 2: Create the constraint by clicking the Slow/Fast Path Exceptions button

Advance Timing Constraints - 10

© 2003 Xilinx, Inc. All Rights Reserved

Constraining Between Unrelated Clock Domains •

Enter a name for this constraint –





Must begin with “TS”

Select the groups that define the constraint Specify the value of the constraint

Advance Timing Constraints - 11

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • •

Advance Timing Constraints - 12

Inter-Clock Domain Constraints Multi-cycle Paths False Paths Miscellaneous Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Multi-cycle Path Constraints •

Multi-cycle paths occur when registers are not updated on consecutive clock cycles –





Always at least one clock cycle between updates Typically, the registers are controlled by a clock enable

A prescaled counter is one example –



200 MHz

CLK

PRE2 Q0 Q1

Registers in COUT14 are updated every 4 clock cycles Paths between these registers are multi-cycle paths

Advance Timing Constraints - 13

© 2003 Xilinx, Inc. All Rights Reserved

TC

CE

50 MHz

COUT14 Q2 Q3 Q4

Q14 Q15

Creating Multi-cycle Path Constraints •



Step 1: Create a global PERIOD constraint (not shown) Step 2: Create groups by using the Group by Nets option –



Group by enable net

Step 3: Click the Multicycle Paths button

Advance Timing Constraints - 14

© 2003 Xilinx, Inc. All Rights Reserved

Creating Multi-cycle Path Constraints •





Enter a TIMESPEC name Select the groups that were previously defined Define the constraint relative to the PERIOD constraint

Advance Timing Constraints - 15

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • •

Advance Timing Constraints - 16

Inter-Clock Domain Constraints Multi-cycle Paths False Paths Miscellaneous Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

False Paths •

The False Paths options will prevent constraints from being applied to specific paths –

Define False Paths to reduce the number of constrained paths in your design

Advance Timing Constraints - 17

© 2003 Xilinx, Inc. All Rights Reserved

Defining False Paths •

Use the False Paths (FROM:TO:TIG) button to define false paths between groups of path endpoints – –





TIG = Timing IGnore Prevents any constraints from being applied to the paths Paths through specific nets or 3state buffers can be defined with the THRU points option

What is wrong with this example?

Advance Timing Constraints - 18

© 2003 Xilinx, Inc. All Rights Reserved

Defining False Paths by Nets •

The False Paths by Nets option allows you to ignore timing constraints on a specific net –



Any delay path containing the RESET net will not be constrained

The Ignored TIMESPECs option allows specific constraints to be ignored –

TS_P2P constraint will be ignored on paths containing the RESET net

Advance Timing Constraints - 19

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • •

Advance Timing Constraints - 20

Inter-Clock Domain Constraints Multi-cycle Paths False Paths Miscellaneous Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Miscellaneous Tab •



Assign individual registers to IOBs Mark asynchronous registers –



Select nets to be routed on the Low Skew Resources –



• •

Prevents “X” propagation during simulation

Use for high-fanout control signals

Assign timing group elements to area groups for floorplanning FEEDBACK constraint for DCMs Define initial values for storage elements

Advance Timing Constraints - 21

© 2003 Xilinx, Inc. All Rights Reserved

Prorating Constraints •

Prorating allows the tools to use the most accurate information –



The implementation tools use the worst-case operating temperature and voltage for your chosen device package (85 º for Commercial, 100 º for Industrial)

Specify your own worst-case conditions –

This will prorate the device delay characteristics to accurately reflect your worst-case system conditions

Advance Timing Constraints - 22

© 2003 Xilinx, Inc. All Rights Reserved

Timing Constraint Priority •



• • • •

Must be allowed to override any timing constraint

FROM THRU TO FROM TO Pin-Specific OFFSETs Group OFFSETs –



Highest

False Paths

Groups of pads or registers

Global PERIOD and OFFSETs –

Lowest priority constraints

Advance Timing Constraints - 23

© 2003 Xilinx, Inc. All Rights Reserved

Lowest

Timing Constraint Interaction •





Whenever a path is covered by more than one constraint, the tools must choose which constraint to use for timing analysis If the constraints are of different types, the highest priority constraint is applied If the constraints are of the same type (Example: FROM TO), the decision is more complex –



Can be dictated with the PRIORITY keyword in the UCF file

To see where your constraints overlap, generate a Timing Specification Interaction (TSI) file –



Under Properties for Post-Place & Route Static Timing Report, type in a filename In the Timing Analyzer, select Analyze → Constraints Interaction

Advance Timing Constraints - 24

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • •

Advance Timing Constraints - 25

Inter-Clock Domain Constraints Multi-cycle Paths False Paths Miscellaneous Constraints Summary

© 2003 Xilinx, Inc. All Rights Reserved

Skills Check

Advance Timing Constraints - 26

© 2003 Xilinx, Inc. All Rights Reserved

Review Question Background Information •

Prescaled 16-bit counter is created in two blocks – – –

Q0 and Q1 in block PRE2 toggle at 200 MHz Q[15:2] toggle every fourth clock edge (50 MHz) The design is fully synchronous because all registers share the same clock •

However, COUT14 registers are disabled 3/4 of the time so they do not have to meet a 200-MHz PERIOD constraint 200 MHz

CLK

PRE2 Q0 Q1

Advance Timing Constraints - 27

TC

CE

50 MHz

COUT14 Q2 Q3 Q4

© 2003 Xilinx, Inc. All Rights Reserved

Q14 Q15

Review Questions •

• •

What constraints need to be placed on this design to assure it will meet the performance objectives? How would you enter these constraints through the Constraints Editor? How do multi-cycle path constraints improve your design’s performance?

200 MHz

CLK

PRE2 Q0 Q1

Advance Timing Constraints - 28

TC

CE

50 MHz

COUT14 Q2 Q3 Q4

© 2003 Xilinx, Inc. All Rights Reserved

Q14 Q15

Answers •

What type of constraints need to be placed on this design to assure it will meet the performance objectives? – –



Global PERIOD constraint of 5 ns (or 200 MHz) Multi-cycle path constraint of 5 x 4 = 20 ns (or 200 / 4 = 50 MHz)

How would you enter these constraints through the Constraints Editor? – –

PERIOD constraint: Use the Global tab Multi-cycle path constraint: • •



Group the flip-flops in COUT14 by clock enable net (group name: MSB) Constrain from MSB to MSB

How do multi-cycle path constraints improve your design’s performance? –

They allow the implementation tools to place some logic farther apart and use slower routing resources

Advance Timing Constraints - 29

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •



If a PERIOD constraint were placed on this design, what delay paths would be constrained? If the goal is to optimize the input and output times without constraining the paths between registers, what constraints are needed? –

Assume that a global PERIOD constraint is already defined

Status Register

Control Register

Control_Enable BIDIR_PAD(7:0) BIDIR_BUS(7:0)

Advance Timing Constraints - 30

© 2003 Xilinx, Inc. All Rights Reserved

Status_Enable

Answers •

If a PERIOD constraint were placed on this design, what delay paths would be constrained? –

Paths between the control registers and the status registers would be constrained

Status Registers

Control Registers

Control_Enable BIDIR_PAD(7:0) BIDIR_BUS(7:0)

Advance Timing Constraints - 31

© 2003 Xilinx, Inc. All Rights Reserved

Status_Enable

Answers •

If the goal is to optimize the input and output times without constraining the paths between registers, what constraints are needed? – –

Enter OFFSET constraints on the Global tab Define False Paths By Nets • •

Select the BIDIR_BUS[7:0] nets Select the global PERIOD constraint to be ignored Status Registers

Control Registers

Control_Enable BIDIR_PAD(7:0) BIDIR_BUS(7:0)

Advance Timing Constraints - 32

© 2003 Xilinx, Inc. All Rights Reserved

Status_Enable

Summary •



Use a Slow/Fast Path Exception to constrain paths that cross between clock domains Identifying multi-cycle and false paths allows the implementation tools to make appropriate tradeoffs –





These paths will use slower routing resources, which frees up fast routing for critical signals

Prorating your operating conditions gives the tools the most accurate picture of your design environment In general, more-specific constraints have a higher priority than lessspecific constraints

Advance Timing Constraints - 33

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •



Timing Presentation on the Web: http://support.xilinx.com → Tech Tips → Timing & Constraints Constraints Guide: http://support.xilinx.com → Software Documentation –

Documentation may also be installed on your local machine

Advance Timing Constraints - 34

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Advance Timing Constraints - 35

© 2003 Xilinx, Inc. All Rights Reserved

Achieving Timing Closure with Advance Constraints Lab

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this lab, you will be able to: •



Enter path-specific and I/O timing constraints by using the Constraints Editor Take steps to achieve timing closure

Advance Timing Constraints - 37

© 2003 Xilinx, Inc. All Rights Reserved

Lab Design: Correlate and Accumulate

Advance Timing Constraints - 38

© 2003 Xilinx, Inc. All Rights Reserved

Lab Overview •

Add path-specific timing constraints based on design knowledge – –

You will add some constraints Other constraints will be copied from a file

Advance Timing Constraints - 39

© 2003 Xilinx, Inc. All Rights Reserved

General Flow Step 1:

Create a group OFFSET constraint

Step 2:

Create multi-cycle path constraints

Step 3:

Create path exception constraints

Step 4:

Create I/O constraints

Step 5:

Implement and analyze timing

Advance Timing Constraints - 40

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Modular Design Techniques •

Floorplanner - 2

Lab : IDT Flow Lab : MDT Flow

© 2003 Xilinx, Inc. All Rights Reserved

Improve Your Timing FloorPlanner

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: • • • • •

Identify the Floorplanner windows Specify the Floorplanner flow Describe how to use area constraints Identify how PACE is used to specify area constraints Identify optimal pin layout

Floorplanner - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 5

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

What is the Floorplanner? •

Graphical tool used to display/edit design layout –

Easy to review the results of implementation

Close-up of Virtex-II die

Floorplanner - 6

© 2003 Xilinx, Inc. All Rights Reserved

When to Floorplan •

Use the Floorplanner to: – – –

– –



Increase productivity/design performance View the layout of your implemented design Partition design sub-systems into general areas on the die (area constraints/layout) Make minor placement modifications Create RPMs (Relationally Placed Macros)

Use the Floorplanner carefully – –

Floorplanner - 7

Poor floorplanning can decrease design performance The implementation tools cannot disregard a poor floorplan

© 2003 Xilinx, Inc. All Rights Reserved

Floorplanner Prerequisites •

Do not perform significant floorplanning unless you are very familiar with: – – –



The design The target device architecture Xilinx software

Without sufficient knowledge, it is suggested you try the following first – – –

– –

Floorplanner - 8

Use timing constraints Increase Place & Route Effort Level Specify → Perform Timing-Based Packing (Map) and extra-effort level (PAR) Pipeline or redesign logic in critical paths Use re-entrant routing or MPPR

© 2003 Xilinx, Inc. All Rights Reserved

Floorplanning Advantage: •

Given sufficient knowledge… –

In large and/or high performance designs, floorplanning/layout is an effective precursor to implementation • • •



Floorplanner - 9

Provides guidance to implementations tools on the layout of the design Can help to reduce run time Can help to increase performance

Floorplanning is required for Incremental Design Techniques and Modular Design Techniques

© 2003 Xilinx, Inc. All Rights Reserved

Floorplanning Flow edn, ngc ucf Floorplanner

NGDBUILD ngd MAP

fnf ncd, pcf PAR ncd Floorplanner - 10

© 2003 Xilinx, Inc. All Rights Reserved

ncf

Floorplanner Versus PACE •

PACE: –



Easiest tool for specifying pin placement constraints and area constraints

Floorplanner: –

More advanced tool with placement capabilities beyond that of PACE • • • • • •

– –

Floorplanner - 11

Create groups of logic Constrain logic to a specific location (hard location constraints - hard LOC) Constrain logic from a current placement (hard LOC) View and edit placed design Perform packing of logic resources Used for cross-probing with Timing Analyzer

PACE is much better for specifying pin constraints Specifying area constraints is similar to PACE

© 2003 Xilinx, Inc. All Rights Reserved

Main Floorplanner Windows Placement Shows the current design layout from the implementation tools

Design Hierarchy Displays colorcoded hierarchical Blocks. Traverse hierarchy to view any component in the design

Design Nets Lists all the nets in the design

Floorplanner - 12

Floorplan (in back) Shows current placement constraints and design edits

© 2003 Xilinx, Inc. All Rights Reserved

Viewing the Device •

Click View → Options, or click the Toggle Resources button to display device resources –



– –



Function Generators and RAM Flip-flops and latches Three-state buffers I/O pads and global buffers

Row and column numbers are displayed for easy reference

Floorplanner - 13

© 2003 Xilinx, Inc. All Rights Reserved

Locating Logic and Nets •



Use the Edit → Find command Filters help you narrow your search –





Floorplanner - 14

Logic type (flip-flops, I/O pins, nets, etc.) Status (floorplanned, not floorplanned, selected, etc.) Connections (driving selected logic, sourcing selected logic, etc.)

© 2003 Xilinx, Inc. All Rights Reserved

Viewing Connectivity •

View connectivity by components –



Click Edit → Preferences → Ratsnest Tab. Check Display nets connected to selected logic Select a component or group of logic in the Hierarchy window

Connections are shown in the Placement and Floorplan windows

Floorplanner - 15

© 2003 Xilinx, Inc. All Rights Reserved

Package View •





To view the package pins, click View menu → Package Pins You can view the bottom view or the top view PACE provides a more complete package view for more beneficial pin placement – All dual-purpose and special pins are identified

Floorplanner - 16

© 2003 Xilinx, Inc. All Rights Reserved

Timing Analyzer Cross-Probing •

The Timing Analyzer and Floorplanner can be used together to crossprobe paths

2 1

Click Clickon onpath path in Timing Analyzer in Timing Analyzer

Floorplanner - 17

Path Pathappears appears in inFloorplanner Floorplanner

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 18

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

Floorplanning Procedures • • • • •

Floorplanner - 19

Creating groups of logic Constraining logic to a specific location Constraining from the current placement Locking I/O pins Area constraints (covered in the next section)

© 2003 Xilinx, Inc. All Rights Reserved

Creating Groups of Logic •



When the design is loaded, logic is automatically grouped according to the design hierarchy Create your own groups of logic in two ways: –



Floorplanner - 20

Select the logic and use the command Hierarchy → Group Use the command Hierarchy → Group By, to select and group logic

© 2003 Xilinx, Inc. All Rights Reserved

Changing Group Colors •

• •

• •

Use color to identify parts of your design easily Select a group of logic Use the command Edit → Colors Choose a new color Click Apply

Floorplanner - 21

© 2003 Xilinx, Inc. All Rights Reserved

Constraining Logic to a Specific Location •

Select the method in which you would like the logic to be distributed –









Distribute One at a Time drops each component individually Up, Down, Left, or Right quickly shapes the logic into a row or column

Pick up the logic by clicking the icon in the Hierarchy window Move the cursor in the Floorplan window Click to place the logic –

Floorplanner - 22

Valid locations are highlighted

© 2003 Xilinx, Inc. All Rights Reserved

Moving Logic •

To move logic that is already placed in the Floorplan window –

– – –



Select the logic that you want to move (in the Hierarchy, Placement, or Floorplan window) Click the logic to pick it up Move the cursor to a new location Click to place the logic

To remove logic from the Floorplan window – –

Floorplanner - 23

Select the logic you want to remove Press or move the logic back into the Hierarchy window

© 2003 Xilinx, Inc. All Rights Reserved

Block RAM Placement •



The placement algorithm for block RAM does not always result in an optimal placement with its source and load Consider placing most (if not all) of your block RAMs –



Hand-place them to use the flow of the device wisely –



Block RAM placement can be very critical to the timing of your design

Horizontal data flow, carry chain runs up

Discussed further in Area Constraints section

Floorplanner - 24

© 2003 Xilinx, Inc. All Rights Reserved

Constraining From the Current Placement •

Use these commands when you want to make minor layout changes



To constrain selected logic in Placement window: – – –





Select the logic that you want to constrain, from the Placement window Use the command Floorplan → Constrain from Placement The layout for the selected logic is copied from the Placement window into the Floorplan window Make changes to the logic placement in the Floorplan window

To copy the entire Placement window: use the command Floorplan → Replace All with Placement

Floorplanner - 25

© 2003 Xilinx, Inc. All Rights Reserved

Locking I/O Pins •

For high-speed, complicated, and large I/O designs, Xilinx suggests you manually lock I/O – – –

Use PACE (recommended) Use the Constraints Editor (Ports tab) Lock the pins based on the pinout from the implementation tools: •



Floorplanner - 26

From ISE Project Navigator: Expand Implement Design, expand Place & Route, double-click Back-annotate Pin Locations Floorplanner – Use the Edit → Find command to select all I/O Pads – Use the Floorplan → Constrain from Placement command – Make adjustments, if needed, and save

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 27

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

I/O Location Constraints •

For high-speed designs, complex designs, and designs with a large number of I/O pins, Xilinx recommends manual placement of I/O – –





Guides the internal data flow The implementation tools have the ability to place logic and pins, but this does not always result in the most optimal placement Poor pin placement can reduce the chances of your design meeting your performance objectives Making good pin assignments requires detailed knowledge of the design functionality and Xilinx architecture •

Floorplanner - 28

Pin assignments must also comply with the silicon’s capabilities – Assignments must follow the I/O banking rules and the pre-grouping of the differential I/O pins – Clock pin assignments affect clock region access and shared input pairs – Take advantage of internal data-flow

© 2003 Xilinx, Inc. All Rights Reserved

Pin Constraints •

Clocks should be constrained to dedicated clock pins – –

Or clock pin pairs for differential clocks Keep in mind the global clock buffer limitations • •



Eight global clocks or eight clocks total into each clock region Rules previously described

Use dual-purpose pins last – –



Floorplanner - 29

For example, configuration and DCI pins This will help to reduce contention during board power-up or when the FPGA is reconfigured on demand PACE or the Xilinx Constraints Editor can be used to prohibit configuration pins

© 2003 Xilinx, Inc. All Rights Reserved

Internal Logic Layout •

Horizontal data flow with vertical bus alignment – –

Carry logic runs vertically Bit 7 Bidirectional data bus longlines Datapath run horizontally •



3-state enable lines run vertically

Control lines (CE, resets, etc) are generally driven on vertical long lines

Bit 0 Datapath

A+B

C+D

E+F

C_REG E_REG G_REG

Floorplanner - 30

© 2003 Xilinx, Inc. All Rights Reserved

Layout for Smaller FPGAs General guidelines for chips with 100K system gates or less: I/O for control signals on the top or bottom –



Signals are routed vertically

I/O for data buses on the left or right Internal layout favors horizontal data flow – – –

Floorplanner - 31

Align area blocks to flow horizontally Allow enough room for carry chains Place block RAMs appropriately to align with arithmetic logic

© 2003 Xilinx, Inc. All Rights Reserved

Data Flow

Control Signals

Data Buses



Control Signals

Data Buses



Layout for Larger FPGAs •





Floorplanner - 32

Group control signals and data buses near related internal logic High-fanout signals may be placed near the middle of the chip, for easy access to horizontal long lines

© 2003 Xilinx, Inc. All Rights Reserved

Data Flow

Control Signals



Use the same guidelines as you would use for smaller chips In addition, consider the following:

Control Signals

For chips with 500K system gates or more:

Data Bus Layout •



Arithmetic functions with more than five bits typically utilize carry logic Carry chains require specific vertical orientation –

MSB

Affects both internal and I/O layout LSB

Floorplanner - 33

© 2003 Xilinx, Inc. All Rights Reserved

Interleaved Bus Layout •

Arithmetic functions involving two or more buses will benefit from interleaved pin constraints –

For example: •

C <= A + B; or C <= A * B;

B(3) A(3) B(2) A(2) B(1) A(1) B(0) A(0)

Floorplanner - 34

© 2003 Xilinx, Inc. All Rights Reserved

Area Constraints (a.k.a. Layout, a.k.a. AREA_GROUPs) •

Easiest and most effective application of floorplanning



Preferred method of floorplanning for synthesis users and large designs –

Individual component names change often during synthesis, but hierarchical block names remain constant •







For this to be effective, you must retain hierarchy during synthesis

Each sub-section of a large design can be constrained to an area

Area constraints allow you to provide guidance while still giving the implementation tools freedom This is the primary floorplanning methodology for use with incremental design techniques

Floorplanner - 35

© 2003 Xilinx, Inc. All Rights Reserved

Area Constraints 1. Select the area group you want to constrain 2. Click the Assign Area button 3. Click and drag to define the area constraint –

Floorplanner - 36

2

1

Floorplanner estimates the required area and will not allow you to select an area that is too small

© 2003 Xilinx, Inc. All Rights Reserved

3

Area Constraint Compression •

When assigning area constraints, it may be helpful to apply a compression factor for mapping –



This is equivalent to the global -c option applied in map, but it only applies to an individual AREA_GROUP Right-click the AREA_GROUP, select → Edit Constraints • •



Floorplanner - 37

Constraint: Compression Value: <%> – <%> represents the percentage of logic resources in that area constraint available for packing

Click OK

© 2003 Xilinx, Inc. All Rights Reserved

Area Constraints in UCF •

Floorplanner (and PACE) area constraints create AREA_GROUP and RANGE constraints in the UCF – – –

AREA_GROUP constraints bind instances into a group RANGE constraints assign an AREA_GROUP to an area on the die Syntax: • • •



For example: • •



Floorplanner - 38

INST AREA_GROUP = AG_; AREA_GROUP RANGE = SLICE_XnYm:SLICE_XnYm; AREA_GROUP COMPRESSION= ; # if applied INST data_control_inst AREA_GROUP = AG_data_control_inst; AREA_GROUP AG_data_control_inst RANGE = SLICE_X0Y47:SLICE_X27Y32 ; AREA_GROUP AG_data_control_inst COMPRESSION= 90;

© 2003 Xilinx, Inc. All Rights Reserved

RANGE Constraints •

RANGE constraints are written for slices, block RAM, multipliers, and 3state buffers –

Range constraints will be written for each of these logic types •

Slices:

AREA_GROUP "AG_data_control_inst" RANGE = SLICE_X0Y47:SLICE_X27Y32 ; •

Block RAMs:

AREA_GROUP " AG_data_control_inst " RANGE = RAMB16_X1Y18:RAMB16_X3Y15 ; •

Block multipliers:

AREA_GROUP " AG_data_control_inst " RANGE = MULT18X18_X1Y18:MULT18X18_X3Y15 ; •

Three state buffers:

AREA_GROUP " AG_data_control_inst " RANGE = TBUF_X0Y15:TBUF_X10Y9 ;

Floorplanner - 39

© 2003 Xilinx, Inc. All Rights Reserved

Multiplier and Block RAM RANGE Constraints There’s only room for 4 block RAMs… but, I need room for 8! What do I do?



In this situation, you are only interested in constraining the slices and 3-state buffers –

Floorplanner - 40

Hand-place the block RAMs and block multipliers

© 2003 Xilinx, Inc. All Rights Reserved

Multiplier and Block RAM RANGE Constraints •





Comment-out the block RAM and multiplier RANGE constraints in the UCF Hand-place the block RAMs and multipliers for optimal placement and timing This will provide the most optimal approach to floorplanning – –

Floorplanner - 41

Control placement of block RAMs and multipliers through hand placement Control placement of slice logic and 3-state buffers through area constraints

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 42

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

Pinout and Area Constraints Editor (PACE) • • •

PACE is a tool used to create pinout and area constraints Constraints are written to the UCF file The flow, look, and use are very similar to that of Floorplanner –

Floorplanner - 43

It does not have all the capabilities of Floorplanner

© 2003 Xilinx, Inc. All Rights Reserved

Pinout and Area Constraints Editor (PACE) Design Hierarchy Displays color-coded hierarchical blocks

Package Pins Allows pin loc specifications

Package Pins Legend

Design Object List Displays elements contained in the group that are selected in the Design Hierarchy window

Floorplanner - 44

Device Architecture Allows area constraint specification

© 2003 Xilinx, Inc. All Rights Reserved

Pin Constraints •

Pin constraints most easily made using PACE

Drag & Drop I/O to Package Pin window

Specify I/O options in Object List window

Floorplanner - 45

© 2003 Xilinx, Inc. All Rights Reserved

PACE Features •

To prohibit pin sites or slice sites, use the Prohibit icon in the toolbar



To allow the use of prohibited sites, use the Allow Icon



Package migration: When a design has the possibility of moving to a different package, PACE will write out prohibit constraints on incompatible pins so user can avoid reassigning I/Os –

Floorplanner - 46

IOB → Make Pin Compatible With...

© 2003 Xilinx, Inc. All Rights Reserved

Area Constraints 1. Select the area group that you want to constrain 2. Click the Assign Area button 3. Click and drag to define the area constraint –

Floorplanner - 47

2 1

PACE estimates the required area and will not allow you to select an area that is too small

© 2003 Xilinx, Inc. All Rights Reserved

3

Area Constraint Compression •

When assigning area constraints, it may be helpful to apply a compression factor for mapping –



This is equivalent to the global -c option applied in MAP, but it only applies to an individual AREA_GROUP Click Areas menu → Edit Constraints, enter Constraint: Compression, Value: <%> •



Floorplanner - 48

<%> represents the number of logic resources in that area constraint available for packing

Click → Enter, click → OK

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 49

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations © 2003 Xilinx, Inc. All Rights Reserved

Review Questions •

• • • •



Will the implementation tools override any manual edits that decrease performance of your design? What is Floorplanner beneficial for? What is the easiest and most beneficial application of floorplanning? After which implementation steps can you perform floorplanning? After creating a new floorplan, which phases of implementation need to be run again? Describe an optimal pin layout for Virtex-II/Spartan -3 devices

Floorplanner - 50

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •

Specify the windows of the floorplanner and explain how they are used

Floorplanner - 51

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Will the implementation tools override any manual edits that decrease performance of your design? –

Floorplanner - 52

No

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

What is Floorplanner beneficial for? –

In large and/or high performance designs • • •

– – – –

– –

Floorplanner - 53

Provides guidance to implementations tools on the layout of the design Can help to reduce run time Can help to increase performance

Floorplanning is required for IDT and MDT Increase productivity/design performance View the layout of your implemented design Partition design sub-systems into general areas on the die (area constraints/layout) Make minor placement modifications Create RPMs (Relationally Placed Macros)

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

What is the easiest and most beneficial application of floorplanning? –



After which implementation steps can you perform floorplanning? – – –



Area Constraints (a.k.a. Layout, a.k.a. AREA_GROUPs) Translate MAP Place & Route

After creating a new floorplan, which phases of implementation need to be run again? –

Floorplanner - 54

Translate, MAP, Place & Route, Configuration

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Specify the windows of the floorplanner and explain how they are used Design Hierarchy Displays color-coded hierarchical blocks. Traverse hierarchy to view any component in design

Design Nets Lists all nets in the design Floorplan (in back) Shows the current placement constraints and design edits

Floorplanner - 55

Placement Shows the current design layout from the implementation tools

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Describe an optimal pin layout for Virtex-II/Spartan-3 devices – –

Place data to flow horizontally across the device Place data, addresses, etc. for proper alignment with carry chain •

Floorplanner - 56

LSBs below MSBs -- flowing up

© 2003 Xilinx, Inc. All Rights Reserved

Summary •







Take care when editing design placement because the implementation tools cannot change a poor floorplan Floorplanning can be an effective method of providing guidance to implementation tools Area constraints give the implementation tools guidance but provide the most flexibility to meet your goals I/O pin layout guidelines: – – –

Floorplanner - 57

Data buses on the left and right of the die Control signals on the top and bottom MSB of buses at the top, LSB at the bottom

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •





Virtex-II, Virtex-II Pro, and Spartan-3 User Guides and Data book –

Details on device architecture



http://support.xilinx.com → Documentation

Online Software Manuals → Constraints Guide → Constraint Entry → Floorplanner Floorplanner → Online Help

Floorplanner - 58

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 59

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide-File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

Reusable RPM Core •

Floorplanner can be used to create cores that retain relative placement and, therefore, performance and timing repeatability –

Floorplanner - 60

It allows you to transform a design (or design block) into a Relationally Placed Macro (RPM) core that can be reused via instantiation in any design

© 2003 Xilinx, Inc. All Rights Reserved

Reusable RPM Core Steps Step 1: Define the logical function of your core in HDL Step 2: Synthesize your design to create an EDIF netlist such as my_core.edf –

Disable I/O insertion and clock buffer insertion

Step 3: Process the EDIF netlist using NGDBuild to produce an NGD file, such as my_core.ngd Step 4: Load your design (NGD or NCD) into the Floorplanner, and shape the relative placement of the design Step 5: Save your design using the Write RPM to NCF command in the File menu

Floorplanner - 61

© 2003 Xilinx, Inc. All Rights Reserved

Relationally Placed Macro •

What is a Relationally Placed Macro (RPM)? –





RPMs can be created on BELs, COMPs, and hierarchy – –



An RPM uses relative location constraints to constrain logic relationally to other logic This helps to keep associated logic close together to reduce routing delays Before COMPs can be relationally placed, BELs must be relationally placed Before hierarchy can be relationally placed, COMPs must be relationally placed

You must be aware that too many RPM blocks can significantly hinder the placement tools –

Floorplanner - 62

They can, in fact, prohibit the tools from finding a placement

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 63

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

MAP Limitations •

RPM sourcing locked flip-flops



Constrained by Packing problems

Floorplanner - 64

© 2003 Xilinx, Inc. All Rights Reserved

RPM Sourcing Locked Flip-Flops •

Relationally Placed Macros are components that have been assigned a specific shape with RLOC constraints –





Synthesized arithmetic components larger than six bits may be RPMs All RPMs in your design are listed in Section 7 of the MAP report

In this example, an 8-bit incrementer/decrementer is driving 8 flip-flops to create a counter

Floorplanner - 65

© 2003 Xilinx, Inc. All Rights Reserved

8-Bit Inc-Dec 8 Flip-flops

RPM Sourcing Locked Flip-Flops •

When the flip-flops are physically constrained with LOC or RLOC, MAP is unable to pack the Inc-Dec and flip-flops together –

Floorplanner - 66

Creates extra delay in the datapath

© 2003 Xilinx, Inc. All Rights Reserved

8-Bit Inc-Dec

8 Flip-flops

RPM Sourcing Locked Flip-Flops: Workarounds •

Remove the constraints from the flip-flops –



8-Bit Inc-Dec

Now MAP can pull the flip-flops into the same slice with the RPM

Use the Floorplanner to place both components into the same slice

Floorplanner - 67

© 2003 Xilinx, Inc. All Rights Reserved

8 registers

Design Considerations • •



When locking down flip-flops, make sure you know what is sourcing them If the flip-flops are sourced by an RPM, you should lock down the RPM instead of the flip-flops If flops are sourced by non-RPM logic, then it is up to you to determine whether you want to lock down the logic –

Floorplanner - 68

In many cases, it should NOT be necessary to lock down the non-RPM logic. MAP will pack this logic into the appropriate slice

© 2003 Xilinx, Inc. All Rights Reserved

Constrained By Packing •

• • •

Flip-flop A is constrained to Slice_R1C1 because it drives an output on the NW corner of the die Flip-flop B is unconstrained and drives an output on the SE corner of the die MAP is allowed to pack unconstrained logic together with constrained logic Result: Bad placement for flip-flop B Slice R1C1 D Q

F/G H F/G

A D Q

Slice R1C1

MAP

D Q

F/G H F/G

A D Q

B

D Q

B

Floorplanner - 69

© 2003 Xilinx, Inc. All Rights Reserved

Constrained by Packing: Workarounds •

Avoid partially filled slices (Recommended) – –



Set the XIL_MAP_LOC_CLOSED environment variable to TRUE – –

• •

Constrain all the components (LUTs, FFs) of a locked slice You can use the Floorplanner to fill up the slice Tells MAP not to pack unconstrained logic into slices with constrained logic Can cause an increase in resource utilization if many slices are only partially filled

Use the Floorplanner to lock flip-flop B in the example Use map -timing implementation option

Floorplanner - 70

© 2003 Xilinx, Inc. All Rights Reserved

Interleaving Logic •

Interleaving spreads out the resources associated with a bus –



Other logic can be interspersed with the bus

MAP and PAR will not automatically interleave logic –

Floorplanner - 71

You must manually interleave logic with RLOCs or the Floorplanner

© 2003 Xilinx, Inc. All Rights Reserved

Interleaving Example: Related Buses •

A and B input pins are interleaved –



Skew between bits of interrelated buses is minimized Nets do not cross each other

B1 A1 B0 A0

VHDL: process ( clk ) begin if ( clk' event and clk = '1' ) then Q <= (A AND B); end if; end process;

Floorplanner - 72

LUT

FF

LUT

FF

Verilog: always@(posedge CLK) OUT <= A & B;

© 2003 Xilinx, Inc. All Rights Reserved

Interleaving Example: Dual-Port RAM • • • •

Dual-port RAM produces 1-bit of output per slice Arithmetic functions and registers produce 2-bits per slice Placing the RAM in a tall column can cause routing problems and skew Horizontally interleave the dual-port RAMs to line up with other functions

Dual-Port RAM

Floorplanner - 73

6

7

6-7

4

5

4-5

2

3

0

1

2-3 0-1

© 2003 Xilinx, Inc. All Rights Reserved

Arithmetic Function

Working with Patterns •

The Capture Pattern and Impose Pattern commands allow you to create a template pattern of placed logic that can be used later when placing similar logic – –



Multiple instances of the same hierarchical block Components along a datapath

Procedure: – –

Select a template group of logic, and click the Capture Pattern button Select a new group of logic to impose the pattern upon and click the Impose Pattern button •



Floorplanner - 74

The new group of logic must have the same number and type of components as the template group

Place the new logic in the Floorplan window

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 75

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

Guide Introduction •



The Floorplanner can be used to create a pseudo guide file to guide future implementations What is a guide file? –





A guide file is a file from a previous implementation that is used to guide the placement and/or routing of future implementations Guiding of logic is based on re-using named elements (FFS, latches, RAMs, ROMs, LUTs, BUFTs, etc.) from a previous implementation on a new one

What is the purpose of using a guide file? – –

Floorplanner - 76

To preserve previous good results Reduce run time

© 2003 Xilinx, Inc. All Rights Reserved

Skills Check

Floorplanner - 77

© 2003 Xilinx, Inc. All Rights Reserved

Guide Questions •

What are the prerequisites for using a Floorplanner guide file?



When would it be appropriate to use a Floorplanner guide file?



Which elements would be appropriate to floorplan (i.e., which elements are least likely to change)? Why?

Floorplanner - 78

© 2003 Xilinx, Inc. All Rights Reserved

Guide Answers •

What are the prerequisites for using a Floorplanner guide file? –



When would it be appropriate to use a Floorplanner guide file? –



Preservation of hierarchy Small changes to overall design (generally, less than ten percent)

Which elements would be appropriate to floorplan (i.e., which elements are least likely to change)? Why? – –

Floorplanner - 79

Synchronous elements: FFs, latches, RAMs Guide files are effective if names remain the same. Names of synchronous elements usually do not change from one implementation to the next as long as the hierarchy does not change. LUT names (because they are usually machine generated) change almost entirely

© 2003 Xilinx, Inc. All Rights Reserved

Floorplanned Guide File •







• • •

Step One: Click Synchronous Elements From the Floorplanner window, click Edit → Find Enter * (star wildcard) in the Name textbox Enter Flip-Flops (or memory) in the Type drop-down box Click Find Click Select Found Click Close

Floorplanner - 80

© 2003 Xilinx, Inc. All Rights Reserved

Floorplanned Guide File • •



Step Two: Constrain selected logic From Floorplanner window, click Floorplan menu → Constrain from Placement

Save and Exit Floorplanner

Floorplanner - 81

© 2003 Xilinx, Inc. All Rights Reserved

Expected Guide Results •

Keep in mind with this flow that you are preserving only the placement of the synchronous elements – –



In addition to not placing any LUT logic, you have not retained the routing You are only guiding the placement of LUT logic based on the location of the synchronous elements

You can expect the results to be similar –

You should not expect the results to match entirely. Why? •

Floorplanner - 82

Because of the changes that were made to the design

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • •

Introduction Floorplanning Procedures Area Constraints & I/O Layout PACE Summary Appendix: – – – –

Floorplanner - 83

RPM Core Overcoming MAP/PAR Limitations Pseudo Guide File with Floorplanner Additional I/O Considerations

© 2003 Xilinx, Inc. All Rights Reserved

Device Migration Considerations •



Virtex-II and Virtex-II Pro pinouts were created to allow for migration to a larger or smaller device For a larger device, the number of reference voltage pins will increase (Vcco and Vref) –

Those pins will need to be connected on the PCB, as if you are planning for the larger device •



Floorplanner - 84

Vref pins connected to reference supply voltage – That is, not used as user I/O – Listed as No Connect (NC) for smaller devices Vcco pins connected to reference supply voltage

© 2003 Xilinx, Inc. All Rights Reserved

Package Migration Considerations •

Specific packages allow migration to a larger or smaller package –





The FG456 and FG676 packages are pinout compatible The FF896 and FF1152 packages are pinout compatible

Pin definitions remain nearly identical –

– –

Floorplanner - 85

Some Vref pins in larger packages are user I/O in smaller packages LVDS pairs are different Some user I/O are in different banks

© 2003 Xilinx, Inc. All Rights Reserved

Planning Board Layout •

Due to the board-level layout of devices, some internal layout tips may not be obeyed –



For example, for the board shown here, interface signals between the two FPGAs may not allow for the most optimal internal layout

Attempt to take into account the optimal internal layout when performing board-level layout

Floorplanner - 86

© 2003 Xilinx, Inc. All Rights Reserved

I/O Layout: Reducing Ground Bounce •



Simultaneously Switching Outputs (SSO) are the main cause of ground bounce Review the Ground Bounce Recommendation Charts in the Data Book –

– –

This chart describes the maximum number of SSO pins per power and ground pair for Virtex-II devices Do not exceed these ratings if possible If you do, you may need to modify your design, your pin assignments, or your system to reduce the risk of suffering from ground bounce •



Some possible solutions are described in the notes

PACE will help you review SSO via command Tools → SSO Analysis

Floorplanner - 87

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Modular Design Techniques •

Floorplanner - 88

Lab : IDT Flow Lab : MDT Flow

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Incremental & Modular Design Techniques - 2

© 2003 Xilinx, Inc. All Rights Reserved

Incremental and Modular Design Techniques

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: • • • • • •

Describe incremental design Outline the incremental design flow Describe modular design Outline the modular design flow Identify incremental and modular design prerequisites Describe how a guide file is used

Incremental & Modular Design Techniques - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 5

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

What is Incremental Design? IDT •

Incremental design is used to make iterative design changes to one subblock and preserve the placement and routing of unchanged blocks Top-Level Block Top Sub-Block Block A Sub-Block Block A1

Incremental & Modular Design Techniques - 6

Sub-Block Block B

Sub-Block Block A2

Sub-Block Block B1

© 2003 Xilinx, Inc. All Rights Reserved

Sub-Block Block C

Sub-Block Block B2

Why Use Incremental Design? •

The first expectation is that the recompiled design must meet timing –



The second expectation is that the recompile (synthesis, place&route) is faster than a full compile –



Assuming the original design met timing

Allowing you to increase the number of bug fixes per day

The basic concept is to maintain “good” results – –

In synthesis you will only resynthesize the changed blocks Then the implementation (place&route) tools will have to run placement and routing on only the changed portions of your design •

Maintaining the placement and routing of unchanged blocks

Incremental & Modular Design Techniques - 7

© 2003 Xilinx, Inc. All Rights Reserved

How? •



Resynthesize only the changed block (Block A2) Use the previous implementation information, map and par NCDs, as a guide file –

The guide file will match all previous routing and placement information for the unchanged blocks and redo placement and routing for the changed block (Block A2)

Incremental & Modular Design Techniques - 8

Top-Level Block Top Sub-Block Block A

Sub-Block Block A1

Sub-Block Block A2

© 2003 Xilinx, Inc. All Rights Reserved

Sub-Block Block B Sub-Block Block B1

Sub-Block Block C

Sub-Block Block B2

IDT Case Study 1 •

Case 1: Packet Processing Design: – –

XC2V500 -4 FG456 Utilization: • •



Performance Objectives: •



92% of Slices 67% of FFs 25 MHz clock

Incremental design techniques results: •



Implementation time before IDT implemented: – ~1.5 hours Implementation after IDT implemented (after each incremental design change): – ~20 minutes

Incremental & Modular Design Techniques - 9

© 2003 Xilinx, Inc. All Rights Reserved

IDT Case Study 2 •

Case 2: Routing algorithm – –

XC2V3000 -4 FF1152 Utilization: • •



Performance Objectives: •



85% of Slices 59% of FFs 112 MHz clock

Incremental design techniques results: •



Implementation time before IDT implemented: – ~8 hours Implementation after IDT implemented (after each incremental design change): – 2.5 hours

Incremental & Modular Design Techniques - 10

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 11

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

What is Modular Design? MDT •

Modular design is used for a system on a chip, where individual teams fully implement their design block(s) independent of other blocks Top-Level Block Top Sub-Block Block A Sub-Block Block A1

Sub-Block Block A2

Team A Incremental & Modular Design Techniques - 12

Sub-Block Block B Sub-Block Block B1

Sub-Block Block C

Sub-Block Block B2

Team B © 2003 Xilinx, Inc. All Rights Reserved

Team C

Why Use Modular Design? •

Team-based design –

Designers want to: • •





Implement design in parallel, piece-wise fashion: Divide and conquer Optimize each block separately, tuning each towards their unique design goals – Focus on smaller blocks Achieve perfectly repeatable, predictable results on unchanged modules: Lock down and forget – If this is your only goal, incremental design techniques are likely to provide a more beneficial solution than modular design Save time

Incremental & Modular Design Techniques - 13

© 2003 Xilinx, Inc. All Rights Reserved

A Consistent Spectrum of Flows Divide and Conquer Methodologies Area Groups

•Simplest •Simplest •Changes •Changesstill still “uncontrolled” “uncontrolled”

Incremental & Modular Design Techniques - 14

Incremental Design

•Developed for quick •Quick runtimes for runtimes for small small design design changes changes •Simple •Simpleflow flow requirements requirements

© 2003 Xilinx, Inc. All Rights Reserved

Modular Design

•True •Truebottom-up bottom-upflow: flow: modules implemented modules implemented separately separately •Most •Mostcomplex complex •Module •ModuleTSPECs TSPECsare are required required

How? • • •

Create a layout that identifies where to place each design block on the die Each block is fully implemented independently of one another All blocks are combined using their individual implementation information –

Via Guided implementation files (NCD)

Block A

Top-Level Block Top Sub-Block Block A Sub-Block Block A1

Sub-Block Block A2

Sub-Block Block B Sub-Block Block B1

Incremental & Modular Design Techniques - 15

Chip Layout

Sub-Block Block C

Sub-Block Block B2

© 2003 Xilinx, Inc. All Rights Reserved

Block B

Block C

Modular Design Tools •



The modular design tools are available as part of all ISE toolsets starting with 5.2 Currently, modular design tools can only be used via the command line

Incremental & Modular Design Techniques - 16

© 2003 Xilinx, Inc. All Rights Reserved

Skills Check

Incremental & Modular Design Techniques - 17

© 2003 Xilinx, Inc. All Rights Reserved

Review •

Describe how Incremental Design Techniques differ from Modular Design Techniques

Incremental & Modular Design Techniques - 18

© 2003 Xilinx, Inc. All Rights Reserved

Answer •

Describe how Incremental Design Techniques differ from Modular Design Techniques –



Incremental Design Techniques: used for rapid bug fixes where changes are limited to a single design block Modular Design Techniques: used for team based designs where individual design blocks are implemented independently by individual teams

Incremental & Modular Design Techniques - 19

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 20

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design Flow

Incremental & Modular Design Techniques - 21

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design: Step 1 nPartition boundaries to match the (incremental) blocks that you would like to preserve – – – –

The synthesis tool must not be allowed to optimize across these boundaries XST, LS, Synplify pre 7.2: Individual netlists for each incremental block Synplify using Multi-Point synthesis: One netlist for entire design Typically, you would create between two and eight blocks

Incremental & Modular Design Techniques - 22

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design: Step 1 nIn this example, we have circled the sub-blocks for which we want to incrementally apply changes –

Each circled block indicates an incremental block • •

Generally results in an individual netlist Each incremental block will require floorplanning Top-Level Block Top Sub-Block Block A Sub-Block Block A1

Incremental & Modular Design Techniques - 23

Sub-Block Block A2

Sub-Block Block B Sub-Block Block B1

© 2003 Xilinx, Inc. All Rights Reserved

Sub-Block Block C

Sub-Block Block B2

Incremental Design: Step 2 oSynthesize design –

Synthesize each incremental design block • •

Generally, write out individual netlists (EDIF, NGC) for each incremental block For incremental block netlists, you need to disable I/O insertion (I/O pads and buffers) – Apply synthesis attribute: do not insert clock buffers

Incremental & Modular Design Techniques - 24

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design: Step 3 pUse Floorplanner (or PACE) to layout AREA constraints for each of the preserved blocks –

Use non-overlapping area constraints

Incremental & Modular Design Techniques - 25

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design: Step 4 q Implement the design until timing objectives are met –



Using the floorplan information contained in the UCF This may require several implementation iterations with a high effort level •



Additionally, it may have required the use of Multi-Pass Place and Route (MPPR)

Once this step is complete we want to maintain these “good” results •

Utilize incremental design techniques

Incremental & Modular Design Techniques - 26

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design: Step 5 rAs changes are required, make changes incrementally: one module at a time –

Resynthesize only the incremental block that changed

Incremental & Modular Design Techniques - 27

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design: Step 6 s

Incremental Implementation –

Reimplement using guide files (NCD) •





As a guide, apply both the previous map NCD file and the par NCD file All previous blocks that did not change will retain placement and routing Once all the unchanged logic is placed and routed, the new logic will be placed and routed

Incremental & Modular Design Techniques - 28

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 29

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

Modular Flow

Incremental & Modular Design Techniques - 30

© 2003 Xilinx, Inc. All Rights Reserved

Phase 1: Initial Budgeting Goals •

Three goals: –





Position global logic in the top-level design outside of the other modules Size and position each module in the target device using the floorplanner to create Area Groups Position the I/O ports of each module in such a way as to guide the implementation tools on the flow of the signals from one module to another

Incremental & Modular Design Techniques - 31

© 2003 Xilinx, Inc. All Rights Reserved

Phases 2 to 3: Implementation •

Phase 2: –

Each module is implemented separately •



This allows each team/designer to individually implement their portion of the design until timing goals are met

Phase 3: – –

Files from the first two steps are merged together During this step, the implementation tools will primarily be routing the nets between the individual blocks

Incremental & Modular Design Techniques - 32

© 2003 Xilinx, Inc. All Rights Reserved

Required Directory Structure •

Under your “work” directory you will create the following structure: –

/



/, /… • •



Will contain top-level NGO file and UCF which are used for all active module implementations One directory for each active module This will contain the active module implementation information

/

Stores “guide” information from each active module implementation

Incremental & Modular Design Techniques - 33

© 2003 Xilinx, Inc. All Rights Reserved

Phase 1A: Create NGD •

Creating an NGD file of the top-level design without any module implementation information ngdbuild -modular initial <design_name> –





Second level of hierarchy is all black-boxes Module files must not be included in the top-level directory The NGDBUILD output file <design_name>.ngo file is used during subsequent Modular Design steps, while <design_name>.ngd is not

Incremental & Modular Design Techniques - 34

© 2003 Xilinx, Inc. All Rights Reserved

Phase 1B: Top Level Timing Constraints •

Add top level timing constraints constraints_editor <design_name>.ngd –

At this point, clocks may not drive any loads, hand edit constraints in UCF to specify clocks

Incremental & Modular Design Techniques - 35

© 2003 Xilinx, Inc. All Rights Reserved

Phase 1C: Floorplan •

Floorplan floorplanner <design_name>.ngd – –

Select File → Read Constraints to read in top level UCF All top level logic must be floorplanned: •



Such as: Top-level I/O ports, global buffers, 3-state buffers, flip-flops, and lookup tables

Floorplan area constraints to assign placement of logic •

Use Autofloorplanning for placement of “Pseudo Logic” – Select Floorplan → Distribute Options, make sure Autofloorplan as needed is selected. When you assign an area constraint for a module, the Floorplanner positions the pseudo logic automatically

Incremental & Modular Design Techniques - 36

© 2003 Xilinx, Inc. All Rights Reserved

Phase 2A: Setup Active Module Active Module Implementation •

Setup Active Module – –

Synthesize each module and sub-blocks in separate module directories Copy top level UCF to module directory and rename to <module_name>.ucf •

This allows you to maintain common top level constraints but also add module level specific constraints

Incremental & Modular Design Techniques - 37

© 2003 Xilinx, Inc. All Rights Reserved

Phase 2A: Setup Active Module (Continued) Active Module Implementation •

Setup Active Module –

Run NGDBUILD on active module(s) ngdbuild -uc <module_name>.ucf -modular module -active <module_name> /<design_name>.ngo –



/<design_name>: Top-level ngo file used as an input file for the Active Module Implementation phase

If necessary create module level constraints •

If changes are made to the UCF, rerun the modular NGDBUILD command

Incremental & Modular Design Techniques - 38

© 2003 Xilinx, Inc. All Rights Reserved

Phase 2B: Implement and Publish Active Module Active Module Implementation •

Implement and Publish Active Module –

Run MAP, PAR, and TRACE until timing is met: map <design_name>.ngd par -w <design_name>.ncd <design_name>_routed.ncd – No Modular specific commands – Apply map and par commands as necessary trce <design_name>_routed.ncd



Publish Implemented Module (PIMCREATE) to a central location: pimcreate pim_directory_path -ncd <design_name>_routed.ncd • This command will publish the module implementation files to the pim_directory_path

Incremental & Modular Design Techniques - 39

© 2003 Xilinx, Inc. All Rights Reserved

Phase 3: Final Assembly •

To incorporate all the logic for each module into the top-level design, run ngdbuild as follows: ngdbuild -modular assemble -pimpath pim_directory_path <design_name>.ngo



Run MAP, PAR, and TRACE: map <design_name>.ngd par -w <design_name>.ncd <design_name>_routed.ncd – No Modular specific commands – Apply map and par commands as necessary trce <design_name>_routed.ncd

Incremental & Modular Design Techniques - 40

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 41

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

IDT/MDT Prerequisites •

Basic guidelines – – – –

Well-partitioned hierarchical design Well-defined data flow Well-defined port interaction Registered leaf-level hierarchical boundaries



MDT specific guidelines



IDT specific guideline

Incremental & Modular Design Techniques - 42

© 2003 Xilinx, Inc. All Rights Reserved

Partitioning •

The design must have an effective functional and physical partition into proper modules –



The design should be broken into functional hierarchical blocks that can be placed onto the chip with a well-defined physical flow Separate structural levels and RTL levels of the code • • •



Do not mix the two types of HDL in a single entity/module Structural -- instantiations of lower-level blocks RTL -- leaf-level code where functionality is created

This is not new to designing, but it does require a top-down design approach be used with some up-front budgeting

Incremental & Modular Design Techniques - 43

© 2003 Xilinx, Inc. All Rights Reserved

Hierarchy Preservation •

Synthesis tool should preserve hierarchy –



Primarily an IDT requirement, but also useful for MDT

Why? –





If the hierarchy is flattened, any change to any single piece of code will require the entire design to be resynthesized This will cause a ripple effect; any small change will effectively change the entire netlist Also, to effectively use incremental design, the hierarchy must be in place

Incremental & Modular Design Techniques - 44

© 2003 Xilinx, Inc. All Rights Reserved

Hierarchy Preservation •



Exemplar, Synplicity, Synopsys, and XST synthesis tools all have the capability to preserve the hierarchy Synopsys: –



Synplicity: – –



Set syn_netlist_hierarchy = TRUE on top-level entity/module Set syn_hier hard for each level of hierarchy

Exemplar: – –



Set checkbox to Preserve Hierarchy in Implementation

Optimize tab → Hierarchy box → Preserve button Set checkbox to Optimize single level of hierarchy

XST: –

Properties → Check Keep Hierarchy

Incremental & Modular Design Techniques - 45

© 2003 Xilinx, Inc. All Rights Reserved

Data-Flow •





What is well-defined data-flow? A large portion of the blocks in the design should not include global logic

Block A1 Block B2 Block A2

Rather, the blocks should have a data-flow that passes from block A1 to block B2 with limited interaction of multiple blocks –

This will make floorplanning easier

Incremental & Modular Design Techniques - 46

© 2003 Xilinx, Inc. All Rights Reserved

Port Interaction •

What is meant by well-defined port interaction?



A small number of ports on each module –



Limit large amounts of interaction so that each module is a self-contained module

This allows each module to be more optimally implemented and placed

Incremental & Modular Design Techniques - 47

© 2003 Xilinx, Inc. All Rights Reserved

Registered Boundaries •

Registering the outputs of each leaf-level boundary eliminates the need for synthesis tools to optimize across boundaries –





This is also a fundamental aspect of synchronous design methodology

Enables synthesis of individual modules without resynthesizing the whole design Provides consistent clock-budgeting between modules in a team-based design environment

Incremental & Modular Design Techniques - 48

© 2003 Xilinx, Inc. All Rights Reserved

Modular Design Guidelines •

MDT specific guidelines: – –

All lower-level modules must be declared Modular design supports only two levels of hierarchy • •



You cannot make multiple instantiations of the same block • •

– –

This requires copying and renaming the entity/module for each unique instance IDT allows this, assuming the synthesis tools support it (XST does not)

IOB registers must be inferred in the top-level code Three-state logic inference should occur inside of one level •



The top level and its sub-blocks IDT can have multiple levels

The 3-state logic should not span several different hierarchical blocks

Three-state signals that are outputs of a block must be declared as inout

Incremental & Modular Design Techniques - 49

© 2003 Xilinx, Inc. All Rights Reserved

IDT Specific Guidelines •

For incremental design to be effective, the full design netlist must change very little for each iteration –

This requires the synthesis tool to limit changes to a single incremental design block •



You will generally generate a netlist file for each incremental block

Select the top-level EDIF/NGC in the Xilinx implementation tools, and the implementation tools will find the remaining files in the same directory –

Or you can provide the search directory information (-sd) if each netlist is in a separate directory/folder

Incremental & Modular Design Techniques - 50

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Synthesis LeonardoSpectrum •

To create incremental design blocks, create a separate LS project for each block that you want to preserve – – –

In sub-blocks, disable pad insertion and clock buffer A separate netlist will be written for each block During implementation, specify Translate Macro Search Path directory for each netlist •

For more than one directory add -sd (search directory) for each netlist, for example: -sd ../ -sd ../

Incremental & Modular Design Techniques - 51

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Synthesis Synplify •

For all Synplify versions 7.1 and previous: –

To create incremental design blocks, create a separate Synplify project for each block that you want to preserve • • •



In sub-blocks, disable pad insertion and clock buffer A separate netlist will be written for each block During implementation, specify -sd (search directory) for each netlist

For Synplify Pro 7.2+: –

Utilize Synplify’s MultiPoint Synthesis Design Flow •

Allows you to specify incremental design blocks in single Synplify project – Results in a single netlist » Only changed incremental blocks will change in the netlist

Incremental & Modular Design Techniques - 52

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Synthesis XST •

For XST, apply an incremental_synthesis attribute to each module/entity – – –

This will be applied on a top-down basis Use the resynthesize attribute if you have changed that module/entity VHDL example: attribute incremental_synthesis: string; attribute incremental_synthesis of <entity_name>: entity is “yes”;



Verilog example: // synthesis attribute incremental_synthesis of <module_name> is yes; // synthesis attribute resynthesize of <module_name> is yes;



XCF example: MODEL <entity_name> incremental_synthesis = yes; MODEL <module_name> incremental_synthesis = yes; MODEL <module_name> resynthesize = yes;

Incremental & Modular Design Techniques - 53

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 54

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

Guide File •

What is a guide file? –

– –



It is a file from a previous implementation that is used to guide a more recent implementation The map.ncd file contains mapping (packing) information for slices, IOBs, etc. The <par>.ncd file contains placement and routing information

What happens during a guided implementation? 1. The implementation tools try to match names of internal resources between the guide file and the current netlist 2. When the tools find a match, they place that logic into the same location as it was located in the guide file 3. Routing is also maintained when matching signals between logic is found

Incremental & Modular Design Techniques - 55

© 2003 Xilinx, Inc. All Rights Reserved

Checkerboard Guide File •

If you do NOT use incremental or modular design techniques, guide will result in a “checkerboard” guide file –

A checkerboard of changes •







Black represents unchanged slices in the design Yellow represents the changed slices in the design

A checkerboard guide file is one in which many small changes have occurred to the placed and routed design This results in less flexibility for PAR to place-and-route the changed components

Incremental & Modular Design Techniques - 56

© 2003 Xilinx, Inc. All Rights Reserved

Incremental Design Guide File •

When you use incremental design techniques, all design changes are limited to one area on the die –



Incremental guiding will not use guide information for any incremental block that changes (AREA_GROUP) –



Due to floorplanning

This gives PAR more flexibility in re-placing and routing that block

PAR is not hindered by a checkerboard guide file

Incremental & Modular Design Techniques - 57

© 2003 Xilinx, Inc. All Rights Reserved

Guide File Requirements •

Limited changes to the netlist –





For the checkerboard approach, the netlist should change less than ten percent For an incremental design approach, only a single incremental block should change

To satisfy this requirement in a HDL-based design, there is a single recommendation: –

Recompile only the input HDL files where the changes are made •

This can be accomplished by adhering to the rules in the previous sections

Incremental & Modular Design Techniques - 58

© 2003 Xilinx, Inc. All Rights Reserved

Guide File Use •



Guide is used during mapping, placement, and routing phases of implementation In Project Navigator, right-click Implement Design → Properties → Place & Route Properties tab –

Use the Guide Design File, and browse to the location of the previous implementation NCD file that you want to use as the guide file •



<design_name>.ncd: Placed & routed NCD file that contains mapping, placement, and routing information <design_name>_map.ncd: Mapped NCD file that contains only mapping information – Only the packing of the logic into CLBs and IOBs will remain the same as the guide file. No placement & routing information will be retained

Incremental & Modular Design Techniques - 59

© 2003 Xilinx, Inc. All Rights Reserved

Guide Mode •



For incremental designs, use Incremental guide mode For modular designs, Incremental guide mode is automatically used –

Exact: •



Leverage: •





The placement and routing is maintained regardless of constraints for all matching components The tools will use all matching components as the INITIAL starting point for placement and routing They will then make changes to adhere to constraints and optimize netlist changes

Incremental: •

Within a floorplanned block, if any component does not match the previous implementation, none of the components within that block are guided – Otherwise it is the same as Exact guide mode

Incremental & Modular Design Techniques - 60

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • • • • •

Incremental & Modular Design Techniques - 61

Incremental Design Introduction Modular Design Introduction Incremental Design Flow Modular Design Flow IDT/MDT Prerequisites and Synthesis Guide Files Summary

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions • • • •

Describe each of the incremental design flow steps Describe each of the modular design flow steps What are the basic prerequisites for incremental and modular design? What is a guide file?

Incremental & Modular Design Techniques - 62

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Describe each of the incremental design flow steps

Incremental & Modular Design Techniques - 63

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Describe each of the modular design flow steps

Incremental & Modular Design Techniques - 64

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

What are the basic prerequisites for incremental and modular design? – – – –



Well-partitioned hierarchical design Well-defined data-flow Well-defined port interaction Registered leaf-level hierarchical boundaries

What is a guide file? –

An NCD file from a previous implementation that is used to guide the packing, placement, and routing of a new (partially changed) netlist

Incremental & Modular Design Techniques - 65

© 2003 Xilinx, Inc. All Rights Reserved

Summary •





Incremental Design Techniques and Modular Design Techniques require adherence to synchronous design methods Incremental Design Techniques are used to reduce compile time and maintain previous “good” results for unchanged blocks Modular Design Techniques is a team based design flow and toolset, which allows separate teams to independently implement individual design blocks

Incremental & Modular Design Techniques - 66

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •

Online Software Manuals: – – – – –

• •

Xilinx Synthesis Technology (XST) Users Guide Development System Reference Guide → Incremental Design Development System Reference Guide → Modular Design Development System Reference Guide → Map → Guided Map Development System Reference Guide → PAR → Guided PAR

Synplify → Help menu → Online Documents LeonardoSpectrum → Help menu → Open Manuals Bookcase

Incremental & Modular Design Techniques - 67

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 1 : Optimize Your Design for Xilinx Architecture –

Core Generator System • Lab : Core Generator System Flow

Section 2 : Achieving Timing Closure – –



Timing Closure with Timing Analyzer Global Timing Constraints • Lab : Global Timing Constraints Advance Timing Constraints • Lab : Achieving Timing Closure with Advance Constraints

Section 3 : Improve Your Timing –

Floorplanner • Lab: Floorplanner

Section 4 : Reduce Implementaion Time –

Incremental Design Techniques •



Lab : IDT Flow

Modular Design Techniques •

Lab : MDT Flow

Incremental & Modular Design Techniques - 68

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 5 : Reduce Debug Time –

FPGA Editor: Viewing and Editing a Routed Design •

Lab: FPGA Editor

Section 6 : On-Chip Verification and Debugging – –

ChipScope Pro Demo

Section 7 : Course Summary Optional Topics – – – –

FPGA Editor - 2

Power Estimation with Xpower Advance Implementation Options Embedded Solutions with Power PC/MicroBlaze and Embedded Development Kit (EDK) DSP Solutions with System Generator

© 2003 Xilinx, Inc. All Rights Reserved

Reduce Debug Time FPGA Editor: Viewing and Editing a Routed Design

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: • • • •

Use the FPGA Editor to view device resources Connect the internal nets of an FPGA to output pins (Insert Probes) Determine the specific resources used by your design Make minor changes to your design without re-implementing

FPGA Editor - 4

© 2003 Xilinx, Inc. All Rights Reserved

Outline • •

• • • •

FPGA Editor - 5

FPGA Editor Basics Viewing Device Resources and Constrained Paths Adding a Probe Making Minor Changes Summary Appendix: Creating a Macro

© 2003 Xilinx, Inc. All Rights Reserved

What Does the FPGA Editor Do? •

The FPGA Editor is a graphical application – –



Displays device resources Precise layout of chosen device

The FPGA Editor is commonly used to: – –

View device resources Make minor modifications • •



Insert Probes •

FPGA Editor - 6

Done late in the design cycle Does not require reimplementation of the design Used for in circuit testing

© 2003 Xilinx, Inc. All Rights Reserved

When to Use the FPGA Editor •

Use the FPGA Editor to: – – –



View the design’s layout Drive a signal to an output pin for testing (inserting a probe) Add logic or special architectural features to your design without having to recompile the design

Do not use the FPGA Editor to: – –

FPGA Editor - 7

Floorplan Carelessly control the place and route

© 2003 Xilinx, Inc. All Rights Reserved

What the FPGA Editor Cannot Do •

The FPGA Editor cannot: –

Add additional logic from a second netlist • •



Make modification to design files •

FPGA Editor - 8

Because translation (NGDBuild) is completed Additional logic would need to be hand-placed and routed

HDL and netlist files will not reflect modifications

© 2003 Xilinx, Inc. All Rights Reserved

Design Flow Diagram MAP •

Xilinx implementation flow –



FPGA Editor

Before implementation (Post-MAP)

NCD

BITGEN

Making minor changes –

NCD & PCF

PAR

Placing and routing critical components –



Entry points for FPGA Editor

After implementation (Post-PAR)

BIT

Remember to document the changes to your design, because your netlist will not reflect the changes made by the FPGA Editor! FPGA Editor - 9

© 2003 Xilinx, Inc. All Rights Reserved

FPGA Editor Push Button Panel

Menu Bar Array Window

List Window

History Window

World Window

FPGA Editor - 10

© 2003 Xilinx, Inc. All Rights Reserved

Navigating





FPGA Editor - 11

Zoom



Array window resources

Use the World window to keep track of your location on the die when you are zoomed in

© 2003 Xilinx, Inc. All Rights Reserved

List Window • •

Easiest way to select objects in your design Displays – – – – – –



Name Filter search feature – –



Components Nets Paths Layers Constraints Macros Limit the number of elements shown Use Wildcards (* and ?)

Ability to highlight components –

FPGA Editor - 12

Choose from 15 different colors

© 2003 Xilinx, Inc. All Rights Reserved

Outline • •

• • • •

FPGA Editor - 13

FPGA Editor Basics Viewing Device Resources and Constrained Paths Adding a Probe Making Minor Changes Summary Appendix: Creating a Macro

© 2003 Xilinx, Inc. All Rights Reserved

Viewing the Contents of a Slice or IOB • • •

Select a slice or IOB Click the editblock button View LUT configuration – – – –



LUT RAM ROM SRL

View the LUT equations –

FPGA Editor - 14

Click the Show/Hide Attributes button

© 2003 Xilinx, Inc. All Rights Reserved

Viewing Constrained Paths •

View constraints in the List window –



Perform a timing analysis –



Select Constraints in the pulldown menu

Tools Æ Trace Æ Setup and Run

Trace window – – –

FPGA Editor - 15

Generates a Timing Analysis report Select the Type of Report Click Apply

© 2003 Xilinx, Inc. All Rights Reserved

Viewing Constrained Paths •

Trace Summary window – –



Select the constraint to report on Click Details

The Trace Errors window – – –



FPGA Editor - 16

Lists the slack on each delay path Most-critical path is listed last Select a delay path to be displayed Click Hilite

© 2003 Xilinx, Inc. All Rights Reserved

Calculating Skew •

Determine net delays –

History window shows: • •



Click the “attrib” button – –



Net destination Associated delay

Located on the Push Button Panel Select Pins tab

Determine skew –

FPGA Editor - 17

(Longest Delay) - (Shortest Delay)

© 2003 Xilinx, Inc. All Rights Reserved

Viewing Multiple Windows •

Multiple Array windows can be viewed by using the command: –







Window Æ New Æ Array Window List, Array, or World window can be selected

Useful for viewing different areas of interest at the same time View and edit the sources and destinations of routes

FPGA Editor - 18

© 2003 Xilinx, Inc. All Rights Reserved

Outline • •

• • • •

FPGA Editor - 19

FPGA Editor Basics Viewing Device Resources and Constrained Paths Adding a Probe Making Minor Changes Summary Appendix: Creating a Macro

© 2003 Xilinx, Inc. All Rights Reserved

Adding a Probe: Probes GUI •



Ties an internal signal to an output pin Probes are managed in the Probes GUI –





Click the “probes” button on the Push Button Panel Tools Æ Probes

Probes can be added, deleted, edited, or highlighted

FPGA Editor - 20

© 2003 Xilinx, Inc. All Rights Reserved

Adding a Probe: Probes GUI •

Click the Add button –





Opens the Define Probe window

Select desired probes to Delete, Edit, or Hilite After a Probe has been added: –





FPGA Editor - 21

Click Bitgen to create new bitfile Click Download to open iMPACT programmer Document the change

© 2003 Xilinx, Inc. All Rights Reserved

Defining a Probe •

Enter a Pin Name Select Net to be probed Click OK



Filter feature to limit net options



Method

• •



Automatic routing • •



Manual routing • •

FPGA Editor - 22

Selects the shortest route Possible long wait times Specific pins can be selected Selects the shortest route if multiple pins are selected

© 2003 Xilinx, Inc. All Rights Reserved

Outline • •

• • • •

FPGA Editor - 23

FPGA Editor Basics Viewing Device Resources and Constrained Paths Adding a Probe Making Minor Changes Summary Appendix: Creating a Macro

© 2003 Xilinx, Inc. All Rights Reserved

Adding Components •

Adding a Component – – –



Select the resource (slice, IOB, etc.) from the Array window Click the add button Complete the Component Properties box

All resources can be added

FPGA Editor - 24

© 2003 Xilinx, Inc. All Rights Reserved

Adding Component Pins •

Adding component pins – – –

Select pin Select “add” in push button panel Complete Properties box •

FPGA Editor - 25

Pin name

© 2003 Xilinx, Inc. All Rights Reserved

Modifying LUTs •

Modifying the equations –



Click the Show/Hide Attributes button Complete the Component Properties box •



Select Apply changes •



* (AND), + (OR), ~ (NOT), @ (XOR) Tool performs a Design Rule Check (DRC)

Click the Save Changes and Close Window button

FPGA Editor - 26

© 2003 Xilinx, Inc. All Rights Reserved

Modifying Other Slice Resources •

Add resources – – –



Click the Apply button –



Select properties of resource Click on pin to route signal paths This will automatically route the signals inside the slice

Performs a Design Rule Check

Click the Save Changes and Close Window button

FPGA Editor - 27

© 2003 Xilinx, Inc. All Rights Reserved

Routing Signals •

Routing setup: Tools → Route → Auto Route Setup… –

Auto Route Design: Options used to auto route the entire design (default values) •





Auto Route Selection: Options used for a selected route (pins, nets, components) •



FPGA Editor - 28

Timespec Driven: Auto routes signals to meet timing constraints – Generally best results Allow Pin Swap: Allow for pin swapping during auto routing. Enables better use of resources

Delay driven: Auto routes selected item as fast as possible – Generally best results Resource Driven: Minimizes use of resources (wires and pips) during auto route – Default

© 2003 Xilinx, Inc. All Rights Reserved

Rerouting Signals Auto Routing •

Rerouting a signal: –

– –



FPGA Editor - 29

Select net in Array or List window to reroute Click → unroute in pushbutton panel Specify routing options (Auto Route Setup) Click → autoroute in pushbutton panel

© 2003 Xilinx, Inc. All Rights Reserved

Manual Routing Signals I •

Select site pins – – –



Route the net –



Click the site pin of a resource Hold down the Shift key Click another site pin

Click the “route” button

Automatically chooses the shortest route between site pins

FPGA Editor - 30

© 2003 Xilinx, Inc. All Rights Reserved

Manual Routing Signals II •

Select object to route –

– – –



Route the net –



Click the site pin of a resource Hold down the Shift key Click net Click subsequent nets

Click the “route” button

Routes one net segment

FPGA Editor - 31

© 2003 Xilinx, Inc. All Rights Reserved

Manual Routing Signals III •

Select object to route –

– –



Route the net –



Click previously routed segment Hold down the Shift key Click site pin

Click the “route” button

Automatically chooses the shortest route from segment to site pins

FPGA Editor - 32

© 2003 Xilinx, Inc. All Rights Reserved

Adding an External IOB •

Adding an IOB –

Select IOB •

– –



Make certain the IOB is bonded, unbonded IOBs have X in IOB box

Select “add” in pushbutton panel Edit Properties, click OK

Use the editblock command to edit resources

FPGA Editor - 33

© 2003 Xilinx, Inc. All Rights Reserved

IOB Resources •

Input/Output registers, output 3-state, I/O standard, drive strength, and slew rate control can be viewed and modified

FPGA Editor - 34

© 2003 Xilinx, Inc. All Rights Reserved

Outline • •

• • • •

FPGA Editor - 35

FPGA Editor Basics Viewing Device Resources and Constrained Paths Adding a Probe Making Minor Changes Summary Appendix: Creating a Macro

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •

List some of the common uses for the FPGA Editor



When should the FPGA Editor not be used?



What are the benefits of inserting a probe?



If any modifications were made using the FPGA Editor, it is important to __________ any changes. Why?

FPGA Editor - 36

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

List some of the common uses for the FPGA Editor – – – –



View device resources Make minor modification Insert probes Generate a new bitstream

When should the FPGA Editor not be used? –

FPGA Editor - 37

FPGA Editor should not be used to Floorplan a design or control the place and route

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

What are the benefits of inserting a probe? –



The probes capability makes it possible to route a signal to an output pin for testing, and generate a new bitstream for the design without re-implementing

If any modifications were made using the FPGA Editor, it is important to Document any changes. Why? –

FPGA Editor - 38

It is necessary to document your changes because the netlist will not reflect the changes made by the FPGA Editor

© 2003 Xilinx, Inc. All Rights Reserved

Summary •









The FPGA Editor provides you with a tremendous amount of design control Most customers use this tool for understanding the device utilization or adding test probes Careful use of this tool is important because indiscriminate movement of logic can severely reduce the likelihood of getting good design performance and utilization The FPGA Editor allows you to make minor changes to a design without re-implementing your design Document any changes to your design because your netlist will not reflect the changes made by the FPGA Editor

FPGA Editor - 39

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •

FPGA Editor Help – –



http://support.xilinx.com Æ Software Manuals Help Æ Help Topics

Tech Tips –

FPGA Editor - 40

http://support.xilinx.com ÆTech Tips Æ Floorplanner & FPGA Editor

© 2003 Xilinx, Inc. All Rights Reserved

Outline • •

• • • •

FPGA Editor - 41

FPGA Editor Basics Viewing Device Resources and Constrained Paths Adding a Probe Making Minor Changes Summary Appendix: Creating a Macro

© 2003 Xilinx, Inc. All Rights Reserved

Creating a New Macro •

Use the Command: File Æ New – –

Choose the Macro radio button Give the Macro a File name •





Add the Macro resources – –



Select necessary components Use the add command

Program the detailed use of each component –



Example will be a wide AND gate named WIDEAND Macro file name wideand.nmc

Use the editblock command

Manually route any internal signals

FPGA Editor - 42

© 2003 Xilinx, Inc. All Rights Reserved

Creating External Pins •

External pins define the ports on the macro –



Add external pins to the macro – –







Select a site pin Edit Æ Add Macro External Pin

External Pin Properties GUI –



The macro cannot be used if there are undefined external pins

“External Name” is the port name referenced in the top level (in1) Choose Type of Pin

Save the Macro Instantiate the component WIDEAND in your design

FPGA Editor - 43

© 2003 Xilinx, Inc. All Rights Reserved

Instantiation of WIDEAND Macro VHDL

Verilog

--Declaration of Macros ----

-- Declaration of Macros -----

component WIDEAND port ( in1, in2, in3, in4 : in std_logic; XOUT : out std_logic); end component; -- Instantiation of Macros ------ (port_map_name=>signal_name) U1:WIDEAND port map ( in1=>W, in2=>X, in3=>Y, in4=>Z, XOUT=>PROD);

FPGA Editor - 44

module WIDEAND (in1, in2, in3, in4, XOUT) input in1; input in2; input in3; input in4; output XOUT; endmodule -- Instantiation of Macros ----U1(.in1 (W), .in2 (X), .in3 (Y), .in4 (Z), .XOUT(PROD));

© 2003 Xilinx, Inc. All Rights Reserved

Setting a Reference Comp •

Shape of the macro will be maintained –



Use the command: Edit Æ Set Macro Reference Comp –



In an RPM

Maintains the relative placement of the slices in your macro

Macros do not maintain the routing selected

FPGA Editor - 45

© 2003 Xilinx, Inc. All Rights Reserved

Viewing a Macro Placement •

After implementing a design with an instantiated Macro –

– –

FPGA Editor - 46

Select All Macros from the List window Choose desired Macro to view View Macro placement in the Array window

© 2003 Xilinx, Inc. All Rights Reserved

Converting an Existing Design to a Macro •



File Æ Save As, then select Macro radio button Perform necessary edits – –

• •

A macro should not have any IOBs Routing used in a macro cannot be locked down –





Delete any logic/routing, etc. Any hanging nets will generate an error

It will be considered a “rat’s nest”

Add external pins to the macro and select a Reference Comp When finished with edits, save the macro

FPGA Editor - 47

© 2003 Xilinx, Inc. All Rights Reserved

Agenda Section 5 : Reduce Debug Time –

FPGA Editor: Viewing and Editing a Routed Design •

Lab: FPGA Editor

Section 6 : On-Chip Verification and Debugging –

ChipScope Pro •

Demo

Section 7 : Course Summary Optional Topics – – – –

FPGA Editor - 48

Power Estimation with Xpower Advance Implementation Options Embedded Solutions with Power PC/MicroBlaze and Embedded Development Kit (EDK) DSP Solutions with System Generator

© 2003 Xilinx, Inc. All Rights Reserved

FPGA Editor Lab

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this lab, you will be able to: • • • •

Analyze the contents of a slice Add a probe Change I/O locations and contents View a constrained path and take steps to improve the performance of a net

FPGA Editor - 50

© 2003 Xilinx, Inc. All Rights Reserved

Lab Design: Correlate and Accumulate

FPGA Editor - 51

© 2003 Xilinx, Inc. All Rights Reserved

General Flow • • • • • • • •

Step 1: Step 2: Step 3: Step 4: Step 5: Step 6: Step 7: Step 8:

FPGA Editor - 52

Implement the Design and Launch FPGA Editor Analyze Slice Contents Add a Probe Modify IOB Properties Move an IOB’s Location Analyze Timing Results View a Path Reroute a Net

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx On-Chip Debug Solutions

Xilinx Confidential

Agenda Section 5 : Reduce Debug Time – FPGA Editor: Viewing and Editing a Routed Design • Lab: FPGA Editor

Section 6 : On-Chip Verification and Debugging – ChipScope Pro – Demo

Section 7 : Course Summary Optional Topics – – – –

Power Estimation with Xpower Advance Implementation Options Embedded Solutions with Power PC/MicroBlaze and Embedded Development Kit (EDK) DSP Solutions with System Generator

Presentation Name 2

Xilinx Confidential

Shorten Debug and Verification Unleash the Power of Xilinx FPGAs • Flexible On-Chip Debug – Place ChipScope Pro cores anywhere in your design to gain point access for debug – Debug occurs at or near system clock speeds

• Integration with Xilinx ISE Tools • Define, add and remove cores at any point

in the design phase • Change ChipScope Pro core inputs without recompiling the entire design

• It’s Never Too Late in an FPGA

– Hardware problems can be fixed during development AND after product deployment

Presentation Name 3

Xilinx Confidential

Enabling System Level Verification and Debug

Viterbi

OPB GPIO

OPB Bus

Arbiter

Bridge

PLB Bus

– Stimulate and monitor logic in real-time using virtual input and outputs – Detect activity asynchronously or synchronously on rising or falling clock edges – Define and drive pulse trains to simulate input signal activity

Use ChipScope Pro VIO cores to add virtual DIP switches and LEDs to any design MAC

• Virtual Input Output (VIO)

ATM Utopia L2

• Integrated Bus Analysis (IBA) – Access CoreConnect bus structures, the interface between the PowerPC and processor peripherals Use ChipScope Pro IBA cores to monitor and analyze PLB and OPB bus transactions

Presentation Name 4

Xilinx Confidential

Enabling Logic Verification and Debug ChipScope Pro

VIRTEX-II PRO Probe points

JTAG

ILA Block RAM

XC2VP20 FF1152

• Integrated Logic Analysis (ILA) – Point access to every node and signal within a Xilinx FPGA – Support system clock rates up to 255MHz

• No I/O pins required for debug – Access via the JTAG Port Presentation Name 5

Xilinx Confidential

ChipScope Pro with Agilent Trace Core Combining On-Chip Debug with Deep External Sample Storage VIRTEX-II PRO ILA

ATC

Probe points

FPGA Trace Port Analyzer

ChipScope Pro

JTAG

Trace

LAN

XC2VP20 FF1152

• Deep External Sample Depth

– Debug complex system designs that require deeper sample storage – Up to 2M samples without using any on-chip BlockRAM – Access via the high-speed Trace Port

• On-Chip Access to Every Signal and Node in the FPGA design – Access any signal and nodes on-chip

Presentation Name 6

Xilinx Confidential

Xilinx ChipScope Pro System Complexity

Enabling Complete FPGA Debug Solutions Traditional

ChipScope Pro IBA ILA with ATC

Logic Analyzer

ChipScope Pro ILA

ChipScope Pro ILA with ATC

Solution

Debug and Verification Requirements Presentation Name 7

Xilinx Confidential

Complete Solutions Available Today • ChipScope Pro On-Chip Debug Solution

– Free 30 Day Evaluation – $695 Full version www.xilinx.com/chipscopepro

• Agilent FPGA Trace Port Analyzer – $7,295 Available Now www.agilent.com/find/FPGA Presentation Name 8

Xilinx Confidential

Agenda Section 5 : Reduce Debug Time –

FPGA Editor: Viewing and Editing a Routed Design •

Lab: FPGA Editor

Section 6 : On-Chip Verification and Debugging – –

ChipScope Pro Demo

Section 7 : Course Summary Optional Topics – – – –

Power Estimation with Xpower Advance Implementation Options Embedded Solutions with Power PC/MicroBlaze and Embedded Development Kit (EDK) DSP Solutions with System Generator

Course Summary - 2

© 2002 Xilinx, Inc. All Rights Reserved

Course Summary

Advance FPGA Design With Xilinx

© 2002 Xilinx, Inc. All Rights Reserved

Summary •







• •





CORE Generator™ cores can be used to take full advantage of the Xilinx FPGA architecture Timing reports are used to identify critical paths and analyze the cause of timing failures Multi-cycle, false path, and critical path timing constraints can be easily specified via the Advanced tab in the Xilinx Constraints Editor Advanced implementation options, like timing-driven packing and extra effort, can help increase performance Floorplanner can be uses to improve timing for the design FPGA Editor can be use to make minor changes to routed design to reduce debugging time You can improve implementation time with Incremental Design Techniques and Module Design Techniques On-Chip Verification with ChipScope Pro

Course Summary - 4

© 2002 Xilinx, Inc. All Rights Reserved

Power Estimation

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: •





List the three phases of the design cycle where power calculations can be performed Estimate power consumption by using the Xilinx Power Estimator worksheet Estimate power consumption by using the XPower utility

Power Estimation - 3

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Power Estimation - 4

Introduction Power Estimator Worksheet Using XPower Software Summary

© 2003 Xilinx, Inc. All Rights Reserved

Power Consumption Overview



As devices get larger and faster, power consumption goes up First generation FPGAs had: – – –



Lower performance Lower power requirements No package power concerns

Package Power Limit

PMAX



Today's FPGAs have: – – –

Power Estimation - 5

Much higher performance Higher power requirements Package power limit concerns exist

© 2003 Xilinx, Inc. All Rights Reserved

High Density

Low Density Real World Design Power Consumption

Performance (MHz)

Power Consumption Concerns •



High-speed and high-density designs require more power, leading to higher junction temperatures Package thermal limits exist – –



125oC for plastic 150oC for ceramic

Power directly limits: – – – –

Power Estimation - 6

System performance Design density Package options Device reliability

© 2003 Xilinx, Inc. All Rights Reserved

Estimating Power Consumption •

Estimating power consumption is a complex calculation – –

Power consumption of an FPGA is almost exclusively dynamic Power consumption is design dependent and is affected by: • • • • • •

Power Estimation - 7

Output loading System performance (switching frequency) Design density (number of interconnects) Design activity (percent of interconnects switching) Logic block and interconnect structure Supply voltage

© 2003 Xilinx, Inc. All Rights Reserved

Estimating Power Consumption •

Power calculations can be performed at three distinct phases of the design cycle –

Concept phase: A rough estimate of power can be calculated based on estimates of logic capacity and activity rates •



Design phase: Power can be calculated more accurately based on detailed information about how the design is implemented in the FPGA •



Use XPower™ software

System integration phase: Power is calculated in a lab environment •



Use the Xilinx Power Estimator worksheets on the Web

Use actual instrumentation

Accurate power calculation at an early stage in the design cycle will result in fewer problems later

Power Estimation - 8

© 2003 Xilinx, Inc. All Rights Reserved

Activity Rates •

• •

Accurate activity rates (A.K.A. toggle rates)are required for meaningful power calculations Clocks and input signals have an absolute frequency Synchronous logic nets use a percentage activity rate –







100-percent indicates that a net is expected to change state on every clock cycle Allows you to adjust the primary clock frequency and see the effect on power consumption Can be set globally to an average activity rate, on groups or individual nets

Logic elements also use a percentage activity rate – –

Power Estimation - 9

Based on the activity rate of output signals of the logic element Logic elements have capacitance

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Power Estimation - 10

Introduction Power Estimator Worksheet Using XPower Software Summary

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx Power Estimator •

Excel worksheets with power estimation formulas built in – –



Enter design data in white boxes Power estimates are shown in red boxes

Worksheet sections: – – – – – – –

Power Estimation - 11

Summary (device totals) Quiescent power CLB logic Block RAMs Block multipliers DCMs I/O pins

© 2003 Xilinx, Inc. All Rights Reserved

Power Estimator Worksheet: Summary and Quiescent

Power Estimation - 12

© 2003 Xilinx, Inc. All Rights Reserved

Power Estimator Worksheet: CLB, RAM, and Multipliers

Power Estimation - 13

© 2003 Xilinx, Inc. All Rights Reserved

Power Estimator Worksheet: DCM and I/O

Power Estimation - 14

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Power Estimation - 15

Introduction Power Estimator Worksheet Using XPower Software Summary

© 2003 Xilinx, Inc. All Rights Reserved

What is the XPower Software? •

• •

A utility for estimating the power consumption and junction temperature of FPGA and CPLD devices Reads an implemented design (NCD file) and timing constraint data You supply activity rates: – – – – –



Clock frequencies Activity rates for nets, logic elements, and output pins Capacitive loading on output pins Power supply data and ambient temperature Detailed design activity data from simulation (VCD file)

XPower calculates total average power consumption and generates a report

Power Estimation - 16

© 2003 Xilinx, Inc. All Rights Reserved

Running XPower •





Expand Implement Design → Place & Route Double-click Analyze Power to launch XPower in interactive mode Use the Generate Power Data process to create reports using VCD files or TCL scripts

Power Estimation - 17

© 2003 Xilinx, Inc. All Rights Reserved

XPower GUI

Power Estimation - 18

© 2003 Xilinx, Inc. All Rights Reserved

XPower Reports •

ASCII (design.pwr) or HTML (design.html) formats –

• • •

Select format in the Edit → Preferences dialog

Summary of power consumption Detailed breakdown of power usage (optional) Thermal data – –

Power Estimation - 19

Junction temperature Theta J-A for chosen package

© 2003 Xilinx, Inc. All Rights Reserved

XPower Summary Report Power summary: I(mA) P(mW) ---------------------------------------------------------------Total estimated power consumption: 807 Vccint 1.50V: 160 240 Vccaux 3.30V: 100 330 Vcco33 3.30V: 72 237 (Additional breakdowns deleted) Thermal summary: ---------------------------------------------------------------Estimated junction temperature: 46C 250 LFM 43C ... Ambient temp: 25C Case temp: 43C Theta J-A range: 26 - 26C/W Decoupling Network Summary: Cap Range (uF) # ---------------------------------------------------------------Capacitor Recommendations: Total for Vccint : 8 470.0 - 1000.0 : 1 0.0470 - 0.2200 : 1 0.0100 - 0.0470 : 2 0.0010 - 0.0047 : 4 (Other power supplies deleted) Power Estimation - 20

© 2003 Xilinx, Inc. All Rights Reserved

XPower Detailed Report Power details: ------------------------------------------------------------------------------Clocks: Loads Loading(fF) C(pF) F(MHz) I(mA) P(mW) ------------------------------------------------------------------------------rd_clk_inst/IBUFG Logic: rd_clk_inst/CLKDLL 40 166.7 10.0 15.0 rd_clk_inst/BUFG 6 166.7 1.5 2.3 Nets: rd_clk_dll 364 231 166.7 58.0 86.9 rd_clk_inst/CLK0 1 0 166.7 0.0 ~0.0 rd_clk_inst/IBUFG 1 0 166.7 0.0 ~0.0 . . . ------------------------------------------------------------------------------Outputs: Loads Loading(fF) C(pF) F(MHz) I(mA) P(mW) ------------------------------------------------------------------------------Vcco33 final_data_obuf[0] 35000 13 5.0 0.8 2.6

. . . ------------------------------------------------------------------------------Logic: Loads Loading(fF) C(pF) F(MHz) I(mA) P(mW) ------------------------------------------------------------------------------cha_fifo_inst/.../FIFO_BRAM.A 32 6.0 0.3 0.4 . . .

Power Estimation - 21

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Power Estimation - 22

Introduction Power Estimator Worksheet Using XPower Software Summary

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •



Power estimations are typically made during which three phases of the design cycle? What methods can be used to enter activity rates into the XPower tool?

Power Estimation - 23

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Power estimations are typically made during which three phases of the design cycle? –







Concept phase: Rough estimate based on estimated logic capacity and activity rates Design phase: More accurate estimate based on information about how the design is implemented in the FPGA System integration phase: Actual power usage is measured in a lab environment

What methods can be used to enter activity rates into the XPower tool? – –

Power Estimation - 24

Load a VCD file Manually enter activity rates

© 2003 Xilinx, Inc. All Rights Reserved

Summary •

Power calculations can be performed at three distinct phases of the design cycle: – – –







Concept phase: (Xilinx Power Estimator Worksheet) Design phase: (XPower software) System integration phase: (lab measurements)

Accurate power calculation at an early stage in the design cycle will result in fewer problems later XPower software is a utility for estimating the power consumption and the junction temperature of FPGA and CPLD devices XPower software uses activity rates to calculate total average power consumption

Power Estimation - 25

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •

Software manuals: Development System Reference Guide –

• •

Online Help from XPower GUI Application Notes: – –



XPower chapter

XAPP158: “Powering Xilinx FPGAs” XAPP415: “Packaging Thermal Management”

Power Estimators (Excel spreadsheets): –

Power Estimation - 26

http://www.xilinx.com/ise/power_tools/spreadsheet_pt.htm

© 2003 Xilinx, Inc. All Rights Reserved

Advanced Implementation Options

© 2003 Xilinx, Inc. All Rights Reserved

Objectives After completing this module, you will be able to: •

Increase design performance by using timing-driven packing and advanced place & route options

Advanced Implementation Options - 17 - 3

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Advanced Implementation Options - 17 - 4

Introduction Timing-Driven Packing Advanced Place & Route Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Introduction •



Xilinx recommends using the default options and global timing constraints the first time you implement a design If your design does not meet timing goals, follow the recommended flow presented earlier –

Early in the design-cycle, look at ways of changing your HDL code • •



– –

Confirm that good coding styles were used Try synthesis options, such as retiming or adding pipeline stages, to reduce logic-levels If it is early in the design cycle, you do not want to run a full implementation every time a change is made - this will be time consuming and frustrating

Increase the Place & Route effort level Apply path-specific timing constraints for synthesis and implementation

Advanced Implementation Options - 17 - 5

© 2003 Xilinx, Inc. All Rights Reserved

When to Use Advanced Options •

If timing is still not met, consider using advanced Map or Place & Route (PAR) options –

Map: Perform Timing-Driven Packing •

– – –



Uses timing constraints to pack critical paths

PAR: Extra Effort PAR: Multi-Pass Place & Route (MPPR) PAR: Re-entrant Routing

These options will increase the software run time –

This module discusses the expected tradeoffs and benefits of each option

Advanced Implementation Options - 17 - 6

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Advanced Implementation Options - 17 - 7

Introduction Timing-Driven Packing Advanced Place & Route Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Timing-Driven Packing •

Timing constraints are used to optimize which pieces of logic are packed into each slice – – – –



Normal (standard) packing is performed PAR is run through the placement phase Timing analysis analyzes the amount of slack in constrained paths If necessary, packing changes are made to allow better placement

The output of Map contains mapping and placement information –

Post-Map Static Timing Report will contain more realistic net delays

Advanced Implementation Options - 17 - 8

© 2003 Xilinx, Inc. All Rights Reserved

Example • •

Originally, the FFs were packed together into a slice After placement and timing analysis, the FFs are packed into different slices to allow independent movement Standard Pack

Timing Driven Pack FF1

FF1 FF2

Advanced Implementation Options - 17 - 9

FF2

© 2003 Xilinx, Inc. All Rights Reserved

Turning on Timing-Driven Packing •



Set the Property Display Level to Advanced Right-click Implement or Map Process, and select Properties –

Check the Perform Timing-Driven Packing box

Advanced Implementation Options - 17 - 10

© 2003 Xilinx, Inc. All Rights Reserved

Tradeoffs • •

Typical performance improvement: Five to eight percent Has the greatest effect when unrelated packing has occurred –

Look in the Map Report, Design Summary section •





Number of slices containing unrelated logic

If no unrelated packing has occurred, performance improvement will be minimal

Run time for the Map process always increases –

Up to 200 percent

Advanced Implementation Options - 17 - 11

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Advanced Implementation Options - 17 - 12

Introduction Timing-Driven Packing Advanced Place & Route Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

PAR Extra Effort • • • •

Only available when Highest effort level is selected Two settings: Normal (1), Continue on Impossible (2) Typical performance improvement: Four percent Run time for the Place & Route process always increases –

Potential 200-percent increase or more

Advanced Implementation Options - 17 - 13

© 2003 Xilinx, Inc. All Rights Reserved

Setting Extra Effort •



Set the Property Display Level to Advanced Right-click Implement or Place & Route Process and select Properties –



Set Place & Route Effort Level (Overall) to High Set Extra Effort

Advanced Implementation Options - 17 - 14

© 2003 Xilinx, Inc. All Rights Reserved

Multi-Pass Place & Route (MPPR) • •

Runs the Place & Route process multiple times Uses a different placement seed (cost table) for each pass – –



One-hundred cost tables available One or more cost tables may meet your timing objectives

Keeps the best results and discards the rest Placed & Routed Result #1 Mapped Design

Place

Route

Placed & Routed Result #2 Placed & Routed Result #3

Advanced Implementation Options - 17 - 15

© 2003 Xilinx, Inc. All Rights Reserved

Running MPPR •



Expand the Implement Design and Place & Route Processes Right-click Multi-Pass Place & Route and select Properties –



Set properties, then click OK

Double-click Multi-Pass Place & Route to run MPPR

Advanced Implementation Options - 17 - 16

© 2003 Xilinx, Inc. All Rights Reserved

MPPR Results •

MPPR Report (under the Multi-Pass Place & Route process) – –



Saved results are placed in the specified directory –



Lists all of the cost tables that were run, from best to worst Saved results are marked Each cost table has its own subdirectory

The best result is also copied back into the main project directory – –

Use this result to continue with re-entrant routing if timing has not been met To try a different result, manually copy the files out of the subdirectory

Advanced Implementation Options - 17 - 17

© 2003 Xilinx, Inc. All Rights Reserved

Re-entrant Routing •

Runs the Place & Route process on a design that has already been placed & routed before –

Skips the placement phase and goes directly to the router

Mapped Design

Placed & Routed Design

Place

Route Re-entrant Routed Design

Advanced Implementation Options - 17 - 18

© 2003 Xilinx, Inc. All Rights Reserved

When to Use Re-entrant Routing • • •

To complete the routing of a design that was not 100-percent routed To improve timing on the best results from MPPR In general, use this option when routing was initially limited by selecting a low Overall or Router Effort Level

Advanced Implementation Options - 17 - 19

© 2003 Xilinx, Inc. All Rights Reserved

Running Re-entrant Routing •

Right-click Implement or Place & Route process, and select Properties –



Set Place & Route Effort Level to High Set Place and Route Mode to Reentrant Route

Advanced Implementation Options - 17 - 20

© 2003 Xilinx, Inc. All Rights Reserved

Outline • • • •

Advanced Implementation Options - 17 - 21

Introduction Timing-Driven Packing Advanced Place & Route Options Summary

© 2003 Xilinx, Inc. All Rights Reserved

Skills Check

Advanced Implementation Options - 17 - 22

© 2003 Xilinx, Inc. All Rights Reserved

Review Questions •



Under what conditions will timing-driven packing have the most impact on design performance? What is the tradeoff when using PAR with the Extra Effort option?

Advanced Implementation Options - 17 - 23

© 2003 Xilinx, Inc. All Rights Reserved

Answers •

Under what conditions will timing-driven packing have the most impact on design performance? –



When unrelated logic is packed together into the same slice

What is the tradeoff when using PAR with the Extra Effort option? –

PAR run time can increase by a factor of two or more

Advanced Implementation Options - 17 - 24

© 2003 Xilinx, Inc. All Rights Reserved

Summary •

Implement your design with global timing constraints and default software options first – – –

• • •

Then verify that your constraints are reasonable Try increasing the Place & Route Effort Level Consider applying path-specific constraints

If unrelated logic is packed together, try timing-driven packing Otherwise, try PAR with the Extra Effort option If these methods do not work, then try MPPR

Advanced Implementation Options - 17 - 25

© 2003 Xilinx, Inc. All Rights Reserved

Where Can I Learn More? •

Online Software Manuals: Development System Reference Guide –

MAP and PAR chapters

Advanced Implementation Options - 17 - 26

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx Embedded Solutions

© 2003 Xilinx, Inc. All Rights Reserved

Programmable Platform Design ISE Platform Studio IDE – Your Desktop SoC Factory

GUI GUI

Your Code, Apps, OS…

Bash (Unix-style) Shell with Tcl applications and text input

Optimized SW Libs and Drivers libc.a Xilinx Micro-Kernel

Your Custom Computing Platform

Ethernet Ethernet

32/33 32/33PCI PCI

Advanced Implementation Options - 17 - 3

LEDs LEDs && Buttons Buttons OPB GPIO

128MB 128MB DDR DDR SDRAM SDRAM

Serial Serial Serial Serial Port Port Port Port 16450 UART

16550 UART

DDR DDR Cntrl Cntrl

JTAG JTAG 4KB 4KBIOCM IOCM BRAM BRAM

OPB 10/100 E-Net OPB PCI

2KB BRAM

OPB OPB Arbiter Arbiter

OPB OPB <> <> PLB PLB

BRAM Cntrl

32KB 32KB BRAM BRAM

PLB PLB Arbiter Arbiter

BRAM BRAM Cntrl Cntrl

VxWorks VxWorks BSP BSP&& Kernel Kernel

PPC PPC 405 405 8KB 8KBDOCM DOCM BRAM BRAM

© 2003 Xilinx, Inc. All Rights Reserved

Linux Linux BSP BSP&& Kernel Kernel

Your Custom HW

Parameterizable

HW IP

Work Workthe theWay Way You YouWant Want––GUI, GUI, Command CommandLines Lines or orBoth! Both!

PowerPC-based Embedded Design RocketIO

Dedicated Hard IP

DSOCM BRAM

PowerPC 405 Core

Off-Chip Memory

OPB

Processor Local Bus

Hi-Speed Peripheral

e.g. Memory Controller

ZBT SSRAM

Advanced Implementation Options - 17 - 4

DCR Bus

Data

PLB

Flexible Soft IP

GB E-Net

DDR SDRAM

IBM CoreConnect™ on-chip bus standard PLB, OPB, and DCR

Bus On-Chip Peripheral Bus Bridge

UART

SDRAM

© 2003 Xilinx, Inc. All Rights Reserved

GPIO

Arbiter

Arbiter

Instruction

ISOCM BRAM

On-Chip Peripheral

Full system customization to meet performance, functionality, and cost goals

MicroBlaze-based Embedded Design

BRAM

Local Memory

MicroBlaze

Bus

32-Bit RISC Core

I-Cache BRAM

D-Cache BRAM

Flexible Soft IP Configurable Sizes Possible in Dedicated Hard IP PowerPC Virtex-II Pro

Custom Functions

On-Chip Peripheral Bus

UART

Advanced Implementation Options - 17 - 5

Bus Bridge

10/100 E-Net

FLASH/SRAM

© 2003 Xilinx, Inc. All Rights Reserved

On-Chip Peripheral

Data

PLB Processor Local Bus

Hi-Speed Peripheral

Custom Functions

Off-Chip Memory

Instruction

OPB

e.g. Memory Controller

GB E-Net

Arbiter

LocalLink™ FIFO Channels 0,1…….32

Arbiter

405 Core

Embedded Development Tool Flow Standard FPGA HW Development Flow

Standard Embedded SW Development Flow

C Code

VHDL/Verilog

Compiler/Linker

Synthesizer

(Simulator)

Simulator

Object Code

Place & Route Data2BlockRAM

? CPU code in off-chip memory

Bitstream CPU code in on-chip memory

Download to FPGA

Download to Board & FPGA

Debugger

Advanced Implementation Options - 17 - 6

© 2003 Xilinx, Inc. All Rights Reserved

?

3rd Party Support

vxWorks 5.4/5.5

Nucleus Nucleus

Neutrino ThreadX ThreadX

Linux NetBSD

µC/OS-II µC/OS-II

eCos

µCLinux Linux µC

OSes Tools

More being added over time, stay tuned… Advanced Implementation Options - 17 - 7

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx Programmable System Design Solution Provides :

; ; ; ; ; ; ;

Complete design solution for processing system HW Complete design solution for processing system SW Generation of bootable HW & SW design ensures a correct HW/SW interface Complete set of optimized peripheral firmware and programming libraries Optimized set of bus infrastructure and peripheral cores GUI and Tcl-based, text-driven applications enable you to work the way you want Common design environment for PowerPC™, MicroBlaze™ and mixed processor designs

Advanced Implementation Options - 17 - 8

© 2003 Xilinx, Inc. All Rights Reserved

Embedded Solutions Demo

Generated System Dedicated Hard IP

300MHz

Flexible Soft IP

PowerPC 405 Core

INTC

Data

100MHz Processor Local Bus

DCR Bus

100MHz

Bus Bridge On-Chip Peripheral Bus

Arbiter

Arbiter

Instruction

16KB BRAM Memory Controller

Memory Controller

10/100 Enet MAC

UARTLite

DCM

Off-Chip Memory

Advanced Implementation Options - 17 - 9

32MB SDRAM

100MHz Ref Clk In

IBM CoreConnect™ On-chip bus standard PLB, OPB & DCR

© 2003 Xilinx, Inc. All Rights Reserved

System Block Diagram OPB Bus PLB Bus UART JTAG_PPC

PLB2OPB INTC

PPC

Proc_Sys_ Reset

DCM

PLB BRAM Cntlr

PLB BRAM

PLB BRAM Cntlr

PLB BRAM

LCD

Advanced Implementation Options - 17 - 10

MY IP

© 2003 Xilinx, Inc. All Rights Reserved

Timer

GPIO

PSB

GPIO

LEDs

GPIO

SWs

Additional Slides

Advanced Implementation Options - 17 - 11

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx Microprocessor Debug (XDM)

Plumbing and synchronization between host-side application and: • Other host-side applications • Actual targets Use of sockets and TCP/IP enables remote and local connections

ppc-eabi-gdb

Others…

Tcl Interface enables: • Command line control of debugging using Tcl capabilities • Complex verification and analyses scripting Tcl/Terminal Interface

gdb remote protocol TCP/IP sockets

XMD MB/PPC Cycle Accurate ISSes

RISCWatch Protocol overJTAG

PPC405 dbg port

Simultaneous, multiple targets and applications

Advanced Implementation Options - 17 - 12

gdb remote protocol

JTAG

TCP/IP sockets Serial

XMD Protocol

MB dbg port XMDStub

© 2003 Xilinx, Inc. All Rights Reserved

mb-gdb MB-GDB

Others…

Hardware-Software Integration

Enables automatic generation of stand-alone or RTOS BSP that matches the custom hardware platform Hand-creating drivers to match and be optimized for the processor, bus structure and peripherals is a time-consuming task that involves HW and SW developer interaction. The EDK provides these as a push-button process – Xilinx has already done the work (source is provided as well)

L0 & L1 drivers are included for all HW IP peripherals

RTOS or Application

Memory management Threads, ITC RTOS-specific calls Etc.

L2 Driver

High Level Driver Non-blocking ISR included, Maintains state Etc.

L1 Driver

L0 Driver Custom HW Platform

Advanced Implementation Options - 17 - 13

RTOS Adaptation Level

© 2003 Xilinx, Inc. All Rights Reserved

Low Level or Lightweight Driver #defines for registers Polled mode blocking Etc.

RTOS Support Levels

As Needed (Vendor, roll your own, services, etc.)

Desirable (Vendor, roll your own, services, etc.)

Step4, Additional

Complete, Complete,automatic automaticBSP BSPgeneration generation (Steps 1-4) for VxWorks 5.4/5.5 (Steps 1-4) for VxWorks 5.4/5.5on on PowerPC and XMK on PowerPC PowerPC and XMK on PowerPC&& MicroBlaze MicroBlaze

Driver Support I2C, GPIO,….. Step 3, FileSystem, Network Stack Ethernet driver integration

Required

Step 2, Board Support

Specific board bootable image, Serial

Basic

Step 1, Kernel support Kernel scheduler matched to tool set Debugger support

Advanced Implementation Options - 17 - 14

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx MicroKernel (XMK)



Lightweight kernel and libraries targeted to PowerPC and MicroBlaze –

– –

Ships with the Embedded Development Kit – source included No royalties or extra charge for use Components: •





Kernel – Process management, Schedulers, IPC facilities, etc. Net – Lightweight TCP/IP network stack File System / Memory File System – Standard block open/close/read/write/chdir – Memory implementation included

Advanced Implementation Options - 17 - 15

© 2003 Xilinx, Inc. All Rights Reserved

.txt

.dat .bss Total a 5,936 15,23 Kern 9,160 143 el 9 Net 11,22 403 3,669 15,29 0 2 File 2452 216 4 2,672 (Total size is ALL services, on PowerPC an additional 8K vector section is required. Real systems could be smaller if they use less services)

Platform Studio

The Programmable Systems IDE

System System Details Details View View

System System Diagram Diagram View View

Integrated IntegratedHW HW&&SW SW System SystemDevelopment DevelopmentTools Tools Advanced Implementation Options - 17 - 16

© 2003 Xilinx, Inc. All Rights Reserved

Source Source Code Code Editor Editor

Base System Builder •

A wizard that build a default system with just a few mouse clicks – –



Includes all connections and generates memory map Ready to be downloaded to a target board and have SW written for it

Data-driven extensibility –

– –

New Xilinx Board Description (XBD) file (based on MHS) Defines all chip-level interconnect to board resources By default, limits peripheral selection to what is actually on the target board

Advanced Implementation Options - 17 - 17

© 2003 Xilinx, Inc. All Rights Reserved

Base System Builder Wizard

Select Target Board 1

Select MB or PPC 2

Note: Support for a generic board is in upcoming release. For now, you can target any system as the UCF file is the only board-specific item

Advanced Implementation Options - 17 - 18

© 2003 Xilinx, Inc. All Rights Reserved

Configure CPU 3

Base System Builder Wizard

Auto-Generated Memory Map

Configure IP 4

Advanced Implementation Options - 17 - 19

5

© 2003 Xilinx, Inc. All Rights Reserved

Base HW Done! 6

Import Peripheral Wizard



A wizard that takes already existing IP and formats for inclusion in EDK. – –

Core will show up in Add/Edit Cores dialog Need to close project and reopen to see change • Creates local project pcores directory structure • Generates the pao and mpd files • Copies all necessary files into pcores

Done! 4

Start 1 Identify source type 3

2

Advanced Implementation Options - 17 - 20

Bus Type

© 2003 Xilinx, Inc. All Rights Reserved

Supported EDK IP bram_block_v1_00_a dcm_module_v1_00_a (New) dcr_intc_v1_00_b dcr_v29_v1_00_a dsbram_if_cntlr_v1_00_a dsbram_if_cntlr_v2_00_a (New) dsocm_v10_v1_00_a (New) fit_timer_v1_00_a fsl_v20_v1_00_b isbram_if_cntlr_v1_00_a isbram_if_cntlr_v2_00_a (New) isocm_v10_v1_00_a jtagppc_cntlr_v1_00_a jtagppc_cntlr_v1_00_b (New) lmb_bram_if_cntlr_v1_00_b lmb_v10_v1_00_a microblaze_v2_00_a mii_to_rmii_v1_00_a opb2dcr_bridge_v1_00_a opb2plb_bridge_v1_00_c opb_atmc_v2_00_a opb_bram_if_cntlr_v1_00_a opb_bram_if_cntlr_v2_00_a opb_central_dma_v1_00_a (New) Advanced Implementation Options - 17 - 21

opb_ddr_v1_00_b opb_emc_v1_10_a opb_emc_v1_10_b opb_ethernet_v1_00_m (New) opb_ethernetlite_v1_00_a opb_gpio_v1_00_a opb_gpio_v2_00_a (New) opb_hdlc_v1_00_b (New) opb_iic_v1_01_a opb_intc_v1_00_b opb_intc_v1_00_c opb_jtag_uart_v1_00_b opb_mdm_v1_00_b opb_mdm_v1_00_c opb_memcon_v1_00_a opb_opb_lite_v1_00_a opb_pci_v1_00_b opb_sdram_v1_00_c opb_spi_v1_00_b opb_sysace_v1_00_a opb_timebase_wdt_v1_00_a opb_timer_v1_00_b opb_uart16550_v1_00_c opb_uartlite_v1_00_b

© 2003 Xilinx, Inc. All Rights Reserved

opb_v20_v1_10_b plb2opb_bridge_v1_00_b plb_atmc_v1_00_a plb_bram_if_cntlr_v1_00_a plb_ddr_v1_00_b plb_ddr_v1_00_c plb_emc_v1_10_b plb_ethernet_v1_00_a (New) plb_gemac_v1_00_b (New) plb_rapidio_lvds_v1_00_a plb_sdram_v1_00_c plb_uart16550_v1_00_c plb_v34_v1_01_a ppc405_v1_00_a ppc405_v2_00_a (New) proc_sys_reset_v1_00_a util_bus_split_v1_00_a (New) util_flipflop_v1_00_a (New) util_reduced_logic_v1_00_a (New) util_vector_logic_v1_00_a (New)

Processor IP Included as VHDL Source •



CoreConnect™ bus infrastructure IP – PLB2OPB Bridge – PLB & OPB Arbiter – PLB & OPB IPIF Interface Memory controllers – OPB Block Memory OPB SDRAM, OPB DDR, OPB EMC, OPB ZBT – PLB Block Memory, PLB SDRAM, PLB DDR

Advanced Implementation Options - 17 - 22





Standard peripherals – OPB Timer / Counter, OPB Watchdog Timer – OPB Uart-Lite, OPB JTAG Uart, OPB GPIO, OPB SPI – OPB Interrupt Controller Additional IP – OPB System ACE controller

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx Developed, Delivered, and Supported

Processor IP Included as Evaluation –

– – –





OPB Uart-16450 & OPB Uart16550 OPB HDLC OPB IIC OPB Ethernet 10-100 MAC & Ethernet-Lite 10-100 MAC OPB ATM Master Utopia Level 2 OPB ATM Slave Utopia Level 2

– –

– –

– – –

OPB PCI 32 Bridge OPB ATM Master Utopia Level 3 OPB ATM Slave Utopia Level 3 PLB ATM Master Utopia Level 2 PLB ATM Slave Utopia Level 2 PLB GMAC Ethernet PLB RapidIO (Installs in to EDK) Xilinx Developed, Delivered, and Supported

Advanced Implementation Options - 17 - 23

© 2003 Xilinx, Inc. All Rights Reserved

IP Evaluation In The EDK

For LogiCORE IP that Xilinx Licenses ($$) •

Evaluation IP in the EDK – –

• •

To evaluate $$ IP, you must buy the EDK OPB 10/100 Ethernet MAC, OPB2PCI 32/33 Bridge, IIC, UART 16450/550, ATM Utopia 2 & 3

EDK 6.1 install includes evaluation IP Evaluation core can be treated in every way like its licensed counterpart –



Evaluation core can be processed through MAP, PAR and through bitstream generation Evaluation core will function in hardware for 6-8 hours •

Re-configure device to reset clock

Advanced Implementation Options - 17 - 24

© 2003 Xilinx, Inc. All Rights Reserved

Xilinx Developed, Delivered, and Supported

Xilinx IP Evaluation

Advanced Implementation Options - 17 - 25

© 2003 Xilinx, Inc. All Rights Reserved

EDK – Supported IP The Xilinx Embedded Development Kit (EDK) includes an extensive list of IP to support designs using the IBM PowerPC hard processor core and the Xilinx MicroBlaze soft processor core.

Advanced Implementation Options - 17 - 26

© 2003 Xilinx, Inc. All Rights Reserved

EDK – Supported IP The Xilinx Embedded Development Kit (EDK) includes an extensive list of IP to support designs using the IBM PowerPC hard processor core and the Xilinx MicroBlaze soft processor core.

Advanced Implementation Options - 17 - 27

© 2003 Xilinx, Inc. All Rights Reserved

EDK – Supported IP The Xilinx Embedded Development Kit (EDK) includes an extensive list of IP to support designs using the IBM PowerPC hard processor core and the Xilinx MicroBlaze soft processor core.

Advanced Implementation Options - 17 - 28

© 2003 Xilinx, Inc. All Rights Reserved

Wind River XE (Xilinx Edition) DIAB XE C/C++ compiler – Benchmark leading PowerPC compiler – Ideal for applications that demand minimal code size and maximum program performance – Integrated with XPS

SingleStep XE software debugger – Debugger for embedded PowerPC designs – Provides complete control over all PPC405 settings, such as initial register values

visionPROBE II XE hardware debugger – Download and run time control for debugging PowerPC designs (parallel port) • ProbeServer enables any SingleStep XE debugger host (Solaris or PC) to connect to a networked lab PC

Advanced Implementation Options - 17 - 29

© 2003 Xilinx, Inc. All Rights Reserved

High-Performance DSP in an FPGA Made Easy and Affordable

Xilinx Confidential

Agenda • Why should I use FPGAs for DSP? • Which FPGAs for DSP? – VirtexTM-II Series, SpartanTM-3

• What DSP algorithms are available? • Which Design Tools should I use? – Software – Hardware

• What’s next?

Xilinx XtremeDSP 2

Xilinx Confidential

Agenda • Why should I use FPGAs for DSP? • Which FPGAs for DSP? – Virtex-II Series, Spartan-3

• What DSP Algorithms are available? • Which Design Tools should I use? – Software – Hardware

• What’s Next?

Xilinx XtremeDSP 3

Xilinx Confidential

FPGAs Mean Parallelism Reason 1: FPGAs handle high computational workloads FPGA

Conventional DSP Device (Von Neumann architecture) Data In

Data In

Reg

C0

Reg1

Reg0 C1

Reg2 C2

Reg255

.... C255

MAC unit Data Out

Data Out

256 Loops needed to process samples

All 256 MAC operations in 1 clock cycle

256 Tap FIR Filter Example Xilinx XtremeDSP 4

Xilinx Confidential

FPGAs are Ideal for Multi-channel DSP designs? 20MHz Samples

LPF

ch1

LPF

ch2

LPF

ch3

LPF

ch4

80MHz Samples LPF Multi Channel Filter

• FPGAs are also ideally suited for multi-channel DSP designs – Many low sample rate channels can be multiplexed (e.g. TDM) and processed in the FPGA, at a high rate – Interpolation (using zeros) can also drive sample rates higher

Xilinx XtremeDSP 5

Xilinx Confidential

Why FPGAs for DSP? (2) Reason 2: Tremendous Flexibility

A Q = (A x B) + (C x D) + (E x F) + (G x H) can be implemented in parallel

B C D E F G H

× × × ×

+ + + +

But is this the only way in the FPGA? Xilinx XtremeDSP 6

Xilinx Confidential

+ +

Q

Customize Architectures to Suit Your Ideal Algorithms Parallel

× × × ×

Semi-Parallel

Serial

+ + +

+ +

× ×

+ +

DQ

+ +

+

×

DQ

+

+

Speed

Optimized for?

Cost

FPGAs allow Area (cost) / Performance tradeoffs Xilinx XtremeDSP 7

Xilinx Confidential

Why FPGAs for DSP? (3) Reason 3: Lower System Cost through Integration AFE

A/D

SDRAM

A/D

MACs

Hundreds of Termination Resistors

DDC DDC

DSP MACs Procs. Control DDC DDC Control

DDC DDC

DDC DDC

D/A

DUC DUC

DUC DUC

D/A

DUC DUC

DUC DUC

A/D A/D

FPGA

Quad TRx

DSP Card

w Po

e

SSTL3 Translators

C rP

FPGA

SDRAM

Control

ASSP

PL4

3.125 Gbps

D/A D/A

Xilinx XtremeDSP 8

MACs, DUCs, DDCs, Logic PowerPC

Control

PowerPC

SDRAM

CORBA

Xilinx Confidential

Quad Network TRx

Card

SDRAM PowerPC PowerPC

ASSP

Agenda • Why should I use FPGAs for DSP? • Which FPGAs for DSP? – Virtex-II Series, Spartan-3

• What DSP Algorithms are available? • Which Design Tools should I use? – Software – Hardware

• What’s Next? Xilinx XtremeDSP 9

Xilinx Confidential

FPGAs as Signal Processors

• Industry’s most used FPGAs for DSP • Industry’s lowest cost FPGA – 90nm process • Highest performance DSP – Up to 556 hard embedded 18x18

– 300mm wafers

multipliers – Up to 10MB of dual port block RAM – Up to 4 PPC 405 processors for control – 3.125 Gbps serial transceivers

• High performance DSP

• Ideal for multi-channel designs Xilinx XtremeDSP 10

Xilinx Confidential

– Up to 104 hard embedded 18x18

multipliers – Up to 2MB of dual port block RAM – 32-bit MicroBlazeTM µP

Flexible Architecture Allows LUTs, RAM or SRL16 Din F0 F1

Xilinx XtremeDSP 11

LUT

D CE

LUT

D CE

F2 LUT

LUT

F3

D CE

D CE

Xilinx Confidential

CE A0 A1 A2 A3

SRL16E RAM16x1S Gates Q0 Q0 0000 Q1 0001 DIN 0010 CE 0011

A0 A1Q14 A2Q15 Q15 1101 A3

0 0 DOUT 0 Dout 0

0 1110 0 CLK 1111 1

AND4

Highest DSP Performance Using Virtex-II Pro • FFT – 1024-point complex FFT (16-bit data & phase factors) • 5.2 µs execution time (fclk=197 MHz, 2VP20-7, 2,764 logic slices)

– FFTs with execution times in the 1 to 2 µs have also been demonstrated (see www.dilloneng.com or www.altrabroadband.com)

• FEC – Viterbi decoder at OC3 data rates: 203 MHz – Interleaver/de-interleaver @fclk > 200 MHz – RS decoding OC192 rates (10 Gbps) • 16 parallel RS decoders in a single XC2V3000-4 • OC768 (40 Gbps) can also be achieved

– TPC Decoder at 155MHz (802.16, 802.16a) Xilinx XtremeDSP 12

Xilinx Confidential

Driving Down the Cost of HighPerformance DSP $2,000

Unit Price

$1,600

First First FPGA FPGA with with dedicated dedicated DSP DSP features features

$1,200

First First FPGA FPGA with with DSP DSP features features ++ µprocessors µprocessors Industry’s Industry’s first first -- 90nm 90nm process process -- 300mm 300mm wafer wafer -- DSP DSP features features

$800

$400

1999

2000

2001

2002

Estimates based on a 1M Gate part (75 GMACs/s performance today) Xilinx XtremeDSP 13

Xilinx Confidential

2003

2004

New Low Price-points Enable New DSP Applications • 276 GMACs/s performance for $100* (3S4000) • Data Path - Rich DSP Fabric – Up to 104 embedded multipliers, SRL16 logic, 1.8Mb of memory – 60+ DSP algorithms built as IP Cores

• Control Path – MicroBlaze processor for control path • 68 DMIPs at 85 MHz

Enables Customer-Premise Equipment, Consumer, Automotive, Industrial Control Type Applications * 8-bit*250KU MAC volume**pricing 2004 inprojections for 250Kunit volumes 2004 Xilinx XtremeDSP 14

Xilinx Confidential

High-end Car Multimedia System 32-bit Embedded µP Rear-seat Displays

Processor I/F

M6 Cipher 8-bit µC

SHA-1 DTCP

MPEG-4 H/W Acceleration DCT/IDCT Motion for MPEG-4 Compensation

32-bit µP

VGA Controller

MDCT for MP3

DDR Memory Controller

Flash Controller UART, I2C, SPI, PWM

MemoryStick

Xilinx XtremeDSP 15

MMC/SD

CF+

Peripherals Peripherals

Xilinx Confidential

PCI Controller 1394

LVDS Tx

Multi-Ch CAN Controller

USB 2.0 MOST

Front Display

SDRAM SDRAM PHY PHY

Agenda • Why should I use FPGAs for DSP? • Which FPGAs for DSP? – Virtex-II Series, Spartan-3

• What DSP Algorithms are available? • Which Design Tools should I use? – Software – Hardware

• What’s Next? Xilinx XtremeDSP 16

Xilinx Confidential

DSP Algorithms for Xilinx FPGAs • Math Functions – Add, Subtract, multiply, Cordic etc.

• Common DSP Functions – FFTs (64pt-16Kpt) - Configurable datapath, phase vectors, transform length – Correlators, Modulation, Demodulation, DDS etc.

• Filters, – MAC, DA, CIC etc.

• FEC – RS, Viterbi, TPC, (De)Interleavers, TCC, AWGN etc.

• Video/Imaging – DCT/IDCT, JPEG, JPEG2000, Color space conversion

• Cable Modem – DOCSIS ITU-T J.83 Modulator Xilinx XtremeDSP 17

Xilinx Confidential

Xilinx DSP Algorithms for an OFDM Receiver & Demodulator Data in from channel

Embedded Multipliers

Digital Down Conversion Interpolator

FFT Core

Frame Synchronization

Channel Estimation

Remove Cyclic Prefix

Channel Equalizer & Detector

Frequency Synchronization

•MACbased FIR Filter Core •DDS Core

Sample Clock Synchronization CORDIC Core

Xilinx XtremeDSP 18

CORDIC Core

Xilinx Confidential

FFT

FEC

FEC Cores: •Reed-Solomon Decoder •Viterbi Decoder •De-interleaver •Turbo Product Code (TPC) Decoder

Advanced algorithms from Xilinx (DOCSIS ITU-T J.83 Modulator) External Memory

Data In

M PEG F ram er

8b->7 b & F IFO

RS (12 8, 12 2)

In te rle a ver

Optional

R andom izer

F ram e C o n tro ls

TCM

Xilinx Confidential

2 ch. RRC

Q

a =0 .18 or a =0 .12

I

Cable Q

P ro g . C o n tro ls Interlea ver S elect Q A M S elect Level S elect

Baseband (Bb) Processor

Xilinx XtremeDSP 19

I

Modulator

Xilinx DSP Algorithms in a 3G Base Station •

MAC FIR for decimation & matched filtering DDS for matched filtering

• Downlink (Forward Link) Uplink (Reverse Link)

Rx Tx

Mobile Station

Interface to Base Station Controller

IF Mixer& BP Filter

Multi Carrier Power Amp. (MCPA)

Low Power Antenna Combiners

ADC

Downlink Symbol Rate Processing • • • • • •

DAC

• • • • •

Digital Up/Down Conversion

• • •

Decimation LPF Matched Filtering



Interpolation LPF Pulse Shaping (RRC) Digital Pre-distortion



I/Q

Back Plane Drivers LVDS & Gigabit IO

Uplink Chip Rate Processing • • • • •

2nd De-interleaving Radio frame deconstruction Rate recovery 1st De-interleaving FEC decoding CRC decoding

Searcher(for multipaths) LFSR code generator Channel estimation Tracking rake receiver Multipath combiner (max ratio)

I/Q Base Band Processing

CDMA2000 TCC Decoder, Viterbi Decoder, De-interleaver

Xilinx Confidential

MAC FIR for interpolation & pulse shaping DDS for pulse shaping

Quadrature Modulation QPSK

Power scaling Dedicated phys channel generation Phys channel combining LFSR code generation Spreading/scrambling

FEC Cores

Xilinx XtremeDSP 20

• •

Downlink Chip Rate Processing

CRC coding FEC coding Rate matching 1st Interleaving TrCH radio frame generation 2nd Interleaving

Uplink Symbol Rate Processing • • • • • •

Digital Down Conversion

Digital Quadrature Demod

Digital Up Conversion

RF

FEC Cores

CDMA2000 TCC Encoder Convolutional Encoder Interleaver

Rx Linear Amplifier

Advanced Algorithms from Xilinx CDMA2000/3GPP2 TCC • Designed for use in 3G wireless base stations • Implements – 3GPP2/1xEVDO/CDMA2000 specification – Full 3GPP2 Interleaver – MAX*, MAX or MAX SCALE algorithms

• Dynamically selectable number of iterations (1-16) • Usable on Virtex-II Pro, Virtex-II and Spartan-3 FPGAs

Xilinx XtremeDSP 21

Xilinx Confidential

Agenda • Why should I use FPGAs for DSP? • Which FPGAs for DSP? – Virtex-II Series, Spartan-3

• What DSP Algorithms are available? • Which Design Tools should I use? – Software – Hardware

• What’s Next? Xilinx XtremeDSP 22

Xilinx Confidential

4 DSP Design Flows Give You Total Flexibility FPGA Designers Writing HDL System Modeling

System Engineers using MATLAB only

System Engineers Using CoWare

MATLAB, Simulink & SysGen for DSP

MATLAB & AccelFPGA

SPW

System Engineers using MATLAB & Simulink

HDL

Manual Entry of VHDL or Verilog

HDL Automatically Generated- SysGen

RTL Automatically Generated

HDL Automatically Generated

Synthesis

Leonardo Spectrum, Synplify Pro or XST

Leonardo Spectrum Synplify Pro or XST

Leonardo Spectrum Synplify Pro or XST

XST

Implementation

ISE Foundation

ISE Foundation

ISE Foundation

ISE Foundation

Increasing Cost Xilinx XtremeDSP 23

Xilinx Confidential

Push-button Performance using Xilinx System Generator for DSP

From Simulink to FPGA bitstream, at the push of a button Xilinx XtremeDSP 24

Xilinx Confidential

Unrivaled Capabilities on System Generator for DSP Hardware in the loop co-simulation Allows the user to simulate part of his system on actual hardware. This means acceleration & verification

System Generator for DSP v3.1 Allows Systems Designers to target external simulation engines - Hardware in the loop - HDL co-simulation

HDL co-simulation using ModelSim Allows the user automatically invoke ModelSim and simulate his Verilog/VHDL directly from Simulink.

Xilinx XtremeDSP 25

Xilinx Confidential

Documented Reference Designs to Accelerate Your Learning • 16-QAM receiver, including LMS based equalizer and carrier recovery loop • A/D and delta-sigma D/A conversion • Concatenated FEC codec for DVB • Custom FIR filter reference library • Digital down converter for GSM • 2D discrete wavelet transform (DWT) filter • 2D filtering using a 5x5 operator • Color space conversion • CORDIC reference design • Polyphase MAC-based FIR • Streaming FFT/IFFT • BER Tester using AWGN model Xilinx XtremeDSP 26

Xilinx Confidential

Partition Tasks Between Processor and DSP Fabric FIR Engine (fabric/multipliers)

C++ Code Stack Control Tasks FIR Filter

PowerPC PowerPC Processor Processor

Control Tasks

FIR Filter Control Tasks

OCM RAM

0

1

2

3

n

+

+

+

+

PowerPC with Application-Specific Hardware Acceleration Extreme

The Virtex-II Pro Advantage

Processing™

Traditional

FIR Filter

FIR Filter

Processing time Xilinx XtremeDSP 27

Xilinx Confidential

Use System Generator to Develop DSP Peripherals for OPB • XAPP264 shows designers how to use Simulink & System Generator to create DSP peripherals that can connect to the OPB – PowerPC – MicroBlaze

• This enables 32-bit control for XtremeDSP using:

Application Note XAPP264 Using SysGen to Create CoreConnectTM Peripherals

UART Lite PowerPC/ MicroBlaze Reloadable DA FIR Peripheral

– VirtexTM-II, Virtex-II ProTM – Spartan-3

OPB

• Still need EDK to program the processor Xilinx XtremeDSP 28

Xilinx Confidential

Host PC (post-processing and filter design in Matlab)

Embedded Design Flow SW Development Flow 1. Specify Software Architecture

HW Development Flow

MSS

SW Configuration 2. Automatic Software BSP/Library Generation

Executable in off-chip memory

Xflow / ProjNav

Executable in on-chip memory

Data2BRAM

?

Download to Board

MHS

LibGen PlatGen

3. Software Compilation Executable

1. Specify Processor, Bus & Peripherals

RTOS/eOS

GDB / XMD

GPIO

MicroBlaze/ PPC Arbiter

Module in EDK Xilinx XtremeDSP 29

UART

Xilinx Confidential

HW Configuration 2.

Automatic Hardware Platform Generation

3. Xilinx Implementation Flow

Bitstream

Download to FPGA

XtremeDSP Hardware Development Kit • Developed with Nallatech • Includes: – Motherboard populated with a Dime-II module – Nallatech FUSE Software – XtremeDSP Software Evaluation CD kit • • • •

MATLAB/Simulink Xilinx System Generator for DSP Xilinx ISE Synplify ProTM (Synplicity) & FPGA AdvantageTM (Mentor)

– User manual and example designs – $1995 Xilinx XtremeDSP 30

Xilinx Confidential

XtremeDSP Development Kit Specifications • Specifications – Virtex-II user FPGA: XC2V3000 – 2 ADC channels • AD6644 A/Ds (14-bits up to 65 MSPS)

– 2 DAC channels • AD9772A D/As (14-bits up to 160 MSPS)

– Support for external clock + on board oscillator + programmable clock – One bank of ZBT-SSRAM (133MHz, 256 kbyte)

– PCI & USB interface

• Supports Hardware in the Loop/HDL co-simulation with Xilinx System Generator for DSP • Multiple Dime-II daughter boards available from Nallatech Xilinx XtremeDSP 31

Xilinx Confidential

What if the XtremeDSP Kit Doesn’t Meet Your Needs? • 40+ DSP development boards available today using Xilinx FPGAs (www.xilinx.com/dsp) – General Purpose – Software Defined Radio – Video/Imaging

• Many vendors support hardware-in-the loop/HDL co-simulation with Xilinx System Generator for DSP Xilinx XtremeDSP 32

Xilinx Confidential

Agenda • Why should I use FPGAs for DSP? • Which FPGAs for DSP? – Virtex-II Series, Spartan-3

• What DSP Algorithms are available? • Which Design Tools should I use? – Software – Hardware • What DSP Services & Technical Support will I get from Xilinx for DSP?

• What’s Next?

Xilinx XtremeDSP 33

Xilinx Confidential

The World’s Fastest Programmable DSP Solution FPGAs with DSP Functions

Highest Performance

Shortest Design Time

Lowest Cost (90nm)

60+ Advanced DSP Cores

60+ DSP Development Boards

• Comprehensive Library • Fast Turnaround • Exceptional Performance

Dedicated Field Specialists • 50+ Field DSP Experts • Processor, I/O Experts too

DSP Design Services, Training & Hotline

• Xilinx Design Services, Education and Support

Xilinx XtremeDSP 34

Familiar Design Flows

Xilinx Confidential

Systems Expertise

What’s Next? • Evaluate Xilinx System Generator for DSP – Free 60 day eval CD kit or web download

• Try Xilinx IP cores in your next DSP design for free – Free eval on most pay cores

• Sign up for a Xilinx DSP Design Flow class – Implementation techniques or Design Flow Visit DSP Central at www.xilinx.com/dsp Xilinx XtremeDSP 35

Xilinx Confidential

Related Documents