Computer System And Network

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Computer System And Network as PDF for free.

More details

  • Words: 73,033
  • Pages: 245
COMPUTER SYSTEM & NETWORK

(COMP 23) ENCODED BY: PREPARED BY:

DONDON LEDAMA MARL T. GONZALEZ SCHOOL ADMINISTRATOR

LESSON I: Introduction

1 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Computer System Components Recent advances in microelectronic technology have made computers an integral part of our society. Each step in our everyday lives maybe influenced by computer technology: we awake to a digital alarm clock’s beaming of reselected music at the right time, drive to work in a digital alarm clock’s beaming of reselected music at the right time, drive to work in a digital-processor-controlled automobile, work in an extensively automated office, shop for computer-coded grocery items and return to rest in the computer-regulated heating and cooling environment of our homes. It may not be necessary to understand the detailed operating principles of a jet plane or an automobile on order to use and enjoy the benefits of these technical marvels. But a fair understanding of the operating principles, capabilities, and limitations of digital computers is necessary, if we would use them in an efficient manner. This book is designed to give such an understanding of the operating principles of digital computers. This chapter will begin by describing the organization of a general-purpose digital computer system and then will briefly trace the evolution of computers. The diagram shows a general view of how desktop and workstation computers are organized. Different systems have different details, but in general all computers consist of components (processor, memory, controllers, video) connected together with a bus. Physically, a bus consists of many parallel wires, usually printed (in copper) on the main circuit board of the computer. Data signals, clock signals, and control signals are sent on the bus back and forth between components. A particular type of bus follows a carefully written standard that describes the signals that are carried on the wires and what the signals mean. The PCI standard (for example) describes the PCI bus used on most current PCs.

The processor continuously executes the machine cycle, executing machine instructions one by one. Most instructions are for an arithmetical, a logical, or a control operation. A machine operation often involves access to main storage or involves an I/O

2 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

controller. If so, the machine operation puts data and control signals on the bus, and (may) wait for data and control signals to return. Some machine operations take place entirely inside the processor (the bus is not involved). These operations are very fast. Input/output Controllers The way in which devices connected to a bus cooperate is another part of a bus standard. Input/output controllers receive input and output requests from the central processor, and then send device-specific control signals to the device they control. They also manage the data flow to and from the device. This frees the central processor from involvement with the details of controlling each device. I/O controllers are needed only for those I/O devices that are part of the system. Often the I/O controllers are part of the electronics on the main circuit board (the mother board) of the computer. Sometimes an uncommon device requires its own controller which must be plugged into a connector (an expansion slot) on the mother board. Main Memory In practice, data and instructions are often placed in different sections of memory, but this is a matter of software organization, not a hardware requirement. Also, most computers have special sections of memory that permanently hold programs (firmware stored in ROM), and other sections that are permanently used for special purposes. Main memory (also called main storage or just memory) holds the bit patterns of machine instructions and the bit patterns of data. Memory chips and the electronics that controls them are concerned only with saving bit patterns and returning them when requested. No distinction is made between bit patterns that are intended as instructions and bit patterns that are intended as data. The amount of memory on a system is often described in terms of: 210 = 1024 bytes. Megabyte: 220 = 1024 Kilobytes 30 = 1024 Gigabyte: 2 Megabytes 40 = 1024 Gigabytes Terabyte: 2 Kilobyte:

3 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

These days (Winter 2005) the amount of main memory in a new desktop computer ranges from 256 megabytes to 1 gigabyte. Hard disks and other secondary storage devices are tens or hundreds of gigabytes. Backup storage comes in sizes as large as several terabytes

Addresses Each byte of main storage has an address. Most modern processors use 32-bit addresses, so there are 232 possible addresses. Think of main storage as if it were an array: Byte [0x00000000 ... 0xFFFFFFFF] main Storage; A main storage address is an index into memory. A 32-bit address is the address of a single byte. Thirty-two wires of the bus contain an address (there are many more bus wires for timing and control). Sometimes people talk about addresses like 0x2000, which looks like a pattern of just 16 bits. But this is just an abbreviation for the full 32-bit address. The actual address is 0x00002000. The first MIPS processors (designed in 1985) used 32-bit addresses. From 1991 to present, top-end MIPS processors use 64-bit addresses. The MIPS32 chip is a modern chip designed for embedded applications. It uses 32-bit addresses, since embedded applications often don't need 64 bits. Recent processor chips from AMD and Intel have 64-bit addresses, although 32-bit versions are still available. The assembly language of this course is for the MIPS32 chip, so we will use 32-bit addresses. The assembly language of the 64-bit MIPS chips is similar.

The MIPS has an address space of 232 bytes. A Gigabyte is 230, so the MIPS have 4 gigabytes of address space. Ideally, all of these memory locations would be implemented using memory chips (usually called RAM). RAM costs about $200 per

4 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

gigabyte. Installing the maximum amount of memory as RAM would cost about $800. This might be more than you want to spend. Hard disk storage costs much less per gigabyte. Hard disks cost about $50 per gigabyte (winter, 2005).

On modern computers, the full address space is present no matter how much RAM has been installed. This is done by keeping some parts of the full address space on disk and some parts in RAM. The RAM, the hard disk, some special electronics, and the operating system work together to provide the full 32 bit address space. To a user or an applications programmer it looks as if all 232 bytes of main memory are present. This method of providing the full address space by using a combination of RAM memory and the hard disk is called virtual memory. The word virtual means "appearing to exist, but not really there." Some computer geeks have a virtual social life. Cache Memory Disk access is slow compared to RAM access. Potentially, using a combination of real memory and disk memory to implement the address space could greatly slow down program execution. However, with clever electronics and a good operating system, virtual memory is only slightly slower than physical memory. Computer systems also have cache memory. Cache memory is very fast RAM that is inside (or close to) the processor. It duplicates sections of main storage that are heavily used by the currently running programs. The processor does not have to use the

5 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

system bus to get or store data in cache memory. Access to cache memory is much faster than to normal main memory. Like virtual memory, cache memory is invisible to most programs. It is an electronic detail below the level of abstraction provided by assembly language. Hardware keeps cache up to date and in synch with main storage. Your programs are unaware that there is cache memory and virtual memory. They just see "main memory". Application programs don't contain instructions that say "store this in cache memory", or say "get this from virtual memory". They only refer to the contents of main memory at a particular address. The hardware makes sure that the program gets or stores the correct byte, no matter where it really is.

Contents of Memory The memory system merely stores bit patterns. That some of these patterns represent integers, that some represent characters, and that some represent instructions (and so on) is of no concern to the electronics. How these patterns are used depends on the programs that use them. A word processor program, for example, is written to process patterns that represent characters. A spreadsheet program processes patterns that represent numbers. Of course, most programs process several types of data, and must keep track of how each is used. Often programs keep the various uses of memory in separate sections, but that is a programming convention, not a requirement of electronics. Any byte in main storage can contain any 8-bit pattern. No byte of main storage can contain anything but an 8-bit pattern. There is nothing in the memory system of a computer that says what a pattern represents. Computer System Organization Before we look at the C language, let us look at the overall organization of computing systems. Figure 1.1 shows a block diagram of a typical computer system. Notice it is divided into two major sections; hardware and software.

6 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Computer Hardware The physical machine, consisting of electronic circuits, is called the hardware. It consists of several major units: the Central P r o c e s si n g Unit (CPU), M a i n Memory, Secondary Memory and Peripherals. The CPU is the major component of a computer; the ``electronic brain'' of the machine. It consists of the electronic circuits needed to perform operations on the data. Main Memory is where programs that are currently being executed as well as their data are stored. The CPU fetches program instructions in sequence, together with the required data, from Main Memory and then performs the operation specified by the instruction. Information may be both read from and written to any location in Main Memory so the devices used to implement this block are called random access memory chips (RAM). The contents of Main Memory (often simply called memory) are both temporary (the programs and data reside there only when they are needed) and volatile (the contents are lost when power to the machine is turned off). The Secondary Memory provides more long term and stable storage for both programs and data. In modern computing systems this Secondary Memory is most often implemented using rotating magnetic storage devices, more commonly called disks (though magnetic tape may also be used); therefore, Secondary Memory is often referred to as the disk. The physical devices making up Secondary Memory, the disk

7 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

drives are also known as mass storage devices because relatively large amounts of data and many programs may be stored on them. The disk drives making up Secondary Memory are one form of Input/Output (I/O) device since they provide a means for information to be brought into (input) and taken out of (output) the CPU and its memory. Other forms of I/O devices which transfer information between humans and the computer are represented by the Peripherals box in Figure 1.1. These Peripherals include of devices such as terminals -- a keyboard (and optional mouse) for input and a video screen for output, high-speed printers, and possibly floppy disk drives and tape drives for permanent, removable storage of data and programs. Other I/O devices may include high-speed optical scanners, plotters, multi-user and graphics terminals, networking hardware, etc. In general, these devices provide the physical interface between the computer and its environment by allowing humans or even other machines to communicate with the computer. Computer Software -- The Operating System Hardware is called ``hard'' because, once it is built, it is relatively difficult to change. However, the hardware of a computer system, by itself, is useless. It must be given directions as to what to do, i.e. a program. These programs are called software; ``soft'' because it is relatively easy to change both the instructions in a particular program as well as which program is being executed by the hardware at any given time. When a computer system is purchased, the hardware comes with a certain amount of software which facilitates the use of the system. Other software to run on the system may be purchased and/or written by the user. Some major vendors of computer systems include: IBM, DEC, HP, AT&T, Sun, Compaq, and Apple. The remaining blocks in Figure 1.1 are typical software layers provided on most computing systems. This software may be thought of as having a hierarchical, layered structure, where each layer uses the facilities of layers below it. The four major blocks shown in the figure are the Operating System, Utilities, User Programs and Applications. The primary responsibility of the Operating System (OS) is to ``manage'' the ``resources'' provided by the hardware. Such management includes assigning areas of memory to different programs which are to be run, assigning one particular program to run on the CPU at a time, and controlling the peripheral devices. When a program is called upon to be executed (its operations performed), it must be loaded, i.e. moved from disk to an assigned area of memory. The OS may then direct the CPU to begin fetching instructions from this area. Other typical responsibilities of the OS include Secondary Storage management (assignment of space on the disk), a piece of software called the file system, and Security (protecting the programs and data of one user from activities of other users that may be on the same system).

8 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Many mainframe machines normally use proprietary operating systems, such as VM and CMS (IBM) and VAX VMS and TOPS 20 (DEC). More recently, there is a move towards a standardized operating system and most workstations and desktops typically use UNIX (AT&T and other versions). A widely used operating system for IBM PC and compatible personal computers is DOS (Microsoft). Apple Macintosh machines are distinguished by an easy to use proprietary operating system with graphical icons. Utility Programs The layer above the OS is labeled Utilities and consists of several programs which are primarily responsible for the logical interface with the user, i.e. the ``view'' the user has when interacting with the computer. (Sometimes this layer and the OS layer below are considered together as the operating system). Typical utilities include such programs as shells, text editors, compilers, and (sometimes) the file system. A shell is a program which serves as the primary interface between the user and the operating system. The shell is a ``command interpreter'', i.e. is prompts the user to enter commands for tasks which the user wants done, reads and interprets what the user enters, and directs the OS to perform the requested task. Such commands may call for the execution of another utility (such as a text editor or compiler) or a user program or application, the manipulation of the file system, or some system operation such as logging in or out. There are many variations on the types of shells available, from relatively simple command line interpreters (DOS) or more powerful command line interpreters (the Bourne Shell, sh, or C Shell, csh in the Unix environment), to more complex, but easy to use graphical user interfaces (the Macintosh or Windows). You should become familiar with the particular shell(s) available on the computer you are using, as it will be your primary means of access to the facilities of the machine. A text editor (as opposed to a word processor) is a program for entering programs and data and storing them in the computer. This information is organized as a unit called a file similar to a file in an office filing cabinet, only in this case it is stored on the disk. (Word processors are more complex than text editors in that they may automatically format the text, and are more properly considered applications than utilities). There are many text editors available (for example vi and emacs on Unix systems) and you should familiarize yourself with those available on your system. As was mentioned earlier, in today's computing environment, most programming is done in high level languages (HLL) such as C.However; the computer hardware cannot understand these languages directly. Instead, the CPU executes programs coded in a lower level language called the machine language. A utility called a compiler is program which translates the HLL program into a form understandable to the hardware. Again, there are many variations in compilers provided (for different languages, for example) as well as facilities provided with the compilers (some may have built-in text

9 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

editors or debugging features). Your system manuals can describe the features available on your system. Finally, another important utility (or task of the operating system) is to manage the file system for users. A file system is a collection of files in which a user keeps programs, data, text material, graphical images, etc. The file system provides a means for the user to organize files, giving them names and gathering them into directories (or folders) and to manage their file storage. Typical operations which may be done with files include creating new files, destroying, renaming, and copying files.

COMPUTER EVOLUTION 500 B.C. The 2/5 Abacus is invented by the Chinese. Find out more about the Abacus at "Abacus: The Art of Calculating with Beads" by Luis Fernandes.

1 A.D. The Antikythera Device, a mechanism that mimicked the actual movements of the sun, moon, and planets, past, present, and future. This technology was then lost for millennia.

10 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

1632

Wilhelm Schickard builds the first "automatic calculator", the "Calculating Clock" which was used for computing astronomical tables.

Wilhelm Schickard (born 1592 in Herrenberg - died 1635 in Tübingen) built the first automatic calculator in 1623. Contemporaries called this machine the Calculating Clock. It precedes the less versatile Pascaline of Blaise Pascal and the calculator of Gottfried Leibniz by twenty years. Schickard's letters to Johannes Kepler show how to use the machine for calculating astronomical tables. The machine could add and subtract six-digit numbers, and indicated an overflow of this capacity by ringing a bell; to aid more complex calculations, a set of Napier's bones were mounted on it. The designs were lost until the twentieth century; a working replica was finally constructed in 1960. Schickard's machine, however, was not programmable. The first design of a programmable computer came roughly 200 years later (Charles Babbage). And the first working program-controlled machine was completed more than 300 years later (Konrad Zuse's Z3, 1941). The Schickard crater on the moon is named after Schickard.

1642 Blaise Pascal, a French religious philosopher and mathematician, builds the first practical calculating machine. Thereby etching his name be resurrected later for the name of a, now programming language.

m echa nica l in history to a r c a n e ,

1830 The "Analytical Engine" is designed by Charles Babbage.

1850

11 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Japanese refined the Abacus into the 1/5 with one bead on top deck and five on bottom deck. 1890 The U.S. Census Bureau adopts the Hollerith Punch Card, Tabulating Machine and Sorter to compile results of the 1890 census, reducing an almost 10-year process to 2 ½ years, saving the government a whopping $5 million. Inventor Herman Hollerith, a Census Bureau statistician, forms the Tabulating Machine Company in 1896. The TMC eventually evolved into IBM. 1930 Abacus is again changed, to the 1/4 design. 1939 The first semi-electronic digital computing device is constructed by John Atanassoff. The "Mark I" Automatic Sequence Controlled Calculator, the first fully automatic calculator, is begun at Harvard by mathematician Howard Aiken. Its designed purpose was to generate ballistic tables for Navy artillery. 1941 German inventor Konrad Zuse produces the Z3 for use in aircraft and missile design but the German government misses the boat and does not support him. There is some debate as to whether the Mark I or the Z3 came first. 1943 English mathematician Alan Turing (bio by Andrew Hodges) begins operation of his secret computer for the British military. It was used by cryptographers to break secret German military codes. It was the first vacuum tube computer but its existence was not made public until decades later. 1946

12 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Eniac (Electronic Numerical Integrator and Calculator), the first credited, all electronic computer is completed at the University of Pennsylvania. It used thousands of vacuum tubes. 1951 Seymour Cray gets his Masters degree in Applied Mathematics, soon after joins Engineering Research Associates and starts working on the 1100 series computers for what ended up being Univac. Remington's Univac I (Universal Automatic Computer), using a Teletype keyboard and printer for user interaction, and became the first commercially available computer. It could handle both numerical and alphabetic data.

1957 Bill Norris and friends start Control Data Corporation (CDC) bring Seymour Cray on-board and begin building Large Scale Scientific Computers. 1958 The first "integrated circuit" is designed by American Jack Kirby. It included resistors, capacitors and transistors on a single wafer chip. 1960 Digital Equipment delivers PDP-1, an interactive computer with CRT and keyboard. Its big screen inspires MIT students to write the world's first computer game. 1963 Sketchpad, first WYSIWYG interactive drawing tool, is published by Ivan Sutherland as his MIT doctoral thesis. 1965 Sutherland demonstrates first VR head-mounted 3-D display. Ted Nelson coins the terms hypertext and hypermedia in a paper at the Association for Computing Machinery's 20th national conference.

13 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

1968 Doug Engelbart demonstrates the first

mouse.

1970 First four nodes are established on Arpanet, precursor of the Internet and World Wide Web. 1971 IBM introduces the 3270 mainframe terminal; its character-based interface becomes the standard for business applications. The first "microprocessor" is produced by American engineer Marcian E. Hoff. 1972 First GUI appears as part of Xerox Parc's Smalltalk programming environment. Seymour Cray incorporates Cray Research. 1974 Xerox PARC researches create Alto, the first computer to use the WIMP interface. Altair 8800 microcomputer, based on Intel's 8080 processor; Interface uses toggle switches, LEDs.

1975 Bill Gates and Paul Allen create and license the first microcomputer version of Basic, for the Altair; Loads via a paper tape.

14 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

1977 Tandy (Radio Shack) produces the first practical personal computer, using a cassette tape drive for programs and storage. Apple ships Apple II, with integrated keyboard, 16-color graphics, and command-line disk operating system. 1978 At Apple Computer, Steve Jobs proposes a "next generation" business machine with graphical user interface. It becomes the Lisa project. Don Brickland and Bob Frankston's VisiCalc's text-based spreadsheet interface becomes the personal computer's first killer app, runs on Apple II. 1981 IBM releases the PC with 4.77 MHz, MS-DOS, command line interfaces, and monochrome block graphics.

1984 Apple ships the Macintosh, the first mass-market computer with a monochrome desktop GUI, plug and play, and suite of GUI productivity applications. 1985 Microsoft ships Windows 1.0, its first graphical environment.

1990 Microsoft announces Windows 3.0; adds 3-D look and feel, Program Manager and File Manager.

1992

15 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Apple announces Newton PDA with pen-based user interface. 1993 Early Web Browsers: ECP Web browser for Macintosh released. NCSA releases Marc Andreessen's Mosaic Web browser for X Window. 1995 Microsoft introduces Bob, industry's first "Social User Interface", and featuring animated "assistants." Bob bombs. Watch the "Remembering the Bob" at Tech TV. Microsoft ships Windows 95, regarded by many as the release that offers features comparable with Apple's Mac. It's the fastest-selling operating system ever shipped. 1997 Microsoft Active Desktop integrates the Web with Windows. Netscape Communicator and Constellation combine Web and desktop GUI. Microsoft invests $150,000,000 in Apple Computers.

1998 Windows 98 released. A good portion of the world still using the abacus, maybe 2 people using the TRS-80.

Defining

the

Terms

Architecture,

Design,

and

Implementation

Introduction Over the past 10 years many practitioners and researchers have sought to define software architecture. At the SEI, we use the following definition: The software architecture of a program or computing system is the structure or structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationships among them. However, we are interested not only in understanding the term “software architecture” but in clarifying the difference between architecture and other related terms such as “design” and “implementation.” The lack of a clear distinction among these terms

16 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

is the cause of much muddy thinking, imprecise communication, and wasted, overlapping effort. For example, “architecture” is often used as a mere synonym for “design” (sometimes preceded with the adjective “high-level”). And many people use the term “architectural patterns” as a synonym for “design patterns.” Confusion also stems from the use of the same specification language for both architectural and design specifications. For example, UML is often used as an architectural description language. In fact, UML has become the industry de facto standard for describing architectures, although it was specifically designed to manifest detailed design decisions (and this is still its most common use). This merely contributes to the confusion, since a designer using UML has no way (within UML) of distinguishing architectural information from other types of information. Confusion also exists with respect to the artifacts of design and implementation. UML class diagrams, for instance, are a prototypical artifact of the design phase. Nonetheless, class diagrams may accumulate enough detail to allow code generation of very detailed programs, an approach that is promoted by CASE tools such as Rational Rose and System Architect. Using the same specification language further blurs the distinction between artifacts of the design (class diagrams) and artifacts of the implementation (source code). Having a unified specification language is, in many ways, a good thing. But a user of this unified language is given little help in knowing if a proposed change is “architectural” or not. Why are we interested in such distinctions? Naturally, a well-defined language improves our understanding of the subject matter. With time, terms that are used interchangeably lose their meaning, resulting inevitably in ambiguous descriptions given by developers, and significant effort is wasted in discussions of the form “by design I mean…and by architecture I mean…” Seeking to separate architectural design from other design activities, definers of software architecture in the past have stressed the following: 1. “Architecture is concerned with the selection of architectural elements, their interaction, and the constraints on those elements and their interactions…Design is concerned with the modularization and detailed interfaces of the design elements, their algorithms and procedures, and the data types needed to support the architecture and to satisfy the requirements.” 2. Software architecture is “concerned with issues...beyond the algorithms and data structures of the computation.” 3. “Architecture…is specifically not about…details of implementations (e.g., algorithms and data structures.)…Architectural design involves a richer collection of abstractions than is typically provided by OOD” (object-oriented design).

17 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

In suggesting typical “architectures” and “architectural styles,” existing definitions consist of examples and offer anecdotes rather than providing clear and unambiguous notions. In practice, the terms “architecture,” “design,” and “implementation” appear to connote varying degrees of abstraction in the continuum between complete details (“implementation”), few details (“design”), and the highest form of abstraction (“architecture”). But the amount of detail alone is insufficient to characterize the differences, because architecture and design documents often contain detail that is not explicit in the implementation (e.g., design constraints, standards, performance goals). Thus, we would expect a distinction between these terms to be qualitative and not merely quantitative. The ontology that we provide below can serve as a reference point for these discussions. The Intension/Locality Thesis To elucidate the relationship between architecture, design, and implementation, we distinguish at least two separate interpretations for abstraction in our context: 1. Intensional (vs. extensional) design specifications are “abstract” in the sense that they can be formally characterized by the use of logic variables that range over an unbounded domain. For example, a layered architectural pattern does not restrict the architect to a specific number of layers; it applies equally well to 2 layers or 12 layers. 2. Non-local (vs. local) specifications are “abstract” in the sense that they apply to all parts of the system (as opposed to being limited to some part thereof). Both of these interpretations contribute to the distinction among architecture, design, and implementation, summarized as the “intension/locality thesis”: 1. Architectural specifications are intensional and non-local 2. Design specifications are intensional but local 3. Implementation specifications are both extensional and local

Table 1 summarizes these distinctions. Table 1. The Intension/Locality Thesis Architecture

Intensional

Non-local

Design

Intensional

Local

Implementation

Extensional

Local

18 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Implications What are the implications of such definitions? They give us a firm basis for determining what is architectural (and hence crucial for the achievement of a system’s quality attribute requirements) and what is not. Consider the concept of a strictly layered architecture (an architecture in which each layer is allowed to use only the layer immediately below it). How do we know that the architectural style “layered” is really architectural? To answer that we need to answer whether this style is intentional and whether it is local or non-local. First of all, are there an unbounded number of implementations that qualify as layered? Clearly there are. Secondly, is the layered style local or non-local? To answer that, we need only consider a violation of the style, where a layer depends on a layer above it, or several layers below it. Since this would be a violation wherever it occurred, the notion of a layered architecture must be non-local. What about a design pattern, such as the factory pattern? This is intensional, because there may be an unbounded number of realizations of a factory design pattern within a system. But is it local or non-local? One may use a design pattern in some corner of the system and not use it (or even violate it) in a different portion of the same system. So design patterns are local. Similarly, it is simple to show that the term “implementation” refers only to artifacts that are extensional and local.

Conclusions Since the inception of architecture as a distinct field of study, there has been much confusion about what the term “architecture” means. Similarly, the distinction between architecture and other forms of design artifacts has never been clear. The intension/locality thesis provides a foundation for determining the meaning of the terms architecture, design, and implementation that accords not only with intuition but also with best industrial practices. A more formal and complete treatment of this topic can be found in our paper, “Architecture, Design, Implementation.” But what are the consequences of precisely knowing the differences among these terms? Is this an exercise in definition for definition’s sake? We think not. Among others, these distinctions facilitate 1. determining what constitutes a uniform program (e.g., a collection of modules that satisfy the same architectural specifications) 2. determining what information goes into architecture documents and what goes into design documents

19 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

3. determining what to examine and what not to examine in an architectural evaluation or a design walkthrough 4. understanding the distinction between local and non-local rules (i.e., between the design rules that are enforced throughout a project versus those that are of a more limited domain, because the architectural rules define the fabric of the system and how it will meet its quality attribute requirements, and the violation of architectural rules typically has more far-reaching consequences than the violation of a local rule) Furthermore, in the industrial practice of software architecture, many statements that are said to be “architectural” are in fact local (e.g., both tasks A and B execute on the same node, or task A controls B). Instead, a truly architectural statement would be, for instance, for each pair of tasks A,B that satisfy some property X, A and B will execute on the same node and the property Control(A,B) holds. More generally, for each specification we should be able to determine whether it is a design statement, describing a purely local phenomenon (and hence of secondary interest in architectural documentation, discussion, or analysis), or whether it is an instance of an underlying, more general rule. This is a powerful piece of information.

How you understand difference between architecture and design? ·

I'd say that architecture is a view of software that's at a higher level than design, i.e. more abstract and less connected with the actual implementation. The architecture gives structure to the design elements, while the design elements give structure to the implemented code.

·

The software architecture of a program or computing system is the structure or structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationships among them. Design -- The process of defining the architecture, components, interfaces, and other characteristics of a system or component. So, Design is a process of producing an instance of Software architecture. Software architecture is a domain of knowledge about abstract models and organization. Software architecture is not a low-level design.

·

I would add that architecture is design, but not all design is architecture.

20 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

LESSON II Combination Logic Introduction Digital electronics is classified into combinational logic and sequential logic. Combinational logic output depends on the inputs levels, whereas sequential logic output depends on stored levels and also the input levels.

The memory elements are devices capable of storing binary info. The binary info stored in the memory elements at any given time defines the state of the sequential circuit. The input and the present state of the memory element determine the output. Memory elements next state is also a function of external inputs and present state. A sequential circuit is specified by a time sequence of inputs, outputs, and internal states.

21 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

There are two types of sequential circuits. Their classification depends on the timing of their signals: ·

Synchronous sequential circuits

·

Asynchronous sequential circuits

Asynchronous sequential circuit This is a system whose outputs depend upon the order in which its input variables change and can be affected at any instant of time. Gate-type asynchronous systems are basically combinational circuits with feedback paths. Because of the feedback among logic gates, the system may, at times, become unstable. Consequently they are not often used.

Synchronous sequential circuits This type of system uses storage elements called flip-flops that are employed to change their binary value only at discrete instants of time. Synchronous sequential circuits use logic gates and flip-flop storage devices. Sequential circuits have a clock signal as one of their inputs. All state transitions in such circuits occur only when the clock value is either 0 or 1 or happen at the rising or falling edges of the clock depending on the type of memory elements used in the circuit. Synchronization is achieved by a timing device called a clock pulse generator. Clock pulses are distributed throughout the system in such a way that the flip-flops are affected only with the arrival of the synchronization pulse. Synchronous sequential circuits that use clock pulses in the inputs are called clocked-sequential circuits. They are stable and their timing can easily be broken down into independent discrete steps, each of which is considered separately.

22 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

A clock signal is a periodic square wave that indefinitely switches from 0 to 1 and from 1 to 0 at fixed intervals. Clock cycle time or clock period: the time interval between two consecutive rising or falling edges of the clock. Clock Frequency = 1 / clock cycle time (measured in cycles per second or Hz) Example: Clock cycle time = 10ns clock frequency = 100Mhz Concept of Sequential Logic A sequential circuit as seen in the last page is combinational logic with some feedback to maintain its current value, like a memory cell. To understand the basics let's consider the basic feedback logic circuit below, which is a simple NOT gate whose output is connected to its input. The effect is that output oscillates between HIGH and LOW (i.e. 1 and 0). Oscillation frequency depends on gate delay and wire delay. Assuming a wire delay of 0 and a gate delay of 10ns, then oscillation frequency would be (on time + off time = 20ns) 50Mhz. The basic idea of having the feedback is to store the value or hold the value, but in the above circuit, output keeps toggling. We can overcome this problem with the circuit below, which is basically cascading two inverters, so that the feedback is in-phase, thus avoids toggling. The equivalent circuit is the same as having a buffer with its output connected to its input.

But there is a problem here too: each gate output value is stable, but what will it be? Or in other words buffer output can not be known. There is no way to tell. If we could know or set the value we would have a simple 1-bit storage/memory element.

23 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The circuit below is the same as the inverters connected back to back with provision to set the state of each gate (NOR gate with both inputs shorted is like an inverter). I am not going to explain the operation, as it is clear from the truth table. S is called set and R is called Reset.

S

R

Q

Q+

0

0

0

0

0

0

1

1

0

1

X

0

1

0

X

1

1

1

X

0

There still seems to be some problem with the above configuration, we can not control when the input should be sampled, in other words there is no enable signal to control when the input is sampled. Normally input enable signals can be of two types. ·

Level Sensitive or ( LATCH)

·

Edge Sensitive or (Flip-Flop)

Level Sensitive: The circuit below is a modification of the above one to have level sensitive enable input. Enable, when LOW, masks the input S and R. When HIGH, presents S and R to the sequential logic input (the above circuit two NOR Gates). Thus Enable, when HIGH, transfers input S and R to the sequential cell transparently, so this kind of sequential circuits are called transparent Latch. The memory element we get is an RS Latch with active high Enable.

24 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Edge Sensitive: The circuit below is a cascade of two level sensitive memory elements, with a phase shift in the enable input between first memory element and second memory element. The first RS latch (i.e. the first memory element) will be enabled when CLK input is HIGH and the second RS latch will be enabled when CLK is LOW. The net effect is input RS is moved to Q and Q' when CLK changes state from HIGH to LOW, this HIGH to LOW transition is called falling edge. So the Edge Sensitive element we get is called negative edge RS flip-flop.

Now that we know the sequential circuit’s basics, let's look at each of them in detail in accordance to what is taught in colleges. You are always welcome to suggest if this can be written better in any way.

Latches and Flip-Flops There are two types of sequential circuits. ·

Asynchronous Circuits.

·

Synchronous Circuits.

As seen in last section, Latches and Flip-flops are one and the same with a slight variation: Latches have level sensitive control signal input and Flip-flops have edge sensitive control signal input. Flip-flops and latches which use this control signals are called synchronous circuits. So if they don't use clock inputs, then they are called asynchronous circuits.

25 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

RS Latch RS latch have two inputs, S and R. S is called set and R is called reset. The S input is used to produce HIGH on Q (i.e. store binary 1 in flip-flop). The R input is used to produce LOW on Q (i.e. store binary 0 in flip-flop). Q' is Q complementary output, so it always holds the opposite value of Q. The output of the S-R latch depends on current as well as previous inputs or state, and its state (value stored) can change as soon as its inputs change. The circuit and the truth table of RS latch are shown below. (This circuit is as we saw in the last page, but arranged to look beautiful :-)).

S

R

Q

Q+

0

0

0

0

0

0

1

1

0

1

X

0

1

0

X

1

1

1

X

0

The operation has to be analyzed with the 4 inputs combinations together with the 2 possible previous states. ·

When S = 0 and R = 0: If we assume Q = 1 and Q' = 0 as initial condition, then output Q after input is applied would be Q = (R + Q')' = 1 and Q' = (S + Q)' = 0. Assuming Q = 0 and Q' = 1 as initial condition, then output Q after the input applied would be Q = (R + Q')' = 0 and Q' = (S + Q)' = 1. So it is clear that when both S and R inputs are LOW, the output is retained as before the application of inputs. (i.e. there is no state change).

·

When S = 1 and R = 0: If we assume Q = 1 and Q' = 0 as initial condition, then output Q after input is applied would be Q = (R + Q')' = 1 and Q' = (S + Q)' = 0.

26 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Assuming Q = 0 and Q' = 1 as initial condition, then output Q after the input applied would be Q = (R + Q')' = 1 and Q' = (S + Q)' = 0. So in simple words when S is HIGH and R is LOW, output Q is HIGH. ·

When S = 0 and R = 1: If we assume Q = 1 and Q' = 0 as initial condition, then output Q after input is applied would be Q = (R + Q')' = 0 and Q' = (S + Q)' = 1. Assuming Q = 0 and Q' = 1 as initial condition, then output Q after the input applied would be Q = (R + Q')' = 0 and Q' = (S + Q)' = 1. So in simple words when S is LOW and R is HIGH, output Q is LOW.

When S = 1 and R =1: No matter what state ·

Q and Q' are in, application of 1 at input of NOR gate always results in 0 at output of NOR gate, which results in both Q and Q' set to LOW (i.e. Q = Q'). LOW in both the outputs basically is wrong, so this case is invalid.

The waveform below shows the operation of NOR gates based RS Latch.

It is possible to construct the RS latch using NAND gates (of course as seen in Logic gates section). The only difference is that NAND is NOR gate dual form (Did I say that in Logic gates section?). So in this case the R = 0 and S = 0 case becomes the invalid case. The circuit and Truth table of RS latch using NAND is shown below.

27 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

S

R

Q

Q+

1

1

0

0

1

1

1

1

0

1

X

0

1

0

X

1

0

0

X

1

If you look closely, there is no control signal (i.e. no clock and no enable), so these kinds of latches or flip-flops are called asynchronous logic elements. Since all the sequential circuits are built around the RS latch, we will concentrate on synchronous circuits and not on asynchronous circuits. RS Latch with Clock We have seen this circuit earlier with two possible input configurations: one with level sensitive input and one with edge sensitive input. The circuit below shows the level sensitive RS latch. Control signal "Enable" E is used to gate the input S and R to the RS Latch. When Enable E is HIGH, both the AND gates act as buffers and thus R and S appears at the RS latch input and it functions like a normal RS latch. When Enable E is LOW, it drives LOW to both inputs of RS latch. As we saw in previous page, when both inputs of a NOR latch are low, values are retained (i.e. the output does not change).

28 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Setup and Hold Time For synchronous flip-flops, we have special requirements for the inputs with respect to clock signal input. They are ·

Setup Time: Minimum time period during which data must be stable before the clock makes a valid transition. For example, for a posedge triggered flip-flop, with a setup time of 2 ns, Input Data (i.e. R and S in the case of RS flip-flop) should be stable for at least 2 ns before clock makes transition from 0 to 1.

·

Hold Time: Minimum time period during which data must be stable after the clock has made a valid transition. For example, for a posedge triggered flip-flop, with a hold time of 1 ns. Input Data (i.e. R and S in the case of RS flip-flop) should be stable for at least 1 ns after clock has made transition from 0 to 1.

If data makes transition within this setup window and before the hold window, then the flip-flop output is not predictable, and flip-flop enters what is known as meta stable state. In this state flip-flop output oscillates between 0 and 1. It takes some time for the flip-flop to settle down. The whole process is called metastability. You could refer to tidbits section to know more information on this topic. The waveform below shows input S (R is not shown), and CLK and output Q (Q' is not shown) for a SR posedge flip-flop.

29 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

D Latch The RS latch seen earlier contains ambiguous state; to eliminate this condition we can ensure that S and R are never equal. This is done by connecting S and R together with an inverter. Thus we have D Latch: the same as the RS latch, with the only difference that there is only one input, instead of two (R and S). This input is called D or Data input. D latch is called D transparent latch for the reasons explained earlier. Delay flip-flop or delay latch is another name used. Below is the truth table and circuit of D latch. In real world designs (ASIC/FPGA Designs) only D latches/Flip-Flops are used.

D

Q

Q+

1

X

1

0

X

0

30 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Below is the D latch waveform, which is similar to the RS latch one, but with R removed.

JK Latch The ambiguous state output in the RS latch was eliminated in the D latch by joining the inputs with an inverter. But the D latch has a single input. JK latch is similar to RS latch in that it has 2 inputs J and K as shown figure below. The ambiguous state has been eliminated here: when both inputs are high, output toggles. The only difference we see here is output feedback to inputs, which is not there in the RS latch.

J

K

Q

1

1

0

1

1

1

1

0

1

31 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

0

1

0

T Latch When the two inputs of JK latch are shorted, a T Latch is formed. It is called T latch as, when input is held HIGH, output toggles.

T

Q

Q+

1

0

1

1

1

0

0

1

1

0

0

0

JK Master Slave Flip-Flop All sequential circuits that we have seen in the last few pages have a problem (All level sensitive sequential circuits have this problem). Before the enable input changes state from HIGH to LOW (assuming HIGH is ON and LOW is OFF state), if inputs changes, then another state transition occurs for the same enable pulse. This sort of multiple transition problem is called racing. If we make the sequential element sensitive to edges, instead of levels, we can overcome this problem, as input is evaluated only during enable/clock edges.

32 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

In the figure above there are two latches, the first latch on the left is called master latch and the one on the right is called slave latch. Master latch is positively clocked and slave latch is negatively clocked.

Sequential Circuits Design We saw in the combinational circuits section how to design a combinational circuit from the given problem. We convert the problem into a truth table, then draw K-map for the truth table, and then finally draw the gate level circuit for the problem. Similarly we have a flow for the sequential circuit design. The steps are given below. ·

Draw state diagram.

·

Draw the state table (excitation table) for each output.

·

Draw the K-map for each output.

·

Draw the circuit.

33 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Looks like sequential circuit design flow is very much the same as for combinational circuit. State Diagram The state diagram is constructed using all the states of the sequential circuit in question. It builds up the relationship between various states and also shows how inputs affect the states. To ease the following of the tutorial, let's consider designing the 2 bit up counter (Binary counter is one which counts a binary sequence) using the T flip-flop. Below is the state diagram of the 2-bit binary counter.

State Table The state table is the same as the excitation table of a flip-flop, i.e. what inputs need to be applied to get the required output. In other words this table gives the inputs required to produce the specific outputs. Q1

Q0

Q1+

Q0+

T1

T0

0

0

0

1

0

1

0

1

1

0

1

1

1

0

1

1

0

1

1

1

0

0

1

1

K-map The K-map is the same as the combinational circuits K-map. Only difference: we draw K-map for the inputs i.e. T1 and T0 in the above table. From the table we deduct that we

34 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

don't need to draw K-map for T0, as it is high for all the state combinations. But for T1 we need to draw the K-map as shown below, using SOP.

Circuit There is nothing special in drawing the circuit; it is the same as any circuit drawing from K-map output. Below is the circuit of 2-bit up counter using the T flip-flop.

35 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

OPERATING MANUAL FOR LOGIC BASIC SERIES ELECTRONIC INDICATORS Choice of Three Power Sources 1. Batteries A set of two Manganese Dioxide Lithium batteries will operate this electronic indicator for approximately 250 hours of normal usage. Because milliampere hour ratings vary widely with manufacturers, normal usage time is very hard to predict. The lithium battery used in this indicator is an IEC standard, type CR2450. The indicators are shipped with the batteries not installed, and should not be installed until battery operation is desired. NOTE: This indicator has an .AUTO-OFF. feature to conserve battery life. After 10 minutes of .no activity. (no key presses or spindle movement), the gage will turn itself off. This feature may be disabled if continuous operation is desired; see .AUTO-OFF On/Off. instructions in this book. Installing Batteries

36 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Using a narrow screwdriver, gently pry under the tab on the left side of plastic bezel and slide out the battery tray as you turn the indicator face side down. Insert two batteries, .+. side up, into tray cavities, then slide the tray back into its bezel slot, taking care that the batteries stay in proper position.

AC Adapter AC adapters (providing 9VDC at 30ma. maximum to the indicator from a 115 or 230 VAC, 50/60 Hz line source) may be purchased from CDI. Although other 9V AC adapters with a 3/32. (2.5mm) mini-plug (center +) may be used, CDI adapters are recommended because they include current limiting to prevent damage from line fluctuations. For 115 V (USA) operation - Order CDI Part #G11-0012 For 230 V (Europe) operation - Order CDI Part #G11-0014 First insert the mini-plug into the socket on the lower left side of the bezel (see drawing on page 2), then plug the adapter into a wall outlet. After turning the indicator .ON., disable the .AUTO-OFF. feature; see .AUTO OFF On/Off.

2. Data I/O Connector Power also may be provided through the data I/O connector, applications where the indicator is integrated with another ripple-free 5 VDC (4.9 to 5. 7 V) regulated voltage source #G13-0034 or a custom variation of another CDI data cable must Contact CDI for full information.

for special featuring or piece of equipment A is required. CDI Cable be used.

37 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Button Functions NOTE: Most functions are active on release of button(s). Key Function Controlled OFF . Press & Release: Turns indicator off ON/CLR - Press & Release: Turns indicator on, clears/resets indicator. With HOLD off: Clears display to .0. With MAX HOLD on: Clears display to spindle position, leaves HOLD on. -Press & Hold (For longer than 5 seconds): Enter/Exit display and key test mode. HOLD . Press & Release: Turns hold function on/off and cancels last selection. 2ND . Press & Hold (for more than 2 seconds until 2ND is displayed): Enables 2ND and 3RD functions such as TR REV (Travel Reverse), IN/MM and AUTO OFF. CHNG - Used with 2ND key to activate selectable resolution.

or

38 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Display-Operating Prompts & Conditions

Operating Instructions

39 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

To Turn AUTO OFF On/Off - Press and hold "2ND" 2ND appears at bottom of display then release. and release "OFF" within 3 seconds. NOTE: An hourglass appears at left side of ’AUTO OFF’ is active. TO Clear Display . to zero and release "ON/CLR".

until - Press

display if - Press

To Verify DATA I/0 FORMAT To view the current output format. - Press and release "2ND", until the 2ND appears in display, then "ON/CLR" and "2ND" in sequence. Format information is displayed for about 3 seconds, then indicator automatically returns to normal operation. Format information is displayed as: RS232 MTI compatible CDI mux BCD Bypass

=SEr

=rS232 =Cdi =bP

To Use HOLD To select type of HOLD - Freeze, Minimum or Maximum: -Press and hold "HOLD" until cursor moves under desired type of hold; FRZ, MIN or then release. To turn HOLD On/Off: . Press and release "HOLD" . MAX HOLD - Holds and displays highest reading. . MIN HOLD - Holds and displays lowest reading. . FREEZE HOLD - Freezes display when "HOLD" button is pressed.

MAX,

NOTE: Pressing CLR button resets indicator to spindle position. To Change INCH/MILLIMETER To change from one to the other: - Press and hold "MOVE/2ND" until 2ND appears at bottom of display then release. - Press and release "TOL" within 3 seconds. NOTE: MM or IN will appear at bottom of display.

40 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

To Turn INDICATOR ON Press "ON/CLR" and release when ’clr’ display To Turn INDICATOR OFF- Press "OFF"

appears on and release

TO Reset to DEFAULT A total reset: clears all user settings and returns to factory-set defaults. 1. Press and hold "2ND" until 2ND appears at bottom of display, then release. 2. Press and release "ON/CLR" within 3 seconds. 3. Press and release "CHNG" within 3 seconds. NOTE: Cannot be done if Lock feature is on.

To Change RESOLUTION -Press and hold "2ND" until 2ND appears at bottom of display then release. - Press and release "ON/CLR" within 3 seconds. - Press and release "HOLD" within 3 seconds. Use "CHNG" key to step through available resolution selections: 1 = .00005" (.001mm) 2 = .0001" (.002mm) 3 = .00025" (.005mm) 4 = .0005" (.O1mm) 5 = .001" (.02mm) Press and release "CHNG" and "2ND" simultaneously to save. Note: Only resolutions coarser than indicator resolution-as-purchased are available. To Enter TEST MODE Press and hold (for more than 5 seconds) "ON/CLR" to enter ’display and key’ test mode. To Exit

41 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

TEST MODE Press and hold (for more than 5 seconds) "ON/CLR" to exit ’display and key’ test mode.

To Change TRAVEL DIRECTION - Press and hold "2ND" until 2ND appears at bottom of display then release. - Press and release "HOLD" within 3 seconds. Note: Arrow in upper right corner will show positive direction of spindle travel. NOTE: Most functions are active on release of key(s). Internal Memory "LOGIC" Series indicators and remote displays include internal non-volatile memory to store all factory default and user settings. When the indicator is turned on, user settings and preset numbers will be the same as when the indicator was turned off. NOTE: Many of the user settings are stored when the indicator is turned .Off. by using the "OFF" key, or when the indicator turns itself off (AUTO OFF). However, if the indicator is turned off by removing power (by disconnecting the AC adapter or cutting power through the Data 1/0 connector), some or all of the user settings and/or changes may be lost! Operating Precautions 1. Do not use the bottom of the spindle stroke as a base of measurement reference, as it is protected with a rubber shock absorber to prevent shock to the internal mechanism. The spindle should be offset .005.-.010" (.12 -.25 mm) from the bottom of travel. 2. Use of CDI type MS-10 or similar sturdy stands or fixtures for indicator mounting, where the base plate and indicator are mounted to a common post, is highly recommended for accurate and repeatable readings. The indicator must be mounted with the spindle perpendicular to the reference or base plate. If the indicator is stem-mounted, protect the indicator from attempted rotation, and from being stuck or bumped, to prevent stem/case mechanical alignment damage. Do not over-tighten the mounting mechanism, and use clamp mounting rather than set screws if at all possible, to prevent damage to the stem. 3. The bezel face can be rotated from its normal horizontal position for convenient viewing. Rotation is limited to 270 degrees and attempts to force it past its internal stop may damage the indicator. 4. Frequently clean the spindle to prevent sluggish or sticky movement. Dry wiping with a lint-free cloth usually will suffice, but isopropyl alcohol may be used to remove gummy

42 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

deposits. Do not apply any type of lubricant to the spindle. Spindle dust caps and spindle boots are available for operation in dirty or abrasive environments. 1" Spindle dust cap Order CDI Part #A21.0131 l. Spindle boot - Order CDI Part #CD170-1 Use a soft cloth dampened with a mild detergent to clean the bezel and front face of the indicator. Do not use aromatic solvents as they may cause damage. 5. Extremely high electrical transients - from nearby arc welders, SCR motor/lighting controls, radio transmitters, etc. - may cause malfunctions of the indicator’s internal circuitry or ’ERROR 1’ indications, even through the electronic design was created to minimize such problems. If at all possible, do not operate the indicator in plant areas subject to these transients. Turning the indicator ’OFF’ for a few seconds, then back ’ON’ from time-to-time may eliminate any problems. Also, use of an isolated AC line (for AC adapter operated indicators and AC powered remote displays), or an AC line filter - plus solid grounding of stands and fixtures - is recommended in these conditions. Additional Display-Operating Prompts & Conditions FLASHING DIGIT or +/- sign - Digit or sign affected by .CHNG. key when setting or changing preset numbers. FLASHING READING, with HIGH or LOW displayed Reading is out of tolerance, to the high or low side. ERROR 1 - Spindle speed too fast, high electrical noise, etc. ERROR 2 - Counter overflow, i.e. counter number (spindle + preset number) out of counter range. ERROR 3 - Improper tolerance combination, i.e. both "HIGH" and ’LOW" set to ’O’ or same number, or "LOW’ set to a higher number than ’HIGH’. Occurs only when ’TOL’ is on. ERROR 4 - Display overflow, i.e. number too large to be properly displayed. Moving spindle to acceptable range eliminates error message. Data Output ’LOGIC’ Series indicators and remote displays provide users with multiple data output formats. The cable attached to the indicator when it is turned on determines the output format in use. Cables for each format can be purchased from CDI. These cables also provide remote control of ’ON/CLR’ and ’HOLD’ functions, plus +5v regulated power input. For special applications, an ERROR FLAG output and/or custom cables also can be provided; contact CDI for information. CAUTION: Use of cables other than those provided or approved by CDI can cause irreparable damage to the indicator or data output port, and such damage is not covered by the CDI Limited Warranty.

43 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Standard RS232 Format - Communications protocol is 1200 baud, no parity, 8 data bits, and 1 stop bit. RS232 can be read by any IBM PC-compatible computer, RS232 serial printer or other device, provided the device can be set to this protocol. A DB25 pin adapter may be necessary for non-standard devices. "WINDOWS" terminal and other communications software, "WEDGE" software, etc., may be used with this format. Cables Required: CDI #GO3-0018 - For IBM Compatible PC (CDI indicator to DB25F) CDI #GO3-0021 - For CDI serial printer types G19-0001/Gl9- 0002 & G19-0003 (CDI indicator to DB25M) MITUTOYO Compatible Format - Use with MITUTOYO compatible printers, collection devices, etc. Cable Required: CDI #G03-0019 - CDI indicator to MTI 10 pin CDI (Multiplexed BCD) Format - Furnished with pigtails one end. Cable Required: CDI #Gl3-0034 - Also may be used for remote control of ’ON/CLR’ or ’HOLD’ functions, or external power (+5V regulated) input. (CDI indicator to pigtail wires.) BYPASS FORMAT - Permits indicator to be used as a probe for the CDI remote display: bypasses ’raw’ unprocessed signals from the detector system directly to the data output connector. In this operation mode, power for the indicator is supplied by the remote display. Cable Required: CDI #Gl3-0022 - CDI indicator to 6-pin DIN IMPORTANT- Indicator and remote display must be of same base resolution. If the two (2) are different base resolutions, you will experience compatibility problems. Limited Warranty "PLUS SERIES" INDICATORS ARE WARRANTED FOR A PERIOD OF ONE YEAR AGAINST DEFECTIVE MATERIALS OR WORKMANSHIP. THIS WARRANTY DOES NOT APPLY TO PRODUCTS THAT ARE MISHANDLED, MISUSED, ETCHED, STAMPED, OR OTHERWISE MARKED OR DAMAGED, NOR DOES IT APPLY TO DAMAGE OR ERRONEOUS OPERATION CAUSED BY USER TAMPERING OR ATTEMPTS TO MODIFY THE INDICATOR. UNITS FOUND TO BE DEFECTIVE WITHIN THE WARRANTY PERIOD WILL BE REPAIRED OR REPLACED FREE OF CHARGE AT THE OPTION OF CDI. A NOMINAL CHARGE WILL BE MADE FOR NON-WARRANTY REPAIRS, PROVIDED THE UNIT IS NOT DAMAGED BEYOND REPAIR.

Boolean algebra

44 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

For a basic introduction to sets, Boolean operations, Venn diagrams, truth tables, and Boolean applications, see Boolean logic. For an alternative perspective see Boolean algebras canonically defined. In abstract algebra, a Boolean algebra is an algebraic structure (a collection of elements and operations on them obeying defining axioms) that captures essential properties of both set operations and logic operations. Specifically, it deals with the set operations of intersection, union, complement; and the logic operations of AND, OR, NOT. For example, the logical assertion that a statement a and its negation ¬a cannot both be true,

Boolean lattice of subsets parallels the set-theory assertion that a subset A and its complement AC have empty intersection,

Because truth values can be represented as binary numbers or as voltage levels in logic circuits, the parallel extends to these as well. Thus the theory of Boolean algebras has many practical applications in electrical engineering and computer science, as well as in mathematical logic. A Boolean algebra is also called a Boolean lattice. The connection to lattices (special partially ordered sets) is suggested by the parallel between set inclusion, A ⊆ B, and ordering, a ≤ b. Consider the lattice of all subsets of {x,y,z}, ordered by set inclusion. This Boolean lattice is a partially ordered set in which, say, {x} ≤ {x,y}. Any two lattice elements, say p = {x,y} and q = {y,z}, have a least upper bound, here {x,y,z}, and a greatest lower bound, here {y}. Suggestively, the least upper bound (or join or supremum) is denoted by the same symbol as logical OR, p∨q; and the greatest lower bound (or meet or infimum) is denoted by same symbol as logical AND, p∧q.

45 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The lattice interpretation helps in generalizing to Heyting algebras, which are Boolean algebras freed from the restriction that either a statement or its negation must be true. Heyting algebras correspond to intuitionist (constructivist) logic just as Boolean algebras correspond to classical logic.

Formal definition A Boolean algebra is a set A, supplied with two binary operations (called AND), (called OR), a unary operation (called NOT) and two distinct elements 0 (called zero) and 1 (called one), such that, for all elements a, b and c of set A, the following axioms hold: associativity commutativity absorption distributivity complements The first three pairs of axioms above: associativity, commutativity and absorption, mean that (A, , ) is a lattice. If A is a lattice and one of the above distributivity laws holds, then the second distributivity law can be proven. Thus, a Boolean algebra can also be equivalently defined as a distributive complemented lattice. From these axioms, one can show that the smallest element 0, the largest element 1, and the complement ¬a of any element a are uniquely determined. For all a and b in A, the following identities also follow: idempotency bounded ness 0 and 1 are complements De Morgan's laws involution Examples

46 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

The simplest Boolean algebra has only two elements, 0 and 1, and is defined by the rules: ∧ 0 1

∨ 0 1

0 0 0

0 0 1

1 0 1

1 1 1

a

0 1

¬a 1 0

·

It has applications in logic, interpreting 0 as false, 1 as true, ∧ as and, ∨ as or, and ¬ as not. Expressions involving variables and the Boolean operations represent statement forms, and two such expressions can be shown to be equal using the above axioms if and only if the corresponding statement forms are logically equivalent.

·

The two-element Boolean algebra is also used for circuit design in electrical engineering; here 0 and 1 represent the two different states of one bit in a digital circuit, typically high and low voltage. Circuits are described by expressions containing variables, and two such expressions are equal for all values of the variables if and only if the corresponding circuits have the same input-output behavior. Furthermore, every possible input-output behavior can be modeled by a suitable Boolean expression.

·

The two-element Boolean algebra is also important in the general theory of Boolean algebras, because an equation involving several variables is generally true in all Boolean algebras if and only if it is true in the two-element Boolean algebra (which can always be checked by a trivial brute force algorithm). This can for example be used to show that the following laws (Consensus theorems) are generally valid in all Boolean algebras: ·

(a ∨ b) ∧ (¬a ∨ c) ∧ (b ∨ c) ≡ (a ∨ b) ∧ (¬a ∨ c)

·

(a ∧ b) ∨ (¬a ∧ c) ∨ (b ∧ c) ≡ (a ∧ b) ∨ (¬a ∧ c)

·

Starting with the propositional calculus with κ sentence symbols, form the Lindenbaum algebra (that is, the set of sentences in the propositional calculus modulo tautology). This construction yields a Boolean algebra. It is in fact the free Boolean algebra on κ generators. A truth assignment in propositional calculus is then a Boolean algebra homomorphism from this algebra to {0,1}.

·

The power set (set of all subsets) of any given nonempty set S forms a Boolean algebra with the two operations ∨ := ∪ (union) and ∧ := ∩ (intersection). The smallest element 0 is the empty set and the largest element 1 is the set S itself.

·

The set of all subsets of S that are either finite or cofinite is a Boolean algebra.

47 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

For any natural number n, the set of all positive divisors of n forms a distributive lattice if we write a ≤ b for a | b. This lattice is a Boolean algebra if and only if n is square-free. The smallest element 0 of this Boolean algebra is the natural number 1; the largest element 1 of this Boolean algebra is the natural number n.

·

Other examples of Boolean algebras arise from topological spaces: if X is a topological space, then the collection of all subsets of X which are both open and closed forms a Boolean algebra with the operations ∨ := ∪ (union) and ∧ := ∩ (intersection).

·

If R is an arbitrary ring and we define the set of central idempotents by A = { e ∈ R: e2 = e, ex = xe, ∀x ∈ R } then the set A becomes a Boolean algebra with the operations e ∨ f := e + f − ef and e ∧ f := ef.

·

Certain Lindenbaum–Tarski algebras.

Order theoretic properties

Boolean lattice of subsets Like any lattice, a Boolean algebra (A, by defining a ≤ b precisely when a = a (which is also equivalent to b = a

,

) gives rise to a partially ordered set (A, ≤)

b b).

In fact one can also define a Boolean algebra to be a distributive lattice (A, ≤) (considered as a partially ordered set) with least element 0 and greatest element 1, within which every element x has a complement ¬x such that x

¬x = 0 and x

¬x = 1

48 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Here and are used to denote the infimum (meet) and supremum (join) of two elements. Again, if complements in the above sense exist, then they are uniquely determined. The algebraic and the order theoretic perspective can usually be used interchangeably and both are of great use to import results and concepts from both universal algebra and order theory. In many practical examples an ordering relation, conjunction, disjunction, and negation are all naturally available, so that it is straightforward to exploit this relationship. Principle of duality One can also apply general insights from duality in order theory to Boolean algebras. Especially, the order dual of every Boolean algebra, or, equivalently, the algebra obtained by exchanging and , is also a Boolean algebra. In general, any law valid for Boolean algebras can be transformed into another valid, dual law by exchanging 0 with 1, with , and ≤ with ≥. Other notation The operators of Boolean algebra may be represented in various ways. Often they are simply written as AND, OR and NOT. In describing circuits, NAND (NOT AND), NOR (NOT OR) and XOR (exclusive OR) may also be used. Mathematicians, engineers, and programmers often use + for OR and · for AND (since in some ways those operations are analogous to addition and multiplication in other algebraic structures and this notation makes it very easy to get sum of products form for people who are familiar with normal algebra) and represent NOT by a line drawn above the expression being negated. Sometimes, the symbol ~ or ! is used for NOT. Here we use another common notation with "meet" NOT.

for AND, "join"

for OR, and ¬ for

Homomorphisms and isomorphisms A homomorphism between the Boolean algebras A and B is a function f : A → B such that for all a, b in A: f(a f(a f(0) f(1)

b) = f(a) b) = f(a) =0 =1

f(b) f(b)

It then follows that f(¬a) = ¬f(a) for all a in A as well. The class of all Boolean algebras, together with this notion of morphism, forms a category. An isomorphism from A to B is a homomorphism from A to B which is bijective. The inverse of an isomorphism is also an isomorphism, and we call the two Boolean algebras A and B isomorphic. From the

49 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

standpoint of Boolean algebra theory, they cannot be distinguished; they differ only in the notation of their elements. Boolean rings, ideals and filters Every Boolean algebra (A, , ) gives rise to a ring (A, +, *) by defining a + b = (a ¬b) (b ¬a) (this operation is called "symmetric difference" in the case of sets and XOR in the case of logic) and a * b = a b. The zero element of this ring coincides with the 0 of the Boolean algebra; the multiplicative identity element of the ring is the 1 of the Boolean algebra. This ring has the property that a * a = a for all a in A; rings with this property are called Boolean rings. Conversely, if a Boolean ring A is given, we can turn it into a Boolean algebra by defining x y = x + y + xy and x y = xy. Since these two operations are inverses of each other, we can say that every Boolean ring arises from a Boolean algebra, and vice versa. Furthermore, a map f : A → B is a homomorphism of Boolean algebras if and only if it is a homomorphism of Boolean rings. The categories of Boolean rings and Boolean algebras are equivalent. An ideal of the Boolean algebra A is a subset I such that for all x, y in I we have x y in I and for all a in A we have a x in I. This notion of ideal coincides with the notion of ring ideal in the Boolean ring A. An ideal I of A is called prime if I ≠ A and if a b in I always implies a in I or b in I. An ideal I of A is called maximal if I ≠ A and if the only ideal properly containing I is A itself. These notions coincide with ring theoretic ones of prime ideal and maximal ideal in the Boolean ring A. The dual of an ideal is a filter. A filter of the Boolean algebra A is a subset p such that for all x, y in p we have x y in p and for all a in A if a x = a then a in p. Representing Boolean algebras It can be shown that every finite Boolean algebra is isomorphic to the Boolean algebra of all subsets of a finite set. Therefore, the number of elements of every finite Boolean algebra is a power of two. Stone's celebrated representation theorem for Boolean algebras states that every Boolean algebra A is isomorphic to the Boolean algebra of all closed-open sets in some (compact totally disconnected Hausdorff) topological space. Axiomatics for Boolean algebras Let the unary functional symbol n be read as 'complement'. In 1933, the American mathematician Edward Vermilye Huntington (1874–1952) set out the following elegant axiomatization for Boolean algebra: 1. Commutativity: x + y = y + x.

50 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

2. Associativity: (x + y) + z = x + (y + z). 3. Huntington equation: n(n(x) + y) + n(n(x) + n(y)) = x. Herbert Robbins immediately asked: If the Huntington equation is replaced with its dual, to wit: 4. Robbins Equation: n(n(x + y) + n(x + n(y))) = x, do (1), (2), and (4) form a basis for Boolean algebra? Calling (1), (2), and (4) a Robbins algebra, the question then becomes: Is every Robbins algebra a Boolean algebra? This question remained open for decades, and became a favorite question of Alfred Tarski and his students. In 1996, William McCune at Argonne National Laboratory, building on earlier work by Larry Wos, Steve Winker, and Bob Veroff, answered Robbins's question in the affirmative: Every Robbins algebra is a Boolean algebra. Crucial to McCune's proof was the automated reasoning program EQP he designed. For a simplification of McCune's proof, see Dahn (1998). Boolean algebra <mathematics, logic> (After the logician George Boole) 1. Commonly, and especially in computer science and digital electronics, this term is used to mean two-valued logic. 2. This is in stark contrast with the definition used by pure mathematicians who in the 1960s introduced "Boolean-valued models" into logic precisely because a "Boolean-valued model" is an interpretation of a theory that allows more than two possible truth values! Strangely, a Boolean algebra (in the mathematical sense) is not strictly an algebra, but is in fact a lattice. A Boolean algebra is sometimes defined as a "complemented distributive lattice". Boole's work which inspired the mathematical definition concerned algebras of sets, involving the operations of intersection, union and complement on sets. Such algebras obey the following identities where the operators ^, V, - and constants 1 and 0 can be thought of either as set intersection, union, complement, universal, empty; or as two-valued logic AND, OR, NOT, TRUE, FALSE; or any other conforming system. a^b=b^a aVb = bVa (commutative laws) (a ^ b) ^ c = a ^ (b ^ c) (a V b) V c = a V (b V c) (associative laws) a ^ (b V c) = (a ^ b) V (a ^ c) a V (b ^ c) = (a V b) ^ (a V c) (distributive laws) a^a = a aVa = a (idempotence laws)

51 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

--a = a -(a ^ b) = (-a) V (-b) -(a V b) = (-a) ^ (-b) a ^ -a = 0 a V -a = 1 a^1 = a aV0 = a a^0 = 0 aV1 = 1 -1 = 0 -0 = 1

(de Morgan's laws)

There are several common alternative notations for the "-" or logical complement operator. If a and b are elements of a Boolean algebra, we define a <= b to mean that a ^ b = a, or equivalently a V b = b. Thus, for example, if ^, V and - denote set intersection, union and complement then <= is the inclusive subset relation. The relation <= is a partial ordering, though it is not necessarily a linear ordering since some Boolean algebras contain incomparable values. Note that these laws only refer explicitly to the two distinguished constants 1 and 0 (sometimes written as LaTeX \top and \bot), and in two-valued logic there are no others, but according to the more general mathematical definition, in some systems variables a, b and c may take on other values as well.

History The term "Boolean algebra" honors George Boole (1815–1864), a self-educated English mathematician. The algebraic system of logic he formulated in his 1854 monograph The Laws of Thought differs from that described above in some important respects. For example, conjunction and disjunction in Boole were not a dual pair of operations. Boolean algebra emerged in the 1860s, in papers written by William Jevons and Charles Peirce. To the 1890 Vorlesungen of Ernst Schröder we owe the first systematic presentation of Boolean algebra and distributive lattices. The first extensive treatment of Boolean algebra in English is A. N. Whitehead's 1898 Universal Algebra. Boolean algebra as an axiomatic algebraic structure in the modern axiomatic sense begins with a 1904 paper by Edward Vermilye Huntington. Boolean algebra came of age as serious mathematics with the work of Marshall Stone in the 1930s, and with Garrett Birkhoff's 1940 Lattice Theory. In the 1960s, Paul Cohen, Dana Scott, and others found deep new results in mathematical logic and axiomatic set theory using offshoots of Boolean algebra, namely forcing and Boolean-valued models. Building series-parallel resistor circuits Once again, when building battery/resistor circuits, the student or hobbyist is faced with several different modes of construction. Perhaps the most popular is the solder less breadboard: a platform for constructing temporary circuits by plugging components and

52 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

wires into a grid of interconnected points. A breadboard appears to be nothing but a plastic frame with hundreds of small holes in it. Underneath each hole, though, is a spring clip which connects to other spring clips beneath other holes. The connection pattern between holes is simple and uniform:

Suppose we wanted to construct the following series-parallel combination circuit on a breadboard:

53 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

54 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The recommended way to do so on a breadboard would be to arrange the resistors in approximately the same pattern as seen in the schematic, for ease of relation to the schematic. If 24 volts is required and we only have 6-volt batteries available, four may be connected in series to achieve the same effect:

This is by no means the only way to connect these four resistors together to form the circuit shown in the schematic. Consider this alternative layout: If greater permanence is desired without resorting to soldering or wire-wrapping, one could choose to construct this circuit on a terminal strip (also called a barrier strip, or terminal block). In this method, components and wires are secured by mechanical tension underneath screws or heavy clips attached to small metal bars. The metal bars, in turn, are mounted on a no conducting body to keep them electrically isolated from each other.

55 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Building a circuit with components secured to a terminal strip isn't as easy as plugging components into a breadboard, principally because the components cannot be physically arranged to resemble the schematic layout. Instead, the builder must understand how to "bend" the schematic's representation into the real-world layout of the strip. Consider one example of how the same four-resistor circuit could be built on a terminal strip:

Another terminal strip layout, simpler to understand and relate to the schematic, involves anchoring parallel resistors (R1//R2 and R3//R4) to the same two terminal points on the strip like this:

56 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Building more complex circuits on a terminal strip involves the same spatial-reasoning skills, but of course requires greater care and planning. Take for instance this complex circuit, represented in schematic form:

The terminal strip used in the prior example barely has enough terminals to mount all seven resistors required for this circuit! It will be a challenge to determine all the necessary wire connections between resistors, but with patience it can be done. First, begin by installing and labeling all resistors on the strip. The original schematic diagram will be shown next to the terminal strip circuit for reference: Next, begin connecting components together wire by wire as shown in the schematic. Over-draw connecting lines in the schematic to indicate completion in the real circuit. Watch this sequence of illustrations as each individual wire is identified in the schematic, then added to the real circuit:

57 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

58 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

59 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

60 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

61 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

62 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Although there are minor variations possible with this terminal strip circuit, the choice of connections shown in this example sequence is both electrically accurate (electrically identical to the schematic diagram) and carries the additional benefit of not burdening any one screw terminal on the strip with more than two wire ends, a good practice in any terminal strip circuit. An example of a "variant" wire connection might be the very last wire added (step 11), which I placed between the left terminal of R2 and the left terminal of R3. This last wire completed the parallel connection between R2 and R3 in the circuit. However, I could have placed this wire instead between the left terminal of R2 and the right terminal of R1, since the right terminal of R1 is already connected to the left terminal of R3 (having been placed there in step 9) and so is electrically common with that one point. Doing this, though, would have resulted in three wires secured to the right terminal of R1 instead of two, which is a faux pax in terminal strip etiquette. Would the circuit have worked this way? Certainly! It's just that more than two wires secured at a single terminal makes for a "messy" connection: one that is aesthetically unpleasing and may place undue stress on the screw terminal. Integrated Circuits (Chips) Integrated Circuits are usually called ICs or chips. They are complex circuits which have been etched onto tiny chips of semiconductor (silicon). The chip is packaged in a plastic holder with pins spaced on a 0.1" (2.54mm) grid which will fit the holes on

63 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

strip board and breadboards. Very fine wires inside the package link the chip to the pins. Pin numbers The pins are numbered anti-clockwise around the IC (chip) starting near the notch or dot. The diagram shows the numbering for 8-pin and 14-pin ICs, but the principle is the same for all sizes. Chip holders (DIL sockets) ICs (chips) are easily damaged by heat when soldering and their short pins cannot be protected with a heat sink. Instead we use a chip holder, strictly called a DIL socket (DIL = Dual In-Line), which can be safely soldered onto the circuit board. The chip is pushed into the holder when all soldering is complete. Chip holders are only needed when soldering so they are not used on breadboards. Commercially produced circuit boards often have chips soldered directly to the board without a chip holder, usually this is done by a machine which is able to work very quickly. Please don't attempt to do this yourself because you are likely to destroy the chip and it will be difficult to remove without damage by de-soldering. Removing a chip from its holder If you need to remove a chip it can be gently prised out of the holder with a small flat-blade screwdriver. Carefully lever up each end by inserting the screwdriver blade between the chip and its holder and gently twisting the screwdriver. Take care to start lifting at both ends before you attempt to remove the chip, otherwise you will bend and possibly break the pins.

Static precautions

Antistatic bags for ICs

64 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Many ICs are static sensitive and can be damaged when you touch them because your body may have become charged with static electricity, from your clothes for example. Static sensitive ICs will be supplied in antistatic packaging with a warning label and they should be left in this packaging until you are ready to use them. It is usually adequate to earth your hands by touching a metal water pipe or window frame before handling the IC but for the more sensitive (and expensive!) ICs special equipment is available, including earthed wrist straps and earthed work surfaces. You can make an earthed work surface with a sheet of aluminum kitchen foil and using a crocodile clip to connect the foil to a metal water pipe or window frame with a 10k resistor in series. Datasheets Datasheets are available for most ICs giving detailed information about their ratings and functions. In some cases example circuits are shown. The large amount of information with symbols and abbreviations can make datasheets seem overwhelming to a beginner, but they are worth reading as you become more confident because they contain a great deal of useful information for more experienced users designing and testing circuits. Sinking and sourcing current Chip outputs are often said to 'sink' or 'source' current. The terms refer to the direction of the current at the chip's output. If the chip is sinking current it is flowing into the output. This means that a device connected between the positive supply (+Vs) and the chip output will be switched on when the output is low (0V). If the chip is sourcing current it is flowing out of the output. This means that a device connected between the chip output and the negative supply (0V) will be switched on when the output is high (+Vs). It is possible to connect two devices to a chip output so that one is on when the output is low and the other is on when the output is high. This arrangement is used in the Level Crossing project to make the red LEDs flash alternately.

The maximum sinking and sourcing currents for a chip output are usually the same but there are some exceptions, for example 74LS TTL logic chips can sink up to 16mA but only source 2mA.

65 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Using diodes to combine outputs The outputs of chips (ICs) must never be directly connected together. However, diodes can be used to combine two or more digital (high/low) outputs from a chip such as a counter. This can be a useful way of producing simple logic functions without using logic gates! The diagram shows two ways of combining outputs using diodes. The diodes must be capable of passing the output current. 1N4148 signal diodes are suitable for low current devices such as LEDs. For example the outputs Q0 - Q9 of a 4017 1-of-10 counter go high in turn. Using diodes to combine the 2nd (Q1) and 4th (Q3) outputs as shown in the bottom diagram will make the LED flash twice followed by a longer gap. The diodes are performing the function of an OR gate. The 555 and 556 Timers The 8-pin 555 timer chip is used in many projects, a popular version is the NE555. Most circuits will just specify '555 timer IC' and the NE555 is suitable for these. The 555 output (pin 3) can sink and source up to 200mA. This is more than most chips and it is sufficient to supply LEDs, relay coils and low current lamps. To switch larger currents you can connect a transistor. The 556 is a dual version of the 555 housed in a 14-pin package. The two timers (A and B) share the same power supply pins. Low power versions of the 555 are made, such as the ICM7555, but these should only be used when specified (to increase battery life) because their maximum output current of about 20mA (with 9V supply) is too low for many standard 555 circuits. The ICM7555 has the same pin arrangement as a standard 555. Logic ICs (chips) Logic ICs process digital signals and there are many devices, including logic gates, flip-flops, shift registers, counters and display drivers. They can be split into two groups according to their pin arrangements: the 4000 series and the 74 series which consists of various families such as the 74HC, 74HCT and 74LS.

66 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

For most new projects the 74HC family is the best choice. The older 4000 series is the only family which works with a supply voltage of more than 6V. The 74LS and 74HCT families require a 5V supply so they are not convenient for battery operation. The table below summarizes the important properties of the most popular logic families: Property Technology Power Supply

4000 Series CMOS

74 Series 74HCT

High-speed CMO High-speed CMOS S TTL compatible

TTL Low-power Schottky 5V ±0.25V

Inputs

Very high impedance. Unused inputs must be connected to +Vs or 0V. Inputs cannot be reliably driven by 74LS outputs unless a 'pull-up' resistor is used (see below).

Very high impedance. Unused inputs must be connected to +Vs or 0V. Compatible with 74LS (TTL) outputs.

'Float' high to logic 1 if unconnected. 1mA must be drawn out to hold them at logic 0.

Outputs

Can sink and source about 5mA (10mA with 9V supply), enough to light an LED. To switch larger currents use a transistor.

Can sink and source about 20mA, enough to light an LED. To switch larger currents use a transistor.

Can sink and source about 20mA, enough to light an LED. To switch larger currents use a transistor.

Can sink up to 16mA (enough to light an LED), but source only about 2mA. To switch larger currents use a transistor.

Fan-out

One output can drive up to 50 CMOS, 74HC or 74HCT inputs, but only one 74LS input.

One output can drive up to 50 CMOS, 74HC or 74HCT inputs, but only 10 74LS inputs.

One output can drive up to 10 74LS inputs or 50 74HCT inputs.

Power consumption of the IC itself

2 to 6V

74 Series 74LS

5V ±0.5V

Maximum Frequency

3 to 15V

74 Series 74HC

about 1MHz

about 25MHz

about 25MHz

about 35MHz

A few µW.

A few µW.

A few µW.

A few mW.

Mixing Logic Families

67 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

It is best to build a circuit using just one logic family, but if necessary the different families may be mixed providing the power supply is suitable for all of them. For example mixing 4000 and 74HC requires the power supply to be in the range 3 to 6V. A circuit which includes 74LS or 74HCT ICs must have a 5V supply. A 74LS output cannot reliably drive a 4000 or 74HC input unless a 'pull-up' resistor of 2.2k is connected between the +5V supply and the input to correct the slightly different voltage ranges used for logic 0. Driving 4000 or 74HC inputs from a

Note that a 74LS input.

4000

series

output can drive only one

4000 Series CMOS This family of logic ICs is numbered from 4000 onwards, and from 4500 onwards. They have a B at the end of the number (e.g. 4001B) which refers to an improved design introduced some years ago. Most of them are in 14-pin or 16-pin packages. They use CMOS circuitry which means they use very little power and can tolerate a wide range of power supply voltages (3 to 15V) making them ideal for battery powered projects. CMOS is pronounced 'see-moss' and stands for Complementary Metal Oxide Semiconductor. However the CMOS circuitry also means that they are static sensitive. Touching a pin while charged with static electricity (from your clothes for example) may damage the IC. In fact most ICs in regular use are quite tolerant and earthing your hands by touching a metal water pipe or window frame before handling them will be adequate. ICs should be left in their protective packaging until you are ready to use them. For the more sensitive (and expensive!) ICs special equipment is available, including earthed wrist straps and earthed work surfaces. 74 Series: 74LS, 74HC and 74HCT There are several families of logic ICs numbered from 74xx00 onwards with letters (xx) in the middle of the number to indicate the type of circuitry, eg 74LS00 and 74HC00. The original family (now obsolete) had no letters, eg 7400. The 74LS (Low-power Schottky) family (like the original) uses TTL (Transistor-Transistor Logic) circuitry which is fast but requires more power than later families. The 74HC family has High-speed CMOS circuitry, combining the speed of TTL with the very low power consumption of the 4000 series. They are CMOS ICs with the

68 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

same pin arrangements as the older 74LS family. Note that 74HC inputs cannot be reliably driven by 74LS outputs because the voltage ranges used for logic 0 are not quite compatible, use 74HCT instead. The 74HCT family is a special version of 74HC with 74LS TTL-compatible inputs so 74HCT can be safely mixed with 74LS in the same system. In fact 74HCT can be used as low-power direct replacements for the older 74LS ICs in most circuits. The minor disadvantage of 74HCT is a lower immunity to noise, but this is unlikely to be a problem in most situations. Beware that the 74 series is often still called the 'TTL series' even though the latest ICs do not use TTL! The CMOS circuitry used in the 74HC and 74HCT series ICs means that they are static sensitive. Touching a pin while charged with static electricity (from your clothes for example) may damage the IC. In fact most ICs in regular use are quite tolerant and earthing your hands by touching a metal water pipe or window frame before handling them will be adequate. ICs should be left in their protective packaging until you are ready to use them. PIC microcontrollers A PIC is a Programmable Integrated Circuit microcontroller, a 'computer-on-a-chip'. They have a processor and memory to run a program responding to inputs and controlling outputs, so they can easily achieve complex functions which would require several conventional ICs. Programming a PIC microcontroller may seem daunting to a beginner but there are a number of systems designed to make this easy. The PICAXE system is an excellent example because it uses a standard computer to program (and re-program) the PICs; no specialist equipment is required other than a low-cost download lead. Programs can be written in a simple version of BASIC or using a flowchart. The PICAXE programming software and extensive documentation is available to download free of charge, making the system ideal for education and users at home. If you think PICs are not for you because you have never written a computer program, please look at the PICAXE system! It is very easy to get started using a few simple BASIC commands and there are a number of projects available as kits which are ideal for beginners. Static Timing Analysis is a method of computing the expected timing of a digital circuit without requiring simulation.

69 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

High-performance integrated circuits have traditionally been characterized by the clock frequency at which they operate. Gauging the ability of a circuit to operate at the specified speed requires an ability to measure, during the design process, its delay at numerous steps. Moreover, delay calculation must be incorporated into the inner loop of timing optimizers at various phases of design, such as logic synthesis, layout (placement and routing), and in in-place optimizations performed late in the design cycle. While such timing measurements can theoretically be performed using a rigorous circuit simulation, such an approach is liable to be too slow to be practical. Static timing analysis plays a vital role in facilitating the fast and reasonably accurate measurement of circuit timing. The speedup appears due to the use of simplified delay models, and on account of the fact that its ability to consider the effects of logical interactions between signals is limited. Nevertheless, it has become a mainstay of design over the last few decades; one of the earliest descriptions of a static timing approach was published in the 1970s. Purpose In a synchronous digital system, data is supposed to move in lockstep, advancing one stage on each tick of the clock signal. This is enforced by synchronizing elements such as flip-flops or latches, which copy their input to their output when instructed to do so by the clock. To first order, only two kinds of timing errors are possible in such a system: ·

A hold time violation, when a signal arrives too early, and advances one clock cycle before it should

·

A setup time violation, when a signal arrives too late, and misses the time when it should advance.

The time when a signal arrives can vary due to many reasons - the input data may vary, the circuit may perform different operations, the temperature and voltage may change, and there are manufacturing differences in the exact construction of each part. The main goal of static timing analysis is to verify that despite these possible variations, all signals will arrive neither too early nor too late, and hence proper circuit operation can be assured. Also, since STA is capable of verifying every path, apart from helping locate setup and hold time violations, it can detect other serious problems like glitches, slow paths and clock skew. Definitions ·

The critical path is defined as the path between an input and an output with the maximum delay. Once the circuit timing has been computed by one of the techniques below, the critical path can easily found by using a trace back method.

70 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

The arrival time of a signal is the time elapsed for a signal to arrive at a certain point. The reference, or time 0.0, is often taken as the arrival time of a clock signal. To calculate the arrival time, delay calculation of all the component of the path will be required. Arrival times, and indeed almost all times in timing analysis, are normally kept as a pair of values - the earliest possible time at which a signal can change, and the latest.

·

Another useful concept is required time. This is the latest time at which a signal can arrive without making the clock cycle longer than desired. The computation of the required time proceeds as follows. At each primary output, the required times for rise/fall are set according to the specifications provided to the circuit. Next, a backward topological traversal is carried out, processing each gate when the required times at all of its fan outs are known.

·

The slack associated with each connection is the difference between the required time and the arrival time. A positive slack s at a node implies that the arrival time at that node may be increased by s without affecting the overall delay of the circuit. Conversely, negative slack implies that a path is too slow, and the path must sped up (or the reference signal delayed) if the whole circuit is to work at the desired speed.

Corners and STA Quite often, designers will want to qualify their design across many conditions. Behavior of an electronic circuit is often dependent on various factors in its environment like temperature or local voltage variations. In such a case either STA needs to be performed for more than one such set of conditions, or STA must be prepared to work with a range of possible delays for each component, as opposed to a single value. If the design works at each extreme condition, then under the assumption of monotonic behavior, the design is also qualified for all intermediate points. The use of corners in static timing analysis has several limitations. It may be overly optimistic, since it assumes perfect tracking - if one gate is fast, all gates are assumed fast, or if the voltage is low for one gate, it's also low for all others. Corners may also be overly pessimistic, for the worst case corner may seldom occur. In an IC, for example, it may not be rare to have one metal layer at the thin or thick end of its allowed range, but it would be very rare for all 10 layers to be at the same limit, since they are manufactured independently. Statistical STA, which replaces delays with distributions, and tracking with correlation, is a more sophisticated approach to the same problem. The most prominent techniques for STA In static timing analysis, the word static alludes to the fact that this timing analysis is carried out in an input-independent manner, and purports to find the worst-case delay of the circuit over all possible input combinations. The computational efficiency (linear in

71 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

the number of edges in the graph) of such an approach has resulted in its widespread use, even though it has some limitations. A method that is commonly referred to as PERT is popularly used in STA. In fact, PERT is a misnomer, and the so-called PERT method discussed in most of the literature on timing analysis refers to the CPM (Critical path method) that is widely used in project management. While the CPM-based methods are the dominant ones in use today, other methods for traversing circuit graphs, such as depth-first search, have been used by various timing analyzers. Interface Timing Analysis Many of the common problems in chip designing are related to interface timing between different components of the design. These can arise because of many factors including incomplete simulation models, lack of test cases to properly verify interface timing, requirements for synchronization, incorrect interface specifications, and lack of designer understanding of a component supplied as a 'black box'. There are specialized CAD tools designed explicitly to analyze interface timing, just as there are specific CAD tools to verify that an implementation of an interface conforms to the functional specification (using techniques such as model checking). Statistical static timing analysis Statistical STA (SSTA) is a procedure that is becoming increasingly necessary to handle the complexities of process and environmental variations in integrated circuits. See Statistical Analysis and Design of Integrated Circuits for a much more in-depth discussion of this topic. LESSON III Synchronous Sequential Circuits Flip-flops are synchronous bistable devices. The term synchronous means the output changes state only when the clock input is triggered. That is, changes in the output occur in synchronization with the clock. Flip-flop is a kind of multivibrator. There are three types of multivibrators: 1. Monostable multivibrator (also called one-shot) has only one stable state. produces a single pulse in response to a triggering input.

It

2. Bistable multivibrator exhibits two stable states. It is able to retain the two SET and RESET states indefinitely. It is commonly used as a basic building block for counters, registers and memories.

72 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

3. Astable multivibrator has no stable state at all. It is used primarily as an oscillator to generate periodic pulse waveforms for timing purposes. The three basic categories of bistable elements are emphasized: edge-triggered flip-flop, pulse-triggered (master-slave) flip-flop, and data lock-out flip-flop. Their operating characteristics and basic applications will also be discussed. Edge-Triggered Flip-flops An edge-triggered flip-flop changes states either at the positive edge (rising edge) or at the negative edge (falling edge) of the clock pulse on the control input. The three basic types are introduced here: S-R, J-K and D. Positive edge-triggered (without bubble at Clock input): S-R, J-K, and D. Negative edge-triggered (with bubble at Clock input): S-R, J-K, and D. The S-R, J-K and D inputs are called synchronous inputs because data on these inputs are transferred to the flip-flop's output only on the triggering edge of the clock pulse. On the other hand, the direct set (SET) and clear (CLR) inputs are called asynchronous inputs, as they are inputs that affect the state of the flip-flop independent of the clock. For the synchronous operations to work properly, these asynchronous inputs must both be kept LOW. Edge-triggered S-R flip-flop The basic operation is illustrated below, along with the truth table for this type of flip-flop. The operation and truth table for a negative edge-triggered flip-flop are the same as those for a positive except that the falling edge of the clock pulse is the triggering edge.

As S = 1, R = 0. Flip-flop SETS on the rising clock edge.

73 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Note that the S and R inputs can be changed at any time when the clock input is LOW or HIGH (except for a very short interval around the triggering transition of the clock) without affecting the output. This is illustrated in the timing diagram below:

Edge-triggered J-K flip-flop The J-K flip-flop works very similar to S-R flip-flop. The only difference is that this flip-flop has NO invalid state. The outputs toggle (change to the opposite state) when both J and K inputs are HIGH. The truth table is shown below.

Edge-triggered D flip-flop The operations of a D flip-flop is much more simpler. It has only one input addition to the clock. It is very useful when a single data bit (0 or 1) is to be stored. If there is a HIGH on the D input when a clock pulse is applied, the flip-flop SETs and stores a 1. If there is a LOW on the D input when a clock pulse is applied, the flip-flop RESETs and stores a 0. The truth table below summarize the operations of the positive edge-triggered D flip-flop. As before, the negative edge-triggered flip-flop works the same except that the falling edge of the clock pulse is the triggering edge.

74 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Pulse-Triggered (Master-Slave) Flip-flops The term pulse-triggered means that data are entered into the flip-flop on the rising edge of the clock pulse, but the output does not reflect the input state until the falling edge of the clock pulse. As this kind of flip-flops are sensitive to any change of the input levels during the clock pulse is still HIGH, the inputs must be set up prior to the clock pulse's rising edge and must not be changed before the falling edge. Otherwise, ambiguous results will happen. The three basic types of pulse-triggered flip-flops are S-R, J-K and D. Their logic symbols are shown below. Notice that they do not have the dynamic input indicator at the clock input but have postponed output symbols at the outputs.

The truth tables for the above pulse-triggered flip-flops are all the same as that for the edge-triggered flip-flops, except for the way they are clocked. These flip-flops are also called Master-Slave flip-flops simply because their internal construction are divided into two sections. The slave section is basically the same as the master section except that it is clocked on the inverted clock pulse and is controlled by the outputs of the master section rather than by the external inputs. The logic diagram for a basic master-slave S-R flip-flop is shown below.

75 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Data Lock-Out Flip-flops The data lock-out flip-flop is similar to the pulse-triggered (master-slave) flip-flop except it has a dynamic clock input. The dynamic clock disables (locks out) the data inputs after the rising edge of the clock pulse. Therefore, the inputs do not need to be held constant while the clock pulse is HIGH. The master section of this flip-flop is like an edge-triggered device. The slave section becomes a pulse-triggered device to produce a postponed output on the falling edge of the clock pulse. The logic symbols of S-R, J-K and D data lock-out flip-flops are shown below. Notice they all have the dynamic input indicator as well as the postponed output symbol.

Again, the above data lock-out flip-flops have same the truth tables as that for the edge-triggered flip-flops, except for the way they are clocked.

Operating Characteristics The operating characteristics mentions here apply to all flip-flops regardless of the particular form of the circuit. They are typically found in data sheets for integrated

76 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

circuits. They specify the performance, operating requirements, and operating limitations of the circuit. Propagation Delay Time - is the interval of time required after an input signal has been applied for the resulting output change to occur. Set-Up Time - is the minimum interval required for the logic levels to be maintained constantly on the inputs (J and K, or S and R, or D) prior to the triggering edge of the clock pulse in order for the levels to be reliably clocked into the flip-flop. Hold Time - is the minimum interval required for the logic levels to remain on the inputs after the triggering edge of the clock pulse in order for the levels to be reliably clocked into the flip-flop. Maximum Clock Frequency - is the highest rate that a flip-flop can be reliably triggered. Power Dissipation - is the total power consumption of the device. Pulse Widths - are the minimum pulse widths specified by the manufacturer for the Clock, SET and CLEAR inputs.

Frequency Division When a pulse waveform is applied to the clock input of a J-K flip-flop that is connected to toggle, the Q output is a square wave with half the frequency of the clock input. If more flip-flops are connected together as shown in the figure below, further division of the clock frequency can be achieved.

77 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The Q output of the second flip-flop is one-fourth the frequency of the original clock input. This is because the frequency of the clock is divided by 2 by the first flip-flop, then divided by 2 again by the second flip-flop. If more flip-flops are connected this way, the frequency division would be 2 to the power n, where n is the number of flip-flops.

Parallel Data Storage In digital systems, data are normally stored in groups of bits that represent numbers, codes, or other information. So, it is common to take several bits of data on parallel lines and store them simultaneously in a group of flip-flops. This operation is illustrated in the figure below.

Each of the three parallel data lines is connected to the D input of a flip-flop. Since the entire clock inputs are connected to the same clock, the data on the D inputs are stored simultaneously by the flip-flops on the positive edge of the clock. Registers, a group of flip-flops use for data storage, will be explained in more detail in a later chapter.

Counting

78 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Another very important application of flip-flops is in digital counters, which are covered in detail in the next chapter. A counter that counts from 0 to 3 is illustrated in the timing diagram on the right. The two-bit binary sequence repeats every four clock pulses. When it counts to 3, it recycles back to 0 to begin the sequence again. 0

Flip-flop (electronics) In digital circuits, the flip-flop, latch, or bistable multivibrator is an electronic circuit which has two stable states and thereby is capable of serving as one bit of memory. A flip-flop is controlled by one or two control signals and/or a gate or clock signal. The output often includes the complement as well as the normal output. As flip-flops are implemented electronically, they naturally also require power and ground connections. Flip-flops can be either simple or clocked. Simple flip-flops consist of two cross-coupled inverting elements – transistors, or NAND, or NOR-gates – perhaps augmented by some enable/disable (gating) mechanism. Clocked devices are specially designed for synchronous (time-discrete) systems and therefore ignores its inputs except at the transition of a dedicated clock signal (known as clocking, pulsing, or strobing). This causes the flip-flop to either change or retain its output signal based upon the values of the input signals at the transition. Some flip-flops change output on the rising edge of the clock, other on the falling edge. Clocked flip-flops are typically implemented as master-slave devices* where two basic flip-flops (plus some additional logic) collaborates to make it insensitive to spikes and noise between the short clock transitions; they nevertheless also often include asynchronous clear or set inputs which may be used to change the current output independent of the clock. Flip-flops can be further divided into types that have found common applicability in both asynchronous and clocked sequential systems: the SR ("set-reset"), D ("data"), T ("toggle"), and JK types are the common ones; all of which may be synthetisized from (most) other types by a few logic gates. The behavior of a particular type can be

79 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

described by what is termed the characteristic equation, which derives the "next" (i.e., after the next clock pulse) output, Qnext, in terms of the input signal(s) and/or the current output, Q. The first electronic flip-flop was invented in 1919 by William Eccles and F. W. Jordan [1]. It was initially called the Eccles-Jordan trigger circuit and consisted of two active elements (radio-tubes). The name flip-flop was later derived from the sound produced on a speaker connected with one of the back coupled amplifiers output during the trigger process within the circuit. * Early master-slave devices actually remained (half) open between the first and second edge of a clocking pulse; today most flip-flops are designed so they may be clocked by a single edge as this gives large benefits regarding noise immunity, without any significant downsides. Set-Reset flip-flops (SR flip-flops) See SR latch. Toggle flip-flops (T flip-flops)

A circuit symbol for a T-type flip-flop, where > is the clock input, T is the toggle input and Q is the stored data output. If the T input is high, the T flip-flop changes state ("toggles") whenever the clock input is strobe. If the T input is low, the flip-flop holds the previous value. This behavior is described by the characteristic equation: (or,

without

benefit

of

the

XOR

operator,

the

equivalent:

) and can be described in a truth table:

T Q

Q n e Commen t xt

0 0 0

hold state

80 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

0 1 1

hold state

1 0 1

toggle

1 1 0

toggle

A toggle flip-flop composed of a single RS flip-flop becomes an oscillator, when it is clocked. To achieve toggling, the clock pulse must have exactly the length of half a cycle. While such a pulse generator can be built, a toggle flip-flop composed of two RS flip-flops is the easy solution. Thus the toggle flip-flop divides the clock frequency by 2 i.e. if clock frequency is 4 MHz, the output frequency obtained from the flip-flop will be 2 MHz. This 'divide by' feature has application in various types of digital counters. JK flip-flop

JK flip-flop timing diagram The JK flip-flop augments the behavior of the SR flip-flop by interpreting the S = R = 1 condition as a "flip" command. Specifically, the combination J = 1, K = 0 is a command to set the flip-flop; the combination J = 0, K = 1 is a command to reset the flip-flop; and the combination J = K = 1 is a command to toggle the flip-flop, i.e., change its output to the logical complement of its current value. Setting J = K = 0 does NOT result in a D flip-flop, but rather, will hold the current state. To synthesize a D flip-flop, simply set K equal to the complement of J. The JK flip-flop is therefore a universal flip-flop, because it can be configured to work as an SR flip-flop, a D flip-flop or a T flip-flop.

A circuit symbol for a JK flip-flop, where > is the clock input, J and K are data inputs, Q is the stored data output, and Q' is the inverse of Q. The characteristic equation of the JK flip-flop is:

81 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

and the corresponding truth table is:

J K

Q nex Commen t t

0 0

hold state

0 1

reset

1 0

set

1 1

toggle

The origin of the name for the JK flip-flop is detailed by P. L. Lindley, a JPL engineer, in a letter to EDN, an electronics design magazine. The letter is dated June 13, 1968, and was published in the August edition of the newsletter. In the letter, Mr. Lindley explains that he heard the story of the JK flip-flop from Dr. Eldred Nelson, who is responsible for coining the term while working at Hughes Aircraft. Flip-flops in use at Hughes at the time were all of the type that came to be known as J-K. In designing a logical system, Dr. Nelson assigned letters to flip-flop inputs as follows: #1: A & B, #2: C & D, #3: E & F, #4: G & H, #5: J & K. Given the size of the system that he was working on, Dr. Nelson realized that he was going to run out of letters, so he decided to use J and K as the set and reset input of each flip-flop in his system (using subscripts or some such to distinguish the flip-flops), since J and K were "nice, innocuous letters." Dr. Montgomery Phister, Jr., an engineer under Dr. Nelson at Hughes, in his book "Logical Design of Digital Computers" (Wiley,1958) picked up the idea that J and K were the set and reset input for a "Hughes type" of flip-flop, which he then termed "J-K flip-flops." He also defined R-S, T, D, and R-S-T flip-flops, and showed how one could use Boolean Algebra to specify their interconnections so as to carry out complex functions.

82 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

D flip-flop The D flip-flop can be interpreted as a primitive delay line or zero-order hold, since the data is posted at the output one clock cycle after it arrives at the input. It is called delay flip flop since the output takes the value in the Data-in. The characteristic equation of the D flip-flop is: and the corresponding truth table is:

D Q >

Qne xt

0 X Rising 0

1 X Rising 1 These flip flops are very useful, as they form the basis for shift registers, which are an essential part of many electronic devices. The advantage of this circuit over the D-type latch is that it "captures" the signal at the moment the clock goes high, and subsequent changes of the data line do not matter, even if the signal line has not yet gone low again. Master-slave D flip-flop A master-slave D flip-flop is created by connecting two gated D latches in series, and invert the enable input to one of them. It is called master-slave because the second latch in the series only changes in response to a change in the first (master) latch.

A master slave D flip flop. It responds on the negative edge of the enable input (usually a clock). For a positive-edge triggered master-slave D flip-flop, when the clock signal is low (logical 0) the “enable” seen by the first or “master” D latch (the inverted clock signal) is high (logical 1). This allows the “master” latch to store the input value when the clock

83 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

signal transitions from low to high. As the clock signal goes high (0 to 1) the inverted “enable” of the first latch goes low (1 to 0) and the value seen at the input to the master latch is “locked”. Nearly simultaneously, the twice inverted “enable” of the second or “slave” D latch transitions from low to high (0 to 1) with the clock signal. This allows the signal captured at the rising edge of the clock by the now “locked” master latch to pass through the “slave” latch. When the clock signal returns to low (1 to 0), the output of the “slave” latch is "locked", and the value seen at the last rising edge of the clock is held while the “master” latch begins to accept new values in preparation for the next rising clock edge.

An implementation of a master-slave D flip-flop that is triggered on the positive edge of the clock. By removing the left-most inverter in the above circuit, a D-type flip flop that strobes on the falling edge of a clock signal can be obtained. This has a truth table like this:

D Q >

Qne xt

0 X Falling 0

1 X Falling 1 Most D-type flip-flops in ICs have the capability to be set and reset, much like an SR flip-flop. Usually, the illegal S = R = 1 condition is resolved in D-type flip-flops. Inputs

Outputs

84 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

S R D > Q

Q'

0 1 X X 0

1

1 0 X X 1

0

1 1 X X 1

1

By setting S = R = 0, the flip-flop can be used as described above. Edge-triggered D flip-flop A more efficient way to make a D flip-flop is not as easy to understand, but it works the same way. While the master-slave D flip flop is also triggered on the edge of a clock, its components are each triggered by clock levels. The "edge-triggered D flip flop" does not have the master slave properties.

A positive-edge-triggered D flip-flop.

85 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Uses ·

A single flip-flop can be used to store one bit, or binary digit, of data.

·

Static RAM, which is the primary type of memory used in registers to store numbers in computers and in many caches, is built out of flip-flops.

·

Any one of the flip-flop types can be used to build any of the others. The data contained in several such flip-flops may represent the state of a sequencer, the value of a counter, an ASCII character in a computer's memory or any other piece of information.

·

One use is to build finite state machines from electronic logic. The flip-flops remember the machine's previous state, and digital logic uses that state to calculate the next state.

·

The T flip-flop is useful for constructing various types of counters. Repeated signals to the clock input will cause the flip-flop to change state once per high-to-low transition of the clock input, if its T input is "1". The output from one flip-flop can be fed to the clock input of a second and so on. The final output of the circuit, considered as the array of outputs of all the individual flip-flops, is a count, in binary, of the number of cycles of the first clock input, up to a maximum of 2n-1, where n is the number of flip-flops used. See: Counters

·

One of the problems with such a counter (called a ripple counter) is that the output is briefly invalid as the changes ripple through the logic. There are two solutions to this problem. The first is to sample the output only when it is known to be valid. The second, more widely used, is to use a different type of circuit called a synchronous counter. This uses more complex logic to ensure that the outputs of the counter all change at the same, predictable time. See: Counters

·

Frequency division: a chain of T flip-flops as described above will also function to divide an input in frequency by 2n, where n is the number of flip-flops used between the input and the output.

Timing and metastability A flip-flop in combination with a Schmitt trigger can be used for the implementation of an arbiter in asynchronous circuits. Clocked flip-flops are prone to a problem called metastability, which happens when a data or control input is changing at the instant of the clock pulse. The result is that the output may behave unpredictably, taking many times longer than normal to settle to its correct state, or even oscillating several times before settling. Theoretically it can take

86 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

infinite time to settle down. In a computer system this can cause corruption of data or a program crash. In many cases, metastability in flip-flops can be avoided by ensuring that the data and control inputs are held constant for specified periods before and after the clock pulse, called the setup time (tsu) and the hold time (th) respectively. These times are specified in the data sheet for the device, and are typically between a few nanoseconds and a few hundred picoseconds for modern devices. Unfortunately, it is not always possible to meet the setup and hold criteria, because the flip-flop may be connected to a real-time signal that could change at any time, outside the control of the designer. In this case, the best the designer can do is to reduce the probability of error to a certain level, depending on the required reliability of the circuit. One technique for suppressing metastability is to connect two or more flip-flops in a chain, so that the output of each one feeds the data input of the next, and all devices share a common clock. With this method, the probability of a metastable event can be reduced to a negligible value, but never to zero. The probability of metastability gets closer and closer to zero as the number of flip-flops connected in series is increased. So-called metastable-hardened flip-flops are available, which work by reducing the setup and hold times as much as possible, but even these cannot eliminate the problem entirely. This is because metastability is more than simply a matter of circuit design. When the transitions in the clock and the data are close together in time, the flip-flop is forced to decide which event happened first. However fast we make the device, there is always the possibility that the input events will be so close together that it cannot detect which one happened first. It is therefore logically impossible to build a perfectly metastable-proof flip-flop. Another important timing value for a flip-flop is the clock-to-output delay (common symbol in data sheets: tCO) or propagation delay (tP), which is the time the flip-flop takes to change its output after the clock edge. The time for a high-to-low transition (tPHL) is sometimes different from the time for a low-to-high transition (tPLH). When connecting flip-flops in a chain, it is important to ensure that the tCO of the first flip-flop is longer than the hold time (tH) of the second flip-flop, otherwise the second flip-flop will not receive the data reliably. The relationship between tCO and tH is normally guaranteed if both flip-flops are of the same type. Analysis of Sequential Circuits The behavior of a sequential circuit is determined from the inputs, the outputs and the states of its flip-flops. Both the output and the next state are a function of the inputs and the present state.

87 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The suggested analysis procedure of a sequential circuit is set out in Figure 6 below.

We start with the logic schematic from which we can derive excitation equations for each flip-flop input. Then, to obtain next-state equations, we insert the excitation equations into the characteristic equations. The output equations can be derived from the schematic, and once we have our output and next-state equations, we can generate the next-state and output tables as well as state diagrams. When we

88 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

reach this stage, we use either the table or the state diagram to develop a timing diagram which can be verified through simulation.

Figure 6. Analysis procedure of sequential circuits.

Example 1.1. Modulo-4 counter Derive the state table and state diagram for the sequential circuit shown in Figure 7.

Figure 7. L o g i c schematic of a sequentia l circuit.

SOLUTION: STEP 1: First we derive the Boolean expressions for the inputs of each flip-flops in the schematic, in terms of external input Cnt and the flip-flop outputs Q1 and Q0. Since there are

89 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

two D flip-flops in this example, we derive two expressions for D1 and D0: D0 = Cnt Q0 = Cnt'*Q0 + Cnt*Q0' D1 = Cnt'*Q1 + Cnt*Q1'*Q0 + Cnt*Q1*Q0' These Boolean expressions are called excitation equations since they represent the inputs to the flip-flops of the sequential circuit in the next clock cycle. STEP 2: Derive the next-state equations by converting these excitation equations into flip-flop characteristic equations. In the case of D flip-flops, Q(next) = D. Therefore the next state equal the excitation equations. Q0(next) = D0 = Cnt'*Q0 + Cnt*Q0' Q1(next) = D1 = Cnt'*Q1 + Cnt*Q1'*Q0 + Cnt*Q1*Q0' STEP 3: table.

Now convert these next-state equations into tabular form called the next-state

Present State

0 0 1 1

Q1Q0

Next State Cnt = 0 Cnt = 1

0 1 0 1

0 0 1 1

0 1 0 1

0 1 1 0

1 0 1 0

Each row is corresponding to a state of the sequential circuit and each column represents one set of input values. Since we have two flip-flops, the number of possible states is four - that is, Q1Q0 can be equal to 00, 01, 10, or 11. These are present states as shown in the table. For the next state part of the table, each entry defines the value of the sequential circuit in the next clock cycle after the rising edge of the Clk. Since this value depends on the present state and the value of the input signals, the next state table will contain one column for each assignment of binary values to the input signals. In this example, since there is only one input signal, Cnt, the next-state table shown has only two columns, corresponding to Cnt = 0 and Cnt = 1. Note that each entry in the next-state table indicates the values of the flip-flops in the next state if their value in the present state is in the row header and the input values in the column header. Each of these next-state values has been computed from the next-state equations in STEP 2. STEP 4: 8.

The state diagram is generated directly from the next-state table, shown in Figure

90 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Figur e 8. State diagra m

Each arc is labelled with the values of the input signals that cause the transition from the present state (the source of the arc) to the next state (the destination of the arc). In general, the number of states in a next-state table or a state diagram will equal 2m , where m is the number of flip-flops. Similarly, the number of arcs will equal 2m x 2k , where k is the number of binary input signals. Therefore, in the state diagram, there must be four states and eight transitions. Following these transition arcs, we can see that as long as Cnt = 1, the sequential circuit goes through the states in the following sequence: 0, 1, 2, 3, 0, 1, 2, .... On the other hand, when Cnt = 0, the circuit stays in its present state until Cnt changes to 1, at which the counting continues. Since this sequence is characteristic of modulo-4 counting, we can conclude that the sequential circuit in Figure 7 is a modulo-4 counter with one control signal, Cnt, which enables counting when Cnt = 1 and disables it when Cnt = 0.

To see how the states changes corresponding to the input signals Cnt, click on this image. Below, we show a timing diagram, representing four clock cycles, which enables us to observe the behavior of the counter in greater detail.

91 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Fig ur e 9 . T im in g Dia gr a m

In this timing diagram we have assumed that Cnt is asserted in clock cycle 0 at t0 and is disasserted in clock cycle 3 at time t4. We have also assumed that the counter is in state Q1Q0 = 00 in the clock cycle 0. Note that on the clock's rising edge, at t1, the counter will go to state Q1Q0 = 01 with a slight propagation delay; in cycle 2, after t2, to Q1Q0 = 10; and in cycle 3, after t3 to Q1Q0 = 11. Since Cnt becomes 0 at t4, we know that the counter will stay in state Q1Q0 = 11 in the next clock cycle. To see the timing behavior of the circuit click on this image

.

In Example 1.1 we demonstrated the analysis of a sequential circuit that has no outputs by developing a next-state table and state diagram which describes only the states and the transitions from one state to the next. In the next example we complicate our analysis by adding output signals, which means that we have to upgrade the next-state table and the state diagram to identify the value of output signals in each state.

Example 1.2 Derive the next state, the output table and the state diagram for the sequential circuit shown in Figure 10.

92 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Figure 10. Logic schemati c of a sequenti al circuit.

SOLUTION: The input combinational logic in Figure 10 is the same as in Example 1.1, so the excitation and the next-state equations will be the same as in Example 1.1. Excitation equations: D0 = Cnt

Q0 = Cnt'*Q0 + Cnt*Q0'

D0 = Cnt'*Q1 + Cnt*Q1'*Q0 + Cnt*Q1*Q0' Next-state equations: Q0(next) = D0 = Cnt'*Q0 + Cnt*Q0' Q1(next) = D0 = Cnt'*Q1 + Cnt*Q1'*Q0 + Cnt*Q1*Q0' In addition, however, we have computed the output equation. Output equation:

Y = Q1Q0

As this equation shows, the output Y will equal to 1 when the counter is in state Q1Q0 = 11, and it will stay 1 as long as the counter stays in that state. Next-state and output table: Present State Q1 Q0 00 01 10

Output

Next State Cnt=0

Cnt=1

00 01 10

01 10 11

Z 0 0 0

93 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

11

11

00

1

State diagram:

Figure 11. State diagram of sequential circuit in Figure 10. To see how the states move from one to another click on the image.

Timing diagram:

94 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Figure 1 2 . Timing diagram o f sequentia l circuit in Figure 10. Click on t h e image to see its timing behavior.

Note that the counter will reach the state Q1Q0 = 11 only in the third clock cycle, so the output Y will equal 1 after Q0 changes to 1. Since counting is disabled in the third clock cycle, the counter will stay in the state Q1Q0 = 11 and Y will stay asserted in all succeeding clock cycles until counting is enabled again.

Design of Sequential Circuits The design of a synchronous sequential circuit starts from a set of specifications and culminates in a logic diagram or a list of Boolean functions from which a logic diagram can be obtained. In contrast to a combinational logic, which is fully specified by a truth table, a sequential circuit requires a state table for its specification. The first step in the design of sequential circuits is to obtain a state table or an equivalence representation, such as a state diagram. A synchronous sequential circuit is made up of flip-flops and combinational gates. The design of the circuit consists of choosing the flip-flops and then finding the combinational structure which, together with the flip-flops, produces a circuit that fulfils the required specifications. The number of flip-flops is determined from the number of states needed in the circuit. The recommended steps for the design of sequential circuits are set out below.

95 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Design of Sequential Circuits This example is taken from M. M. Mano, Digital Design, Prentice Hall, 1984, p.235. Example 1.3 We wish to design a synchronous sequential circuit whose state diagram is shown in Figure 13. The type of flip-flop to be use is J-K.

Figure 13. State diagram From the state diagram, we can generate the state table shown in Table 9. Note that there is no output section for this circuit. Two flip-flops are needed to represent the four states and are designated Q0Q1. The input variable is labeled x. Present State

Next State x=0 x=1

96 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Q0 Q1 0 0 1 1

0 1 0 1

0 1 1 1

0 0 0 1

0 0 1 0

1 1 1 0

Table 9. State table. We shall now derive the excitation table and the combinational structure. The table is now arranged in a different form shown in Table 11, where the present state and input variables are arranged in the form of a truth table. Remember, the excitable for the JK flip-flop was derive in Table 1. Table 10. Excitation table for JK flip-flop Output Transitions

Flip-flop inputs

Q àQ(next) 0 0 1 1

à à à à

JK

0 1 0 1

0 1 X X

X X 1 0

Table 11. Excitation table of the circuit Present State Q0 Q1 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

Next State Q0 Q1 0 0 1 0 1 1 1 0

0 1 0 1 0 1 1 0

Input x 0 1 0 1 0 1 0 1

Flip-flop Inputs J0K0 J1K1 0 0 1 0 X X X X

X X X X 0 0 0 1

0 1 X X 0 1 X X

X X 1 0 X X 0 1

In the first row of Table 11, we have a transition for flip-flop Q0 from 0 in the present state to 0 in the next state. In Table 10 we find that a transition of states from 0 to 0 requires that input J = 0 and input K = X. So 0 and X are copied in the first row under J0 and K0 respectively. Since the first row also shows a transition for the flip-flop Q1 from 0 in the present state to 0 in the next state, 0 and X are copied in the first row

97 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

under J1 and K1. This process is continued for each row of the table and for each flip-flop, with the input conditions as specified in Table 10. The simplified Boolean functions for the combinational circuit can now be derived. The input variables are Q0, Q1, and x; the outputs are the variables J0, K0, J1 and K1. The information from the truth table is plotted on the Karnaugh maps shown in Figure 14.

Figure 14. Karnaugh Maps The flip-flop input functions are derived: J0 = Q1*x' J1 = x

K0 = Q1*x K1 = Q0'*x' + Q0*x = Q0¤x

Note: the symbol ¤ is exclusive-NOR. The logic diagram is drawn in Figure 15.

Figure 15. Logic diagram of the sequentia l circuit.

Example 1.4 Design a sequential circuit whose state tables are specified in Table 12, using D flip-flops. Table 12. State table of a sequential circuit.

98 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Present State Q0 Q1 0 0 1 1

0 1 0 1

Output

Next State x=0 0 0 1 0

0 0 1 0

x = 0

x=1 0 1 1 0

1 0 0 1

0 0 0 0

x = 1 0 0 0 1

Table 13. Excitation table for a D flip-flop. Output Transitions

Flip-flop inputs

QàQ(next) 0 0 1 1

à à à à

D 0

0 1 0 1

1 0 1

Next step is to derive the excitation table for the design circuit, which is shown in Table 14. The output of the circuit is labeled Z. Present State Q0 Q1 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

Next State Q0 Q1 0 0 0 1 1 1 0 0

0 1 0 0 1 0 0 1

Input x 0 1 0 1 0 1 0 1

F l i p - f l o p Output Inputs D0 0 0 0 1 1 1 0 0

D1 0 1 0 0 1 0 0 1

Z 0 0 0 0 0 0 0 1

Table 14. Excitation table Now plot the flip-flop inputs and output functions on the Karnaugh map to derive the Boolean expressions, which is shown in Figure 16.

99 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Figure 16. Karnaugh maps The simplified Boolean expressions are: D0 = Q0*Q1' + Q0'*Q1*x D1 = Q0'*Q1'*x + Q0*Q1*x + Q0*Q1'*x' Z = Q0*Q1*x Finally, draw the logic diagram.

Figure 17. Logic diagram of the sequential circuit. Register Transfer Language (RTL) is a term used in computer science. It is an intermediate representation used by the GCC compiler. RTL is used to represent the code being generated, in a form closer to assembly language than to the high level languages which GCC compiles. RTL is generated from the GCC Abstract Syntax Tree representation, transformed by various passes in the

100 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

GCC 'middle-end', and then converted to assembly language. GCC currently uses the RTL form to do a part of its optimization work. RTL is usually written in a form which looks like a Lisp S-expression: (set:SI (reg:SI 140) (plus:SI (reg:SI 138) (reg:SI 139))) This "side-effect expression" says "add the contents of register 138 to the contents of register 139 and store the result in register 140." The RTL generated for a program is different when GCC generates code for different processors. However, the meaning of the RTL is more-or-less independent of the target: it would usually be possible to read and understand a piece of RTL without knowing what processor it was generated for. Similarly, the meaning of the RTL doesn't usually depend on the original high-level language of the program. LESSON IV Memory and Storage Types of memory Many types of memory devices are available for use in modern computer systems. As an embedded software engineer, you must be aware of the differences between them and understand how to use each type effectively. In our discussion, we will approach these devices from the software developer's perspective. Keep in mind that the development of these devices took several decades and that their underlying hardware differs significantly. The names of the memory types frequently reflect the historical nature of the development process and are often more confusing than insightful. Figure 1 classifies the memory devices we'll discuss as RAM, ROM, or a hybrid of the two.

Figure 1. Common memory types in embedded systems Types of RAM The RAM family includes two important memory devices: static RAM (SRAM) and dynamic RAM (DRAM). The primary difference between them is the lifetime of the data

101 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

they store. SRAM retains its contents as long as electrical power is applied to the chip. If the power is turned off or lost temporarily, its contents will be lost forever. DRAM, on the other hand, has an extremely short data lifetime-typically about four milliseconds. This is true even when power is applied constantly. In short, SRAM has all the properties of the memory you think of when you hear the word RAM. Compared to that, DRAM seems kind of useless. By itself, it is. However, a simple piece of hardware called a DRAM controller can be used to make DRAM behave more like SRAM. The job of the DRAM controller is to periodically refresh the data stored in the DRAM. By refreshing the data before it expires, the contents of memory can be kept alive for as long as they are needed. So DRAM is as useful as SRAM after all. When deciding which type of RAM to use, a system designer must consider access time and cost. SRAM devices offer extremely fast access times (approximately four times faster than DRAM) but are much more expensive to produce. Generally, SRAM is used only where access speed is extremely important. A lower cost-per-byte makes DRAM attractive whenever large amounts of RAM are required. Many embedded systems include both types: a small block of SRAM (a few kilobytes) along a critical data path and a much larger block of DRAM (perhaps even Megabytes) for everything else. Types of ROM Memories in the ROM family are distinguished by the methods used to write new data to them (usually called programming), and the number of times they can be rewritten. This classification reflects the evolution of ROM devices from hardwired to programmable to erasable-and-programmable. A common feature of all these devices is their ability to retain data and programs forever, even during a power failure. The very first ROMs were hardwired devices that contained a preprogrammed set of data or instructions. The contents of the ROM had to be specified before chip production, so the actual data could be used to arrange the transistors inside the chip. Hardwired memories are still used, though they are now called masked ROMs to distinguish them from other types of ROM. The primary advantage of a masked ROM is its low production cost. Unfortunately, the cost is low only when large quantities of the same ROM are required. One step up from the masked ROM is the PROM (programmable ROM), which is purchased in an unprogrammed state. If you were to look at the contents of an unprogrammed PROM, you would see that the data is made up entirely of 1's. The process of writing your data to the PROM involves a special piece of equipment called a device programmer. The device programmer writes data to the device one word at a time by applying an electrical charge to the input pins of the chip. Once a PROM has been programmed in this way, its contents can never be changed. If the code or data stored in the PROM must be changed, the current device must be discarded. As a result, PROMs are also known as one-time programmable (OTP) devices.

102 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

An EPROM (erasable-and-programmable ROM) is programmed in exactly the same manner as a PROM. However, EPROMs can be erased and reprogrammed repeatedly. To erase an EPROM, you simply expose the device to a strong source of ultraviolet light. (A window in the top of the device allows the light to reach the silicon.) By doing this, you essentially reset the entire chip to its initial--unprogrammed--state. Though more expensive than PROMs, their ability to be reprogrammed makes EPROMs an essential part of the software development and testing process. Hybrids As memory technology has matured in recent years, the line between RAM and ROM has blurred. Now, several types of memory combine features of both. These devices do not belong to either group and can be collectively referred to as hybrid memory devices. Hybrid memories can be read and written as desired, like RAM, but maintain their contents without electrical power, just like ROM. Two of the hybrid devices, EEPROM and flash, are descendants of ROM devices. These are typically used to store code. The third hybrid, NVRAM, is a modified version of SRAM. NVRAM usually holds persistent data. EEPROMs are electrically-erasable-and-programmable. Internally, they are similar to EPROMs, but the erase operation is accomplished electrically, rather than by exposure to ultraviolet light. Any byte within an EEPROM may be erased and rewritten. Once written, the new data will remain in the device forever--or at least until it is electrically erased. The primary tradeoff for this improved functionality is higher cost, though write cycles are also significantly longer than writes to a RAM. So you wouldn't want to use an EEPROM for your main system memory. Flash memory combines the best features of the memory devices described thus far. Flash memory devices are high density, low cost, nonvolatile, fast (to read, but not to write), and electrically reprogrammable. These advantages are overwhelming and, as a direct result, the use of flash memory has increased dramatically in embedded systems. From a software viewpoint, flash and EEPROM technologies are very similar. The major difference is that flash devices can only be erased one sector at a time, not byte-by-byte. Typical sector sizes are in the range 256 bytes to 16KB. Despite this disadvantage, flash is much more popular than EEPROM and is rapidly displacing many of the ROM devices as well. The third member of the hybrid memory class is NVRAM (non-volatile RAM). No volatility is also a characteristic of the ROM and hybrid memories discussed previously. However, an NVRAM is physically very different from those devices. An NVRAM is usually just an SRAM with a battery backup. When the power is turned on, the NVRAM operates just like any other SRAM. When the power is turned off, the NVRAM draws just enough power from the battery to retain its data. NVRAM is fairly common in embedded systems. However, it is expensive--even more expensive than SRAM, because of the

103 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

battery--so its applications are typically limited to the storage of a few hundred bytes of system-critical information that can't be stored in any better way. Table 1 summarizes the features of each type of memory discussed here, but keep in mind that different memory types serve different purposes. Each memory type has its strengths and weaknesses. Side-by-side comparisons are not always effective. Type

Yes Yes

E r a s e Max Erase Cost (per Speed Size Cycles Byte) Byte Unlimited Expensive Fast Byte Unlimited Moderate Moderate

No

n/a

n/a

Inexpensive

Fast

a e n/a

n/a

Moderate

Fast

Volatile? Writeable?

SRAM Yes DRAM Yes Masked No ROM PROM

No

EPROM

No

Once, with d e v i c programmer Yes, with d e v i c programmer

EEPROM No

Yes

Flash

No

Yes

NVRAM

No

Yes

a L i m i t e d Entire e ( c o n s u l t Moderate Fast Chip datasheet) L i m i t e d Fast to read, Byte ( c o n s u l t Expensive slow to datasheet) erase/write L i m i t e d Fast to read, Sector ( c o n s u l t Moderate slow to datasheet) erase/write Expensive Byte Unlimited (SRAM + Fast battery)

Table 1. Characteristics of the various memory types

Computer storage, computer memory, and often casually memory refer to computer components, devices and recording media that retain data for some interval of time. Computer storage provides one of the core functions of the modern computer, that of information retention. It is one of the fundamental components of all modern c o m p u t e r s ,

104 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

a n d coupled with a central processing unit (CPU), implements the basic Von Neumann computer model used since the 1940s. In contemporary usage, memory usually refers to a form of solid state storage known as random access memory (RAM) and sometimes other forms of fast but temporary storage. Similarly, storage more commonly refers to mass storage - optical discs, forms of magnetic storage like hard disks, and other types of storage which are slower than RAM, but of a more permanent nature. These contemporary distinctions are helpful, because they are also fundamental to the architecture of computers in general. As well, they reflect an important and significant technical difference between memory and mass storage devices, which has been blurred by the historical usage of the terms "main storage" (and sometimes "primary storage") for random access memory, and "secondary storage" for mass storage devices. This is explained in the following sections, in which the traditional "storage" terms are used as sub-headings for convenience. Purposes of storage The fundamental components of a general-purpose computer are arithmetic and logic unit, control circuitry, storage space, and input/output devices. If storage was removed, the device we had would be a simple digital signal processing device (e.g. calculator, media player) instead of a computer. The ability to store instructions that form a

105 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

computer program, and the information that the instructions manipulate is what makes stored program architecture computers versatile. A Digital computer represents information using the binary numeral system. Text, numbers, pictures, audio, and nearly any other form of information can be converted into a string of bits, or binary digits, each of which has a value of 1 or 0. The most common unit of storage is the byte, equal to 8 bits. A piece of information can be manipulated by any computer whose storage space is large enough to accommodate the corresponding data, or the binary representation of the piece of information. For example, a computer with a storage space of eight million bits, or one megabyte, could be used to edit a small novel. Various forms of storage, based on various natural phenomena, have been invented. So far, no practical universal storage medium exists, and all forms of storage have some drawbacks. Therefore a computer system usually contains several kinds of storage, each with an individual purpose, as shown in Various forms of storage, divided according the diagram. to their distance from the central processing unit. Additionally, common technology and capacity found in home computers of 2005 is

Primary storage Primary storage is directly connected to the central processing the computer. It must be present for to function correctly, just as in a analogy the lungs must be present oxygen storage) for the heart to (to pump and oxygenate the blood). in the diagram, primary storage consists of three kinds of storage: ·

·

unit of the CPU biological ( f o r function As shown typic ally

Processor registers are the central processing unit. contain information that the and logic unit needs to carry current instruction. They are the fastest of all forms of storage, being switching transistors integrated on the silicon chip, and functioning as "flip-flops".

internal to Re gis ters arithmetic out the technically computer

Cache memory is a special internal memory used by many

type of central

C P U ' s e le ctro nic

106 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

processing units to increase their performance or "throughput". Some of the information in the main memory is duplicated in the cache memory, which is slightly slower but of much greater capacity than the processor registers, and faster but much smaller than main memory. Multi-level cache memory is also commonly used - "primary cache" being smallest, fastest and closest to the processing device; "secondary cache" being larger and slower, but still faster and much smaller than main memory. ·

Main memory contains the programs that are currently being run and the data the programs are operating on. In modern computers, the main memory is the electronic solid-state random access memory. It is directly connected to the CPU via a "memory bus" (shown in the diagram) and a "data bus". The arithmetic and logic unit can very quickly transfer information between a processor register and locations in main storage, also known as a "memory addresses". The memory bus is also called an address bus or front side bus and both busses are high-speed digital "superhighways". Access methods and speed are two of the fundamental technical differences between memory and mass storage devices. (Note that all memory sizes and storage capacities shown in the diagram will inevitably be exceeded with advances in technology over time.)

Secondary and off-line storage Secondary storage requires the computer to use its input/output channels to access the information, and is used for long-term storage of persistent information. However most computer operating systems also use secondary storage devices as virtual memory - to artificially increase the apparent amount of main memory in the computer. Secondary storage is also known as "mass storage", as shown in the diagram above. Secondary or mass storage is typically of much greater capacity than primary storage (main memory), but it is also much slower. In modern computers, hard disks are usually used for mass storage. The time taken to access a given byte of information stored on a hard disk is typically a few thousandths of a second, or milliseconds. By contrast, the time taken to access a given byte of information stored in random access memory is measured in thousand-millionths of a second, or nanoseconds. This illustrates the very significant speed difference which distinguishes solid-state memory from rotating magnetic storage devices: hard disks are typically about a million times slower than memory. Rotating optical storage devices (such as CD and DVD drives) are typically even slower than hard disks, although their access speeds are likely to improve with advances in technology. Therefore, the use of virtual memory, which is millions of times slower than "real" memory, significantly degrades the performance of any computer. Virtual memory is implemented by many operating systems using terms like swap file or "cache file". The main historical advantage of virtual memory was that it was much less expensive than real memory. That advantage is less relevant today, yet surprisingly most operating systems continue to implement it, despite the significant performance penalties.

107 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Off-line storage is a system where the storage medium can be easily removed from the storage device. Off-line storage is used for data transfer and archival purposes. In modern computers, compact discs, DVDs, memory cards, flash memory devices including "USB drives", floppy disks, Zip disks and magnetic tapes are commonly used for off-line mass storage purposes. "Hot-pluggable" USB hard disks are also available. Off-line storage devices used in the past include punched cards, microforms, and removable Winchester disk drums. Tertiary and database storage Tertiary storage is a system where a robotic arm will "mount" (connect) or "dismount" off-line mass storage media (see the next item) according to the computer operating system's demands. Tertiary storage is used in the realms of enterprise storage and scientific computing on large computer systems and business computer networks, and is something a typical personal computer user never sees firsthand. Database storage is a system where information in computers is stored in large databases, data banks, data warehouses, or data vaults. It involves packing and storing large amounts of storage devices throughout a series of shelves in a room, usually an office, all linked together. The information in database storage systems can be accessed by a supercomputer, mainframe computer, or personal computer. Databases, data banks, and data warehouses, etc, can only be accessed by authorized users. Network storage Network storage is any type of computer storage that involves accessing information over a computer network. Network storage arguably allows to centralize the information management in an organization, and to reduce the duplication of information. Network storage includes: ·

Network-attached storage is secondary or tertiary storage attached to a computer which another computer can access over a local-area network, a private wide-area network, or in the case of online file storage, over the Internet.

·

Network computers are computers that do not contain internal secondary storage devices. Instead, documents and other data are stored on a network-attached storage.

Confusingly, these terms are sometimes used differently. Primary storage can be used to refer to local random-access disk storage, which should properly be called secondary storage. If this type of storage is called primary storage, then the term secondary storage would refer to offline, sequential-access storage like tape media. Characteristics of storage

108 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The division to primary, secondary, tertiary and off-line storage is based on memory hierarchy, or distance from the central processing unit. There are also other ways to characterize various types of storage. Volatility of information Volatile memory requires constant power to maintain the stored information. Volatile memory is typically used only for primary storage. (Primary storage is not necessarily volatile, even though today's most cost-effective primary storage technologies are. Non-volatile technologies have been widely used for primary storage in the past and may again be in the future.) Non-volatile memory will retain the stored information even if it is not constantly supplied with electric power. It is suitable for long-term storage of information, and therefore used for secondary, tertiary, and off-line storage. Dynamic memory is volatile memory which also requires that stored information is periodically refreshed, or read and rewritten without modifications. Ability to access non-contiguous information Random access means that any location in storage can be accessed at any moment in the same, usually small, amount of time. This makes random access memory well suited for primary storage. Sequential access means that the accessing a piece of information will take a varying amount of time, depending on which piece of information was accessed last. The device may need to seek (e.g. to position the read/write head correctly), or cycle (e.g. to wait for the correct location in a revolving medium to appear below the read/write head). Ability to change information ·

Read/write storage, or mutable storage, allows information to be overwritten at any time. A computer without some amount of read/write storage for primary storage purposes would be useless for many tasks. Modern computers typically use read/write storage also for secondary storage.

·

Read only storage retains the information stored at the time of manufacture, and write once storage (WORM) allows the information to be written only once at some point after manufacture. These are called immutable storage. Immutable storage is used for tertiary and off-line storage. Examples include CD-R.

·

Slow write, fast read storage is read/write storage which allows information to be overwritten multiple times, but with the write operation being much slower than the read operation. Examples include CD-RW.

109 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Addressability of information ·

In location-addressable storage, each individually accessible unit of information in storage is selected with its numerical memory address. In modern computers, location-addressable storage usually limits to primary storage, accessed internally by computer programs, since location-addressability is very efficient, but burdensome for humans.

·

In file system storage, information is divided into files of variable length, and a particular file is selected with human-readable directory and file names. The underlying device is still location-addressable, but the operating system of a computer provides the file system abstraction to make the operation more understandable. In modern computers, secondary, tertiary and off-line storage use file systems.

·

In content-addressable storage, each individually accessible unit of information is selected with a hash value, or a short identifier with number? Pertaining to the memory address the information is stored on. Content-addressable storage can be implemented using software (computer program) or hardware (computer device), with hardware being faster but more expensive option.

Capacity and performance ·

Storage capacity is the total amount of stored information that a storage device or medium can hold. It is expressed as a quantity of bits or bytes (e.g. 10.4 megabytes).

·

Storage density refers to the compactness of stored information. It is the storage capacity of a medium divided with a unit of length, area or volume (e.g. 1.2 megabytes per square centimeter).

·

Latency is the time it takes to access a particular location in storage. The relevant unit of measurement is typically nanosecond for primary storage, millisecond for secondary storage, and second for tertiary storage. It may make sense to separate read latency and write latency, and in case of sequential access storage, minimum, maximum and average latency.

·

Throughput is the rate at which information can read from or written to the storage. In computer storage, throughput is usually expressed in terms of megabytes per second or MB/s, though bit rate may also be used. As with latency, read rate and write rate may need to be differentiated.

110 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Technologies, devices and media Magnetic storage Magnetic storage uses different patterns of magnetization on a magnetically coated surface to store information. Magnetic storage is non-volatile. The information is accessed using one or more read/write heads. Since the read/write head only covers a part of the surface, magnetic storage is sequential access and must seek, cycle or both. In modern computers, the magnetic surface will take these forms: ·

·

Magnetic disk ·

Floppy disk, used for off-line storage

·

Hard disk, used for secondary storage

Magnetic tape, used for tertiary and off-line storage

In early computers, magnetic storage was also used for primary storage in a form of magnetic drum, or core memory, core rope memory, thin film memory, twistor memory or bubble memory. Also unlike today, magnetic tape was often used for secondary storage. Semiconductor storage Semiconductor memory uses semiconductor-based integrated circuits to store information. A semiconductor memory chip may contain millions of tiny transistors or capacitors. Both volatile and non-volatile forms of semiconductor memory exist. In modern computers, primary storage almost exclusively consists of dynamic volatile semiconductor memory or dynamic random access memory. Since the turn of the century, a type of non-volatile semiconductor memory known as flash memory has steadily gained share as off-line storage for home computers. Non-volatile semiconductor memory is also used for secondary storage in various advanced electronic devices and specialized computers. Optical disc storage Optical disc storage uses tiny pits etched on the surface of a circular disc to store information, and reads this information by illuminating the surface with a laser diode and observing the reflection. Optical disc storage is non-volatile and sequential access. The following forms are currently in common use: ·

CD, CD-ROM, DVD: Read only storage, used for mass distribution of digital information (music, video, computer programs)

111 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

CD-R, DVD-R, DVD+R: Write once storage, used for tertiary and off-line storage

·

CD-RW, DVD-RW, DVD+RW, DVD-RAM: Slow write, fast read storage, used for tertiary and off-line storage

·

Blu-ray

·

HD DVD

The following form have also been proposed: ·

HVD

·

Phase-change Dual

Magneto-optical disc storage Magneto-optical disc storage is optical disc storage where the magnetic state on a ferromagnetic surface stores information. The information is read optically and written by combining magnetic and optical methods. Magneto-optical disc storage is non-volatile, sequential access, slow write, fast read storage used for tertiary and off-line storage. Ultra Density Optical disc storage Ultra Density Optical disc storage An Ultra Density Optical disc or UDO is a 5.25" ISO cartridge optical disc encased in a dust-proof caddy which can store up to 30 GB of data. Utilizing a design based on a Magneto-optical disc, but utilizing Phase Change technology combined with a blue violet laser, a UDO disc can store substantially more data than a magneto-optical disc or MO, because of the shorter wavelength (405 nm) of the blue-violet laser employed. MOs use a 650-nm-wavelength red laser. Because its beam width is shorter when burning to a disc than a red-laser for MO, a blue-violet laser allows more information to be stored digitally in the same amount of space. Current generations of UDO store up to 30 GB, but 60 GB and 120 GB versions of UDO are in development and are expected to arrive sometime in 2007 and beyond, though up to 500 GB has been speculated as a possibility for UDO. [1]

Optical Jukebox storage Optical jukebox storage is a robotic storage device that utilizes optical disk device and can automatically load and unload optical disks and provide terabytes of near-line information. The devices are often called optical disk libraries, robotic drives, or auto changers. Jukebox devices may have up to 1,000 slots for disks, and usually have a

112 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

picking device that traverses the slots and drives. The arrangement of the slots and picking devices affects performance, depending on the space between a disk and the picking device. Seek times and transfer rates vary depending upon the optical technology. Jukeboxes are used in high-capacity archive storage environments such as imaging, medical, and video. HSM is a strategy that moves little-used or unused files from fast magnetic storage to optical jukebox devices in a process called migration. If the files are needed, they are migrated back to magnetic disk.

Other early methods Paper tape and punch cards have been used to store information for automatic processing since the 1890s, long before general-purpose computers existed. Information was recorded by punching holes into the paper or cardboard medium, and was read by electrically (or, later, optically) sensing whether a particular location on the medium was solid or contained a hole. Williams tube used a cathode ray tube, and Selectron tube used a large vacuum tube to store information. These primary storage devices were short-lived in the market, since Williams tube was unreliable and Selectron tube was expensive. Delay line memory used sound waves in a substance such as mercury to store information. Delay line memory was dynamic volatile, cycle sequential read/write storage, and was used for primary storage. Other proposed methods Phase-change memory uses different mechanical phases of phase change material to store information, and reads the information by observing the varying electric resistance of the material. Phase-change memory would be non-volatile, random access read/write storage, and might be used for primary, secondary and off-line storage. Holographic storage stores information optically inside crystals or photopolymers. Holographic storage can utilize the whole volume of the storage medium, unlike optical disc storage which is limited to a small number of surface layers. Holographic storage would be non-volatile, sequential access, and either write once or read/write storage. It might be used for secondary and off-line storage. Molecular memory stores information in polymers that can store electric charge. Molecular memory might be especially suited for primary storage. LESSON V A Simple computer: Hardware Design

113 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

A computer program is a collection of instructions that describe a task, or set of tasks, to be carried out by a computer. The term computer program may refer to source code, written in a programming language, or to the executable form of this code. Computer programs are also known as software, applications programs, system software or simply programs. The source code of most computer programs consists of a list of instructions that explicitly implement an algorithm (known as an imperative programming style); in another form (known as declarative programming) the characteristics of the required information are specified and the method used to obtain the results, if any, is left to the platform. Computer programs are often written by people known as computer programmers, but may also be generated by other programs. Terminology Commercial computer programs aimed at end-users are commonly referred to as application software by the computer industry, as these programs are focused on the functionality of what the computer is being used for (its application), as opposed to being focused on system-level functionality (for example, as the Windows operating system software is). In practice, colloquially, both application software and system software may correctly be referred to as programs, as may be the more esoteric firmware—software firmly built into an embedded system. Programs that execute on the hardware are a set of instructions in a format understandable by the instruction set of the computer's main processor, which cause specific other instructions to execute or perform a simple computation like addition. But computers process millions of such per second and that is the program, the sequence of instructions strung together such that when executed, they do something useful, and usually repeatable and reliable. For differences in the usage of the spellings program and programmed, see American and British English spelling differences. Program execution A computer program is loaded into memory (usually by the operating system) and then executed ("run"), instruction by instruction, until termination, either with success or through software or hardware error. Before a computer can execute any sort of program (including the operating system, itself a program) the computer hardware must be initialized. This initialization is done in modern PCs by a piece of software stored on programmable memory chips installed by

114 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

the manufacturer, called the BIOS. The BIOS will attempt to initialize the boot sequence, making the computer ready for higher-level program execution. Programs vs. data The executable form of a program (that is, usually object code) is often treated as being different from the data the program operates on. In some cases this distinction is blurred with programs creating, or modifying, data, which is subsequently executed as part of the same program (this is a common occurrence for programs written in Lisp), see self-modifying code. Programming Main article: Computer programming A program is likely to contain a variety of data structures and a variety of different algorithms to operate on them. Creating a computer program is the iterative process of writing new source code or modifying existing source code, followed by testing, analyzing and refining this code. A person who practices this skill is referred to as a computer programmer or software developer. The sometimes lengthy process of computer programming is now referred to as "software development" or software engineering. The latter becoming more popular due to the increasing maturity of the discipline. (see Debate over who is a software engineer) Two other forms of modern day approaches are team programming where each member of the group has equal say in the development process except for one person who guides the group through discrepancies. These groups tend to be around 10 people to keep the group manageable. The second form is referred to as "peer programming" or pair programming. See Process and methodology for the different aspects of modern day computer programming. Trivia The world's shortest useful program is usually agreed upon to be the utility cont/rerun used on the old operating system CP/M. It was 2 bytes long (JMP 100), jumping to the start position of the program that had previously been run and so restarting the program, in memory, without loading it from the much slower disks of the 1980's. According to the International Obfuscated C Code Contest, the world's smallest "program" consisted of a file containing zero bytes, which when run output zero bytes to the screen (also making it the world's smallest self-replicating program). This

115 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

"program" was qualified as such only due to a flaw in the language of the contest rules, which were soon after modified to require the program to be greater than zero bytes. Ada Lovelace wrote a set of notes specifying in complete detail a method for calculating Bernoulli numbers with the Analytical Engine described by Charles Babbage. This is recognized as the world's first computer program and she is recognized as the world's first computer programmer by historians. In computer science, a data structure is a way of storing data in a computer so that it can be used efficiently. Often a carefully chosen data structure will allow a more efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data structure. A well-designed data structure allows a variety of critical operations to be performed, using as few resources, both execution time and memory space, as possible. Data structures are implemented using the data types, references and operations on them provided by a programming language. Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to certain tasks. For example, B-trees are particularly well-suited for implementation of databases, while routing tables rely on networks of machines to function.

A binary tree, a simple type of branching linked data structure. Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to certain tasks. For example, B-trees are particularly well-suited for implementation of databases, while routing tables rely on networks of machines to function. In the design of many types of programs, the choice of data structures is a primary design consideration, as experience in building large systems has shown that the difficulty of implementation and the quality and performance of the final result depends heavily on choosing the best data structure. After the data structures are chosen, the algorithms to be used often become relatively obvious. Sometimes things work in the opposite direction - data structures are chosen because certain key tasks have algorithms that work best with particular data structures. In either case, the choice of appropriate data structures is crucial.

116 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

This insight has given rise to many formalized design methods and programming languages in which data structures, rather than algorithms, are the key organizing factor. Most languages feature some sort of module system, allowing data structures to be safely reused in different applications by hiding their verified implementation details behind controlled interfaces. Object-oriented programming languages such as C++ and Java in particular use classes for this purpose. Since data structures are so crucial to professional programs, many of them enjoy extensive support in standard libraries of modern programming languages and environments, such as C++'s Standard Template Library, the Java API, and the Microsoft .NET Framework. The fundamental building blocks of most data structures are arrays, records, discriminated unions, and references. For example, the nullable reference, a reference which can be null, is a combination of references and discriminated unions, and the simplest linked data structure, the linked list, is built from records and nullable references. There is some debate about whether data structures represent implementations or interfaces. How they are seen may be a matter of perspective. A data structure can be viewed as an interface between two functions or as an implementation of methods to access storage that is organized according to the associated data type.

Common data structures Main article: List of data structures ·

stacks

·

queues

·

linked lists

·

trees

·

graphs

The arithmetic logic unit (ALU) is a digital circuit that calculates an arithmetic operation (like an addition, subtraction, etc.) and logic operations (like an Exclusive Or) between two numbers. The ALU is a fundamental building block of the central processing unit of a computer.

117 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Many types of electronic circuits need to perform some type of arithmetic operation, so even the circuit inside a digital watch will have a tiny ALU that keeps adding 1 to the current time, and keeps checking if it should beep the timer, etc... By far, the most complex electronic circuits are those that are built inside the chip of modern microprocessors like the Pentium. Therefore, these processors have inside them a powerful and very complex ALU. In fact, a modern microprocessor (or mainframe) may have multiple cores, each core with multiple execution units, each with multiple ALUs. Many other circuits may contain ALUs inside: GPUs like the ones in NVidia and ATI graphic cards, FPUs like the old 80387 co-processor, and digital signal processor like the ones found in Sound Blaster sound cards, CD players and High-Definition TVs. All of these have several powerful and complex ALUs inside.

A typical schematic symbol for an ALU: A & B are operands; R is the output; F is the input from the Control Unit; D is an output status History: Von Neumann's proposal Mathematician John von Neumann proposed the ALU concept in 1945, when he wrote a report on the foundations for a new computer called the EDVAC (Electronic Discrete Variable Automatic Computer). Later in 1946, he worked with his colleagues in designing a computer for the Princeton Institute of Advanced Studies (IAS). The IAS computer became the prototype for many later computers. In the proposal, von Neumann outlined what he believed would be needed in his machine, including an ALU. Von Neumann stated that an ALU is a necessity for a computer because it is guaranteed that a computer will have to compute basic mathematical operations, including addition,

118 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

subtraction, multiplication, and division.[1] He therefore believed it was "reasonable that [the computer] should contain specialized organs for these operations."[2] Numerical Systems An ALU must process numbers using the same format as the rest of the digital circuit. For modern processors, that almost always is the two's complement binary number representation. Early computers used a wide variety of number systems, including one's complement, sign-magnitude format, and even true decimal systems, with ten tubes per digit. ALUs for each one of these numeric systems had different designs, and that influenced the current preference for two's complement, as this is the representation that makes it easier for the ALUs to calculate additions and subtractions. Practical overview

A simple 2-bit ALU that does AND, OR, XOR, and addition (click image for an explanation) Most of the computer’s actions are performed by the ALU. The ALU gets data from processor registers. This data is processed and the results of this operation are stored into ALU output registers. Other mechanisms move data between these registers and memory.[3] A Control Unit controls the ALU, by setting circuits that tell the ALU what operations to perform. Simple Operations Most ALUs can perform the following operations: ·

Integer arithmetic operations (addition, subtraction, and sometimes multiplication and division, though this is more expensive)

·

Bitwise logic operations (AND, NOT, OR, XOR)

·

Bit-shifting operations (shifting or rotating a word by a specified number of bits to the left or right, with or without sign extension). Shifts can be interpreted as multiplications by 2 and divisions by 2.

119 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Complex Operations An engineer can design an ALU to calculate any operation, however complicated it is; the problem is that the more complex the operation, the more expensive the ALU is, the more space it uses in the processor, and more power it dissipates, etc... Therefore, engineers always calculate a compromise, to provide for the processor (or other circuits) an ALU powerful enough to make the processor fast, but yet not so complex as to become prohibitive. Imagine that you need to calculate, say the square root of a number; the digital engineer will examine the following options to implement this operation: 1. Design a very complex ALU that calculates the square root of any number in a single step. This is called calculation in a single clock. 2. Design a complex ALU that calculates the square root through several steps. This is called interactive calculation, and usually relies on control from a complex control unit with built-in microcode. 3. Design a simple ALU in the processor, and sell a separate specialized and costly processor that the customer can install just besides this one, and implements one of the options above. This is called the co-processor. 4. Emulate the existence of the co-processor, that is, whenever a program attempts to perform the square root calculation, make the processor check if there is a co-processor present and use it if there is one; if there isn't one, interrupt the processing of the program and invoke the operating system to perform the square root calculation through some software algorithm. This is called software emulation. 5. Tell the programmers that there is no co-processor and there is no emulation, so they will have to write their own algorithms to calculate square roots by software. This is performed by software libraries. The options above go from the fastest and most expensive one to the slowest and least expensive one. Therefore, while even the simplest computer can calculate the most complicated formula, the simplest computers will usually take a long time doing that because several of the steps for calculating the formula will involve the options #3, #4 and #5 above. Powerful processors like the Pentium IV and AMD64 implement option #1 above for the most of the complex operations and the slower #2 for the extremely complex operations. That is possible by the ability of building very complex ALUs in these processors.

120 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Inputs and outputs The inputs to the ALU are the data to be operated on (called operands) and a code from the control unit indicating which operation to perform. Its output is the result of the computation. In many designs the ALU also takes or generates as inputs or outputs a set of condition codes from or to a status register. These codes are used to indicate cases such as carry-in or carry-out, overflow, divide-by-zero, etc.[4] ALUs vs. FPUs A Floating Point Unit also performs arithmetic operations between two values, but they do so for numbers in floating point representation, which is much more complicated than the two's complement representation used in a typical ALU. In order to do these calculations, an FPU has several complex circuits built-in, including some internal ALUs. Usually engineers call an ALU the circuit that performs arithmetic operations in integer formats (like two's complement and BCD), while the circuits that calculate on more complex formats like floating point, complex numbers, etc... usually receive a more illustrious name. In computing, input/output , or I/O, is the collection of interfaces that different functional units (sub-systems) of an information processing system use to communicate with each other, or the signals (information) sent through those interfaces. Inputs are the signals received by the unit, and outputs are the signals sent from it. The term can also be used as part of an action; to "do I/O" is to perform an input or output operation. I/O devices are used by a person (or other system) to communicate with a computer. For instance, keyboards and mouses are considered input devices of a computer and monitors and printers are considered output devices of a computer. Typical devices for communication between computers are for both input and output, such as modems and network cards. It is important to notice that the previous designations of devices as either input or output change when the perspective changes. Mouses and keyboards take as input physical movement that the human user outputs and convert it into signals that a computer can understand. The output from these devices is treated as input by the computer. Similarly, printers and monitors take as input signals that a computer outputs. They then convert these signals into representations that human users can see or read. (For a human user the process of reading or seeing these representations is receiving input.) In computer architecture, the combination of the CPU and main memory (i.e. memory that the CPU can read and write to directly, with individual instructions) is considered

121 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

the heart of a computer, and any movement of information from or to that complex, for example to or from a disk drive, is considered I/O. The CPU and its supporting circuitry provide I/O methods that are used in low-level computer programming in the implementation of device drivers. Higher-level operating system and programming facilities employ separate, more abstract I/O concepts and primitives. For example, an operating system provides application programs with the concept of files. C programming language defines functions that allow programs to perform I/O through streams, such as read data from them and write data into them. An alternative to special primitive functions is the I/O monad that permits programs to just describe I/O, and the actions are carried out outside the program. This is notable because the I/O functions would introduce side-effects to any programming language, but now purely functional programming is practical. Control unit is the part of a CPU or other device that directs its operation. The outputs of the unit control the activity of the rest of the device. A control unit can be thought of as a finite state machine. At one time control units for CPUs were ad-hoc logic, and they were difficult to design. Now they are often implemented as a microprogram that is stored in a control store. Words of the microprogram are selected by a microsequencer and the bits from those words directly control the different parts of the device, including the registers, arithmetic and logic units, instruction registers, buses, and off-chip input/output. In modern computers, each of these subsystems may have its own subsidiary controller, with the control unit acting as a supervisor. (See also CPU design and computer architecture.) Types of control units All types of control units generate electronic control signals that control other parts of a CPU. Control units are usually one of these types: 1. Microcoded control units. In a microcoded control unit, a program reads signals, and generates control signals. The program itself is executed by a very simple computer, a relatively simple digital circuit called a microsequencer. 2. Hardware control units. In a hardware control unit, a digital circuit generates the control signals directly. The system console, root console or simply console is the text entry and display device for system administration messages, particularly those from the BIOS or boot loader, the kernel, from the init system and from the system logger.

122 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

On traditional minicomputers, the console was a serial console, an RS-232 serial link to a terminal such as a DEC VT100. This terminal was usually kept in a secured room since it could be used for certain privileged functions such as halting the system or selecting which media to boot from. Large midrange systems, e.g. those from Sun Microsystems, Hewlett-Packard and IBM, still use serial consoles. In larger installations, the console ports are attached to multiplexers or network-connected multiport serial servers that let an operator connect his terminal to any of the attached servers. On PCs, the computer's attached keyboard and monitor have the equivalent function. Since the monitor cable carries video signals, it cannot be extended very far. Often, installations with many servers therefore use keyboard/video multiplexers (KVM switches) and possibly video amplifiers to centralize console access. In recent years, KVM/IP devices have become available that allow a remote computer to view the video output and send keyboard input via any TCP/IP network and therefore the Internet. Some PC BIOSes, especially in servers, also support serial consoles, giving access to the BIOS through a serial port so that the simpler and cheaper serial console infrastructure can be used. Even where BIOS support is lacking, some operating systems, e.g. FreeBSD and Linux, can be configured for serial console operation either during boot up, or after startup. It is usually possible to log in from the console. Depending on configuration, the operating system may treat a login session from the console as being more trustworthy than a login session from other sources. Routers and Managed Switches (as well as other networking and telecoms equipment) may also have console ports in particular Cisco Systems routers and switches that use Cisco IOS are normally configured via their console ports.

123 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Knoppix system console showing the boot process A microprogram implements a CPU instruction set. Just as a single high level language statement is compiled to a series of machine instructions (load, store, shift, etc), in a CPU using microcode, each machine instruction is in turn implemented by a series of microinstructions, sometimes called a microprogram. Microprograms are often referred to as microcode. The elements composing a microprogram exist on a lower conceptual level than the more familiar assembler instructions. Each element is differentiated by the "micro" prefix to avoid confusion: microprogram, microcode, microinstruction, microassembler, etc. Microprograms are carefully designed and optimized for the fastest possible execution, since a slow microprogram would yield a slow machine instruction which would in turn cause all programs using that instruction to be slow. The microprogrammer must have extensive low-level hardware knowledge of the computer circuitry, as the microcode controls this. The microcode is written by the CPU engineer during the design phase.

124 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

On most computers using microcode, the microcode doesn't reside in the main system memory, but exists in a special high speed memory, called the control store. This memory might be read-only memory, or it might be read-write memory, in which case the microcode would be loaded into the control store from some other storage medium as part of the initialization of the CPU. If the microcode is read-write memory, it can be altered to correct bugs in the instruction set, or to implement new machine instructions. Microcode can also allow one computer microarchitecture to emulate another, usually more-complex architecture. Microprograms consist of series of microinstructions. These microinstructions control the CPU at a very fundamental level. For example, a single typical microinstruction might specify the following operations: ·

Connect Register 1 to the "A" side of the ALU

·

Connect Register 7 to the "B" side of the ALU

·

Set the ALU to perform two's-complement addition

·

Set the ALU's carry input to zero

·

Store the result value in Register 8

·

Update the "condition codes" with the ALU status flags ("Negative", "Zero", "Overflow", and "Carry")

·

Microjump to MicroPC nnn for the next microinstruction

To simultaneously control all of these features, the microinstruction is often very wide, for example, 56 bits or more. The reason for microprogramming Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially CPU instruction sets were "hard wired". Each machine instruction (add, shift, move) was implemented directly with circuitry. This provided fast performance, but as instruction sets grew more complex, hard-wired instruction sets became more difficult to design and debug. Microcode alleviated that problem by allowing CPU design engineers to write a microprogram to implement a machine instruction rather than design circuitry for that. Even late in the design process, microcode could easily be changed, whereas hard wired instructions could not. This greatly facilitated CPU design and led to more complex instruction sets. Another advantage of microcode was the implementation of more complex machine instructions. In the 1960s through the late 1970s, much programming was done in

125 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

assembly language, a symbolic equivalent of machine instructions. The more abstract and higher level the machine instruction, the greater the programmer productivity. The ultimate extension of this were "Directly Executable High Level Language" designs. In these each statement of a high level language such as PL/I would be entirely and directly executed by microcode, without compilation. The IBM Future Systems project and Data General Fountainhead Processor were examples of this. Microprogramming also helped alleviate the memory bandwidth problem. During the 1970s, CPU speeds grew more quickly than memory speeds. Numerous acceleration techniques such as memory block transfer, memory pre-fetch and multi-level caches helped reduce this. However high level machine instructions (made possible by microcode) helped further. Fewer more complex machine instructions require less memory bandwidth. For example complete operations on character strings could be done as a single machine instruction, thus avoiding multiple instruction fetches. Architectures using this approach included the IBM System/360 and Digital Equipment Corporation VAX, the instruction sets of which were implemented by complex microprograms. The approach of using increasingly complex microcode-implemented instruction sets was later called CISC. Other benefits A processor's microprograms operate on a more primitive, totally different and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it possible to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures. Doing so is important if binary program compatibility is a priority. That way previously existing programs can run on totally new hardware without requiring revision and recompilation. However there may be a performance penalty for this approach. The tradeoffs between application backward compatibility vs CPU performance are hotly debated by CPU design engineers. The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations actually used hardware implementing a much simpler underlying microarchitecture. In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduced the amount of unique system software that had to be written for each model.

126 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

A similar approach was used by Digital Equipment Corporation in their VAX family of computers. Initially a 32-bit TTL processor in conjunction with supporting microcode implemented the programmer-visible architecture. Later VAX versions used different micro architectures, yet the programmer-visible architecture didn't change. Microprogramming also reduced the cost of field changes to correct defects (bugs) in the processor; a bug could often be fixed by replacing a portion of the microprogram rather than by changes being made to hardware logic and wiring.

History In 1947, the design of the MIT Whirlwind introduced the concept of a control store as a way to simplify computer design and move beyond ad hoc methods. The control store was a two-dimensional lattice: one dimension accepted "control time pulses" from the CPU's internal clock, and the other connected to control signals on gates and other circuits. A "pulse distributor" would take the pulses generated by the CPU clock and break them up into eight separate time pulses, each of which would activate a different row of the lattice. When the row was activated, it would activate the control signals connected to it. Described another way, the signals transmitted by the control store are being played much like a player piano roll. That is, they are controlled by a sequence of very wide words constructed of bits, and they are "played" sequentially. In a control store, however, the "song" is short and repeated continuously. In 1951 Maurice Wilkes enhanced this concept by adding conditional execution, a concept akin to a conditional in computer software. His initial implementation consisted of a pair of matrices, the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, as it were) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term microprogramming to describe this feature and distinguish it from a simple control store. Examples of micro programmed systems ·

The Burroughs B1700 included bit-addressable main memory, microprogramming to support different programming languages.

to

allow

·

The Digital Equipment Corporation PDP-11 processors, with the exception of the PDP-11/20, were microprogrammed (Sieworek,Bell,Newell 1982)

127 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

Most models of the IBM System/360 series were microprogrammed: ·

The Model 25 was unique among System/360 models in using the top 16k bytes of core storage to hold the control storage for the microprogram. The 2025 used a 16-bit microarchitecture with seven control words (or microinstructions).

·

The Model 30, the slowest model in the line, used an 8-bit microarchitecture with only a few hardware registers; everything that the programmer saw was emulated by the microprogram.

·

The Model 40 used 56-bit control words. The 2040 box implements both the System/360 main processor and the multiplex channel (the I/O processor).

·

The Model 50 had two internal data paths which operated in parallel: a 32-bit data path used for arithmetic operations, and an 8-bit data path used in some logical operations. The control store used 90-bit microinstructions.

·

The Model 85 had separate instruction fetch (I-unit) and execution (E-unit) to provide high performance. The I-unit is hardware controlled. The E-unit is microprogrammed with 108-bit control words.

Implementation Each microinstruction in a microprogram provides the bits which control the functional elements that internally comprise a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less-complex programming challenge. To take advantage of this, computers were divided into several parts: A microsequencer picked the next word of the control store. A sequencer is mostly a counter, but usually also has some way to jump to a different part of the control store depending on some data, usually data from the instruction register and always some part of the control store. The simplest sequencer is just a register loaded from a few bits of the control store. A register set is a fast memory containing the data of the central processing unit. It may include the program counter, stack pointer, and other numbers that are not easily accessible to the application programmer. Often the register set is triple-ported, that is, two registers can be read, and a third written at the same time. An arithmetic and logic unit performs calculations, usually addition, logical negation, a right shift, and logical AND. It often performs other functions, as well.

128 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

There may also be a memory address register and a memory data register, used to access the main computer storage. Together, these elements form an "execution unit." Most modern CPUs have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code. These elements could often be bought together as a single chip. This chip came in a fixed width which would form a 'slice' through the execution unit. These were known a 'bit slice' chips. The AMD Am2900 is the best known example of a bit slice processor. The parts of the execution units, and the execution units themselves are interconnected by a bundle of wires called a bus. Programmers develop microprograms. The basic tools are software: A microassembler allows a programmer to define the table of bits symbolically. A simulator program executes the bits in the same way as the electronics (hopefully), and allows much more freedom to debug the microprogram. After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data. This program is similar to those used to optimize a programmable logic array. No known computer program can produce optimal logic, but even pretty good logic can vastly reduce the number of transistors from the number required for a ROM control store. This reduces the cost and power used by a CPU. Microcode can be characterized as horizontal or vertical. This refers primarily to whether each microinstruction directly controls CPU elements (horizontal microcode), or requires subsequent decoding by combinational logic before doing so (vertical microcode). Consequently each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction. Horizontal microcode A typical horizontal microprogram control word has a field, a range of bits, to control each piece of electronics in the CPU. For example, one simple arrangement might be: | register source A | register source B | destination register | arithmetic and logic unit operation | type of jump | jump address | For this type of micro machine to implement a jump instruction with the address following the jump op-code, the micro assembly would look something like: # Any line starting with a number-sign is a comment # This is just a label, the ordinary way assemblers symbolically represent a # memory address. Instruction JUMP:

129 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

# To prepare for the next instruction, the instruction-decode microcode has already # moved the program counter to the memory address register. This instruction fetches # the target address of the jump instruction from the memory word following the # jump opcode, by copying from the memory data register to the memory address register. # This gives the memory system two clock ticks to fetch the next # instruction to the memory data register for use by the instruction decode. # The sequencer instruction "next" means just add 1 to the control word address. MDR, NONE, MAR, COPY, NEXT, NONE # This places the address of the next instruction into the PC. # This gives the memory system a clock tick to finish the fetch started on the # previous microinstruction. # The sequencer instruction is to jump to the start of the instruction decode. MAR, 1, PC, ADD, JMP, Instruction Decode # The instruction decode is not shown, because it's usually a mess, very particular # to the exact processor being emulated. Even this example is simplified. # Many CPUs have several ways to calculate the address, rather than just fetching # it from the word following the op-code. Therefore, rather than just one # jump instruction, those CPUs have a family of related jump instructions. Horizontal microcode is microcode that sets all the bits of the CPU's controls on each tick of the clock that drives the sequencer. Note how many of the bits in horizontal microcode contain fields to do nothing. Vertical microcode In vertical microcode, each microinstruction is encoded -- that is, the bit fields may pass through intermediate combinatory logic which in turn generates the actual control signals for internal CPU elements (ALU, registers, etc.) By contrast, with horizontal microcode the bit fields themselves directly produce the control signals. Consequently vertical microcode requires smaller instruction lengths and less storage, but requires more time to decode, resulting in a slower CPU clock. Some vertical microcodes are just the assembly language of a simple conventional computer that is emulating a more complex computer. This technique was popular in the time of the PDP-8. Another form of vertical microcode has two fields: | field select | field value |

130 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The "field select" selects which part of the CPU will be controlled by this word of the control store. The "field value" actually controls that part of the CPU. With this type of microcode, a designer explicitly chooses to make a slower CPU to save money by reducing the unused bits in the control store; however, the reduced complexity may increase the CPU's clock frequency, which lessens the effect of an increased number of cycles per instruction. As transistors became cheaper, horizontal microcode came to dominate the design of CPUs using microcode, with vertical microcode no longer being used. Writable control stores A few computers were built using "writable microcode" -- rather than storing the microcode in ROM or hard-wired logic, the microcode was stored in a RAM called a Writable Control Store or WCS. Many of these machines were experimental laboratory prototypes, but there were also commercial machines that used writable microcode, such as early Xerox workstations, the DEC VAX 8800 ("Nautilus") family, the Symbolic L- and G-machines, and a number of IBM System/370 implementations. Many more machines offered user-programmable writeable control stores as an option (including the HP 2100 and DEC PDP-11/60 minicomputers). WCS offered several advantages including the ease of patching the microprogram and, for certain hardware generations, faster access than ROMs could provide. User-programmable WCS allowed the user to optimize the machine for specific purposes. A CPU that uses microcode generally takes several clock cycles to execute a single instruction, one clock cycle for each step in the microprogram for that instruction. Some CISC processors include instructions that can take a very long time to execute. Such variations in instruction length interfere with pipelining and interrupt latency. Microcode versus VLIW and RISC The design trend toward heavily microcoded processors with complex instructions began in the early 1960s and continued until roughly the mid-1980s. At that point the RISC design philosophy started becoming more prominent. This included the points: ·

Analysis shows complex instructions are rarely used, hence the machine resources devoted to them are largely wasted.

·

Programming has largely moved away from assembly level, so it's no longer worthwhile to provide complex instructions for productivity reasons.

·

The machine resources devoted to rarely-used complex instructions is better used for expediting performance of simpler, commonly-used instructions.

131 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

Complex microcoded instructions requiring many, varying clock cycles are difficult to pipeline for increased performance.

·

Simpler instruction sets allow direct execution by performance penalty of microcoded execution.

hardware, avoiding the

Many RISC and VLIW processors are designed to execute every instruction (as long as it is in the cache) in a single cycle. This is very similar to the way CPUs with microcode execute one microinstruction per cycle. VLIW processors have instructions that behave like very wide horizontal microcode, although typically VLIW instructions do not have as fine-grained control over hardware as microcode. RISC processors can have instructions that look like narrow vertical microcode. Modern implementations of CISC instruction sets such as the x86 instruction set implement the simpler instructions in hardware rather than microcode, using microcode only to implement the more complex instructions. LESSON VI Input/Output The Input-output model of economics uses a matrix representation of a nation's (or a region's) economy to predict the effect of changes in one industry on others and by consumers, government, and foreign suppliers on the economy. Wassily Leontief (1906-1999) is credited with the development of this analysis. Francois Quesnay developed a cruder version of this technique called Tableau économique. Leontief won a Bank of Sweden Prize in Economic Sciences in Memory of Alfred Nobel for his development of this model. The analytical apparatus is strictly empiricist which reduces bias in the analysis. For this reason, Leontief seems to have been just about the only economist who was equally honored by communist and capitalist economists. Input-output analysis considers inter-industry relations in an economy, depicting how the output of one industry goes to another industry where it serves as an input, and thereby makes one industry dependent on another both as customer of output and as supplier of inputs. An input-output model is a specific formulation of input-output analysis. Each row of the input-output matrix reports the monetary value of an industry's inputs and each column represents the value of an industry's outputs. Suppose there are three industries. Row 1 reports the value of inputs to Industry 1 from Industries 1, 2, and 3. Rows 2 and 3 do the same for those industries. Column 1 reports the value of outputs from Industry 1 to Industries 1, 2, and 3. Columns 2 and 3 do the same for the other industries. While the input-output matrix reports only the intermediate goods and services that are exchanged among industries, row vectors on the bottom record the disposition of

132 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

finished goods and services to consumers, government, and foreign buyers. Similarly, column vectors on the right record non-industrial inputs like labor and purchases from foreign suppliers. In addition to studying the structure of national economies, input-output economics has been used to study regional economies within a nation, and as a tool for national economic planning. The mathematics of input-output economics is straightforward, but the data requirements are enormous because the expenditures and revenues of each branch of economic activity has to be represented. The tool has languished because not all countries collect the required data, data quality varies, and the data collection and preparation process has lags that make timely analysis difficult. Typically input-out tables are compiled retrospectively as a "snapshot" cross-section of the economy, once every few years. Usefulness An input-output model is widely used in economic forecasting to predict flows between sectors. They are also used in local urban economics. Irving Hock at the Chicago Area Transportation Study did detailed forecasting by industry sectors using input-output techniques. At the time, Hock’s work was quite an undertaking, the only other work that has been done at the urban level was for Stockholm and it was not widely known. Input-output was one of the few techniques developed at the CATS not adopted in later studies. Later studies used economic base analysis techniques. Input-output Analysis versus Consistency Analysis Despite the clear ability of the input-output model to depict and analyze the dependence of one industry or sector on another, Leontief and others never managed to introduce the full spectrum of dependency relations in a market economy. In 2003, Mohammad Gani, a pupil of Leontief, introduced Consistency Analysis in his book 'Foundations of Economic Science', which formally looks exactly like the input-output table, but explores the dependency relations in terms of payments and intermediation relations. Consistency analysis explores the consistency of plans of buyers and sellers by decomposing the input-output table into four separate matrices, each for a different kind of means of payment. It integrates micro and macroeconomics in one model and deals with money in a fully ideology-free manner. It deals with the circulation of money vis-à-vis the movement of goods. In a technical sense, input-output analysis can be seen as a special case of consistency analysis without money and without entrepreneurship and transaction cost.

133 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Key Ideas The inimitable book by Leontief himself remains the best exposition of input-output analysis. See bibliography. Input-output concepts are simple. Consider the production of the ith sector. We may isolate (1) the quantity of that production that goes to final demand, c, (2) to total output, xi, and (3) flows xi from that industry to other industries. We may write a transactions tableau. Table: Transactions in a Three Sector Economy Economic Activities

Inputs to Agriculture

Inputs to Manufacturing

Inputs to Transport

Final Demand

Total Output

Agriculture

5

15

2

68

90

Manufacturing

10

20

10

40

80

Transportation

10

15

5

0

30

Labor

25

30

5

0

60

Or

Note that in the example given we have no input flows from the industries to 'Labor’. We know very little about production functions because all we have are numbers representing transactions in a particular instance (single points on the production functions):

The neoclassical production function is an explicit function Q = f (K, L),

134 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Where Q = Quantity, K = Capital, L = Labor, And

the

partial

derivatives

( ) are the schedules for input factors.Leontief, the innovator of input-output analysis, uses a special production function depends linearly on the total output variables xi. Using coefficients a, we may manipulate our transactions information into what is known as an input-output table:

demand w h i c h Leontief

Now

Gives

Rewriting finally yields

Introducing matrix notation, we can see how a solution may be obtained. Let

135 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Denote the total output vector, the final demand vector, the unit matrix and the input-output matrix, respectively. Then:

Provided (I − A) is a regular matrix which can thus be inverted. There are many interesting aspects of the Leontief system, and there is an extensive literature. There is the Hawkins-Simon Condition on producibility. There has been interest in disaggregation to clustered inter-industry flows, and the study of constellations of industries. A great deal of empirical work has been done to identify coefficients, and data have been published for the national economy as well as for regions. This has been a healthy, exciting area for work by economists because the Leontief system can be extended to a model of general equilibrium; it offers a method of decomposing work done at a macro level. Transportation is implicit in the notion of inter-industry flows. It is explicitly recognized when transportation is identified as an industry – how much is purchased from transportation in order to produce. But this is not very satisfactory because transportation requirements differ, depending on industry locations and capacity constraints on regional production. Also, the receiver of goods generally pays freight cost, and often transportation data are lost because transportation costs are treated as part of the cost of the goods. Walter Isard and his student, Leon Moses, were quick to see the spatial economy and transportation implications of input-output, and began work in this area in the 1950s developing a concept of interregional input-output. Take a one region versus the world case. We wish to know something about interregional commodity flows, so introduce a column into the table headed “exports” and we introduce an “input” row. Imports Table: Adding Export And Import Transactions Economic Activities 1 2 … … Z Exports Final Demand Total Outputs 1

136 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

2 … … Z

A more satisfactory way to proceed would be to tie regions together at the industry level. That is, we identify both within region inter-industry transactions and among region inter-industry transactions. A not-so-small problem here is that the table gets very large very quickly. Input-output, as we have discussed it, is conceptually very simple. Its extension to an overall model of equilibrium in the national economy is also relatively simple and attractive. But there is a downside. One who wishes to do work with input-output systems must deal skillfully with industry classification, data estimation, and inverting very large, ill-conditioned matrices. Two additional difficulties are of interest in transportation work. There is the question of substituting one input for another, and there is the question about the stability of coefficients as production increases or decreases. These are intertwined questions. They have to do with the nature of regional production functions. Forecasting and/or Analysis Using Input-Output This discussion focuses on the use of input-output techniques in transportation; there is a vast literature on the technique as such. Table: Interregional Transactions Economic Activities

Ag

North Mfg

... ... Ag

East Mfg

... ... Ag

West Mfg

... ... Exports

Total Outputs

North Mfg ... ... Ag East Mfg ... ...

137 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Ag West Mfg ... ... As we see from the use of the economic base study, Urban transportation planning studies are demand-driven. The question we want to answer is, “What transportation need results from some economic development: what’s the feedback from development to transportation?” For that question, input-output is helpful. That’s the question Hock posed. There is an increase in the final demand vector, changed inter-industry relations result, and there is an impact on transportation requirements. Rappoport et al. (1979) started with consumption projections. These drove solutions of a national I-O model for projections of GNP and transportation requirements as per the transportation vector in the I-O matrix. Submodels were then used to investigate modal split and energy consumption in the transportation sector. Another question asked is: What is the impact of the transportation construction activity on an area? One of the first studies made of the impact of the interstate highway system used the national I/O model to forecast impacts measured in increased steel production, cement, employment, etc. Table: Input-Output Model for Hypothetical Economy Total requirements from regional industries per dollar of output delivered to final demand Purchasing Industry

Agriculture

Transport

Manufacturer

Services

Agriculture

1.14

0.22

0.13

0.12

Transportation

0.19

1.10

0.16

0.07

Manufacturing

0.16

0.16

1.16

0.06

Services

0.08

0.05

0.08

1.09

Total

1.57

1.53

1.53

1.34

Selling Industry

The Maritime Administration (MARAD) has produced the Port Impact Kit for a number of years. This software illustrates the use of I/O models. Simply written, it makes the technique widely available. It shows how to calculate direct effects from the initial round of spending that’s worked out by the vessel/cargo combinations. The direct expenditures are entered into the I/O table, and indirect effects are calculated. These are the inter-industry-relations derived activities from the purchases of supplies, purchases, labor, etc. An I/O table is supplied to aid that calculation. Then, using the

138 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

I/O table, induced effects are calculated. These are effects from household purchases of goods and services made possible from the wages generated from direct and indirect effects. The Corps of Engineers has a similar capability that has been used to examine the impacts of construction or base closing. The US Department of Commerce Bureau of Economic Analysis (BEA) (1997) model discusses how to use their state level I/O models (RIMS II). The ready availability of BEA and MARAD-like tables and calculation tools says that we will see more and more feedback impact analysis. The information is meaningful for many purposes. Feed forward calculations seem to be much more interesting for planning. The question is, “If an investment is made in transportation, what will be its development effects?” An investment in transportation might lower transport costs, increase quality of service, or a mixture of these. What would be the effect on trade flows, output, earnings, etc.? The first problem we know of worked on from this point of view was in Japan in the 1950’s. The situation was the building of a bridge to connect two islands, and the core question was of the mixing of the two island economies. A first consideration is the impact of changed transportation attributes, say, lower cost, on industry location, and/or agricultural or other resource based extra active activity, and/or on markets. A spatial price equilibrium model (linear programming) is the tool of choice for that. Input-output then permits tracing changed inter-industry relations, impacts on wages, etc. Britton Harris (1974) uses that analysis strategy. He begins with industry location forecasting equations: treats equilibrium of locations, markets, and prices; and pays much attention to transport costs. An interesting thing about this and other models is that input-output considerations are no more than an accounting add-on; they hardly enter Harris’ study. The interesting problems are the location and flow problems. I/O devices This topic discusses the different types of I/O devices used on your managed system, and how the I/O devices are added to logical partitions. I/O devices allow your managed system to gather, store, and transmit data. I/O devices are found in the server unit itself and in expansion units and towers that are attached to the server. I/O devices can be embedded into the unit, or they can be installed into physical slots. Not all types of I/O devices are supported for all operating systems or on all server models. For example, I/O processors (IOPs) are supported only on i5/OS® logical partitions. Also, Switch Network Interface (SNI) adapters are supported only on certain server models, and are not supported for i5/OS logical partitions. I/O pools for i5/OS logical partitions

139 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

This information discusses how I/O pools must be used to switch I/O adapters (IOAs) between i5/OS® logical partitions that support switchable independent auxiliary storage pools (IASPs). An I/O pool is a group of I/O adapters that form an IASP. Other names for IASPs include I/O failover pool and switchable independent disk pool. The IASP can be switched from a failed server to a backup server within the same cluster without the active intervention of the HMC. The I/O adapters within the IASP can be used by only one logical partition at a time, but any of the other logical partitions in the group can take over and use the I/O adapters within the IASP. The current owning partition must power off the adapters before another partition can take ownership. IASPs are not suitable for sharing I/O devices between different logical partitions. If you want to share an I/O device between different logical partitions, use the HMC to move the I/O device dynamically between the logical partitions. IOPs for i5/OS logical partitions This information discusses the purpose of IOPs and how you can switch IOPs and IOAs dynamically between i5/OS® logical partitions. i5/OS logical partitions require that the I/O processor (IOP) be attached to the system I/O bus and one or more I/O adapters (IOA). The IOP processes instructions from the server and works with the IOAs to control the I/O devices. The combined-function IOP (CFIOP) can connect to a variety of different IOAs. For instance, a CFIOP could support disk units, a console, and communications hardware. Note: A server with i5/OS logical partitions must have the correct IOP feature codes for the load source disk unit and alternate restart devices. Without the correct hardware, the logical partitions will not function correctly. A logical partition controls all devices connected to an IOP. You cannot switch one I/O device to another logical partition without moving the ownership of the IOP. Any resources (IOAs and devices) that are attached to the IOP cannot be in use when you move an IOP from one logical partition to another. IOAs for i5/OS logical partitions This information discusses some of the types of IOAs that are used to control devices in i5/OS® logical partitions and the placement rules that you must follow when installing these devices in your servers and expansion units.

Load source for i5/OS logical partitions This topic discusses the purpose of a load source for i5/OS® logical partitions and the placement rules that you must follow when installing the load source.

140 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Each i5/OS logical partition must have one disk unit designated as the load source. The server uses the load source to start the logical partition. The server always identifies this disk unit as unit number 1. You must follow placement rules when placing a load source disk unit in your managed system. Before adding a load source to your managed system or moving a load source within your managed system, validate the revised system hardware configuration with the System Planning Tool (SPT), back up the data on the disks attached to the IOA, and move the hardware according to the SPT output. Alternate restart device and removable media devices for i5/OS logical partitions This topic discusses the purpose of tape and optical devices in i5/OS® logical partitions and the placement rules that you must follow when installing these devices. A removable media device reads and writes to media (tape, CD-ROM, or DVD). Every i5/OS logical partition must have either a tape or an optical device (CD-ROM or DVD) available to use. The server uses the tape or optical devices as the alternate restart device and alternate installation device. The media in the device is what the system uses to start from when you perform a D-mode initial program load (IPL). The alternate restart device loads the Licensed Internal Code contained on the removable media instead of the code on the load source disk unit. It can also be used to install the system. Depending on your hardware setup, you might decide that your logical partitions will share these devices. If you decide to share these devices, remember that only one logical partition can use the device at any time. To switch devices between logical partitions, you must move the IOP controlling the shared device to the desired logical partition. Disk units for i5/OS logical partitions Disk units store data for i5/OS™ logical partitions. You can configure disk units into auxiliary storage pools (ASPs). Disk units store data for i5/OS logical partitions. The server can use and reuse this data at any time. This method of storing data is more permanent than memory (RAM); however, you can still erase any data on a disk unit. Disk units can be configured into auxiliary storage pools (ASPs) on any logical partition. All of the disk units you assign to an ASP must be from the same logical partition. You cannot create a cross-partition ASP. In computing, an interrupt is an asynchronous signal from hardware indicating the need for attention or a synchronous event in software indicating the need for a change in execution. A hardware interrupt causes the processor to save its state of execution via a context switch, and begin execution of an interrupt handler. Software interrupts

141 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

are usually implemented as instructions in the instruction set, which cause a context switch to an interrupt handler similarly to a hardware interrupt. Interrupts are a commonly used technique for computer multitasking, especially in real-time computing. Such a system is said to be interrupt-driven. An act of interrupting is referred to as an interrupt request ("IRQ"). Overview Hardware interrupts were introduced as a way to avoid wasting the processor's valuable time in polling loops, waiting for external events. Interrupts may be implemented in hardware as a distinct system with control lines, or they may be integrated into the memory subsystem. If implemented in hardware, a Programmable Interrupt Controller (PIC) or Advanced Programmable Interrupt Controller (APIC) is connected to both the interrupting device and to the processor's interrupt pin. If implemented as part of the memory controller, interrupts are mapped into the system's memory address space. Interrupts can be categorized into the following types: software interrupt, maskable interrupt, non-maskable interrupt (NMI), interprocessor interrupt (IPI), and spurious interrupt. ·

A software interrupt is an interrupt generated within a processor by executing an instruction. Examples of software interrupts are system calls.

An interrupt that leaves the machine in a well-defined state is called a precise interrupt. Such an interrupt has four properties: - The PC (Program Counter)is saved in a known place. - All instructions before the one pointed to by the PC have fully executed. - No instruction beyond the one pointed to by the PC has been executed (That is no prohibition on instruction beyond that in PC, it is just that any changes they make to registers or memory must be undone before the interrupt happens). - The execution state of the instruction pointed to by the PC is known. An interrupt that does not meet these requirements is called an imprecise interrupt. ·

A maskable interrupt is essentially a hardware interrupt which may be ignored by setting a bit in an interrupt mask register's (IMR) bit-mask.

·

Likewise, a non-maskable interrupt is a hardware interrupt which typically does not have a bit-mask associated with it allowing it to be ignored.

142 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

An interprocessor interrupt is a special type of interrupt which is generated by one processor to interrupt another processor in a multiprocessor system.

·

A spurious interrupt is a hardware interrupt which is generated by system errors, such as electrical noise on one of the PICs interrupt lines.

Processors typically have an internal interrupt mask which allows software to ignore all external hardware interrupts while it is set. This mask may offer faster access than accessing an IMR in a PIC, or disabling interrupts in the device itself. In some cases, such as the x86 architecture, disabling and enabling interrupts on the processor itself acts as a memory barrier, in which case it may actually be slower. The phenomenon where the overall system performance is severely hindered by excessive amounts of processing time spent handling interrupts is called an interrupt storm or live lock. Level-triggered A level-triggered interrupt is a class of interrupts where the presence of an unserviced interrupt is indicated by a high level (1), or low level (0), of the interrupt request line. A device wishing to signal an interrupt drives the line to its active level, and then holds it at that level until serviced. It ceases asserting the line when the CPU commands it to or otherwise handles the condition that caused it to signal the interrupt. Typically, the processor samples the interrupt input at predefined times during each bus cycle such as state T2 for the Z80 microprocessor. If the interrupt isn't active when the processor samples it, the CPU doesn't see it. One possible use for this type of interrupt is to minimize spurious signals from a noisy interrupt line: a spurious pulse will often be so short that it is not noticed. Multiple devices may share a level-triggered interrupt line if they are designed to. The interrupt line must have a pull-down or pull-up resistor so that when not actively driven it settles to its inactive state. Devices actively assert the line to indicate an outstanding interrupt, but let the line float (do not actively drive it) when not signaling an interrupt. The line is then in its asserted state when any (one or more than one) of the sharing devices is signaling an outstanding interrupt. This class of interrupts is favored by some because of a convenient behavior when the line is shared. Upon detecting assertion of the interrupt line, the CPU must search through the devices sharing it until one requiring service is detected. After servicing this one, the CPU may recheck the interrupt line status to determine whether any other devices also need service. If the line is now disserted then the CPU avoids the need to check all the remaining devices on the line. Where some devices interrupt much more than others, or where some devices are particularly expensive to check for interrupt status, a careful ordering of device checks brings some efficiency gain.

143 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

There are also serious problems with sharing level-triggered interrupts. As long as any device on the line has an outstanding request for service the line remains asserted, so it is not possible to detect a change in the status of any other device. Deferring servicing a low-priority device is not an option, because this would prevent detection of service requests from higher-priority devices. If there is a device on the line that the CPU does not know how to service, then any interrupt from that device permanently blocks all interrupts from the other devices. The original PCI standard mandated shareable level-triggered interrupts. The rationale for this was the efficiency gain discussed above. (Newer versions of PCI allow, and PCI Express requires, the use of message-signaled interrupts.) Edge-triggered An edge-triggered interrupt is a class of interrupts that are signaled by a level transition on the interrupt line, either a falling edge (1 to 0) or (usually) a rising edge (0 to 1). A device wishing to signal an interrupt drives a pulse onto the line and then returns the line to its quiescent state. If the pulse is too short to detect by polled I/O then special hardware may be required to detect the edge. Multiple devices may share an edge-triggered interrupt line if they are designed to. The interrupt line must have a pull-down or pull-up resistor so that when not actively driven it settles to one particular state. Devices signal an interrupt by briefly driving the line to its non-default state, and let the line float (do not actively drive it) when not signaling an interrupt. The line then carries all the pulses generated by all the devices. However, interrupt pulses from different devices may merge if they occur close in time. To avoid losing interrupts the CPU must trigger on the trailing edge of the pulse (e.g., the rising edge if the line is pulled up and driven low). After detecting an interrupt the CPU must check all the devices for service requirements. Edge-triggered interrupts do not suffer the problems that level-triggered interrupts have with sharing. Service of a low-priority device can be postponed arbitrarily, and interrupts will continue to be received from the high-priority devices that are being serviced. If there is a device that the CPU does not know how to service, it may cause a spurious interrupt, or even periodic spurious interrupts, but it does not interfere with the interrupt signaling of the other devices. The elderly ISA bus uses edge-triggered interrupts, but does not mandate that devices be able to share them. The parallel port also uses edge-triggered interrupts. Many older devices assume that they have exclusive use of their interrupt line, making it electrically unsafe to share them. However, ISA motherboards include pull-up resistors on the IRQ lines, so well-behaved devices share ISA interrupts just fine. Hybrid

144 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Some systems use a hybrid of level-triggered and edge-triggered signaling. The hardware not only looks for an edge, but it also verifies that the interrupt signal stays active for a certain period of time. A common hybrid interrupt is the NMI (non-maskable interrupt) input. Because NMIs generally signal major-or even catastrophic-system events, a good implementation of this signal tries to ensure that the interrupt is valid by verifying that it remains active for a period of time. This 2-step approach helps to eliminate false interrupts from affecting the system. Message-signalled A message-signalled interrupt does not use a physical interrupt line. Instead, a device signals its request for service by sending a short message over some communications medium, typically a computer bus. The message might be of a type reserved for interrupts, or it might be of some pre-existing type such as a memory write. Message-signalled interrupts behave very much like edge-triggered interrupts, in that the interrupt is a momentary signal rather than a continuous condition. Interrupt-handling software treats the two in much the same manner. Typically, multiple pending message-signalled interrupts with the same message (the same virtual interrupt line) are allowed to merge, just as closely-spaced edge-triggered interrupts can merge. Message-signalled interrupt vectors can be shared, to the extent that the underlying communication medium can be shared. No additional effort is required. Because the identity of the interrupt is indicated by a pattern of data bits, not requiring a separate physical conductor, many more distinct interrupts can be efficiently handled. This reduces the need for sharing. Interrupt messages can also be passed over a serial bus, not requiring any additional lines. PCI Express, a serial computer bus, uses message-signalled interrupts exclusively. Difficulty with sharing interrupt lines Multiple devices sharing an interrupt line (of any triggering style) all act as spurious interrupt sources with respect to each other. With many devices on one line the workload in servicing interrupts grows as the square of the number of devices. It is therefore preferred to spread devices evenly across the available interrupt lines. Shortage of interrupt lines is a problem in older system designs where the interrupt lines are distinct physical conductors. Message-signalled interrupts, where the interrupt line is virtual, are favoured in new system architectures (such as PCI Express) and relieve this problem to a considerable extent. Some devices with a badly-designed programming interface provide no way to determine whether they have requested service. They may lock up or otherwise

145 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

misbehave if serviced when they do not want it. Such devices cannot tolerate spurious interrupts, and so also cannot tolerate sharing an interrupt line. ISA cards, due to often cheap design and construction, are notorious for this problem. Such devices are becoming much rarer, as hardware logic becomes cheaper and new system architectures mandate shareable interrupts. Typical uses Typical interrupt uses include the following: system timers, disks I/O, power-off signals, and traps. Other interrupts exist to transfer data bytes using UARTs or Ethernet; sense key-presses; control motors; or anything else the equipment must do. A classic system timer interrupt interrupts periodically from a counter or the power-line. The interrupt handler counts the interrupts to keep time. The timer interrupt may also be used by the OS's task scheduler to reschedule the priorities of running processes. Counters are popular, but some older computers used the power line frequency instead, because power companies in most Western countries control the power-line frequency with an atomic clock. A disk interrupt signals the completion of a data transfer from or to the disk peripheral. A process waiting to read or write a file starts up again. A power-off interrupt predicts or requests a loss of power. It allows the computer equipment to perform an orderly shutdown. Interrupts are also used in type ahead features for buffering events like keystrokes. Direct memory access (DMA) is a feature of modern computers, that allows certain hardware subsystems within the computer to access system memory for reading and/or writing independently of the central processing unit. Many hardware systems use DMA including disk drive controllers, graphics cards, network cards, and sound cards. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without a DMA channel. Without DMA, using programmed input/output (PIO) mode, the CPU typically has to be occupied for the entire time it's performing a transfer. With DMA, the CPU would initiate the transfer, do other operations while the transfer is in progress, and receive an interrupt from the DMA controller once the operation has been done. This is especially useful in real-time computing applications where not stalling behind concurrent operations is critical. Principle DMA is an essential feature of all modern computers, as it allows devices to transfer data without subjecting the CPU to a heavy overhead. Otherwise, the CPU would have to copy each piece of data from the source to the destination. This is typically slower

146 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

than copying normal blocks of memory since access to I/O devices over a peripheral bus is generally slower than normal system RAM. During this time the CPU would be unavailable for any other tasks involving CPU bus access, although it could continue doing any work which did not require bus access. A DMA transfer essentially copies a block of memory from one device to another. While the CPU initiates the transfer, it does not execute it. For so-called "third party" DMA, as is normally used with the ISA bus, the transfer is performed by a DMA controller which is typically part of the motherboard chipset. More advanced bus designs such as PCI typically use bus mastering DMA, where the device takes control of the bus and performs the transfer itself. A typical usage of DMA is copying a block of memory from system RAM to or from a buffer on the device. Such an operation does not stall the processor, which as a result can be scheduled to perform other tasks. DMA transfers are essential to high performance embedded systems. It is also essential in providing so-called zero-copy implementations of peripheral device drivers as well as functionalities such as network packet routing, audio playback and streaming video.

DMA engines In addition to hardware interaction, DMA can also be used to offload expensive memory operations, such as large copies or scatter-gather operations, from the CPU to a dedicated DMA engine. While normal memory copies are typically too small to be worthwhile to offload on today's desktop computers, they are frequently offloaded on embedded devices due to more limited resources.[1] Newer Intel Xeon processors also include a DMA engine technology called I/OAT, meant to improve network performance on high-throughput network interfaces, such as gigabit Ethernet, in particular.[2] However, benchmarks with this approach on Linux indicate no more than 10% improvement in CPU utilization.[3] Examples ISA For example, a PC's ISA DMA controller has 16 DMA channels of which 7 are available for use by the PC's CPU. Each DMA channel has associated with it a 16-bit address register and a 16-bit count register. To initiate a data transfer the device driver sets up the DMA channel's address and count registers together with the direction of the data transfer, read or write. It then instructs the DMA hardware to begin the transfer. When the transfer is complete, the device interrupts the CPU.

147 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

"Scatter-gather" DMA allows the transfer of data to and from multiple memory areas in a single DMA transaction. It is equivalent to the chaining together of multiple simple DMA requests. Again, the motivation is to off-load multiple input/output interrupt and data copy tasks from the CPU. DRQ stands for DMA request; DACK for DMA acknowledge. These symbols are generally seen on hardware schematics of computer systems with DMA functionality. They represent electronic signaling lines between the CPU and DMA controller.

The Core Connect™ Bus Architecture Recent advances in silicon densities now allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripherals formerly attached to the processor at the card level are integrated onto the same die as the processor. As a result, chip designers must now address issues traditionally handled by the system designer. In particular, the on-chip buses used in such system-on-a chip designs must be sufficiently flexible and robust in order to support a wide variety of embedded system needs. The IBM Blue Logic™ cores program provides the framework to efficiently realize complex system-on-a chip (SOC) designs. Typically, a SOC contains numerous functional blocks representing a very large number of logic gates. Designs such as these are best realized through a macro-based approach. Macro based design provides numerous benefits during logic entry and verification, but the ability to reuse intellectual property is often the most significant. From generic serial ports to complex memory controllers and processor cores, each SOC generally requires the use of common macros. Many single chip solutions used in applications today are designed as custom chips, each with its own internal architecture. Logical units within such a chip are often difficult to extract and re-use in different applications. As a result, many times the same function is redesigned from one application to another. Promoting reuse by ensuring macro interconnectivity is accomplished by using common buses for intermacro communications. To that end, the IBM CoreConnect architecture provides three buses for interconnecting cores, library macros, and custom logic: · Processor Local Bus (PLB) · On-Chip Peripheral Bus (OPB) · Device Control Register (DCR) Bus Figure 1 illustrates how the CoreConnect architecture can be used to interconnect macros in a PowerPC 440 based SOC. High performance, high bandwidth blocks such as the PowerPC 440 CPU core, PCI-X Bridge and PC133/DDR133 SDRAM Controller reside on the PLB, while the OPB hosts lower data rate peripherals. The daisy-chained DCR bus provides a relatively low-speed data path for passing configuration and status information between the PowerPC 440 CPU core and other on-chip macros.

148 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The CoreConnect architecture shares many similarities with the Advanced Microcontroller Bus Architecture (AMBA™) from ARM Ltd. As shown in Table 1, the recently announced AMBA 2.01 includes the specification of many high performance features that have been available in the Core Connect architecture for over three years. Both architectures support data bus widths of 32-bits and higher, utilize separate read and write data paths and

149 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

allow multiple masters. CoreConnect and AMBA 2.0 now both provide high performance features including pipelining, split transactions and burst transfers. Many custom designs utilizing the high performance features of the CoreConnect architecture are available in the marketplace today. Open specifications for the CoreConnect architecture are available on the IBM Microelectronics web site. In addition, IBM offers a no-fee, royalty-free CoreConnect architectural license. Licensees receive the PLB arbiter, OPB arbiter and PLB/OPB Bridge designs along with bus model toolkits and bus functional compilers for the PLB, OPB and DCR buses. In the future, IBM intends to include compliance test suites for each of the three buses. Processor Local Bus The PLB and OPB buses provide the primary means of data flow among macro elements. Because these two buses have different structures and control signals, individual macros are designed to interface to either the PLB or the OPB. Usually the PLB interconnects high-bandwidth devices such as processor cores, external memory interfaces and DMA controllers. The PLB addresses the high performance, low latency and design flexibility issues needed in a highly integrated SOC through: · Decoupled address, read data, and write data buses with split transaction capability · Concurrent read and writes transfers yielding a maximum bus utilization of two data transfers per clock · Address pipelining that reduces bus latency by overlapping a new write request with an ongoing write transfer and up to three read requests with an ongoing read transfer. · Ability to overlap the bus request/grant protocol with an ongoing transfer

150 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

In addition to providing a high bandwidth data path, the PLB offers designers flexibility through the following features: · Support for both multiple masters and slaves · Four priority levels for master requests allowing PLB implementations with various arbitration schemes · Deadlock avoidance through slave forced PLB rearbitration · Master driven atomic operations through a bus arbitration locking mechanism · Byte-enable capability, supporting unaligned transfers · A sequential burst protocol allowing byte, half-word, word and double-word burst transfers · Support for 16-, 32- and 64-byte line data transfers · Read word address capability, allowing slaves to return line data either sequentially or target word first · DMA support for buffered, fly-by, peripheral-to-memory, memory-to-peripheral, and memory-to memory transfers · Guarded or unguarded memory transfers allow slaves to individually enable or disable prefetching of instructions or data · Slave error reporting · Architecture extendable to 256-bit data buses · Fully synchronous The PLB specification describes system architecture along with a detailed description of the signals and transactions. PLB-based custom logic systems require the use of a PLB macro to interconnect the various master and slave macros. Figure 2 illustrates the connection of multiple masters and slaves through the PLB macro. Each PLB master is attached to the PLB macro via separate address, read data and write data buses and a plurality of transfer qualifier signals. PLB slaves are attached to the PLB macro via shared, but decoupled, address, read data and write data buses along with transfer control and status signals for each data bus. The PLB architecture supports up to 16 master devices. Specific PLB macro implementations, however, may support fewer masters. The PLB architecture also supports any number of slave devices. The number of masters and slaves attached to a PLB macro directly affects the maximum attainable PLB bus clock rate. This is because

151 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

larger systems tend to have increased bus wire load and a longer delay in arbitrating among multiple masters and slaves. The PLB macro consists of a bus arbitration control unit and the control logic required managing the address and data flow through the PLB. The separate address and data buses from the masters allow simultaneous transfer requests. The PLB macro arbitrates among these requests and directs the address, data and control signals from the granted master to the slave bus. The slave response is then routed from the slave bus back to the appropriate master. PLB Bus Transactions PLB transactions consist of multiphase address and data tenures. Depending on the level of bus activity and capabilities of the PLB slaves, these tenures may be one or more PLB bus cycles in duration. In addition, address pipelining and separate read and write data buses yield increased bus throughput by way of concurrent tenures. Address tenures have three phases: request, transfer and address acknowledge. A PLB transaction begins when a master drives its address and transfer qualifier signals and requests ownership of the bus during the request phase of the address tenure. Once the PLB arbiter grants bus ownership the master's address and transfer qualifiers are presented to the slave devices during the transfer phase. The address cycle terminates when a slave latches the master's address and transfer qualifiers during the address acknowledge phase. Figure 3 illustrates two deep read and write address pipelining along with concurrent read and write data tenures. Master A and Master B represent the state of each master's address and transfer qualifiers. The PLB arbitrates between these requests and passes the selected master's request to the PLB slave address bus. The trace labeled Address Phase shows the state of the PLB slave address bus during each PLB clock. As shown in Figure 3, the PLB specification supports implementations where these three phases can require only a single PLB clock cycle. This occurs when the requesting master is immediately granted access to the slave bus and the slave acknowledges the address in the same cycle. If a master issues a request that cannot be immediately forwarded to the slave bus, the request phase lasts one or more cycles.

152 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Each data beat in the data tenure has two phases: transfer and acknowledge. During the transfer phase the master drives the write data bus for a write transfer or samples the read data bus for a read transfer. As shown in Figure 3, the first (or only) data beat of a write transfer coincides with the address transfer phase. Data acknowledge cycles are required during the data acknowledge phase for each data beat in a data cycle. In the case of a single-beat transfer, the data acknowledge signals also indicate the end of the data transfer. For line or burst transfers, the data acknowledge signals apply to each individual beat and indicate the end of the data cycle only after the final beat. The highest data throughput occurs when data is transferred between master and slave in a single PLB clock cycle. In this case the data transfer and data acknowledge phases are coincident. During multi-cycle accesses there is a wait-state either before or between the data transfer and data acknowledge phases. The PLB address, read data, and write data buses are decoupled from one another, allowing for address cycles to be overlapped with read or write data cycles, and for read data cycles to be overlapped with write data cycles. The PLB split bus transaction capability allows the address and data buses to have different masters at the same time. Additionally, a second master may request ownership of the PLB, via address pipelining, in parallel with the data cycle of another master's bus transfer. This is shown in Figure 3. Overlapped read and write data transfers and split-bus transactions allow the PLB to operate at a very high bandwidth by fully utilizing the read and write data buses. Allowing PLB devices to move data using long burst transfers can further enhance bus throughput. However, to control the maximum latency in a particular application, master latency timers are required. All masters able to issue burst operations must contain a latency timer that increments at the PLB clock rate and a latency count register. The latency count register is an example of a configuration register that is accessed via the DCR bus. During a burst operation, the latency timer begins counting after an address acknowledge

153 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

is received from a slave. When the latency timer exceeds the value programmed into the latency count register, the master can either immediately terminate its burst, continue until another master requests the bus or continue until another master requests the bus with a higher priority. PLB Cross-Bar Switch In some PLB-based systems multiple masters may cause the aggregate data bandwidth to exceed that which can be satisfied with a single PLB. With such a system it may be possible to place the high data rate masters and their target slaves on separate PLB buses. An example is a multiprocessor system using separate memory controllers. A macro known as the PLB Cross-Bar Switch (CBS) can be utilized to allow communication between masters on one PLB and slaves on the other. As shown in Figure 4, the CBS is placed between the PLB arbiters and their slave buses. When a master begins a transaction, the CBS uses the associated address to select the appropriate slave bus. The CBS supports simultaneous data transfers on both PLB buses along with a prioritization scheme to handle multiple requests to a common slave port. In addition, a high priority request can interrupt a lower priority transaction. On-Chip Peripheral Bus The On-Chip Peripheral Bus (OPB) is a secondary bus architected to alleviate system performance bottlenecks by reducing capacitive loading on the PLB. Peripherals suitable for attachment to the OPB include serial ports, parallel ports, UARTs, GPIO, timers and other low-bandwidth devices. As part of the IBM Blue Logic cores program, all OPB core peripherals directly attach to OPB. This common design point accelerates the design cycle time by allowing system designers to easily integrate complex peripherals into an ASIC.

The OPB provides the following features: · A fully synchronous protocol with separate 32-bit address and data buses · Dynamic bus sizing to support byte, half-word and word transfers

154 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

· · · ·

Byte and half-word duplication for byte and half-word transfers A sequential address (burst) protocol Support for multiple OPB bus masters Bus parking for reduced-latency transfers

OPB Bridge PLB masters gain access to the peripherals on the OPB bus through the OPB bridge macro. The OPB bridge acts as a slave device on the PLB and a master on the OPB. It supports word (32-bit), half-word (16-bit) and byte read and write transfers on the 32-bit OPB data bus, bursts and has the capability to perform target word first line read accesses. The OPB bridge performs dynamic bus sizing, allowing devices with different data widths to efficiently communicate. When the OPB bridge master performs an operation wider than the selected OPB slave the bridge splits the operation into two or more smaller transfers. OPB Implementation The OPB supports multiple masters and slaves by implementing the address and data buses as a distributed multiplexer. This type of structure is suitable for the less data intensive OPB bus and allows adding peripherals to a custom core logic design without changing the I/O on either the OPB arbiter or existing peripherals. Figure 5 shows one method of structuring the OPB address and data buses. Observe that both masters and slaves provide enable control signals for their outbound buses. By requiring that each macro provide this signal, the associated bus combining logic can be strategically

Channels (1) A high-speed metal or optical fiber subsystem that provides a path between the computer and the control units of the peripheral devices. Used in mainframes and high-end servers, each channel is an independent unit that transfers data concurrently with other channels and the CPU. For example, in a 32-channel computer, 32 streams of data are transferred simultaneously. In contrast, the PCI bus in a desktop computer is a shared channel between all devices plugged into it. (2) The physical connecting medium in a network, which could be twisted wire pairs, coaxial cable or optical fiber between clients, servers and other devices. (3) A subchannel within a communications channel. Multiple channels are transmitted via different carrier frequencies or by interleaving bits and bytes. This usage of the term can refer to both wired and wireless transmission. See FDM and TDM. (4) The Internet counterpart to a TV or radio channel. Information on a particular subject is transmitted to the user's computer from a Webcast site via the browser or push client.

155 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

See Webcast, push client and push technology. (5) The distributor/dealer sales channel. Vendors that sell in the channel rely on the sales ability of their dealers and the customer relationships they have built up over the years. Such vendors may also compete with the channel by selling direct to the customer via catalogs and the Web.

Channel controller (redirected from I/O processor) A channel controller is a simple CPU used to handle the task of moving data to and from the memory of a computer. Depending on the sophistication of the design, they can also be referred to as peripheral processors, I/O processors, I/O controllers or DMA controllers. Most input/output tasks can be fairly complex and require logic to be applied to the data to convert formats and other similar duties. In these situations the computer's CPU would normally be asked to handle the logic, but due to the fact that the I/O devices are very slow, the CPU would end up spending a huge amount of time (in computer terms) sitting idle waiting for the data from the device. A channel controller avoids this problem by using a low-cost CPU with enough logic and memory onboard to handle these sorts of tasks. They are typically not powerful or flexible enough to be used on their own, and are actually a form of co-processor. The CPU sends small programs to the controller to handle an I/O job, which the channel controller can then complete without any help from the CPU. When it is complete, or there is an error, the channel controller communicates with the CPU using a selection of interrupts. Since the channel controller has direct access to the main memory of the computer, they are also often referred to as DMA Controllers (where DMA means direct memory access), but that term is somewhat more loose in definition and is often applied to non-programmable devices as well. The first use of channel controllers was in the famed CDC 6600 supercomputer, which used 12 dedicated computers they referred to as peripheral processors, or PP's for this role. The PP's were quite powerful, basically a cut down version of CDC's first computer, the CDC 1604. Since the 1960s channel controllers have been a standard part of almost all mainframe designs, and the primary reason why anyone buys one. CDC's PP's are at one end of the spectrum of power, most mainframe systems tasked the CPU with more and the channel controllers with less of the overall I/O task.

156 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Channel controllers have also been made as small as single-chip designs with multiple channels on them, used in the NeXT computers for instance. However with the rapid speed increases in computers today, combined with operating systems that don't "block" when waiting for data, the channel controller has become somewhat redundant and are not commonly found on smaller machines. Channel controllers can be said to be making a comeback in the form of "bus mastering" peripheral devices, such as SCSI adaptors and network cards. The rationale for these devices is the same as for the original channel controllers, namely off-loading interrupts and context switching from the main CPU. A serial number is a unique number that is one of a series assigned for identification which varies from its successor or predecessor by a fixed discrete integer value. Common usage has expanded the term to refer to any unique alphanumeric identifier for one of a large set of objects, however in data processing and allied fields in computer science. Not every numerical identifier is a serial number; identifying numbers which are not serial numbers are sometimes called nominal numbers. Sequence numbers are almost always non-negative, and typically start at zero or one. Applications of serial numbering Serial numbers are valuable in quality control, as once a defect is found in the production of a particular batch of product, the serial number will quickly identify which units are affected. Serial numbers are also used as a deterrent against theft and counterfeit products in that serial numbers can be recorded, and stolen or otherwise irregular goods can be identified. Many computer programs come with serial numbers, often called "CD keys," and the installers often require the user to enter a valid serial number to continue. These numbers are verified using a certain algorithm to avoid usage of counterfeit keys. Serial numbers also help track down counterfeit currency, because in some countries each banknote has a unique serial number. The ISSN or International Standard Serial Number seen on magazines and other periodicals, an equivalent to the ISBN applied to books, is serially assigned but takes its name from the library science use of serial to mean a periodical. Certificates and Certificate Authorities (CA) are necessary for widespread use of cryptography. These depend on applying mathematically rigorous serial numbers and serial number arithmetic The term "serial number" is also used in military formations as an alternative to the expression "service number". Estimating population size from serial numbers

157 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

If there are items whose serial numbers is part of a sequence of consecutive numbers and you take n number of random samples of items' serial numbers, you can then estimate the population of items "in the wild" using a maximum likelihood method derived using Bayesian reasoning. Serial number arithmetic Serial numbers are often used in network protocols. However, most sequence numbers in computer protocols are limited to a fixed number of bits, and will wrap around after a sufficiently many numbers have been allocated. Thus, recently-allocated serial numbers may duplicate very old serial numbers, but not other recently-allocated serial numbers. To avoid ambiguity with these non-unique numbers, RFC 1982, " Serial Number Arithmetic" defines special rules for calculations involving these kinds of serial numbers. Lollipop sequence number spaces are a more recent and sophisticated scheme for dealing with finite-sized sequence numbers in protocols. An information processor or information processing system, as its name suggests, is a system (be it electrical, mechanical or biological) which takes information (a sequence of enumerated states) in one form and processes (transforms) it into another form, e.g. to statistics, by an algorithmic process.

An information processing system is made up of four basic parts, or sub-systems: ·

input

·

processor

·

storage

·

output

An object may be considered an information processor if it receives information from another object and in some manner changes the information before transmitting it. This broadly defined term can be used to describe every change which occurs in the

158 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

universe. As an example, a falling rock could be considered an information processor due to the following observable facts: First, information in the form of gravitational force from the earth serves as input to the system we call a rock. At a particular instant the rock is a specific distance from the surface of the earth traveling at a specific speed. Both the current distance and speed properties are also forms of information which for that instant only may be considered "stored" in the rock. In the next instant, the distance of the rock from the earth has changed due to its motion under the influence of the earth's gravity. Any time the properties of an object change a process has occurred meaning that a processor of some kind is at work. In addition, the rock's new position and increased speed is observed by us as it falls. These changing properties of the rock are its "output." It could be argued that in this example both the rock and the earth are the information processing system being observed since both objects are changing the properties of each other over time. If information is not being processed no change would occur at all.

Lesson VII Arithmetic/Logic Unit Enhancement Arithmetic logic units (ALU) perform arithmetic and logic operations on binary data inputs. In some processors, the ALU is divided into two units: an arithmetic unit (AU) and a logic unit (LU). In processors with multiple arithmetic units, one AU may be used for fixed-point operations while another is used for floating-point operations. In some personal computers (PCs), floating-point operations are performed by a special floating-point AU that is located on a separate chip called a numeric coprocessor. Typically, arithmetic logic units have direct input and output access to the processor controller, main memory and input/output (I/O) devices. Inputs and outputs flow along an electronic path called a bus. Each input consists of a machine instruction word that contains an operation code, one or more operands, and sometimes a format code. The operation code determines the operations to perform and the operands to use. When combined with a format code, it also indicates whether the operation is fixed-point or floating-point. ALU outputs are placed in a storage register. Generally, arithmetic logic units include storage points for input operands, operands that are being added, the accumulated result, and shifted results. Arithmetic logic units vary in terms of number of bits, supply voltage, operating current, propagation delay, power dissipation, and operating temperature. The number of bits equals the width of the two input words on which the ALU performance arithmetic and logical operations. Common configurations include 2-bit, 4-bit, 8-bit, 16-bit, 32-bit and

159 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

64-bit ALUs. Supply voltages range from - 5 V to 5 V and include intermediate voltages such as - 4.5 V, - 3.3 V, - 3 V, 1.2 V, 1.5 V, 1.8 V, 2.5 V, 3 V, 3.3 V, and 3.6 V. The operating current is the minimum current needed for active operation. The propagation delay is the time interval between the application of an input signal and the occurrence of the corresponding output. Power dissipation, the total power consumption of the device, is generally expressed in watts (W) or milliwatts (mW). Operating temperature is a full-required range. Selecting arithmetic logic units requires an analysis of logic families. Transistor-transistor logic (TTL) and related technologies such as Fairchild advanced Schottky TTL (FAST) use transistors as digital switches. By contrast, emitter coupled logic (ECL) uses transistors to steer current through gates that compute logical functions. Another logic family, complementary metal-oxide semiconductor (CMOS), uses a combination of P-type and N-type metal-oxide-semiconductor field effect transistors (MOSFETs) to implement logic gates and other digital circuits. Bipolar CMOS (BiCMOS) is a silicon-germanium technology that combines the high speed of bipolar TTL with the low power consumption of CMOS. Other logic families for arithmetic logic units include cross-bar switch technology (CBT), gallium arsenide (GaAs), integrated injection logic (I2L) and silicon on sapphire (SOS). Gunning with transceiver logic (GTL) and gunning with transceiver logic plus (GTLP) are also available. Arithmetic logic units are available in a variety of integrated circuit (IC) package types and with different numbers of pins. Basic IC package types for ALUs include ball grid array (BGA), quad flat package (QFP), single in-line package (SIP), and dual in-line package (DIP). Many packaging variants are available. For example, BGA variants include plastic-ball grid array (PBGA) and tape-ball grid array (TBGA). QFP variants include low-profile quad flat package (LQFP) and thin quad flat package (TQFP). DIPs are available in either ceramic (CDIP) or plastic (PDIP). Other IC package types include small outline package (SOP), thin small outline package (TSOP), and shrink small outline package (SSOP). Decimal Arithmetic The 80x86 CPUs use the binary numbering system for their native internal representation. The binary numbering system is, by far, the most common numbering system in use in computer systems today. In days long since past, however, there were computer systems that were based on the decimal (base 10) numbering system rather than the binary numbering system. Consequently, their arithmetic system was decimal based rather than binary. Such computer systems were very popular in systems targeted for business/commercial systems1. Although systems designers have discovered that binary arithmetic is almost always better than decimal arithmetic for general calculations, the myth still persists that decimal arithmetic is better for money calculations than binary arithmetic. Therefore, many software systems still specify the use of decimal arithmetic in their calculations (not to mention that there is lots of legacy code out there whose

160 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

algorithms are only stable if they use decimal arithmetic). Therefore, despite the fact that decimal arithmetic is generally inferior to binary arithmetic, the need for decimal arithmetic still persists. Of course, the 80x86 is not a decimal computer; therefore we have to play tricks in order to represent decimal numbers using the native binary format. The most common technique, even employed by most so-called decimal computers, is to use the binary coded decimal, or BCD representation. The BCD representation (see "Nibbles" on page 56) uses four bits to represent the 10 possible decimal digits. The binary value of those four bits is equal to the corresponding decimal value in the range 0..9. Of course, with four bits we can actually represent 16 different values. The BCD format ignores the remaining six bit combinations.

Table 1: Binary Code Decimal (BCD) Representation BCD Representation Decimal Equivalent 0000

0

0001

1

0010

2

0011

3

0100

4

0101

5

0110

6

0111

7

1000

8

1001

9

1010

Illegal

1011

Illegal

1100

Illegal

1101

Illegal

1110

Illegal

1111

Illegal

161 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Since each BCD digit requires four bits, we can represent a two-digit BCD value with a single byte. This means that we can represent the decimal values in the range 0..99 using a single byte (versus 0..255 if we treat the value as an unsigned binary number). Clearly it takes a bit more memory to represent the same value in BCD as it does to represent the same value in binary. For example, with a 32-bit value you can represent BCD values in the range 0..99,999,999 (eight significant digits) but you can represent values in the range 0..4,294,967,295 (better than nine significant digits) using the binary representation. Not only does the BCD format waste memory on a binary computer (since it uses more bits to represent a given integer value), but decimal arithmetic is slower. For these reasons, you should avoid the use of decimal arithmetic unless it is absolutely mandated for a given application. Binary coded decimal representation does offer one big advantage over binary representation: it is fairly trivial to convert between the string representation of a decimal number and the BCD representation. This feature is particularly beneficial when working with fractional values since fixed and floating point binary representations cannot exactly represent many commonly used values between zero and one (e.g., 1/10). Therefore, BCD operations can be efficient when reading from a BCD device, doing a simple arithmetic operation (e.g., a single addition) and then writing the BCD value to some other device.

Literal BCD Constants HLA does not provide, nor do you need, a special literal BCD constant. Since BCD is just a special form of hexadecimal notation that does not allow the values $A..$F, you can easily create BCD constants using HLA's hexadecimal notation. Of course, you must take care not to include the symbols 'A'..'F' in a BCD constant since they are illegal BCD values. As an example, consider the following MOV instruction that copies the BCD value '99' into the AL register: mov( $99, al );

he important thing to keep in mind is that you must not use HLA literal decimal constants for BCD values. That is, "mov( 95, al );" does not load the BCD representation for ninety-five into the AL register. Instead, it loads $5F into AL and that's an illegal BCD value. Any computations you attempt with illegal BCD values will produce garbage results. Always remember that, even though it seems counter-intuitive, you use hexadecimal literal constants to represent literal BCD values. How Pipelining Works PIpelining, a standard feature in RISC processors, is much like an assembly line. Because

162 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

the processor works on different steps of the instruction at the same time, more instructions can be executed in a shorter period of time. A useful method of demonstrating this is the laundry analogy. Let's say that there are four loads of dirty laundry that need to be washed, dried, and folded. We could put the first load in the washer for 30 minutes, dry it for 40 minutes, and then take 20 minutes to fold the clothes. Then pick up the second load and wash, dry, and fold, and repeat for the third and fourth loads. Supposing we started at 6 PM and worked as efficiently as possible, we would still be doing laundry until midnight.

However, a smarter approach to the problem would be to put the second load of dirty laundry into the washer after the first was already clean and whirling happily in the dryer. Then, while the first load was being folded, the second load would dry, and a third load could be added to the pipeline of laundry. Using this method, the laundry would be finished by 9:30.

163 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

RISC Pipelines A RISC processor pipeline operates in much the same way, although the stages in the pipeline are different. While different processors have different numbers of steps, they are basically variations of these five, used in the MIPS R3000 processor: 1. fetch instructions from memory 2. read registers and decode the instruction 3. execute the instruction or calculate an address 4. access an operand in data memory 5. write the result into a register If you glance back at the diagram of the laundry pipeline, you'll notice that although the washer finishes in half an hour, the dryer takes an extra ten minutes, and thus the wet clothes must wait ten minutes for the dryer to free up. Thus, the length of the pipeline is dependent on the length of the longest step. Because RISC instructions are simpler than those used in pre-RISC processors (now called CISC, or Complex Instruction Set Computer), they are more conducive to pipelining. While CISC instructions varied in length, RISC instructions are all the same length and can be fetched in a single operation. Ideally, each of the stages in a RISC processor pipeline should take 1 clock cycle so that the processor finishes an instruction each clock cycle and averages one cycle per instruction (CPI).

164 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Pipeline Problems In practice, however, RISC processors operate at more than one cycle per instruction. The processor might occasionally stall a result of data dependencies and branch instructions. A data dependency occurs when an instruction depends on the results of a previous instruction. A particular instruction might need data in a register which has not yet been stored since that is the job of a preceding instruction which has not yet reached that step in the pipeline. For example: add $r3, $r2, $r1 add $r5, $r4, $r3 more instructions that are independent of the first two In this example, the first instruction tells the processor to add the contents of registers r1 and r2 and store the result in register r3. The second instructs it to add r3 and r4 and store the sum in r5. We place this set of instructions in a pipeline. When the second instruction is in the second stage, the processor will be attempting to read r3 and r4 from the registers. Remember, though, that the first instruction is just one step ahead of the second, so the contents of r1 and r2 are being added, but the result has not yet been written into register r3. The second instruction therefore cannot read from the register r3 because it hasn't been written yet and must wait until the data it needs is stored. Consequently, the pipeline is stalled and a number of empty instructions (known as bubbles go into the pipeline. Data dependency affects long pipelines more than shorter ones since it takes a longer period of time for an instruction to reach the final register-writing stage of a long pipeline. MIPS' solution to this problem is code reordering. If, as in the example above, the following instructions have nothing to do with the first two, the code could be rearranged so that those instructions are executed in between the two dependent instructions and the pipeline could flow efficiently. The task of code reordering is generally left to the compiler, which recognizes data dependencies and attempts to minimize performance stalls. Branch instructions are those that tell the processor to make a decision about what the next instruction to be executed should be based on the results of another instruction. Branch instructions can be troublesome in a pipeline if a branch is conditional on the results of an instruction which has not yet finished its path through the pipeline. For example: Loop : add $r3, $r2, $r1

165 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

sub $r6, $r5, $r4 beq $r3, $r6, Loop The example above instructs the processor to add r1 and r2 and put the result in r3, then subtract r4 from r5, storing the difference in r6. In the third instruction, beq stands for branch if equal. If the contents of r3 and r6 are equal, the processor should execute the instruction labeled "Loop." Otherwise, it should continue to the next instruction. In this example, the processor cannot make a decision about which branch to take because neither the value of r3 or r6 have been written into the registers yet. The processor could stall, but a more sophisticated method of dealing with branch instructions is branch prediction. The processor makes a guess about which path to take - if the guess is wrong, anything written into the registers must be cleared, and the pipeline must be started again with the correct instruction. Some methods of branch prediction depend on stereotypical behavior. Branches pointing backward are taken about 90% of the time since backward-pointing branches are often found at the bottom of loops. On the other hand, branches pointing forward, are only taken approximately 50% of the time. Thus, it would be logical for processors to always follow the branch when it points backward, but not when it points forward. Other methods of branch prediction are less static: processors that use dynamic prediction keep a history for each branch and uses it to predict future branches. These processors are correct in their predictions 90% of the time. Still other processors forgo the entire branch prediction ordeal. The RISC System/6000 fetches and starts decoding instructions from both sides of the branch. When it determines which branch should be followed, it then sends the correct instructions down the pipeline to be executed. Pipelining Developments In order to make processors even faster, various methods of optimizing pipelines have been devised. Super pipelining refers to dividing the pipeline into more steps. The more pipe stages there are, the faster the pipeline is because each stage is then shorter. Ideally, a pipeline with five stages should be five times faster than a non-pipelined processor (or rather, a pipeline with one stage). The instructions are executed at the speed at which each stage is completed, and each stage takes one fifth of the amount of time that the non-pipelined instruction takes. Thus, a processor with an 8-step pipeline (the MIPS R4000) will be even faster than its 5-step counterpart. The MIPS R4000 chops its pipeline into more pieces by dividing some steps into two. Instruction fetching, for example, is now done in two stages rather than one. The stages are as shown: 1. Instruction Fetch (First Half)

166 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

2. Instruction Fetch (Second Half) 3. Register Fetch 4. Instruction Execute 5. Data Cache Access (First Half) 6. Data Cache Access (Second Half) 7. Tag Check 8. Write Back

Superscalar pipelining involves multiple pipelines in parallel. Internal components of the processor are replicated so it can launch multiple instructions in some or all of its pipeline stages. The RISC System/6000 has a forked pipeline with different paths for floating-point and integer instructions. If there is a mixture of both types in a program, the processor can keep both forks running simultaneously. Both types of instructions share two initial stages (Instruction Fetch and Instruction Dispatch) before they fork. Often, however, superscalar pipelining refers to multiple copies of all pipeline stages (In terms of laundry, this would mean four washers, four dryers, and four people who fold clothes). Many of today's machines attempt to find two to six instructions that it can execute in every pipeline stage. If some of the instructions are dependent, however, only the first instruction or instructions are issued. Dynamic pipelines have the capability to schedule around stalls. A dynamic pipeline is divided into three units: the instruction fetch and decode unit, five to ten execute or functional units, and a commit unit. Each execute unit has reservation stations, which act as buffers and hold the operands and operations.

167 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

While the functional units have the freedom to execute out of order, the instruction fetch/decode and commit units must operate in-order to maintain simple pipeline behavior. When the instruction is executed and the result is calculated, the commit unit decides when it is safe to store the result. If a stall occurs, the processor can schedule other instructions to be executed until the stall is resolved. This, coupled with the efficiency of multiple units executing instructions simultaneously, makes a dynamic pipeline an attractive alternative.

Lesson VIII Processor and System Structures Types of Computer System The line between computer systems can be extremely vague. A powerful entry level system can double as an low end business system, or a gamming system can be identical to a low end workstation. In fact, some equipment manufactures may refer to their business systems as workstations. Some components on a computer in any class can be installed on all systems. For example, a manufacture may use the same RAM on the entry level system and the gaming system. You will want to pay particular attention to system's CPU and Video. Sometimes it the amount of hard disk space or the addition of a better

168 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

video adapter that moves a system from one class to another. Please keep in mind that the tables below show the minimum configurations. Entry Level Systems Entry level systems are the most common systems for home and general use. These systems are often powerful enough to run standard office software, watch DVD movies, surf the internet, send and receive email, as well as run home and small office accounting software. As of the date of this article, the minimum configuration for an entry level computer system is as follows: Intel Pentium 4 or Celeron running at 2 Processor Gigahertz (GHz) or better, or AMD Athlon, Duron or Semphon running at 1.5Ghz or better System memory (RAM) 256 megabytes (Mb) of DDR RAM Hard Disk Storage 40 Gigabytes (Gb) Optical Storage CDRW/DVD Monitor 17 inch CRT display USB Ports 2.0 standard at least 4 ports At least 32MB - Often uses system Video memory Audio Should be included along with speakers Network Adapter Should be included (for Hi-speed Internet) Computer (CPU)

Additionally the system may include some additional ports such as keyboard and mouse ports, a printer port, serial port(s), a game port and, optionally, a dial-up modem. Business Class Systems Intel Pentium 4 or Celeron running at 2 Computer Processor Gigahertz (GHz) or better, or AMD Athlon, (CPU) Duron or Semphon running at 1.5Ghz or better 256 megabytes (Mb) of Error Correcting System memory (RAM) Code (ECC) DDR RAM EIDE 120 Gigabytes (Gb) – SCSI or SATA Hard Disk Storage with RAID 1 preferred Optical Storage CDRW/DVD Monitor 17 inch CRT display USB Ports 2.0 standard at least 4 ports At least 32MB RAM - Often uses system Video memory Audio Should be included along with speakers Network Adapter Should be included (for Hi-speed Internet)

169 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Gaming Systems Intel Pentium 4 running at 3 Gigahertz (GHz) or better, or AMD Athlon, or AMD64 running at 2.2Ghz or better System memory (RAM) 1 Gigabyte (Gb) of DDR RAM Hard Disk Storage Gigabytes 80 (Gb) Optical Storage CDRW/DVD Monitor 17 inch LCD display USB Ports 2.0 standard at least 6 ports At least 128MB DDR RAM video adapter Video with Graphics Processing Unit (GPU) and heat sink Audio 5.1 Dolby Network Adapter Should be included (for Hi-speed Internet) Computer (CPU)

Processor

Workstations and Servers Workstations and servers are usually built and configured to specifications.

Intel Pentium 4, Intel Xeon, AMD64, Computer Processor(s) AMD64FX, AMD Opteron. System may (CPUs) support multiple processors. 512 megabytes (Mb) to 4 Gigabytes of System memory (RAM) DDR RAM Hard Disk Storage 80 Gigabytes (Gb) to 2 terabytes (Tb) Optical Storage Task Specific Monitor 17 inch CRT display USB Ports 2.0 standard at least 6 ports Video Task Specific Audio Optional - Task Specific Network Adapter High end Network Adapter

In mathematics, an operand is one of the inputs (arguments) of an operator. For instance, in 3+6=9 '+' is the operator and '3' and '6' are the operands.

170 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The number of operands of an operator is called its arity. Based on arity, operators are classified as unary, binary, ternary etc. In computer programming languages, the definitions of operator and operand are almost the same as in mathematics. Additionally, in assembly language, an operand is a value (an argument) on which the instruction, named by mnemonic, operates. The operand may be a processor register, a memory address, a literal constant, or a label. A simple example (in the PC architecture) is MOV

DS, AX

where the value in register operand 'AX' is to be moved into register 'DS'. Depending on the instruction, there may be zero, one, two, or more operands. Retrieved from "http://en.wikipedia.org/wiki/Operand" The ISA Level: Data Types, Instruction Formats and Addressing Introduction • At the ISA level, a variety of different data types are used to represent data. • A key issue is whether or not there is hardware support for a particular data type. • Hardware support means that one or more instructions expect data in a particular format, and the user is not free to pick a different format. • Another issue is precision – what if we wanted to total the transactions on Bill Gates’ deposit account? • Using 32-bit arithmetic would not work here because the numbers involved are larger than 232 (about 4 billion). • We could to use two 32-bit integers to represent each number, giving 64 bits in all. • However, if the machine does not support this kind of double precision number, all arithmetic on them will have to be done in software, thus, without a required hardware representation. • Today, we will look at data types are supported by the hardware, and thus for which specific formats are required. Numeric Data Types • Data types can be divided into two categories: numeric and nonnumeric. • Chief among the numeric data types are the integers, which come in many lengths, typically 8, 16, 32, and 64 bits.

171 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

• Most modern computers store integers in two’s complement binary notation. • Some computers support unsigned integers as well as signed integers. • For an unsigned integer, there is no sign bit and all the bits contain data – thus the range of a 32-bit word is 0 to 232 − 1, inclusive. • In contrast, a two’s complement signed 32-bit integer can only handle numbers up to 231 − 1, but it can also handle negative numbers. • For numbers that cannot be expressed as an integer, floating-point numbers are used. • They have lengths of 32, 64, or sometimes 128 bits. • Most computers have instructions for doing floating-point arithmetic. • Many computers have separate registers for holding integer operands and for holding floating-point operands. Nonnumeric Data Types • Modern computers are often used for nonnumerical applications, such as word processing or database management. • Thus, characters are clearly important here although not every computer provides hardware support for them. • The most common character codes are ASCII and UNICODE. • These support 7-bit characters and 16-bit characters, respectively. • It is not uncommon for the ISA level to have special instructions that are intended for handling character strings. • The instructions can perform copy, search, edit and other functions on the strings. • Boolean values are also important. • Two values: TRUE or FALSE. • In theory, a single bit can represent a Boolean, with 0 as false and 1 as true (or vice versa). • In practice, a byte or word is used per Boolean value because individual bits in a byte do not have their own addresses and thus are hard to access. • A common system uses the convention that 0 means false and everything else means true. • Our last data type is the pointer, which is just a machine address. • We have already seen pointers. • When we discussed stacks we came across pointers SP and LV. • Accessing a variable at a fixed distance from a pointer, which is the way ILOAD works, is extremely common on all machines. Instruction Formats

172 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

• An instruction consists of an opcode, usually along with some additional information such as where operands come from, and where results go to. • The general subject of specifying where the operands are (i.e., their addresses) is called addressing. • Instructions always have an opcode to tell what the instruction does. • There can be zero, one, two, or three addresses present.

Instruction Formats • On some machines, all instructions have the same length; on others there may be many different lengths. • Instructions may be shorter than, the same length as, or longer than the word length. • Having all the instructions be the same length is simpler and makes decoding easier but often wastes space, since all instructions then have to be as long as the longest one.

173 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Addressing • Instructions generally have one, two or three operands. • The operands are addressed using one of the following modes: – Immediate – Direct – Register – Indexed – Other mode

174 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

• Some machines have a large number of complex addressing modes. • We will consider a few addressing modes here. • The simplest way for an instruction to specify an operand is for the address part of the instruction actually to contain the operand itself rather than an address or other information describing where the operand is. • Such an operand is called an immediate operand because it is automatically fetched from memory at the same time the instruction itself is fetched. • Example: MOV R1,4 • Advantage – no extra memory reference to fetch the operand. • Disadvantage – only a constant can be supplied this way. Direct Addressing • A method for specifying an operand in memory is just to give its full address. • This mode is called direct addressing. • Like immediate addressing, direct addressing is restricted in its use: the instruction will always access exactly the same memory location. • So while the value can change, the location cannot. • Thus direct addressing can only be used to access global variables whose address is known at compile time. Register Addressing • Register addressing is conceptually the same as direct addressing but specifies a register instead of a memory location. • Because registers are so important (due to fast access and short addresses) this addressing mode is the most common one on most computers. • Many compilers go to great lengths to determine which variables will be accessed most often (for example, a loop index) and put these variables in registers. • This addressing mode is known simply as register mode. • In this mode, the operand being specified comes from memory or goes to memory, but its address is not hardwored into the instruction, as in direct addressing. • Instead, the address is contained in a register. • When an address is used in this manner, it is called a pointer. • A big advantage of register indirect addressing is that it can reference memory without paying the price of having a full memory address in the instruction. • Consider an program which steps through the elements of a 1024-element one-dimensional

175 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

integer array to compute the sum of the elements in register R1. • We will indirectly register through R2 to access the elements of the array • Here is the assembly program: Example of Register Indirect Addressing MOV R1,#0 MOV R2,#A MOV R3,#A+4096 LOOP: ADD R1,(R2) ADD R2,#4 CMP R2,R3

; accumulate the sum in R1, initially 0 ; R2 = address of the array A ; R3 = address of the first word beyond A ; register indirect through R2 to get operand ; increment R2 by one word (4 bytes) ; are we done yet?

Indexed Addressing • It is frequently useful to be able to reference memory words at a known offset from a register. • Addressing memory by giving a register (explicit or implicit) plus a constant offset is called indexed addressing. • Example: consider the following calculation: • We have two one-dimensional arrays of 1024 words each, A and B, and we wish to compute Ai AND Bi for all the pairs and then OR these 1024 Boolean products together to see if there is at least one nonzero pair in the set. • Here is the assembly program.

Based-Inde

xed Addressing

• Some machines have an addressing mode in which the memory address is computed by adding up two registers plus an (optional) offset. • Sometimes this mode is called based-indexed addressing.

176 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

• One of the registers is the base and the other is the index. • Such a mode would have been useful in our example here. • Outside the loop we could have put the address of A in R5 and the address of B in R6. • Then we could have replaced the instruction at LOOP and its successor with LOOP: MOV R4,(R2+R5) AND R4,(R2+R6) An instruction set is (a list of) all instructions, and all their variations, that a processor can execute. Instructions include: ·

arithmetic such as add and subtract

·

logic instructions such as and, or, and not

·

data instructions such as move, input, output, load, and store

An instruction set, or instruction set architecture (ISA), is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O. An ISA includes a specification of the set of opcodes (machine language), the native commands implemented by a particular CPU design. Instruction set architecture is distinguished from the microarchitecture, which is the set of processor design techniques used to implement the instruction set. Computers with different microarchitectures can share a common instruction set. For example, the Intel Pentium and the AMD Athlon implement nearly identical versions of the x86 instruction set, but have radically different internal designs. This concept can be extended to unique ISAs like TIMI (Technology-Independent Machine Interface) present in the IBM System/38 and IBM AS/400. TIMI is an ISA that is implemented as low-level software and functionally resembles what is now referred to as a virtual machine. It was designed to increase the longevity of the platform and applications written for it, allowing the entire platform to be moved to very different hardware without having to modify any software except that which comprises TIMI itself. This allowed IBM to move the AS/400 platform from an older CISC architecture to the newer POWER architecture without having to rewrite any parts of the OS or software associated with it. Instruction set design When designing microarchitectures, engineers use Register Transfer Language (RTL) to define the operation of each instruction of an ISA. Historically there have been 4 ways to store that description inside the CPU:

177 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

all early computer designers, and some of the simpler later RISC computer designers, hard-wired the instruction set.

·

Many CPU designers compiled the instruction set to a microcode ROM inside the CPU. (such as the Western Digital MCP-1600)

·

Some CPU designers compiled the instruction set to a writable RAM or FLASH inside the CPU (such as the Rekursiv processor and the Imsys Cjip)[1], or a FPGA (reconfigurable computing).

An ISA can also be emulated in software by an interpreter. Due to the additional translation needed for the emulation, this is usually slower than directly running programs on the hardware implementing that ISA. Today, it is common practice for vendors of new ISAs or microarchitectures to make software emulators available to software developers before the hardware implementation is ready. Some instruction set designers reserve one or more opcodes for some kind of software interrupt. For example, MOS Technology 6502 uses 0x00 (all zeroes), Zilog Z80 uses 0xFF (all ones),[1] and Motorola 68000 has instructions 0xA000 through 0xAFFF. Fast virtual machines are much easier to implement if an instruction set meets the Popek and Goldberg virtualization requirements. On systems with multiple processors, non-blocking synchronization algorithms are much easier to implement if the instruction set includes support for something like "fetch-and-increment" or "load linked/store conditional (LL/SC)" or "atomic compare and swap". Code density In early computers, program memory was expensive and limited, and minimizing the size of a program in memory was important. Thus the code density -- the combined size of the instructions needed for a particular task -- was an important characteristic of an instruction set. Instruction sets with high code density employ powerful instructions that can implicity perform several functions at once. Typical complex instruction-set computers (CISC) have instructions that combine one or two basic operations (such as "add", "multiply", or "call subroutine") with implicit instructions for accessing memory, incrementing registers upon use, or dereferencing locations stored in memory or registers. Some software-implemented instruction sets have even more complex and powerful instructions. Reduced instruction-set computers (RISC), first widely implemented during a period of rapidly-growing memory subsystems, traded off simpler and faster instruction-set implementations for lower code density (that is, more program memory space to implement a given task). RISC instructions typically implemented only a single implicit

178 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

operation, such as an "add" of two registers or the "load" of a memory location into a register. Minimal instruction set computers (MISC) are a form of stack machine, where there are few separate instructions (16-64), so that multiple instructions can be fit into a single machine word. These type of cores often take little silicon to implement, so they can be easily realized in an FPGA or in a multi-core form. Code density is similar to RISC; the increased instruction density is offset by requiring more of the primitive instructions to do a task. Instruction sets may be categorized by the number of operands in their most complex instructions. (In the examples that follow, a, b, and c refer to memory addresses, and reg1 and so on refer to machine registers.) ·

0-operand ("zero address machines") -- these are also called stack machines, and all operations take place using the top one or two positions on the stack. Adding two numbers here can be done with four instructions: push a, push b, add, pop c;

·

1-operand -- this model was common in early computers, and each instruction performs its operation using a single operand and places its result in a single accumulator register: load a, add b, store c;

·

2-operand -- most RISC machines fall into this category, though many CISC machines also fall here as well. For a RISC machine (requiring explicit memory loads), the instructions would be: load a,reg1, load b,reg2, add reg1,reg2, store reg2;

·

3-operand -- some CISC machines, and a few RISC machines fall into this category. The above example here might be performed in a single instruction in a machine with memory operands: add a,b,c, or more typically (most machines permit a maximum of two memory operations even in three-operand instructions): move a,reg1, add reg1,b,c. In three-operand RISC machines, all three operands are typically registers, so explicit load/store instructions are needed. An instruction set with 32 registers requires 15 bits to encode three register operands, so this scheme is typically limited to instructions sets with 32-bit instructions or longer;

·

more operands -- some CISC machines permit a variety of addressing modes that allow more than 3 register-based operands for memory accesses.

There has been research into executable compression as a mechanism for improving code density. The mathematics of Kolmogorov complexity describes the challenges and limits of this. Machine language

179 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Machine language is built up from discrete statements or instructions. Depending on the processing architecture, a given instruction may specify: ·

Particular registers for arithmetic, addressing, or control functions

·

Particular memory locations or offsets

·

Particular addressing modes used to interpret the operands

More complex operations are built up by combining these simple instructions, which (in a von Neumann machine) are executed sequentially, or as otherwise directed by control flow instructions. Some operations available in most instruction sets include: ·

·

·

moving ·

set a register (a temporary "scratchpad" location in the CPU itself) to a fixed constant value

·

move data from a memory location to a register, or vice versa. This is done to obtain the data to perform a computation on it later, or to store the result of a computation.

·

read and write data from hardware devices

computing ·

add, subtract, multiply, or divide the values of two registers, placing the result in a register

·

perform bitwise operations, taking the conjunction/disjunction (and/or) of corresponding bits in a pair of registers, or the negation (not) of each bit in a register

·

compare two values in registers (for example, to see if one is less, or if they are equal)

affecting program flow ·

jump to another location in the program and execute instructions there

·

jump to another location if a certain condition holds

·

jump to another location, but save the location of the next instruction as a point to return to (a call)

180 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Some computers include "complex" instructions in their instruction set. A single "complex" instruction does something that may take many instructions on other computers. Such instructions are typified by instructions that take multiple steps, control multiple functional units, or otherwise appear on a larger scale than the bulk of simple instructions implemented by the given processor. Some examples of "complex" instructions include: ·

saving many registers on the stack at once

·

moving large blocks of memory

·

complex and/or floating-point arithmetic (sine, cosine, square root, etc.)

·

performing an atomic test-and-set instruction

·

instructions that combine ALU with an operand from memory rather than a register

A complex instruction type that has become particularly popular recently is the SIMD or Single-Instruction Stream Multiple-Data Stream operation or vector instruction, an operation that performs the same arithmetic operation on multiple pieces of data at the same time. SIMD have the ability of manipulating large vectors and matrices in minimal time. SIMD instructions allow easy parallelization of algorithms commonly involved in sound, image, and video processing. Various SIMD implementations have been brought to market under trade names such as MMX, 3DNow! and AltiVec. The design of instruction sets is a complex issue. There were two stages in history for the microprocessor. One using CISC or complex instruction set computer where many instructions were implemented. In the 1970s places like IBM did research and found that many instructions were used that could be eliminated. The result was the RISC, reduced instruction set computer, architecture which uses a smaller set of instructions. The result was a simpler instruction set may offer the potential for higher speeds, reduced processor size, and reduced power consumption; a more complex one may optimize common operations, improve memory/cache efficiency, or simplify programming. List of ISAs This list is far from comprehensive as old architectures are abandoned and new ones invented on a continual basis. There are many commercially available microprocessors and microcontrollers implementing ISAs in all shapes and sizes. Customised ISAs are also quite common in some applications, e.g. ARC International, application-specific integrated circuit, FPGA, and reconfigurable computing. Also see history of computing hardware.

181 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

ISAs commonly implemented in hardware ·

Alpha AXP (DEC Alpha)

·

AMD64 (also known as EM64T)

·

ARM (Acorn RISC Machine) (Advanced RISC Machine now ARM Ltd)

·

IA-64 (Itanium)

·

MIPS

·

Motorola 68k

·

PA-RISC (HP Precision Architecture)

·

IBM 700/7000 series

·

IBM POWER

·

PowerPC

·

SPARC

·

SuperH

·

System/360

·

Tricore (Infineon)

·

Transputer (STMicroelectronics)

·

UNIVAC 1100/2200 series

·

VAX (Digital Equipment Corporation)

·

x86 (also known as IA-32) (Pentium, Athlon)

·

EISC (AE32K)

ISAs commonly implemented in software with hardware incarnations ·

p-Code (UCSD p-System Version III on Western Digital Pascal MicroEngine)

·

Java virtual machine (ARM Jazelle, PicoJava, JOP)

·

FORTH

182 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

ISAs never implemented in hardware ·

ALGOL object code

·

SECD machine, a languages.

·

MMIX, a teaching machine used in Donald Knuth's The Art of Computer Programming

·

Z-machine, a virtual machine used for Infocom's text adventure games

virtual machine used for some functional programming

Categories of ISA ·

application-specific integrated circuit (ASIC) fully custom ISA

·

CISC

·

digital signal processor

·

graphics processing unit

·

MISC

·

reconfigurable computing

·

RISC

·

vector processor

·

VLIW

·

orthogonal instruction set

Examples of commercially available ISA ·

central processing unit

·

microcontroller

·

microprocessor Processor Register

In computer architecture, a processor register is a small amount of very fast computer memory used to speed the execution of computer programs by providing quick access to commonly used values—typically, the values being calculated at a given point in time. Most, but not all, modern computer architectures operate on the principle

183 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

of moving data from main memory into registers, operating on them, then moving the result back into main memory—a so-called load-store architecture. Processor registers are the top of the memory hierarchy, and provide the fastest way for the system to access data. The term is often used to refer only to the group of registers that can be directly indexed for input or output of an instruction, as defined by the instruction set. More properly, these are called the "architectural registers". For instance, the x86 instruction set defines a set of eight 32-bit registers, but a CPU that implements the x86 instruction set will contain many more registers than just these eight. Putting frequently used variables into registers is critical to the program's performance. This action, namely register allocation is usually done by a compiler in the code generation phase. Categories of registers Registers are normally measured by the number of bits they can hold, for example, an "8-bit register" or a "32-bit register". Registers are now usually implemented as a register file, but they have also been implemented using individual flip-flops, high speed core memory, thin film memory, and other ways in various machines. There are several classes of registers according to the content: ·

Data registers are used to store integer numbers (see also Floating Point Registers, below). In some older and simple current CPUs, a special data register is the accumulator, used implicitly for many operations.

·

Address registers hold memory addresses and are used to access memory. In some CPUs, a special address register is an index register, although often these hold numbers used to modify addresses rather than holding addresses.

·

Conditional registers hold truth values often used to determine whether some instruction should or should not be executed.

·

General purpose registers (GPRs) can store both data and addresses, i.e., they are combined Data/Address registers.

·

Floating point registers (FPRs) are used to store floating point numbers in many architectures.

·

Constant registers hold read-only values (e.g., zero, one, pi, ...).

·

Vector registers hold data for vector processing done by SIMD instructions (Single Instruction, Multiple Data).

184 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

Special purpose registers hold program state; they usually include the program counter (aka instruction pointer), stack pointer, and status register (aka processor status word). ·

Instruction registers store the instruction currently being executed.

·

Index registers are used for modifying operand addresses during the run of a program.

·

In some architectures, model-specific registers (also called machine-specific registers) store data and settings related to the processor itself. Because their meanings are attached to the design of a specific processor, they cannot be expected to remain standard between processor generations.

·

Registers related to fetching information from random access memory, a collection of storage registers located on separate chips from the CPU (unlike most of the above, these are generally not architectural registers): ·

Memory buffer register

·

Memory data register

·

Memory address register

·

Memory Type Range Registers

Hardware registers are similar, but occur outside CPUs. Some examples The table below shows the number of registers of several mainstream processors: Processors

I n t e g e r Double registers registers

Pentium 4

8

8

Athlon MP

8

8

Opteron 240

16

16

FP

185 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Itanium 2

128

128

UltraSPARC IIIi 32

32

Power 3

32

32

Addressing modes, a concept from computer science, are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how machine language instructions in that architecture identify the operand (or operands) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere. In computer programming, addressing modes are primarily of interest to compiler writers and to those who write code directly in assembly language. Caveats Note that there is no generally accepted way of naming the various addressing modes. In particular, different authors and/or computer manufacturers may give different names to the same addressing mode, or the same names to different addressing modes. Furthermore, an addressing mode which, in one given architecture, is treated as a single addressing mode may represent functionality that, in another architecture, is covered by two or more addressing modes. For example, some complex instruction set computer (CISC) computer architectures, such as the Digital Equipment Corporation (DEC) VAX, treat registers and literal/immediate constants as just another addressing mode. Others, such as the IBM System/390 and most reduced instruction set computer (RISC) designs, encode this information within the instruction code. Thus, the latter machines have three distinct instruction codes for copying one register to another, copying a literal constant into a register, and copying the contents of a memory location into a register, while the VAX has only a single "MOV" instruction. The addressing modes listed below are divided into code addressing and data addressing. Most computer architectures maintain this distinction, but there are, or have been, some architectures which allow (almost) all addressing modes to be used in any context. The instructions shown below are purely representative in order to illustrate the addressing modes, and do not necessarily apply to any particular computer.

186 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Useful side effect Some computers have a Load effective address instruction. This performs a calculation of the effective operand address, but instead of acting on that memory location, it loads the address that would have been accessed into a register. This can be useful when passing the address of an array element to a subroutine. It may also be a slightly sneaky way of doing more calculation than normal in one instruction; for example, use with the addressing mode 'base+index+offset' allows one to add two registers and a constant together in one instruction. How many address modes? Different computer architectures vary greatly as to the number of addressing modes they provide. At the cost of a few extra instructions, and perhaps an extra register, it is normally possible to use the simpler addressing modes instead of the more complicated modes. It has proven much easier to design pipelined CPUs if the only addressing modes available are simple ones. Most RISC machines have only about five simple addressing modes, while CISC machines such as the DEC VAX supermini have over a dozen addressing modes, some of which are quite complicated. The IBM System/360 mainframe had only three addressing modes; a few more have been added for the System/390. When there are only a few addressing modes, the particular addressing mode required is usually encoded within the instruction code (e.g. IBM System/390, most RISC). But when there are lots of addressing modes, a specific field is often set aside in the instruction to specify the addressing mode. The DEC VAX allowed multiple memory operands for almost all instructions and so reserved the first few bits of each operand specifier to indicate the addressing mode for that particular operand. Even on a computer with many addressing modes, measurements of actual programs indicate that the simple addressing modes listed below account for some 90% or more of all addressing modes used. Since most such measurements are based on code generated from high-level languages by compilers, this may reflect to some extent the limitations of the compilers being used. Simple addressing modes for code Absolute +----+------------------------------+ |jump| address | +----+------------------------------+ Effective address = address as given in instruction

187 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Program relative +------+-----+-----+----------------+ |jumpEQ| reg1| reg2| offset | jump relative if reg1=reg2 +------+-----+-----+----------------+ Effective address = offset plus address of next instruction. The offset is usually signed, in the range -32768 to +32767. This is particularly useful in connection with conditional jumps, because you usually only want to jump to some nearby instruction (in a high-level language most if or while statements are reasonably short). Measurements of actual programs suggest that an 8 or 10 bit offset is large enough for some 90% of conditional jumps. Another advantage of program-relative addressing is that the code may be position-independent, i.e. it can be loaded anywhere in memory without the need to adjust any addresses. Register indirect +-------+-----+ |jumpVia| reg | +-------+-----+ Effective address = contents of specified register. The effect is to transfer control to the instruction whose address is in the specified register. Such an instruction is often used for returning from a subroutine call, since the actual call would usually have placed the return address in a register. Register +------+-----+-----+-----+ | mul | reg1| reg2| reg3| +------+-----+-----+-----+

reg1 := reg2 * reg3;

This 'addressing mode' does not have an effective address, and is not considered to be an addressing mode on some computers. In this example, all the operands are in registers, and the result is placed in a register. Base plus offset, and variations +------+-----+-----+----------------+ | load | reg | base| offset | +------+-----+-----+----------------+ Effective address = offset plus contents of specified base register.

188 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The offset is usually a signed 16-bit value (though the 80386 is famous for expanding it to 32-bit, though x64 didn't). If the offset is zero, this becomes an example of register indirect addressing; the effective address is just that in the base register. On many RISC machines, register 0 is fixed with value 0. If register 0 is used as the base register, this becomes an example of absolute addressing. However, only a small portion of memory can be accessed (the first 32 Kbytes and possibly the last 32 Kbytes) The 16-bit offset may seem very small in relation to the size of current computer memories (which is why the 80386 expanded it to 32-bit. x64 didn't expand it, however.) (it could be worse: IBM System/360 mainframes only have a positive 12-bit offset 0 to 4095). However, the principle of locality of reference applies: over a short time span most of the data items you wish to access are fairly close to each other. Example 1: Within a subroutine you will mainly be interested in the parameters and the local variables, which will rarely exceed 64 Kbytes, for which one base register suffices. If this routine is a class method in an object-oriented language, you will need a second base register pointing at the attributes for the current object (this or self in some high level languages). Example 2: If the base register contains the address of a record or structure, the offset can be used to select a field from that record (most records/structures are less than 32 Kbytes in size). Simple addressing modes for data Immediate/literal +------+-----+-----+----------------+ | add | reg1| reg2| constant | reg1 := reg2 + constant; +------+-----+-----+----------------+ This 'addressing mode' does not have an effective address, and is not considered to be an addressing mode on some computers. The constant might be signed or unsigned. Instead of using an operand from memory, the value of the operand is held within the instruction itself. On the DEC VAX machine, the literal operand sizes could be 6, 8, 16, or 32 bits long.

189 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Other addressing modes for code and/or data Absolute/Direct +------+-----+--------------------------------------+ | load | reg | address | +------+-----+--------------------------------------+ Effective address = address as given in instruction. This requires space in an instruction for quite a large address. It is often available on CISC machines which have variable length instructions. Some RISC machines have a special Load Upper Literal instruction which places a 16-bit constant in the top half of a register. An OR literal instruction can be used to insert a 16-bit constant in the lower half of that register, so that a full 32-bit address can then be used via the register-indirect addressing mode, which itself is provided as 'base-plus-offset' with an offset of 0. Indexed absolute +------+-----+-----+--------------------------------+ | load | reg |index| 32-bit address | +------+-----+-----+--------------------------------+ Effective address = address plus contents of specified index register. This also requires space in an instruction for quite a large address. The address could be the start of an array or vector, and the index could select the particular array element required. The index register may need to have been scaled to allow for the size of each array element. Note that this is more or less the same as base-plus-offset addressing mode, except that the offset in this case is large enough to address any memory location. Base plus index +------+-----+-----+-----+ | load | reg | base|index| +------+-----+-----+-----+ Effective address = contents of specified base register plus contents of specified index register. The base register could contain the start address of an array or vector, and the index could select the particular array element required. The index register may need to have been scaled to allow for the size of each array element. This could be used for accessing elements of an array passed as a parameter.

190 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Base plus index plus offset +------+-----+-----+-----+----------------+ | load | reg | base|index| 16-bit offset | +------+-----+-----+-----+----------------+ Effective address = offset plus contents of specified base register plus contents of specified index register. The base register could contain the start address of an array or vector of records, the index could select the particular record required, and the offset could select a field within that record. The index register may need to have been scaled to allow for the size of each record. Scaled +------+-----+-----+-----+ | load | reg | base|index| +------+-----+-----+-----+ Effective address = contents of specified base register plus scaled contents of specified index register. The base register could contain the start address of an array or vector, and the index could contain the number of the particular array element required. This addressing mode dynamically scales the value in the index register to allow for the size of each array element, e.g. if the array elements are double precision floating-point numbers occupying 8 bytes each then the value in the index register is multiplied by 8 before being used in the effective address calculation. The scale factor is normally restricted to being a power of two so that shifting rather than multiplication can be used (shifting is usually faster than multiplication). Register indirect +------+-----+-----+ | load | reg | base| +------+-----+-----+ Effective address = contents of base register. A few computers have this as a distinct addressing mode. Many computers just use base plus offset with an offset value of 0.

191 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Register autoincrement indirect +------+-----+-----+ | load | reg | base| +------+-----+-----+ Effective address = contents of base register. After determining the effective address, the value in the base register is incremented by the size of the data item that is to be accessed. Within a loop, this addressing mode can be used to step through all the elements of an array or vector. A stack can be implemented by using this in conjunction with the next addressing mode (autodecrement). In high-level languages it is often thought to be a good idea that functions which return a result should not have side effects (lack of side effects makes program understanding and validation much easier). This instruction has a side effect in that the base register is altered. If the subsequent memory access causes a page fault then restarting the instruction becomes much more problematical. This side-effects business proved to be something of a nightmare for VAX implementors, since instructions could have up to 6 operands, each of which could cause side-effects on registers and each of which could each cause 2 page faults (if operands happened to straddle a page boundary). Of course the instruction itself could be over 50 bytes long and might straddle a page boundary as well! Autodecrement register indirect +------+-----+-----+ | load | reg | base| +------+-----+-----+ Before determining the effective address, the value in the base register is decremented by the size of the data item which is to be accessed. Effective address = new contents of base register. Within a loop, this addressing mode can be used to step backwards through all the elements of an array or vector. A stack can be implemented by using this in conjunction with the previous addressing mode (autoincrement). See also the discussion on side-effects under the autoincrement addressing mode. Memory indirect Any of the addressing modes mentioned in this article could have an extra bit to indicate indirect addressing, i.e. the address calculated by using some addressing mode

192 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

is the address of a location (often 32 bits or a complete word) which contains the actual effective address. Indirect addressing may be used for code and/or data. It can make implementation of pointers or references very much easier, and can also make it easier to call subroutines which are not otherwise addessable. There is a performance penalty due to the extra memory access involved. Some early minicomputers (e.g. DEC PDP8, Data General Nova) had only a few registers and only a limited addressing range (8 bits). Hence the use of memory indirect addressing was almost the only way of referring to any significant amount of memory. PC-based addressing The x86-64 architecture supports RIP-based addressing, which uses the 64-bit program counter (instruction pointer) RIP as a base register. This allows for position-independent code. Obsolete addressing modes The addressing modes listed here were used in the 1950-1980 time frame, but most are no longer available on current computers. This list is by no means complete; there have been lots of other interesting/peculiar addressing modes used from time to time, e.g. absolute plus logical OR of 2 or 3 index registers. Multi-level memory indirect If the word size is larger than the address size, then the word referenced for memory-indirect addressing could itself have an indirect flag set to indicate another memory indirect cycle. Care is needed to ensure that a chain of indirect addresses does not refer to itself; if it did, you could get an infinite loop while trying to resolve an address. The DEC PDP-10 computer with 18-bit addresses and 36-bit words allowed multi-level indirect addressing with the possibility of using an index register at each stage as well. Memory-mapped registers On some computers the registers were regarded as occupying the first 8 or 16 words of memory (e.g. ICL 1900, DEC PDP-10). This meant that there was no need for a separate 'Add register to register' instruction - you could just use the 'Add memory to register' instruction.

193 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

In the case of early models of the PDP-10, which did not have any cache memory, you could actually load a tight inner loop into the first few words of memory (the fast registers in fact), and have it run much faster than if it was in magnetic core memory. Later models of the DEC PDP-11 series mapped the registers onto addresses in the input/output area, but this was primarily intended to allow remote diagnostics. Confusingly, the 16-bit registers were mapped onto consecutive 8-bit byte addresses. Memory indirect, auto inc/dec On some early minicomputers (e.g. DEC PDP8, Data General Nova), there were typically 16 special memory locations. When accessed via memory indirect addressing, 8 would automatically increment after use and 8 would automatically decrement after use. This made it very easy to step through memory in loops without using any registers. Zero page In the MOS Technology 6502 the first 256 bytes of memory could be accessed very rapidly. The reason was that the 6502 was lacking in registers which were not special function registers. To use zero page access an 8-bit address would be used, saving one clock cycle as compared with using a 16-bit address. An Operating System would use much of zero page, so it was not as useful as it might have seemed. Scaled index with bounds checking This is similar to scaled index addressing, except that the instruction has two extra operands (typically constants), and the hardware would check that the index value was between these bounds. Another variation uses vector descriptors to hold the bounds; this makes it easy to implement dynamically allocated arrays and still have full bounds checking. Register indirect to byte within word The DEC PDP-10 computer used 36-bit words. It had a special addressing mode which allowed memory to be treated as a sequence of bytes (bytes could be any size from 1 bit to 36 bits). A 1-word sequence descriptor held the current word address within the sequence, a bit position within a word, and the size of each byte. Instructions existed to load and store bytes via this descriptor, and to increment the descriptor to point at the next byte (bytes were not split across word boundaries). Much DEC software used five 7-bit bytes per word (plain ASCII characters), with 1 bit unused per word. Implementations of C had to use four 9-bit bytes per word, since C assumes that you can access every bit of memory by accessing consecutive bytes.

194 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Lesson IX Memory System Enhancement Characteristics of Main Memory Main memory is as vital as the processor chip to a computer system. Fast systems have both a fast processor and a large memory. Here is a list of some characteristics of computer memory. Some characteristics are true for both kinds of memory; others are true for just one. Bit Here is a table that summarizes the characteristics of the two types of computer memory. Characteristic

True for Main Memory

Closely connected to the processor.

X

Holds the programs and data the processor is actively using.

X

Used for long term storage.

X

Interacts with processor millions of times per second.

X

Contents is easily changed.

X

Relatively low capacity.

X

Relatively huge capacity. Fast access.

True for Secondary Memory

X

X X

Slow access.

X

Connected to main memory.

X

Holds programs and data. Organized into files.

X

X X

195 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

In both main and secondary memory, information is stored as patterns of bits. Recall from chapter two what a bit is: A bit is a single "on"/"off" value. Only these two values are possible. The two values may go by different names, such as "true"/"false", or "1"/"0". There are many ways in which a bit can be implemented. For example a bit could be implemented as: ·

A mechanical electrical switch (like a light switch.)

·

Voltage on a wire.

·

A single transistor (used in main memory).

·

A tiny part of the surface of a magnetic disk.

·

A tiny part of the surface of a magnetic tape.

·

A hole punched in a card.

·

A tiny part of the light-reflecting surface of a CD.

·

Part of a radio signal.

·

Many, many more ways

So the particular implementation of bits is different in main memory and secondary memory, but logically, both types of memory store bits. Copied Information Information stored in binary form does not change when it is copied from one medium (storage method) to another. And an unlimited number of such copies can be made (remember the advantages of binary.) This is a very powerful combination. You may be so accustomed to this that it seems commonplace. But when you (say) download an image from the Internet, the data has been copied many dozens of times, using a variety of storage and transmission methods. It is likely, for example, that the data starts out on magnetic disk and is then copied to main storage of the web site's computer (involving a voltage signal in between.) From main storage it is copied (again with a voltage signal in between) to a network interface card, which temporarily holds it in many transistors. From there it is sent as an electrical signal down a cable. Along the route to your computer, there may be dozens of computers that transform data from an electrical signal, into main memory transistor form, and then back to an electrical signal on another cable. Your data may even be transformed into a radio signal, sent to a satellite (with its own computers), and sent

196 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

back to earth as another radio signal. Eventually the data ends up as data in your video card (transistors), which transforms it Byte power of 2

Name

Number of Bytes

byte

1

20

kilobyte

1024

210

megabyte 1,048,576

220

gigabyte

1,073,741,824

230

1,099,511,627,776

240

terabyte

One bit of information is so little that usually computer memory is organized into groups of eight bits. Each eight bit group is called a byte. When more than eight bits are required for some data, a whole number of bytes are used. One byte is about enough memory to hold a single character. Often very much more than eight bits are required for data, and thousands, millions, or even billions of bytes are needed. These amounts have names, as seen in the table. If you expect computers to be your career, it would be a good idea to become very familiar with this table. (Except that the only number you should remember from the middle column is that a kilobyte is 1024 bytes.) Often a kilobyte is called a "K" and a megabyte is called a "Meg." Bytes, not Bits The previous table listed the number of bytes, not bits. So one K of memory is 1024 bytes, or 1024*8 == 8,192 bits. Usually one is not particularly interested in the exact number of bits. It will be very useful in your future career to be sure you know how to multiply powers of two. 2M * 2N = 2(M+N) In the above, "*" means "multiplication." For example: 26 * 210 = 216 Locations in a digital image are specified by a row number and a column number (both of them integers). A particular digital image is 1024 rows by 1024 columns, and each location holds one byte. How many megabytes are in that image?

197 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Locations in a digital image are specified number and a column number (both of integers). A particular digital image is by 1024 columns, and each location holds How many megabytes are in that image?

by a row t h e m 1024 rows one byte.

Picture of Main Memory Main memory consists of a very long list In most modern computers, each byte address that is used to locate it. The shows a small part of main memory:

of bytes. has an picture

Each box in this picture represents a byte. Each byte has an address. In this the addresses are the integers to the left boxes: 0, 1, 2, 3, 4, ... and so on. The for most computer memory start at 0 and sequence until each byte has an address.

s i n g l e picture of the addr esses go up in

Each byte contains a pattern of eight bits. When the computer's power is on, every byte contains some pattern or other, even those bytes not being used for anything. (Remember the nature of binary: when a binary device is working it is either "on" or "off", never in between.) The address of a byte is not part of its contents. When the processor needs to access the byte at a particular address, the electronics of the computer "knows how" to find that byte in memory.

Contents of Main Memory

198 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Main memory (as all computer stores bit patterns. That is, each location consists of eight bits, and either "0" or "1". For example, the the first few bytes of memory.

m e m o r y ) m e m o r y each bit is picture shows

The only thing that can be stored at location is eight bits, each with a value The bits at a memory location are contents of that location.

one memory of "0" or "1". called the

Sometimes people will say that each location holds an eight bit binary is OK, as long as you remember that might be used to represent a anything else.

m e m o r y number. This the "number" character, or

Remember that what a particular p a t t e r n represents depends on its context (ie., how a program is using it.) You cannot look at an arbitrary bit pattern (such as those in the picture) and say what it represents. Programs and Memory The processor has written a byte of data at location 7. The old contents of that location are lost. Main memory now looks like the picture. When a program is running, it has a section of memory for the data it is using. Locations in that section can be changed as many times as the program needs. For example, if a program is adding up a list of numbers, the sum will be kept in main memory (probably using several bytes.) As new numbers are added to the sum, it will change and main memory will have to be changed, too. Other sections of main memory might not change at all while a program is running. For example, the instructions that make up a program do not (usually) change as a program is running. The instructions of a running program are located in main memory, so those locations will not change. When you write a program in Java (or most other languages) you do not need to keep track of memory locations and their contents. Part of the purpose of a programming language is to do these things automatically.

199 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

In secondary storage (usually the computer system's hard disk.)

Hard Disks The hard disk of a computer system records a magnetic surface much surface of audio tape. The (writing) and reading of the done with a read/write similar to that used with tape.

bytes on like the recording data is h e a d a u d i o

The picture shows one disk and one read/write head at the end of a movable arm. The arm moves in and out along a radius of the disk. Since the disk is rotating it will record data in a circular track on the disk. Later on, to read the data, it must be moved to the right position, then it must wait until the rotating disk brings the data into position. Just as with audio tape, data can be read without changing it. When new data it recorded, it replaces any data that was previously recorded at that location. Unlike audio tape, the read/write head does not actually touch the disk but skims just a little bit above it. Usually the component called the "hard disk" of a computer system contains many individual disks and read/write heads like the above. The disks are coated with magnetic material on both sides (so each disk gets two read/write heads) and the disks are all attached to one spindel. All the disks and heads are sealed into a dust-free metal can. Since the operation of a hard disk involves mechanical motion (which is much slower than electronic processes), reading and writing data is much slower than with main memory. Files Hard disks (and other secondary memory devices) are used for long-term storage of large blocks of information, such as programs and data sets. Usually disk memory is organized into files. A file is a collection of information that has been given a name and is stored in secondary memory. The information can be a program or can be data. The form of the information in a file is the same as with any digital information---it consists of bits, usually grouped into eight bit bytes. Files are frequently quite large; their size is measured in kilobytes or megabytes.

200 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

If you have never worked with files on a computer before you should study the documentation that came with your operating system, or look at a book such as Windows NT for Dummies (or whatever is appropriate for your computer.) One of the jobs of a computer's operating system is to keep track of file names and where they are on its hard disk. For example, in DOS the user can ask to run the program DOOM like this: C:\> DOOM.EXE The "C:\>" is a prompt; the user typed in "DOOM.EXE". The operating system now has to find the file called DOOM.EXE somewhere on its hard disk. The program will be copied into main storage and will start running. As the program runs it asks for information stored as additional files on the hard disk, which the operating system has to find and copy into main memory. Usually in a file in secondary storage. If the file does not already exist, the program will ask the operating system to create it. Files and the Operating System Usually all collections of data outside of main storage are organized into files. The job of keeping all this information straight is the job of the operating system. If the computer system is part of a network, keeping straight all the files on all the computers can be quite a task, and is the collective job of all the operating systems involved. Application programs (including programs that you might write) do not directly read, write, create, or delete files. Since the operating system has to keep track of everything, all other programs ask it to do file manipulation tasks. For example, say that a program has just calculated a set of numbers and needs to save them. The following might be how it does this: 1. Program: asks the operating system to create a file with a name RESULTS.DAT 2. Operating System: gets the request; finds an unused section of the disk and creates an empty file. The program is told when this has been completed. 3. Program: asks the operating system to save the numbers in the file. 4. Operating System: gets the numbers from the program's main memory, writes them to the file. The program is told when this has been completed. 5. Program: continues on with whatever it is doing. So when an application program is running, it is constantly asking the operating system to perform file manipulation tasks (and other tasks) and waiting for them to be completed. If a program asks the operating system to do something that will damage the file system, the operating system will refuse to do it. Modern programs are written

201 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

so that they have alternatives when a requests is refused. Older programs were not written this way, and do not run well on modern computers. In modern computer systems, only the operating system can directly do anything with disk files. How does this: 1. affect the security of the system? ·

The security is increased because programs that try to do dangerous or stupid things to files can't. They have to ask the operating system, which will only do safe and sensible things.

2. affect computer games? ·

Older computer games did their file manipulation themselves without asking the operating system (old DOS was stupid and slow and many application programs ignored it.) Those games won't run on modern computers.

3. affect the ease in creating programs? ·

Program creation is easier because the work of dealing with files is done by the operating system.

Types of Files As far as the hard disk is concerned, all files are the same. At the electronic level, there is no difference between a file containing a program and a file containing data. All files are named collections of bytes. Of course, what the files are used for is different. The operating system can take a program file, copy it into main memory, and start it running. The operating system can take a data file, and supply its information to a running program when it asks. Often then last part of a file's name (the extension) shows what the file is expected to be used for. For example, in "mydata.txt" the ".txt" means that the file is expected to be used as a collection of text, that is, characters. With "Netscape.exe" the ".exe" means that the file is an "executable," that is, a program that is ready to run. With "program1.java" the ".java" means that the file is a source program in the language java (there will be more about source programs later on in these notes.) To the hard disk, each of these files is the same sort of thing: a collection of bytes.

Address EXtension In computing, Physical Address Extension (PAE) refers to a feature of x86 processors that allows for up to 64 gigabytes of physical memory to be used in 32-bit systems, given appropriate operating system support. PAE is provided by Intel Pentium

202 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Pro and above CPUs (including all later Pentium-series processors except the 400 MHz bus versions on the Pentium M), as well as by some compatible processors such as those from AMD. The CPUID flag PAE is assigned for the purpose of identifying CPUs with this capability. The processor hardware is augmented with additional address lines used to select the additional memory, and 36-bit page tables, but regular application software continues to use instructions with 32-bit addresses and a flat memory model limited to 4 gigabytes. The operating system uses PAE to map this 32-bit address space onto the 64 gigabytes of total memory, and the map can be and usually is different for each process. In this way the extra memory is useful even though regular applications cannot access it all simultaneously. For application software which needs access to more than 4 gigabytes of memory some special mechanism may be provided by the operating system in addition to the regular PAE support. On Microsoft Windows this mechanism is called Address Windowing Extensions (AWE), while on Unix systems a variety of tricks are used, such as using mmap() to map regions of a file into and out of the address space as needed, none having been blessed as a standard. Page table structures In traditional 32-bit protected mode, x86 processors use a two-level page translation scheme, where the register CR3 points to a single 4K-long page directory, which is divided into 1024 4-byte entries that point to 4K-long page tables, similarly consisting of 1024 4-byte entries pointing to 4K-long pages. Enabling PAE (by setting bit 5, PAE, of the system control register CR4) causes major changes to this scheme. By default, the size of each page remains as 4K. Each entry in the page table and page directory is extended to 64 bits (8 bytes) rather than 32 to allow for additional address bits; the table size does not change, however, so each table now has only 512 entries. Because this allows only a quarter as many entries as the original scheme, an extra level of hierarchy must be added, so CR3 now points to the Page Directory Pointer Table, a short table which contains pointers to 4 page directories. Additionally, the entries in the page directory have an additional flag, named 'PS' (for Page Size). If this bit (bit 7) is set to 1, the page directory entry does not point to a page table, but a single large page (2MB in length).

203 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Lesson X Control Unit Enhancement A speed enhancement technique for CMOS circuits is disclosed. In the series of logic stages, nodes in the signal path of a pulse are set by preceding logic stages, then reset by feedback from subsequent logic stages. This eliminates the capacitive burden of resetting any given node from the input signal to allow substantially all of the input signal to be employed in setting the nodes to an active state rather than wasting part of the signal in turning off the reset path. The technique is illustrated as applied to RAM circuits. C

l

a

i

m

s

I claim: 1. A circuit comprising: a plurality of cascaded stages, each stage being capable of being placed in one of a set state or a reset state, wherein the set state of a particular stage is established by using a majority of charge supplied from an immediately preceding stage, and the reset state of the particular stage is established by using a minority of charge supplied directly from a subsequent stage. 2. A circuit as in claim 1 wherein each stage comprises at least one PMOS transistor and at least one NMOS transistor. 3. A circuit as in claim 2 wherein each stage further comprises an additional transistor. 4. A circuit as in claim 3 wherein the additional transistor is coupled to the subsequent stage to enable reset of the particular stage. 5. A circuit as in claim 2 wherein the PMOS and NMOS transistors are interconnected to form an inverter. 6. A circuit as in claim 5 wherein the PMOS and NMOS transistors include gates commonly connected.

204 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

7. A circuit as in claim 1 wherein the circuit is coupled between a first and a second potential source, and a selected number of the stages each comprises: an input node; an output node; a first transistor connected to the input node, the output node, and to one of the first and second potential sources of connecting the output node to one of the first and second potential sources in response to a first type signal on the input node; and a second transistor connected to the output node, to the other of the first and second potential sources, and to the following stage, for connecting the output node to the other of the first and second potential sources in delayed response to the earlier first type signal on the input node. 8. A circuit as in claim 7 wherein for each of the selected number of stages: the first transistor has a gate connected to the input node, a source connected to the second potential source, and a drain connected to the output node; and the second transistor has a gate connected to an output node of the following stage, a drain connected to the output node, and a source connected to the first potential source. 9. A circuit as in claim 8 wherein for each of the selected number of stages: the first transistor comprises an NMOS transistor; and the second transistor comprises a PMOS transistor. 10. A circuit as in claim 9 wherein for each of the selected number of stages, each stage further comprises: a third transistor connected to the output node, to the other of the first and second potential sources and to the input node, for connecting the output node to the other of the first and second potential in response to a second type signal on the input node. 11. A circuit as in claim 10 wherein for each of the selected number of stages, each stage further comprises: the third transistor has a source connected to the other of the first and second potential sources, a drain connected to the output node, and a gate connected to the input node. 12. A circuit as in claim 1 wherein the input node of the particular stage is connected to the output node of the stage immediately preceding the particular stage. 13. A circuit as in claim 12 wherein the subsequent stage is an even number of stages after the particular stage. 14. A CMOS circuit comprising: a plurality of serially-connected stages, each stage including an NMOS and a PMOS transistor; odd-numbered stages including an NMOS transistor which is more than one-half the size of the PMOS transistor in that stage; and even-numbered stages including a PMOS transistor which is more than twice the size of the NMOS transistor in that stage.

205 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

15. A CMOS circuit as in claim 14 wherein: each stage includes an input node and an output node, the input node of a stage being connected to the output node of a preceding stage; wherein the input node of an odd-numbered stage is connected to control the PMOS transistor, the input node of an even-numbered stage is connected to control the NMOS transistor; and wherein the NMOS transistor in an odd-numbered stage is connected to be controlled by the output node of a subsequent stage, and the PMOS transistor in an even-numbered stage is controlled by the output node of a subsequent stage. 16. A circuit as in claim 15 wherein: the stages are capable of being placed in a first logic stage or a second logic state; the NMOS transistor in the odd-numbered stages is coupled to a subsequent stage and controls the first logic state of the odd-numbered stages; and the PMOS transistor in the even-numbered stages is coupled to a subsequent stage and controls the first logic state of the even-numbered stages. 17. A circuit as in claim 16 wherein: the NMOS transistor in the even-numbered stages is coupled to a immediately preceding stage and controls the second logic state of the even-numbered stages; the PMOS transistor in the odd-numbered stages is coupled to an immediately preceding stage and controls the second logic stage of the odd-numbered stages. 18. A circuit as in claim 14 coupled between a lower potential and an upper potential wherein each stage comprises: an input node; an output node; a PMOS transistor having a gate connected to the input node, a source connected to the upper potential, and a drain connected to the output node; and an NMOS transistor having a gate connected to the input node, a source connected to the lower potential, and a drain connected to the output node. 19. A circuit comprising: a first stage; a plurality of cascaded stages; a last stage; each cascaded stage including set means and reset means; the set means for each particular one of the cascaded stages being coupled to and driven by a previous stage, and the reset means for each cascaded stage being coupled to and driven by a subsequent stage, wherein virtually all of the power available during the switching of the previous stage is available for driving the set means for the particular cascaded stage, thereby increasing the switching speed of the set means of the particular cascaded stage; and wherein a minor portion of the power available during the switching of the subsequent stage is used for driving the reset means of the particular cascaded stage, thereby accomplishing the reset of the particular cascaded stage without significantly altering the switching speed of the subsequent stage. 20. A logic circuit comprising: a first node for receiving an input signal having energy; a second node for supplying an output signal; a plurality of cascaded stages each having a control input node, a reset input node, and an output node, the control input node of a first stage of the plurality being connected to the first node, the output node of a last

206 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

stage of the plurality being connected to the second node; each said stage being capable of assuming a set state and a reset state, the set state for each stage being controlled by a signal on its control input node which for all stages except the first stage is coupled to the output node of an earlier stage, whereby most of the energy available from the input signal for the first cascaded stage is used for setting that cascaded stage and most of the energy available from each subsequent stage is used for setting the next stage thereafter; and the reset state for each stage being controlled by a signal on the reset input node supplied from an output node of a subsequent stage, whereby energy to reset each particular cascaded stage comes from a subsequent stage. 21. A logic circuit as in claim 20 wherein the set state and the reset state for each cascaded stage are controlled by logic switches whose conduction depends upon the state of the control input node for that stage. 22. A circuit as in claim 21 wherein the logic switches for each cascaded stage connected between the output node of that stage and a most positive potential source comprises PMOS transistors. 23. A circuit as in claim 21 wherein the logic switches for each cascaded stage connected between the output node for that stage and a most negative potential source comprise NMOS transistors. 24. A logic circuit as in claim 20 wherein the subsequent stage is an even number of stages following the particular stage. 25. The circuit as in claim 24 wherein the even number is four. 26. A circuit for providing control signals to other circuits comprising: a plurality of serially-connected stages, each capable of being placed in a set state or a reset state; wherein a majority of charge to switch a stage to a set state comes from a prior stage and a majority of charge to switch a stage to a reset state comes directly from a later stage. 27. A method of increasing the speed of operation of a CMOS circuit having multiple serially-connected stages comprising: providing a pulse having charge at an input node to a selected stage; using a majority of the charge of the pulse to place the selected stage in an active state; propagating the active state of the selected stage to later stages to thereby also place them in an active state; and using an output signal from one of the later stages connected directly to the selected stage to place the selected stage in a reset state to await arrival of another pulse.

207 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The Hard-Wired Control Unit Figure 2 is a block diagram showing the internal organization of a hard-wired control unit for our simple computer. Input to the controller consists of the 4-bit opcode of the instruction currently contained in the Instruction Register and the negative flag from the accumulator. The controller's output is a set of 16 control signals that go out to the various registers and to the memory of the computer, in addition to a HLT signal that is activated whenever the leading bit of the op-code is one. The controller is composed of the following functional units: A ring counter, an instruction decoder, and a control matrix.

Figure 2. A Block diagram of the Basic Computer's Hard-wired Control unit

The ring counter provides a sequence of six consecutive active signals that cycle continuously. Synchronized by the system clock, the ring counter first activates its T0 line, then its T1 line, and so forth. After T5 is active, the sequence begins again with T0. Figure 3 shows how the ring counter might be organized internally.

Figure

3.

The

Internal

Organization

of

the

Ring

Counter

208 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The instruction decoder takes its four-bit input from the op-code field of the instruction register and activates one and only one of its 8 output lines. Each line corresponds to one of the instructions in the computer's instruction set. Figure 4 shows the internal organization of this decoder. Figure 4. The Internal Organization of the Hard-w i r e d Instruc t i o n Decode r The m o s t importa nt part of the hard-wir e d controlle r is the control matrix. I t receives i n p u t from the r i n g c ount er and the instructi o n decoder and provides the proper sequence of control signals. Figure 5 is a diagram of how the control matrix for our simple machine might be wired. Figure 5. The Internal Organization of the Hard-wired Control Matrix

209 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

IP = T2 W = T5*STA LP = T3*JMP + T3*NF*JN LD = T4*STA LA = T5*LDA + T4*ADD + T4*SUB EA = T4*STA + T3*MBA EP = T0 S = T3*SUB A = T3*ADD LI = T2 LM = T0 + T3*LDA + T3*STA ED = T2 + T5*LDA R = T1 + T4*LDA EU = T3*ADD+T3*SUB EI = T3*LDA + T3*STA + T3*JMP + T3*NF*JN LB = T3*MBA To understand how this diagram was obtained, we must look carefully at the machine's instruction set (Table 1).

Table 1. An Instruction Set For The Basic Computer Instruction Op-Code Execution Register Ring Active Control Mnemonic Action Transfers Pulse Signals -----------------------------------------------------------------------------------------------------------------------------------LDA 1 ACC<--(RAM) 1. MAR <-- IR 3 EI, LM (Load ACC) 2. MDR <-- RAM(MAR) 4 R 3. ACC <-- MDR 5 ED, LA STA EI, LM (Store ACC) EA, LD

2

(RAM) <--ACC

1. MAR <-- IR

3

2. MDR <-- ACC

4

210 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

3. RAM(MAR) <-- MDR

5

1. ALU <-- ACC + B

3

2. ACC <-- ALU

4

1. ALU <-- ACC - B

3

2. ACC <-- ALU

4

W ADD A (Add B to ACC) EU, LA

3

ACC <-- ACC + B

SUB 4 S (Sub. B from ACC) EU, LA

ACC <-- ACC - B

MBA EA, LB (Move ACC to B)

5

B <-- ACC

1. B <-- A

3

JMP EI, LP (Jump to Address)

6

PC <-- RAM

1. PC <-- IR

3

JN NF: EI, LP (Jump if Negative)

7

PC <-- RAM

1. PC <-- IR

3

HLT "Fetch" EP, LM

if negative flag is set 8-15

if NF set

Stop clock IR <-- Next

1. MAR <-- PC

0

Instruction

2. MDR <-- RAM(MAR)

1

R 3. IR <-- MDR

Table 2 shows which control signals must be active at each ring counter pulse for each of the instructions in the computer's instruction set (and for the instruction fetch operation). The table was prepared by simply writing down the instructions in the left-hand column. (In the circuit these will be the output lines from the decoder). The various control signals are placed horizontally along the top of the table. Entries into the table consist of the moments (ring counter pulses T0, T1, T2, T3, T4, or T5) at which each control signal must be active in order to have the instruction executed. This table is prepared very easily by reading off the information for each instruction given in Table 1. For example, the Fetch operation has the EP and LM control signals active at ring count 1, and ED, LI,

211 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

and IPC active at ring count 2. Therefore the first row (Fetch) of Table 2 has T0 entered below EP and LM, T1 below R, and T2 below IP, ED, and LI. Table 2. A Matrix of Times at which Each Control Signal Must Be Active in Order to Execute the Hard-wired Basic Computer's Instructions Control Signal: IP LP EP LM R W LD ED LI EI LA EA A S EU LB Instruction: ----------------------------------------------------------------------------"Fetch" T2 T0 T0 T1 T2 T2 LDA T3 T4 T5 T3 T5 STA T3 T5 T4 T3 T4 MBA T3 T3 ADD T4 T3 T4 SUB T4 T3 T4 JMP T3 T3 JN T3*NF T3*NF

Once Table 2 has been prepared, the logic required for each control signal is easily obtained. For each an AND operation is performed between any active ring counter (Ti) signals that were entered into the signal's column and the corresponding instruction contained in the far left-hand column. If a column has more than one entry, the output of the ANDs are ORed together to produce the final control signal. For example, the LM column has the following entries: T0 (Fetch), T3 associated with the LDA instruction, and T3 associated with the STA instruction. Therefore, the logic for this signal is: LM = T0 + T3*LDA + T3*STA This means that control signal LM will be activated whenever any of the following conditions is satisfied: (1) ring pulse T0 (first step of an instruction fetch) is active, or (2) an LDA instruction is in the IR and the ring counter is issuing pulse 3, or (3) and STA instruction is in the IR and the ring counter is issuing pulse 3. The entries in the JN (Jump Negative) row of this table require some further explanation. The LP and EI signals are active during T3 for this instruction if and only if the accumulator's negative flag has been set. Therefore the entries that appear above these signals for the JN instruction are T3*NF, meaning that the state of the negative flag must be ANDed in for the LP and EI control signals. Figure 6 gives the logical equations required for each of the control signals used on our machine. These equations have been read from Table 2, as explained above. The circuit diagram of the control matrix (Figure 5) is constructed directly from these equations.

212 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

It should be noticed that the HLT line from the instruction decoder does not enter the control matrix, Instead this signal goes directly to circuitry (not shown) that will stop the clock and thus terminate execution.

Figure 6. The logical equations required for each of the hardwired control signals on the basic computer. The machine's control matrix is designed from these equations.

213 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

R I S C ? RISC, or Reduced Instruction Set Computer. is a type of microprocessor architecture that utilizes a small, highly-optimized set of instructions, rather than a more specialized set of instructions often found in other types of architectures. H i s t o r y The first RISC projects came from IBM, Stanford, and UC-Berkeley in the late 70s and early 80s. The IBM 801, Stanford MIPS, and Berkeley RISC 1 and 2 were all designed with a similar philosophy which has become known as RISC. Certain design features have been characteristic of most RISC processors:

214 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

one cycle execution time: RISC processors have a CPI (clock per instruction) of one cycle. This is due to the optimization of each instruction on the CPU and a technique called ;

·

pipelining: a technique that allows for simultaneous execution of parts, or stages, of instructions to more efficiently process instructions;

·

large number of registers: the RISC design philosophy generally incorporates a larger number of registers to prevent in large amounts of interactions with memory

This Site We will first examine MIPS in detail as an example of an early RISC architecture to better understand the features and design of RISC architectures. We will then study pipelining to see the performance benefits of such a technique. Then we will look at the advantages and disadvantages of such a RISC-based architecture as compared to CISC architectures. Finally, we will discuss some of the recent developments and future directions of RISC processor technology in particular, and processor technology as a whole in general. H i s t o r y The MIPS processor was developed as part of a VLSI research program at Stanford University in the early 80s. Professor John Hennessy, now the University's President, started the development of MIPS with a brainstorming class for graduate students. The readings and idea sessions helped launch the development of the processor which became one of the first RISC processors, with IBM and Berkeley developing processors at around the same time.

Mips Architecture The Stanford research group had a strong background in compilers, which led them to develop a processor whose architecture would represent the lowering of the compiler to the hardware level, as opposed to the raising of hardware to the software level, which had been a long running design philosophy in the hardware industry. Thus, the MIPS processor implemented a smaller, simpler instruction set. Each of the instructions included in the chip design ran in a single clock cycle. The processor used a technique called pipelining to more efficiently process instructions. MIPS used 32 registers, each 32 bits wide (a bit pattern of this size is referred to as a word).

215 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Instruction Set The MIPS instruction set consists of about 111 total instructions, each represented in 32 bits. An example of a MIPS instruction is below: add $r12, $r7, $r8 Above is the assembly (left) and binary (right) representation of a MIPS addition instruction. The instruction tells the processor to compute the sum of the values in registers 7 and 8 and store the result in register 12. The dollar signs are used to indicate an operation on a register. The colored binary representation on the right illustrates the 6 fields of a MIPS instruction. The processor identifies the type of instruction by the binary digits in the first and last fields. In this case, the processor recognizes that this instruction is an addition from the zero in its first field and the 20 in its last field. The operands are represented in the blue and yellow fields, and the desired result location is presented in the fourth (purple) field. The orange field represents the shift amount, something that is not used in an addition operation. The instruction set consists of a variety of basic instructions, including: ·

21 arithmetic instructions (+, -, *, /, %)

·

8 logic instructions (&, |, ~)

·

8 bit manipulation instructions

·

12 comparison instructions (>, <, =, >=, <=, ¬)

·

25 branch/jump instructions

·

15 load instructions

·

10 store instructions

·

8 move instructions

·

4 miscellaneous instructions

MIPS Today

216 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

MIPS Computer Systems, Inc. was founded in 1984 upon the Stanford research from which the first MIPS chip resulted. The company was purchased buy Silicon Graphics, Inc. in 1992, and was spun off as MIPS Technologies, Inc. in 1998. Today, MIPS powers many consumer electronics and other devices. How Pipelining Works Pipelining, a standard feature in RISC processors, is much like an assembly line. Because the processor works on different steps of the instruction at the same time, more instructions can be executed in a shorter period of time. A useful method of demonstrating this is the laundry analogy. Let's say that there are four loads of dirty laundry that need to be washed, dried, and folded. We could put the the first load in the washer for 30 minutes, dry it for 40 minutes, and then take 20 minutes to fold the clothes. Then pick up the second load and wash, dry, and fold, and repeat for the third and fourth loads. Supposing we started at 6 PM and worked as efficiently as possible, we would still be doing laundry until midnight.

However, a smarter approach to the problem would be to put the second load of dirty laundry into the washer after the first was already clean and whirling happily in the dryer. Then, while the first load was being folded, the second load would dry, and a third load could be added to the pipeline of laundry. Using this method, the laundry would be finished by 9:30.

217 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

RISC Pipelines A RISC processor pipeline operates in much the same way, although the stages in the pipeline are different. While different processors have different numbers of steps, they are basically variations of these five, used in the MIPS R3000 processor: 1. fetch instructions from memory 2. read registers and decode the instruction 3. execute the instruction or calculate an address 4. access an operand in data memory 5. write the result into a register If you glance back at the diagram of the laundry pipeline, you'll notice that although the washer finishes in half an hour, the dryer takes an extra ten minutes, and thus the wet clothes must wait ten minutes for the dryer to free up. Thus, the length of the pipeline is dependent on the length of the longest step. Because RISC instructions are simpler than those used in pre-RISC processors (now called CISC, or Complex Instruction Set Computer), they are more conducive to pipelining. While CISC instructions varied in length, RISC instructions are all the same length and can be fetched in a single operation. Ideally, each of the stages in a RISC processor pipeline should take 1 clock cycle so that the processor finishes an instruction each clock cycle and averages one cycle per instruction (CPI).

218 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Pipeline Problems In practice, however, RISC processors operate at more than one cycle per instruction. The processor might occasionally stall a a result of data dependencies and branch instructions. A data dependency occurs when an instruction depends on the results of a previous instruction. A particular instruction might need data in a register which has not yet been stored since that is the job of a preceeding instruction which has not yet reached that step in the pipeline. For example: add $r3, $r2, $r1 add $r5, $r4, $r3 more instructions that are independent of the first two In this example, the first instruction tells the processor to add the contents of registers r1 and r2 and store the result in register r3. The second instructs it to add r3 and r4 and store the sum in r5. We place this set of instructions in a pipeline. When the second instruction is in the second stage, the processor will be attempting to read r3 and r4 from the registers. Remember, though, that the first instruction is just one step ahead of the second, so the contents of r1 and r2 are being added, but the result has not yet been written into register r3. The second instruction therefore cannot read from the register r3 because it hasn't been written yet and must wait until the data it needs is stored. Consequently, the pipeline is stalled and a number of empty instructions (known as bubbles go into the pipeline. Data dependency affects long pipelines more than shorter ones since it takes a longer period of time for an instruction to reach the final register-writing stage of a long pipeline. MIPS' solution to this problem is code reordering. If, as in the example above, the following instructions have nothing to do with the first two, the code could be rearranged so that those instructions are executed in between the two dependent instructions and the pipeline could flow efficiently. The task of code reordering is generally left to the compiler, which recognizes data dependencies and attempts to minimize performance stalls. Branch instructions are those that tell the processor to make a decision about what the next instruction to be executed should be based on the results of another instruction.

219 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Branch instructions can be troublesome in a pipeline if a branch is conditional on the results of an instruction which has not yet finished its path through the pipeline. For example: Loop : add $r3, $r2, $r1 sub $r6, $r5, $r4 beq $r3, $r6, Loop The example above instructs the processor to add r1 and r2 and put the result in r3, then subtract r4 from r5, storing the difference in r6. In the third instruction, beq stands for branch if equal. If the contents of r3 and r6 are equal, the processor should execute the instruction labeled "Loop." Otherwise, it should continue to the next instruction. In this example, the processor cannot make a decision about which branch to take because neither the value of r3 or r6 have been written into the registers yet. The processor could stall, but a more sophisticated method of dealing with branch instructions is branch prediction. The processor makes a guess about which path to take - if the guess is wrong, anything written into the registers must be cleared, and the pipeline must be started again with the correct instruction. Some methods of branch prediction depend on stereotypical behavior. Branches pointing backward are taken about 90% of the time since backward-pointing branches are often found at the bottom of loops. On the other hand, branches pointing forward, are only taken approximately 50% of the time. Thus, it would be logical for processors to always follow the branch when it points backward, but not when it points forward. Other methods of branch prediction are less static: processors that use dynamic prediction keep a history for each branch and uses it to predict future branches. These processors are correct in their predictions 90% of the time. Still other processors forgo the entire branch prediction ordeal. The RISC System/6000 fetches and starts decoding instructions from both sides of the branch. When it determines which branch should be followed, it then sends the correct instructions down the pipeline to be executed. Pipelining Developments In order to make processors even faster, various methods of optimizing pipelines have been devised. Super pipelining refers to dividing the pipeline into more steps. The more pipe stages there are, the faster the pipeline is because each stage is then shorter. Ideally, a pipeline with five stages should be five times faster than a non-pipelined processor (or rather, a pipeline with one stage). The instructions are executed at the speed at which each stage is completed, and each stage takes one fifth of the amount of time that the non-pipelined instruction takes. Thus, a processor with an 8-step pipeline (the MIPS R4000) will be even faster than its 5-step counterpart. The MIPS R4000 chops its

220 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

pipeline into more pieces by dividing some steps into two. Instruction fetching, for example, is now done in two stages rather than one. The stages are as shown: 1. Instruction Fetch (First Half) 2. Instruction Fetch (Second Half) 3. Register Fetch 4. Instruction Execute 5. Data Cache Access (First Half) 6. Data Cache Access (Second Half) 7. Tag Check 8. Write Back

Superscalar pipelining involves multiple pipelines in parallel. Internal components of the processor are replicated so it can launch multiple instructions in some or all of its pipeline stages. The RISC System/6000 has a forked pipeline with different paths for floating-point and integer instructions. If there is a mixture of both types in a program, the processor can keep both forks running simultaneously. Both types of instructions share two initial stages (Instruction Fetch and Instruction Dispatch) before they fork. Often, however, superscalar pipelining refers to multiple copies of all pipeline stages (In terms of laundry, this would mean four washers, four dryers, and four people who fold clothes). Many of today's machines attempt to find two to six instructions that it can execute in every pipeline stage. If some of the instructions are dependent, however, only the first instruction or instructions are issued. Dynamic pipelines have the capability to schedule around stalls. A dynamic pipeline is divided into three units: the instruction fetch and decode unit, five to ten execute or functional units, and a commit unit. Each execute unit has reservation stations, which act as buffers and hold the operands and operations.

221 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

While the functional units have the freedom to execute out of order, the instruction fetch/decode and commit units must operate in-order to maintain simple pipeline behavior. When the instruction is executed and the result is calculated, the commit unit decides when it is safe to store the result. If a stall occurs, the processor can schedule other instructions to be executed until the stall is resolved. This, coupled with the efficiency of multiple units executing instructions simultaneously, makes a dynamic pipeline an attractive alternative. The simplest way to examine the advantages and disadvantages of RISC architecture is by contrasting it with it's predecessor: CISC (Complex Instruction Set Computers) architecture. Multiplying Two Numbers in Memory On the right is a diagram representing the storage scheme for a generic computer. The main memory is divided into locations numbered from (row) 1: (column) 1 to (row) 6: (column) 4. The execution unit is responsible for carrying out all computations. However, the execution unit can only operate on data that has been loaded into one of the six registers (A, B, C, D, E, or F). Let's say we want to find the product of two numbers - one stored in location 2:3 and another stored in location 5:2 - and then store the product back in the location 2:3. The CISC Approach

222 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

A complex instruction set computer (CISC) is a microprocessor instruction set architecture (ISA) in which each instruction can execute several low-level operations, such as a load from memory, an arithmetic operation, and a memory store, all in a single instruction. The term was coined in contrast to reduced instruction set computer (RISC). The primary goal of CISC architecture is to complete a task in as few lines of assembly as possible. This is achieved by building processor hardware that is capable of understanding and executing a series of operations. For this particular task, a CISC processor would come prepared with a specific instruction (we'll call it "MULT"). When executed, this instruction loads the two values into separate registers, multiplies the operands in the execution unit, and then stores the product in the appropriate register. Thus, the entire task of multiplying two numbers can be completed with one instruction: MULT 2:3, 5:2 MULT is what is known as a "complex instruction." It operates directly on the computer's memory banks and does not require the programmer to explicitly call any loading or storing functions. It closely resembles a command in a higher level language. For instance, if we let "a" represent the value of 2:3 and "b" represent the value of 5:2, then this command is identical to the C statement "a = a * b." One of the primary advantages of this system is that the compiler has to do very little work to translate a high-level language statement into assembly. Because the length of the code is relatively short, very little RAM is required to store instructions. The emphasis is put on building complex instructions directly into the hardware.

The RISC Approach RISC processors only use simple instructions that can be executed within one clock cycle. Thus, the "MULT" command described above could be divided into three separate commands: "LOAD," which moves data from the memory bank to a register, "PROD," which finds the product of two operands located within the registers, and "STORE," which moves data from a register to the memory banks. In order to perform the exact series of steps described in the CISC approach, a programmer would need to code four lines of assembly:

223 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

LOAD A, 2:3 LOAD B, 5:2 PROD A, B STORE 2:3, A At first, this may seem like a much less efficient way of completing the operation. Because there are more lines of code, more RAM is needed to store the assembly level instructions. The compiler must also perform more work to convert a high-level language statement into code of this form. CISC RISC Emphasis on hardware Emphasis on software Includes multi-clock S i n g l e - c l o c k , complex instructions reduced instruction only M e m o r y - t o - m e m o r y : Register to register: "LOAD" and "STORE" "LOAD" and "STORE" incorporated in instructions are independent instructions Small code sizes, Low cycles per second, high cycles per second large code sizes Transistors used for storing Spends more transistors complex instructions on memory registers However, the RISC strategy also brings some very important advantages. Because each instruction requires only one clock cycle to execute, the entire program will execute in approximately the same amount of time as the multi-cycle "MULT" command. These RISC "reduced instructions" require less transistors of hardware space than the complex instructions, leaving more room for general purpose registers. Because all of the instructions execute in a uniform amount of time (i.e. one clock), pipelining is possible. Separating the "LOAD" and "STORE" instructions actually reduces the amount of work that the computer must perform. After a CISC-style "MULT" command is executed, the processor automatically erases the registers. If one of the operands needs to be used for another computation, the processor must re-load the data from the memory bank into a register. In RISC, the operand will remain in the register until another value is loaded in its place.

224 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The Performance Equation The following equation is commonly used for expressing a computer's performance ability:

The CISC approach attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction. RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program. RISC Roadblocks Despite the advantages of RISC processing, RISC chips took over a gain a foothold in the commercial was largely due to a lack of software

b a s e d decade to world. This support.

Although Apple's Power Macintosh l i n e featured RISC-based chips and Windows NT was RISC compatible, Windows 3.1 and Windows 95 were designed with CISC processors in mind. Many companies were unwilling to take a chance with the emerging RISC technology. Without commercial interest, processor developers were unable to manufacture RISC chips in large enough volumes to make their price competitive. Another major setback was the presence of Intel. Although their CISC chips were becoming increasingly unwieldy and difficult to develop, Intel had the resources to plow through development and produce powerful processors. Although RISC chips might surpass Intel's efforts in specific areas, the differences were not great enough to persuade buyers to change technologies. The Overall RISC Advantage Today, the Intel x86 is arguable the only chip which retains CISC architecture. This is primarily due to advancements in other areas of computer technology. The price of RAM has decreased dramatically. In 1977, 1MB of DRAM cost about $5,000. By 1994, the same amount of memory cost only $6 (when adjusted for inflation). Compiler technology has also become more sophisticated, so that the RISC use of RAM and emphasis on software has become ideal.

225 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

CISC and RISC Convergence State of the art processor technology has changed significantly since RISC chips were first introduced in the early '80s. Because a number of advancements (including the ones described on this page) are used by both RISC and CISC processors, the lines between the two architectures have begun to blur. In fact, the two architectures almost seem to have adopted the strategies of the other. Because processor speeds have increased, CISC chips are now able to execute more than one instruction within a single clock. This also allows CISC chips to make use of pipelining. With other technological improvements, it is now possible to fit many more transistors on a single chip. This gives RISC processors enough space to incorporate more complicated, CISC-like commands. RISC chips also make use of more complicated hardware, making use of extra function units for superscalar execution. All of these factors have led some groups to argue that we are now in a "post-RISC" era, in which the two styles have become so similar that distinguishing between them is no longer relevant. However, it should be noted that RISC chips still retain some important traits. RISC chips strictly utilize uniform, single-cycle instructions. They also retain the register-to-register, load/store architecture. And despite their extended instruction sets, RISC chips still have a large number of general purpose registers. Simultaneous Multi-Threading Simultaneous Multi-Threading (SMT) allows multiple threads to be executed at the exact same time. Threads are series of tasks which are executed alternately by the processor. Normal thread execution requires threads to be switched on and off the processor as a single processor dominates the processor for a moment of time. This allows some tasks that involve waiting (for disk accesses, or network usage) to execute more efficiently. SMT allows threads to execute at the same time by pulling instructions into the pipeline from different threads. This way, multiple threads advance in their processes and no one thread dominates the processor at any given time. Value Prediction Value prediction is the prediction of the value that a particular load instruction will produce. Load values are generally not random, and approximately half of the load instructions in a program will fetch the same value as they did in a previous execution. Thus, predicting that the load value will be the same as it was last time speeds up the processor since it allows the computer to continue without having to wait for the load memory access. As loads tend to be one of the slowest and most frequently executed instructions, this improvement makes a significant difference in processor speed.

226 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Example system configurations Connection Point Services (CPS) lends itself to various configurations, according to your needs. A few examples follow. Dedicated single server Example uses: •

Testing



Small company



Small Internet service provider

You can maintain both Phone Book Service (PBS) and Phone Book Administrator (PBA) on a single computer running an operating system in the Windows Server 2003 family. Even though PBA posts to the same server on which it resides, you must use the same procedures for setting permissions and posting phone books as you would with any other configuration. Dedicated Phone Book Service server with a Phone Book Administrator client Example uses: •

Medium to large corporations



When ownership and responsibilities for phone book administration and server maintenance are split between groups

In this configuration, PBS and PBA are installed on separate computers. PBA could be installed on a server or on a workstation running Windows XP Professional. The following illustration shows this configuration.

227 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Dedicated Phone Book Service server with remote administration of Phone Book Administrator client

Example use: When the primary computer running Phone Book Administrator is not physically accessible to the administrator, you can use this dual-mode system. You can configure PBA to run on a primary (dedicated) computer and on a remote workstation at the same time.The following illustration shows this configuration.

All data files reside on the primary computer, never on the remote workstation. The remote workstation accesses the data files on the primary computer. Multiple servers with firewall Example uses: •

Highly secured environment



Very large Internet service providers



Phone book replication among multiple Internet service providers

You can install PBA on a primary computer and on multiple remote workstations. PBS is installed on a staging server and on multiple host servers residing in a less secure environment outside a firewall. The following illustration shows this configuration.

228 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

The remote workstations access phone book data on the primary PBA computer. Phone book updates are posted to the staging server. Using a content replication method, phone book updates are then copied from the staging server through the firewall to the host servers. Lesson XI Advanced Architectures Classes of Architecture:

Figure 1. Layered class type architecture. I originally used the term "class type" because I first started with this approach using object-oriented (OO) technology, although since then have used it for component-based architectures, service oriented architectures (SOAs), and combinations thereof. Throughout this article I still refer to classes within the layers, although there is absolutely nothing stopping you from using non-OO technology to implement the layers. The five layers are summarized in Table 1, as are the skills required to successfully work on them (coding is applicable to all layers so it's not listed).

Table 1. The 5 Layers. Skillset Layer

Description

229 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

For user interfaces:

Interface

This layer wraps access to the logic of · User interface design your system. There are two categories of skills interface class – user interface (UI) · Usability skills classes that provide people access to your system and system interface (SI) classes · Ability to work closely that provide access to external systems to with stakeholders your system. Java Server Pages (JSPs) and graphical user interface (GUI) screens For system interfaces: implemented via the Swing class library are commonly used to implement UI · API design skills classes within Java. Web services and CORBA wrapper classes are good options · Legacy analysis skills for implementing SI classes.

Domain

This layer implements the concepts pertinent to your business domain such as Student or Seminar, focusing on the data aspects of the business objects, plus behaviors specific to individual objects. Enterprise Java Bean (EJB) entity classes are a common approach to implementing domain classes within Java.

Process

The process layer implements business logic that involves collaborating with several domain classes or even other process classes.

Persistence layers encapsulate the capability to store, retrieve, and delete Persistence objects/data permanently without revealing details of the underlying storage technology. often implement between

·

Analysis skills to identify domain classes

·

Design skills to determine how to implement the domain classes

·

Domain modeling skills, in particular UML class modeling

·

Analysis skills to identify process classes and process logic

·

Design skills to determine how to implement the process classes

·

Modeling skills, in particular activity modeling, flow charting, and sequence diagramming

·

Object/relational (O/R) mapping

·

Architectural skills so you can choose the right

230 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

your object schema and your database schema and there are various available to you.

System

System classes provide operating-system-specific functionality for your applications, isolating your software from the operating system (OS) by wrapping OS-specific features, increasing the portability of your application.

database encapsulation strategy ·

Modeling skills, in particular class modeling and physical data modeling

·

Analysis skills to identify what needs to be built

·

Architectural and design skills to determine how to implement the classes

·

Modeling skills, in particular class modeling, sequence diagramming, and state modeling

Collaboration within a layer is allowed. For example, UI objects can send messages to other UI objects and business/domain objects can send messages to other business/domain objects. Collaboration can also occur between layers connected by arrows. As you see in Figure 1, interface classes may send messages to domain classes but not to persistence classes. Domain classes may send messages to persistence classes, but not to interface classes. By restricting the flow of messages to only one direction, you dramatically increase the portability of your system by reducing the coupling between classes. For example, the domain classes don’t rely on the user interface of the system, implying that you can change the interface without affecting the underlying business logic. All types of classes may interact with system classes. This is because your system layer implements fundamental software features such as inter-process communication (IPC), a service classes use to collaborate with classes on other computers, and audit logging, which classes use to record critical actions taken by the software. For example, if your user interface classes are running on a personal computer (PC) and your domain classes are running on an EJB application server on another machine, and then your interface classes will send messages to the domain classes via the IPC service in the system layer. This service is often implemented via the use of middleware. It’s critical to understand that this isn’t the only way to layer an application, but instead that it is a very common one. The important thing is that you identify the layers that are pertinent to your environment and then act accordingly. Dataflow architecture is a computer architecture that directly contrasts the traditional von Neumann architecture or control flow architecture. Dataflow architectures do not

231 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

have a program counter or (at least conceptually) the executability and execution of instructions is solely determined based on the availability of input arguments to the instructions. Although no commercially successful computer hardware has used a data flow architecture, it is very relevant in many software architectures today including database engine designs and parallel computing frameworks. Software architecture Dataflow is a software architecture based on the idea that changing the value of a variable should automatically force recalculation of the values of other variables. A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system. A data flow diagram can also be used for the visualization of data processing (structured design). It is common practice for a designer to draw a context-level DFD first which shows the interaction between the system and outside entities. This context-level DFD is then "exploded" to show more detail of the system being modeled. Azna, the original developer of structured design, based on Martin and Estrin's "data flow graph" model of computation. Data flow diagrams (DFDs) are one of the three essential perspectives of SSADM. The sponsor of a project and the end users will need to be briefed and consulted throughout all stages of a systems evolution. With a dataflow diagram, users are able to visualize how the system will operate, what the system will accomplish and how the system will be implemented. Old system dataflow diagrams can be drawn up and compared with the new systems dataflow diagrams to draw comparisons to implement a more efficient system. Dataflow diagrams can be used to provide the end user with a physical idea of where the data they input, ultimately has an effect upon the structure of the whole system from order to dispatch to restock how any system is developed can be determined through a dataflow diagram. Components A data flow diagram illustrates the processes, data stores, and external entities in a business or other system and the connecting data flows.

Data flow diagram example The four components of a data flow diagram (DFD) are:

232 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Data flow diagram notation External Entities/Terminators are outside of the system being modeled. Terminators represent where information comes from and where it goes. In designing a system, we have no idea about what these terminators do or how they do it. Processes modify the inputs in the process of generating the outputs Data Stores represent a place in the process where data comes to rest. A DFD does not say anything about the relative timing of the processes, so a data store might be a place to accumulate data over a year for the annual accounting process. Data Flows are how data moves between terminators, processes, and data stores (those that cross the system boundary are known as IO or Input Output Descriptions). Every page in a DFD should contain fewer than 10 components. If a process has more than 10 components, then one or more components (typically a process) should be combined into one and another DFD be generated that describes that component in more detail. Each component should be numbered, as should each subcomponent, and so on. So for example, a top level DFD would have components 1 2 3 4 5, the subcomponent DFD of component 3 would have components 3.1, 3.2, 3.3, and 3.4; and the sub subcomponent DFD of component 3.2 would have components 3.2.1, 3.2.2, and 3.2.3 Data store A''''data store is a repository for data. Data stores can be manual, digital, or temporary.''''

233 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Duplication External entities and data stores can be duplicated in the system for more clarity, while processes cannot. External entities that have been replicated are marked by an asterisk (\) in the lower left part of the oval that represents that entity. Data stores have a double line on the left side of their box . Developing a DFD Top-Down Approach 1. The system designer makes a context level DFD, which shows the interaction (data flows) between the system (represented by one process) and the system environment (represented by terminators). 2. The system is decomposed in lower level DFD (Zero) into a set of processes, data stores, and the data flows between these processes and data stores. 3. Each process is then decomposed into an even lower level diagram containing its sub processes. 4. This approach then continues on the subsequent sub processes, until a necessary and sufficient level of detail is reached which is called the primitive process (aka chewable in one bite). Event Partitioning Approach 1. Construct detail DFD. 1. The list of all events is made. 2. For each event a process is constructed. 3. Each process is linked (with incoming data flows) directly with other processes or via data stores, so that it has enough information to respond to given event. 4. The reaction of each process to a given event is modeled by an outgoing data flow. DFD tools ·

Concept Draw - Windows and MacOS X data flow diagramming tool

·

Dia - open source diagramming tool with DFD support

234 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

·

Microsoft Visio - Windows diagramming tool which includes very basic DFD support (Images only, does not record data flows

Dataflow programming languages embody these principles, with Spreadsheets perhaps the most widespread embodiment of dataflow. For example, in a spreadsheet you can specify a cell formula which depends on other cells; then when any of those cells is updated the first cell's value is automatically recalculated. It's possible for one change to initiate a whole sequence of changes, if one cell depends on another cell which depends on yet another cell, and so on. The dataflow technique is not restricted to recalculating numeric values, as done in spreadsheets. For example, dataflow can be used to redraw a picture in response to mouse movements, or to make robot turn in response to a change in light level. One benefit of dataflow is that it can reduce the amount of coupling-related code in a program. For example, without dataflow, if a variable X depends on a variable Y, then whenever Y is changed X must be explicitly recalculated. This means that Y is coupled to X. Since X is also coupled to Y (because X's value depends on the Y's value), the program ends up with a cyclic dependency between the two variables. Most good programmers will get rid of this cycle by using an observer pattern, but only at the cost of introducing a non-trivial amount of code. Dataflow improves this situation by making the recalculation of X automatic, thereby eliminating the coupling between from Y to X. Dataflow makes implicit a significant amount of code that otherwise would have had to be tediously explicit. Dataflow is also sometimes referred to as reactive programming. There have been a few programming languages created specifically to support dataflow. In particular, many (if not most) visual programming languages have been based on the idea of dataflow. A good example of a Java-based framework is Pervasive DataRush. Diagrams The term dataflow may also be used to refer to the flow of data within a system, and is the name normally given to the arrows in a data flow diagram that represent the flow of data between external entities, processes, and data stores. Concurrency A dataflow network is a network of concurrently executing processes or automata that can communicate by sending data over channels (see message passing.)

235 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Kahn process networks, named after one of the pioneers of dataflow networks, are a particularly important class of such networks. In a Kahn process network the processes are determinate. This implies that they satisfy the so-called Kahn's principle, which roughly speaking states that, each determinate process computes a continuous function from input streams to output streams, and that a network of determinate processes is itself determinate, thus computing a continuous function. This implies that the behaviour of such networks can be described by a set of recursive equations, which can be solved using fix point theory. The concept of dataflow networks is closely related to another model of concurrency known as the Actor model. Hardware architecture Hardware architectures for dataflow was a major topic in Computer architecture research in the 1970s and early 1980s. Jack Dennis of MIT pioneered the field of static dataflow architectures while the Manchester Dataflow Machine and MIT Tagged Token architecture were major projects in dynamic dataflow. A compiler analyzes a computer program for the data dependencies between operations. It does this in order to better optimize the instruction sequences. Normally, the compiled output has the results of these optimizations, but the dependency information itself is not recorded within the compiled binary code. A compiled program for a dataflow machine would keep this dependency information. A dataflow compiler would record these dependencies by creating unique tags for each dependency instead of using variable names. By giving each dependency a unique tag, it exposes any possibility of parallel execution of non-dependent instructions. Each instruction, along with its tagged operands would be stored in the compiled binary code. The compiled program would be loaded into a Content-addressable memory of the dataflow computer. When all of the tagged operands of an instruction became available, that is previously calculated, the instruction was marked as available for execution by an execution unit. This was known as activating or firing the instruction. Once the instruction was completed by the execution unit, its output data would be broadcast (with its tag) to the CAM memory. Any other instructions that were dependent on this particular datum (identified by its tag value) would be updated. In this way, subsequent instructions would be activated. Instructions would be activated in data order, that is when all of the required data operands were available. This order can be different from the sequential order envisioned by the human programmer, the programmed order. The instructions along with their required data would be transported as packets to the execution units. These packets are often known as instruction tokens. Similarly, data

236 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

results are transported back to the CAM as data tokens. The packetization of instructions and results allowed for parallel execution of activated instructions on a large scale. Connection networks would deliver the activated instruction tokens to the execution units and return data tokens to the instruction CAM memory. In contrast to the conventional von Neumann architecture, data tokens are not permanently stored in memory, rather they are transient messages that only exist when in transit to the instruction storage. Earlier designs that only used instruction addresses as data dependency tags were called static dataflow machines. These machines could not allow instructions from multiple loop iterations (or multiple calls to the same routine) to be issued simultaneously as the simple tags could not differentiate between the different loop iterations (or each invocation of the routine). Later designs called dynamic dataflow machines used more complex tags to allow greater parallelism from these cases. The research, however, never overcame the problems related to: ·

efficiently broadcasting data tokens in a massively parallel system

·

efficiently dispatching instruction tokens in a massively parallel system

·

building CAMs large enough to hold all of the dependencies of a real programs

Instructions and their data dependencies proved to be too fine-grained to be effectively distributed in a large network. That is, the time for the instructions and tagged results to travel through a large connection network was longer than the time to actually do the computations. Out-of-order execution is the conceptual descendant of dataflow computation and has become the dominant computing paradigm since the 1990s. It is a form of restricted dataflow. This paradigm introduced the idea of an execution window. The execution window follows the programmed sequential order of the program, however within the window, instructions are allowed to be completed in data dependency order. This is accomplished by the computer hardware dynamically tagging the data dependencies within the window. The logical complexity of dynamically keeping track of the data dependencies, restricts OoO CPUs to a small number of execution units (2-6) and the execution window sizes to the range of 32 to 200 Blue RJ-45 patchcord of the type instructions, much smaller than envisioned for commonly used to connect network full dataflow machines. devices.

237 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

A computer network is two or computers connected together using telecommunication system for the communicating and sharing

m o r a purpose resources

e of

Blue RJ-45 patchcord of the type commonly used to connect network devices.

Experts in the field of networking d e b a t e whether two computers that are connected together using some form of communications medium constitute a network. Therefore, some sources will state that a network requires three connected computers. A computer connected to a non-computing device (e.g., networked to a printer via an Ethernet link) may also represent a computer network, although this article does not currently address this configuration. For example, [1] states that "the term network describes two or more connected computers" while [2] states that a computer network is "A network of data processing nodes that are interconnected for the purpose of data communication", the term "network" being defined in the same document as "An interconnection of three or more communicating entities" (author's emphasis). This article uses the definition which requires two or more computers to be connected together to form a network. The same basic functions are generally present in this case as with larger numbers of connected computers. Basic Computer Network Building Blocks Computers Many of the components of an average network are individual computers, which are generally either workstations (including personal computers) or servers. Types of Workstations There are many types of workstations that may be incorporated into a particular network, some of which have high-end displays, multiple CPUs, large amounts of RAM, large amounts of hard drive storage space, or other enhancements required for special data processing tasks, graphics, or other resource intensive applications. (See also network computer). Types of Servers The following lists some common types of servers and their purpose. File Server

238 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Stores various types of files and distributes them to other clients on the network. Print Server Controls and manages one or more printers and accepts print jobs from other network clients, spooling the print jobs, and performing most or all of the other functions that a workstation would perform to accomplish a printing task if the printer were connected directly to the workstation's printer port. Mail Server Stores, sends, receives, routes, and performs other email related operations for other clients on the network. Fax Server Stores, sends, receives, routes, and performs other functions necessary for the proper transmission, reception, and distribution of faxes. Telephony Server Performs telephony related functions such as answering calls automatically, performing the functions of an interactive voice response system, storing and serving voice mail, routing calls between the Public Switched Telephone Network (PSTN) and the network or the Internet (e.g., voice over IP (VoIP) gateway), etc. Proxy Server Performs some type of function on behalf of other clients on the network to increase the performance of certain operations (e.g., prefetching and caching documents or other data that is requested very frequently) or as a security precaution to isolate network clients from external threats. Remote Access Server (RAS) Monitors modem lines or other network communications channels for requests to connect to the network from a remote location, answers the incoming telephone call or acknowledges the network request, and performs the necessary security checks and other procedures necessary to log a user onto the network. Application Server Performs the data processing or business logic portion of a client application, accepting instructions for operations to perform from a workstation and serving the results back to the workstation, while the workstation performs the user interface or GUI portion of the processing (i.e., the presentation logic) that is required for the application to work properly. Web Server Stores HTML documents, images, text files, scripts, and other Web related data (collectively known as content), and distributes this content to other clients on the network on request. Backup Server Has network backup software installed and has large amounts of hard drive storage or other forms of storage (tape, etc.) available to it to be used for the purpose of ensuring that data loss does not occur in the network. Printers

239 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Many printers are capable of acting as part of a computer network without any other device, such as a print server, to act as an intermediary between the printer and the device that is requesting a print job to be completed. Dumb Terminals Many networks use dumb terminals instead of workstations either for data entry and display purposes or in some cases where the application runs entirely on the server. Other Devices There are many other types of devices that may be used to build a network, many of which require an understanding of more advanced computer networking concepts before they are able to be easily understood (e.g., hubs, routers, bridges, switches, hardware firewalls, etc.). On home and mobile networks, connecting consumer electronics devices such as video game consoles is becoming increasingly common. Building a Computer Network A Simple Network A simple computer network may be constructed from two computers by adding a network adapter (Network Interface Controller (NIC)) to each computer and then connecting them together with a special cable called a crossover cable. This type of network is useful for transferring information between two computers that are not normally connected to each other by a permanent network connection or for basic home networking applications. Alternatively, a network between two computers can be established without dedicated extra hardware by using a standard connection such as the RS-232 serial port on both computers, connecting them to each other via a special cross linked null modem cable. Practical Networks Practical networks generally consist of more than two interconnected computers and generally require special devices in addition to the Network Interface Controller that each computer needs to be equipped with. Examples of some of these special devices are listed above under Basic Computer Network Building Blocks / Other devices. Types of Networks: Below is a list of the most common types of computer networks.

240 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Local Area Network (LAN): A network that is limited to a relatively small spatial area such as a room, a single building, a ship, or an aircraft. Local area networks are sometimes called a single location network. Note: For administrative purposes, large LANs are generally divided into smaller logical segments called workgroups. A workgroup is a group of computers that share a common set of resources within a LAN. Campus Area Network (CAN): A network that connects two or more LANs but that is limited to a specific (possibly private) geographical area such as a college campus, industrial complex, or a military base. Note: A CAN is generally limited to an area that is smaller than a Metropolitan Area Network Metropolitan Area Network (MAN): A network that connects two or more LANs or CANs together but does not extend beyond the boundaries of the immediate town, city, or metropolitan area. Multiple routers, switches & hubs are connected to create a MAN. Wide Area Network (WAN): A network that covers a broad geographical area (i.e., any network whose communications links cross metropolitan, regional, or national boundaries) or, less formally, a network that uses routers and public communications links. Types of WANs: Centralized: A centralized WAN consists of a central computer that is connected to dumb terminals and / or other types of terminal devices. Distributed: A distributed WAN consists of two or more computers in different locations and may also include connections to dumb terminals and other types of terminal devices. Internetwork: Two or more networks or network segments connected using devices that operate at layer 3 (the 'network' layer) of the OSI Basic Reference Model, such as a router. Note: Any interconnection among or between public, private, commercial, industrial, or governmental networks may also be defined as an internetwork. Internet, The: A specific internetwork, consisting of a worldwide interconnection of governmental, academic, public, and private networks based upon the Advanced Research Projects Agency Network (ARPANET) developed by ARPA of the U.S. Department of Defense – also home to the World Wide Web (WWW) and referred to as the 'Internet' with a capital 'I' to distinguish it from other generic internetworks.

241 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Synonyms for the 'Internet' also include the 'Web' or, in a more comical sense, the 'Interweb'. Intranet: A network or internetwork that is limited in scope to a single organization or entity or, also, a network or internetwork that is limited in scope to a single organization or entity and which uses the TCP/IP protocol suite, HTTP, FTP, and other network protocols and software commonly used on the Internet. Note: Intranets may also be categorized as a LAN, CAN, MAN, WAN, or other type of network. Extranet: A network or internetwork that is limited in scope to a single organization or entity but which also has limited connections to the networks of one or more other usually, but not necessarily, trusted organizations or entities (e.g., a company's customers may be provided access to some part of its intranet thusly creating an extranet while at the same time the customers may not be considered 'trusted' from a security standpoint). Note: Technically, an extranet may also be categorized as a CAN, MAN, WAN, or other type of network, although, by definition, an extranet cannot consist of a single LAN, because an extranet must have at least one connection with an outside network. Intranets and extranets may or may not have connections to the Internet. If connected to the Internet, the intranet or extranet is normally protected from being accessed from the Internet without proper authorization. The Internet itself is not considered to be a part of the intranet or extranet, although the Internet may serve as a portal for access to portions of an extranet. Classification of computer networks By network layer Computer networks may be classified according to the network layer at which they operate according to some basic reference models that are considered to be standards in the industry such as the seven layer OSI reference model and the five layer TCP/IP model. By scale Computer networks may be classified according to the scale or extent of reach of the network, for example as a Personal area network (PAN), Local area network (LAN), Wireless local area network (WLAN), Campus area network (CAN), Metropolitan area network (MAN), or Wide area network (WAN).

242 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

By connection method Computer networks may be classified according to the technology that is used to connect the individual devices in the network such as HomePNA, Power line communication, Ethernet, or WiFi.

By functional relationship Computer networks may be classified according to the functional relationships which exist between the elements of the network, for example Active Networking, Client-server and Peer-to-peer (workgroup) architectures. By network topology Computer networks may be classified according to the network topology upon which the network is based, such as Bus network, Star network, Ring network, Mesh network, Star-bus network, Tree or Hierarchical topology network, etc. Topology can be arranged in a Geometric Arrangement By services provided Computer networks may be classified according to the services which they provide, such as Storage area networks, Server farms, Process control networks, Value-added network, SOHO network, Wireless community network, XML appliance, Jungle Networks, khadar network, etc. By protocol

243 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Computer networks may be classified according to the communications protocol that is being used on the network. See the articles on List of network protocol stacks and List of network protocols for more information.

S ampl e

Networks

244 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

245 f.c. ledesma avenue, san carlos city, negros occidental Tel. #: (034) 312-6189 / (034) 729-4327

Related Documents

Computer System And Network
November 2019 26
Computer Network
May 2020 19
Computer Network
June 2020 26
Computer Network
July 2020 24
Computer Network
June 2020 35
Computer Network
June 2020 26