THE CRAY-1
COMPUTER SYSTEM
THE CRAY - I'COMPUTER SYSTEM The Cray Research, Inc. CRAY-1 Computer System is a large-scale, general-purpose digital computer featuring vector as well as scalar processing, a 12.5 naaosecond clock period, d a 50 nanosecond memory cycle time. The CRAY-1 is able of executing over 80 million floating point operations Even higher rates are possible with programs that age of the vector features of the computer.
Functional units
RAY-1 is an effective host processor for local computer rks and time-sharing networks. Cray Research provides ware and software interfaces that link the CRAY-1 with manufacturer's computer systems. This approach allows to add the CRAY-1 to existing computing facilities.
I I
$he CRAY-1 is particularly adapted t o the needs of the 'scientific community and is especially useful in solving problems requiring the analysis and predictiw of the behavior of physical phenomena through computer simulation. The fields of weather forecasting, aircraft design, nuclear research, ophysical research, and seismic analysis involve this process. or example, the movements of global air masses for weather recasting, air flows over wing and airframe surfaces for aft design, and the movements of particles for nuclear ch, all lend themselves to such simulations. In each fic field, the equations are known but the solutions require extensive computations involving large quantities of data. The quality of a solution depends heavily on the number of data points that can be considered and the number of computations that can be performed. The CRAY-1 provides substantial increases with respect to both the number of data points and computations so that researchers can apply - - - the CRAY-1 to problems not feasibly solvable in the past.
CONFIGURATION The basic configuration of the CRAY-1 consists of the central processor unit (CPU), power and cooling equipment, one or more minicomputer consoles, and a mass storage (disk) subsystem. The CPU holds the computation, memory, and 110 sections of the computer. A minicomputer serves either as a maintenance control unit or a job entry station. INPUT/OUTPUT Input/output is via twenty-four 110 channels, twelve of which are input and twelve output. Any number of channels may be active at a given time. For a 16-bit channel, transfer rates of 160 million bits per second can be achieved. Higher rates are possible but in practice, the maximum transfer rate is limited by the speed of peripheral devices. Publicatio Number 2240008 B Copyrightb1976, 1977 Cray Research, Inc.
MEMORY SECTION 0.25MorObMw1 M
-
[7 MCU
t
MASS STORAGE SUBSYSTEM
,
FRONT-END COMPUTERS
BASIC COMPUTER SYSTEM Inputloutput and the CPU share memory access via a single port. 110 priority is sufficient to ensure maintaining the required transfer rates for the peripheral devices.
MEMORY The CRAY-1 memory is constructed of 1024-bitLSI chips. Up to 1,048,576 (generally referred to as one million) 72-bit wordseare arranged in 16 banks. Up to 524,288 words can be arranged in 8 banks. A word consists of 64 data bits and 8 check bits. The bank cycle time, that is, the time required to remove or insert an element of data in memory, is 50 nanoseconds. This short cycle time provides an extremely efficient random-access memory. There is no inherent memory degradation for 16-bank memories with less than one million words of memory. A single-error-correction double error detection (SECDED) network assures that data written into memory can be returned to the computation section with consistent precision. There are eight check bits per memory word.
"'lq-
JF+&-F . ~-,~!~ . !u -q!.~.-+~
F p . q @ ~aai,p/-nn
mrJ$<
:kL.$.-n,
Integrated oircuits used in the CRAY-1 computer.
Printed circuit board module with integrated
circuits installed.
CRAY-I computer shown with installed banks of printed circuit board modules.
Section of the CRAY-1 computer showing installation of printed circuit board modules.
Data station (equipment not manufactured by CR I ) used in the operation of the CRAY-1.
FACIS AND FIGURES
COMPUrATION SECTION
16 or 32 bits
lnstruction size Repertoire size Clock period
The computation section as illustrated on page 4 is composed of instruction buffers, registers, and functional units which operate together to execute sequences of instructions.
I
128 instruction codes 12.5 nsec
Instruction stackibuffers
64 words (4096 bits)
Functional units
twelve: 3 1 2 2 1 1 1 1
Internal character representation in the CRAY-1 is in ASCII with each &bit word able to accommodate eight characters.
integer add integer multiply shift logical floating add floating multiply reciprocal approx. population count
Numeric representation is either in two's complement form (24-bit or 64bit) or in 64bit floating point form using a signed magnitude binary coefficient and a biased exponent. Exponent overflow and underflow is caused if the exponent is greater than 577778 or less than 200008.Either of these conditions causes an interrupt except where the interrupt has been inhibited.
Programmable registers
I
12.5 nsec Iunit
Max. vector result rate
FLOATING POINT COMPUTATION RATES (results per second) Addition
80x 1 0 ~ l s e c
Multiplication
80 x
Division
25x 10~1sec
Technology
bipolar semiconductor
Word length
72 bits (64 data, 8 SECDED)
Address space
4M words
Data path width (bits)
I
I
Cycle time
I1.
Size
Organization Iinterleave Maximum band width
SlGN
lo6 Isec
2's COMPLEMENT INTEGER (24 BITS)
SlGN 2's COMPLEMENT INTEGER (64 BITS)
64 (1 word)
BINARY POINT
50 nsec.
v
262.1 44 words or 524,288 words or 1,048,576 words
I
16 banks (8 banks optional)
I
80 x
(
Error checking
SIGN EXPONENT
COEFFICIENT
SIGNED MAGNITUDE FLOATING POINT (64 BITS)
lo6 words lsec (5.1 x lo9 bits I secl
DATA FORMATS
SECDED
PHYSICAL CHARACTERISTICS 1 ELECTRONIC TECHNOLOGY Size of CPU cabinet
Weight of mainframe Cooling Plug-in modules Module types PC boards
9 f t diameter base 4.5 f t diameter center 6.5 f t height
II 1
1
I
Circuitry (equivalent no. of transistors)
I
1
~ogic Highdensity logic
Instruction set The CRAY-I executes 128 operation codes as either 16-bit (one parcel) or 32-bit (two-parcel) instructions. Operation codes provide for both scalar and vector processing.
5 2 5 tons Freon
In general, an instruction that references registers occupies one parcel; an instruction that references memory occupies two parcels. All of the arithmetic and logical instructions reference registers.
1662 113 5 layer 2.5M
I
1
ECL, 1 nsec.
551
I
Floating point instructions provide for addition, subtraction, multiplication, and reciprocal approximation. The reciprocal approximation instruction allows for the computation of a floating point divide operation using a multiple instruction sequence.
------Instruction Buffers
COMPUTATION SECTION
4
Execution
-)
16 BITS SHIFT, MASK OPERATION OPERAND AND RESULT REG. SHIFT; MASK COUNT
16 BITS ARITHMETIC, LOGICAL OPERATION RESULT
REG.
Integer or fixed point operations are provided for as follows: integer addition, integer subtraction, and integer multiplication. An integer multiply operation produces a 24-bit result; additions and subtractions produce either 24-bit or 64-bit results. No integer divide instruction is provided. The operation can be accomplished through a software algorithm using floating point hardware. The instruction set includes Boolean operations for OR, AND, and exclusive OR and for a mask-controlled merge operation. Shift operations allow the manipulation of 64- or 128-bit operands to produce a 64-bit result. Similar 64-bit arithmetic capability is provided for both scalar and vector processing. Full indexing capability allows the programmer to index throughout memory in either scalar or vector modes of processing. This allows matrix operations in vector mode to be performed on rows, on columns, or on the diagonal. Instructions for scalar population and leading zero counts return bit counts based on S register contents to an A register. Corresponding instructions for vector operations are available as a special option.
OPERAND REG.
OPERAND REG.
Addressing g
h
i
j
k
Instructions that reference data do so on a word basis. Instructions that alter the sequence of instructions being executed, that is, the branch instructions, reference parcels of words. In this case, the lower two bits of an address identify the location of an instruction parcel in a word.
rn
9 '
Instruction buffers RESULT REG.
a
h
i
WORD ADDRESS
i
+
k
OPE RATION CODE
32 BITS CONSTANT CONSTANT-
-
A S
I
RESULT REG.
16 BITS CONSTANT
All instructions are executed from four instruction buffers, each consisting of 64 16-bit registers. Associated with each instruction buffer is a base address register that is used to determine if the current instruction resides in a buffer. Since the four instruction buffers are large, substantial program segments can reside in them. Forward and backward branching within the buffers is possible and the program segments may be noncontiguous. When the current instruction does not reside in a buffer, one of the instruction buffers is filled from memory. Four memory words are transferred per clock period. me buffer that is filed is the one least recently filled, that is, the buffers are filled in rotation. To allow the current instruction to issue as soon as possible, the memory word containing the current instruction is among the first four transferred. A parcel counter register (P) points to the next parcel to exit from the buffers. Prior to issue, instruction parcels may be held in the next instruction parcel (NIP), lower instruction parcel (LIP) and cutrent instruction parcel (CIP) registers. Operating registers
OPERATION CODE
I b
PARCEL ADDRESS
INSTRUCTION FORMATS
32 BITS BRANCH
The CRAY-1 has five sets of registers, three primary and two intermediate. Primary registers can be accessed directly by functional units. Intermediate registers are not accessible by functional units but act as buffers between primary registers and memory.
The figure on page 4 represents the CRAY-I registers and functional units. The 64 address and 64 scalar intermediate registers can be filled by block transfers from memory. Their purpose is to reduce memory references made by the scalar and address registers.
The functional units are all fully segmented. This means that a new set of operands for unrelated computation may enter a functional unit each clock period even though the functional unit time may be more than one clock period. This segmentation is made possible by capturing and holding the information arriving at the unit or moving within the unit at the end of every clock period.
The eight address registers are each 24 bits and can be used to count loops, provide shift counts, and act as index registers in addition to their main use for memory references.
The twelve functional units can be arbitrarily assigned to four groups: address, scalar, vector, and floating point. The first three groups each acts in conjunction with one of the three primary register types, to support address, scalar, and vector modes of processing. The fourth group, floating point, can suppod either scalar or vector operations &d will accept operands from or deliver results to scalar or vector registers accordingly.
The eight 64-bit scalar registers in-addition to contributing operands and receiving results for scalar operations can provide one operand for vector operations. Each of the eight vector (V) registers is actually a set of 64 &bit registers, called elements. The number of vector operations to be performed (that is, the vector length) determines how many of the elements of a register are used to supply operands in a vector set or receive results of the vector operation. The hardware accommodates vectors with lengths up to 64; longer vectors are handled by the software dividing the vector into 64-element segments and a remainder.
FUNCTIONAL UNITS Functional Unit
Associated with the vector registers are a 7-bit vector length register and a 64-bit vector mask register. The vector length register, as its name implies, determines the number of operations performed by a vector instruction. Each bit of the vector mask register corresponds to an element of a V register. The mask is used with vector merge and test instructions to allow operations to be performed on individual vector elements.
Unit Time (Clock Periodd
Address integer add
2
Address multiply
6
Scalar integer add
3
Scalar logical
1
Scalar shift
2
3
Supportingregisters In addition to the operating registers, the CPU contains a variety of auxiliary and control registers. For example, there is a channel address (CAI register and a channel limit register (CL) for each 110 channel.
Scalar leading zerolpop count
4
Vector integer add
3 3
Vector logical
2
Vector shift
4
Floating point add
6
Floating point multiply Floating point reciprocal
7 14
-
Memory field protection Each object program has a designated field of memory. Field h i t s are defmed by a base address register and a-limit address register. Any attempt to reference instructions or data beyond these Imits results in t range error.
Functional unitsInstructions other than simple transmits or control operations are performed by hardware organizations known as functional units. Each of the twelve wiits in the CRAY-1 executes an algorithm or a portion of the irtstruction set. Units are independent. A number of firnctional units can be in operation at one time.
The technique employed in the CRAY-1 to switch execution from one program to another is termed the exchange mechanism. A 16-word block of program parameters is maintained for each program. When another propam is to begin execution, an operation known as an exchange sequence is initiated. This sequence causes the prograni parameters f o ~ . the next program to be executed to be exchanged with the information in the operating registers to be saved. The operating register contents are thus saved for the terminating program and entered with data for the new program.
A functional unit receives operands from registers and delivers the result to a register when the function has been performed. The units operate essentially in three-address mode with source and destination addressing limited to register designaton.
All functional units perform their algorithms in a fured amount of time. No delays are possible once the operands have been delivered to the unit. The. amount of time required from delivery of the operands to the unit to the completion of the calculation is termed the "functional unit time" and is measured in 12.5 nsec clock periods.
Exchange sequences may be initiated automatically upon occurrence of an interrupt condition or may be voluntarily *
6
The operating system is activated through a system dead start operation performed from the MCU. A job usually consists of compilation of a program written in some language such as FORTRAN, loading and execution of the program generated by the compiler, and processing of all output data generated by the job.
36 37 38 39
~-~od.r+ lnterrupt on correctable memory error lnterrupt on floating point Interrupt on uncorrectable memory error Monitor mode
-
F fl,f Console interrupt RTC interrupt Floatingpoint error Operand range Proaram range Memory error 110 interrupt Error exit Normal exit it position from left ofword
31 32 33 34 35 36 37 38 39.
Ragiston S Syndrome bits RAB Read address for error (where B is bank) P Program address BA Baseaddress LA Limit address XA Exchange address VL Vector length
The operating system is purposely straight-fomCifd and uncomplicated to keep system overhead low. The operating system is multiprogrammed so that a large number of jobs can be in some stage of processing concurrently. Jobs are submitted by front-end (host) computers or at local or remote job entry stations. A job waits on mass storage until the operating system determines that the resources it requires are available. At this time, the sytem begins processing the job by examining each of the control statements that accompany the job deck and are used to inform the operating system of the tasks to be performed by the job. The control statements are read, interpreted, and acted on sequentially. Output from a job is placed on system mass storage. At job completion, this output may be transferred back to the front-end computer or station of job origin for processing such as printing or transfer to magnetic tape.
-
If a job is waiting for resources, operator action, data transfer, or some other action, the operating system may move the job onto mass storage (roll it out) so that some other job can be executed. Job flow is depicted on page 8.
-
Features of the operating system include the following:
10 01
E Error type (bits 0.1) Uncorrectabfe memory Correctable memory
00 01 10 11
R R e d mod. (bitr 10.111 Scalar I10 Vector Fetch
EXCHANGE PACKAGE initiated by the user or by the operating system through normal and error exit instructions.
Clock period counter Programs can be precisely timed with a real-time clock period counter that increments once each 12.5 nanoseconds.
-
Resource management
- Remote or local job entry -
Multiprogramming of jobs
-
Management of data sets that suwive from one dead start to another (permanent data sets)
- Generation of a chronological history of each job called a log file. It lists each control statement and any encountered.
- Communication with station operators - Staging of data sets between system mass storage and peripheral devices suc4 as magnetic tape units.
Programmable real-time clock
As a special option, the CRAY-1 may be equipped with a programmable real-time clock that has a frequency of 10 MHz and an increment of 100 nanoseconds.
FORTRAN compiler Developed in parallel with the CRAY-1 computer system is a powerful FORTRAN compiler designed to take advantage of the vector capability of the computer.
SOFTWARE The CRAY-1, as with any other computer system, requires three types of software: an operating system, languagelutility systems, and applications programs.
The Operating System The CRAY Operating System is a group of memory or disk resident programs that manages the resources, supewises the running of jobs, and performs inputloutput operations.
The compiler, itself, determines the need for vectorizing and generates code accordingly removing the burdens of such considerations from the programmer. A significant effort is being made to develop optimizing routines that examine FORTRAN code to see if it can be vectorized. The compiler adheres closely to the ANSI 1966 standard. Assembler The Cray Research, Inc. CAL assembler provides users with a
means of expressing symbolically all hardware functions of the CPU. Augmenting the instruction repertoire is a set of versatile pseudo instructions that provide users with options for generating macro instructions, organizing programs, and so on. CAL enables the user to tailor programs to the architecture of the CRAY-1. The operating system as well as most othet software provided by Cray Research, Inc. is eoded in GAL assembly language.
Source program maintenance An UPDATE program provides a means of maintainhg language programs as card images on mass storage magnetic tape. The user converts the source language programts' to a data set called a library in which each of the original cards is assigned a number. Later, when &e user wishes to mod@- the card, it is referenced by its number and is accordingly, deleted, replaced, or used for an insertion point of additional
43 4-
-d
k?
I
I
I
CRAY-1 COMPUTER SYSTEM
rzTD I
I I I
I
I
USER-SUPPLIED
SUPPORTING EQUIPMENT
DISK UNITS
1
Mas Storage Subsystem
Front-End l'mamor
PRINTER1 PLOTTER
Maintenance
Control Unit
MCU CONSOLE
I I I
!
DATA FLOW THROUGH SYSTEM
I
accompanying problem of heat dissipation. The Freon cooling system used in the CRAY-1 employs the latest in refrigeration technology to maintain a column temperature of about 68' in the unit.
cards. After being corrected, the edited program can be submitted to the FORTRAN compiler or CAL assembler for processing. Corrections can be temporary for the purpose of testing new code or can be permanent. .r
Eront-end systems
MAINTENANCE CONTROL UNIT (MCU)
A variety of computer systems produced by other manufacturers may serve as front-end systems to the CRAY-1. A minimum of one such system is expected to be present. A front-end system may serve as a job entry station for submitting jobs to the CRAY-1 and processing output from jobs or as a data concentrator for multiplexing several remote stations or terminals. In addition, a front-end system may provide operator functions by passing commands and messages between the CRAY-1 and the operator at the front-end system.
A 16-bit minicomputer system serves as a maintenance control unit. The MCU performs system initialization and basic recovery for the operating system. Included in the MCU system is a software package that enables the minicomputer to monitor CRAY-1 performance during production hours. EXTERNAL INTERFACE
P
The CRAY-1 may be interfaced to front-end host systems through special controllers that compensate for differences in channel widths, machine word size, electrical logic levels, and control protocols. The interface is a Cray Research, Inc. product implemented in logic compatible with the host system.
A front-end system operates under control of its own operating system in a mode asynchronous to the CRAY-1 computer system The MCU when not used for monitor purposes can serve in the capacity of a front-end computer system. Applications programs Applications programs are specialized programs usually written in a source language such as FORTRAN that solve particular user problems. These programs are generally written by customers. Cray Research, Inc. will provide software specialists on a contractual basis to assist customers in the preparation and development of specific applications programs.
=STEM MASS STORAGE System mass storage consists of two or more Cray Research, Inc. DCU-2 Disk Controllers and multiple DD-19 Disk Storage Units. The disk controller is a Cray Research, Inc. product and is implemented in ECL logic similar to that used in the mainframe. Each controller may have four DD-19 disk storage units attached to it. Operational characteristics of the DD-19 units are summarized in the accompanying table.
ARCHITECTURE
CHARACTERISTICS OF DD-19 DISK STORAGE UNIT
Construction The CRAY-1 with 16 memory banks is modularly constructed of 1662 modules held by 24 chassis. Each module contains two 6 in. by 8 in. printed circuit boards on which are mounted a maximum of 144 integrated circuit packages per board. Emitter coupled logic (ECL) is used throughout. Four basic chip types are used: a high-speed 514 NAND gate, a slow-speed 514 NAND gate, a 16x1 register chip, and a 1024x1 memory chip. Appearance The aesthetics of the machine have not b e ~ nneglected. The CPU is attractively housed in a cylindrical cabinet. The chassis are arranged two per each of the twelve wedge-shaped columns. At the base are the twelve power supplies. The power supply cabinets, which extend outward from the base are vinyl padded to provide seating for computer personnel. The compact mainframe occupies a mere 70 sq. ft. of floor space. Cooling The speed of the CPU is derived largely by keeping wire lengths extremely short in the mainframe. This, in turn, necessitates a dense concentration of components with an
I
Tracks per surface
a s
<>'
I
seam,,,
18
Bits per sector
32,76%
Number of head groups
10
.
.
time w
I
Dafa trans* rate (average bits per sm.1 Total bits that can be dreamed to a unit (disk eylinder capacity)
II
15-80-
35.4 x
I t
0
Racording surfaces per drive
lo6
I1 I
5.9 x lo6
MAINTENANCE SERVICES Cray Research, Inc. provides resident maintenance engineers on a contractual basis.
CRAY-1INSTRUCTION SET
C
0
-
CRAy-1 OOOxxx tOOOijk OOlOjk
C AL ERR ERR CA.Aj
0020xk t0020x0 0021xx OOZZXX 003xjx
VL
VL EFI DFI VM
tOO4ijk OOSxj k OOdijkm 007ijkm 010ijkm Ollijkm OlZijkm O13ijkm 014ijkm OlSijkm Ol6i jkm Ol7ijkm OZOijkm OWijkm
EX
J
J
R
JAZ
JAN
JAP
JAM
JSZ
JSN
JSP
JSM
1
Ai
O22ijk
023ijx Ai 024ijk Ai 025ijk Bjk 026ijx Ai 027ijx Ai O3Oijk Ai
Sj Bj k Ai PSj ZSj A j +Ak
CI
CA. Aj
CE ,Aj
,A0
DESCRIPTION
Error exit
Error exit
Set the channel (Aj) current address to
(Ak) and begin the 1/0 sequence
Set the channel (Aj) limit address to (Ak)
Clear channel (Aj) interrupt fiag
Enter XA register with (Aj)
Enter real-time clock register with (Sj)
Transmit (Ak) to VL register
Transmit 1 to VL register
Enable interrupt on floating point error
Disable interrupt on floating point error
Transmit (Sj) to VM register
Clear VM register
Normal exit
Normal exit
Jump to (Bjk)
Jump to exp
Return jump to exp; set BOO to P
.
Branch to exp if (AO) = 0 Branch to exp if (AO) f 0
Branch to exp if (AO) positive
Branch to exp if (AO) negative
Branch to exp if (SO) = 0
Branch to exp if (SO) f 0
Branch to exp if (SO) positive
Branch to exp if (SO) negative
Transmit exp = jkm to Ai
Transmit exp = 1's complement
of jkm to Ai
Transmit exp = jk to Ai
Transmit (Sj) to Ai
Transmit (Bjk) to Ai
Transmit (Ai) to Bjk
Population count of (Sj) to Ai
Leading zero count of (Sj) to Ai
Integer sum of (Aj) and (Ak) to Ai
Transmit (Ak) to Ai
Integer sum of (Aj) and 1 to Ai
Integer difference of (Aj) less (Ak) to Ai
Transmit -1 to Ai
Transmit the negative of (Ak) to Ai
Ir-teger difference of (Aj) less 1 to Ai
Integer product of (Aj) and (Ak) to Ai
Channel number to Ai (j-0)
Address of channel (Aj) to Ai (j+O; k-0)
Error flag of channel (Aj) to Ai (jfO; k-1)
Read (Ai) words to B register jk from (AO)
Read (Ai) words to B register jk from (AO)
Store (Ai) words at B register jk to (AO)
DESCRIPTION
Floating sum o f (Sj) and (Sk) to Si
Normalize (Sk) to Si
Floating difference o f (Sj) and (Sk) to Si
Transmit normalized negative of (Sk) to Si
Floating product of (Sj) and (Sk) to Si
Half precision rounded floating product
of (Sj) and (Sk) to Si
Full precision rounded floating product of (Sj) and (Sk) to Si Floating product of (Sj) and (Sk) to Si 2 Floating reciprocal approximation of (Sj) to Si
Transmit (Ak) to Si with no sign extension
Transmit (Ak) to Si with sign extension
Transmit (Ak) to Si as unnormalized
floating point number
Transmit constant 0.75*2**48 to Si
-
071i3x
071i4x
071i5x
071i6x
071i7x
072ixx
073ixx
O74ijk
075ijk
076ijk
077ijk
t077iOk
lOhijkm
tlOOijkm
tlOOijkm
tlOhiOOO
llhijkm exp ,Ah tll0ijkm
tlloijkm
tllhiOOO
l2hijkm
t12Oijkm
tl2Oijkm
t12hi000
13hijkm
t130ijkm
tl30ijkm
t13hi000
14Oijk
t140iOO
l4lijk
142ijk
t142iOk
143ijk
Transmit constant 0.5 to Si
Transmit constant 1.0 to Si
Transmit constant 2.0 to Si
Transmit constant 4.0 to Si
Transmit (RTC) to Si
Transmit (VM) to Si
Transmit (Tjk) to Si
Transmit (Si) to Tjk
Transmit (Vj, element (Ak)) to Si Transmit (Sj) to Vi element (Ak) Clear Vi element (Ak) Read from ((Ah) + exp) to Ai (AO-0) Read from (exp) to Ai Read from (exp) to Ai Read from (Ah) to Ai Store (Ai) to (Ah) + exp (AO-0) Store (Ai) to exp Store (Ai) to exp Store (Ai) to (Ah) Read from ((Ah) + exp) to Si [AO=O) Read from exp to Si Read from exp to Si Read from (Ah) to Si Store (Si) to (Ah) + exp (AO-0) Store (Si) to exp Store (Si) to exp Store (Si) to (Ah) Logical products of (Sj) and (Vk] to Vi Clear Vi Logical products of (Vj) and (Vk) to Vi
Logical sums of (Sj) and (Vk) to Vi
Transmit (Vk) to Vi
Logical sums of (Vj) and (Vk) to Vi
0 AJSD Tjl,Ai Tjk,Ai exP
(a]
Read wurds t e T r a g i s t e r jk from (A03
S t o r e [Ail words a t T r e g i s t e r jk t o (AO)
S t o r e (Ai) words a t T r e g i s t e r jk t o [AO)
Transmit jkm t o Si.
Transmit exp
;zp
f.
-1 Sexp Ucexp 0 sj esk s j ess SB&Sj tSk&Sj tSB&Sj SjUk
sns3 SB\Sj tSj\Sk #Sk #Sj\SB tSB\Sj USB
-
1 ' s comple~lcmtof jkm t o S i
64-jk b i t s i n S i from Form 1's mask exp the right Enter 1 i n t e Si Enter -1 i n t o S i Fom 1's mask exp jk b i t s i n S i fro@ the l e f t Clear S i Logical product of (Sj] and (Sk) t o S i Sign b i t of ( S j ) ta SS Sign b i t ef (Sj) t o S i (jZ0) Logical product of (Sj) aad 1's complement of (sk) t o si (SjJ with s i g n b i t t l e a ~ dtta S i Logical d i f f e r e n c e of [Sj) and Cgfr) t o S i Taggle s i g n b i t of $5, then e n t e r i n t o S i T o ~ g l es i g n b i t o-f S j , then enter i n t o S i ( j / Q ) Logical equivalence oQ (Sk) and C S j j t o S i Transmit 1 ' s camplemeat of (Sk) t o S i Logical equi.palence of ISj) an& s i g n b i t t o Sf Logical equivalence o f [ S j J snd s i g n b i t to S i [jfo) Enter 1's complement of sign b i t i n t o S i
-
S j !Si%Sk Logical product of (Si) and CSk] c~o~lpleste;n& 6Red with Logical pmdurt o f [S)) gtyd (Wte S i S j !SigSB S c a l s r merge of [Si) and s i e d h b t of f S j j to Si Sj!Sk Logical sum of ISf 1 and @k) te ST Sk Tranaait (Sk) t o S i Sj!SB Logical sum o f (Sjf and s i g n b i t ao 5k SB!Sj Logical sun of (Sj] and s i g n b i t te S i [$+a) SB Enter s i g n b i t i n t o S i Sic exp S h i f t ( S i ) l e f t exp = j k p l a c e s t o SO Si,exp S h i f t f S i ) r i g h t exp * 64-jk p l a c e s te 9 Sieexp S h i f t (Si) l e f t exp = jk places Si>erp S h i f t (Si) r i g h t exp = 64-jk places Si,SjcAk S h i f t [Si and S j ) l e f t (Ak) prolces t o S i ST,Sj
$j*m
Si Si Si Si
-
[a)
-
-
*
tO57iOk O&Oijk 06lijk fO6liOk
S j !Vk&w~Transmit (Sj) i f VM b i t = 1; (Vlt) i f VM b i t = 0 t o V i tVMi3vk Vector merge of /Vk) and 0 t o V i Vj !Vk$VM Transmit (Vj] i f VM b i t * 1; (Vk) i f VH b i t = 0 t o V i Vj
Ak S h i f t (Vj) r i g h t (Ak) p l a c e s t o V i Vj>l S h i f t (Vj] r i g h t one p l a c e t o V i Vj ,VjAk Double s h i f t (Vj 1 r i g h t [Ak) p l a c e s t o Vi Vj,Qj>l Double s h i f t (Vj) r i g h t one place t o Vi I n t e g s r sums of ($1) and (m) t o V i $j+Vk I n t e g e r sums o f (Vj) and (Vk) t o Vi Vj*W Sj-rn I n t e g e r d i f f e r e n c e s o f ( S j ) and (Vk) t o Vi -a Transmit negative o f (Vk) t o Vi Vj-Vk l n t e g e r d i f f e r e n c e s of (Vj) and (Vk] t e V i F l o a t i n g products o f (Sj] and (\rtl t o Vi t o Vi FZoating producOs o f (Vj) and Vj*Half p r e c i s i o n ratsnded f lmating p ~ ~ d u c t s Sj*W uf [Sj) and (Vk) t o Vi Vj*HVk Half p m c i s i o n r ~ d s d f l o a t i a g p r o d u c t s of [Vj) and (Vk) t o V i s j * ~ V k Rounded f l o a t i n g products of ( S j ) and (Vk] t o Vi Rounded f l o a t i n g products o f {V.j] and (Vk) t o V i 2 f l o a t i n g products of (Sj) and (Vk) t o Vi 2 f l o a t i n g products o f (Vj) end (vk) t o Vi Floating sums of ( S j ) and (Vk) t o V i Noraalize (Vk] t o V i F l o a t i n g sums of (Vj) and [Vk) t o V i F l o a t i n g d i f f e r e n c e s o f (Sj) and (Vk) t o V i Tramrait normalized negatives of (Vk) t o V i F l e s t i s g d i f f e r e n c e s of (Vj) and (Vk) t 6 Vi F l o a t i n g r e c i p r o c a l approximations of (Vj] t o V i Vj ,Z where (Vj) = 0 j VM=l where (Vj) f 0 Vj,P -1 where (Vj) p o s i t i v e Oj .b$ -1 where [Vj) negative ,A@,@ &ad (VL) words t o V i frcm (AO) iacramenttd by (Ak)
,h0,1 Read (VL) uords t o V i from (A01
incramented by 1
Vj S t o r e (VC) words from Vj t o (AO)
incrementeQ by (Ak)
S t o r e (VL] words from Vj t o (AO)
Vj i n e r e m n t e d by 1
,RO,AP 0
I
Cray Research, Inc.
Corporate Addresses
f<s.neraI
I
Officer
Corporate Headquart8rs 1440 Northland Drive
Mendota Heights, Minnesota 55120
w Manufactur!ng Industrial Park
Chippewa Falls, Wisconsin 54279
#
f
5
Safes o f f i i
Dommic Eastern Rqional Sales
10750 Columbia Pike, Suite 602
Silver Spring, Maryland 20901
*
w
Central Regional Sales
1440 Northland Drive
Mendom Heights. Minnesota 55120 . < Z ,----. -- - 3 - z . A:-.- , -:A
- .. ,=Mountain Regional Sates
75 Manhattan Drive, Suite 3
Boulder, Colorado 80303
-
..
Houston District (Petroleum)
3121 Buffalo Speedway, Suite 400
Houston, Texas 77098
w Western Regional Sales
10l%ontinental Boulevard, Suite 456
El Segindo, California 90245
Seattle District
536A Medical and Dental Building
Evsrett, Washington 98201
R t
International Cray Research (U.K.) Limited
James Glaisher House
Grenville Place
Bracknell, England
\
/