Control Unit, 2-Bus Implementation of the SRC Data Path

<< Logic Design for the Uni-bus SRC, Control Signals Generation in SRC

3-bus implementation for the SRC, Machine Exceptions, Reset >>

Advanced Computer Architecture-CS501

Advanced Computer Architecture

Lecture No. 16

Reading Material

Vincent P. Heuring & Harry F. Jordan

Chapter 4

Computer Systems Design and Architecture

4.2.2, 4.6.1

Summary

�

Control Signals Generation in SRC (continued...)

�

The Control Unit

�

2-Bus Implementation of the SRC Data Path

This section of lecture 16 is a continuation of the previous lecture.

Control signals for the store instruction

st ra, c2(rb)

The store time step operations are similar to the load instruction, with the exception of

steps T6 and T7. However, one can easily interpret these now. These are outlined in the

given table.

Control signals for the branch and branch link instructions

Branch instructions can be either be simple branches or link-and-then-branch type. The

syntax for the branch instructions is

brzr rb, rc

This is the branch and zero instruction we looked at earlier. The control signals for this

instruction are:

As usual, the first three steps are for the instruction fetch phase. Next, the following

control signals are issued:

Page 185

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

LCON to enable the CON circuitry to operate, and instruct it to check for the appropriate

condition (whether it is branch if zero, or branch if not equal to zero, etc.)

RCE to allow the register rc value to be read.

R2BUS allows the bus to read from the selected register.

At step T4:

RBE to allow the register rb value to be read. rb value is the branch target address.

R2BUS allows the bus to read from the selected register.

LPC (if CON=1): this control signal is issued conditionally, i.e. only if CON is 1, to

enable the write for the program counter. CON is set to 1 only if the specified condition is

met. In this way, if the condition is met, the program counter is set to the branch address.

Branch and link instructions

The branch and link instruction is similar to the branch instruction, with an additional

step, T4. Step T4 of the simple conditional branch instruction becomes the step T5 in this

case.

The syntax of the instruction `branch and link if zero' is

brlzr ra, rb, rc

Table that lists the RTL and control signals for the store instruction of the SRC is given:

The circuitry that enables the condition checking for the conditional branches in the SRC

is illustrated in the following figure:

Page 186

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

Control signals for the shift right instruction

The given table illustrates the RTL and the control signals for the shift right `shr'

instruction. This is implemented by applying the five bits of n (nb4, nb3, nb2, nb1, nb0)

to the select inputs of the barrel shifter and activating the control signal SHR as explained

in an earlier lecture.

Page 187

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

Generating the Test Condition N=0

The Control Unit

The control unit is responsible for generating control signals as well as the timing signals.

Hence the control unit is responsible for the synchronization of internal as well as

external events. By means of the control signals, the control unit instructs the data path

what to do in every clock cycle during the execution of instructions.

Control Unit Design

Since the control unit performs quite complex tasks, its design must be done very

carefully. Most errors in processor design are in the Control Unit design phase. There are

primarily two approaches to design a control unit.

1. Hardwired approach

2. Micro programming

Hardwired approach is relatively faster, however, the final circuit is quite complex. The

micro-programmed implementation is usually slow, but it is much more flexible.

"Finite-state machine" concepts are usually used to represent the CU. Every state

corresponds to one "clock cycle" i.e., 1 state per clock. In other words each timing step

could be considered as just 1 state and therefore from one timing step to other timing

step, the state would change. Now, if we consider the control unit as a black box, then

there would be four sets of inputs to the control unit. These are as follows:

1. The output of timing step generator (There are 8 disjoint timing steps in our

example T0-T7).

2. Op-code (op-code is first given to the decoder and the output of the decoder is

given to the control unit).

3. Data path generated signals, like the "CON" control signal,

4. Signals from external events, like "Interrupt" generated by the Interrupt generator.

The complexity of the control is a function of the

� Number of states

� Number of inputs to the CU

� Number of the outputs generated by the CU

Page 188

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

Hardwired Implementation of the Control Unit

The accompanying block diagram shows the inputs to the control unit. The output control

signals generated from control unit to the various parts of the processor are also shown in

the figure.

Example Control Unit for the FALCON-A

The following figure shows how the operation code (op-code) field of the Instruction

This is an example for the FALCON-A processor where the instruction is 16-bit long.

Similar concepts will apply to the SRC, in which case the instruction word is 32 bits and

IR <31...27> contains the op-code. Similar concepts will apply to the SRC, in which case

Page 189

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

the instruction word is 32 bits and IR<31..27> contains the opcode. The most significant

5 bits represent the op-code. These 5-bits from the IR are fed to a 5-to-32 decoder. These

32 outputs are numbered from 0-to-31 and named as op0, op1 up to op31. Only one of

these 32 outputs will be active at a given time .The active output will correspond to

instruction executing on the processor.

To design a control unit, the next step is to write the Boolean Equations. For this we need

to browse through the structural descriptions to see which particular control signals occur

in different timing steps. So, for each instruction we have one such table defining

structural RTL and the control signals generated at each timing step. After browsing we

need to check that which control signal is activated under which condition. Finally we

need to write the expression in the form of a logical expression as the logical combination

of "AND" and "OR" of different control signals. The given table shows Boolean

Equations for some example control signals.

For example, PCout would be active in every T0 timing step. Then in timing interval T3

the output of the PC would be activated if the op-code is 20 or 22 which represent jump

and sub-routine call. In step T4 if the op-code is 16, 17, 18 or 19, again we need PCout

activated and these 4 instructions correspond to the conditional jumps. We can say that in

other words in step T1, PCout is always activated "OR" in T3 it is activated if the

instruction is either jump or sub-routine call "OR" in T4 if there is one of the conditional

jumps. We can write an equation for it as

PCout=T0+T3.(OP20+OP22)+T4.(OP16+OP17+OP18+OP19)

In the form of logic circuit the implementation is shown in the figure. We can see that we

"OR" the op-ode 20 and 22 and "AND" it with T3, then "OR" all the op16 up to op19

and "AND" it with T4, then T0 and the "AND" outputs of T3 and T4 are "OR" together

to obtain the PCout.

Page 190

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

In the same way the logic circuit for LPC control signal is as shown and the equation

would be :

LPC=T1+T5.OP20+T6.CON.(OP16+OP17+OP18+OP19)

We can formulate Boolean equations and draw logic circuits for other control signals in

the same way.

Effect of using "real" Gates

We have assumed so far that the gates are ideal and that there is no propagation delay. In

designing the control unit, the propagation delays for the gates can not be neglected. In

particular, if different gates are cascaded, the output of one gate forms the input of other.

The propagation delays would add up. This, in turn would place an upper limit on the

Page 191

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

frequency of the clock which controls the generation of the timing intervals T0, T1... T7.

So, we can not arbitrarily increase the frequency of this clock. As an example consider

the transfer of the contents of a register R1 to a register R2. The minimum time required

to perform this transfer is given by

tmin = tg + tbp + tcomb + t1

The details are explained in the text with reference to Fig 4.10. Thus, the maximum clock

frequency based on this transfer will be 1/tmin. Students are encouraged to study example

4.1 of the text.

2-Bus Implementation of the SRC Data Path

In the previous sections, we studied the uni-bus implementation of the data path in the

SRC. Now we present a 2-bus implementation of the data path in the SRC. We observe

from this figure that there is a bus provided for data that is to be written to a component.

This bus is named the `in' bus. Another bus is provided for reading out the values from

these components. It is called the `out' bus.

Structural RTL for the `sub' instruction using the 2-bus data path implementation

Next, we look at the structural RTL as well as the control signals that are issued in

sequence for instruction execution in a 2-bus implementation of the data path. The given

table illustrates the Register Transfer Language representation of the operations for

carrying out instruction fetch, and execution for the sub instruction.

Page 192

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

The first three steps belong to the instruction fetch phase; the instruction to be executed is

fetched into the Instruction Register and the PC value is incremented to point to the next-

in-line instruction. At step T3, the register R[rb] value is written to register A. At the time

step T4, the subtracted result from the ALSU is assigned to the destination register R[ra].

Notice that we did not need to store the result in a temporary register due to the

availability of two buses in place of one. At the end of this sequence, the timing step

generator is initialized to T0.

Control signals for the fetch operation

The control signals for the instruction fetch phase are shown in the table. A brief

explanation is given below:

At time step T0, the following control signals are issued:

� PCout: This will enable read of the Program Counter, and so its value will be

transferred onto the `out' bus

� LMAR: To enable the load for MAR

� C=B: This instruction is used to copy the value on the `out' bus to the `in' bus, so

it can be loaded into the Memory Address Register. We can observe in the data-

path implementation figure given earlier that, at any time, the value on the `out'

bus makes up the operand B for the ALSU. The result C of ALSU is connected to

the "in" bus, and therefore, the contents transfer from one bus to the other can

take place.

Page 193

Last Modified: 01-Nov-06

Advanced Computer Architecture-CS501

At time step T1:

� PCout: Again, this will enable read of the Program Counter, and so its value will

be transferred onto the CPU internal `out' bus

� INC4: To instruct the ALSU to perform the increment-by-four operation.

� LPC: This control signal will enable write of the Program Counter, thus the new,

incremented value can be written into the PC if it is made available on the "in"

bus. Note that the ALSU is assumed to include an INC4 function.

� MRead: To enable memory word read.

� MARout: To supply the address of memory word to be accessed by allowing the

contents of the MAR (memory address register) to be written onto the CPU

external (address) bus.

� LMBR: The memory word is stored in the register MBR (memory buffer

At time step T2:

� MBRout: The contents of the Memory Buffer Register are read out onto the

`out' bus, by means of applying this signal, as it enables the read for the MBR.

� C=B: Once again, this signal is used to copy the value from the `out' bus to the

`in' bus, so it can be loaded into the Memory Address Register.

� LIR: This instruction will enable the write of the Instruction Register. Hence the

instruction that is on the `in' bus is loaded into this register.

At time step T3, the execution may begin, and the control signals issued at this stage

depend on the actual instruction encountered. The control signals issued for the

instruction fetch phase are the same for all the instructions.

Note that, we assume the memory to be fast enough to respond during a given time slot.

If that is not true, wait states have to be inserted. Also keep in mind that the control

signals during each time slot are activated simultaneously, while those for successive

time slots are activated in sequence. If a particular control signal is not shown, its value is

zero.

Page 194

Last Modified: 01-Nov-06

Table of Contents: