ZeePedia buy college essays online

Advance Computer Architecture

<<< Previous Performance of I/O Subsystems, Server Utilization, Asynchronous I/O and operating system Next >>>
Advanced Computer Architecture-CS501
Advanced Computer Architecture
Lecture No. 42
Reading Material
Patterson, D.A. and Hennessy, J.L.
Chapter 8
Computer Architecture -A Quantitative Approach
Performance of I/O Subsystems
Loss System
Single Server Model
Little's Law
Server Utilization
Poisson distribution
Benchmarks programs
Asynchronous I/O and operating system
Consider a producer-server model. A buffer (or queue) is present between them. Tasks
are being received and when one task is finished (i.e. served) then the second task is
taken up by the server. Now latency and the response time depend upon how many tasks
are present in the queue and how quickly they are served. If there is no task, ahead in the
queue the latency would be low and response time would be shorter.
Through put depends upon the average number of calls and the service time taken by a
particular server.
Performance of I/O Subsystems
There are three methods to measure I/O subsystem performance:
 Straight away calculations using execution time
 Queuing Theory
Loss System
Loss system is a simple system having no buffer so it does not have any provision for the
queuing. In a loss system, provision is time in term of how many switches we do need,
then provide some redundancy how many individuals I/O controllers we do need, then
how many CPUs are there. It is also called dimension of a loss system.
Delay System
Page 367
Last Modified: 01-Nov-06
Advanced Computer Architecture-CS501
This system provides additional facilities. If we find some call party busy, we can have
provision of call waiting. If we have more than one call waiting, then once we finish the
first call, we may receive the second call.
Single Server Model
Consider a black box. Suppose it represents an I/O controller. At the input, we have
arrival of different tasks. As one task is done, we have a departure at the output. So in the
black box, we have a server. Now if we expand and open-up the black box, we could see
that incoming calls are coming into the buffer and the output of the buffer is connected to
the server. This is an example of "single server model".
Little's Law
For a system with multiple independent requests for I/O service and input rate equal to
output rate, we use Little's law to find the mean number of tasks in the system and Time
sys such that
Mean number of tasks = Arrival Rate x Mean Response time
Timesys = Timeq + Times
Times = Average time to serve task
Timeq = Average time per task in the queue
Timesys = Aver time /task
Arrival Rate = λ = Average number of arriving tasks
Lengths = Average number of task in service
Lengthq = Average length of queue
Lengthsys= Lengthq +Lengths
Server Utilization
Server Utilization = Arrival Rate x Timeq
Server utilization is also called traffic intensity and its value must be between 0 and 1.
Server utilization depends upon two parameters:
1. Arrival Rate
2. Average time required to serve each task
So, we can say that it depends on the I/O bandwidth and arrival rate of calls into the
Example 1
Page 368
Last Modified: 01-Nov-06
Advanced Computer Architecture-CS501
Suppose an I/O system with a single disk gets (on average) 100 I/O requests/second.
Assume that average time for a disk to service an I/O request is 5ms. What is the
utilization of the I/O system?
Time for an I/O request = 5ms
Server utilization = 100 x 0.005
= 0.5
Poisson distribution
In order to calculate the response time of an I/O system, we make the following
1. Arrival is random
2. System is memory less. It means that incoming calls are not correlated.
For characterize random events, according to above two assumptions, we use Poisson
Probability (k)= (e-k x ak ) /k!
a= Rate of events x Elapsed time
= Arrival rate x t
= -----------------------------------
(Arithmetic mean time) 2
Average Residual Service Time = x weighted mean time x (1+C2 )
Example 2
For the system of previous example having server utilization of 0.5, what is the mean
number of I/O requests in the queue?
(Server utilization) 2
Lengthq = ---------------------------
(1- Server utilization)
Lengthq = (0.5) 2 / (1-0.5)= 0.5
Assumptions about Queuing Model
Poisson distribution is assumed
The system is in equilibrium
The length of the queue is infinity
The system has only one server
Page 369
Last Modified: 01-Nov-06
Advanced Computer Architecture-CS501
5. The server will start the next task after finishing the previous one.
Example 3
Suppose a processor sends 10 disks I/O per second, these requests are exponentially
distributed, and the average service time of an older disk is 10ms. Answer the following
What is the number of requests in the queue?
What is the average time a spent in the queue?
What is the average response time for a disk request?
Average number of arriving tasks/second = 20
Average disk time = 10ms = 0.01sec
Sever utilization = 20 x 0.01=0.2
Timeq = 10ms x 0.2/(1-0.2) = 2.5ms
Average response time = 2.5+10=22.5ms
M/M/m model of queuing theory
A system which has multiple servers is called M/M/m model.
The following formulas are used for M/M/m model:
Arrival Rate x Times
Utilization = -----------------------------
Lengths = Arrival Rate x Timeq
(Times x (Ptasks>= Ns))
Timeq = ----------------------------------
Ns x (1- utilization)
Ns x utilization
Probtasks>= Ns = -------------------------- x Prob0tasks
Ns! x (1-utilization)
Suppose instead of a new, faster disk, we add a second slow disk, and duplicate the data
so that read can be serviced by either disk. Let's assume that the requests are all reads.
Recalculate the answers to the earlier questions, this time using an M/M/m queue.
The average utilization of the two disks is given as;
Page 370
Last Modified: 01-Nov-06
Advanced Computer Architecture-CS501
Arrival rate x Times
Server utilization = ----------------------------
= (20 x 0.01) / 2
= 0.1
(2 x utilization) 2
(2 x utilization ) n
= [ 1 + ------------------------- + --------------------------] -1
2! x (1- utilization)
(2x 0.1) 2
= [ 1 + ---------------- + (2 x 0.1)] -1
2! x (1- 0.1)
= (1 + .022 + 0.2 ) -1
= 1.222-1
(2 x utilization)  2
Probtasks>= Ns = ------------------------- x Prob0tasks
2! x (1- utilization)
(2x 0.1) 2
---------------- x 1.222-1
2! x (1- 0.1)
= 0.018
Probtasks>= Ns
Timeq = Times x ----------------------------
Ns x (1- utilization)
= 0.01 x 0.018 / ( 2 x 0.9)
= 0.1msec
Average response time = 10msec + 0.1msec
= 10.01msec
Benchmarks programs
In order to measure the performance of real systems and to collect the values of
parameters needed for prediction, Benchmark programs are used.
Types of Benchmark programs
Two types of benchmark programs are used:
Page 371
Last Modified: 01-Nov-06
Advanced Computer Architecture-CS501
Asynchronous I/O and operating system
In order to improve the I/O performance, parallelism is used.
For this, two approaches are available:
 Synchronous I/O
 Asynchronous I/O
Synchronous I/O
In this approach, operating system requests data and switches to another process. Until
the desire data arrived. Then the operating system switches back to the requesting
Asynchronous I/O
This model is of the process to continue after making a request and it is not blocked until
it tries to read requested data.
Bus versus switches
Consider a LAN, using bus topology. If we replace the bus with a switch, the speed of the
data transfer will be improved to a great extent.
Page 372
Last Modified: 01-Nov-06
Table of Contents:
  1. Computer Architecture, Organization and Design
  2. Foundations of Computer Architecture, RISC and CISC
  3. Measures of Performance SRC Features and Instruction Formats
  4. ISA, Instruction Formats, Coding and Hand Assembly
  5. Reverse Assembly, SRC in the form of RTL
  6. RTL to Describe the SRC, Register Transfer using Digital Logic Circuits
  7. Thinking Process for ISA Design
  8. Introduction to the ISA of the FALCON-A and Examples
  9. Behavioral Register Transfer Language for FALCON-A, The EAGLE
  10. The FALCON-E, Instruction Set Architecture Comparison
  11. CISC microprocessor:The Motorola MC68000, RISC Architecture:The SPARC
  12. Design Process, Uni-Bus implementation for the SRC, Structural RTL for the SRC instructions
  13. Structural RTL Description of the SRC and FALCON-A
  14. External FALCON-A CPU Interface
  15. Logic Design for the Uni-bus SRC, Control Signals Generation in SRC
  16. Control Unit, 2-Bus Implementation of the SRC Data Path
  17. 3-bus implementation for the SRC, Machine Exceptions, Reset
  18. SRC Exception Processing Mechanism, Pipelining, Pipeline Design
  19. Adapting SRC instructions for Pipelined, Control Signals
  20. SRC, RTL, Data Dependence Distance, Forwarding, Compiler Solution to Hazards
  21. Data Forwarding Hardware, Superscalar, VLIW Architecture
  22. Microprogramming, General Microcoded Controller, Horizontal and Vertical Schemes
  23. I/O Subsystems, Components, Memory Mapped vs Isolated, Serial and Parallel Transfers
  24. Designing Parallel Input Output Ports, SAD, NUXI, Address Decoder , Delay Interval
  25. Designing a Parallel Input Port, Memory Mapped Input Output Ports, wrap around, Data Bus Multiplexing
  26. Programmed Input Output for FALCON-A and SRC
  27. Programmed Input Output Driver for SRC, Input Output
  28. Comparison of Interrupt driven Input Output and Polling
  29. Preparing source files for FALSIM, FALCON-A assembly language techniques
  30. Nested Interrupts, Interrupt Mask, DMA
  31. Direct Memory Access - DMA
  32. Semiconductor Memory vs Hard Disk, Mechanical Delays and Flash Memory
  33. Hard Drive Technologies
  34. Arithmetic Logic Shift Unit - ALSU, Radix Conversion, Fixed Point Numbers
  35. Overflow, Implementations of the adder, Unsigned and Signed Multiplication
  36. NxN Crossbar Design for Barrel Rotator, IEEE Floating-Point, Addition, Subtraction, Multiplication, Division
  37. CPU to Memory Interface, Static RAM, One two Dimensional Memory Cells, Matrix and Tree Decoders
  38. Memory Modules, Read Only Memory, ROM, Cache
  39. Cache Organization and Functions, Cache Controller Logic, Cache Strategies
  40. Virtual Memory Organization
  41. DRAM, Pipelining, Pre-charging and Parallelism, Hit Rate and Miss Rate, Access Time, Cache
  42. Performance of I/O Subsystems, Server Utilization, Asynchronous I/O and operating system
  43. Difference between distributed computing and computer networks
  44. Physical Media, Shared Medium, Switched Medium, Network Topologies, Seven-layer OSI Model