Unit VIII: Reduced Instruction Set Computer (RISC) - Computer Architecture - BCA Notes (Pokhara University)

Breaking

Monday, June 15, 2020

Unit VIII: Reduced Instruction Set Computer (RISC) - Computer Architecture

Introduction to CISC and RISC Architecture:

Hardware designers invent numerous technologies & tools to implement the desired architecture in order to fulfill these needs. Hardware architecture may be implemented to be either hardware-specific or software specific, but according to the application both are used in the required quantity. As far as the processor hardware is concerned, there are 2 types of concepts to implement the processor hardware architecture. The first one is RISC and the other is CISC.

Reduced Instruction Set Computers CISC & RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

CISC has the ability to execute addressing modes or multi-step operations within one instruction set. It is the design of the CPU where one instruction performs many low-level operations. For example, memory storage, arithmetic operation, and loading from memory.

RISC is a CPU design strategy based on the insight that simplified instruction set gives higher performance when combined with a microprocessor architecture which has the ability to execute the instructions by using some microprocessor cycles per instruction.

CISC Architecture:

The CISC approach attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction. Computers based on the CISC architecture are designed to decrease the memory cost. Because, the large programs need more storage, thus increasing the memory cost and large memory becomes more expensive. To solve these problems, the number of instructions per program can be reduced by embedding the number of operations in a single instruction, thereby making the instructions more complex.

Reduced Instruction Set Computers CISC and RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

Characteristics of CISC Architecture:

  1. Instruction-decoding logic will be Complex.
  2. One instruction is required to support multiple addressing modes.
  3. Less chip space is enough for general purpose registers for the instructions that are operated directly on memory.
  4. Various CISC designs are set up with two special registers for the stack pointer, handling interrupts, etc.
  5. MUL is referred to as a “complex instruction” and requires the programmer for storing functions.
  6. MUL loads two values from the memory into separate registers in CISC.
  7. CISC uses minimum possible instructions by implementing hardware and executes operations.
  8. Instruction Set Architecture is a medium to permit communication between the programmer and the hardware. The data execution part, copying of data, deleting, or editing is the user commands used in the microprocessor, and with this microprocessor, the Instruction set architecture is operated.

Advantages of CISC Architecture:

  1. Microprogramming is easy assembly language to implement, and less expensive than hard-wiring a control unit.
  2. The ease of micro-coding new instructions allowed designers to make CISC machines upwardly compatible:
  3. As each instruction became more accomplished, fewer instructions could be used to implement a given task.

Disadvantages of CISC Architecture:

  1. The performance of the machine slows down due to the amount of clock time taken by different instructions will be dissimilar
  2. Only 20% of the existing instructions are used in a typical programming event, even though there are various specialized instructions in reality that are not even used frequently.
  3. The conditional codes are set by the CISC instructions as a side effect of each instruction which takes time for this setting – and, as the subsequent instruction changes the condition code bits – so, the compiler has to examine the condition code bits before this happens.

RISC Architecture:

RISC (Reduced Instruction Set Computer) is used in portable devices due to its power efficiency. For Example, Apple iPod and Nintendo DS. RISC is a type of microprocessor architecture that uses a highly-optimized set of instructions. RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program Pipelining is one of the unique features of RISC. It is performed by overlapping the execution of several instructions in a pipeline fashion. It has a high-performance advantage over CISC.

Reduced Instruction Set Computers CISC and RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

Characteristics of RISC Architecture:

  1. Simple Instructions are used in RISC architecture.
  2. RISC helps and supports a few simple data types and synthesize complex data types.
  3. RISC utilizes simple addressing modes and fixed-length instructions for pipelining.
  4. RISC permits any register to use in any context.
  5. One Cycle Execution Time
  6. The amount of work that a computer can perform is reduced by separating “LOAD” and “STORE” instructions.
  7. RISC contains a Large Number of Registers in order to prevent various interactions with memory.
  8. In RISC, Pipelining is easy as the execution of all instructions will be done in a uniform interval of time i.e. one click.
  9. In RISC, more RAM is required to store assembly level instructions.
  10. Reduced instructions need a less number of transistors in RISC.
  11. RISC uses the Harvard memory model means it is Harvard Architecture.
  12. A compiler is used to perform the conversion operation means to convert a high-level language statement into the code of its form.

Advantages of RISC Architecture:

  1. RISC(Reduced instruction set computing)architecture has a set of instructions, so high-level language compilers can produce more efficient code
  2. It allows freedom of using the space on microprocessors because of its simplicity.
  3. Many RISC processors use the registers for passing arguments and holding the local variables.
  4. RISC functions use only a few parameters and the RISC processors cannot use the call instructions, and therefore, use a fixed-length instruction which is easy to pipeline.
  5. The speed of the operation can be maximized and the execution time can be minimized.
    Very less number of instructional formats, a few numbers of instructions and a few addressing modes are needed.

Disadvantages of RISC Architecture:

  1. Mostly, the performance of the RISC processors depends on the programmer or compiler as the knowledge of the compiler plays a vital role while changing the CISC code to a RISC code
  2. While rearranging the CISC code to a RISC code, termed as a code the expansion will increase the size. And, the quality of this code expansion will again depend on the compiler, and also on the machine’s instruction set.
  3. The first level cache of the RISC processors is also a disadvantage of the RISC, in which these processors have large memory caches on the chip itself. For feeding the instructions, they require very fast memory systems.

RISC v/s CISC:

CISC

RISC

CISC stands for Complex Instruction Set Computer.

RISC stands for Reduced Instruction Set Computer.

It has a microprogramming unit.

It has a hard-wired unit of programming.

The instruction set has various different instructions that can be used for complex operations.

The instruction set is reduced, and most of these instructions are very primitive.

Performance is optimized with emphasis on hardware.

Performance is optimized which emphasis on software

The only single register set

Multiple register sets are present

They are mostly less or not pipelined

This type of processors are highly pipelined

Execution time is very high

Execution time is very less

Code expansion is not a problem.

Code expansion may create a problem.

The decoding of instructions is complex.

The decoding of instructions is simple.

It requires external memory for calculations

It doesn't require external memory for calculations

Examples of CISC processors are the System/360, VAX, AMD, and Intel x86 CPUs.

Common RISC microprocessors are ARC, Alpha, ARC, ARM, AVR, PA-RISC, and SPARC.

Single-cycle for each instruction

Instructions can take several clock cycles

Heavy use of RAM (can cause bottlenecks if RAM is limited)

More efficient use of RAM than RISC

Simple, standardized instructions

Complex and variable-length instructions

A small number of fixed-length instructions

A large number of instructions

Limited addressing modes

Compound addressing modes

Important applications are Security systems, Home automation.

Important applications are Smartphones, PDAs.

Varying formats (16-64 bits for each instruction).

Fixed (32-bit) format

Unified cache for instructions and data.

Separate data and instruction cache.


Pipelining:

Pipelining is the process of accumulating instruction from the processor through a pipeline. It allows storing and executing instructions in an orderly process. It is also known as pipeline processing.

Pipelining is a technique where multiple instructions are overlapped during execution. The pipeline is divided into stages and these stages are connected with one another to form a pipe-like structure. Instructions enter from one end and exit from another end. Pipelining increases the overall instruction throughput.

Reduced Instruction Set Computers CISC and RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

In the pipeline system, each segment consists of an input register followed by a combinational circuit. The register is used to hold data and a combinational circuit performs operations on it. The output of the combinational circuit is applied to the input register of the next segment.

The pipeline system is like the modern day assembly line set up in factories. For example in a car manufacturing industry, huge assembly lines are set up and at each point, there are robotic arms to perform a certain task, and then the car moves on ahead to the next arm.

Types of Pipeline:

1. Arithmetic Pipeline:

Arithmetic pipelines are usually found in most of the computers. They are used for floating-point operations, multiplication of fixed-point numbers etc. For example, The input to the Floating Point Adder pipeline is:

X = A*2^a
Y = B*2^b

Here A and B are mantissae (significant digit of floating-point numbers), while a and b are exponents.

The floating-point addition and subtraction is done in 4 parts:

  1. Compare the exponents.
  2. Align the mantissa.
  3. Add or subtract mantissa
  4. Produce the result.

Registers are used for storing the intermediate results between the above operations.

2. Instruction Pipelining:

First generation RISC processors achieve execution speed that approaches one instruction per system clock cycle. To improve it, two classes of processors evolved to offer execution of multiple instructions per clock cycle i.e. super-scalar and super-pipelined architectures.

A super-scalar architecture replicates each of the pipeline stages so that two or more instructions at the same stage of pipelines. Similarly, super-pipelined architecture is one that makes use of more and finer-grained pipeline stages.

All instructions follow the five pipeline stages:

  1. Instruction fetch

  2. Source operand fetch from register file

  3. ALU operation or data operand address generation

  4. Data memory reference

  5. Write back into register file

The limitation of super-scalar is dependencies between instructions in different pipelines can slow down the system. With super-scaling there is overhead associated with transferring instructions from one stage to another.

Instruction Pipelining Numerical:

Reduced Instruction Set Computers CISC and RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

Instruction Pipelining Numerical Example:

Assume that pipeline has k = 8 segments and executes n = 120 tasks in sequence. Let the time taken to process a sub-operation in each segment is 20 seconds calculate the speed of ration in the pipeline.

Solution,

Reduced Instruction Set Computers CISC and RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

Advantages of Pipelining:

  1. The cycle time of the processor is reduced.
  2. It increases the throughput of the system.
  3. It makes the system reliable.

Disadvantages of Pipelining:

  1. The design of the pipelined processor is complex and costly to manufacture.
  2. The instruction latency is more.

RISC Pipelining:

Pipelining, a standard feature in RISC processors is much like an assembly line. Because the processor works on different steps of the instruction at the same time, more instructions can be executed in a shorter period of time.

Different processors have different numbers of steps, they are basically variations of these five, used in the MIPS R3000 processor:

  1. Fetch instructions from memory
  2. Read registers and decode the instruction
  3. Execute the instruction or calculate an address
  4. Access an operand in data memory
  5. Write the result into a register

The length of the pipeline is dependent on the length of the longest step. Because RISC instructions are simpler than those used in pre-RISC processors (now called CISC, or Complex Instruction Set Computer), they are more conducive to pipelining. While CISC instructions varied in length, RISC instructions are all the same length and can be fetched in a single operation. Ideally, each of the stages in a RISC processor pipeline should take 1 clock cycle so that the processor finishes an instruction each clock cycle and averages one cycle per instruction (CPI).

Two-Phase of Execution for Register Based Instruction:

1. I: Instruction Fetch

2. E: Execute

       a.  ALU operation with register input and output.

Three Phases of Execution for Load and Store:

1. I: Instruction fetch

2. E: Execute

        a. Calculate memory address

3. D: Memory

              a. Register to memory or memory to register operation

Reduced Instruction Set Computers CISC and RISC Architecture RISC Pipelining Conflicts in Instruction Pipelining & Solutions Register Windows & Renaming

Conflicts or Hazards in Instruction Pipelining and their Solutions:

Pipeline processors have several problems associated with controlling smooth, efficient execution of instructions on the pipeline. These problems are generally called hazards, and include the following three types:

1. Structural Hazards:

This situation arises mainly when two instructions require a given hardware resource at the same time and hence for one of the instructions the pipeline needs to be stalled.

The most common case is when memory is accessed at the same time by two instructions. One instruction may need to access the memory as part of the Execute or Write back phase while other instruction is being fetched. In this case, if both the instructions and data reside in the same memory. Both the instructions can’t proceed together and one of them needs to be stalled till the other is done with the memory access part. Thus in general sufficient hardware resources are needed for avoiding structural hazards.

Solution:

  1. Delay the second access by one clock cycle.

  2. Provide separate memory for instructions and data.

2. Control Hazards:

The instruction fetch unit of the CPU is responsible for providing a stream of instructions to the execution unit. The instructions fetched by the fetch unit are in consecutive memory locations and they are executed.

However the problem arises when one of the instructions is a branching instruction to some other memory location. Thus all the instructions fetched in the pipeline from consecutive memory locations are invalid now and need to removed(also called flushing of the pipeline). This induces a stall till new instructions are again fetched from the memory address specified in the branch instruction.

Thus the time lost as a result of this is called a branch penalty. Often dedicated hardware is incorporated in the fetch unit to identify branch instructions and compute branch addresses as soon as possible and reducing the resulting delay as a result.

Solution:

  1. Predict: Assume an outcome and continue fetching (undo) if a prediction is wrong.

  2. Delayed branch:

  3. Stall: Stop loading instructions until the result is available.

3. Data Hazards:

A data hazard is any condition in which either the source or the destination operands of an instruction are not available at the time expected in the pipeline. As a result of which some operation has to be delayed and the pipeline stalls. Whenever there are two instructions one of which depends on the data obtained from the other.

A=3+A
B=A*4

In the above sequence, the second instruction needs the value of ‘A’ computed in the first instruction. Thus the second instruction is said to depend on the first.

If the execution is done in a pipelined processor, it is highly likely that the interleaving of these two instructions can lead to incorrect results due to data dependency between the instructions. Thus the pipeline needs to be stalled as and when necessary to avoid errors.

Solution:

  1. Read after write

  2. Write after read

  3. Write after write (out dependencies)

Register Window:

In computer engineering, register windows are a feature in some instruction set architectures to improve the performance of procedure calls, a very common operation. Register windows were one of the main features of the Berkeley RISC design, which would later be commercialized as the AMD Am29000, Intel i960, Sun Microsystems SPARC, and Intel Itanium.

Most CPU designs include a small amount of very high-speed memory known as registers. Registers are used by the CPU in order to hold temporary values while working on long strings of instructions. Considerable performance can be added to a design with more registers. However, since the registers are a visible piece of the CPU's instruction set, the number cannot typically be changed after the design has been released.

While registers are almost a universal solution to performance, they do have a drawback. Different parts of a computer program all use their own temporary values and therefore compete for the use of the registers. Since a good understanding of the nature of program flow at runtime is very difficult, there is no easy way for the developer to know in advance how many registers they should use, and how many to leave aside for other parts of the program. In general, these sorts of considerations are ignored, and the developers, and more likely, the compilers they use, attempt to use all the registers visible to them. In the case of processors with very few registers, to begin with, this is also the only reasonable course of action.

Register windows aim to solve this issue. Since every part of a program wants registers for its own use, several sets of registers are provided for the different parts of the program. If these registers were visible, there would be more registers to compete over, i.e. they have to be made invisible.

Rendering the register's invisible can be implemented efficiently; the CPU recognizes the movement from one part of the program to another during a procedure call. It is accomplished by one of a small number of instructions (prologue) and ends with one of a similarly small set (epilogue). In the Berkeley design, these calls would cause a new set of registers to be "swapped in" at that point, or marked as "dead" (or "reusable") when the call ends.

Register Renaming:

Register renaming is a form of pipelining that deals with data dependencies between instructions by renaming their register operands. An assembly language programmer or a compiler specifies these operands using architectural registers - the registers that are explicit in the instruction set architecture. Renaming replaces architectural register names, with a new value name for each instruction destination operand. This eliminates the name dependencies (output dependencies and anti-dependences) between instructions and automatically recognizes true dependences.

The recognition of true data dependencies between instructions permits a more flexible life cycle for instructions. By maintaining a status bit for each value indicating whether or not it has been computed yet, it allows the execution phase of two instruction operations to be performed out of order when there are no true data dependencies between them. This is called out-of-order execution.

After looking at the process of renaming operands we will look at the life cycle of instruction in a register renaming architecture. Then we will look at a generic hardware organization for it and some possible performance enhancements. Finally, we will look at a brief history of the register renaming concept.

No comments:

Post a Comment

If you have any doubt, then don't hesitate to drop comments.