ECE668 Quiz5(Includes calculation)

docx

School

Florida Polytechnic University *

*We aren’t endorsed by this school

Course

5741

Subject

Computer Science

Date

Feb 20, 2024

Type

docx

Pages

18

Uploaded by abonyamin1

Report
Consider a 7 stage pipeline, which is running a program containing 7% unconditional branches and 8% conditional branches. Assuming that the penalty incurred in resolving an unconditional branch is 3 cycles and the penalty for conditional branches is 1 cycles more than unconditional branches, calculate the Pipeline Speedup ignoring all other hazards. (round to two decimal places.) Answer: 1.53 x General answer comments
resolving an unconditional branch is 3 cycles and the penalty for conditional branches is 1 cycles more than unconditional branches, calculate the Pipeline Speedup ignoring all other hazards. (Round to two decimal places.) Answer: 1.53 x General answer comments Speedup = Pipeline.Depth / (1+ ¥ Branch Frequency * Branch Penalty) The correct answer is: 4.58
Question 2 Correct 10.00 points out of 10.00 ¥ Flag question = Suppose we have a program containing 11% branch instructions, 70% of which are taken and the rest are not taken. Calculate the CPI if the hardware always predicts not taken. Consider only branch penalties, ignore all other hazards. Assume a correctly predicted branch takes 1 cycle while an incorrectly predicted branch takes 2 cycles to complete. Assume all instructions that are not branches take only 1 cycle. (Round to two decimal places.)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
hardware always predicts not taken. Consider only branch penalties, ignore all other hazards. Assume a correctly predicted branch takes 1 cycle while an incorrectly predicted branch takes 2 cycles to complete. Assume all instructions that are not branches take only 1 cycle. (Round to two decimal places.) Answer: 1.07 v CPI = Fnot_Branch *1 + F_Branch (F_Br_NotTaken *1 + (1-FBr_NotTaken) *2) The correct answer is: 1.08
Question 3 Incorrect 0.00 points out of 10.00 ¥ Flag question L Consider a program that executes 22% branch instructions. 5% of them are unconditional branches, 6% are untaken conditional branches and the rest are taken conditional branches. In the case of the MIPS R4000 (refer to Lesson 8 - Slide 13), what will be the CPI in the 'Branch not Taken' case? Ignore all other types of hazards and assume the ideal CPI = 1. Assume that the target address calculation is done in the decode stage while the branch condition evaluation is done in the execute (ALU) stage. (round to two decimal places.)
Taken' case? Ignore all other types of hazards and assume the ideal CPI = 1. Assume that the target address calculation is done in the decode stage while the branch condition evaluation is done in the execute (ALU) stage. (Round to two decimal places.) Answer: 1.61 x In the Branch Not Taken case, the penalty for branches are the followings: Penalty for Unconditional branches = 2 cycles Penalty for Taken branches = 3 cycles Penalty for Not-Taken branches = 0 cycles (as the scheme always predicts not taken) CPlyotraken = 1 + F_Br_Unconditional * 2 + P_Br_Cond_NotTaken * 0 + P_Br_Cond_Taken* 3 The correct answer is: 1.43
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 4 Correct 10.00 points out of 10.00 ¥ Flag question = In order to avoid any stalls due to branches, how many delay slots will we need to fill in the modified MIPS 5-Stage Pipeline (refer to Lesson 8 - Slide 8)? Your answer is correct. In the modified MIPS pipeline, branches are resolved after the second stage, so one delay slot is enough The correct answer is: 1
Question 5 Incorrect 0.00 points out of 10.00 ¥ Flag question s In order to avoid any stalls due to branches, how many delay slots will we need to fill in the MIPS R4000 8-stage Pipeline (refer to Lesson 8 - Slide 13)? Select one: a. 1 b. 2 x c 3 d 4 Your answer is incorrect. In the MIPS R4000 8-stage pipeline, branches are resolved, at the latest, at the end of 4th clock cycle, so three delay slots are needed to avoid any stalls. The correct answer is: 3
Question 6 Incorrect 0.00 points out of 10.00 ¥ Flag question | P Consider a program that executes 22% branch instructions, 3% of those are unconditional branches, 7% are untaken conditional branches and the rest are taken conditional branches. In the case of the MIPS R4000 (refer to Lesson 8 - Slide 13), what will be the CPl in the 'Branch Taken' case? Ignore all other types of hazards and assume the ideal CPI = 1. Assume that the target address calculation is done in the decode stage while the branch condition evaluation is done in the execute (ALU) stage. (Round to two decimal places.)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Luldiuviial vialiviica. i the case of the MIPS R4000 (refer to Lesson 8 - Slide 13), what will be the CPl in the '‘Branch Taken' case? Ignore all other types of hazards and assume the ideal CPI = 1. Assume that the target address calculation is done in the decode stage while the branch condition evaluation is done in the execute (ALU) stage. (Round to two decimal places.) In the Branch Taken case, the penalty for branches in MIPS R4000 are as the followings Penalty for Unconditional branches = 2 cycles Penalty for Taken branches = 2 cycles Penalty for Not-Taken branches = 3 cycles CPlyot Taken = 1 + (F_Br_Uncond * 2 + F_Br_Uncond_Taken * 3 + F_Br_Cond_Taken * 2 The correct answer is: 1.51
Question 7 Correct 10.00 points out of 10.00 " Flag question = Suppose a particular program contains 23% branches. One branch delay slot needs to be usefully filled to avoid the branch penalty. Assuming that the compiler can fill 98% of the delay slots and that 82% of the instructions executed in the branch delay slots are useful, calculate the CPI. Assume that the ideal CPlis 1 and ignore all other hazards. The penalty incurred if a slot is not filled with useful instruction is 1 cycle. (round to two decimal places.)
the delay slots and that 82% of the instructions executed in the branch delay slots are useful, calculate the CPI. Assume that the ideal CPlis 1 and ignore all other hazards. The penalty incurred if a slot is not filled with useful instruction is 1 cycle. (round to two decimal places.) Answer: 1.05 v Prob. of useful instruction in the delay slot = F_SlotsFilled * F_InstUseful Prob.of not-useful instruction in the delay slot = Penalty per branch = 1 - F_SlotsFilled * F_InstUseful CPI =1 + F_Br * Prob.of not-useful instruction in the delay slot * 1 The correct answer is: 1.05
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 8 Incorrect 0.00 points out of 20.00 ¥ Flag question | A MIPS-like pipelined processor has the following five stages: IF, ID, EX, MEM and WB. When executing conditional branches the condition is checked in the ID stage while the branch target address is calculated in the EX stage. Assume that the processor supports data forwarding and delayed branches for which you will need to decide on the number of delay slots. You are given the following program segment: S1: SUB Ry, Ry, R3 /* Rq is the destination */ Q.- AAdAAR. R. R~
e rolnowing program segment: S1: SUB Ry, Ry, R3 /* Ry is the destination */ Sy: Add R4, R3, Rs /* Ry is the destination */ S3—Sp-1: Instructions not modifying Ry or Ry Snl Bz Ra, 100(R1) /* Branch to 100+ (R,) if R4=0%*/ How many delay slots should you include in the design to optimize the performance of the above program segment assuming that n > 5? What would be the total number of cycles required to execute this program segment after the delay slots are usefully filled? Can we reduce the number of cycles for n = 3 by doing some rescheduling?
slots are usefully filled? Can we reduce the number of cycles for n = 3 by doing some rescheduling? The answers in the order of the questions are: Your answer is incorrect. Two delay slots would optimize the performance Sp2 and S, can be moved to the delay slots yielding 5 + (n-1) = N + 4 cycles to execute all n instructions, if n > 5. Yes, We can reduce the number of cycles for n = 3 by rescheduling (how?). The correct answer is: 2, n+4, Yes
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 9 Correct 10.00 points out of 10.00 ¥ Flag question = A processor has separate instruction and data caches, each requiring two cycles for any operation. It includes a single 2-cycles execution unit responsible for executing all ALU and LOAD instructions. As a result, the instruction pipeline has the following eight stages: IF1, IF2, ID, EX1, EX2, MEM1, MEM2, WB. In the absence of hazards the pipeline has a CPI of 1. The processor has no hardware support for dynamic scheduling. The instruction stream executed by the processor consists of 25% Load fmmdeiimdioa ~md ANO, AL
The instruction stream executed by the processor consists of 25% Load instructions and 40% ALU instructions. The frequencies of RAW data dependencies between these two instructions and the instructions following them are: Instruction i LOAD| ALU RAW in instruction i+1 30% 20% RAW in instruction i+2, but not in instruction i+1 12% 10% RAW in instruction i+3, but not in instruction i+1 nor i+2 10% 5% RAW in instruction i+4, but not in earlier instructions 5% 1% Nalrinlate the rantrihntinn
Calculate the contribution to the CPI of the processor due to RAW data hazards assuming that DATA FORWARDING IS SUPPORTED. Note that the Register File can execute a write followed by a read in the same cycle. (Round your answer to three decimal places). InstructionjLoadStalls w/o [Stalls with [ALU Stalls \ I [Fre. [ForwardingForwardingFreq.Forwat RAWiIn poxj T poxfa +1 [ ] RAWin [12% 73 10% 3 [+2 ONLY RAWIn [10% 2 1 5% P [+3 ONLY RAWIn 5% [i i f+4 ONLY | | Contribution to CPI = 0.25[0.3*3 +0.12+42 + 0.1] +0.4[0.241) = 0.390 The correct answer is: 0.39
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help