hw4_cse490-590-su2021_sol

pdf

School

SUNY Buffalo State College *

*We aren’t endorsed by this school

Course

590LR

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

Uploaded by SuperHumanWombatMaster946

CSE 490/590 Summer 2021 Homework 4 1. For the code sequence shown below. loop: l.d $f12, 0($f5) add.d $f6, $f6, $f12 daddui $f5, $f5, -8 bne $f5, $f9, loop // $f9 holds the address of the last value to be operated on a) Show loop unrolling so that there are four copies of the loop body Assume $f5, $f9 (that is, the size of the array) are initially a multiple of 32, which means that the number of loop iterations is a multiple of 4. Eliminate any obviously redundant computations and do not reuse any of the registers. l.d $f12, 0($f5) add.d $f7, $f7, $f12 l.d $f13, -8($f5) add.d $f8, $f8, $f13 l.d $f14, -16($f5) add.d $f10, $f10, $f14 l.d $f15, -24($f5) add.d $f11, $f11, $f15 daddui $f5, $f5, -32 bne $f5, $f9, loop add.d $f16, $f7, $f8 add.d $f17, $f10, $f11 add.d $f18, $f16, $f17 or l.d $f12, 0($f5) add.d $f6, $f6, $f12 l.d $f13, -8($f5) add.d $f6, $f6, $f13 l.d $f14, -16($f5) add.d $f6, $f6, $f14 l.d $f15, -24($f5) add.d $f6, $f6, $f15 daddui $f5, $f5, -32 bne $f5, $f9, loop b) Computer the number of cycles needed for 4 iterations 1. l.d $f12, 0($f5) 2. stall 3. add.d $f7, $f7, $f12

CSE 490/590 Summer 2021 Homework 4 4. l.d $f13, -8($f5) 5. stall 6. add.d $f8, $f8, $f13 7. l.d $f14, -16($f5) 8. stall 9. add.d $f10, $f10, $f14 10. l.d $f15, -24($f5) 11. stall 12. add.d $f11, $f11, $f15 13. daddui $f5, $f5, -32 14. stall 15. bne $f5, $f9, loop 16. add.d $f16, $f7, $f8 17. add.d $f17, $f10, $f11 18. stall 19. stall 20. stall 21. add.d $f18, $f16, $f17 or 1. l.d $f12, 0($f5) 2. stall 3. add.d $f6, $f6, $f12 4. stall 5. l.d $f13, -8($f5) 6. stall 7. add.d $f6, $f6, $f13 8. stall 9. l.d $f14, -16($f5) 10. stall 11. add.d $f6, $f6, $f14 12. stall 13. l.d $f15, -24($f5) 14. stall 15. add.d $f6, $f6, $f15 16. daddui $f5, $f5, -32 17. stall 18. bne $f5, $f9, loop 2. For the code sequence shown below L.D F0,0(R1) ADD.D F4,F0,F2 S.D F4,0(R1)

CSE 490/590 Summer 2021 Homework 4 L.D F0,-8(R1) ADD.D F4,F0,F2 S.D F4,-8(R1) Rename the registers as needed and schedule the sequence to minimize the stalls L.D F0,0(R1) stall ADD.D F4,F0,F2 Stall stall S.D F4,0(R1) L.D F5,-8(R1) stall ADD.D F6,F5,F2 Stall stall S.D F6,-8(R1) Schedulling L.D F0,0(R1) L.D F5,-8(R1) ADD.D F4,F0,F2 ADD.D F6,F5,F2 stall S.D F4,0(R1) S.D F6,-8(R1) 3. For the given code sequence below executed on a 2-issue processor. I1: LW r2, 0(r1) I2: LW r3, 4(r1) I3: LW r4, 8(r1) I4: LW r4, 12(r1) I5: ADD r6, r4, r5 I6: ADD r7, r2, r3 I7: ADD r8, r7, r6 I8: LW r9, 4(r8) a) Draw a pipeline diagram [Consider Data Forwarding] You can also follow datapath design in lecture b) Calculate IPC

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

CSE 490/590 Summer 2021 Homework 4 IPC = 8/10 4. Consider the following code sequence. I1: lw $s4, 0($s1) I2: or $s2, $s4, $s1 I3: and $s6, $s5, $s3 Highlight the Hazard and discuss how out of order processor will help when lw $s4, 0($s1) encounters a cache miss? I1: lw $s4, 0($s1) I2: or $s2, $s4, $s1 I3: and $s6, $s5, $s3 I3 can execute and wait for write back stage until the data is loaded in $s4 in I1 and eventually forwarded to I2 5. Show loop unrolling so that there are four copies of the loop body for the .MIPS code Loop: L.D F0,0(R1) ADD.D F4,F0,F2 S.D F4,0(R1) DADDUI R1,R1,#-8 BNE R1,R2,Loop Assuming R1 – R2 (that is, the size of the array) is initially a multiple of 32, which means that the number of loop iterations is a multiple of 4. Eliminate any obviously redundant computations and do not reuse any of the registers. Here is the result after merging the DADDUI instructions and dropping the unnecessary BNE operations that are duplicated during unrolling. Note that R2 must now be set so that 32(R2) is the starting address of the last four elements. Loop: L.D F0,0(R1) ADD.D F4,F0,F2 S.D F4,0(R1) // drop DADDUI & BNE L.D F6,-8(R1) ADD.D F8,F6,F2 S.D F8,-8(R1) //drop DADDUI & BNE L.D F10,-16(R1) ADD.D F12,F10,F2 S.D F12,-16(R1) //drop DADDUI & BNE L.D F14,-24(R1) ADD.D F16,F14,F2 S.D F16,-24(R1) DADDUI R1,R1,#-32 BNE R1,R2,Loop Observations We have eliminated three branches and three decrements of R1. The addresses on the loads and stores have been compensated to allow the DADDUI instructions on R1 to be merged. This optimization may seem trivial, but it is not; it requires symbolic substitution and simplification.

CSE 490/590 Summer 2021 Homework 4 Symbolic substitution and simplification will rearrange expressions so as to allow constants to be collapsed, allowing an expression such as ((i + 1) + 1) to be rewritten as (i + (1 + 1)) and then simplified to (i + 2). Without scheduling, every operation in the unrolled loop is followed by a dependent operation and thus will cause a stall. This loop will run in 27 clock cycles (13 stalls and 14 instructions) — after each LD 1 stall, after each ADDD 2, after the DADDUI 1 stall, plus 14 instruction issue cycles 6. Show the unrolled loop in the question 5 after it has been scheduled for the pipeline with the latencies from the below figure. Loop: L.D F0,0(R1) L.D F6,-8(R1) L.D F10,-16(R1) L.D F14,-24(R1) ADD.D F4,F0,F2 ADD.D F8,F6,F2 ADD.D F12,F10,F2 ADD.D F16,F14,F2 S.D F4,0(R1) S.D F8,-8(R1) DADDUI R1,R1,#-32 S.D F12,16(R1) S.D F16,8(R1) BNE R1,R2,Loop Observation The execution time of the unrolled loop has dropped to a total of 14 clock cycles(no stalls), Every 4 iteration 14 cycle => Each iteration takes 3.5 clock cycles 7. According to in order execution and out of order execution (IF ID E W) in the following instructions, when should each instruction finish? For example, if I1 W at t4, your answer should be I1 = t4. Assumptions: i. We only consider IF, ID, E, W, four states. Mem stage is not here. ii. We only have One adder and One multiplier. iii. Number of cycles: ADD: 1 cyc MUL: 4 cyc iv. DATA FORWARDING is considered INSTRUCTIONS: I 1 : MUL R3  R1, R2 I 2 : ADD R5  R3, R4 I 3 : ADD R7  R2, R6 I 4 : ADD R10  R8, R9 I 5 : MUL R11  R7, R10 I 6 : ADD R5  R5, R11 INSTRUCTIONS:

CSE 490/590 Summer 2021 Homework 4 I 1 : MUL R3  R1, R2 I 2 : ADD R5  R3 , R4 I 3 : ADD R7  R2, R6 I 4 : ADD R10  R8, R9 I 5 : MUL R11  R7, R10 I 6 : ADD R5  R5, R11 8. In the following instruction sequence, find the hazards. Rename the registers to eliminate the anti and output dependences div.s r1,r2,r3 mult.s r4,r1, r5 add.s r1 ,r3, r6 sub.s r3,r1, r4

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

Tania_Colligan_17818782_Assessment 1 _EDP243_SP2023.docx

Technology in Port Security.docx

problem set 01.pdf

IP2 FINAL.docx

A program planning model is needed to ensure that all scopes of the intended program are outlined an

Linear_RegressionQB.pdf

LAB4.py

IntroPR_and_Bayesian_Decision_QB.pdf

sim lab 817.pdf

LAB SIM 10.pdf

Crime Scene Evidence Paper- CRJ311.docx

8-1 Discussion - Medium.docx

Recommended textbooks for you

C++ Programming: From Problem Analysis to Program...

Computer Science

ISBN:9781337102087

Author:D. S. Malik

Publisher:Cengage Learning

C++ for Engineers and Scientists

Computer Science

ISBN:9781133187844

Author:Bronson, Gary J.

Publisher:Course Technology Ptr

Microsoft Visual C#

Computer Science

ISBN:9781337102100

Author:Joyce, Farrell.

Publisher:Cengage Learning,

Programming Logic & Design Comprehensive

Computer Science

ISBN:9781337669405

Author:FARRELL

Publisher:Cengage

EBK JAVA PROGRAMMING

Computer Science

ISBN:9781337671385

Author:FARRELL

Publisher:CENGAGE LEARNING - CONSIGNMENT

Programming with Microsoft Visual Basic 2017

Computer Science

ISBN:9781337102124

Author:Diane Zak

Publisher:Cengage Learning

SEE MORE TEXTBOOKS

Recommended textbooks for you

C++ Programming: From Problem Analysis to Program...
Computer Science
ISBN:9781337102087
Author:D. S. Malik
Publisher:Cengage Learning
C++ for Engineers and Scientists
Computer Science
ISBN:9781133187844
Author:Bronson, Gary J.
Publisher:Course Technology Ptr
Microsoft Visual C#
Computer Science
ISBN:9781337102100
Author:Joyce, Farrell.
Publisher:Cengage Learning,
Programming Logic & Design Comprehensive
Computer Science
ISBN:9781337669405
Author:FARRELL
Publisher:Cengage
EBK JAVA PROGRAMMING
Computer Science
ISBN:9781337671385
Author:FARRELL
Publisher:CENGAGE LEARNING - CONSIGNMENT
Programming with Microsoft Visual Basic 2017
Computer Science
ISBN:9781337102124
Author:Diane Zak
Publisher:Cengage Learning