hw8-v2

pdf

School

British Columbia Institute of Technology *

*We aren’t endorsed by this school

Course

1000

Subject

Computer Science

Date

Apr 3, 2024

Type

pdf

Pages

3

Uploaded by ElderOysterMaster53

Report
ENSC 254 Homework Assignment 8 Important Logistics: Some general grading logistics on homework assignments has been posted on our course website: https://canvas.sfu.ca/courses/76769/pages/homework-logistics Homework is for individual work; each student will be graded individually. Homework 8 weighs 2.5% of the final marks. It includes 73 points in total. If you get X points out of 73 points, it will be scaled as X/73 * 2.5% of the final marks. Electronic submission: You can type in the answers in Microsoft Word, or write on paper and then take a picture. For your final submission, please convert all your answers into one PDF file , and submit it electronically. Submission deadline: Jul 29 th , Saturday, 2023, 11:59:59pm . Every 10 minutes late, you will lose 10% of the points; 100 minutes late, you will get zero mark for this homework. 1. [73 points] <Lecture 18> Given the following RISC-V assembly program: 1. addi x5, x0, 0 2. addi x6, x0, 16 3. loop: 4. slli x7, x5, 2 5. add x7, x10, x7 6. lw x28, 4(x7) 7. mul x28, x28, x6 8. sw x28, 0(x7) 9. addi x5, x5, 1 10. blt x5, x6, loop 1) [13 points] Read the above assembly program and convert it to a C program. Assume x5 holds variable i , x10 holds the starting address of array A , all other registers hold some temporary variables. All data types are of 32-bit integer. a) [9 points] Write a brief comment for each instruction to explain what each instruction does. b) [4 points] Convert the assembly program into a C++ program. One line of C++ code is for the for loop structure, which counts for 2 points. Another line of C++ code is for the loop body, which counts for 2 points. 2) [14 points] Assume a RISC-V CPU with the classical five-stage pipeline and static two- issue packets, as taught in Lecture 18 (slides 9 to 19). That is, each cycle, you can statically issue and execute up to two instructions concurrently: one is for ALU/branch instruction and the other is for load/store instruction. Assume that branch condition and target address are determined at the instruction Decode stage, and there is no branch prediction, i.e., a branch has one cycle delay. An unused instruction slot is padded with a nop instruction. Schedule these instructions to execute in the CPU by filling out the following table (from cycle n+2 to n+8). You may change the order of how these instructions execute. But you cannot change an instruction itself and cannot change the semantics of the program. Each table cell counts for 1 point.
ALU/branch Load/store Cycle addi x5, x0, 0 nop n addi x6, x0, 16 nop n+1 loop: n+2 n+3 n+4 n+5 n+6 n+7 n+8 3) [2 points] Given your scheduling in question 2), in total, how many cycles does it take to finish executing the entire RISC-V program? Assume the first instruction packet takes 5 cycles to finish and for all remaining instruction packets, each takes 1 cycle to finish. Note: detailed computation steps must be included; otherwise, you lose 1 point. 4) [20 points] Unroll the loop by 4 times: please update the RISC-V assembly program as below by filling instructions and comments in line 9 to 18. Make sure you apply the register renaming technique for each unrolled loop iteration to avoid false dependencies. Extra registers you can use are x29 to x31; other registers are not allowed. Each instruction counts for 1 point and each comment counts for 1 point. 1. addi x5, x0, 0 2. addi x6, x0, 16 3. loop: 4. slli x7, x5, 2 5. add x7, x10, x7 6. lw x28, 4(x7) 7. mul x28, x28, x6 8. sw x28, 0(x7) 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. blt x5, x6, loop # if (i < size) go to loop
5) [22 points] Similar to question 2), schedule the unrolled version of the RISC-V assembly program in question 4) to the RISC-V CPU with the classical five-stage pipeline and static two-issue packets. Fill the following table starting with cycle n+2 and you can add rows to the table as needed. ALU/branch Load/store Cycle addi x5, x0, 0 nop n addi x6, x0, 16 nop n+1 loop: n+2 n+3 6) [2 points] Given your scheduling in question 5), in total, how many cycles does it take to finish executing the entire RISC-V program? Assume the first instruction packet takes 5 cycles to finish and for all remaining instruction packets, each takes 1 cycle to finish. Note: detailed computation steps must be included; otherwise, you lose 1 point.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help