Q. What is the baseline performance (in cycles, per loop iteration) of the code sequence in if no new instruction’s execution could be initiated until the previous instruction’s execution had completed? Ignore front-end fetch and decode. Assume for now that execution does not stall for lack of the next instruction, but only one instruction/cycle can be issued. Assume the branch is taken, and that there is a one-cycle branch delay slot. Latencies beyond single cycle: Latencies beyond single cycle: Memory LD +6 Memory SD +4 Integer ADD, SUB +1 Branches +2 fadd.d +4 fmul.d +6 fdiv.d +10 Execution code Loop : fld f2, 0(Rx) I0: fdiv.d f8, f2, f0 I1: fmul.d f2, f6, f2 I2: fld f4, 0(Ry) I3: fadd.d f4, f0, f4 I4: fadd.d f10, f8, f4 I5: fsd f10, 0(Ry) I6: addi Rx, Ry, 8 I7: addi Ry, Ry, 8 I8: sub x20, x4, Rx I9: bnz x20, Loop Q. Think about what latency numbers really mean—they indicate the number of cycles a given function requires to produce its output. If the overall pipeline stalls for the latency cycles of each functional unit, then you are at least guaranteed that any pair of back-to-back instructions (a “producer” followed by a “consumer”) will execute correctly. But not all instruction pairs have a producer/consumer relationship. Sometimes two adjacent instructions have nothing to do with each other. How many cycles would the loop body in the code sequence require if the pipeline detected true data dependences and only stalled on those, rather than blindly stalling everything just because one functional unit is busy? Show the code with inserted where necessary to accommodate stated latencies. (Hint: an instruction with latency +2 requires two cycles to be inserted into the code sequence.) Think of it this way: a one-cycle instruction has latency 1 + 0, meaning zero extra wait states. So, latency 1 + 1 implies one stall cycle; latency 1 +N has N extra stall cycles. Q. Consider a multiple-issue design. Suppose you have two execution pipelines, each capable of beginning execution of one instruction per cycle, and enough fetch/decode bandwidth in the front end so that it will not stall your execution. Assume results can be immediately forwarded from one execution unit to another, or to itself. Further assume that the only reason an execution pipeline would stall is to observe a true data dependency. Now how many cycles does the loop require? Q.   Reorder the instructions to improve performance of the code in above values Assume the two-pipe machine in and that the out-oforder completion issues of Exercise 3.4 have been dealt with successfully. Just worry about observing true data dependences and functional unit latencies for now. How many cycles does your reordered code take?

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Topic Video
Question

Q. What is the baseline performance (in cycles, per loop iteration) of the code sequence in if no new instruction’s execution could be initiated until the previous instruction’s execution had completed? Ignore front-end fetch and decode. Assume for now that execution does not stall for lack of the next instruction, but only one instruction/cycle can be issued. Assume the branch is taken, and that there is a one-cycle branch delay slot. Latencies beyond single cycle:

Latencies beyond single cycle:
Memory LD +6
Memory SD +4
Integer ADD, SUB +1
Branches +2
fadd.d +4
fmul.d +6
fdiv.d +10

Execution code
Loop : fld f2, 0(Rx)
I0: fdiv.d f8, f2, f0
I1: fmul.d f2, f6, f2
I2: fld f4, 0(Ry)
I3: fadd.d f4, f0, f4
I4: fadd.d f10, f8, f4
I5: fsd f10, 0(Ry)
I6: addi Rx, Ry, 8
I7: addi Ry, Ry, 8
I8: sub x20, x4, Rx
I9: bnz x20, Loop

Q. Think about what latency numbers really mean—they indicate the number of cycles a given function requires to produce its output. If the overall pipeline stalls for the latency cycles of each functional unit, then you are at least guaranteed that any pair of back-to-back instructions (a “producer” followed by a “consumer”) will execute correctly. But not all instruction pairs have a producer/consumer relationship. Sometimes two adjacent instructions have nothing to do with each other. How many cycles would the loop body in the code sequence require if the pipeline detected true data dependences and only stalled on those, rather than blindly stalling everything just because one functional unit is busy? Show the code with inserted where necessary to accommodate stated latencies. (Hint: an instruction with latency +2 requires two cycles to be inserted into the code sequence.) Think of it this way: a one-cycle instruction has latency 1 + 0, meaning zero extra wait states. So, latency 1 + 1 implies one stall cycle; latency 1 +N has N extra stall cycles.

Q. Consider a multiple-issue design. Suppose you have two execution pipelines, each capable of beginning execution of one instruction per cycle, and enough fetch/decode bandwidth in the front end so that it will not stall your execution. Assume results can be immediately forwarded from one execution unit to another, or to itself. Further assume that the only reason an execution pipeline would stall is to observe a true data dependency. Now how many cycles does the loop require?

Q.   Reorder the instructions to improve performance of the code in above values Assume the two-pipe machine in and that the out-oforder completion issues of Exercise 3.4 have been dealt with successfully. Just worry about observing true data dependences and functional unit latencies for now. How many cycles does your reordered code take?

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps

Blurred answer
Knowledge Booster
Instruction Format
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education