Please written by computer source   Question 1:   Suppose we wish to write a procedure that computes the inner product of two vectors u and v. An abstract version of the function has a CPE of 14–18 with x86- 64 for different types of integer and floating-point data. By doing the same sort of transformations we did to transform the abstract program combine1 into the more efficient combine4, we get the following code:     Our measurements show that this function has CPEs of 1.50 for integer data and 3.00 for floating-point data. For data type double, the x86-64 assembly code for the inner loop is as follows:   Assume that the functional units have the characteristics listed in Figure 5.12.   **See last page for figures   A. Diagram how this instruction sequence would be decoded into operations and show how the data dependencies between them would create a critical path of operations, in the style of textbook Figures 5.13 and 5.14.         vmovsd vmovsd vmulsd vaddsd           | | | |           V V V V   Get udata(i) Load vdata(i) Multiply Add to sum                      |                                  V |                    5. Increment i |                                    V                                   6. Compare limit     B. For data type double, what lower bound on the CPE is determined by the critical path?   C. Assuming similar instruction sequences for the integer code as well, what lower bound on the CPE is determined by the critical path for integer data?   D. Explain how the floating-point versions can have CPEs of 3.00, even though the multiplication operation requires 5 clock cycles.   The processor can issue one multiplication per cycle if there are no data dependencies between the multiplications. Processors also have multiple functional units for performing floating-point operations, which can further increase the parallelism and reduce the latency of the critical path.   Question 2:   Write a version of the inner product procedure described in Question 1 that uses 6 × 6 loop unrolling. Our measurements for this function with x86-64 give a CPE of 1.06 for integer data and 1.01 for floating-point data.   What factor limits the performance to a CPE of 1.00?   Question 3:   Write a version of the inner product procedure described in Question 1 that uses 6 × 1a loop unrolling to enable greater parallelism. Our measurements for this function give a CPE of 1.10 for integer data and 1.05 for floating-point data.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Please written by computer source

 

Question 1:

 

Suppose we wish to write a procedure that computes the inner product of two vectors u and v. An abstract version of the function has a CPE of 14–18 with x86- 64 for different types of integer and floating-point data. By doing the same sort of transformations we did to transform the abstract program combine1 into the more efficient combine4, we get the following code:

 

 

Our measurements show that this function has CPEs of 1.50 for integer data and 3.00 for floating-point data. For data type double, the x86-64 assembly code for the inner loop is as follows:

 

Assume that the functional units have the characteristics listed in Figure 5.12.

 

**See last page for figures

 

A. Diagram how this instruction sequence would be decoded into operations and show how the data dependencies between them would create a critical path of operations, in the style of textbook Figures 5.13 and 5.14.

 

      vmovsd vmovsd vmulsd vaddsd

 

        | | | |

 

        V V V V

 

Get udata(i) Load vdata(i) Multiply Add to sum

 

                   |            

 

                   V |

 

                 5. Increment i |

 

                                 V

 

                                6. Compare limit

 

 

B. For data type double, what lower bound on the CPE is determined by the critical path?

 

C. Assuming similar instruction sequences for the integer code as well, what lower bound on the CPE is determined by the critical path for integer data?

 

D. Explain how the floating-point versions can have CPEs of 3.00, even though the multiplication operation requires 5 clock cycles.

 

The processor can issue one multiplication per cycle if there are no data dependencies between the multiplications. Processors also have multiple functional units for performing floating-point operations, which can further increase the parallelism and reduce the latency of the critical path.

 

Question 2:

 

Write a version of the inner product procedure described in Question 1 that uses 6 × 6 loop unrolling. Our measurements for this function with x86-64 give a CPE of 1.06 for integer data and 1.01 for floating-point data.

 

What factor limits the performance to a CPE of 1.00?

 

Question 3:

 

Write a version of the inner product procedure described in Question 1 that uses 6 × 1a loop unrolling to enable greater parallelism. Our measurements for this function give a CPE of 1.10 for integer data and 1.05 for floating-point data.

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 4 steps

Blurred answer
Knowledge Booster
Functions
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education