RadhikaRaghuwanshi_HW6

pdf

School

George Washington University *

*We aren’t endorsed by this school

Course

6461

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

pdf

Pages

Uploaded by KidMaskWaterBuffalo16

5.21 Assume that we have a function for an application of the form F (i, p), which gives the fraction of time that exactly i processors are usable given that a total of p processors are available. This means that Assume that when i processors are in use, the applications run i times faster. a. Rewrite Amdahl’s Law so that it gives the speedup as a function of p for some application. According to Amdahl’s law: Speedup = Execution time old/ Execution time new = T/t Execution time that uses i processers is given by F(i. p) This will give t = T * b. An application A runs on single processor for a time T seconds. Different portions of its running time can improve if a larger number of processors is used. Figure 5.40 provides the details. How much speedup will A achieve when on 8 processors? Solution: 8 processors: T(0.2/1+ 0.2/2+0.1/4+0.5/6+0.45/8) = 0.2 + 0.1 + 0.025 + 0.008+ 0.056 = 0.389 * T c. Repeat for 32 processors and an infinite number of processors. 32 processors: T(0.2/1+ 0.2/2+0.1/4+0.5/6+0.15/8+0.3/16) = 0.2+ 0.1+0.025+0.008+0.01875 = 0.37 *T Infinite Processors: T(0.2/1+ 0.2/2+0.1/4+0.5/6+0.15/8+0.2/16+0.1/128) = 0.36 * T 5.22 In this exercise, we examine the effect of the interconnection network topology on the CPI of programs running on a 64-processor distributed-memory multiprocessor. The processor clock rate is 2.0 GHz, and the base CPI of an application with all references hitting in the cache is 0.75. Assume that 0.2% of the instructions involve a remote communication reference. The cost of a remote communication reference is (100 + 10 h) ns, h being the number of communication network hops that a remote reference has to make to the remote processor memory and back. Assume all communication links are bidirectional. a. Calculate the worst-case remote communication cost when the 64 processors are arranged as a ring, as an 8x8 processor grid, or as a hypercube (hint: longest communication path on a 2n hypercube has n links).

Arranged as ring: Largest number of communication hops =32 Communication Cost = (100 + 10*32) ns = 420 ns. Arranged as 8 x 8 processor grid: Largest number of communication hops = 14 Communication Cost = (100 + 10*14) ns = 240 ns. Arranged as hypercube: Largest number of hops = 6 Communication Cost = (100 + 10*6)ns = 160 ns. b. Compare the base CPI of the application with no remote communication to the CPI achieved with each of the three topologies in part (a). Given Base CPI = 0.75 Arranged as ring: Worst Case CPI = 0.75 + 0.2/100 * (420) = 1.34 cycles/inst Arranged as 8 x 8 processor grid: Worst Case CPI = 0.75 + 0.2/100 * (240) = 0.98 cycles/inst Arranged as hypercube: Worst Case CPI = 0.75 + 0.2/100 * (160) = 0.82 cycles/inst The average CPI can be obtained by replacing largest number of communication hops in the calculation by the average number of communication hops.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

RadhikaRaghuwanshi_HW6

Related Documents