Homework1-CPI (Prudhviraj Sheela)

docx

School

Oklahoma State University *

*We aren’t endorsed by this school

Course

1214

Subject

Computer Science

Date

Feb 20, 2024

Type

docx

Pages

Uploaded by LieutenantLemurMaster966

Name: Prudhviraj Sheela OSU CWID: A20228857 CS-5113 Computer Organization and Architecture Homework - (Chapter-1) Cycles Per Instruction (CPI) 1.15(13) Assume that we make an enhancement to a computer that improves some mode of execution by a factor of 10. Enhanced mode is used 50% of the time, measured as a percentage of the execution time when the enhanced mode is in use . Recall that Amdahl’s law depends on the fraction of the original, unenhanced execution time that could make use of enhanced mode. Thus, we cannot directly use this 50% measurement to compute speedup with Amdahl’s law. a) What is the speedup we have obtained from fast mode? Answer) Speedup obtained from fast mode is calculated as follows,  We are given the fraction of execution time that the enhancement component is used after the enhancement has taken place. So, we call the F(denhanced) to show that we can reverse the process to the original fraction that can be enhanced.  So we initially apply the Amdhal’s law in the forward direction to get an expression for the ratio of new to old execution times.  Later, we apply the Amdhal’s law in backwards time to obtain the expression for the ratio of old to latest execution times. Execution_time(new) / Execution_time(old) = (1 – F(enhanced)) + (F(enhanced) * (1/speedup(enhanced))) = (1 – F(enhanced)) + (F(enhanced)*(1/10)) = (1 – F(enhanced)) + (0.1 * F(enhanced)) = 1 – (0.9 * F(enhanced)) –-> (1) Execution_time(old) / Execution_time(new) = (1 – F(denhanced)) + (F(denhanced) * (1/speedup(denhanced))) = (1 - 0.5) + (0.5 * (1/(1/10))) = 0.5 + (0.5/0.1) = 0.5 + 5 = 5.5 .˙. Speedup obtained from fast mode = Execution_time(old) / Execution_time(new) = 5.5 b) What percentage of the original execution time has been converted to fast mode? Answer) From above (1), we can see that  Execution_time(new) / Execution_time(old) = 1 – (0.9 * F(enhanced)) 1/5.5 = 1 – (0.9 * F(enhanced)) (5.5) * (1 - (0.9 * F(enhanced))) = 1 5.5 – (4.95 * F(enhanced)) = 1 4.95 * F(enhanced) = 5.5 – 1  4.95 * F(enhanced) = 4.5  F(enhanced) = (4.5 / 4.95) F(enhanced) = 0.91 .˙. The percentage of the original execution time has been converted to fast mode = 0.91 = 91%  We can cross check the speedup of the fast mode by substituting the value of F(enhanced) in the above eq’n (1) Speedup = 1 / (Execution_time(new) / Execution_time(old)) = 1 / (1 – F(enhanced)) + (0.1 * F(enhanced)) = 1 / (1 – 0.91) + (0.1 * 0.91) = 1 / (0.09 + 0.091) = 1 / (0.181) = 5.52 = 5.5(approx) .˙. Speedup obtained from fast mode = 5.5 (This satisfies the above (a) question) 1.16(14) When making changes to optimize part of a processor, it is often the case that speeding up one type of instruction comes at the cost of slowing down something else. For example, if we put in a complicated fast floating- point unit, that takes space, and something might have to be moved farther away from the middle to accommodate it, adding an extra cycle in delay to reach that unit. The basic Amdahl’s law equation does not take into account this trade-off. a) If the new fast floating-point unit speeds up floating-point operations by, on average, 2×, and floating-point operations take 20% of the original program’s execution time, what is the overall speedup (ignoring the penalty to

any other instructions)? Answer) Overall Speedup based on the given data = 1 / (1 – F(enhanced)) + (F(enhanced) / Speedup(F(enhanced))) = 1 / ((1 – 0.2) + (0.2/2)) = 1 / (0.8 + 0.1) = 1 / 0.9 = 1.111 .˙. Overall Speedup = 1.11 b) Now assume that speeding up the floating-point unit slowed down data cache accesses, resulting in a 1.5× slowdown (or 2/3 speedup). Data cache accesses consume 10% of the execution time. What is the overall speedup now? Answer) Overall Speed based on the given data = 1 / [(1 – F old (enhanced) – F new (denhanced)) + (F old (enhanced) / Speedup(F old (enhanced))) + (F new (denhanced) / Speedup(F new (denhanced)))]  Considering the data given in (a) the overall speed now changes as follows, Overall Speedup = 1 / ((1 - 0.2 - 0.1) + (0.2/2) + ((0.1) / (2/3)) = 1 / (0.7 + 0.1 + ((0.1) * (3/2)) = 1 / (0.8 + 0.15) = 1 / (0.95) = 1.052 .˙. Overall Speedup = 1.05 c) After implementing the new floating-point operations, what percentage of execution time is spent on floating- point operations? What percentage is spent on data cache accesses? Answer) i) Percentage of execution time spent on floating point operations = 0.1/0.95 = 0.105 = 10.5% ii) Percentage spent on data cache accesses = 0.15/0.95 = 0.158 = 15.8% 1.17() Your company has just bought a new Intel Core i5 dual- core processor, and you have been tasked with optimizing your software for this processor. You will run two applications on this dual core, but the resource requirements are not equal. The first application requires 80% of the resources, and the other only 20% of the resources. Assume that when you parallelize a portion of the program, the speedup for that portion is 2. a) Given that 40% of the first application is parallelizable, how much speedup would you achieve with that application if run in isolation? Answer) Speedup if run in isolation = (Execution time without E / Execution time with E) = 1 / ((1 – F) + (F * (1/speedup for parallelized portion))) = 1 / (1 – 0.4) + (0.4 * (1/2)) = 1 / (0.6 + 0.2) = 1 / 0.8 = 1.25 .˙. Speedup for application if run in isolation = 1.25 b) Given that 99% of the second application is parallelizable, how much speedup would this application observe if run in isolation? Answer) Speedup if run in isolation = (Execution time without E / Execution time with E) = 1 / ((1 – F) + (F * (1/speedup for parallelized portion))) = 1 / (1 – 0.99) + (0.99 * (1/2)) = 1 / (0.01 + 0.495) = 1 / 0.505 = 1.9801 .˙. Speedup for application if run in isolation = 1.98 c) Given that 40% of the first application is parallelizable, how much overall system speedup would you observe if you parallelized it? Answer) Overall system speedup if parallelized by 40% = 1 / F(second_application) + (F(first_application) * ((1 – F(enhanced)) + (F(parallel) / Speedup(parallel)))) = 1 / (0.2 + (0.8 * (0.6 + (0.4/2)))) = 1 / (0.2 + 0.48 + 0.16) = 1 / 0.84 = 1.1904 .˙. Overall system speedup if parallelized by 40% = 1.19 d) Given that 99% of the second application is parallelizable, how much overall system speedup would you observe if you parallelized it? Answer) Overall system speedup if parallelized by 99%

= 1 / F(first_application) + (F(second_application) * ((1 – F(enhanced)) + (F(parallel) / Speedup(parallel)))) = 1 / (0.8 + (0.2 * (0.01 + (0.99/2)))) = 1 / (0.8 + 0.002 + 0.099) = 1 / 0.901 = 1.109 .˙. Overall system speedup if parallelized by 99% = 1.11 1.18(16) When parallelizing an application, the ideal speedup is speeding up by the number of processors. This is limited by two things: percent- age of the application that can be parallelized and the cost of communication. Amdahl’s law takes into account the former but not the latter. a) What is the speedup with N processors if 80% of the application is parallelizable, ignoring the cost of communication? Answer) Speedup with N processors = (Execution time without E / Execution time with E) = 1 / ((1 – F) + (F * (1 / speedup for parallelized portion))) = 1 / ((1 – 0.8) + (0.8 * (1 / N))) = 1 / (0.2 + (0.8/N)) .˙. Speedup with N processors = 1 / (0.2 + (0.8/N)) b) What is the speedup with 8 processors if, for every processor added, the communication overhead is 0.5% of the original execution time? Answer) Speedup with 8 processors = (Execution time without E / Execution time with E) = 1 / ((1 – F) + (F * (1 / speedup for parallelized portion))) = 1 / ((1 - 0.8) + (8 * 0.005) + (0.8/8)) = 1 / (0.2 + 0.04 + 0.1) = 1 / 0.34 = 2.941 .˙. Speedup with 8 processors = 2.94 c) What is the speedup with 8 processors if, for every time the number of processors is doubled, the communication overhead is increased by 0.5% of the original execution time? Answer) Speedup with 8 process, for every time the number of processors is doubled is calculated as, Speedup = (Execution time without E / Execution time with E) = 1 / ((1 – F) + (F * (1 / speedup for parallelized portion))) = 1 / ((1 - 0.8) + (3 * 0.005) + (0.8/8)) = 1 / (0.2 + 0.015 + 0.1) = 1 / 0.315 = 3.174 .˙. Speedup with 8 process, for every time the number of processors is doubled = 3.17 d) What is the speedup with N processors if, for every time the number of processors is doubled, the communication overhead is increased by 0.5% of the original execution time? Answer) Speedup with N processes, for every time the number of processes is doubled is calculated as, Speedup = (Execution time without E / Execution time with E) = 1 / ((1 – F) + (F * (1 / speedup for parallelized portion))) = 1 / ((1-0.8) + (logN * 0.005) + (0.8/N)) = 1 / (0.2 + (logN * 0.005) + (0.8/N)) .˙. Speedup with N process, for every time the number of processors is doubled = 1 / (0.2 + (logN * 0.005) + (0.8/N)) e) Write the general equation that solves this question: What is the number of processors with the highest speedup in an application in which P % of the original execution time is parallelizable, and, for every time the number of processors is doubled, the communication is increased by 0.5% of the original execution time? Answer) The general equation that solves this question is, d / (d*N) * (Speedup) = 0 d / (d*N) * (Execution time without E / Execution time with E) = 0

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

d / (d*N) * (1 / ((1 – F(enhnacement)) + (F(enhancement) * (1 / speedup for parallelized portion)))) = 0 .˙. d / (d*N) * (1 / ((1-P) + (logN * 0.005) + (P/N)) = 0 (Final General Equation obtained)

Homework1-CPI (Prudhviraj Sheela)

Related Documents