26.1-10 For what number of processors do the two versions of the chess program run equally fast, assuming that TP = T1/P + T∞? A chess lesson To illustrate the power of work/span analysis, this section closes with a true story that occurred during the development of one of the first world-class parallel chess-playing programs [106] many years ago. The timings below have been simplified for exposition. The chess program was developed and tested on a 32-processor computer, but it was designed to run on a supercomputer with 512 processors. Since the supercomputer availability was limited and expensive, the developers ran benchmarks on the small computer and extrapolated performance to the large computer. At one point, the developers incorporated an optimization into the program that reduced its running time on an important benchmark on the small machine from T32 = 65 seconds to seconds. Yet, the developers used the work and span performance measures to conclude that the optimized version, which was faster on 32 processors, would actually be slower than the original version on the 512 processors of the large machine. As a result, they abandoned the “optimization.” Here is their work/span analysis. The original version of the program had work T1 = 2048 seconds and span T∞= 1 second. Let’s treat inequality (26.4) on page 760 as the equation TP = T1/P + T∞, which we can use as an approximation to the running time on P processors. Then indeed we have T32 = 2048/32 + 1 = 65. With the optimization, the work becomes T′1 = 1024 seconds, and the span becomes T′∞ = 8 seconds. Our approximation gives T′32 = 1024/32 + 8 = 40. The relative speeds of the two versions switch when we estimate their running times on 512 processors, however. The first version has a running time of T512 = 2048/512+1 = 5 seconds, and the second version runs in seconds. The optimization that speeds up the program on 32 processors makes the program run for twice as long on 512 processors! The optimized version’s span of 8, which is not the dominant term in the running time on 32 processors, becomes the dominant term on 512 processors, nullifying the advantage from using more processors. The optimization does not scale up. The moral of the story is that work/span analysis, and measurements of work and span, can be superior to measured running times alone in extrapolating an algorithm’s scalability.

26.1-10 For what number of processors do the two versions of the chess program run equally fast, assuming that TP = T1/P + T∞? A chess lesson To illustrate the power of work/span analysis, this section closes with a true story that occurred during the development of one of the first world-class parallel chess-playing programs [106] many years ago. The timings below have been simplified for exposition. The chess program was developed and tested on a 32-processor computer, but it was designed to run on a supercomputer with 512 processors. Since the supercomputer availability was limited and expensive, the developers ran benchmarks on the small computer and extrapolated performance to the large computer. At one point, the developers incorporated an optimization into the program that reduced its running time on an important benchmark on the small machine from T32 = 65 seconds to seconds. Yet, the developers used the work and span performance measures to conclude that the optimized version, which was faster on 32 processors, would actually be slower than the original version on the 512 processors of the large machine. As a result, they abandoned the “optimization.” Here is their work/span analysis. The original version of the program had work T1 = 2048 seconds and span T∞= 1 second. Let’s treat inequality (26.4) on page 760 as the equation TP = T1/P + T∞, which we can use as an approximation to the running time on P processors. Then indeed we have T32 = 2048/32 + 1 = 65. With the optimization, the work becomes T′1 = 1024 seconds, and the span becomes T′∞ = 8 seconds. Our approximation gives T′32 = 1024/32 + 8 = 40. The relative speeds of the two versions switch when we estimate their running times on 512 processors, however. The first version has a running time of T512 = 2048/512+1 = 5 seconds, and the second version runs in seconds. The optimization that speeds up the program on 32 processors makes the program run for twice as long on 512 processors! The optimized version’s span of 8, which is not the dominant term in the running time on 32 processors, becomes the dominant term on 512 processors, nullifying the advantage from using more processors. The optimization does not scale up. The moral of the story is that work/span analysis, and measurements of work and span, can be superior to measured running times alone in extrapolating an algorithm’s scalability.

Database System Concepts

7th Edition

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Chapter1: Introduction

Section: Chapter Questions

Problem 1PE

See similar textbooks

Related questions

Concept explainers

Question

26.1-10
For what number of processors do the two versions of the chess
program run equally fast, assuming that TP = T1/P + T∞?

A chess lesson
To illustrate the power of work/span analysis, this section closes with a
true story that occurred during the development of one of the first
world-class parallel chess-playing programs [106] many years ago. The
timings below have been simplified for exposition.
The chess program was developed and tested on a 32-processor
computer, but it was designed to run on a supercomputer with 512
processors. Since the supercomputer availability was limited and
expensive, the developers ran benchmarks on the small computer and
extrapolated performance to the large computer.
At one point, the developers incorporated an optimization into the
program that reduced its running time on an important benchmark on
the small machine from T32 = 65 seconds to seconds. Yet, the
developers used the work and span performance measures to conclude
that the optimized version, which was faster on 32 processors, would
actually be slower than the original version on the 512 processors of the
large machine. As a result, they abandoned the “optimization.”
Here is their work/span analysis. The original version of the program
had work T1 = 2048 seconds and span T∞= 1 second. Let’s treat
inequality (26.4) on page 760 as the equation TP = T1/P + T∞, which
we can use as an approximation to the running time on P processors.
Then indeed we have T32 = 2048/32 + 1 = 65. With the optimization,
the work becomes T′1 = 1024 seconds, and the span becomes T′∞ = 8
seconds. Our approximation gives T′32 = 1024/32 + 8 = 40.
The relative speeds of the two versions switch when we estimate their
running times on 512 processors, however. The first version has a
running time of T512 = 2048/512+1 = 5 seconds, and the second version
runs in seconds. The optimization that speeds up
the program on 32 processors makes the program run for twice as long
on 512 processors! The optimized version’s span of 8, which is not the
dominant term in the running time on 32 processors, becomes the
dominant term on 512 processors, nullifying the advantage from using
more processors. The optimization does not scale up.
The moral of the story is that work/span analysis, and measurements
of work and span, can be superior to measured running times alone in
extrapolating an algorithm’s scalability.

• Note: For exercise 26.1-10, when it refers to two versions of a chess program, it is referring to the case described on pages 768-769 in a section titled "A chess lesson." You should read that entire section to understand the scenario, but it is the
4th paragraph (the paragraph that begins at the bottom of page 768 and concludes at the top of page 769) that has all of the data you need to answer the question. Specifically, it wants you to determine the value of P such that the chess
program with the optimization and the chess program without the optimization would expect to have the same T_P, using the equation indicated in the problem. That paragraph gives you T_1 and T_inf for both versions, so you need to set up
an equation where on the left you have the equation for T_P from the question using T_1 and T_inf for the first version, and on the right you do the same for the second version, and then solve the equation for P.

Expert Solution

Step by step

Solved in 3 steps with 1 images

SEE SOLUTION Check out a sample Q&A here

Knowledge Booster

Learn more about

Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.