RadhikaRaghuwanshi_HW7

pdf

School

George Washington University *

*We aren’t endorsed by this school

Course

6461

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

4

Uploaded by KidMaskWaterBuffalo16

Report
6.14 MapReduce enables large amounts of parallelism by having data-independent tasks run on multiple nodes, often using commodity hardware; however, there are limits to the level of parallelism. For example, for redundancy MapReduce will write data blocks to multiple nodes, consuming disk and, potentially, network bandwidth. Assume a total dataset size of 300 GB, a network bandwidth of 1 Gb/s, a 10 s/GB map rate, and a 20 s/GB reduce rate. Also assume that 30% of the data must be read from remote nodes, and each output file is written to two other nodes for redundancy. Use Figure 6.6 for all other parameters. a. Assume that all nodes are in the same rack. What is the expected runtime with 5 nodes? 10 nodes? 100 nodes? 1000 nodes? Discuss the bottlenecks at each node size. Total Dataset size = 300 GB Network bandwidth = 1Gb/s Map rate = 10 s/Gb Reduce rate = 20 s/Gb 30% data must read from remote nodes Each output file is written to two other nodes 5 nodes: With Five nodes each node will have to process = 300 GB/5 = 60 GB 60 GB x 0.3 = 18 GB must be read remotely Thus, 42 GB must be read from disk Time for Remote data access = 18GB/100 MB = 18000 MB/100MB = 180 s Time for Local data access = 42 GB/200 MB = 42000 MB/ 200 MB = 210 s Given, the map rate Map will take 60 Gb x 10s/GB = 600 seconds Given, the reduce rate the reduce will take 60 Gb x 20 s/GB = 1200 seconds Total expected execution time = (180 + 210) x 2 + 600 + 1200 = 2580 seconds The primary bottleneck is the reduce phase, while the total data transfer time is the secondary bottleneck. 10 nodes: With Ten nodes each node will have to process = 300 GB/10 = 30 GB 30 GB x 0.3 = 9 GB must be read remotely
Thus, 21 GB must be read from disk Time for Remote data access = 9 GB/ 100 MB = 9000 MB / 100 MB = 90 s Time for Local data access = 21 GB/ 200 MB = 21000 MB / 200 MB = 105 s Given, the map rate Map will take 30 GB x 10 s/GB = 300 seconds Given, the reduce rate the reduce will take 30 GB x 20 s/GB = 600 seconds Total expected execution time = (90 + 105) x 2 + 300 + 600 = 1290 seconds 100 nodes: With 100 nodes each node will have to process = 300 GB/100 = 3 GB 3 GB x 0.3 = 0.9 GB must be read remotely Thus, 2.1 GB must be read from disk Time for Remote data access = 0.9 GB/100 MB = 900 MB/100MB = 9 s Time for Local data access = 2.1 GB/ 100 MB = 2100 MB /200 MB = 10.5 s Given, the map rate Map will take 3 GB x 10 s/GB = 30 seconds Given, the reduce rate the reduce will take 3 GB x 20 s/Gb = 60 seconds Total expected execution time = (9 + 10.5) x2 + 30+ 60 = 129 seconds 1000 nodes: With 1000 nodes each node will have to process = 300 GB/1000 = 0.3 GB 0.3 GB x 0.3 = 0.09 GB must be read remotely Thus, 0.21 GB must be read from disk Time for Remote data access = 0.09 GB/ 100 MB = 0.90 s Time for Local data access = 0.21 GB/200 MB = 1.05 s Given, the map rate Map will take 0.3 GB x 10s/GB = 3 seconds Given, the reduce rate the reduce will take 0.3 GB x 20s/GB = 6 seconds Total expected execution time = (0.9 + 1.05) x2 + 3 + 6 = 12.9 seconds The bottlenecks actually remain identical across the different node sizes as all communications remain local within a rack and the problem is divided up evenly. b. Assume that there are 40 nodes per rack and that any remote read/write has an equal chance of going to any node. What is the expected runtime at 100 nodes? 1000 nodes? 100 nodes: With 40 nodes per rack, at 100 nodes there will be 3 racks Let us assume equal distribution of nodes (approx. 33 nodes per rack) We have 300 GB/100 = 3GB per node Thus 900 MB must be read remotely and there is 2/3 chance that it must go to other rack. Thus 600 MB must be read from another rack, which has disk bandwidth of 10 MB/s yielding 60 sec to read from a different rack. The 300 MB to read from within the rack will take 300 MB/100 MB/s = 3 seconds. Finally, the time to do the map is 30 seconds, and the reduce is 60 seconds. The total time is therefore (60 + 3) x 2 + 30 +60 = 216. Data transfer from machines outside of the rack wind up being the bottleneck. 1000 nodes: With 40 nodes per rack, at 1000 nodes there will be 25 racks
We have 300 GB/1000 = 0.3 GB per node = 300 MB we get 90 MB that must be read remotely, and 24/25 chance it must go to another rack. Thus, 86.4 MB must be read from another rack, which has disk bandwidth of 10 MB/s, yielding 8.64 seconds to read from a different rack. The 3.6 MB to read from within the rack will take 3.6 MB/100 MB/s = 0.036 seconds. Finally, the time to do the map is 3 seconds and the reduce is 6 seconds. The total run time = (8.64 + 0.036) x 2 +3 +6 = 26.352 seconds. Data transfers from machine outside of the rack wind up being bottle neck. c. An important consideration is minimizing data movement as much as possible. Given the significant slowdown of going from local to rack to array accesses, software must be strongly optimized to maximize locality. Assume that there are 40 nodes per rack, and 1000 nodes are used in the MapReduce job. What is the runtime if remote accesses are within the same rack 20% of the time? 50% of the time? 80% of the time? 20 %: 90 MB must be read remotely 72 MB is from another rack and 18 MB is within node’s rack. That yields 72 MB/10 MB/s = 7.2 seconds to read remote data And 18 MB/100 MB/s = 0.18 seconds to read local data Total runtime = (0.18 + 7.2 ) x 2 + 3 + 6 = 23.76 50 %: 90 MB must be read remotely 45 MB is from another rack and 45 MB is within node’s rack That yields 45 MB/ 10 MB/s = 4.5 sec to read remote data And 45 MB/100 MB/s = 0.45 sec to read local data Total runtime = (4.5 + 0.45) x 2 + 3 +6 = 18.9 80%: Let us calculate with 80 % of the remote access going to the same rack. Using the numbers from above question, we get 90 MB that must be read remotely, of which 18 MB is from another rack, and 72 MB is within a node’s rack. That yields 18 MB/10 MB/s = 1.8 seconds to read remote data, and 72 MB/100 MB/s = 0.72 seconds to read local data. The total runtime is therefore (1.8 + 0.72) x 2 + 3 +6 = 14.04 seconds. Now the reduce phase is the bottleneck. d. Given the simple MapReduce program in Section 6.2, discuss some possible optimizations to maximize the locality of the workload. In the Map reduce program, initially list of each word is produced in map phase and summing of each of the instances of those words in the reduce phase. One option to maximize locality is, prior to the reduce phase, to sum up any identical words and simply emit their count instead of a single word for each time they are encountered. If a larger amount of data must be transferred, such as several lines of text surrounding the word, then during the map phase the system could calculate how many bytes must be transferred, and
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
try surrounding the word (such as for creating web search snippets), then during the map phase the system could calculate how many bytes must be transferred, and try to optimize the placement of the items to be reduced to minimize data transfers to be within racks.