Week 13-2 CA Final Exam review P2

pdf

School

Syracuse University *

*We aren’t endorsed by this school

Course

655

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

12

Uploaded by DoctorDanger12858

Report
CIS 655- Computer Architecture/ CSE 661 Advanced Computer Architecture Week 13 Final Exam Review 1 Priyantha Kumarawadu Associate Teaching Professor Department of Electrical Engineering and Computer Science Syracuse University
Final Exams Date/Time: Tuesday May 9, 2023, from 12:45pm – 2:45 pm Venue: Grant Auditorium Format: Closed Book 20 multiple choice questions 10 short answers
What will be included? Loop Level Parallelism - ≈ 10% Software Pipelining, - ≈ 20% Data Level Parallelism, - ≈ 25% Vector Architecture SIMD Extensions GPU Thread Level Parallelism - ≈ 20% Domain Specific Architectures- ≈ 10% Warehouse Scale Computers - ≈ 5% Data Centers - ≈ 10%
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Q 5 5.1. What do you mean by “loop independence”? 5.2 Why is loop independence important? 5.3 Give three example of loop dependency. 5.4 What is the main difference between loop independent depedance and loop-carried dependence.
Q6. What is the major advantage of Software pipelining over loop unrolling.
Q7 Consider the following code segment: vld v1, x1 vld v2, x2 vld v3, x3 vmul v2, v2, v3 vmul.vs v1, v1, f0 // multiply vector by scalar vadd v1, v1, v2 vst v1, x1 a) Find how instructions can appear as convoys. b) Can you rearrange the code to reduce the number of convoys if possible? If not, please explain why? c) Compute the approximate execution time for the code. Assume there are 64 elements in vectors and the operation executes 1 cycle per array element.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Q8 Consider the RISC-V code given below: How many clock cycles are required to run the entire set of code if you ignore the start-up time in the loop iterate 60 times. Assume 16 lanes and It has only one stall arising between the fmult.d and the fsd and it is 2 cycles. Loop: fld f1, 0(x1) addiw x1, x1, 8 fmult.d f2, f1, f0 bne x1, x2, Loop fsd f2, -8(x1)
Q9 Training a deep learning model turned out to be more compute- intensive. What architecture solves the problem? A. Server CPUs which can process more than two threads per core. B. Significantly large level 4 flash cache. C. Multi-banked cache which supplies more data to the processor(s). D. Tensor Processing Unit
Q 10 Solution Which of the following correctly describes the relationship between Warps, thread blocks, and CUDA cores? A. A warp is divided into a number of thread blocks, and each thread block executes on a single CUDA core B. A thread block may be divided into a number of warps, and each warp may execute on a single CUDA core C. A thread block is assigned to a warp, and each thread in the warp is executed on a separate CUDA core D. Each warp is divided in to number of blocks and each block executed in a single CUDA core
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Q11 Which of the following is not a form of parallelism supported by CUDA? A. Vector parallelism - Floating point computations are executed in parallel on wide vector units B. Thread level task parallelism - Different threads execute a different tasks C. Block and grid level parallelism - Different blocks or grids execute different tasks D. Data parallelism - Different threads and blocks process different parts of data in
Q12. Which of the following will likely cause a compiler to not vectorize a loop? A) A read-after-write dependency B) A write-after-read dependency C) A call to the math function sin() D) Use of arrays whose dimensions are not multiples of the vector width
Questions?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help