typedef struct { short data[4];} MatrixElement; void copy_matrix(MatrixElement m1[], MatrixElement m2[], int ROWS, int COLS) { int i, j, k; for (i = 0; i < ROWS; i++) { for (j = 0; j < COLS; j++) { for (k = 0; k < 4; k++) { m1[i*COLS+j].data[k] = m2[i*COLS+j].data[k]; } } }} void copy_matrix_transpose(MatrixElement m1[], MatrixElement m2[], int ROWS, int COLS) { int i, j, k; for (i = 0; i < ROWS; i++) { for (j = 0; j < COLS; j++) { for (k = 0; k < 4; k++) { m1[i*COLS+j].data[k] = m2[j*ROWS+i].data[k]; } } }} You can assume the following conditions: The matrix m1 is allocated at memory address 0, and matrix m2 immediately follows it. Indices i, j, and k are kept in registers. ROWS and COLS are constants. The cache is initially empty before the function call. The cache is write-back (i.e., only writes back to memory when a line is evicted) and write-allocate (i.e., it always allocates a line for the write). The cache uses a least-recently-used replacement policy. sizeof(short) == 2. Given a direct-mapped cache of size 128 bytes with a 16-byte block size, answer the following: A. What is the cache miss rate of copy_matrix if ROWS = 4 and COLS = 8? Miss rate = _____________ % B. What is the cache miss rate of copy_matrix if ROWS = 3 and COLS = 8? Miss rate = _____________ % Considering a 2-way set associated cache of the same size and block size, answer the following: C. What is the cache miss rate of copy_matrix if ROWS = 4 and COLS = 8? Miss rate = _____________ % D. What is the cache miss rate of copy_matrix_transpose if ROWS = 8 and COLS = 8? Miss rate for accessing m1 = _____________ % Miss rate for accessing m2 = _____________ %
typedef struct {
short data[4];
} MatrixElement;
void copy_matrix(MatrixElement m1[], MatrixElement m2[], int ROWS, int COLS) {
int i, j, k;
for (i = 0; i < ROWS; i++) {
for (j = 0; j < COLS; j++) {
for (k = 0; k < 4; k++) {
m1[i*COLS+j].data[k] = m2[i*COLS+j].data[k];
}
}
}
}
void copy_matrix_transpose(MatrixElement m1[], MatrixElement m2[], int ROWS, int COLS) {
int i, j, k;
for (i = 0; i < ROWS; i++) {
for (j = 0; j < COLS; j++) {
for (k = 0; k < 4; k++) {
m1[i*COLS+j].data[k] = m2[j*ROWS+i].data[k];
}
}
}
}
You can assume the following conditions:
- The matrix m1 is allocated at memory address 0, and matrix m2 immediately follows it.
- Indices i, j, and k are kept in registers.
- ROWS and COLS are constants.
- The cache is initially empty before the function call.
- The cache is write-back (i.e., only writes back to memory when a line is evicted) and write-allocate (i.e., it always allocates a line for the write).
- The cache uses a least-recently-used replacement policy.
- sizeof(short) == 2.
Given a direct-mapped cache of size 128 bytes with a 16-byte block size, answer the following: A. What is the cache miss rate of copy_matrix if ROWS = 4 and COLS = 8? Miss rate = _____________ %
B. What is the cache miss rate of copy_matrix if ROWS = 3 and COLS = 8? Miss rate = _____________ %
Considering a 2-way set associated cache of the same size and block size, answer the following: C. What is the cache miss rate of copy_matrix if ROWS = 4 and COLS = 8? Miss rate = _____________ %
D. What is the cache miss rate of copy_matrix_transpose if ROWS = 8 and COLS = 8? Miss rate for accessing m1 = _____________ % Miss rate for accessing m2 = _____________ %

Step by step
Solved in 1 steps









