Practice Midterm 2 - With Answers

pdf

School

University of Pennsylvania *

*We aren’t endorsed by this school

Course

5500

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

Uploaded by momoshen17

CIS 5500: Database and Information Systems Practice Midterm 2 Question 1 (20 points, General) For each of the following questions, select the (one) correct answer. 1. (2 points) If every transaction is well-formed and uses two-phase locking, then any legal interleaving of such transactions is: A) serial B) free of conflicts C) serializable D) deadlock-free 2. (2 points) Suppose that T1 has a higher priority than T2, and that T1 holds a shared lock on A when T2 requests a shared lock on A. Which of the following would happen? A) Under WOUND-WAIT: T2 will wait B) Under WAIT-DIE: T2 will die C) If deadlock detection is used: an edge will be added from T2 to T1 in the WAIT-FOR graph. D) T2 will be granted the lock under any deadlock detection/prevention strategy 3. (2 points) Which of the following statements are true about isolation levels? A) Using transactions with higher isolation levels (SERIALIZABLE being highest) decreases undesirable phenomena and the amount of parallelism. B) A read-write transaction cannot use isolation level READ COMMITTED. C) A transaction at isolation level REPEATABLE READ ensures that rerunning the same SQL query twice in the transaction gives the same result. D) In isolation level READ UNCOMMITTED, locks on updated data are held until the end of the transaction.

4. (2 points) Suppose you have a relation R over which an Alternative 1 index (called I1) has been set up over attribute A and an Alternative 2 index (called I2) is set up over attribute B. Which of the following is true? A) The data records are sorted over B. B) The data entries in I1 are the data records. C) The data entries in I1 contain a pointer to the data records. D) I2 is a sparse index. 5. (2 points) Which of the following statements is true about B+ trees? A) The number of page I/O’s to find data entry k* is log_F(N) where F is fanout and N is number of leaf nodes. B) A B+ tree index on a composite search key <sid, year> is useful for an SQL query whose selection condition is (year == ‘SOPHOMORE’). C) Every search key value occurring in an index entry page (non-leaf node) must occur in some data entry page (leaf node). D) The smallest key value on every data entry page (leaf node) appears somewhere on an index page. 6. (2 points) Which of the following is NOT true about NoSQL solutions? A) There are multiple types of NoSQL solutions. B) The schema of data is flexible. C) They scale by partitioning data and distributing computation over a cluster of machines. D) The replication of data helps updates run faster. 7. (2 points) Which of the following statements is NOT true about relational algebra? A) It is better suited for query optimization than SQL. B) For all relations R, S, T, the result of R ⋈ (S ⋈ T) is the same as ( R ⋈ S) ⋈ T, where ⋈ denotes natural join. C) Projection and selection commute with each other. D) Pushing projections down can help reduce the size of intermediate query results, i.e. the number of bytes required to store the results.

8. (2 points) Suppose that a relation R fits on N= 100 pages, and that there are B=4 buffers of memory being used. Which of the following is true about sorting R using External Merge Sort? A) At the end of pass 0, each page of R is sorted internally but not relative to any other page. B) At the end of pass 0 there are 34 sorted runs of 3 pages. C) After pass 0, 4 sorted runs are merged during each pass. D) R can be sorted in 3 passes after pass 0 (4 passes total) 9. (2 points) What is the most dominant cost in query processing? A) The time to read/write a page from disk into buffered memory. B) The time to operate on data in buffered memory, e.g. to calculate the join between tuples in memory. C) The time it takes the page to be located on disk. 10. (2 points) Which of the following statements is NOT true about EXPLAIN? A) It is a profiling tool that shows how an SQL query will be executed. B) The number of rows and time indicated at each operation are accurate since the query is executed and the numbers are recorded. C) The number of rows and time indicated at each operation are estimates that may be quite different from those obtained when the query is executed. D) It can help the user to identify redundant operations in an SQL query.

Your preview ends here