Practice Midterm 1 - With Answers

pdf

School

University of Pennsylvania *

*We aren’t endorsed by this school

Course

550

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

13

Uploaded by momoshen17

Report
CIS 5500: Database and Information Systems Practice Midterm 1 Part 1. (16 points) Multiple choice Select the best (one) answer for each of the following questions. 1. (2 points) SQL is a declarative language because A. The programmer says what data they want rather than how to get the data. B. The programmer must be aware of indices and the physical layout in order to write good queries. C. It can be used to describe what the data is. 2. (2 points) Consider relations R(A, B) and S(C, D), and a foreign key constraint R(B) references S(C). In both R and S, all non-key attributes can contain NULL. Which of the following instances of R and S are legal, i.e. do not violate these key and foreign key constraints? A. R = { (a1, c1) } and S = { (NULL, b2), (c1, b1) }. B. R = { (a1, c1), (a2, NULL) } and S = { (c1, b1) }. C. R = { (a1, c1) } and S = { (c2, NULL) }. A violates non-NULL key. B is fine. A foreign key may be NULL if allowed by schema definition, and the foreign key constraint is met. C violates the foreign key constraint
3. (2 points) Consider the following instance of R(A, B, C) A B C 30 10 1 25 10 2 10 5 3 Suppose a materialized view T is defined as follows: CREATE MATERIALIZED VIEW T AS SELECT A, B FROM R WHERE A>20 AND B>=ALL (SELECT B FROM R) Which of the following operations requires a modification to T? A. Inserting (A: 15, B: 20, C: 5) into R B. Inserting (A: 15, B: 4, C: 5) into R C. Deleting (A: 10, B: 5, C: 3) from R A: Note that T={(30, 10), (25, 10)}. Inserting (A: 15, B: 20) into R changes the max value of B to 20, and therefore all tuples in T must be deleted.
4 . (2 points) Suppose you have the following relation T. A B C 1 NULL NULL 2 NULL 1 3 2 2 What is the result of this query? SELECT B FROM T WHERE B == C A. {(B: NULL), (B: 1), (B: 2)} B. {(B: NULL), (B: 2)} C. {(B: 2)} A comparison between NULL and anything (including another NULL) is UNKNOWN, hence for the first two tuples the query evaluates to UNKNOWN whereas it evaluates to TRUE for the third tuple. Tuples are returned only if the WHERE clause evaluates to TRUE. 5. (2 points) Consider relations R(A) and S(A) where R.A and S.A are keys. When do the following two queries not return the same result (ignoring the order of rows in the result)? SELECT * FROM R JOIN S ON R.A = S.A SELECT * FROM R LEFT OUTER JOIN S ON R.A = S.A A. When the set of tuples in R is the same as the set of tuples in S. B. When the set of tuples in R is a strict superset of the set of tuples in S. C. When the set of tuples in R is a strict subset of the set of tuples in S. D. When R is empty. If all tuples in R are also in S, then no tuple in R will fail to join with some tuple in S. Therefore A, C and D will always have the same result in the JOIN as in the LEFT OUTER JOIN. However, if some tuple in R is not in S (which is the case in B) then there will be more tuples in the LEFT OUTER JOIN than in the JOIN.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
6. (2 points) You are given F={ A→ B, AD→ C, B→ C, D→ A, D→ C}. Which of the following is the minimal cover for F? A. { A→ B, B→ C, D→ AC} B. { A→ B, A→ C, B→ C, D→ A, D→ C}} C. { A→ B, B→ C, D→ A} 7. (2 points) What is the difference between a materialized view and a temporary table? A. A temporary table requires special permissions to create whereas a materialized view can be created by anyone B. A temporary table is evaluated once and may become stale, whereas a materialized view is updated as the underlying tables are updated. C. A temporary table exists only for the duration of a query, whereas a materialized view is kept until it is dropped. A is false: Both temporary tables and views require permissions to create. C is false since a temporary table exists for the duration of a session. B is true. 8. (2 points) Suppose R has n rows and S has m rows. Let T be the join of R and S on R.A = S.A, and let T have k rows. Which of the following is always true? A. 0 <= k <= m*n B. k = m C. k >= min(m, n) K may be zero when there are no values in common between R and S. Therefore B and C are false. A is true, because if A=a for all tuples in R and S then the join becomes the cross product.
Part 2. (30 points, SQL) Write SQL queries for the questions below using the following schema: Book(BookID, Title, Genre) Person(ID, Name) BookAuthors(BookID, AuthorID) BookID references Book.BookID AuthorID references Person.ID Reviews(BookID, ReviewerID, Rating) BookID references Book.BookID ReviewerID references Person.ID 1. [4 points] Print the ID of every book that has not been reviewed. Schema: (BookID) SELECT BookID FROM Book WHERE BookID NOT IN (SELECT BookID FROM Reviews) 2. [5 points] Print the ID and name of every person who is both an author and a reviewer. SELECT p.ID, p.Name FROM BookAuthors ba JOIN Reviews r ON ba.AuthorID=r.ReviewerID JOIN Person p ON p.ID=ba.AuthorID Or lots of other ways, e.g. SELECT * FROM Person WHERE EXISTS (SELECT * FROM BookAuthors WHERE AuthorID=ID) AND EXISTS (SELECT * FROM Reviews WHERE ReviewerID=ID) 3. [5 points] Print the BookID, Title and Genre of each book that has more than one author. Schema: (BookID, Title, Genre) SELECT b.BookID, Title, Genre FROM Book b JOIN BookAuthors ba ON b.BookID=ba.BookID GROUP BY BookID, Title, Genre HAVING COUNT(*) >1
4. [5 points] Print the ID, Title and average rating of each book. If a book has not been reviewed, it should be included with an average rating of 0. Order the result by the average rating. Schema: (BookID, Title, AvgRating) This is incorrect: SELECT b.BookID, Title, AVG(Rating) AS AvgRating FROM Book b LEFT JOIN Reviews r ON b.BookID=r.BookID GROUP BY BookID, Title ORDER BY AvgRating This will return NULL instead of 0. Correct answers: SELECT b.BookID, Title, CASE WHEN avg(Rating) IS NULL THEN 0 ELSE avg(AvgRating) END FROM Book b LEFT JOIN Reviews r ON b.BookID=r.BookID GROUP BY BookID, Title ORDER BY AvgRating Or SELECT b.BookID, Title, AVG(Rating) AS AvgRating FROM Book b JOIN Reviews r ON b.BookID=r.BookID GROUP BY BookID, Title ORDER BY AvgRating UNION SELECT b.BookID, Title, 0 as AvgRating WHERE b.BookID NOT IN (SELECT BookID from Reviews) 5. [5 points] Print the ID of all reviewers who have given a rating of 5 to every book they have reviewed. Schema: (ReviewerID) SELECT ReviewerID FROM Review WHERE ReviewerID NOT IN (SELECT ReviewerID FROM Review WHERE Rating != 5) Alternate Solution 1: SELECT r.ReviewerID FROM Reviews r WHERE NOT EXISTS (SELECT * FROM Reviews r2 WHERE r.ReviewerID = r2.ReviewerID AND r2.Rating < 5)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
6. [6 points] Print the most popular genre(s), i.e. the genre(s) that have the most books. Schema: (Genre) WITH GenreCounts AS (SELECT COUNT(*) AS Num FROM Book GROUP BY Genre) SELECT Genre FROM Book GROUP BY Genre HAVING COUNT(*)= (SELECT MAX(Num) FROM GenreCounts) Alternate Solution 1: WITH GenreCounts AS (SELECT COUNT(*) AS Num FROM Book GROUP BY Genre) SELECT Genre FROM Book GROUP BY Genre HAVING COUNT(*)>= ALL (SELECT Num FROM GenreCounts)
Part 3. (15 points, ER) 1. (10 points) State whether or not (True or False) each of the following scenarios is possible under the ER diagram given above. Briefly justify your answer in each case. A. (2 points) A student named John is assigned to the cafeteria with id C1, and a student named John is assigned to the cafeteria with id C3. True. Two students with different ids can both be named John; each one can be assigned to a different cafeteria B. (2 points) The student with id 4 is a mentor to the student with id 4. True. There is no constraint which says a student cannot be a mentor of themself. C. (2 points) The student with id 1 is a mentor to the student with id 7 and a mentor to the student with id 8.
True. The cardinality constraint for MentorOf is 0..*, so a student can mentor multiple students. D. (2 points) The student with id 4 is the mentee of the student with id 1 and a mentee of the student with id 2. False. A student can be a mentee of at most one student. E. (2 points) The student with id 5 is a mentee of the student with id 1 and a mentor of the student with id 2. True. A student can be both a mentor and a mentee. 2. (5 points) Describe in English how you would modify the ER diagram to capture the following: A dormitory has an address and is identified by a name. A student lives in exactly one room in a dormitory, where the room is a number. Your answer should describe any additional entity sets, relationship sets, attributes, and cardinality constraints necessary. Add an entity set Dormitory(name, address) and a relationship LivesIn between Student and Dormitory. The relationship LivesIn has an attribute called room. The cardinality constraint from Student to LivesIn is 1..1, and the cardinality constraint from Dormitory to LivesIn is 0..*.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Part 4. (15 points, ER to relational translation) The following ER diagram models course scheduling on a university campus. You should assume that the ISA hierarchy has both overlap and covering constraints: every hall is either an Active Learning hall or a Lecture hall, but not both. 1. (2 points) What is the significance of the double lines around Hall and LocatedIn? They indicate that Hall is a weak-entity set, defined by the relationships LocatedIn with Building. 2. (3 points) What are the attributes of an Active Learning Hall? What are the attributes of a Lecture Hall? Both inherit the attributes of Hall, and have none in addition: (BName, name, capacity, floor) 3. (10 points) Translate the ER diagram to a relational schema, using the syntax R(A,B,C) rather than SQL DDL. For each table state the primary key, all foreign keys, and indicate any (non-key) attributes that cannot be null. E.g. if R has primary key (A, B) and C is a foreign key referencing S(C) which cannot be null you should write: R(A, B, C) C NOT NULL (C) references S(C)
For full credit, avoid creating unnecessary tables. Building(name, address) Hall(building, name, floor, capacity, type) type in (“Active Learning Hall”, “Lecture Hall”) type NOT NULL (building) references Building.BName Course (CID, time, building, name) (building, name) NOT NULL (building, name) references Hall(building, BName)
Part 5. (24 points, Relational Design Theory) Consider a relation R(ABCD) with functional dependencies F = { BC -> A, D -> C, A -> BD }. 1. [2 points] Give values for the variables x, y, z and u in the instance of R below to respect all dependencies in F. Use values a1, a2, … for x (in column A), values b1, b2, … for y (in column B), values c1, c2, … for z (in column C), and values d1, d2, … for u (in column D). A B C D a1 b1 c1 d1 a2 y z d1 x b1 c2 u x: a3 (or anything that is not a1, a2) y: b2 (or anything that is not b1) z: c1 u: d2 (or anything that is not d1) 2. [4 points] Prove that the keys of R are A, BC and BD. Using the attribute closure algorithm, we can see that: A+= {A, B, D, C}, hence A is a key. There are no other single attribute keys since: B+={B}, C+={C} and D+={D}. BC+= {B, C, A, D}, hence BC is a key. It is minimal since neither B nor C are keys. , hence BD is a key. It is minimal since neither B nor D are keys. CD is not a key since CD+={C, D}. AB, AC and AD are not minimal, hence are not keys. Any set of 3 attributes will be a superset of a key. Or they can give proofs using Armstrong’s Axioms rather than using the attribute closure. 3. [3 points] Is R in 3NF? Explain why or why not. Yes. Since for each of the four dependencies in F, either the lhs is a superkey or the rhs is a prime attribute (in the case of D -> C).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4. [3 points] Show that R is not in BCNF. We have F = { BC -> A, D -> C, A -> BD } and the keys are A, BC, BD. To be in BCNF, the lhs of every dependency in F must be a superkey. However, we have a troublesome dependency: D->C. Hence, R is not in BCNF. 5. [3 points] Is the decomposition R1(AC) and R2(BCD) lossless? No. We have R1 \cap R2 = {C} and R1 - R2 = {A} and R2 - R1 = {BD}. Neither C -> A nor C -> BD is implied by F. 6. [7 points] Decompose R into a schema that is in BCNF. Show your work at each step of the decomposition. List the projected dependencies and keys for the final decomposed schema. Since D->C is the only troublesome dependency, we have: R1(CD) and R2(ABD) with F1 = { D -> C } and F2 = { A -> BD, BD -> A } respectively. (Note: we should deduct points if students forget to list BD -> A in F2. R1 has key C and R2 has keys A and BD. So, both are in BCNF since the lhs of every dependency is a superkey of the corresponding relation. 7. [2 points] Is your final decomposed schema above dependency-preserving? Explain why or why not. We fail to preserve dependency BC -> A since BC -> A is not implied by (F1 U F2).