Ramakrishnan and Gehrke Solutions-185-199

pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

C200

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

15

Uploaded by UltraSnowBear26

Report
180 Chapter 18 4. WAL Protocol: Whenever a change is made to a database object, the change is first recorded in the log and the log is written to stable storage before the change is written to disk. 5. If a steal policy is in effect, the changes made to an object in the buffer pool by a transaction can be written to disk before the transaction commits. This might be because some other transaction might ”steal” the buffer page presently occupied by an uncommitted transaction. A no-force policy is in effect if, when a transaction commits, we need not ensure that all the changes it has made to objects in the buffer pool are immediately forced to disk. Exercise 18.2 Briefly answer the following questions: 1. What are the properties required of LSNs? 2. What are the fields in an update log record? Explain the use of each field. 3. What are redoable log records? 4. What are the differences between update log records and CLRs? Answer 18.2 Answer omitted. Exercise 18.3 Briefly answer the following questions: 1. What are the roles of the Analysis, Redo, and Undo phases in ARIES? 2. Consider the execution shown in Figure 18.1. (a) What is done during Analysis? (Be precise about the points at which Analysis begins and ends and describe the contents of any tables constructed in this phase.) (b) What is done during Redo? (Be precise about the points at which Redo begins and ends.) (c) What is done during Undo? (Be precise about the points at which Undo begins and ends.) Answer 18.3 The answer to each question is given below. 1. The Analysis phase starts with the most recent begin checkpoint record and pro- ceeds forward in the log until the last log record. It determines (a) The point in the log at which to start the Redo pass
Crash Recovery 181 10 20 30 40 50 60 00 end_checkpoint begin_checkpoint LOG LSN update: T1 writes P5 update: T2 writes P3 update: T3 writes P3 CRASH, RESTART T2 end T2 commit T1 abort 70 Figure 18.1 Execution with a Crash (b) The dirty pages in the buffer pool at the time of the crash. (c) Transactions that were active at the time of the crash which need to be undone. The Redo phase follows Analysis and redoes all changes to any page that might have been dirty at the time of the crash. The Undo phase follows Redo and undoes the changes of all transactions that were active at the time of the crash. 2. (a) For this example, we will assume that the Dirty Page Table and Transaction Table were empty before the start of the log. Analysis determines that the last begin checkpoint was at LSN 00 and starts at the corresponding end checkpoint (LSN 10). We will denote Transaction Table records as (transID, lastLSN) and Dirty Page Table records as (pageID, recLSN) sets. Then Analysis phase runs until LSN 70, and does the following: LSN 20 Adds (T1, 20) to TT and (P5, 20) to DPT LSN 30 Adds (T2, 30) to TT and (P3, 30) to DPT LSN 40 Changes status of T2 to ”C” from ”U” LSN 50 Deletes entry for T2 from Transaction Table LSN 60 Adds (T3, 60) to TT. Does not change P3 entry in DPT LSN 70 Changes (T1, 20) to (T1, 70) The final Transaction Table has two entries: (T1, 70), and (T3, 60). The final Dirty Page Table has two entries: (P5, 20), and (P3, 30). (b) Redo Phase: Redo starts at LSN 20 (smallest recLSN in DPT).
182 Chapter 18 10 20 30 40 50 60 LSN LOG 00 update: T1 writes P1 update: T3 writes P3 70 update: T1 writes P2 update: T2 writes P3 update: T2 writes P5 update: T2 writes P5 T2 abort T3 commit Figure 18.2 Aborting a Transaction LSN 20 Changes to P5 are redone. LSN 30 P3 is retrieved and its pageLSN is checked. If the page had been written to disk before the crash (i.e. if pageLSN > = 30), nothing is re-done otherwise the changes are re-done. LSN 40,50 No action LSN 60 Changes to P3 are redone LSN 70 No action (c) Undo Phase: Undo starts at LSN 70 (highest lastLSN in TT). The Loser Set consists of LSNs 70 and 60. LSN 70: Adds LSN 20 to the Loser Set. Loser Set = (60, 20). LSN 60: Undoes the change on P3 and adds a CLR indicating this Undo. Loser Set = (20). LSN 20: Undoes the change on P5 and adds a CLR indicating this Undo. Exercise 18.4 Consider the execution shown in Figure 18.2. 1. Extend the figure to show prevLSN and undonextLSN values. 2. Describe the actions taken to rollback transaction T 2. 3. Show the log after T 2 is rolled back, including all prevLSN and undonextLSN values in log records. Answer 18.4 Answer omitted. Exercise 18.5 Consider the execution shown in Figure 18.3. In addition, the system
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Crash Recovery 183 10 20 30 40 50 60 LSN LOG 00 70 CRASH, RESTART T3 abort update: T1 writes P5 T2 end update: T3 writes P2 T2 commit update: T3 writes P3 update: T2 writes P2 update: T1 writes P1 begin_checkpoint end_checkpoint 80 90 Figure 18.3 Execution with Multiple Crashes crashes during recovery after writing two log records to stable storage and again after writing another two log records. 1. What is the value of the LSN stored in the master log record? 2. What is done during Analysis? 3. What is done during Redo? 4. What is done during Undo? 5. Show the log when recovery is complete, including all non-null prevLSN and un- donextLSN values in log records. Answer 18.5 The answer to each question is given below. 1. LSN 00 is stored in the master log record as it is the LSN of the begin checkpoint record. 2. During analysis the following happens:
184 Chapter 18 LSN 20 Add (T1,20) to TT and (P1,20) to DPT LSN 30 Add (T2,30) to TT and (P2,30) to DPT LSN 40 Add (T3,40) to TT and (P3,40) to DPT LSN 50 Change status of T2 to C LSN 60 Change (T3,40) to (T3,60) LSN 70 Remove T2 from TT LSN 80 Change (T1,20) to (T1,70) and add (P5,70) to DPT LSN 90 No action At the end of analysis, the transaction table contains the following entries: (T1,80), and (T3,60). The Dirty Page Table has the following entries: (P1,20), (P2,30), (P3,40), and (P5,80). 3. Redo starts from LSN20 (minimum recLSN in DPT). LSN 20 Check whether P1 has pageLSN more than 10 or not. Since it is a committed transaction, we probably need not redo this update. LSN 30 Redo the change in P2 LSN 40 Redo the change in P3 LSN 50 No action LSN 60 Redo the changes on P2 LSN 70 No action LSN 80 Redo the changes on P5 LSN 90 No action 4. ToUndo consists of (80, 60). LSN 80 Undo the changes in P5. Append a CLR: Undo T1 LSN 80, set undonextLSN = 20. Add 20 to ToUndo. ToUndo consists of (60, 20). LSN 60 Undo the changes on P2. Append a CLR: Undo T3 LSN 60, set undonextLSN = 40. Add 40 to ToUndo. ToUndo consists of (40, 20). LSN 40 Undo the changes on P3. Append a CLR: Undo T3 LSN 40, T3 end ToUndo consists of (20). LSN 20 Undo the changes on P1. Append a CLR: Undo T1 LSN 20, T1 end
Crash Recovery 185 5. The log looks like the following after recovery: LSN 00 begin checkpoint LSN 10 end checkpoint LSN 20 update: T1 writes P1 LSN 30 update: T2 writes P2 LSN 40 update: T3 writes P3 LSN 50 T2 commit prevLSN = 30 LSN 60 update: T3 writes P2 prevLSN = 40 LSN 70 T2 end prevLSN = 50 LSN 80 update: T1 writes P5 prevLSN = 20 LSN 90 T3 abort prevLSN = 60 LSN 100 CLR: Undo T1 LSN 80 undonextLSN= 20 LSN 110 CLR: Undo T3 LSN 60 undonextLSN= 40 LSN 120,125 CLR: Undo T3 LSN 40 T3 end. LSN 130,135 CLR: Undo T1 LSN 20 T1 end. Exercise 18.6 Briefly answer the following questions: 1. How is checkpointing done in ARIES? 2. Checkpointing can also be done as follows: Quiesce the system so that only check- pointing activity can be in progress, write out copies of all dirty pages, and include the dirty page table and transaction table in the checkpoint record. What are the pros and cons of this approach versus the checkpointing approach of ARIES? 3. What happens if a second begin checkpoint record is encountered during the Anal- ysis phase? 4. Can a second end checkpoint record be encountered during the Analysis phase? 5. Why is the use of CLRs important for the use of undo actions that are not the physical inverse of the original update? 6. Give an example that illustrates how the paradigm of repeating history and the use of CLRs allow ARIES to support locks of finer granularity than a page. Answer 18.6 Answer omitted. Exercise 18.7 Briefly answer the following questions: 1. If the system fails repeatedly during recovery, what is the maximum number of log records that can be written (as a function of the number of update and other log records written before the crash) before restart completes successfully?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
186 Chapter 18 2. What is the oldest log record we need to retain? 3. If a bounded amount of stable storage is used for the log, how can we always ensure enough stable storage to hold all log records written during restart? Answer 18.7 The answer to each question is given below. 1. Let us take the case where each log record is an update record of an uncommitted transaction and each record belongs to a different transaction. This means there are n records of n different transactions, each of which has to be undone. During recovery, we have to add a CLR record of the undone action and an end transaction record after each CLR. Thus, we can write a maximum of 2 n log records before restart completes. 2. The oldest begin checkpoint referenced in any fuzzy dump or master log record. 3. One needs to ensure that there is enough to hold twice as many records as the current number of log records. If necessary, do a fuzzy dump to free up some log records whenever the number of log records goes above one third of the available space. Exercise 18.8 Consider the three conditions under which a redo is unnecessary (Sec- tion 18.6.2). 1. Why is it cheaper to test the first two conditions? 2. Describe an execution that illustrates the use of the first condition. 3. Describe an execution that illustrates the use of the second condition. Answer 18.8 Answer omitted. Exercise 18.9 The description in Section 18.6.1 of the Analysis phase made the sim- plifying assumption that no log records appeared between the begin checkpoint and end checkpoint records for the most recent complete checkpoint. The following ques- tions explore how such records should be handled. 1. Explain why log records could be written between the begin checkpoint and end checkpoint records. 2. Describe how the Analysis phase could be modified to handle such records. 3. Consider the execution shown in Figure 18.4. Show the contents of the end checkpoint record. 4. Illustrate your modified Analysis phase on the execution shown in Figure 18.4.
Crash Recovery 187 10 20 30 40 LSN LOG 00 begin_checkpoint update: T1 writes P1 update: T2 writes P2 50 60 70 80 T1 commit CRASH, RESTART T3 commit end_checkpoint update: T3 writes P3 T2 abort T1 end Figure 18.4 Log Records between Checkpoint Records Answer 18.9 The answer to each question is given below. 1. In ARIES, first a begin checkpoint record is written and then, after some time, an end checkpoint record is written. While the end checkpoint record is being constructed, the DBMS continues executing transactions and writing other log records. So, we could have log records between the begin checkpoint and the end checkpoint records. The only guarantee we have is that the transaction table and the dirty page table are accurate as of the time of the begin checkpoint record. 2. The Analysis phase begins by examining the most recent begin checkpoint log record and then searches for the next end checkpoint record. Then the Dirty Page Table and the Transaction Table are initialized to the copies of those struc- tures in the end checkpoint. Our new Analysis phase remains the same untill here. In the old algorithm, Analysis scans the log in the forward direction until it reaches the end of the log. In the modified algorithm, Analysis goes back to the begin checkpoint and scans the log in the forward direction. 3. The end checkpoint record contains the transaction table and the dirty page table as of the time of the begin checkpoint (LSN 00 in this case). Since we are assuming these tables to be empty before LSN 00, the end checkpoint record will indicate an empty transaction table and an empty dirty page table. 4. Instead of starting from LSN 80, Analysis goes back to LSN 10 and executes as follows:
188 Chapter 18 LSN 10 Add (T1,10) to TT and (P1, 10) to DPT LSN 20 Change status of T1 from U to C. LSN 30 Add (T2,30) to TT and (P2, 30) to DPT LSN 40 Remove (T1,10) from TT LSN 50 No action LSN 60 Add (T3,60) to TT and (P3, 60) to DPT LSN 70 No action LSN 80 Change status of T3 from U to C. Exercise 18.10 Answer the following questions briefly: 1. Explain how media recovery is handled in ARIES. 2. What are the pros and cons of using fuzzy dumps for media recovery? 3. What are the similarities and differences between checkpoints and fuzzy dumps? 4. Contrast ARIES with other WAL-based recovery schemes. 5. Contrast ARIES with shadow-page-based recovery. Answer 18.10 Answer omitted.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
19 SCHEMA REFINEMENT AND NORMAL FORMS Exercise 19.1 Briefly answer the following questions: 1. Define the term functional dependency . 2. Why are some functional dependencies called trivial ? 3. Give a set of FDs for the relation schema R(A,B,C,D) with primary key AB under which R is in 1NF but not in 2NF. 4. Give a set of FDs for the relation schema R(A,B,C,D) with primary key AB under which R is in 2NF but not in 3NF. 5. Consider the relation schema R(A,B,C) , which has the FD B C . If A is a can- didate key for R , is it possible for R to be in BCNF? If so, under what conditions? If not, explain why not. 6. Suppose we have a relation schema R(A,B,C) representing a relationship between two entity sets with keys A and B , respectively, and suppose that R has (among others) the FDs A B and B A . Explain what such a pair of dependencies means (i.e., what they imply about the relationship that the relation models). Answer 19.1 1. Let R be a relational schema and let X and Y be two subsets of the set of all attributes of R. We say Y is functionally dependent on X, written X Y, if the Y-values are determined by the X-values. More precisely, for any two tuples r 1 and r 2 in (any instance of) R π X ( r 1 ) = π X ( r 2 ) π Y ( r 1 ) = π Y ( r 2 ) 2. Some functional dependencies are considered trivial because they contain super- fluous attributes that do not need to be listed. Consider the FD: A AB . By reflexivity, A always implies A , so that the A on the right hand side is not neces- sary and can be dropped. The proper form, without the trivial dependency would then be A B . 189
190 Chapter 19 3. Consider the set of FD: AB CD and B C . AB is obviously a key for this relation since AB CD implies AB ABCD . It is a primary key since there are no smaller subsets of keys that hold over R(A,B,C,D) . The FD: B C violates 2NF since: C B is false; that is, it is not a trivial FD B is not a superkey C is not part of some key for R B is a proper subset of the key AB (transitive dependency) 4. Consider the set of FD: AB CD and C D . AB is obviously a key for this relation since AB CD implies AB ABCD . It is a primary key since there are no smaller subsets of keys that hold over R(A,B,C,D) . The FD: C D violates 3NF but not 2NF since: D C is false; that is, it is not a trivial FD C is not a superkey D is not part of some key for R 5. The only way R could be in BCNF is if B includes a key, i.e. B is a key for R. 6. It means that the relationship is one to one. That is, each A entity corresponds to at most one B entity and vice-versa. (In addition, we have the dependency AB C , from the semantics of a relationship set.) Exercise 19.2 Consider a relation R with five attributes ABCDE . You are given the following dependencies: A B , BC E , and ED A . 1. List all keys for R . 2. Is R in 3NF? 3. Is R in BCNF? Answer 19.2 Answer omitted. Exercise 19.3 Consider the relation shown in Figure 19.1. 1. List all the functional dependencies that this relation instance satisfies. 2. Assume that the value of attribute Z of the last record in the relation is changed from z 3 to z 2 . Now list all the functional dependencies that this relation instance satisfies.
Schema Refinement and Normal Forms 191 X Y Z x 1 y 1 z 1 x 1 y 1 z 2 x 2 y 1 z 1 x 2 y 1 z 3 Figure 19.1 Relation for Exercise 19.3. Answer 19.3 1. The following functional dependencies hold over R : Z Y , X Y , and XZ Y 2. Same as part 1. Functional dependency set is unchanged. Exercise 19.4 Assume that you are given a relation with attributes ABCD . 1. Assume that no record has NULL values. Write an SQL query that checks whether the functional dependency A B holds. 2. Assume again that no record has NULL values. Write an SQL assertion that enforces the functional dependency A B . 3. Let us now assume that records could have NULL values. Repeat the previous two questions under this assumption. Answer 19.4 Answer omitted. Exercise 19.5 Consider the following collection of relations and dependencies. As- sume that each relation is obtained through decomposition from a relation with at- tributes ABCDEFGHI and that all the known dependencies over relation ABCDEFGHI are listed for each question. (The questions are independent of each other, obviously, since the given dependencies over ABCDEFGHI are different.) For each (sub)relation: (a) State the strongest normal form that the relation is in. (b) If it is not in BCNF, decompose it into a collection of BCNF relations. 1. R1(A,C,B,D,E) , A B , C D 2. R2(A,B,F), AC E , B F 3. R3(A,D), D G, G H 4. R4(D,C,H,G) , A I, I A
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
192 Chapter 19 5. R5(A,I,C,E) Answer 19.5 1. 1NF. BCNF decomposition: AB, CD, ACE. 2. 1NF. BCNF decomposition: AB, BF 3. BCNF. 4. BCNF. 5. BCNF. Exercise 19.6 Suppose that we have the following three tuples in a legal instance of a relation schema S with three attributes ABC (listed in order): (1,2,3), (4,2,3), and (5,3,3). 1. Which of the following dependencies can you infer does not hold over schema S ? (a) A B , (b) BC A , (c) B C 2. Can you identify any dependencies that hold over S ? Answer 19.6 Answer omitted. Exercise 19.7 Suppose you are given a relation R with four attributes ABCD . For each of the following sets of FDs, assuming those are the only dependencies that hold for R , do the following: (a) Identify the candidate key(s) for R . (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF). (c) If R is not in BCNF, decompose it into a set of BCNF relations that preserve the dependencies. 1. C D, C A, B C 2. B C, D A 3. ABC D, D A 4. A B, BC D, A C 5. AB C, AB D, C A, D B Answer 19.7 1. (a) Candidate keys: B (b) R is in 2NF but not 3NF.
Schema Refinement and Normal Forms 193 (c) C D and C A both cause violations of BCNF. One way to obtain a (lossless) join preserving decomposition is to decompose R into AC , BC , and CD . 2. (a) Candidate keys: BD (b) R is in 1NF but not 2NF. (c) Both B C and D A cause BCNF violations. The decomposition: AD , BC , BD (obtained by first decomposing to AD , BCD ) is BCNF and lossless and join-preserving. 3. (a) Candidate keys: ABC , BCD (b) R is in 3NF but not BCNF. (c) ABCD is not in BCNF since D A and D is not a key. However if we split up R as AD , BCD we cannot preserve the dependency ABC D . So there is no BCNF decomposition. 4. (a) Candidate keys: A (b) R is in 2NF but not 3NF (because of the FD: BC D ). (c) BC D violates BCNF since BC does not contain a key. So we split up R as in: BCD , ABC . 5. (a) Candidate keys: AB , BC , CD , AD (b) R is in 3NF but not BCNF (because of the FD: C A). (c) C A and D B both cause violations. So decompose into: AC , BCD but this does not preserve AB C and AB D , and BCD is still not BCNF because D B . So we need to decompose further into: AC , BD , CD . However, when we attempt to revive the lost functioanl dependencies by adding ABC and ABD , we that these relations are not in BCNF form. Therefore, there is no BCNF decomposition. Exercise 19.8 Consider the attribute set R = ABCDEGH and the FD set F = { AB C, AC B , AD E , B D , BC A , E G } . 1. For each of the following attribute sets, do the following: (i) Compute the set of dependencies that hold over the set and write down a minimal cover. (ii) Name the strongest normal form that is not violated by the relation containing these attributes. (iii) Decompose it into a collection of BCNF relations if it is not in BCNF. (a) ABC , (b) ABCD , (c) ABCEG , (d) DCEGH , (e) ACEH 2. Which of the following decompositions of R = ABCDEG , with the same set of dependencies F , is (a) dependency-preserving? (b) lossless-join?
194 Chapter 19 (a) { AB , BC , ABDE, EG } (b) { ABC, ACDE, ADG } Answer 19.8 Answer omitted. Exercise 19.9 Let R be decomposed into R 1 , R 2 , . . . , R n . Let F be a set of FDs on R . 1. Define what it means for F to be preserved in the set of decomposed relations. 2. Describe a polynomial-time algorithm to test dependency-preservation. 3. Projecting the FDs stated over a set of attributes X onto a subset of attributes Y requires that we consider the closure of the FDs. Give an example where considering the closure is important in testing dependency-preservation, that is, considering just the given FDs gives incorrect results. Answer 19.9 1. Let F i denote the projection of F on R i . F is preserved if the closure of the (union of) the F i ’s equals F (note that F is always a superset of this closure.) 2. We shall describe an algorithm for testing dependency preservation which is poly- nomial in the cardinality of F. For each dependency X Y F check if it is in F as follows: start with the set S (of attributes in) X. For each relation R i , compute the closure of S R i relative to F and project this closure to the attributes of R i . If this results in additional attributes, add them to S. Do this repeatedly until there is no change to S . 3. There is an example in the text in Section 19.5.2. Exercise 19.10 Suppose you are given a relation R(A,B,C,D) . For each of the fol- lowing sets of FDs, assuming they are the only dependencies that hold for R , do the following: (a) Identify the candidate key(s) for R . (b) State whether or not the pro- posed decomposition of R into smaller relations is a good decomposition and briefly explain why or why not. 1. B C, D A ; decompose into BC and AD . 2. AB C, C A, C D ; decompose into ACD and BC . 3. A BC , C AD ; decompose into ABC and AD . 4. A B , B C , C D ; decompose into AB and ACD . 5. A B , B C , C D ; decompose into AB , AD and CD .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help