ilovepdf_merged-7

pdf

School

University of the Fraser Valley *

*We aren’t endorsed by this school

Course

13

Subject

Computer Science

Date

Oct 30, 2023

Type

pdf

Pages

238

Uploaded by kakger18

Report
SOEN 363 - Data Systems for Software Engineers Lecture 12: DBMS Internals- Part II Essam Mansour
Today… Last Session: DBMS Internals- Part I Buffer Management Files and Access Methods (file organizations) Today’s Session: DBMS Internals- Part II Tree-based indexes: ISAM and B+ trees
DBMS Layers Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline Why Indexing? Storing Data Records and Index Types Indexed Static Access Method (ISAM) Trees B+ Trees
Motivation Consider a file of student records sorted by GPA How can we answer a range selection ( E.g., “Find all students with a GPA higher than 3.0” )? What about doing a binary search followed by a scan ? Yes, but… What if the file becomes “very” large? Cost is proportional to the number of pages fetched Hence, may become very slow! Page 1 Page 2 Page N Page 3 Data File
Motivation What about creating an index file (with one entry per page) and do binary search there? But, what if the index file becomes also “very” large? Page 1 Page 2 Page N Data File Index File K 1 P 1 K 2 P 2 K N P N Index Entry = <first key on the page, pointer to the page>
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Motivation Repeat recursively! Non-leaf Pages Pages Leaf Each tree page is a disk block and all data records reside ( if chosen to be part of the index ) in ONLY leaf pages How else data records can be stored?
Outline Why Indexing? Storing Data Records and Index Types Indexed Static Access Method (ISAM) Trees B+ Trees
Where to Store Data Records? In general, 3 alternatives for “data records” (each referred to as K* ) can be pursued: Alternative (1): K* is an actual data record with key k Alternative (2): K* is a < k , rid> pair, where rid is the record id of a data record with search key k Alternative (3): K* is a < k , rid-list> pair, where rid-list is a list of rids of data records with search key k
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Where to Store Data Records? In general, 3 alternatives for “data records” (each referred to as K* ) can be pursued: Alternative (1): K* is an actual data record with key k Alternative (2): K* is a < k , rid> pair, where rid is the record id of a data record with search key k Alternative (3): K* is a < k , rid-list> pair, where rid-list is a list of rids of data records with search key k Alternative (1) : Leaf pages contain the actual data (i.e., the data records) Alternative (2) : Leaf pages contain the <key, rid> pairs and actual data records are stored in a separate file Alternative (3) : Leaf pages contain the <key, rid-list> pairs and actual data records are stored in a separate file The choice among these alternatives is orthogonal to the indexing technique
Clustered vs. Un-clustered Indexes Indexes can be either clustered or un-clustered Clustered Indexes: When the ordering of data records is the same as (or close to) the ordering of entries in some index Un-clustered Indexes: When the ordering of data records differs from the ordering of entries in some index
Clustered vs. Un-clustered Indexes Is an index that uses Alternative (1) clustered or un-clustered? Clustered Is an index that uses Alternative (2) or (3) clustered or un-clustered? Clustered “only” if data records are sorted on the search key field In practice: A clustered index is an index that uses Alternative (1) Indexes that use Alternatives (2) or (3) are un-clustered
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Clustered and Nonclustered Indexes
Clustered and Nonclustered Indexes CREATE INDEX myIndex ON SalesData (ProductID, OrderQty)
Clustered and Nonclustered Indexes CREATE INDEX myIndex ON SalesData (ProductID, OrderQty)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline Why Indexing? Storing Data Records and Index Types Indexed Static Access Method (ISAM) Trees B+ Trees
ISAM Trees Indexed Sequential Access Method (ISAM) trees are static E.g., 2 Entries Per Page 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root Non-Leaf Pages Leaf Pages
ISAM Trees: Page Overflows What if there are a lot of insertions after creating the tree? Non-leaf Pages Pages Leaf Overflow page Primary pages
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Searching for Entries Search begins at root, and key comparisons direct it to a leaf Search for 27* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root
ISAM: Inserting Entries The appropriate page is determined as for a search, and the entry is inserted (with overflow pages added if necessary) Insert 23* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23*
ISAM: Inserting Entries The appropriate page is determined as for a search, and the entry is inserted (with overflow pages added if necessary) Insert 48* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Inserting Entries The appropriate page is determined as for a search, and the entry is inserted (with overflow pages added if necessary) Insert 41* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Inserting Entries The appropriate page is determined as for a search, and the entry is inserted (with overflow pages added if necessary) Insert 42* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41* 42* Chains of overflow pages can easily develop!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Deleting Entries The appropriate page is determined as for a search, and the entry is deleted ( with ONLY overflow pages removed when becoming empty ) Delete 42* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41* 42*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Deleting Entries The appropriate page is determined as for a search, and the entry is deleted ( with ONLY overflow pages removed when becoming empty ) Delete 42* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Deleting Entries The appropriate page is determined as for a search, and the entry is deleted ( with ONLY overflow pages removed when becoming empty ) Delete 42* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Deleting Entries The appropriate page is determined as for a search, and the entry is deleted ( with ONLY overflow pages removed when becoming empty ) Delete 51* 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41* Note that 51 still appears in an index entry, but not in the leaf!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Deleting Entries The appropriate page is determined as for a search, and the entry is deleted ( with ONLY overflow pages removed when becoming empty ) Delete 55* 10* 15* 20* 27* 33* 37* 40* 46* 55* 63* 97* 20 33 51 63 40 Root 23* 48* 41* Primary pages are NOT removed, even if they become empty!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ISAM: Some Issues Once an ISAM file is created, insertions and deletions affect only the contents of leaf pages (i.e., ISAM is a static structure !) Since index-level pages are never modified, there is no need to lock them during insertions/deletions Critical for concurrency! Long overflow chains can develop easily The tree can be initially set so that ~20% of each page is free If the data distribution and size are relatively static, ISAM might be a good choice to pursue!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Next Lecture Why Indexing? Storing Data Records in Indexes and Index Types Indexed Static Access Method (ISAM) Trees B+ Trees
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SOEN 363 - Data Systems for Software Engineers Lecture 13: DBMS Internals- Part 3 Essam Mansour
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Today… Last Session: DBMS Internals- Part 2 Tree-based indexes: ISAM Today s Session: DBMS Internals- Part II Tree-based indexes: B+ trees
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Dynamic Trees ISAM indices are static Long overflow chains can develop as the file grows, leading to poor performance This calls for more flexible, dynamic indices that adjust gracefully to insertions and deletions No need to allocate the leaf pages sequentially as in ISAM Among the most successful dynamic index schemes is the B+ tree
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree Properties Each node in a B+ tree of order d (this is a measure of the capacity of a tree) : Has at most 2d keys Has at least d keys (except the root, which may have just 1 key) All leaves are on the same level Has exactly n-1 keys if the number of pointers is n k 1 k 2 k n p 1 p n+1 Points to a sub-tree in which all keys are less than k 1 Points to a sub-tree in which all keys are greater than or equal k n Points to a sub-tree in which all keys are greater than or equal k 1 and less than to k 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Searching for Entries Search begins at root, and key comparisons direct it to a leaf (as in ISAM) Example 1: Search for entry 5* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Searching for Entries Search begins at root, and key comparisons direct it to a leaf (as in ISAM) Example 2: Search for entry 15* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 15* is not found!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Trees: Inserting Entries Find correct leaf L Put data entry onto L If L has enough space, done ! Else, split L into L and a new node L 2 Re-partition entries evenly , copying up the middle key Parent node may overflow Push up middle key (splits grow trees; a root split increases the height of the tree)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 Leaf is full ; hence, split!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 2* 3* 5* 7* 8* 5 The middle key (i.e., 5 ) is “copied up” and continues to appear in the leaf
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 2* 3* 5* 7* 8* 5 17 24 30 13 5 Parent is full ; hence, split! > 2d keys and 2d + 1 pointers
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 2* 3* 5* 7* 8* 5 5 24 30 17 13 The middle key (i.e., 17 ) is pushed up
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 2* 3* 5* 7* 8* 5 24 30 17 13
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* 2* 3* Root 17 24 30 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* Splitting the root lead to an increase of height by 1! What about re-distributing entries instead of splitting nodes? FINAL TREE!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 Leaf is full ; hence, check the sibling Poor Sibling
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 Do it through the parent
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Insertions Insert entry 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Do it through the parent 8* “Copy up” the new low key value! 13 8 But, when to redistribute and when to split ?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Splitting vs. Redistributing Leaf Nodes Previous and next-neighbor pointers must be updated upon insertions ( if splitting is to be pursued ) Hence, checking whether redistribution is possible does not increase I/O Therefore, if a sibling can spare an entry, re-distribute Non-Leaf Nodes Checking whether redistribution is possible usually increases I/O Splitting non-leaf nodes typically pays off!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Trees: Deleting Entries Start at root, find leaf L where entry belongs Remove the entry If L is at least half-full, done! If L underflows Try to re-distribute (i.e., borrow from a rich sibling and copy up its lowest key ) If re-distribution fails, merge L and a poor sibling Update parent And possibly merge, recursively
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 19* 2* 3* Root 17 24 30 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* Removing 19* does not cause an underflow
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 19* 2* 3* Root 17 24 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* 20* 22* FINAL TREE!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 20* 2* 3* Root 17 24 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* Deleting 20* causes an underflow; hence, check a sibling for redistribution 20* 22*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 20* 2* 3* Root 17 24 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* The sibling is ‘rich’ (i.e., can lend an entry); hence, remove 20* and redistribute! 20* 22*
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 20* 2* 3* Root 17 24 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* 22* “Copy up” 27* , the lowest value in the leaf from which we borrowed 24* Is it done?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 20* 2* 3* Root 17 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* Copy up 27* , the lowest value in the leaf from which we borrowed 24* 22* 27
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 20* 2* 3* Root 17 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 27 FINAL TREE!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* The affected leaf will contain only 1 entry and the sibling cannot lend any entry (i.e., redistribution is not applicable); hence, merge ! 22* 27
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 30 14* 16* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 27 30 22* 27* 29* 33* 34* 38* 39* Merge “Toss” 27 because the page that it was pointing to does not exist anymore!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 14* 16* 13 5 7* 5* 8* 30 22* 27* 29* 33* 34* 38* 39* No, but almost there… Is it done?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 14* 16* 13 5 7* 5* 8* 30 22* 27* 29* 33* 34* 38* 39* This entails an underflow ; hence, we must either redistribute or merge!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 14* 16* 13 5 7* 5* 8* 30 22* 27* 29* 33* 34* 38* 39* The sibling is poor (i.e., redistribution is not applicable) ; hence, merge!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 14* 16* 13 5 7* 5* 8* 30 22* 27* 29* 33* 34* 38* 39* Root 13 5 Lacks a pointer for 30! 30
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 14* 16* 13 5 7* 5* 8* 30 22* 27* 29* 33* 34* 38* 39* Root 13 5 Lacks a key value to create a complete index entry! 30
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* Root 17 14* 16* 13 5 7* 5* 8* 30 22* 27* 29* 33* 34* 38* 39* Root 13 5 “Pull down” 17! 30 17
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39* 5* 8* Root 30 13 5 17 FINAL TREE!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* Root 13 5 17 20 22 30 14* 16* 17* 18* 20* 33* 34* 38* 39* 22* 27* 29* 21* 7* 5* 8* 3* 2* Assume (instead ) the above tree during deleting 24* Now we can re-distribute ( instead of merging ) keys! 24* was originally here
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
B+ Tree: Examples of Deletions Delete 24* 14* 16* 33* 34* 38* 39* 22* 27* 29* 17* 18* 20* 21* 7* 5* 8* 2* 3* Root 13 5 17 30 20 22 DONE! It suffices to re-distribute only 20; 17 was redistributed for illustration.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Next Class Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager Continue (Hash-Based Indexing)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SOEN 363 - Data Systems for Software Engineers Lecture 14: DBMS Internals- Part 4 Essam Mansour
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Today… Last Session: DBMS Internals- Part 3 Tree-based indexes: ISAM and B+ trees Today’s Session: DBMS Internals- Part 4 Tree-based (B+ tree- cont’d) and Hash-based indexes
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DBMS Layers Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager Continue…
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Primer on Hash-Based Indexing Static Hashing Extendible Hashing
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Hash-Based Indexing What indexing technique can we use to support range searches (e.g., “Find s_name where gpa >= 3.0)? Tree-Based Indexing What about equality selections (e.g., “Find s_name where sid = 102”? Tree-Based Indexing Hash-Based Indexing ( cannot support range searches! ) Hash-based indexing, however, proves to be very useful in implementing relational operators (e.g., joins)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Primer on Hash-Based Indexing Static Hashing Extendible Hashing
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Static Hashing A hash structure (or table or file) is a generalization of the simpler notion of an ordinary array In an array, an arbitrary position can be examined in O(1) A hash function h is used to map keys into a range of bucket numbers h (key) mod N h key Primary bucket pages Overflow pages 2 0 N-1 With Static Hashing , allocated sequentially and never de-allocated With Static Hashing , allocated ( as needed ) when corresponding buckets become full
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Static Hashing Data entries can be any of the three alternatives ( A (1) , A (2) or A (3) - see previous lecture) Data entries can be sorted in buckets to speed up searches The hash function h is used to identify the bucket to which a given key belongs and subsequently insert , delete or locate a respective data record A hash function of the form h (key) = ( a * key + b ) works well in practice A search ideally requires 1 disk I/O, while an insertion or a deletion necessitates 2 disk I/Os
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Static Hashing: Some Issues Similar to ISAM, the number of buckets is fixed! Cannot deal with insertions and deletions gracefully Long overflow chains can develop easily and degrade performance! Pages can be initially kept only 80% full Dynamic hashing techniques can be used to fix the problem Extendible Hashing (EH) Linear Hashing (LH)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Primer on Hash-Based Indexing Static Hashing Extendible Hashing
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Directory of Pointers How else ( as opposed to overflow pages ) can we add a data record to a full bucket in a static hash file? Reorganize the table (e.g., by doubling the number of buckets and redistributing the entries across the new set of buckets) But, reading and writing all pages is expensive! In contrast, we can use a directory of pointers to buckets Buckets number can be doubled by doubling just the directory and splitting only the bucket that overflowed The trick lies on how the hash function can be adjusted!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing Extendible Hashing uses a directory of pointers to buckets The result of applying a hash function h is treated as a binary number and the last d bits are interpreted as an offset into the directory d is referred to as the global depth of the hash file and is kept as part of the header of the file 00 01 10 11 DIRECTORY Bucket A Bucket B Bucket C Bucket D DATA PAGES 10* 1* 21* 4* 12* 32* 16* 15* 7* 19* 5* 2 GLOBAL DEPTH
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Searching for Entries To search for a data entry, apply a hash function h to the key and take the last d bits of its binary representation to get the bucket number Example: search for 5* 00 01 10 11 DIRECTORY Bucket A Bucket B Bucket C Bucket D DATA PAGES 10* 1* 21* 4* 12* 32* 16* 15* 7* 19* 5* 5 = 1 01 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries An entry can be inserted as follows: Find the appropriate bucket ( as in search ) Split the bucket if full and redistribute contents (including the new entry to be inserted) across the old bucket and its “split image” Double the directory if necessary Insert the given entry
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 13* 13* 00 01 10 11 DIRECTORY Bucket A Bucket B Bucket C Bucket D 10* 1* 21* 4* 12* 32* 16* 15* 7* 19* 5* 13 = 11 01 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 20* 13* 00 01 10 11 DIRECTORY Bucket A Bucket B Bucket C Bucket D 10* 1* 21* 4* 12* 32* 16* 15* 7* 19* 5* 20 = 101 00 FULL, hence, split and redistribute! 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 20* 13* 00 01 10 11 DIRECTORY Bucket A Bucket B Bucket C Bucket D 10* 1* 21* 32* 16* 15* 7* 19* 5* 20 = 101 00 20* Bucket A2 ( `split image' of Bucket A) 4* 12* 2 Is this enough?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 20* 13* 00 01 10 11 DIRECTORY Bucket A Bucket B Bucket C Bucket D 10* 1* 21* 32* 16* 15* 7* 19* 5* 20* Bucket A2 ( `split image' of Bucket A) 4* 12* 2 Double the directory and increase the global depth 20 = 101 00
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 20* 19* 0 00 001 010 011 1 00 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 ( `split image' of Bucket A) 32* 1* 5* 21*13* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH These two bits indicate a data entry that belongs to one of these two buckets The third bit distinguishes between these two buckets! But, is it necessary always to double the directory?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 9* 9 = 1 001 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 ( `split image' of Bucket A) 32* 1* 5* 21*13* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH FULL, hence, split!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 9* 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 ( `split image‘ of A) 32* 1* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH Bucket B2 ( `split image‘ of B) 5* 21*13* 9* Almost there… 9 = 1 001
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 9* 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 ( `split image‘ of A) 32* 1* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH Bucket B2 ( `split image‘ of A) 5* 21*13* 9* There was no need to double the directory! When NOT to double the directory? 9 = 1 001
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Find the appropriate bucket (as in search), split the bucket if full, double the directory if necessary and insert the given entry Example: insert 9* 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 ( `split image of A) 32* 1* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH Bucket B2 ( `split image‘ of B) 5* 21*13* 9* If a bucket whose local depth equals to the global depth is split, the directory must be doubled 3 2 2 3 3 LOCAL DEPTH 3 9 = 1 001
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 9* 9 = 1 001 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 (`split image' of Bucket A) 32* 1* 5* 21*13* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH FULL, hence, split! 2 2 2 3 3 Repeat… Because the local depth (i.e., 2) is less than the global depth (i.e., 3), NO need to double the directory LOCAL DEPTH
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 9* 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 (`split image‘ of A) 32* 1* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH Bucket B2 (`split image‘ of B) 5* 21*13* 9* 3 2 2 3 3 LOCAL DEPTH 3 9 = 1 001 Repeat
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 9* 19* 000 001 010 011 100 101 110 111 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 (`split image‘ of A) 32* 1* 16* 10* 15* 7* 4* 20* 12* GLOBAL DEPTH Bucket B2 (`split image‘ of B) 5* 21*13* 9* 3 2 2 3 3 LOCAL DEPTH 3 FINAL STATE! 9 = 1 001 Repeat…
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 20* Because the local depth and the global depth are both 2, we should double the directory! 20 = 101 00 13* 00 01 10 11 2 2 2 2 2 LOCAL DEPTH GLOBAL DEPTH DIRECTORY Bucket A Bucket B Bucket C Bucket D DATA PAGES 10* 1* 21* 4* 12* 32* 16* 15* 7* 19* 5* FULL, hence, split! Repeat
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 20* Is this enough? 20* 00 01 10 11 2 2 2 2 LOCAL DEPTH 2 2 DIRECTORY GLOBAL DEPTH Bucket A Bucket B Bucket C Bucket D Bucket A2 (`split image' of Bucket A) 1* 5* 21*13* 32* 16* 10* 15* 7* 19* 4* 12* Repeat… 20 = 101 00
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 20* 19* 2 2 2 000 001 010 011 100 101 110 111 3 2 2 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 (`split image' of Bucket A) 32* 1* 5* 21*13* 16* 10* 15* 7* 4* 20* 12* LOCAL DEPTH GLOBAL DEPTH Repeat… Is this enough?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Extendible Hashing: Inserting Entries Example: insert 20* 19* 2 2 2 000 001 010 011 100 101 110 111 3 3 3 DIRECTORY Bucket A Bucket B Bucket C Bucket D Bucket A2 (`split image' of Bucket A) 32* 1* 5* 21*13* 16* 10* 15* 7* 4* 20* 12* LOCAL DEPTH GLOBAL DEPTH Repeat… FINAL STATE!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Next Class Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager Hash-based Indexes ( Cont’d ) and External Sorting
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SOEN 363 - Data Systems for Software Engineers Lecture 15: Query Optimization Essam Mansour
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Today Last Session: DBMS Internals Today s Session: Query Optimization
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DBMS Layers Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Brief Primer on Query Optimization Evaluating Query Plans Relational Algebra Equivalences Estimating Plan Costs Enumerating Plans
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Cost-Based Query Sub-System Query Parser Query Optimizer Plan Generator Plan Cost Estimator Query Plan Evaluator Catalog Manager Usually there is a heuristics-based rewriting step before the cost-based steps. Schema Statistics select name from STUDENT, TAKES where c- id=‘415’ and STUDENT.ssn=TAKES.ssn Queries
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Query Optimization Steps Step 1 : Queries are parsed into internal forms (e.g., parse trees) Step 2 : Internal forms are transformed into canonical forms (syntactic query optimization) Step 3 : A subset of alternative plans are enumerated Step 4 : Costs for alternative plans are estimated Step 5 : The query evaluation plan with the least estimated cost is picked
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Required Information to Evaluate Queries To estimate the costs of query plans, the query optimizer examines the system catalog and retrieves: Information about the types and lengths of fields Statistics about the referenced relations Access paths (indexes) available for relations In particular, the Schema and Statistics components in the Catalog Manager are inspected to find a good enough query evaluation plan
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Cost-Based Query Sub-System Query Parser Query Optimizer Plan Generator Plan Cost Estimator Query Plan Evaluator Catalog Manager Usually there is a heuristics-based rewriting step before the cost-based steps. Schema Statistics Queries select name from STUDENT, TAKES where c-id= 415 and STUDENT.ssn=TAKES.ssn
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Catalog Manager: The Schema What kind of information do we store at the Schema? Information about tables (e.g., table names and integrity constraints) and attributes (e.g., attribute names and types) Information about indices (e.g., index structures) Information about users Where do we store such information? In tables, hence, can be queried like any other tables For example: Attribute_Cat (attr_name: string , rel_name: string ; type: string ; position: integer )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Catalog Manager: Statistics What would you store at the Statistics component? NTuples(R) : # records for table R NPages(R) : # pages for R NKeys(I) : # distinct key values for index I INPages(I) : # pages for index I IHeight(I) : # levels for I ILow(I), IHigh(I) : range of values for I ... Such statistics are important for estimating plan costs and result sizes ( to be discussed shortly! )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SQL Blocks SQL queries are optimized by decomposing them into a collection of smaller units, called blocks A block is an SQL query with: No nesting Exactly 1 SELECT and 1 FROM clauses At most 1 WHERE, 1 GROUP BY and 1 HAVING clauses A typical relational query optimizer concentrates on optimizing a single block at a time
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Translating SQL Queries Into Relational Algebra Trees select name from STUDENT, TAKES where c-id= 415 and STUDENT.ssn=TAKES.ssn STUDENT TAKES s p An SQL block can be thought of as an algebra expression containing: A cross-product of all relations in the FROM clause Selections in the WHERE clause Projections in the SELECT clause Remaining operators can be carried out on the result of such SQL block
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Translating SQL Queries Into Relational Algebra Trees ( Cont’d ) STUDENT TAKES s p STUDENT TAKES s p Canonical form Still the same result! How can this be guaranteed?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Translating SQL Queries Into Relational Algebra Trees ( Cont’d ) STUDENT TAKES s p STUDENT TAKES s p Canonical form OBSERVATION: try to perform selections and projections early!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Translating SQL Queries Into Relational Algebra Trees ( Cont d ) STUDENT TAKES s p Index; seq scan Hash join; merge join; nested loops; How to evaluate a query plan (as opposed to evaluating an operator)?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Brief Primer on Query Optimization Evaluating Query Plans Relational Algebra Equivalences Estimating Plan Costs Enumerating Plans
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SOEN 363 - Data Systems for Software Engineers Lecture 16: Query Optimization Essam Mansour
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Today… Last Session: Query Optimization Today’s Session: Query Optimization
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DBMS Layers Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Brief Primer on Query Optimization Evaluating Query Plans Relational Algebra Equivalences Estimating Plan Costs Enumerating Plans
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Query Evaluation Plans A query evaluation plan (or simply a plan ) consists of an extended relational algebra tree (or simply a tree) A plan tree consists of annotations at each node indicating: The access methods to use for each relation The implementation method to use for each operator Consider the following SQL query Q : SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 What is the corresponding RA of Q ?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Query Evaluation Plans (Cont’d) Q can be expressed in relational algebra as follows: ) (Re 5 100 ( Sailors sid sid serves rating bid sname Reserves Sailors sid=sid bid=100 rating > 5 sname A RA Tree: Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) An Extended RA Tree: (File Scan) (File Scan)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Pipelining vs. Materializing When a query is composed of several operators, the result of one operator can sometimes be pipelined to another operator Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) (File Scan) (File Scan) Pipeline the output of the join into the selection and projection that follow Applied on-the-fly
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Pipelining vs. Materializing When a query is composed of several operators, the result of one operator can sometimes be pipelined to another operator Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) (File Scan) (File Scan) Pipeline the output of the join into the selection and projection that follow Applied on-the-fly In contrast, a temporary table can be materialized to hold the intermediate result of the join and read back by the selection operation! Pipelining can significantly save I/O cost!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The I/O Cost of the Q Plan What is the I/O cost of the following evaluation plan? Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) (File Scan) (File Scan) The cost of the join is 1000 + 1000 * 500 = 501,000 I/Os (assuming page-oriented Simple NL join) The selection and projection are done on-the-fly; hence, do not incur additional I/Os
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Pushing Selections How can we reduce the cost of a join? By reducing the sizes of the input relations! Reserves Sailors sid=sid bid=100 rating > 5 sname Involves bid in Reserves; hence, “push” ahead of the join! Involves rating in Sailors; hence, push ahead of the join!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Pushing Selections How can we reduce the cost of a join? By reducing the sizes of the input relations! Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) (Scan; write to temp T2) (Sort-Merge Join) Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) (File Scan) (File Scan)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The I/O Cost of the New Q Plan What is the I/O cost of the following evaluation plan? Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) (Scan; write to temp T2) (Sort-Merge Join) Cost of Scanning Reserves = 1000 I/Os Cost of Writing T1 = 10 * I/Os ( later ) Cost of Scanning Sailors = 500 I/Os Cost of Writing T2 = 250 * I/Os ( later ) * Assuming 100 boats and uniform distribution of reservations across boats. * Assuming 10 ratings and uniform distribution over ratings.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The I/O Costs of the Two Q Plans Total Cost = 501, 000 I/Os Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) (Scan; write to temp T2) (Sort-Merge Join) Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) (File Scan) (File Scan) Total Cost = 4060 I/Os
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Pushing Projections How can we reduce the cost of a join? By reducing the sizes of the input relations! Consider (again) the following plan: Reserves Sailors sid=sid bid=100 sname rating > 5 (Scan; write to temp T1) (Scan; write to temp T2) What are the attributes required from T1 and T2? Sid from T1 Sid and sname from T2 Hence, as we scan Reserves and Sailors we can also remove unwanted columns (i.e., “Push” the projections ahead of the join)!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Pushing Projections How can we reduce the cost of a join? By reducing the sizes of the input relations! Consider (again) the following plan: Reserves Sailors sid=sid bid=100 sname rating > 5 (Scan; write to temp T1) (Scan; write to temp T2) The cost after applying this heuristic can become 2000 I/Os (as opposed to 4060 I/Os with only pushing the selection)! “Push” ahead the join
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
What if indexes are available on Reserves and Sailors? Using Indexes Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) With clustered index on bid of Reserves, we get 100,000/100 = 1000 tuples (assuming 100 boats and uniform distribution of reservations across boats) Since the index is clustered, the 1000 tuples appear consecutively within the same bucket; thus # of pages = 1000/100 = 10 pages
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Using Indexes What if indexes are available on Reserves and Sailors? Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) For each selected Reserves tuple, we can retrieve matching Sailors tuples using the hash index on the sid field Selected Reserves tuples need not be materialized and the join result can be pipelined! For each tuple in the join result, we apply rating > 5 and the projection of sname on-the-fly
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Using Indexes What if indexes are available on Reserves and Sailors? Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) Is it necessary to project out unwanted columns? NO , since selection results are NOT materialized
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Using Indexes What if indexes are available on Reserves and Sailors? Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) Does the hash index on sid need to be clustered? NO , since there is at most 1 matching Sailors tuple per a Reserves tuple! Why?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Using Indexes What if indexes are available on Reserves and Sailors? Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) Cost = 1.2 I/Os (if A(1)) or 2.2 (if A(2))
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Using Indexes What if indexes are available on Reserves and Sailors? Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) Why not pushing this selection ahead of the join? It would require a scan on Sailors!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
What is the I/O cost of the following evaluation plan? The I/O Cost of the New Q Plan Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) (Clustered hash index on bid) 10 I/Os Cost = 1.2 I/Os for 1000 Reserves tuples; hence, 1200 I/Os Total Cost = 10 + 1200 = 1210 I/Os
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Comparing I/O Costs: Recap Total Cost = 501, 000 I/Os Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) (Scan; write to temp T2) (Sort-Merge Join) Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) (File Scan) (File Scan) Total Cost = 4060 I/Os Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Hash index) (Index Nested Loops, with pipelining ) (On-the-fly) (Hash index on sid) Total Cost = 1210 I/Os
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
But, How Can we Ensure Correctness? Canonical form Still the same result! How can this be guaranteed? Reserves Sailors sid=sid bid=100 rating > 5 sname Reserves Sailors sid=sid bid=100 sname rating > 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Brief Primer on Query Optimization Evaluating Query Plans Relational Algebra Equivalences Estimating Plan Costs Enumerating Plans
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SOEN 363 - Data Systems for Software Engineers Lecture 17: Query Optimization Essam Mansour
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Today… Last Session: Query Optimization Today s Session: Query Optimization
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DBMS Layers Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline A Brief Primer on Query Optimization Evaluating Query Plans Relational Algebra Equivalences Estimating Plan Costs Enumerating Plans
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Relational Algebra Equivalences A relational query optimizer uses relational algebra equivalences to identify many equivalent expressions for a given query Two relational algebra expressions over the same set of input relations are said to be equivalent if they produce the same result on all relations’ instances Relational algebra equivalences allow us to: Push selections and projections ahead of joins Combine selections and cross-products into joins Choose different join orders
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Selections Two important equivalences involve selections: 1. Cascading of Selections: 2. Commutation of Selections: c cn c cn R R 1 1 ... ... c c c c R R 1 2 2 1 Allows us to combine several selections into one selection OR : Allows us to replace a selection with several smaller selections Allows us to test selection conditions in either order
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Projections One important equivalence involves projections: Cascading of Projections: This says that successively eliminating columns from a relation is equivalent to simply eliminating all but the columns retained by the final projection! R R an a a ... 1 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Cross-Products and Joins Two important equivalences involve cross-products and joins: 1. Commutative Operations: This allows us to choose which relation to be the inner and which to be the outer! (R × S) (S × R) (R S) (S R)  
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Cross-Products and Joins Two important equivalences involve cross-products and joins: 2. Associative Operations: This says that regardless of the order in which the relations are considered, the final result is the same! R × (S × T) (R × S) × T  R (S T) (R S) T    R (S T) (T R) S     It follows: This order-independence is fundamental to how a query optimizer generates alternative query evaluation plans
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Selections, Projections, Cross Products and Joins Selections with Projections: Selections with Cross-Products: This says we can commute a selection with a projection if the selection involves only attributes retained by the projection! )) ( ( )) ( ( R R a c c a R S c ) ( S R c This says we can combine a selection with a cross-product to form a join ( as per the definition of a join )!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Selections, Projections, Cross Products and Joins Selections with Cross-Products and with Joins: S R c S R c ) ( ) ( This says we can commute a selection with a cross-product or a join if the selection condition involves only attributes of one of the arguments to the cross-product or join! S R c S R c ) ( ) ( Caveat : The attributes mentioned in c must appear only in R and NOT in S
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Selections, Projections, Cross Products and Joins Selections with Cross-Products and with Joins ( Cont’d ): ) ( 3 2 1 ) ( S R c c c S R c This says we can push part of the selection condition c ahead of the cross-product! ))) ( 3 ( 2 ( 1 S R c c c )) ( 3 ) ( 2 ( 1 S c R c c This applies to joins as well!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RA Equivalences: Selections, Projections, Cross Products and Joins Projections with Cross-Products and with Joins: ) ( 2 ) ( 1 ) ( S a R a S R a Intuitively, we need to retain only those attributes of R and S that are either mentioned in the join condition c or included in the set of attributes a retained by the projection ) ( 2 ) ( 1 ) ( S a c R a S c R a )) ( 2 ) ( 1 ( ) ( S a c R a a S c R a
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How to Estimate the Cost of Plans? Now that correctness is ensured, how can the DBMS estimate the costs of various plans? Canonical form Reserves Sailors sid=sid bid=100 rating > 5 sname Reserves Sailors sid=sid bid=100 sname rating > 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Next Class Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Queries Transaction Manager Lock Manager Recovery Manager Continue…
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NoSQL Database Management Systems Data Systems for Software Engineers Essam Mansour Essam Mansour (Concordia) SOEN 363 1 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline and Learning Outcomes - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 2 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 3 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Essam Mansour (Concordia) SOEN 363 4 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Why NoSQL? Essam Mansour (Concordia) SOEN 363 5 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Why NoSQL? For decades, relational database management systems (MySQL, Post- greSQL, SQL Server, Oracle) have been considered as the one-size-fits- all solution for providing data persistence and its retrieval for decades. The ever increasing need for scalability and new application require- ments have created new challenges for traditional RDBMS. At some stage, there has been some dissatisfaction with this one-size- fits-all approach in deploying the data storage tier for large scale online web services. Essam Mansour (Concordia) SOEN 363 6 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
What is NoSQL? An umbrella term for all database management systems that don’t follow the popular and well established RDBMS principles and often relate to large data sets accessed and manipulated on a Web scale 1 . NoSQL = N ot O NLY SQL NoSQL implies that more than one storage mechanism could be used based on the needs. 1 Tiwari, S. (2011). Professional NoSQL. John Wiley & Sons. Essam Mansour (Concordia) SOEN 363 7 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
What is NoSQL? NoSQL is not a single product or even a single technology. It represents a class of products and a collection of diverse, and sometimes related, concepts about data storage and manipulation. NoSQL database systems represent a new generation of low-cost, high performance database software which is increasingly gaining more and more popularity. Essam Mansour (Concordia) SOEN 363 8 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How it is different? Don’t use SQL as a query language. Essam Mansour (Concordia) SOEN 363 9 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How it is different? No Fixed Schema Essam Mansour (Concordia) SOEN 363 10 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How it is different? No Join Operations Expensive Operation for combination of records from two or more tables Requires fixed Schema and Strong Consistency Essam Mansour (Concordia) SOEN 363 11 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How it is different? Allows Distributed, Fault-tolerant Architecture NoSQL systems allow you to store your database on multiple nodes and maintain high-speed performance. Allows Linear Scalability When you add more nodes, you get a consistent increase in performance. Essam Mansour (Concordia) SOEN 363 12 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 13 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Essam Mansour (Concordia) SOEN 363 14 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Brief History MultiValue Databases at TRW in 1965 DBM - AT&T in 1979 Carlo Strozzi used ”NoSQL” Term to name a lightweight open source relational database with a different standard SQL inter- face in 1998 Graph Database - Neo4j in 2000 Distributed Key/Value Store Google BigTable in 2004 Document Database - CouchDB started in 2005 Cloud Key/Value Data Store - Amazon Dynamo Document Database - MongoDB in 2007 Facebook Cassandra Project in 2008 Project Voldemort by LinkedIn in 2008 Term NoSQL has been defined and reintroduced in 2009 Essam Mansour (Concordia) SOEN 363 15 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NoSQL Evolution World Wide Webs (Web 2.0) Companies started to face scalability issues with the huge growing data and infrastructure. Many companies came up with their own solutions to these problems with new technologies like Google: BigTable 2 Apache: Cassandra 3 Amazon: DynamoDB 4 2 https://cloud.google.com/bigtable/ 3 http://cassandra.apache.org/ 4 https://aws.amazon.com/dynamodb/ Essam Mansour (Concordia) SOEN 363 16 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NoSQL Evolution The number of NoSQL Database Management Systems increased every year with main goals on performance , reliability , consistency , and enhancing search and read performance 5 . The success of the first NoSQL DBMS initiated the development of more similar open-source and proprietary database systems. Currently, there is a huge variety of cloud database systems available Commercial : Amazon Simple DB, Amazon DynamoDB, Microsoft Azure Table Storage. Open Source : HBase, Cassandra, Voldemort, CouchDB, MongoDB 5 Sakr, S., Liu, A., Batista, D. M., Alomari, M. (2011). A survey of large scale data management approaches in cloud environments . IEEE Communications Surveys & Tutorials. Essam Mansour (Concordia) SOEN 363 17 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Where is NoSQL now? BigTable Voldemort Cassandra HBase HBase Essam Mansour (Concordia) SOEN 363 18 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 19 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Limitations of the distributed databases are: Consistency Every node needs to always see the same data value at any given instance Essam Mansour (Concordia) SOEN 363 20 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Limitations of the distributed databases are: Availability The system should continue to operate even if nodes in a cluster crash or some of the hardware/software parts are down at any time. Essam Mansour (Concordia) SOEN 363 21 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Limitations of the distributed databases are: Partition Tolerance The systems continues to operate in the presence of network partitions. Essam Mansour (Concordia) SOEN 363 22 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
CAP Theorem: Any distributed database with shared data, can have at most two of the three desirable properties underlineConsistency , underlineAvail- ability , and underlinePartition Tolerance Essam Mansour (Concordia) SOEN 363 23 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Availability + Partition Tolerance = Not Consistent Essam Mansour (Concordia) SOEN 363 24 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Consistency + Partition Tolerance = Not Available, waiting... Essam Mansour (Concordia) SOEN 363 25 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Availability + Consistency = Not Partitioned Essam Mansour (Concordia) SOEN 363 26 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 27 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Reliable Database Transactions Control Transaction control is important with respect to performance and con- sistency in distributed computing environments. Two Transaction Control Models are usually used which are: ACID : used in RDBMS BASE : used in many NoSQL Systems The difference between these models is in the amount of effort required by application developers and the location (tier) of the transactional controls. Essam Mansour (Concordia) SOEN 363 28 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Example You have two bank accounts ”Checking” and ”Savings”. You want to transfer 1000$ from ”Savings” to ”Checkings” using the transfer form on the bank website. Essam Mansour (Concordia) SOEN 363 29 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RDBMS Control Using ACID Atomicity : The transaction must happen as an all or nothing. Consistency : You shouldn’t have a report showing the withdrawal from Savings account without addition to checkings. (ie. Database should block all reports during atomic operations) Isolation : each part of the transaction occurs without knowledge of other parts. Durability : Once the transaction is done, it should be permanent. (If database crashes, all the done transactions must be restored from the last backup points). Essam Mansour (Concordia) SOEN 363 30 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RDBMS Control Using ACID The main focus of ACID systems is the integrity and consistency of data above any other considerations. Losing the availability by temporarily blocking mechanisms is the only way to ensure the reliability and accuracy of the information given by the system. ACID system is said to be pessimistic as they must consider all possible failure nodes in a computing environment. Essam Mansour (Concordia) SOEN 363 31 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Example You are buying some items from a website that uses the ”Shopping Cart” and ”CheckOut” constructs. The issue here is that if you have used the ACID transaction control system, you may prevent some customers from taking an order and block them for a while so you may lose this customer. Essam Mansour (Concordia) SOEN 363 32 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NoSQL Control Using BASE Basic Availability : It means that the system can be temporarily in- consistent so that transactions are manageable and all information and services are ”Basically Available”. Soft-State : It means that some inaccuracies are allowed and data may change during usage to reduce the amount of used resources for the sake of availability. Eventual Consistency : It means that at certain point when all the service logic is executed, the system will turn into a consistent state 6 . 6 Vogels, Werner. Eventually consistent . Communications of the ACM (2009) Essam Mansour (Concordia) SOEN 363 33 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NoSQL Control Using BASE The main focus of BASE systems is the availability of services above any other considerations. We should allow new data to be stored all time even at the risk of being out of sync for a short period of time 7 . BASE systems tend to be simpler and faster than ACID as they don’t require handling of many events by locking and unlocking resources like in case of ACID systems BASE system is said to be optimistic as it is assumed that all systems will eventually catch up and become consistent. 7 Pritchett, Dan. BASE: An ACID alternative . ACM Queue (2008 Essam Mansour (Concordia) SOEN 363 34 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Reliable Database Transactions Control Essam Mansour (Concordia) SOEN 363 35 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 36 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Advantages of NoSQL Databases: Simple and Flexible Structure = Schema Free Based on Key-Value Pairs Some Store types include Column Store, Document Store, Object Store, XML Store, Graph Store, etc. It can allow storage of serialized objects into the database. Open Source NoSQL databases don’t need expensive licensing fees and can run on inexpensive hardware. Expansion is always cheaper and easier than relational databases. It depends on Horizontal scaling by distributing the load over more nodes. On the contrary, Relational databases scale vertically by replac- ing main host with a more powerful one. Essam Mansour (Concordia) SOEN 363 37 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Disadvantages of NoSQL Databases: They don’t support the reliability features of relational databases and ACID transaction control systems (Atomicity, Consistency, Isola- tion, Durability). They trade consistency for the sake of availability, performance and scalability. Each type of NoSQL databases always has a limited number of ap- plications. Incompatibility with the SQL queries and the need of a manual/propriet querying language which adds more complexity. They don’t support Joins/ Group By/ Order By/ ACID transac- tions. Essam Mansour (Concordia) SOEN 363 38 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SQL Vs NoSQL Essam Mansour (Concordia) SOEN 363 39 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Essam Mansour (Concordia) SOEN 363 40 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Essam Mansour (Concordia) SOEN 363 41 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 42 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Classification of NoSQL Databases Essam Mansour (Concordia) SOEN 363 43 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Classification of NoSQL Databases Essam Mansour (Concordia) SOEN 363 44 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Classification of NoSQL Databases Essam Mansour (Concordia) SOEN 363 45 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Querying Data in NoSQL Syntax Varies (No standard interface or language) No set-based query language terms of constraints. Procedural Program Languages such as Java, C, etc. Application specifies retrieval path. No query Optimizer Essam Mansour (Concordia) SOEN 363 46 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
A directed Graph structure is used to represent the data. The graph includes a set of nodes, each represents an object . Some pairs of objects are connected by links (edges) representing relations between objects. It is used typically in social networking applications. It allows developers to focus on relations between objects rather than the objects themselves. They are powerful for graph queries like (shortest path, connected components, etc.) Example: Neo4J 8 8 https://neo4j.com/ Essam Mansour (Concordia) SOEN 363 47 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Example: Essam Mansour (Concordia) SOEN 363 48 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Neo4j uses Cypher as a declarative graph query language It allows expressive and efficient querying and updating of a property graph. Cypher is a relatively simple but still very powerful language in which very complicated database queries can easily be expressed through Cypher. Essam Mansour (Concordia) SOEN 363 49 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
CREATE The command CREATE will create a new node/edge CREATE (a:Studentname:’Tarun’) - > (NODE) CREATE (a:Studentname:’Bob’)- > (NODE) CREATE (b:Cityname:’Tartu’, located In:’Estonia’)- > (NODE) CREATE (c:Coursename:’Data Management’)- > (NODE) CREATE (b) < -[:Live In]-(a)-[:Studysemester:’spring’]- > (c) - > (EDGE) Essam Mansour (Concordia) SOEN 363 50 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
READ The MATCH command will retrieve all the nodes with specific relation MATCH (a)-[b]- > (x) RETURN a,b,x Essam Mansour (Concordia) SOEN 363 51 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
UPDATE The MERGE command will merge some nodes together after matching. The SET command will update/insert some data to a node/edge after matching. MATCH (n:Cityname:’Tartu’) SET n.population:’100k’ MATCH (n:Studentname:’Tarun’), (m:Studentname:’Bob’) MERGE (n)-[:Friendship]- > (m), (m)-[:Friendship]- > (n) Essam Mansour (Concordia) SOEN 363 52 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DELETE The DELETE command will delete an edge/node after making a Match MATCH e = (n:Studentname:’Bob’)-[r]-(n:Studentname:’Tarun’) DELETE e MATCH (n:Studentname:’Bob’) DELETE n Essam Mansour (Concordia) SOEN 363 53 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
CREATE Very similar to SQL commands The INSERT command will perform an insert operation into a table. INSERT INTO students (idNo, name, city) VALUES (87787, Tarun, Tartu); Essam Mansour (Concordia) SOEN 363 54 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
READ The SELECT command will retrieve data from a given table. These keywords help in forming the SELECT queries: WHERE : This keyword will specify the location where data is to be selected. FROM : This keyword will specify the table to select from. Examples: SELECT * FROM students SELECT city, name FROM students SELECT name FROM students WHERE city = ’tartu’ Essam Mansour (Concordia) SOEN 363 55 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
UPDATE The UPDATE command will update values of a certain key in the table. These keywords help in forming the update queries: Where : This keyword will specify the location where data is to be updated. Set : This keyword will specify the updated value. Must : This keyword includes the columns composing the primary key. Example: UPDATE students SET city = Tallinn WHERE idNo = 87787 Essam Mansour (Concordia) SOEN 363 56 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DELETE The DELETE command will delete an entry from a given table. These keywords help in forming the DELETE queries: WHERE : This keyword will specify the location where data is to be deleted. FROM : This keyword will specify the table to delete from. Example: DELETE FROM students WHERE city = ’tartu’ Essam Mansour (Concordia) SOEN 363 57 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
A hybrid of RDBMS and Key-Value stores. Data is stored in columns as opposed to being stored in rows as in Relational database management systems. It consists of one or more column families that group certain columns in the database and each key is used to identify and point to one family. Essam Mansour (Concordia) SOEN 363 58 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Each column includes tuples of names and values comma sepa- rated. Fast Data Aggregation Rows that correspond to a single column are stored as a single disk entry which leads to faster access during read/write operations. Example: HBase 9 9 https://hbase.apache.org/ Essam Mansour (Concordia) SOEN 363 59 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Example: Country Product Sales US Alpha 3000 US Beta 1250 ES Alpha 700 UK Alpha 450 Essam Mansour (Concordia) SOEN 363 60 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Essam Mansour (Concordia) SOEN 363 61 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
CREATE ”Create” will create new table. Eg: create ’ < table name > ’, < column family name > ’, ’ < column family name > ’, ... ”Put” will insert data into a table. Example: create ’student’, ’personal data’, ’courses’ Put ’student’,’87787’, ’personal data:name’,’Tarun’ Put ’student’,’87787’, ’personal data:city’,’Tartu’ Put ’student’,’87787’, ’courses:spring’,’Data Management’ Essam Mansour (Concordia) SOEN 363 62 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
READ The ”get” command will retrieve data from a given table based on some parameters The ”scan” command will retrieve all data from a given table get ’students’, ’87787’ get ’students’, ’87787’,COLUMN= > ’personal data:city’ scan ’students’ Essam Mansour (Concordia) SOEN 363 63 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
UPDATE The ”put” command will insert/overwrite/ update entries in a table. put ’students’, ’87787’, ’personal data:city’, ’tallinn’ Essam Mansour (Concordia) SOEN 363 64 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DELETE The ”delete” command will delete an entry document from table. delete ’students’, ’87787’, ’courses:spring’ Essam Mansour (Concordia) SOEN 363 65 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
It is similar to key-value stores in that they are schema-less and have the same advantages/disadvantages. The value of each key stored in the database is a ”document” with many possible different encodings like XML, JSON, BSON (Binary Encoded JSON). Documents themselves can contain many different key-value pairs, key- array pairs, or nested documents. Documents can be indexed which leads to outperformance of traditional file systems. Example: MongoDB 10 10 https://www.mongodb.com/ Essam Mansour (Concordia) SOEN 363 66 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Example: Essam Mansour (Concordia) SOEN 363 67 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
CREATE The command ”db.collection.insert()” will perform an insert operation into a collection of a document. Essam Mansour (Concordia) SOEN 363 68 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
READ The find() command will retrieve all the documents of the given collection. db.collection name.find() If a document is to be retrieved based on some criteria. Then, we should pass some parameters in the find() command db.students.find(”idNo”:”87787”) Essam Mansour (Concordia) SOEN 363 69 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
UPDATE If a document is to be retrieved based on some criteria. Then, we should pass some parameters in the update()command, and set the new attributes using $set The update() command will retrieve all the documents of the given collection. db.collection name.update() db.student.update( ”idNo”: ”87787” ,$set: ”name”:”Tarun”) Essam Mansour (Concordia) SOEN 363 70 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
DELETE db.collection name.remove() The remove() command will delete an entry document from a given collection. db.students.remove(”idNo”:”87787”) Essam Mansour (Concordia) SOEN 363 71 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NoSQL Databases 11 11 http://nosql-database.org/ Essam Mansour (Concordia) SOEN 363 72 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 73 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Indexing Structures for NoSQL Databases The process of associating a key with the location of a corresponding data record in a Database management system is called indexing . The three of the most common methods used in the process of indexing in NoSQL databases are: B-Tree Indexing T-Tree Indexing O2-Tree Indexing Essam Mansour (Concordia) SOEN 363 74 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
One Size Does Not Fit All There has been a big debate between the proponents of the NoSQL and RDBMS camps which is centered around the right choice for im- plementing online transaction processing systems. RDBMS proponents think that the NoSQL camp has not spent suffi- cient time to understand the foundation of the transaction model, The NoSQL camp argues that the domain specific optimization oppor- tunities of NoSQL systems gives more flexibility to the developers. However, they admit that making optimization decision requires a lot of experience and can be very error-prone and dangerous if not done by experts. Essam Mansour (Concordia) SOEN 363 75 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Outline - What is NoSQL ? - The Evolution of NoSQL - Scalability and CAP theorem - BASE principle - Advantages and Disadvantages - Classification of NoSQL Databases - Indexing Structure for NoSQL databases - Cloud Spanner Essam Mansour (Concordia) SOEN 363 76 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Cloud Spanner 13 Spanner is Google’s globally distributed NewSQL database 12 . Cloud Spanner is the only enterprise-grade, globally-distributed , and strongly consistent database service built for the cloud specifically to combine the benefits of relational database structure with non- relational horizontal scale. This combination delivers high-performance transactions and strong consistency across rows, regions, and continents with an industry- leading 99.999% availability SLA, no planned downtime, and enterprise- grade security. Cloud Spanner revolutionizes database administration and management and makes application development more efficient. 12 Corbett, James C., et al. Spanner: Google’s globally distributed database . ACM Transactions on Computer Systems (TOCS), 2013 13 https://cloud.google.com/spanner/ Essam Mansour (Concordia) SOEN 363 77 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Cloud Spanner 14 14 https://cloud.google.com/spanner/ Essam Mansour (Concordia) SOEN 363 78 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NewSQL NewSQL is a combination of SQL and NoSQL 15 . It is a class of modern RDBMS that seeks to provide: The same scalable performance of NoSQL systems for online transaction processing (read-write) workloads. Maintaining the ACID guarantees of a traditional database system. NewSQL systems are relational databases designed to provide: ACID compliance, real-time OLTP (Online Transaction Processing). Conventional SQL-based OLAP (Online Analytic Processing) in Big Data Environments. 15 Grolinger, Katarina, et al. Data management in cloud environments: NoSQL and NewSQL data stores. Journal of Cloud Computing: advances, systems and applications, 2013 Essam Mansour (Concordia) SOEN 363 79 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NewSQL These Systems break through conventional RDBMS performance lim- its by employing NoSQL style features such as column-oriented data storage and distributed architectures, or by employing technologies like in-memory processing, symmetric multiprocessing (SMP), or massively parallel processing (MPP) Essam Mansour (Concordia) SOEN 363 80 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
NewSQL Essam Mansour (Concordia) SOEN 363 81 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Current Market Status RDBMS RDBMS Data Warehouse 1985-1995 1995-2010 2010-Now Data Warehouse RDBMS NoSQL Essam Mansour (Concordia) SOEN 363 82 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
What’s next? Essam Mansour (Concordia) SOEN 363 83 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The End Essam Mansour (Concordia) SOEN 363 84 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help

Browse Popular Homework Q&A

Q: When crickets chirp 126 times a minute, it is about 67 degrees Fahrenheit. When they chirp 175 times…
Q: (a) Show whether the operator of the kinetic energy operator K and the momentum operator Px for a…
Q: Use the properties of logarithms to express the following logarithms in t and z. a. log, 5xy? b. log
Q: The vector OP shown in the figure has a length of 6 cm. Two sets of perpendicular axes, x-y and…
Q: 1. Evaluate the limit: 5 x lim x-5 (x-5)2 2. Evaluate the limit: TC 1000 lim x-7- (x-7)2 x-8+ (8-…
Q: Equilibrium solution . Stable equilibrium solution (a) Find the general solution to the differential…
Q: A plane is flying at a constant altitude of 1 mile and a speed of 400 mph toward an observer on the…
Q: 2. Solve for the following rates, treating all variables as functions of t (a) Solve for dr dt (b)…
Q: The current population of a small town is 1714 people. It is believed that town's population is…
Q: A manufacturer estimates that its product can be produced at a total cost of C(x) = 50,000 + 100x +…
Q: Evaluate the following definite integral. Show all intermediate steps. ∫12   [ 2xe^(x^2+1) +…
Q: Let f(x)=(x^4-162x^2)/4. Use the definition of a derivative or the derivative rules to find f'(x)=…
Q: Circle which of the phylogenetic trees is/are different from the one at the top:
Q: A two-evaporator compression refrigeration system as shown in Fig. P11–120E uses refrigerant-134a as…
Q: Let O -4 Compute the indicated matrix. (If this is not possible, enter DNE in any single blank). A +…
Q: A data set lists earthquake depths. The summary statistics are n = 400, x = 5.92 km, s=4.67 km. Use…
Q: Use the properties of logarithms to express the following logarithms in t and z. a. log, 5xy? b. log
Q: General intelligence can be measured from tasks that only assess how much someone can memorize True…
Q: Yn+1 Yn-1 + 34. In-1 +4Jn + Jn+1], is proposed. By utilizing the model problem, y' = Xy, (a)…
Q: Using arithmetic shifting as shown in 8 bits storage using one's and two's complement, perform the…
Q: 3 -1 Let u = 14| and A = -5 7 Is u in the plane in R³ spanned by the columns of A? Why or why not?…
Q: Find the area of the shaded region. f(x) = 3x + 2x2 - x°, g(x) = 0 The area is (Type an integer or a…