Final_exam

pdf

School

Northeastern University *

*We aren’t endorsed by this school

Course

6700

Subject

Industrial Engineering

Date

Feb 20, 2024

Type

pdf

Pages

5

Uploaded by DeaconMorningLlama40

Report
IE 7374 Data Management for Analytics Final Examination (160 minutes) Points Problem 1 (page 2) 15 Problem 2 (page 2) 10 Problem 3 (page 3) 20 Problem 4 (page 3-4) 25 Problem 5 (pages 4-5) 15 Problem 6 (pages 5) 15 Total 100 Instructions: e This exam is a closed-book, closed-notes exam. Books and notes are NOT allowed. A two-page (two sheets of standard A4 paper, both sides) cheat sheet is allowed. e Electronics of any kind (smartphones, laptop computers, tablets, calculators, etc.) is NOT allowed. Write your answer neatly, clearly, and legibly with good organization. e Keep your exam sheet and answers to yourself. Peeking, copying, and sharing are strictly prohibited. e Write your name on the exam sheet. e Trips to restrooms must be short and unaccompanied by smartphones, laptop computers, and tablets. e Upon completion, please sign the following honor code, organize your answer and the signed honor code into a SINGLE PDF file, and upload the PDF file to blackboard. Please Sign the Honor Code: On my honor, I pledge to uphold the values of honesty, integrity, and respect that are expected of me as a Northeastern student. Student Name: Signature:
IE 7374 Data Management for Analytics Problem 1 [15 points]. Considering the following EER model PICKUP-DATE-TIME @ e @ @ OPERATED BY ACC-LOCATION = ACCIDENT | 1.1 1..M CUSTOMER e |2 points] Identify derived attribute type if there is any e |3 points] Identify relationship attribute types if there is any e |5 points] Identify existence dependent entity types if there is any e |5 points] Identify EER specializations if there 1s any and specify the disjointness constraints and completeness constraints whenever necessary Problem 2 [10 points]. The following ER model features a ternary relationship “BOOKING”. Model this ternary relationship with binary relationship types since ternary relationship is not supported in the designated DBMS, and draw the updated ER model (relationships and appropriate cardinalities must be specified clearly) BOOKING [ TOURIST ]O"N & 0"Nr HOTEL ] R > TRAVEL AGENCY
IE 7374 Data Management for Analytics Problem 3 [20 points]. Normalize the following unnormalized relations (underline primary keys, italic foreign key) to 3NF e [10 points] Create necessary relations to bring it to 3NF with primary key and foreign keys clearly specified for each relation. Note that it is unnecessary to fill in the tables with data. OrderiD Orcer Customer Customer Customer ProcuctD Product Product Product Ordered Date D Name Address Descripon ~ Finish ~ StandardPrice Quarsty 1008 102412010 2 Vae Plano, TX 7 Dinng Naturd 80000 2 Fumiture Table Ash 5 Wiriter's Chery 325.00 2 Desk 4 Entertinment Naturd 650,00 1 Canter Maple 1007 101252010 6 Fumture Boulder, 1" 4-Dr Oak 500.00 B Gdley CO Dresser 4 Entertarment Naturd 650.00 3 Canter Maple e [10 points] Create necessary relations to bring it to 3NF with primary key and foreign keys clearly specified for each relation. Note that it is unnecessary to fill in the tables with data. (StudentName is the primary key in the following). StudentName StudentPhone Student-Dept Student-DeptName AdvisorName AdvisorPhone Advisor-Dept Term Tolstoy 593-3824 21 English Caplice 253-3233 ESD Fall Thoreau 644-2343 21 English Caplice 253-3233 ESD Fall James 534-2534 21 English Lapide 253-1111 ESD Fall Woolf 643-5436 18 Mathematics Toomre 253-6322 Mathematics Spring Shakespeare 634-6344 8 Physics Smith 253-8453 Physics Spring Pushkin 534-9832 7 Biology Griffith 253-9833 Bio Engr Spring Problem 4 [25 points]. There are six tables describing a company, describing employees, departments, buildings, which department(s) an employee works in (and a percentage of the time for each), department managers (possibly more than one per department), and in which building an employee works (an employee may have more than one office). The primary key of each table is the attribute(s) in capitals. Other attributes are not necessarily unique. Samples of the tuples for each table are shown below: EMP 100,000 tuples IN_DEPT - 110,000 tuples EID | EName | Salary | Start Date | End_Date EID DID Percent_Time 001 Jane | $124,000 3/1/93 null 001 101 100 002 Jim $32,000 2/29/96 null 002 102 100 003 John $99,000 12/12/98 null 003 101 60 004 Joe $55,000 2/2/92 null 003 102 40 005 | Jenny $51,000 5/5/95 null 004 103 100 EID values range from 1 to 100,000 005 103 100 IN_BUILDING 110.000 tuples EID BID BUILDING - 2,000 tuples 001 201 BID BName Address 002 201 201 ATC 1600 Ampitheatre 003 202 202 cccC 500 Crittenden 003 203 203 MFB 123 Shoreline 004 202 BID values range from 1 to 2,000 005 203
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
IE 7374 Data Management for Analytics DEPT 1,000 tuples MANAGES_DEPT - 800 tuples DID DName Annual_Budget EID DID 101 Research $1,001,000 003 101 102 Development $500,000 003 102 103 Sales $2,000,000 001 103 104 HR $250,000 DID values range from 1 to 1000 a. [5S points] Describe what the following query retrieves SELECT D.DName FROM DEPT D, IN_DEPT I, EMP E WHERE D.DID = I.DID AND E.EID = I.EID AND E.Salary >= ALL (SELECT Salary FROM EMP) b. more than 10 employees work c. [S points] Write a query to retrieve the names (DName) of the departments where no employees work [S points] Write a query to retrieve building ID (BID) and building name (BName) of the buildings where d. [5 points] Write a query to retrieve the employee ID (EID) and names (EName) who work in the departments with the top 2 highest annual budget (Annual_Budget) e. |[S points] Write a query to retrieve department ID (DID), department name (DName), total number of employees working in the department, average salary of those employees who work in the department, and annual budget per employee for the department. Restrict to those departments where there is at least one employee Problem S [15 points]. The metadata structure of a social graph shown below is for a book-reading club, modeling genres, books, and readers: Reader FRIEND_OF Some examples of the relations are created as follows using Cypher queries: Create (Bart:Reader {name: ‘Bart Baesens’, age:32}) Create (Seppe:Reader {name: ‘Seppe vanden Broucke’, age:30}) Create(Fantasy:Genre {name: ‘fantasy’}) Create(Education:Genre {name: ‘education’}) Creat(b@1:Book {title: ‘My First Book’})
IE 7374 Data Management for Analytics Creat(b@2:Book {title: ‘A Thriller Unleashed’}) Create (b01)-[ :IS_GENRE] ->(Education), (b02)-[ :IS_GENRE] ->(Thriller), Create (Bart) -[ :FRIEND_OF]->(Seppe), ..(Seppe) - [ :FRIEND OF]->(Wilfried), Ereat (Bart) -[ :LIKES] ->(b@1), ..(Seppe) -[ :LIKES]->(be2).. Write Cypher queries to answer the following questions: e [4 points] Who are Seppe’s friends that liked education books (Seppe’s name: Seppe vanden Broucke)? e |5 points] Can you recommend fantasy books that Bart’s friends liked and Bart has not liked (Bart’s name: Bart Baesens)? e |6 points] Can you get a list of all pairs of friends that like at least 2 humor books in common? Problem 6 [15 points]. A collection, “people”, is created in a MongoDB and sample of the items are shown below: { "_id" : ObjectId("5ef372a476cd570d3ed1f923"), "name" : "Seppe", "restaurant_id" : 23, "rating" : 5 } { " _id" : ObjectId("5ef372a476cd570d3ed1f924"), "name" : "Seppe", "restaurant_id" : 89, "rating" : 6 } { "_id" : ObjectId("5ef372a476cd570d3ed1f925"), "name" : "Seppe", "restaurant_id" : 182, "rating" : 6 } { " _id" : ObjectId("5ef372a476cd570d3ed1f92a"), "name" : "Wilfried", "restaurant_id" : 153, "rating" : 4 } { " _id" : ObjectId("5ef372a476cd570d3ed1f92b"), "name" : "Wilfried", "restaurant_id" : 23, "rating" : 4 } { "_id" : ObjectId("5ef372a476cd570d3ed1f92c"), "name" : "Wilfried", "restaurant_id" : 94, "rating" : 4 } e [10 points] Write a query using a map-reduce pipeline to get the largest (maximum) rating for each restaurant id. e |5 points] Write a query to get the largest (maximum) rating for each restaurant_id without using the map-reduce pipeline