TakeHomeExam-1_CPRE_414

pdf

School

Iowa State University *

*We aren’t endorsed by this school

Course

414

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

7

Uploaded by SargentGoatMaster933

Report
MANI KANTA LOKESH MEGHANA KORADA PART-1 1.Describe briefly the concept of non-strict hierarchies. What is a typical problem in such settings? Non-Strict When a dimension has many-to-many relationships, this occurs. Absent or non-online data When a lower level in a dimension can exist without matching data in the higher level to roll-up to, a hierarchy is created. Absent coverage Also referred to as ragged dimensions or unbalanced hierarchies. Non-strict hierarchies, or partial hierarchies, in which elements or entities can belong to more than one category or level at the same time and where there are no rigid, exclusive links between categories. To put it another way, objects in a non-strict hierarchy may have more than one parent or fall under more than one category. Misunderstanding and confusion risk if it's unclear to whom to report. reduced sense of accountability because workers could have multiple supervisors. danger of power disputes developing in the absence of a clear framework. Uncertainty regarding an item's hierarchy rank is a common issue in non-strict hierarchical environments. This may make it difficult to organize, categorize, and find information or objects. Take into account, for instance, a loose hierarchy for online retail goods, in which a "smartphone" might fall under more than one category, such as "Electronics," "Mobile Devices," or "Communication." This makes it challenging to assign the item to a single, clear category, which can make user navigation and search more challenging. Non-strict hierarchies are frequently seen in a variety of real-world situations, including file system organization, e-commerce product classification, and news website article classification. Additional tactics like tagging, keywords, and search algorithms may be used to solve the difficulties in such contexts and enhance the hierarchy's organization and item retrieval. https://www.nibusinessinfo.co.uk/content/flat-organisational-structure#:~:text=The%20common%20disad vantages%20of%20a,absence%20of%20a%20formal%20system https://cyberleninka.org/article/n/1023245.pdf
MANI KANTA LOKESH MEGHANA KORADA Explain briefly the constellation logical schema of DW. Fact Constellation Schema describes a logical structure of data warehouse or data mart. Fact Constellation Schema can design with a collection of de-normalized FACT, Shared, and Conformed Dimension tables. Fact Constellation Schema is a sophisticated database design that is difficult to summarize information. Drill-down, roll-up, and pivoting are just a few of the online analytical processing techniques it supports. Both a summary and a detailed format are being processed for the historical data. OLAP is applied to data marts or warehouses. The main goal of OLAP is to facilitate the ad hoc querying required to support DSS. The OLAP application is based on the multidimensional view of data. OLAP is not a data structure or schema; rather, it is an operational view. Because OLAP systems are complicated, a multidimensional representation of the data is necessary. In a Constellation Schema: Fact Tables: Similar to a star schema, the fact tables are located in the middle and contain numerical, quantitative data or measurements, like sales revenue or quantities sold. Fact tables and multiple dimension tables are interconnected. Dimension Tables: Dimension tables, like the star schema, are used to store descriptive attributes that give the measures in the fact tables context. These characteristics support data grouping, filtering, and analysis. Compared to star schema dimension tables, constellation schema dimension tables may be more standardized. Constellation of Dimension Tables: In contrast to the star schema, a constellation schema enables direct relationships between dimension tables and numerous fact tables. More intricate data relationships and analytical capabilities are made possible by this flexibility. It also permits common dimensions to be shared amongst various fact tables. Similar to the snowflake schema, dimension tables in a constellation schema can be further normalized by dividing them into hierarchies or sub-dimensions, which results in a more complex and effective structure. This can be useful for managing big, intricate data warehouses and helps cut down on data redundancy. https://www.javatpoint.com/data-warehouse-what-is-fact-constellation-schema#:~:text=Fact%20Constella tion%20Schema%20describes%20a,is%20difficult%20to%20summarize%20information . https://www.geeksforgeeks.org/fact-constellation-in-data-warehouse-modelling/
MANI KANTA LOKESH MEGHANA KORADA 3. Assume that you have a generalized hierarchy in which the split-paths are again merging at some level. What happens to the key from that level when MultiDim is translated into a logical (say, snowflake) model. When you have a generalized hierarchy in which the split-paths are merging at some level, the translation of such a hierarchy into a logical data model, such as a snowflake schema, involves certain considerations: Merging Levels: In a generalized hierarchy, multiple paths may converge or merge at a certain level. In a snowflake schema, this means that the attributes corresponding to those merging paths should be represented by a single dimension table at that level. This helps in reducing redundancy and maintaining data consistency. Snowflaking: If the hierarchy is complex and has many attributes or sub-levels, you may apply the snowflaking technique to normalize the dimension tables. This involves breaking down dimension tables into sub-dimensions or hierarchies, which can help with data management and reduce redundancy. Primary Keys: When multiple paths merge in the hierarchy, you'll need to decide how to handle primary keys in the logical model. The primary key of the dimension table at the merging level should be a combination of attributes that uniquely identify the level. These attributes should be chosen carefully to ensure data integrity and consistency. Relationships: In the snowflake schema, relationships between dimension tables and fact tables are typically one-to-many. Ensure that the relationships between fact tables and dimension tables reflect the merged paths in the hierarchy correctly. Additional Tables: Depending on the complexity of the hierarchy and the need for different attributes or hierarchies, you may create additional dimension tables or sub-dimensions to capture the unique characteristics of each path within the merged hierarchy. https://www.integrate.io/blog/snowflake-schemas-vs-star-schemas-what-are-they-and-how-are-they-differ ent/ https://www.sciencedirect.com/topics/computer-science/generalization-hierarchy
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
MANI KANTA LOKESH MEGHANA KORADA 4. Explain briefly the concept of centrality in graphs. Centrality is a concept in graph theory that measures the relative importance or prominence of a node (vertex) within a network or graph. It helps identify which nodes play key roles in connecting and influencing other nodes in the graph. Different centrality measures provide various perspectives on node importance. Some common centrality measures include: Degree Centrality: Based on the quantity of edges (connections) a node possesses, degree centrality is the most basic form of centrality measurement. Higher degree centrality nodes are thought to be more central because they have stronger connections to other nodes in the network. Betweenness Centrality: In a graph, betweenness centrality measures the frequency with which a node is located on the shortest paths connecting other pairs of nodes. High betweenness centrality nodes function as network bridges or middlemen and have the ability to regulate the information and resource flow. Closeness Centrality: Considering the length of the shortest paths, closeness centrality calculates a node's proximity to every other node in the network. Nodes with high closeness centrality are frequently regarded as central in terms of accessibility because they can easily connect to other nodes in the network. Eigenvector Centrality: This method uses a node's connections to other significant nodes to determine how important it is. It takes into account the caliber of connections as well as their quantity. High eigenvector centrality nodes are interconnected with other nodes that are also highly central. PageRank: A centrality metric used to rank web pages in search engine results, PageRank was first created by Google. It simulates the possibility of a random web surfer or walker arriving at a particular webpage. High PageRank nodes are regarded as authoritative and powerful. https://www.turing.com/kb/graph-centrality-measures#:~:text=Centrality%20is%20a%20crucial%20conc ept,other%20nodes%20in%20a%20graph .
MANI KANTA LOKESH MEGHANA KORADA Part 2: The next 3 questions are actually variations of your homework (part 2). Please refer to Fig. 1 below (the 4 th problem in this group has its own Figure (Fig. 2)). 1. Draw the constellation schema for the logical model of the DW in Fig. 1 CONSTELLATION SCHEMA
MANI KANTA LOKESH MEGHANA KORADA 2.Write a SQL statement for the following query: For each department, the total number of professors involved in research projects during the calendar year 2021 SELECT COUNT(Professor.ProfessorId) FROM Department, Professor, Research, Time t WHERE t.Year = 2021 and Professor.ProfessorId = Research.ResearchId GROUP BY DepartmentID 3. For the conceptual model (i.e., use the cube shown in Fig. 1 as-is), write the MDX statements for the following queries: Q1: For each department and each funding agency, the total number of Person Months in the projects that started after 2015. SELECT [FundingAgency].[AgencyName].MEMBERS on COLUMNS, FILTER {[Department].[DepartmentName].MEMBERS, ([Project].[Research].[Calendar].[StartDate].[Year] > 2015)) ON ROWS FROM UniversityDW WHERE [Project].[TotalPersonMonths] 4. Relation Support Confidence A=>A 83.33% 100% A =>C 50% 60% A=>D 50% 60% C=>A 50% 75% C=>C 66.67% 100% D=>A 50% 75% D=>D 66.67% 100%
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
MANI KANTA LOKESH MEGHANA KORADA Part 3: For the last portion, refer to Figure 3 below. Write the Cypher syntax for the following queries: 1. Retrieve the names of the customers who have purchased chocolate. MATCH (i:Item{item_name = "Chocolate"})<-[:Item]-(Ordered)<-[:User]-(c:Customer) RETURN c.name; 2. Retrieve the email addresses and the expiration date of the credit cards of customers who have purchased beer. MATCH (i:Item{item_name = "Beer"})<- [:Item]-(Ordered) <- [:User]-(c:credit_card) <- [:User]-(e:Email) RETURN cc.expiration, e.e-address;