Class Material

pdf

School

California State University, East Bay *

*We aren’t endorsed by this school

Course

620

Subject

Information Systems

Date

Oct 30, 2023

Type

pdf

Pages

19

Uploaded by AmbassadorBeaver2154

Report
Data Warehousing Name_______________________________ Provide brief answers to the following questions. 1. ABC Inc. is a retailer. Their sales for the month of January in 2018 included 10000 sales transactions, each sales transaction consisting of 2 order line items on average. ABC Inc’s data warehouse includes a sales star schema with the SalesFact table having a granularity of 1 row per order line item. The retailer had only one active promotion during this period and it was newly introduced in January 2018, 50% off coupon for Cheese. There were 300 order line items where the coupon was used. When their data warehouse is updated with new data for January 2018, how many new rows will be added to the (1) SalesFact table (2) Promotion Dimension table and (3) Date Dimension to capture January 2018 sales? The SalesFact table has a granularity of 1 row per order line item, and there were 10000 sales transactions with 2 order line items each, so the SalesFact table would have 20,000 rows for January 2018 sales. There was only one active promotion during January 2018, and it was newly introduced during that period, so the Promotion Dimension table would need only 1 new row to capture this promotion. January 2018 is a single month, so the Date Dimension table would need only 1 new row to capture this month. 2. A dimensional model for an auction process consists of a FactAuction table, DimDate, DimSeller, DimBuyer, DimProduct. What would be the primary key, foreign key(s) and attributes in the FactAuction table? The only measure of interest is the Auction Amount. DimDate DimProduct DateKey (PK) ProductKey (PK) DimBuyer BuyerKey (PK) DimSeller SellerKey (PK) FactAuction The primary key of the FactAuction table would be a composite key consisting of foreign keys from the DimDate, DimSeller, DimBuyer, and DimProduct tables. Specifically: DateKey (FK) SellerKey (FK) BuyerKey (FK) ProductKey (FK) The FactAuction table would also have the following attribute(s): The primary key of the FactAuction table would be a composite key consisting of foreign keys from the DimDate, DimSeller, DimBuyer, and DimProduct tables. Specifically:
Data Warehousing Name_______________________________ DateKey (FK) SellerKey (FK) BuyerKey (FK) ProductKey (FK) The FactAuction table would also have the following attribute(s): Auction Amount (the only measure of interest) Auction Amount (the only measure of interest) 3. Given below are three transactions (orderline items). How would the data be represented in a data warehouse? Please illustrate by entering the transactions below into the Fact and Dimension tables given below. 1. John Doe (CustomerNo 24) purchased “Chlorox Wipes” (ProductNo 2) for $2 (Sales Amount) on April 23, 2016 2. John Doe (CustomerNo 24) purchased “Dasani Water 1QT” (ProductNo 3) for $2 (Sales Amount) on April 23, 2016 3. Jane Smith (CustomerNo 26) purchased “ Dasani Water 1QT (ProductNo 3) for $2 Mixing multiple granularities in a single fact table is not recommended. Give two reason why. Explain with an example. 4. The data can be represented in the data warehouse as follows: 5. DimDate DateKey Day Month Year 20160423 23 04 2016 20160424 24 04 2016
Data Warehousing Name_______________________________ 6. DimProduct ProductKey ProductNo ProductName 2 Chlorox Wipes Wipes 3 Dasani Water 1QT Water 7. DimCustomer CustomerKey CustomerNo CustomerName 24 24 John Doe 26 26 Jane Smith 8. FactSales
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Data Warehousing Name_______________________________ DateKey ProductKey CustomerKey SalesAmount 20160423 2 24 2 20160423 3 24 2 20160424 3 26 2 9. Note that in the FactSales table, there is a row for each order line item. So, the first two transactions have two rows each in the fact table, while the third transaction has only one row. The primary key in each dimension table is the key attribute (e.g., DateKey in DimDate). The foreign key in the fact table is the key attribute in each dimension table (e.g., DateKey, ProductKey, and CustomerKey). The fact table also includes the measure of interest (i.e., SalesAmount).
Data Warehousing Name_______________________________ 4. Find two most critical errors and propose a solution to correct the errors for the following Product Dimension Table. Attribute Description ProductNo Business Key and Primary Key ProductName Name of Product ProductCategory Category of Product ProductInventory Current inventory (Quantity on hand) of product StartDate Start date of row effective period EndDate End date of row effective period Two critical errors in the Product Dimension Table are: Missing attribute: There is no attribute for the unit price of the product. This information is critical for sales analysis and forecasting. Solution: Add an attribute named "UnitPrice" to the Product Dimension Table. Overlapping time periods: The StartDate and EndDate attributes define the effective period of a row, but there is a possibility of overlapping time periods. For example, if a row for a product is updated with a new inventory value, but the EndDate of the previous row has not been updated, then there will be two rows with overlapping time periods. Solution: Implement a Type 2 Slowly Changing Dimension (SCD) approach to track changes to the Product Inventory attribute. This involves creating a new row in the Product Dimension Table for each change, with a new StartDate and EndDate that ensures no overlapping time periods. The previous row will have its EndDate updated to reflect the effective end date of the change. Two critical errors in the Product Dimension Table are: The ProductInventory attribute is not a descriptive attribute but a metric. Metrics are numerical measures that can be aggregated or used for calculations, while descriptive attributes provide context and help to filter or group data. Therefore, ProductInventory should be moved to the Fact table where it belongs. Solution: Remove the ProductInventory attribute from the Product Dimension Table and add it as a measure to the appropriate Fact table(s) where it is relevant.
Data Warehousing Name_______________________________ The StartDate and EndDate attributes are used to track changes in the product dimension over time, but they do not allow for continuous tracking of changes that may occur within the same day. This means that any updates to a product during the day will not be captured by this design. Solution: Instead of using StartDate and EndDate attributes, a more appropriate approach would be to use a timestamp to record changes to the Product dimension table. This will allow for continuous tracking of changes in the Product dimension at a granular level, including changes that occur within the same day. Additionally, the ProductNo attribute should be marked as a Business Key and Primary Key to ensure that it uniquely identifies each product in the dimension table. Use the Pharmacy Data Warehouse Dimensional Model and related information provided below to answer Q6 and Q7 Description of select fields PharmaSalesFact.POSTransactionNo: A unique number generated for each transaction by the point of sales register at the pharmacy. PharmaSalesFact.RegularUnitPrice: The price per unit of product PharmaSalesFact.ExtendedSalesDollar: The netprice of the product multiplied by quantity sold. PharmaSalesFact.ProfitMargin: The profit margin for the specific transaction line item calculated based on calculated based on the ExtendedSalesDollarAmount and ExtendedGrossProfit DoctorDim.DoctorID: A unique sequential integer created for each record inserted into the DoctorDim table DoctorDim.LicenseNo: The medical practice license number of the Physician issued by the Medical Board. The number is unique to each Physician licensed to practice. 5. For each of the attributes below, specify if it is one or more of Additive, Non-Additive, Semi- Additive, Degenerate Dimension, Surrogate Key, Business or Natural Key . Attribute Attribute Type(s)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Data Warehousing Name_______________________________ PharmaSalesFact.POSTransactionNo PharmaSalesFact.RegularUnitPrice PharmaSalesFact.ExtendedSalesDollar PharmaSalesFact.ProfitMargin DoctorDim.DoctorID DoctorDim.LicenseNo 6. Consider the Pharmacy Data Warehouse Dimensional Model. The company now wants to also track, total number of unique products(drugs) purchased in a transaction and customer wait time for each visit to the pharmacy. The wait time is measured as the difference between the time at which the customer checks in at the pharmacy with their prescription and the check- out time at which all products in the prescription are dispensed. There can be multiple products prescribed within one prescription. Option 1 is adding the number of products and wait time measure to the existing Fact table Option 2 is to create a new fact table Which option would you recommend and why? A real estate brokerage agency helps individuals buy and sell homes. A customer approaches the agency to either buy or sell a home and an agent works with the customer to help with the process. A home sale transaction includes information on home buyer, home seller, buyer’s agent, sellers agent, home property, list date, sold date, list price, sold price, buyer agent commission, seller agent commission. A real estate agent can acts as a buyer’s agent in some transactions and as a sellers agent in others. The agency is also interested in selling quicker (reducing days between list date and sold date). 7. Assuming a grain of one row per home sale transaction, identify all the relevant dimensions. 7. Relevant dimensions for a home sale transaction could be: Home Buyer Home Seller Buyer’s Agent Seller’s Agent
Data Warehousing Name_______________________________ Home Property Date (with attributes such as Day, Month, Year) Location (with attributes such as City, State, Zip Code) Agency (with attributes such as Agency Name, Agent Name) 8. Relevant measures that can be included in the fact table for a home sale transaction could be: List Price Sold Price Buyer Agent Commission Seller Agent Commission Days between List Date and Sold Date Profit Margin (Sold Price - List Price) Home Property Details (such as Square Footage, Number of Bedrooms/Bathrooms, Lot Size, Year Built) 8. Assuming a grain of one row per home sale transaction, identify all the relevant measures that can be included in the fact table. 9. A 4 bedroom home that was sold in 2014 is back on the market for sale again. However, a new bedroom has been added to the home since the last sale and it now has 5 bedrooms. The data warehouse includes information about the home and corresponding sales transaction in 2014. Specify whether bedroom information is best included in a fact table or a dimension table. Also describe how the data warehouse should be updated to reflect that the specific home now has 5 bedrooms. Bedroom information is best included in a dimension table as it is a characteristic of the home and not a measure. To reflect the fact that the home now has 5 bedrooms, the dimension table for the home property should be updated. Specifically, the row for the specific home should be updated to reflect the new bedroom count. If there is a surrogate key for the home property dimension, it should remain the same to maintain consistency with any existing fact tables. If there is no surrogate key and the primary key is a natural key, the primary key may need to be updated to reflect the change in bedroom count. Any related fact tables should also be updated to reflect the new bedroom count for the specific home.
Data Warehousing Name_______________________________ The bedroom information is best included in the dimension table. In this case, the dimension table would be the Home Property Dimension which would include attributes such as home address, number of bedrooms, square footage, lot size, etc. To update the data warehouse to reflect that the specific home now has 5 bedrooms, the corresponding record in the Home Property Dimension table needs to be updated. The number of bedrooms attribute for that specific home would be changed from 4 to 5. Additionally, if there are any existing fact tables that reference the Home Property Dimension table, such as the Home Sale Transaction Fact table, those tables would also need to be updated to reflect the new bedroom count for the specific home. 4. Mixing multiple granularities in a single fact table is not recommended. Give two reason why. Explain with an example. Mixing multiple granularities in a single fact table is not recommended for the following reasons: 1. Data inconsistencies: Different levels of granularity can result in data inconsistencies when aggregating facts at different levels. For example, consider a fact table that combines sales data at the individual customer level and at the state level. If we want to calculate the total sales for a particular state, we can sum up the sales for each customer in that state. However, if we then want to calculate the total sales for a particular city within that state, we can't simply sum up the sales for each customer in that city, since some customers may have purchased from multiple cities within the state. 2. Query performance: Combining multiple levels of granularity in a single fact table can negatively impact query performance, since it can result in large, unwieldy tables that are difficult for the database engine to handle. For example, a fact table that combines sales data at the individual customer level and at the state level would be much larger than a fact table that only includes sales data at the state level. 4. Mixing multiple granularities in a single fact table is not recommended. Give two reason why. Explain with an example. Mixing multiple granularities in a single fact table is not recommended for the following reasons: 1. Data inconsistencies: Different levels of granularity can result in data inconsistencies when aggregating facts at different levels. For example, consider a fact table that combines sales data at the individual customer level and at the state level. If we want to calculate the total sales for a particular state, we can sum up the sales for each customer in that state. However, if we then want to calculate the total sales for a particular city within that state, we can't simply sum up the sales for each customer in that city, since some customers may have purchased from multiple cities within the state. 2. Query performance: Combining multiple levels of granularity in a single fact table can negatively impact query performance, since it can result in large, unwieldy tables that are difficult for the database engine to handle. For example, a fact table that combines sales data at the individual customer level and at the state level would be much larger than a fact table that only includes sales data at the state level
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Data Warehousing Name_______________________________ Design a star schema using the 4 step dimensional modeling process for the following business processes. In your presentation (a) State the business process (b)Granularity in the form of 1 row per..(c) Dimensions, and (d) Facts 1. Restaurant Orders (Group 1 & 2) 2. Hotel Bookings (Group 3 & 4) 3. Movie Theater Ticket Sales (Group 5 & 6) 4. Fuel Sales at Gas Station (Group 7 & 8) 5. Stock Trading (at Brokerage) (Group 9 & 10) 1. Restaurant Orders: (a) Business process: Restaurant orders (b) Granularity: 1 row per order (c) Dimensions: Time: Date, Hour, Minute Location: Restaurant location Customer: Customer ID, Name, Gender, Age Order: Order ID, Order date, Order time, Order type (dine-in, takeout, delivery), Payment method Menu item: Item ID, Item name, Item category, Item price (d) Facts: Sales: Total sales amount (in dollars), quantity sold Discount: Total discount given (in dollars) Cost: Total cost of food (in dollars) 2. Hotel Bookings: (a) Business process: Hotel bookings (b) Granularity: 1 row per booking (c) Dimensions: Time: Check-in date, Check-out date, Check-in hour, Check-out hour Location: Hotel location Customer: Customer ID, Name, Gender, Age, Address, Phone number Room: Room number, Room type, Room rate (d) Facts: Revenue: Total revenue generated (in dollars) Occupancy: Number of rooms occupied Length of stay: Average length of stay (in days) 3. Movie Theater Ticket Sales: (a) Business process: Movie theater ticket sales (b) Granularity: 1 row per ticket sold (c) Dimensions: Time: Date, Hour, Minute Location: Theater location Movie: Movie ID, Movie name, Movie genre, Movie rating Customer: Customer ID, Name, Gender, Age Ticket: Ticket ID, Ticket type (adult, child, senior), Ticket price (d) Facts: Revenue: Total revenue generated (in dollars) Attendance: Number of tickets sold Concession sales: Total amount of concession sales (in dollars)
Data Warehousing Name_______________________________ 4. Fuel Sales at Gas Station: (a) Business process: Fuel sales at gas station (b) Granularity: 1 row per transaction (c) Dimensions: Time: Date, Hour, Minute Location: Gas station location Customer: Customer ID, Name, Gender, Age Fuel: Fuel type (regular, premium, diesel), Fuel price Pump: Pump ID, Pump location (d) Facts: Revenue: Total revenue generated (in dollars) Volume: Volume of fuel sold (in gallons) Profit: Total profit made from fuel sales (in dollars) 5. Stock Trading (at Brokerage): (a) Business process: Stock trading at brokerage (b) Granularity: 1 row per trade (c) Dimensions: Time: Trade date, Trade time Brokerage account: Account number, Account type Security: Security ID, Security name, Security type (stock, bond, option) Market: Market ID, Market name Trade: Trade ID, Trade type (buy, sell), Trade price (d) Facts: Revenue: Total revenue generated (in dollars) Commission: Total commission charged (in dollars) Gain/Loss: Total gain or loss from trades (in dollars) What are some of the key attributes (about top 10) you would include in a Date Dimension table for a Retail company? Why? A Date Dimension table is a critical component of a data warehouse for any retail company. It contains a comprehensive list of dates and their associated attributes, which allows for easy and efficient analysis of sales data over time. Some of the key attributes that should be included in a Date Dimension table for a retail company are: 1. Date Key: A unique identifier for each date in the dimension. 2. Date: The actual date represented by the Date Key. 3. Day of Week: The day of the week, e.g., Monday, Tuesday, etc. 4. Week Number: The number of the week in the year. 5. Month: The name of the month. 6. Quarter: The quarter of the year. 7. Year: The year. 8. Holiday Indicator: A binary flag indicating whether the day is a holiday or not. 9. Season: The season of the year, e.g., winter, spring, summer, or fall. 10. Fiscal Year: The fiscal year of the company, which may differ from the calendar year.
Data Warehousing Name_______________________________ 11. Weekday/Weekend/Seasom These attributes are important because they allow for the analysis of sales data across various time periods, such as weeks, months, quarters, and years. They also provide information about special events or holidays that may affect sales, as well as seasonal trends in consumer behavior. The inclusion of a fiscal year attribute is especially important for financial reporting and budgeting purposes. Overall, a comprehensive Date Dimension table can provide valuable insights into the performance of a retail company over time and help inform strategic decision-making. A Retail store uses four types of promotions Ads, Price Reduction, displays, and coupons. Create promotion dimension table(s) that can capture the following promotions. a. 20% price reduction on all Diary products. (Jan 1 Jan 7, 2021) b. Dairy Products advertised in Weekly Flyer Ad (Jan 1 Jan 7, 2021) c. Superbowl Radio Ad on snacks and beer selection (Feb 5 and Feb 7, 2021) d. Superbowl snacks front aisle display (Feb 5 Feb 7, 2021 ) To capture the four types of promotions described, we can create a single Promotion Dimension table with the following attributes: 1. Promotion Key: A unique identifier for each promotion record. 2. Promotion Type: A categorical attribute indicating the type of promotion. This can take on the values "Ad," "Price Reduction," "Display," or "Coupon." 3. Promotion Name: A descriptive name for the promotion. 4. Start Date: The date when the promotion begins. 5. End Date: The date when the promotion ends. 6. Product Category: The category of products that the promotion applies to. This can take on values such as "Dairy," "Snacks," or "Beer." 7. Product Subcategory: A more specific category for the product, such as "Cheese" for the Dairy category. 8. Discount Percentage: For Price Reduction promotions, this attribute indicates the percentage discount applied to the product price. 9. Ad Campaign Name: For Ad promotions, this attribute indicates the name of the campaign where the promotion was advertised. 10. Display Location: For Display promotions, this attribute indicates the physical location where the display was set up, such as "Front Aisle" for the Superbowl snacks display.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Data Warehousing Name_______________________________ A Retail Store currently has a star schema with the following tables. The store recently introduced a loyalty program (e.g. like the Safeway card) and has begun to capture customer information and award points for purchases. How would you extend the schema so the new information can be added to the schema. To extend the star schema to include the new information from the loyalty program, you will need to add a new dimension table and a fact table to the existing schema. The new dimension table will be a customer dimension table, which will store information about the customers such as their name, address, email, phone number, loyalty program ID, etc. This table will be linked to the existing fact table (sales fact table) through a foreign key (customer ID). The new fact table will be a loyalty points fact table, which will store information about the points awarded to customers for their purchases. This table will contain the following columns: customer ID, date of purchase, total amount spent, and the number of points awarded for the purchase. This table will be linked to the customer dimension table through the customer ID foreign key. The extended star schema will look like: Sales fact table (existing) product key store key date key sales amount Customer dimension table (new) customer key customer name address email phone number loyalty program ID
Data Warehousing Name_______________________________ Loyalty points fact table (new) customer key date key total amount spent points awarded With this new schema, you can now analyze the loyalty program's impact on sales and understand which customers are the most loyal and valuable to the store. To extend the star schema to include the new loyalty program information, we need to add one or more tables that will store the customer information and their associated loyalty points. Here's an example of how you could extend the schema: Create a new dimension table called "Customer" that contains customer information such as name, address, phone number, email address, and any other relevant information. Add a new column to the "Fact_Sales" table called "Customer_Key". This column will be a foreign key that references the "Customer" dimension table. Create a new fact table called "Fact_Loyalty_Points" that will store information about the loyalty program. This table will contain the following columns: Customer_Key: A foreign key that references the "Customer" dimension table. Date: The date that the loyalty points were awarded. Points: The number of points awarded to the customer. Connect the "Fact_Loyalty_Points" table to the existing star schema by adding a foreign key reference to the "Customer" dimension table. With these changes, you can now analyze the sales data by customer and also track the effectiveness of the loyalty program by analyzing the loyalty points data. For example, you could use the schema to answer questions like: Which customers have earned the most loyalty points? What is the average number of loyalty points earned per customer?
Data Warehousing Name_______________________________ Are customers who participate in the loyalty program more likely to make repeat purchases than those who don't? E1. UsedCarInc is a used car dealer. When a customer arrivers with a car to sell, a dealer agent inspects the car and makes an offer to purchase the car. After purchasing, the car is cleaned and is made available for sale to other customers who come to buy used cars. Any dealer agent can be involved in either the purchase or sale process. Develop a dimensional model to support their main car purchase and sale process. A car purchase by dealer transaction includes information on the seller, dealer agent (employee), car, purchase price, and date. A car sale by dealer includes information on buyer, car, dealer agent (employee), sale price, and date. The company wants to increase profit (sale price minus purchase price) and have higher profit margins, and lower the number of days between car purchase by dealer and car sold by dealer (increase inventory turnover). 1. What is the business process of interest here? 2. What are the different levels of granularity present? 3. What are the dimensions for each grain identified in #2? 4. What are the facts for each grain in #2 ? 5. Specify if VIN Number and Mileage are included in the model and the most appropriate table where they can be included 6. Would an accumulating fact table apply in this case? 7. Are there any role playing dimensions in this case? 1. The business process of interest is the car purchase and sale process at UsedCarInc, where dealer agents inspect cars and make offers to purchase them, and then sell the purchased cars to customers. 2. There are two levels of granularity present in this process: Car purchase by dealer: This grain captures the transaction-level details for each car purchased by the dealer, including information on the seller, dealer agent, car, purchase price, and date. Car sale by dealer: This grain captures the transaction-level details for each car sold by the dealer, including information on the buyer, car, dealer agent, sale price, and date. 3. The dimensions for each grain identified in #2 are as follows: Car purchase by dealer: Seller dimension: Contains information on the seller, such as name, address, phone number, and any other relevant information. Dealer agent dimension: Contains information on the dealer agent, such as name, job title, and any other relevant information.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Data Warehousing Name_______________________________ Car dimension: Contains information on the car, such as make, model, year, color, and any other relevant information. Date dimension: Contains information on the date of the transaction, such as day, month, year, and any other relevant information. Car sale by dealer: Buyer dimension: Contains information on the buyer, such as name, address, phone number, and any other relevant information. Dealer agent dimension: Contains information on the dealer agent, such as name, job title, and any other relevant information. Car dimension: Contains information on the car, such as make, model, year, color, and any other relevant information. Date dimension: Contains information on the date of the transaction, such as day, month, year, and any other relevant information. 4. The facts for each grain in #2 are as follows: Car purchase by dealer: Purchase price: The amount paid by the dealer to purchase the car from the seller. Car sale by dealer: Sale price: The amount paid by the buyer to purchase the car from the dealer. 5. VIN Number and Mileage could be included in the Car dimension table, as they are both attributes of the car that are relevant to both the purchase and sale processes. 6. An accumulating fact table could be used in this case to track the inventory turnover of the purchased cars. This would involve capturing the inventory count (i.e., the number of cars in inventory) at the end of each day, and then subtracting the inventory count from the previous day to determine the number of cars sold. 7. Yes, the Dealer agent dimension could be a role playing dimension, as a dealer agent can be involved in both the purchase and sale processes. This means that the same dimension table can be used to represent the
Data Warehousing Name_______________________________ dealer agent in both the Car purchase by dealer and Car sale by dealer fact tables. 1. The business process of interest is the car purchase and sale process of the used car dealer. 2. There are two different levels of granularity present: Car purchase by dealer Car sale by dealer 3. The dimensions for each grain identified in #2 are: Car purchase by dealer: Seller Dealer agent (employee) Car Date Car sale by dealer: Buyer Car Dealer agent (employee) Date 4. The facts for each grain in #2 are: Car purchase by dealer: Purchase price Car sale by dealer: Sale price 5. VIN Number and Mileage could be included in the Car dimension table, as they are attributes specific to the car being purchased and sold. 6. Yes, an accumulating fact table could apply in this case, as we want to track the inventory turnover over time. An accumulating fact table can track changes in inventory over time, such as the purchase and sale of cars. 7. Yes, there are role-playing dimensions in this case. The "Dealer agent" dimension is a role-playing dimension, as it can appear in both the Car purchase and Car sale fact tables, but it represents different roles in each case.
Data Warehousing Name_______________________________ SalesEmployeeDimention Table Field Description SalesEmployeeSSN This is the employee id number and is the same as employee social security number. Since the employee SSN is unique to each employee, it does not change, and can be represented in integer format, the employee SSN is used in the transactional database as employee id number, and is retained in the dimension table as dimension table primary key. SalesEmployeeName This is the First Name, Last Name of the employee. Over course of employment, there are cases where either employee’s first name or last name or both have changed. SalesEmployeePosition This is the position title of the sales employee and can take different values such as sales associate, senior sales associate, account executive etc. Over course of employment as a sales person, an employee may hold different position titles over time. SalesDistrict The sales territory is organized into a hierarchy of regions and districts. The district and region to which a sales employee is assigned may change over time. Each sales district belongs to a region. However, the sales territory can be re-organized and the assignments of districts to regions can also vary over time. SalesRegion A sales region consists of sales districts. The assignment of sales districts to sales regions can change when there are reorganizations. Such reorganizations are infrequent. In-Class Discussion Questions 1. What are the pros/cons of modeling the changes to employee dimension as type 1, 2 or 3 SCD. What is the best way? Can it be implemented using the current dimension design? What changes would need to be made to the dimension table design? 2. Assume that the organization has decided to model all the changes to employee dimension as type 2 SCD. At the next quarterly update of the data warehouse, the ETL analyst informs you that EmployeeNo 123456789 has had both a change in SalesDistrict and change in SalesEmployeePosition. Should these changes be added to the dimension table as one new row that captures both changes, or two new rows with one row for Sales Position Change and one row for district change? 3. The SalesEmployeeDimension table has been normalized into two different dimension tables as follows SalesEmployeeDim(SalesEmployeeSSN, SalesEmployeeName, SalesEmployeePosition, SalesDistrictID) SalesDistrict (SalesDistrictID, SalesDistrict, SalesRegion) . How would the SCD implementation selected in Q1 above vary for the denormalized and normalized versions of the dimension tables? 1. The choice of SCD type for modeling changes to the employee dimension depends on the specific business requirements and needs of the organization. Type 1 SCD involves updating the dimension record with the new information
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Data Warehousing Name_______________________________ and overwriting the old information. This method is useful when the history of the dimension changes is not important or can be easily reconstructed. Type 2 SCD involves creating a new record for each change and keeping the old record intact. This method is useful when it is important to track the history of changes to the dimension over time. Type 3 SCD involves creating a new attribute to track a specific change to the dimension while keeping the old attribute intact. This method is useful when only certain changes need to be tracked over time. If the organization decides to model all changes to the employee dimension as type 2 SCD, the design of the dimension table would need to be changed to include additional columns to track the start and end dates of each record. This would enable the tracking of changes over time and facilitate the querying of historical data. 2. In a type 2 SCD implementation, it is best to add two new rows for the changes to the SalesEmployeePosition and SalesDistrict, as each change represents a separate event and needs to be tracked separately. This would ensure that the historical information is retained and can be queried as needed. 3. In a denormalized version of the SalesEmployeeDimension table, the SCD implementation would be the same as in Q1, as the table still contains all the attributes of the employee dimension. In a normalized version, such as the one given in the question, the SCD implementation would need to be modified to include start and end dates for each record in both the SalesEmployeeDim and SalesDistrict tables. This would enable the tracking of changes over time and facilitate the querying of historical data.

Browse Popular Homework Q&A

Q: impany considering the acquisition of a machine that costs $543,000. The machine is expected to have…
Q: hich of the following statements about flux are true? The unit of flux is Vm²/C When turning a…
Q: Identify gene 1( bl, pr, vg) Identity gene 2 ( bl, pr, vg) Identify gene 3 ( bl, pr, vg)
Q: Evaluate the integral. integral   x 4  + 3x 2 +1      /  x 5 +5 x 3 +5x  dx
Q: In the resonance structure for the carbonyl group the charges on the atoms are
Q: Assume that FrameBody does not have excess capacity and therefore would lose sales if the frames…
Q: 5. Use the balanced equation for the conversion of salicylic acid (C₂H603) and acetic acid (C₂H4O₂)…
Q: etermine if the following compounds are soluble or insoluble in water: sodium acetate barium…
Q: ELODEA IN THE AQUARIUM WATER FROM CONTAINER Draw what the Elodea cells looked like when placed in…
Q: The __________ is the point defining the geometric center of an object.     center of mass…
Q: Describe the end behavior of the given function. 2 P(x)=√11x + 5x² - 3x + 3 What is the end…
Q: Testing: H0:p=0.67 H1:p<0.67 Your sample consists of 138 subjects, with 92 successes. Calculate…
Q: Is the retailer Target focusing on utilitarian or hedonic consumer benefits (or both)? Does the…
Q: 12. Dinosaur fossils are too old to be reliably dated using carbon-14. (See Exercise 11.) Suppose we…
Q: Consider the following mechanism for the formation of tert-butanol: (CH3) CBr(aq) (CH3), C+ (aq)…
Q: Linkage Mapping Using a Trihybrid Testcross in Fruit Flies Remember the black, purple, and vestigial…
Q: Use the method of Lagrange multipliers to minimize the function subject to the given constraint.…
Q: me value of k for which f(x) Sx² - kx if x ≤- 3-kx² if x>- [B][C] - [D] -3 [E] -1
Q: Given the following non-regular pentagon find the measure of the unknown angle. 139° x = 58° 155 134…
Q: You consume music (M) and concert tickets (C). Your utility function is U(M, C) = M1/4C3/4. The…
Q: Which "problematic" property of the covariance does the correlation "correct?" 1)Lack of…
Q: Assume that adults have IQ scores that are normally distributed with a mean of 97.8 and a standard…