PS3_3_FA23_CSGY6083B

pdf

School

New York University *

*We aren’t endorsed by this school

Course

6083

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

14

Uploaded by LieutenantElectron12238

Report
ASSIGNMENT: 3 [100 Points, 5% weight to the Final grades] Please submit your assignment on NYU Brightspace in a single PDF document attachment. Please mention Student ID, Name, Course, Section Number, and date of submission on first page of your submission. Please use SQL file (DDL and DML code) for writing SQLs in Oracle. You may create equivalent DDL and DML code for MySQL and write SQLs in MySQL. All the tables should be created with your initial as a prefix. DO NOT change data in tables. You can write SQL in either Oracle or MySQL. Problem A : 50 points [each question has 10 points] The New York City Subway is a rapid transit system in the New York City boroughs of Manhattan, Brooklyn, Queens, and the Bronx. It is owned by the government of New York City and leased to the New York City Transit Authority, an affiliate agency of the state-run Metropolitan Transportation Authority (MTA). Opened on October 27, 1904, the New York City Subway is one of the world's oldest public transit systems, one of the most used, and the one with the most stations, with 472 stations in operation and 36 train lines. SAMTA (Subway Administration of Metropolitan Transportation Authority) is the NYU affiliated data analytics startup company. SAMTA has undertaken a database project to normalize subway data systems. For relational schema model in figure, write SQL statements to answer following questions. You don’t need to create a relational model. CS-GY 6083 - B, Fall 2023 Principles of Database Systems
i. List all the details of the station which has the highest number of facilities in Elevator. SQL QUERY - WITH BoroughFacilityCounts AS ( SELECT s.station_id, s.station_name, s.station_address, s.borough, s.neighborhood, - f.facility_name, COUNT(sf.facility_id) AS num_facilities, RANK() OVER (ORDER BY COUNT(sf.facility_id) DESC) AS facility_rank FROM ss_station s JOIN ss_station_facility sf ON s.station_id = sf.station_id JOIN ss_facility f ON sf.facility_id = f.facility_id WHERE f.facility_name = 'Elevator' GROUP BY s.station_id, s.station_name, s.station_address, s.borough, s.neighborhood, f.facility_name ) SELECT * FROM BoroughFacilityCounts WHERE facility_rank = 1; RESULT -
ii. List details of each train line and their highest run time. Your result should have TRAIN_ID, LINE_NAME, ARRIVVAL, DEPARTURE, SQL QUERY WITH TrainLineRuntimes AS ( SELECT ts.train_id, tl.line_name, TO_CHAR(ts.arrival, 'MM/DD/YYYY HH24:MI:SS') AS arrival, TO_CHAR(ts.departure, 'MM/DD/YYYY HH24:MI:SS') AS departure, ts.departure - ts.arrival AS runtime, RANK() OVER (PARTITION BY ts.train_id ORDER BY ts.departure - ts.arrival DESC) AS runtime_rank FROM ss_trainlines_station ts JOIN ss_trainlines tl ON ts.train_id = tl.train_id ) SELECT train_id, line_name, arrival, departure FROM TrainLineRuntimes WHERE runtime_rank = 1; RESULT -
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
iii. Find the top two boroughs in terms of numbers of station facilities. Your result should have the name of the borough and number of facilities. SQL QUERY - WITH BoroughFacilityCounts AS ( SELECT s.borough, COUNT(sf.facility_id) AS num_facilities, RANK() OVER (ORDER BY COUNT(sf.facility_id) DESC) AS borough_rank FROM ss_station s JOIN ss_station_facility sf ON s.station_id = sf.station_id GROUP BY s.borough ) SELECT borough, num_facilities FROM BoroughFacilityCounts
WHERE borough_rank <= 2; RESULT- iv. List station name, borough, neighborhood, facility name, and number of facilities. Arrange the result in descending order of number of facilities. SQL QUERY - SELECT ss_station.station_name, ss_station.borough, ss_station.neighborhood, ss_facility.facility_name, ss_station_facility.number_of_facility FROM ss_station_facility INNER JOIN ss_station ON ss_station_facility.station_id = ss_station.station_id INNER JOIN ss_facility ON ss_station_facility.facility_id = ss_facility.facility_id ORDER BY ss_station_facility.number_of_facility DESC; RESULT-
v. List the train line that has maximum number of stations. SQL QUERY - WITH LineStationCounts AS (
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SELECT tl.line_name, COUNT(ts.station_id) AS num_stations, RANK() OVER (ORDER BY COUNT(ts.station_id) DESC) AS line_rank FROM ss_trainlines tl JOIN ss_trainlines_station ts ON tl.train_id = ts.train_id GROUP BY tl.line_name ) SELECT line_name, num_stations FROM LineStationCounts WHERE line_rank = 1; RESULT- Submit: For each of questions submit a) SQL b) SQL result. All SQL and corresponding results must be visible clearly on screenshots. Problem 2: 20 points [each question has 10 points]
part1 SELECT TRUNC(schedule) AS "dates", COUNT(DISTINCT flight_id) AS "flight_count" FROM flight GROUP BY TRUNC(schedule) ORDER BY TRUNC(schedule); part 2 SELECT schedule AS "dates", SUM(capacity)/1000||'K' AS "total_capacity" FROM (SELECT DISTINCT(f.flight_id), TRUNC(schedule) AS schedule, capacity FROM flight f JOIN flight_size fs ON fs.flight_id = f.flight_id ) GROUP BY schedule ORDER BY schedule; Submission: Create tables as above with your initial as prefix such as AP_FLIGHT, AP_SIZE. Populate the same data as shown in pictures. For Q1 and Q2 write the SQLs that produce the desired results as Output 1 and Output 2. Submit screenshots of your SQL queries and their respective results. You can use Oracle or MySQL relational database. All SQL and corresponding results must be visible clearly on screenshots.
Problem 3: 30 points For a given relational model below, please find attached file containing DDL and DMLs. You may create equivalent code for MySQL. Create tables and insert data by replacing SK with your own initial. You can do this assignment either in Oracle or MySQL. You don’t need to draw logical/relational models . I. For this relational model of a furniture company, create a read-only database view that represents the following dataset. Customer ID, Customer Name (both First and Last name), Order_Id, order date, each product in order with description, quantity, Unit_Price, Total price of each product, and Finish shade. Sort the dataset in order of total order amount. Give appropriate column names in view. Restrict dataset to represent only those orders which have total value over $1000. Once view is created submit View code and then retrieve result of view using SQL query. ANS:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
CREATE VIEW orde_view AS SELECT o.cust_id, c.fname || ' ' || c.lname AS customer_name, o.order_id, o.order_date, p.prod_id, p.descr AS product_description, f.shade AS finish_shade, op.quant AS quantity, p.unit_price, (op.quant * p.unit_price) AS total_price, order_totals.total_order_amount FROM sk_order o JOIN sk_cust c ON o.cust_id = c.cust_id JOIN sk_ord_prod op ON o.order_id = op.order_id JOIN sk_prod p ON op.prod_id = p.prod_id JOIN sk_finish f ON op.finish_id = f.finish_id JOIN ( SELECT o.order_id, SUM(op.quant * p.unit_price) AS total_order_amount FROM sk_order o JOIN sk_ord_prod op ON o.order_id = op.order_id JOIN sk_prod p ON op.prod_id = p.prod_id GROUP BY o.order_id HAVING SUM(op.quant * p.unit_price) > 1000 ) order_totals ON o.order_id = order_totals.order_id; SELECT * FROM ORDE_VIEW;
II. Find top 3 products and their finish shade in terms of total quantities sold between 01/01/2018 and 12/31/2021. Your result dataset should have Product_Id, Product Description, shade, and total quantity sold. ANS: WITH ProductQuantityRank AS ( SELECT p.prod_id, p.descr AS product_description, f.shade, SUM(op.quant) AS total_quantity_sold, RANK() OVER (ORDER BY SUM(op.quant) DESC) AS quantity_rank FROM sk_prod p JOIN sk_ord_prod op ON p.prod_id = op.prod_id JOIN sk_finish f ON op.finish_id = f.finish_id
JOIN sk_order o ON op.order_id = o.order_id WHERE o.order_date BETWEEN TO_DATE('01-JAN-2018', 'DD-MON-YYYY') AND TO_DATE('31-DEC-2021', 'DD-MON-YYYY') GROUP BY p.prod_id, p.descr, f.shade ) SELECT prod_id, product_description, shade, total_quantity_sold FROM ProductQuantityRank WHERE quantity_rank <= 3;
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
III. Find products and their finish shade that have not been sold in during the months of Oct, Nov, and Dec in 2020 ANS: SELECT p.prod_id, p.descr AS product_description, f.shade FROM sk_prod p CROSS JOIN sk_finish f WHERE NOT EXISTS ( SELECT 1 FROM sk_ord_prod op JOIN sk_order o ON op.order_id = o.order_id WHERE p.prod_id = op.prod_id
AND f.finish_id = op.finish_id AND EXTRACT(MONTH FROM o.order_date) IN (10, 11, 12) AND EXTRACT(YEAR FROM o.order_date) = 2020 ); And many more Submission: For each question write an SQL query and submit both SQL and a clearly visible screenshot of corresponding result underneath each question. Make appropriate use of column alias and built in functions in your SQL queries.