Review Test Submission- 6346-Midterm Exam – BUAN ..-

pdf

School

University of Texas, Dallas *

*We aren’t endorsed by this school

Course

6346

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

15

Uploaded by MasterSeaLion3216

Report
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 1/15 Review Test Submission: 6346-Midterm Exam BUAN 6346.5U2 - Big Data - Su20 Course Homepage Week#7 - Midterm Exam Review Test Submission: 6346-Midterm Exam User Monik Chintha Course (MERGED) BUAN 6346.5U2 - MIS 6346.5U2 - Su20 Test 6346-Midterm Exam Started 7/9/20 6:00 PM Submitted 7/9/20 7:10 PM Due Date 7/9/20 7:30 PM Status Completed Attempt Score 100 out of 100 points Time Elapsed 1 hour, 10 minutes out of 1 hour and 30 minutes Results Displayed All Answers, Submitted Answers, Correct Answers, Incorrectly Answered Questions Question 1 Selected Answer: True Answers: True False Impala and Hive are tools for performing SQL queries on data in HDFS Question 2 Selected Answer: True Answers: True False YARN manages resources in a Hadoop cluster and schedules jobs Question 3 Selected Answer: Which of the following is part of Built-In Flume channel? All the above My eLearning 2 out of 2 points 2 out of 2 points 4 out of 4 points Monik Chintha 74
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 2/15 Answers: Memory File JDBC All the above Question 4 Selected Answer: Answers: The below code belongs to Pig component, which is one of the Hadoop ecosystem. What is the main feature or advantage of Pig programming? High-level language easy to program: Trivial to achieve parallel execution of simple and parallel data analysis tasks High-level language easy to program: Trivial to achieve parallel execution of simple and parallel data analysis tasks Low-level programing language, which makes Pig program very close to the machine language. Pig is considered data ingestion and an ETL tool None of the above. 2 out of 2 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 3/15 Question 5 Selected Answer: False Answers: True False Impala generates jobs that run on the Hadoop cluster data processing engine – It Runs MapReduce jobs on Hadoop based on HiveQL statements Question 6 Selected Answer: Answers: Which of the following is not a feature of HDFS? MapReduce works internally in memory and it is not tight to mappers HDFS Provides high -throughput access to application data. HDFS is used to store large datasets MapReduce works internally in memory and it is not tight to mappers HDFS runs on large clusters of commodity machines HDFS is a distributed, scalable, and portable ±lesystem written in Java for the Hadoop framework Question 7 Selected Answer: Answers: The table “job” was created by the below statement and it has 20K records. When dropping the table, the data could be retrieved and recovered from: Data not found any where in the cluster /user/hive/warehouse 2 out of 2 points 4 out of 4 points 4 out of 4 points
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 4/15 /hdfs/hive/warehouse /loudecre/hive/warehouse Data not found any where in the cluster Question 8 Selected Answer: True Answers: True False Hadoop is a framework for distributed storage and processing Question 9 Selected Answer: Answers: Which of the following is not part of Core Hadoop Cluster? Microsoft SQL HDFS YARN MapReduce Microsoft SQL Question 10 Selected Answer: Answers: What does the below sqoop statement do? All the above Creates the properties of table ‘employee’ in the Hive Metastore 2 out of 2 points 4 out of 4 points 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 5/15 Imports the employee data from the MySql to the Hive default directory in HDFS Creates a table that is accessible in both Hive and Impala All the above Question 11 Selected Answer: Answers: Which of the following is part of Hadoop ecosystem? All the above Apache sqoop Apache Oozie Cloudera Search - Solr All the above Question 12 Selected Answer: Answers: Which of the following is Built-in Flume Sources? All the above Syslog – Captures messages from UNIX syslog daemon over the network Netcat – Captures any data wriCen to a socket on an arbitrary TCP port Spooldir – Extracts events from ±les appearing in a speci±ed (local) directory HTTP Source – Receives events from HTTP requests Exec – Executes a UNIX program and reads events from standard output All the above Question 13 4 out of 4 points 4 out of 4 points 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 6/15 Selected Answer: Answers: The table “adclicks” was created by the below statement and it has 20K records. When dropping the table, the data could be located at: /loudacre/ad_data /user/hive/warehouse/ad_data /loudacre/ad_data /loudecre/hive/warehouse/ad_data Data not found any where in the cluster Question 14 Selected Answer: Answers: Which of the following is NOT part of Flume Sinks? Memory Null Logger HDFS Memory Question 15 When creating the below table in HIVE, the location of the default data location is: 4 out of 4 points 4 out of 4 points
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 7/15 Selected Answer: Answers: /user/hive/warehouse /user/hive/warehouse /hdfs/hive/warehouse /loudecre/hive/warehouse /loudacre/job Question 16 Selected Answer: Answers: Which of the following is NOT part of Flume component architecture? Collector Source Sink Collector Channel Question 17 Selected Answer: False Answers: True False Hive executes queries directly on the Hadoop cluster – It uses a very fast specialized SQL engine, and it skips MapReduce Question 18 Selected Answer: Answers: The below statement is used to display hive table properties: Describe formatted 4 out of 4 points 2 out of 2 points 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 8/15 Describe formatted Invalidate metadata Describe Hive table Validate metadata Question 19 Selected Answer: Answers: Which of the following components is NOT considered Data Ingest Tools for Hadoop cluster: HDFS HDFS Apache Sqoop Apache Flume Apache Pig Question 20 Selected Answer: Answers: What does the below command line do? >>$ “hdfs dfs -put /usr/training/calllogs /loudacre/” Assumption: /loudacre --> resides in HDFS /user/training/calllogs --> resides in the local Linux ±le system Moves the directory “calllogs” entirely to HDFS location under /loudacre Moves the directory “calllogs” entirely to HDFS location under /loudacre Moves the “/loudacre” directory from HDFS to the local Linux ±le system /usr/training/calllogs Moves all ±les that reside under “calllogs” directory to HDFS:/loudacre without moving the directory “calllogs” None of the above 4 out of 4 points 2 out of 2 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 9/15 Question 21 Selected Answer: Answers: A common scenario in log collection is a large number of log producing clients sending data to a few consumer agents that are attached to the storage subsystem. For example, logs collected from hundreds of web servers forwarded to a dozen agents that write to the HDFS cluster. Question: In the below Flume design, if Agent#4 is down, then the aggregated data that is being sourced from the three agents ( Agent1 + Agent2 + Agent3) , will sink up in: The system is "partially halted" as Agent 1, Agent2, and Agent3 will continue working as expected. The data that was collected by each agent will be accumulated in each agent's Avro Sink area and will not move forward as long as Agent#4 is down. The aggregated data will reach to the ( HDFS) storage that is de±ned in Agent4 con±guration The aggregated data will go nowhere and the entire system will be hulted 4 out of 4 points
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&… 10/15 The system is "partially halted" as Agent 1, Agent2, and Agent3 will continue working as expected. The data that was collected by each agent will be accumulated in each agent's Avro Sink area and will not move forward as long as Agent#4 is down. None of the above Question 22 Selected Answer: Answers: Which Hive statement is used to loud the below HDFS:accounts data into the managed Hive “accounts” table: LOAD DATA INPATH ‘/loudacre/accounts’ INTO TABLE accounts LOAD DATA INPATH ‘/loudacre/accounts’ INTO TABLE accounts LOAD DATA INPATH TABLE ‘accounts’ into ‘loudacre/accounts’ LOAD DATA INPATH -mv ‘/loudacre/accounts/’ INTO TABLE accounts LOAD DATA INPATH -put ‘/loudacre/accounts/’ INTO TABLE accounts Question 23 Which of the following is NOT TRUE for Hadoop Distributed File System (HDFS): 4 out of 4 points 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&r… 11/15 Selected Answer: Answers: Data is distributed to limited numbers of nodes HDFS is the storage layer for Hadoop Provides inexpensive reliable storage for massive amounts of data on industry-standard hardware Data is distributed when stored Data is distributed to limited numbers of nodes Question 24 Selected Answer: True Answers: True False Flume Agents can receive data from many sources, including other agents. Question 25 The main goal of the below Sqoop statement is: 2 out of 2 points 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&… 12/15 Selected Answer: Answers: Extract the entire data from Mysql: “account” table and store it under HDFS:/loudecra/accounts Import the existing accounts data from HDFS and load it into MySQL: “account” table. Merge both data from Mysql and HDFS and store it under HDFS:/loudacre/accounts Extract all new accounts incremental from Mysql: “accounts” table and store them under HDFSL/loudacre/accounts Extract the entire data from Mysql: “account” table and store it under HDFS:/loudecra/accounts Question 26 Selected Answer: Answers: What does the below command line do? >>$ hdfs dfs -get /loudacre/kb /usr/training/sqoop Assumption: /loudacre/kb --> resides in HDFS /user/training/sqoop --> resides in the local Linux ±lesystem Moves the KB directory from HDFS contents over the local Linux ±lesystem /usr/training/sqoop Gets the sqoop directory and it's contents and moves it under HDFS:/loudacre/kb 2 out of 2 points
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&… 13/15 Moves the KB directory from HDFS contents over the local Linux ±lesystem /usr/training/sqoop Moves all ±les that reside under “KB” directory to /usr/training/sqoop without moving the directory “KB” itself. None of the above Question 27 Selected Answer: Answers: The main goal of the below Sqoop statement is: Sqoop import the incremental accounts from Mysql:”account” table that have acct_num values greater than 129764. Sqoop Import all accounts with acct_num greater than 129764 from HDFS and load them back into MySQL: “account” table. Merge both data from Mysql and HDFS and store it under HDFS:/loudacre/accounts. Sqoop import the incremental accounts from Mysql:”account” table that have acct_num values greater than 129764. Sqoop import the incremental accounts from Mysql:”account” table that have the acct_num value less than 129764. 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&… 14/15 Question 28 Selected Answer: Answers: What is the expectation of the below statement? The statement will import the data into HDFS default location "/usr/hive/warehouse/" and will create a new hive table named "device" as a managed table type with all data populated. The device data will reside under HDFS:/loudacre/device The statement will import the data into HDFS default location "/usr/hive/warehouse/" and will create a new hive table named "device" as a managed table type with all data populated. The statement will import the data into HDFS default location "/ usr /hive/warehouse/" and will create a new hive table named "device" as an external table type with all data populated. The statement will import the device data into HDFS default location "/ usr /hive/warehouse/" and will create a new hive table named "device" as an external table without data ( empty table ). Question 29 Selected Answer: Answers: To access a new hive table from Impala shell, the below action statement is required. Run ‘Invalidate metadata’ from Impala shell Run Describe formatted from Hive shell Run ‘Invalidate metadata’ from Impala shell Run “Describe Hive” table from Hive 4 out of 4 points 4 out of 4 points
7/9/2020 Review Test Submission: 6346-Midterm Exam – BUAN ... https://elearning.utdallas.edu/webapps/assessment/review/review.jsp?attempt_id=_26065994_1&course_id=_176682_1&content_id=_3307178_1&… 15/15 Thursday, July 9, 2020 7:11:00 PM CDT Run “Validate metadata” from Impala shell Question 30 Selected Answer: False Answers: True False One of main features of the Hadoop cluster is that Hive and Impala use the Metastore to determine data format and location. In case of the Metastore node failure, Hadoop cluster has the capability to operate the requests of the queries and data retrieval from HDFS successfully without the need of Metastore node. OK 2 out of 2 points
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help