Advanced Database Please solve it using Hadoop Assume that you have the following relations, each relation represents a dataset of text files stores on HDFS. 1. ratings ( UserID, MovieID, Rating ) // where rating represent the rating between (from 1 to 5) given by the user to the corresponding movieID 2. users ( UserID, Gender, Age) 3. movies ( MovieID, Title, Genres ) // where genres in the classification of the movie such as comedy, children, action, …. Suppose you have been given a task to find the average rating for each movie in the form (movieID, Title, avg_rating). Computing the average rating must consider the following: 4. only children and comedy movies 5. consider rating values that are above 2 6. consider ratings from users who’s age is above 25 What to submit: • First briefly describe how to implement the above task in MapReduce jobs in an efficient way • specify how many jobs you need • what the purpose of each MapReduce job (what it does) • what the Map and Reduce functions do in each MapReduce Job. write that in pseudo code similar to what we did in the lectures of relational algebra in MapReduce

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

 

Advanced Database Please solve it using Hadoop

Assume that you have the following relations, each relation represents a dataset of
text files stores on HDFS.
1. ratings ( UserID, MovieID, Rating ) // where rating represent the rating between
(from 1 to 5) given by the user to the corresponding movieID
2. users ( UserID, Gender, Age)
3. movies ( MovieID, Title, Genres ) // where genres in the classification of the
movie such as comedy, children, action, ….
Suppose you have been given a task to find the average rating for each movie in
the form (movieID, Title, avg_rating). Computing the average rating must consider
the following:
4. only children and comedy movies
5. consider rating values that are above 2
6. consider ratings from users who’s age is above 25
What to submit:
• First briefly describe how to implement the above task in MapReduce jobs in an
efficient way
• specify how many jobs you need
• what the purpose of each MapReduce job (what it does)
• what the Map and Reduce functions do in each MapReduce Job. write that in
pseudo code similar to what we did in the lectures of relational algebra in
MapReduce

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Fundamentals of Datawarehouse
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education