xplain the code below in a step-by-step manner. Also explain what the two join statements intend to achieve.                                                          emp = [(1,"Smith",-1,"2018","10","M",3000), \     (2,"Rose",1,"2010","20","M",4000), \     (3,"Williams",1,"2010","10","M",1000), \     (4,"Jones",2,"2005","10","F",2000), \   ] empColumns = ["emp_id","name","superior_emp_id","year_joined", \        "emp_dept_id","gender","salary"]   empDF = spark.createDataFrame(data=emp, schema = empColumns) empDF.printSchema() empDF.show(truncate=False)   dept = [("Finance",10), \     ("Marketing",20), \     ("Sales",30), \     ("IT",40) \   ] deptColumns = ["dept_name","dept_id"] deptDF = spark.createDataFrame(data=dept, schema = deptColumns) deptDF.printSchema() deptDF.show(truncate=False)   # Join statement 1 empDF.join(deptDF,empDF.emp_dept_id ==  deptDF.dept_id,"outer") \     .show(truncate=False)   # Join statement 2 empDF.join(deptDF,empDF.emp_dept_id ==  deptDF.dept_id,"right") \    .show(truncate=False)

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question

xplain the code below in a step-by-step manner. Also explain what the two join statements intend to achieve.                                                       

 

emp = [(1,"Smith",-1,"2018","10","M",3000), \

    (2,"Rose",1,"2010","20","M",4000), \

    (3,"Williams",1,"2010","10","M",1000), \

    (4,"Jones",2,"2005","10","F",2000), \

  ]

empColumns = ["emp_id","name","superior_emp_id","year_joined", \

       "emp_dept_id","gender","salary"]

 

empDF = spark.createDataFrame(data=emp, schema = empColumns)

empDF.printSchema()

empDF.show(truncate=False)

 

dept = [("Finance",10), \

    ("Marketing",20), \

    ("Sales",30), \

    ("IT",40) \

  ]

deptColumns = ["dept_name","dept_id"]

deptDF = spark.createDataFrame(data=dept, schema = deptColumns)

deptDF.printSchema()

deptDF.show(truncate=False)

 

# Join statement 1

empDF.join(deptDF,empDF.emp_dept_id ==  deptDF.dept_id,"outer") \

    .show(truncate=False)

 

# Join statement 2

empDF.join(deptDF,empDF.emp_dept_id ==  deptDF.dept_id,"right") \

   .show(truncate=False)

Expert Solution
Step 1:

emp = [(1,"Smith",-1,"2018","10","M",3000), \

    (2,"Rose",1,"2010","20","M",4000), \

    (3,"Williams",1,"2010","10","M",1000), \

    (4,"Jones",2,"2005","10","F",2000), \

  ]

empColumns = ["emp_id","name","superior_emp_id","year_joined", \

       "emp_dept_id","gender","salary"]

 

empDF = spark.createDataFrame(data=emp, schema = empColumns)

empDF.printSchema()

empDF.show(truncate=False)

 

dept = [("Finance",10), \

    ("Marketing",20), \

    ("Sales",30), \

    ("IT",40) \

  ]

deptColumns = ["dept_name","dept_id"]

deptDF = spark.createDataFrame(data=dept, schema = deptColumns)

deptDF.printSchema()

deptDF.show(truncate=False)

 

This whole code prints emp and dept dataframe to conlsole.

steps

Step by step

Solved in 2 steps

Blurred answer
Similar questions
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY