Regex, APIs, BeautifulSoup: python import requests, re from pprint import pprint from bs4 import BeautifulSoup complete the missing bodies of the functions below: def companies(website): """

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Regex, APIs, BeautifulSoup: python

import requests, re
from pprint import pprint
from bs4 import BeautifulSoup

complete the missing bodies of the functions below:

def companies(website):
"""
Question 6
- Acces the table at the provided website:
'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(C)'
- Parse through it and retrieve the names of all companies in the site that
~ Are based in the US
~ Have an acronym anywhere in their name
~ (Let us define 'acronym' as any two or more consecutive capital letters)
Args:
string (website)
Returns:
list (list of company names)
>>> web1 =
'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(C)'
>>> web2 =
'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(T)'
>>> companies(web1)
['CACI',
'CAI International, Inc.',
'CARBO Ceramics Inc.',
...
'CYS Investments, Inc.']
>>> len(companies(web1))
27
>>> len(companies(web2))
23
"""
pass
 

test code:

# web1 =
'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(C)'
# web2 =
'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(T)'
# pprint(companies(web1))
# pprint(companies(web2))
Expert Solution
Step 1

# def companies(website):
   #     # Acces the table at the provided website:
   #     # 'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(C)'
   #     # Parse through it and retrieve the names of all companies in the site that
   #     # ~ Are based in the US
   #     # ~ Have an acronym anywhere in their name
   #     # ~ (Let us define 'acronym' as any two or more consecutive capital letters)
   #     # Args:
   #     # string (website)
   #     # Returns:
   #     # list (list of company names)
   #     # >>> web1 =
   #     # 'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(C)'
   #     # >>> web2 =
   #     # 'https://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_(T)'
   #     # >>> companies(web1)
   #     #

trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question

What is the answer? 

Solution
Bartleby Expert
SEE SOLUTION
Knowledge Booster
JQuery and Javascript
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education