Final-Report.pdf(1) 1-1
pdf
keyboard_arrow_up
School
Northeastern University *
*We aren’t endorsed by this school
Course
6700
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
Pages
7
Uploaded by HighnessIron12089
Residential Apartment Analysis
IE6600 – Computation and Visualization for Analytics
Final Report
Group Number 14
Reha Patel (002969991)
Shubhankar Radhakrishna (002781473)
Sireesh Veeramachaneni (002734184)
Suman Rajshekhar Algood (002772175)
Part 1: Introduction and Research Questions
An important part about moving to a new city is finding the right place to live in. Many
individuals choose to center their home searches on amenities offered, the number of bedrooms
or bathrooms, or even what floor the home is on. However, most people will use price as a
starting point for finding a new home, and because many of the individuals on the team recently
moved to a new city and will begin moving around as they adjust, this analysis will give them
insights as to what they can find within their budgets. Specifically, by city, how does the
presence of bedrooms, bathrooms, and amenities drive the price of apartments? Also, how does
the city being analyzed impact the price of apartments and presence of specific amenities?
This investigation will aim to answer the question posed by analyzing apartment data
presented by Equity Residential, one of the most popular real estate trusts with properties in
major U.S. cities like Boston, New York, and Washington D.C. Throughout, the analysis will
provide insights from a high level by comparing price with features across all cities, as well as a
city-by-city comparison for individuals that may be seeking to choose between two cities.
In this project, we have used Python and Excel to clean the datasets that we got from
Kaggle. We then used the cleaned data for making Tableau Visualizations. We have designed our
dashboards in an easy-to-understand manner that any person with minimal knowledge on
apartments in a new city can use to gain a great deal of insight on which apartment will be a
seuitable choice for them based on their personal requirements and preferences.
Part 2: Summary of Results
After analyzing the data scraped from Equity Residential, it is concluded that there is, in
fact, a relationship between apartment features and amenities and the price of the apartment. This
was a conclusion drawn regardless of what city was being analyzed. In fact, although price
ranges for apartments varied by cities, it was noted that all cities showed the same basic trends,
such as the price of apartments increasing as the square footage increased. An additional
conclusion drawn, that was unexpected, was that while pricing trends were generally consistent,
the average prices themselves varied by zip code, perhaps due to outside factors such as
proximity to public transportation, parks, etc.
We have used Tableau to bring out the insights from a city level data to highlight the
apartments available in each city and all the amenities that are on offer for each of them. The
1
various amenities and other features like price, square feet, location play the major selection
criteria for renting an apartment.
The data was cleaned and processed completely using Python’s Pandas library and the
data was connected to Tableau. The project consists of two dashboards; the first dashboard is a
general analysis of the apartments in user selected city, whereas the second dashboard tells us
about individual cities and their respective amenities in comparison to another city which utilizes
the same dataset to bring about different insights for the end user to choose and decide upon
when moving into a new city.
Part 3: Data Sources
The main portion of data used in this analysis was sourced from
Kaggle
, and it features
apartment data scraped from Equity Residential websites. This dataset includes columns such as
the price of an apartment, what floor it is on, and what amenities it features (such as an office,
fireplace, etc.).
Alongside this, an additional dataset of the
US zip codes from 2013 government data
featuring the longitude and latitude of zip codes was joined to the apartment dataset.
The apartment dataset preparation involved:
●
checking and converting data types,
●
checking for and removing null values,
●
dropping unnecessary columns such as URL and date record,
●
feature extraction.
Some of the information provided was what direction the windows in the apartment
faced, so a new column called Sunset Exposure was created that was marked as 1 if either
Eastern or Western Exposure was 1. Finally, the address for each listing was one of the original
columns, so the zip code was extracted from the column, and then a longitude and latitude-based
dataset was joined based on the zip code column. This would allow for visualization based on
longitude and latitude in the future stages of the project.
Part 4: Results and Methods
The data we have obtained from Kaggle contained large amounts of null values and
redundant columns which needed removal to obtain the data for visualizations and dashboards.
For the initial EDA, we chose to use Python to remove null values, drop unnecessary columns
like date recorded and unit ID, and finally export the data frame for easy import into Tableau,
2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
where we created a calculated field for what percentage of all apartments had each amenity. In
fact, the entire file containing our data cleaning and feature extraction can be found
here
.
With the simple idea of being easily viewable and understandable to the end user/viewer
we decided to have two dashboards that would serve as a tool in an apartment hunt. The first
dashboard includes a generic and easy to follow visualizations for apartments over prices and
squarefoot. The second dashboard involves a direct comparison between any two cities of user
choice and has the option to filter over various amenities which are on offer. Our idea is that
users will be able to use the first dashboard to get a high level overview and then utilize the
second dashboard to perform a more in depth analysis into what each city can readily offer them.
Before we could begin on our
dashboard
analysis,
we
created
some
preliminary
graphs.
For
example,
we
wanted
to
understand if there was an
overall trend in price, square footage, and
the number of bedrooms. The crux of
apartment
hunts
can be the price and
number of bedrooms so we wanted to
explore this relationship even though we
did not deem it to be necessary for the
dashboard. Above is a graph depicting
this, and we can see that as the square footage increases, not only does the price increase, but as
does the number of bedrooms.
An additional visualization we were
interested in seeing was the impact of floor on
the prices. In fact, we found that as the floor of
an apartment in New York City increased, as did
the price. We created a scatter plot shown to the
right to depict this.
3
Finally, after drawing some very high level insights from the graphs above, we were able to
create the following dashboards:
Dashboard 1
As previously stated, the first dashboard was created with the aim to provide the user
high level insights on each of the cities individually as well as providing a comparative analysis
across all cities. When the (All) filter is selected, users are able to compare the quartiles of
apartment prices across cities as well as understand what the impact of move in month would be
on the square footage they would be able to find and for what price. By also showing a
distribution of price, users are able to at a glance know what ranges they would be looking at.
For example, if a user wants to specifically check out New York City, they would know that they
could possibly find apartments in the $2000-$2500 range, but the greatest number of apartments
they would be able to find would be around $3500-$4500.
4
Dashboard 2
Once the user has determined which two cities have apartments within a reasonable price,
they can then navigate to the second dashboard which will help them give them more detailed
insights into each city. For example, in each city, which areas have the greatest vacancies and
what are the average number of bedrooms and bathrooms available. If a user is hoping to up their
odds of living in a 2-bedroom apartment, they may want to consider a city like San Diego where
the average number of bedrooms is 1.673 over a city like New York where the average number
of bedrooms is 1.093. Finally, to some individuals the amenities offered by an apartment would
be a deal-breaker. By going further through the dashboard, they would be able to understand
what percentage of apartments in each city have each of the amenities featured.
5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Part 5: Limitations and Future Work
As the analysis began, we noticed that there was only data for 10 major cities in the
United States, missing cities such as Chicago, Atlanta, and Miami. While 10 major cities were
good enough for initial analysis, it would be beneficial to include additional cities for future
analysis, that way there is fair representation throughout the different regions in the United States
(Pacific Northwest, Midwest, South, etc.). In future expansions of this project, it would be
interesting to use the web scraping techniques used to gather the original data to then gather data
for additional cities.
An additional limitation we realized we may have faced was that the web scraping may
not have been entirely accurate. Because we were relying on data scraped by another individual,
it may not have been entirely accurate, and it was also not the most up to date. According to
Kaggle, the data was scraped in 2021, so we believe the prices and availability of apartments
may have been impacted by the pandemic. To overcome this limitation, future work of this
project could include re-scraping the data from the Equity Residential website for more
up-to-date listings and prices.
Finally, it would be interesting to create a third dashboard with the possibility of
performing predictive analysis. For example, if an individual wants to live in an apartment in
New York City with a city skyline view and on the 4th floor, what is an approximate price range
that the individual would be looking at? Additionally, what are the odds of even finding such an
accommodation? Such an enhancement of our project would be able to increase its usability
amongst individuals that are comparing cities as they look to move apartments.
6