problem21
pdf
keyboard_arrow_up
School
University of Michigan *
*We aren’t endorsed by this school
Course
215
Subject
Computer Science
Date
Nov 24, 2024
Type
Pages
26
Uploaded by TareqA5
Problem 21: Final exam, Fall 2020: The legacy of "redlining"
Version 1.2 (Added clarification on regression part)
This problem builds on your knowledge of the Python data stack to do analyze data that contains geographic information. It has
6
exercises, numbered 0
to
5
. There are
13 available points.
However, to earn 100%, the threshold is just
10 points.
(Therefore, once you hit
10
points, you can stop. There is no
extra credit for exceeding this threshold.)
Each exercise builds logically on the previous one, but you may solve them in any order. That is, if you can't solve an exercise, you can still move on and
try the next one.
However, if you see a code cell introduced by the phrase, "Sample result for ...", please run it.
Some demo cells in the notebook
may depend on these precomputed results.
The point values of individual exercises are as follows:
Exercise 0: 2 points
Exercise 1: 3 points
Exercise 2: 2 points
Exercise 3: 2 points
Exercise 4: 2 points
Exercise 5: 2 points
Pro-tips.
All test cells use
randomly generated inputs.
Therefore, try your best to write solutions that do not assume too much. To help you debug, when
a test cell does fail, it will often tell you exactly what inputs it was using and what output it expected, compared to yours.
If you need a complex SQL query, remember that you can define one using a triple-quoted (multiline) string
(https://docs.python.org/3.7/tutorial/introduction.html#strings)
.
If your program behavior seem strange, try resetting the kernel and rerunning everything.
If you mess up this notebook or just want to start from scratch, save copies of all your partial responses and use
Actions
Reset
Assignment
to get a fresh, original copy of this notebook.
(Resetting will wipe out any answers you've written so far, so be sure to stash those
somewhere safe if you intend to keep or reuse them!)
If you generate excessive output (e.g., from an ill-placed
print
statement) that causes the notebook to load slowly or not at all, use
Actions
Clear Notebook Output
to get a clean copy. The clean copy will retain your code but remove any generated output.
However
, it will also
rename
the notebook to
clean.xxx.ipynb
. Since the autograder expects a notebook file with the original name, you'll need to rename the
clean notebook accordingly.
Good luck!
Background
During the economic Great Depression of the 1930s, the United States government began "rating" neighborhoods, on a letter-grade scale of "A" ("good")
to "D" ("bad"). The purpose was to use such grades to determine which neighborhoods would qualify for new investments, in the form of residential and
business loans.
But these grades also reflected racial and ethnic bias toward the residents of their neighborhoods. Nearly 100 years later, the effects have taken the form
of environmental and economic disparaties.
In this notebook, you will get an idea of how such an analysis can come together using publicly available data and the basic computational data processing
techniques that appeared in this course. (And after you finish the exam, we hope you will try the optional exercise at the end and refer to the "epilogue" for
related reading.)
Goal and workflow.
Your goal is to see if there is a relationship between the rating a neighborhood received in the 1930s and two attributes we can
observe today: the average temperature of a neighborhood and the average home price.
Temperature tells you something about the local environment. Areas with more parks, trees, and green space tend to experience more moderate
temperatures.
The average home price tells you something about the wealth or economic well-being of the neighborhood's residents.
Your workflow will consist of the following steps:
1. You'll start with neighborhood rating data, which was collected from public records as part of a University of Richmond study on redlining policies
(https://dsl.richmond.edu/panorama/redlining)
2. You'll then combine these data with satellite images, which give information about climate. These data come from the US Geological Survey
(https://usgs.gov/)
.
3. Lastly, you'll merge these data with home prices from the real estate website, Zillow
(https://zillow.com/)
.
Note: The analysis you will perform is correlational, but the deeper research that inspired this problem tries to control for a variety of factors
and suggests causal effects.
Part 0: Setup
At a minimum, you will need the following modules in this problem. They include a new one we did not cover called
geopandas
. While it may be new to
you, if you have mastered
pandas
, then you know almost everything you need to use
geopandas
. Anything else you need will be given to you as part of
this problem, so don't be intimidated!
In [1]:
import sys
print(f"* Python version:
{sys.version}
")
# Standard packages you know and love:
import pandas as pd
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import geopandas
print("* geopandas version:", geopandas.__version__)
Run the next code cell, which will load some tools needed by the test cells.
In [2]:
### BEGIN HIDDEN TESTS
%
load_ext
autoreload
%
autoreload
2
### END HIDDEN TESTS
from testing_tools import
data_fn, load_geopandas, load_df, load_pickle
from testing_tools import
f_ex0__sample_result
from testing_tools import
f_ex1__sample_result
from testing_tools import
f_ex2__sample_result
from testing_tools import
f_ex3__sample_result
from testing_tools import
f_ex4__sample_result
from testing_tools import
f_ex5__sample_result
Part 1: Neighborhood ratings
The neighborhood rating data is stored in a special extension of a pandas
DataFrame
called a
GeoDataFrame
. Let's load the data into a variable named
neighborhood_ratings
and have a peek at the first few rows:
* Python version: 3.7.5 (default, Dec 18 2019, 06:24:58)
[GCC 5.5.0 20171010]
* geopandas version: 0.6.2
In [3]:
neighborhood_ratings = load_geopandas('fullDownload.geojson')
print(type(neighborhood_ratings))
neighborhood_ratings.head()
Each row is a neighborhood. Its location is given by name, city, and a two-letter state abbreviation code (the
name
,
city
, and
state
columns,
respectively). The rating assigned to a neighborhood is a letter,
'A'
,
'B'
,
'C'
, or
'D'
, given by the
holc_grade
column.
In addition, there is special column called
geometry
. It contains a geographic outline of the boundaries of this neighborhood. Let's take a look at row 4
(last row shown above):
In [4]:
g4_example = neighborhood_ratings.loc[4, 'geometry']
print("* Type of `g4_example`:", type(g4_example))
print("
\n
* Contents of `g4_example`:", g4_example)
print("
\n
* A quick visual preview:")
display(g4_example)
The output indicates that this boundary is stored a special object type called a
MultiPolygon
. It is usually a single connected polygon, but may also be
the union of multiple such polygons.
The coordinates of the multipolygon's corners are floating-point values, and correspond to longitude and latitude values
(https://www.latlong.net/)
. But for
this notebook, the exact format won't be important. Simply treat the shapes as being specified in some way via a collection of two-dimensional
coordinates measured in arbitrary units.
Lastly, observe that calling
display()
on a
MultiPolygon
renders a small picture of it.
Opening geopandas data file, './resource/asnlib/publicdata/fullDownload.geojson' ...
<class 'geopandas.geodataframe.GeoDataFrame'>
Out[3]:
state city
name
holc_id holc_grade area_description_data
geometry
0
AL
Birmingham
Mountain Brook Estates and
Country Club Garden...
A1
A
{'5': 'Both sales and
rental prices in 1929
we...
MULTIPOLYGON
(((-86.75678 33.49754,
-86.75692 ...
1
AL
Birmingham
Redmont Park, Rockridge Park,
Warwick Manor, a...
A2
A
{'5': 'Both sales and
rental prices in 1929
we...
MULTIPOLYGON
(((-86.75867 33.50933,
-86.76093 ...
2
AL
Birmingham
Colonial Hills, Pine Crest
(outside city limits)
A3
A
{'5': 'Generally speaking,
houses are not buil...
MULTIPOLYGON
(((-86.75678 33.49754,
-86.75196 ...
3
AL
Birmingham
Grove Park, Hollywood,
Mayfair, and Edgewood s...
B1
B
{'5': 'Both sales and
rental prices in 1929
we...
MULTIPOLYGON
(((-86.80111 33.48071,
-86.80099 ...
4
AL
Birmingham
Best section of Woodlawn
Highlands
B10
B
{'5': 'Both sales and
rental prices in 1929
we...
MULTIPOLYGON
(((-86.74923 33.53332,
-86.74916 ...
* Type of `g4_example`: <class 'shapely.geometry.multipolygon.MultiPolygon'>
* Contents of `g4_example`: MULTIPOLYGON (((-86.749227 33.533325, -86.749156 33.530809, -86.75388599
999999 33.529075, -86.754373 33.529382, -86.754729 33.529769, -86.754729 33.530294, -86.756048000000
01 33.531225, -86.75539499999999 33.532008, -86.754456 33.532335, -86.753196 33.531483, -86.749714 3
3.533295, -86.749227 33.533325)))
* A quick visual preview:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Exercise 0: Filtering ratings (2 points)
Complete the function,
def
filter_ratings(ratings, city_st, targets=None):
...
so that it filters ratings data by its city and state name, along with a set of targeted letter grades. In particular, the inputs are:
ratings
: A geopandas
GeoDataFrame
similar to the
neighborhood_ratings
example above.
city_st
: The name of a city and two-letter state abbreviation as a string, e.g.,
city_st = 'Atlanta, GA'
to request only rows for Atlanta,
Georgia.
targets
: A Python set containing zero or more ratings, e.g.,
targets = {'A', 'C'}
to request only rows having either an
'A'
grade or a
'C'
grade.
The function should return a copy of the input
GeoDataFrame
that has the same columns as
ratings
but only rows that match
both
the desired
city_st
value
and
any one of the target ratings.
For example, suppose
ratings
is the following:
city
state holc_grade holc_id
(... other cols not shown ...)
geometry
0 Chattanooga TN
C
C4
...
MULTIPOLYGON(...)
1 Augusta
GA
C
C5
...
MULTIPOLYGON(...)
2 Chattanooga TN
B
B7
...
MULTIPOLYGON(...)
3 Chattanooga TN
A
A1
...
MULTIPOLYGON(...)
4 Augusta
GA
B
B4
...
MULTIPOLYGON(...)
5 Augusta
GA
D
D11
...
MULTIPOLYGON(...)
6 Augusta
GA
B
B1
...
MULTIPOLYGON(...)
7 Chattanooga TN
D
D8
...
MULTIPOLYGON(...)
8 Chattanooga TN
C
C7
...
MULTIPOLYGON(...)
Then
filter_ratings(ratings, 'Chattanooga, TN', {'A', 'C'})
would return
city
state holc_grade holc_id
(... other cols not shown ...)
geometry
0 Chattanooga TN
C
C4
...
MULTIPOLYGON(...)
3 Chattanooga TN
A
A1
...
MULTIPOLYGON(...)
8 Chattanooga TN
C
C7
...
MULTIPOLYGON(...)
All of these rows match
'Chattanooga, TN'
and have a
holc_grade
value of either
'A'
or
'C'
. Other columns, such as
holc_id
and any columns
not shown, would be returned as-is from the original input.
Note 0:
We will test your function on a randomly generated data frame. The input is guaranteed to have the columns,
'city'
,
'state'
,
'holc_grade'
, and
'geometry'
. However, it may have other columns with arbitrary names; your function should ensure these pass
through unchanged, including the types.
Note 1:
Observe that
targets
may be
None
, which is the default value if unspecified by the caller. In this case, you should
not
filter by
rating, but only by
city_st
. The
targets
variable may be the empty set, in which case your function should return an empty
GeoDataFrame
.
Note 2:
You may return the rows in any order. We will use a function similar to
tibbles_are_equivalent
from Notebook 7 to
determine if your output matches what we expect.
In [5]:
def
filter_ratings(ratings, city_st, targets=
None
):
assert
isinstance(ratings, geopandas.GeoDataFrame)
assert
isinstance(targets, set)
or
(targets
is None
)
assert
{'city', 'state', 'holc_grade', 'geometry'} <= set(ratings.columns)
### BEGIN SOLUTION
matches_city_st = (ratings['city'] + ', ' + ratings['state']) == city_st
matches_targets = ratings['holc_grade'].isin(targets
or
set()) | (targets
is None
)
return
ratings[matches_city_st & matches_targets]
### END SOLUTION
In [6]:
# Demo cell
ex0_demo_result = filter_ratings(neighborhood_ratings, 'Atlanta, GA', targets={'A', 'C'})
print(type(ex0_demo_result), len(ex0_demo_result))
# Result: `<class 'geopandas.geodataframe.GeoDataFrame'> 51`
ex0_demo_result.sample(5)
In [7]:
# Test cell: f_ex0__filter_ratings (2 points)
### BEGIN HIDDEN TESTS
def
f_ex0__gen_soln(grade=
None
, fn_base="atl", fn_ext="geojson", overwrite=
False
):
from testing_tools import
file_exists, load_geopandas, save_geopandas
if
grade
is None
:
fn = f"
{fn_base}
.
{fn_ext}
"
targets =
None
else
:
fn = f"
{fn_base}
-
{grade}
.
{fn_ext}
"
targets = {grade}
if
file_exists(fn)
and not
overwrite:
gdf = load_geopandas(fn)
else
:
# not file_exists(fn) or overwrite
gdf = filter_ratings(neighborhood_ratings, 'Atlanta, GA', targets=targets)
save_geopandas(gdf, fn, overwrite=overwrite)
return
gdf
for
g_ex0
in
[
None
, 'A', 'B', 'C', 'D']:
f_ex0__gen_soln(grade=g_ex0, overwrite=
False
)
### END HIDDEN TESTS
from testing_tools import
f_ex0__check
print("Testing...")
for
trial
in
range(125):
f_ex0__check(filter_ratings)
filter_ratings__passed =
True
print("
\n
(Passed!)")
<class 'geopandas.geodataframe.GeoDataFrame'> 51
Out[6]:
state city
name
holc_id holc_grade area_description_data geometry
1352
GA
Atlanta
Section north of Fourteenth
Street between Pie...
C6
C
{'0': 'Atlanta, Georgia',
'5': 'Property if ac...
MULTIPOLYGON
(((-84.38755 33.79744,
-84.38753 ...
1330
GA
Atlanta
Glenwood Ave. to Georgia R.R.,
between Morelan...
C24
C
{'0': 'Atlanta, Georgia',
'5': 'Property if ac...
MULTIPOLYGON
(((-84.33963 33.73995,
-84.34898 ...
1328
GA
Atlanta
East Lake (All in DeKalb County,
but portion i...
C22
C
{'0': '', '5': 'Property if
acquired in this s...
MULTIPOLYGON
(((-84.30035 33.75901,
-84.30035 ...
1348
GA
Atlanta Newer portion of Hapeville
C40
C
{'0': 'Atlanta, Georgia',
'5': 'Property if ac...
MULTIPOLYGON
(((-84.41621 33.66343,
-84.41614 ...
1326
GA
Atlanta
Oakhurst and older portion of
Decatur (in DeKa...
C20
C
{'0': 'Atlanta, Georgia',
'5': 'Property, if a...
MULTIPOLYGON
(((-84.28502 33.77148,
-84.28360 ...
Opening geopandas data file, './resource/asnlib/publicdata/atl.geojson' ...
Opening geopandas data file, './resource/asnlib/publicdata/atl-A.geojson' ...
Opening geopandas data file, './resource/asnlib/publicdata/atl-B.geojson' ...
Opening geopandas data file, './resource/asnlib/publicdata/atl-C.geojson' ...
Opening geopandas data file, './resource/asnlib/publicdata/atl-D.geojson' ...
Testing...
(Passed!)
Sample result of
filter_ratings
(Exercise 0) for Atlanta.
If you had a working solution to Exercise 0, then in principle you could use it to visualize
these neighborhoods, color-coded by grade, as the following cell does for
'Atlanta, GA'
.
Run this cell even if you did not complete Exercise 0.
In [8]:
f_ex0__sample_result();
# The black "star" is Georgia Tech!
Bounding boxes
Recall that a geopandas dataframe includes a
'geometry'
column, which defines the geographic shape of each neighborhood using special
multipolygon objects. To simplify some geometric calculations, a useful operation is to determine a multipolygon's
bounding box
, which is the smallest
rectangle that encloses it.
Getting a bounding box is easy! For example, recall the neighborhood in row 4 of the
neighborhood_ratings
geopandas dataframe:
In [9]:
g4_example = neighborhood_ratings.loc[4, 'geometry']
print("* Type of `g4_example`:", type(g4_example))
print("
\n
* Contents of `g4_example`:", g4_example)
print("
\n
* A quick visual preview:")
display(g4_example)
The bounding box is given to you by the multipolygon's
.bounds
attribute. This attribute is a Python 4-tuple (tuple with four components) that encodes
both the lower-left corner and the upper-right corner of the shape. Here is what that tuple looks like for the previous example:
In [10]:
print("* Recall: `g4_example` ==", g4_example)
print("
\n
* ==> `g4_example.bounds` ==", g4_example.bounds)
Opening geopandas data file, './resource/asnlib/publicdata/atl.geojson' ...
* Type of `g4_example`: <class 'shapely.geometry.multipolygon.MultiPolygon'>
* Contents of `g4_example`: MULTIPOLYGON (((-86.749227 33.533325, -86.749156 33.530809, -86.75388599
999999 33.529075, -86.754373 33.529382, -86.754729 33.529769, -86.754729 33.530294, -86.756048000000
01 33.531225, -86.75539499999999 33.532008, -86.754456 33.532335, -86.753196 33.531483, -86.749714 3
3.533295, -86.749227 33.533325)))
* A quick visual preview:
* Recall: `g4_example` == MULTIPOLYGON (((-86.749227 33.533325, -86.749156 33.530809, -86.7538859999
9999 33.529075, -86.754373 33.529382, -86.754729 33.529769, -86.754729 33.530294, -86.75604800000001
33.531225, -86.75539499999999 33.532008, -86.754456 33.532335, -86.753196 33.531483, -86.749714 33.5
33295, -86.749227 33.533325)))
* ==> `g4_example.bounds` == (-86.756048, 33.529075, -86.749156, 33.533325)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The first two elements of the tuple are the smallest possible x-value and the smallest possible y-value among all points of the multipolygon. The last two
elements are the largest x-value and y-value.
If it's helpful, here is a plot that superimposes the bounding box on
g4_example
:
In [11]:
# Draw the multipolygon as a solid gray line:
from testing_tools import
plot_multipolygon, plot_bounding_box
plot_multipolygon(g4_example, color='gray')
# Add the bounding box as a dashed black line:
plot_bounding_box(g4_example.bounds, color='black', linestyle='--')
Exercise 1: Bounding box of
all
neighborhoods (3 points)
Complete the function,
get_bounds(gdf)
, below, so that it returns the coordinates of a single bounding box for
all
neighborhoods in a given dataframe.
For example, suppose
gdf_ex1_demo
holds rows 3 and 4 of the
neighborhood_ratings
dataframe:
In [12]:
gdf_ex1_demo = neighborhood_ratings.loc[[3, 4]]
gdf_ex1_demo
This dataframe has these bounds for each of the two rows:
In [13]:
print(gdf_ex1_demo.loc[3, 'geometry'].bounds)
print(gdf_ex1_demo.loc[4, 'geometry'].bounds)
Therefore, the bounding box for
gdf_ex1_demo
is the smallest rectangle that covers both neighborhoods, or
(-86.815458, 33.464794,
-86.749156, 33.533325)
. The next code cell illustrates the result.
Out[12]:
state city
name
holc_id holc_grade area_description_data
geometry
3
AL
Birmingham
Grove Park, Hollywood,
Mayfair, and Edgewood s...
B1
B
{'5': 'Both sales and
rental prices in 1929
we...
MULTIPOLYGON
(((-86.80111 33.48071,
-86.80099 ...
4
AL
Birmingham
Best section of Woodlawn
Highlands
B10
B
{'5': 'Both sales and
rental prices in 1929
we...
MULTIPOLYGON
(((-86.74923 33.53332,
-86.74916 ...
(-86.815458, 33.464794, -86.767064, 33.483678)
(-86.756048, 33.529075, -86.749156, 33.533325)
In [14]:
plot_multipolygon(gdf_ex1_demo.loc[3, 'geometry'], color='blue')
plot_bounding_box(gdf_ex1_demo.loc[3, 'geometry'].bounds, color='blue', linestyle=':')
plot_multipolygon(gdf_ex1_demo.loc[4, 'geometry'], color='gray')
plot_bounding_box(gdf_ex1_demo.loc[4, 'geometry'].bounds, color='gray', linestyle=':')
gdf_ex1_demo_bounding_box = (-86.815458, 33.464794, -86.749156, 33.533325)
plot_bounding_box(gdf_ex1_demo_bounding_box, color='black', linestyle='--')
The plot shows two multipolygons, along with the bounding box around each one as dotted lines. Your function should return a single bounding box for all
multipolygons, which we show as the dashed black line that encloses both.
Note 0:
The test cell will use randomly generated input data frames. Per the example above, your solution should only depend on the
presence of a column named
'geometry'
, and should return a correct result no matter what other columns are present in the input.
Note 1:
We've provided a partial solution that handles the corner-case of an empty input dataframe, so your solution can focus on
dataframes having at least one row.
In [15]:
def
get_bounds(gdf):
assert
isinstance(gdf, geopandas.GeoDataFrame)
if
len(gdf) == 0:
return None
assert
len(gdf) >= 1
### BEGIN SOLUTION
return
(gdf['geometry'].apply(
lambda
x: x.bounds[0]).min(),
gdf['geometry'].apply(
lambda
x: x.bounds[1]).min(),
gdf['geometry'].apply(
lambda
x: x.bounds[2]).max(),
gdf['geometry'].apply(
lambda
x: x.bounds[3]).max())
### END SOLUTION
In [16]:
# Demo cell
your_gdf_ex1_demo_bounding_box = get_bounds(gdf_ex1_demo)
print("Your result on the demo dataframe:", your_gdf_ex1_demo_bounding_box)
print("Expected result:", gdf_ex1_demo_bounding_box)
assert
all([np.isclose(a, b)
for
a, b
in
zip(your_gdf_ex1_demo_bounding_box,
gdf_ex1_demo_bounding_box)]), \
"*** Your result does not match our example! ***"
print("Great -- so far, your result matches our expected result.")
Your result on the demo dataframe: (-86.815458, 33.464794, -86.749156, 33.533325)
Expected result: (-86.815458, 33.464794, -86.749156, 33.533325)
Great -- so far, your result matches our expected result.
In [17]:
# Test cell: f_ex1__get_bounds (3 points)
### BEGIN HIDDEN TESTS
def
f_ex1__gen_soln(fn_base="atl-bb", fn_ext="pickle", overwrite=
False
):
from testing_tools import
file_exists, load_pickle, save_pickle
fn = f"
{fn_base}
.
{fn_ext}
"
if
file_exists(fn)
and not
overwrite:
bounds = load_pickle(fn)
else
:
gdf = f_ex0__gen_soln()
bounds = get_bounds(gdf)
save_pickle(bounds, fn)
return
bounds
f_ex1__gen_soln(overwrite=
False
)
### END HIDDEN TESTS
from testing_tools import
f_ex1__check
print("Testing...")
for
trial
in
range(250):
f_ex1__check(get_bounds)
print("
\n
(Passed!)")
Sample result of
get_bounds
(Exercise 1) for Atlanta.
If your function was working, then you could calculate the bounding box for Atlanta, which would
be the following.
Run this cell even if you did not complete Exercise 1.
In [18]:
_, _, f_ex1__atl_bounds = f_ex1__sample_result();
print(f"Bounding box for Atlanta:
{f_ex1__atl_bounds}
")
Part 2: Temperature analysis
We have downloaded satellite images that cover some of the cities in the
neighborhood_ratings
dataset. Each pixel of an image is the estimated
temperature at the earth's surface. The images we downloaded were taken by the satellite on a summer day.
Here is an example of a satellite image that includes the Atlanta, Georgia neighborhoods used in earlier examples. The code cell below loads this image,
draws it, and superimposes the Atlanta bounding box. The image is stored in the variable
sat_demo
. The geopandas dataframe for Atlanta is stored in
gdf_sat_demo
, and its bounding box in
bounds_sat_demo
.
Opening pickle from './resource/asnlib/publicdata/atl-bb.pickle' ...
Testing...
(Passed!)
Opening geopandas data file, './resource/asnlib/publicdata/atl.geojson' ...
Opening pickle from './resource/asnlib/publicdata/atl-bb.pickle' ...
Bounding box for Atlanta: (-84.457945, 33.637042, -84.254692, 33.869701)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In [19]:
from testing_tools import
load_satellite_image, plot_satellite_image
# Load a satellite image that includes the Atlanta area
sat_demo = load_satellite_image('LC08_CU_024013_20190808_20190822_C01_V01_ST--EPSG_4326.tif')
fig = plt.figure()
plot_satellite_image(sat_demo, ax=fig.gca())
# Add the bounding box for Atlanta
_, gdf_sat_demo, bounds_sat_demo = f_ex1__sample_result(do_plot=
False
);
plot_bounding_box(bounds_sat_demo, color='black', linestyle='dashed')
Masked images:
merging the satellite and neighborhood data.
A really cool feature of a geopandas dataframe is that you can "intersect" its polygons with
an image!
We wrote a function called
mask_image_by_geodf(img, gdf)
that does this merging for you. It takes as input a satellite image,
img
, and a
geopandas dataframe,
gdf
. It then clips the image to the bounding box of
gdf
, and masks out all the pixels. By "masking," we mean that pixels falling
within the multipolygon regions of
gdf
retain their original value; everything outside those regions gets a special "undefined" value.
Here is an example. First, let's call
mask_image_by_geodf
to generate the Numpy array, stored as
sat_demo_masked
:
In [20]:
def
mask_image_by_geodf(img, gdf):
from json import
loads
from rasterio.mask import
mask
gdf_json = loads(gdf.to_json())
gdf_coords = [f['geometry']
for
f
in
gdf_json['features']]
out_img, _ = mask(img, shapes=gdf_coords, crop=
True
)
return
out_img[0]
sat_demo_masked = mask_image_by_geodf(sat_demo, gdf_sat_demo)
print(sat_demo_masked.shape)
sat_demo_masked
The output shows the clipped result has a shape of 798 x 698 pixels, and the values are 16-bit integers (
dtype=int16
). The first thing you might see are
a bunch of values equal to -9999. That is the special value indicating that the given pixel falls
outside
of any neighborhood polygon.
Any other integer is the estimated surface temperature
in degrees Kelvin
(https://en.wikipedia.org/wiki/Kelvin#2019_redefinition)
multiplied by 10
.
For instance, suppose a pixel has the value
3167
embedded in the sample output above. That is 3167 / 10 = 316.7 degrees Kelvin, which in degrees
Celsius would be 316.7 - 273.15 = 43.55 degrees Celsius. (That, in turn, is approximately (316.7 − 273.15) * 9/5 + 32 = 110.39 degrees Farenheit.)
In our analysis, we'd like to inspect the average temperatures of the neighborhoods,
ignoring
the -9999 values.
Opening satellite image, './resource/asnlib/publicdata/LC08_CU_024013_20190808_20190822_C01_V01_ST--
EPSG_4326.tif' ...
Opening geopandas data file, './resource/asnlib/publicdata/atl.geojson' ...
Opening pickle from './resource/asnlib/publicdata/atl-bb.pickle' ...
(798, 698)
Out[20]:
array([[-9999, -9999, -9999, ..., -9999, -9999, -9999],
[-9999, -9999, -9999, ..., -9999, -9999, -9999],
[-9999, -9999, -9999, ..., -9999, -9999, -9999],
...,
[-9999, -9999,
3167, ..., -9999, -9999, -9999],
[-9999, -9999, -9999, ..., -9999, -9999, -9999],
[-9999, -9999, -9999, ..., -9999, -9999, -9999]], dtype=int16)
If it's helpful, here is a picture of that Numpy array. The dark regions correspond to the -9999 values that fall outside the neighborhoods of
gdf_sat_demo
; the bright ones indicate the presence of valid temperatures.
If they appear to have the same color or shade, it's because the -9999 values make other "real" temperatures look nearly the same.
In [21]:
plt.imshow(sat_demo_masked)
Exercise 2: Cleaning masked images (2 points)
To help our analysis, your next task is to clean a masked image, converting its values to degrees Celsius.
In particular, let
masked_array
be any Numpy array holding
int16
values, where the value
-9999
represents masked or missing values, and any other
integer is a temperature in degrees Kelvin times 10. You should complete the function,
masked_to_degC(masked_array)
, so that it returns a
new
Numpy array having the same shape as
masked_array
, but with the following properties:
The new array should hold
floating-point
values, not integers. That is, the new Numpy array should have
dtype=float
.
Every
-9999
value should be converted into a
not-a-number (NaN)
value.
Any other integer value should be converted to
degrees Celsius.
For instance, suppose
masked_array
is the following 2-D Numpy array:
[[-9999
2950 -9999]
[-9999
3167
2014]
[-9999
3075
3222]
[ 2801 -9999
2416]]
Then the output array should have the following values:
[[
nan
21.85
nan]
[
nan
43.55 -71.75]
[
nan
34.35
49.05]
[
6.95
nan -31.55]]
Note 0:
The simplest way to use a NaN value is through the predefined constant,
np.nan
(https://numpy.org/doc/stable/user/misc.html)
.
Note 1:
There are three demo cells. Two of them show plots in addition to input/output pairs, in case you work better with visual
representations. In the plots, any NaN entries will appear as blanks (white space).
Note 2:
Your function must work for an input array of
any
dimension greater than or equal to 1. That is, it could be a 1-D array, a 2-D array
(e.g., like true images), or 3-D or higher. Solutions that only work on 2-D arrays will only get half credit (one point instead of two).
In [22]:
# Note:
print(np.nan)
# a single NaN value
Out[21]:
<matplotlib.image.AxesImage at 0x7f17e6ac85d0>
nan
In [23]:
def
masked_to_degC(masked_array):
assert
isinstance(masked_array, np.ndarray)
assert
masked_array.ndim >= 1
assert
np.issubdtype(masked_array.dtype, np.integer)
### BEGIN SOLUTION
new_array = masked_array.astype(float)
new_array[new_array == -9999] = np.nan
new_array *= 0.1
new_array -= 273.15
return
new_array
### END SOLUTION
In [24]:
# Demo cell 0:
img_ex2_demo = np.array([[-9999,
2950, -9999],
[-9999,
3167, 2014],
[-9999,
3075, 3222],
[ 2801, -9999, 2416]], dtype=np.int16)
img_ex2_demo_clean = masked_to_degC(img_ex2_demo)
print(img_ex2_demo)
print()
print(img_ex2_demo_clean)
plt.imshow(img_ex2_demo_clean)
plt.colorbar();
In [25]:
# Demo cell 1: Try a 1-D array. Expected output: array([-260.85, nan, -227.55, -194.25, nan])
masked_to_degC(np.array([123, -9999, 456, 789, -9999], dtype=np.int16))
[[-9999
2950 -9999]
[-9999
3167
2014]
[-9999
3075
3222]
[ 2801 -9999
2416]]
[[
nan
21.85
nan]
[
nan
43.55 -71.75]
[
nan
34.35
49.05]
[
6.95
nan -31.55]]
Out[25]:
array([-260.85,
nan, -227.55, -194.25,
nan])
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In [26]:
# Demo cell 2: Apply to the example satellite image
sat_demo_clean_ex2 = masked_to_degC(sat_demo_masked)
print(sat_demo_clean_ex2)
plt.imshow(sat_demo_clean_ex2);
In [27]:
# Test cell 0: f_ex2__masked_to_degC_2d (1 point)
### BEGIN HIDDEN TESTS
def
f_ex2__gen_soln(fn_base="atl-masked-cleaned", fn_ext="pickle", overwrite=
False
):
from testing_tools import
file_exists, load_pickle, save_pickle
fn = f"
{fn_base}
.
{fn_ext}
"
if
file_exists(fn)
and not
overwrite:
img_clean = load_pickle(fn)
else
:
# not file_exists(fn) or overwrite
gdf = f_ex0__gen_soln()
img = load_satellite_image('LC08_CU_024013_20190808_20190822_C01_V01_ST--EPSG_4326.tif')
img_masked = mask_image_by_geodf(img, gdf)
img_clean = masked_to_degC(img_masked)
save_pickle(img_clean, fn)
return
img_clean
f_ex2__gen_soln(overwrite=
False
)
### END HIDDEN TESTS
from testing_tools import
f_ex2__check
print("Testing...")
for
trial
in
range(250):
f_ex2__check(masked_to_degC, ndim=2)
masked_to_degC__passed_2d =
True
print("
\n
(Passed the 2-D case!)")
In [28]:
# Test cell 1: f_ex2__masked_to_degC_nd (1 point)
from testing_tools import
f_ex2__check
print("Testing...")
for
trial
in
range(250):
f_ex2__check(masked_to_degC, ndim=
None
)
print("
\n
(Passed the any-D case!)")
[[
nan
nan
nan ...
nan
nan
nan]
[
nan
nan
nan ...
nan
nan
nan]
[
nan
nan
nan ...
nan
nan
nan]
...
[
nan
nan 43.55 ...
nan
nan
nan]
[
nan
nan
nan ...
nan
nan
nan]
[
nan
nan
nan ...
nan
nan
nan]]
Opening pickle from './resource/asnlib/publicdata/atl-masked-cleaned.pickle' ...
Testing...
(Passed the 2-D case!)
Testing...
(Passed the any-D case!)
Sample result of
masked_to_degC
(Exercise 2) on the Atlanta data.
A correct implementation of
masked_to_degC
would, when applied to the
Atlanta data, produce a masked image resembling what follows.
Run this cell even if you did not complete Exercise 2.
In [29]:
sat_demo_clean = f_ex2__sample_result();
Exercise 3: Average temperature (2 points)
Suppose you are given
masked_array
, a Numpy array of masked floating-point temperatures like that
produced
by
masked_to_degC
in Exercise 2.
That is, it has floating-point temperature values
except
at "masked" entries, which are marked by NaN values. Complete the function
mean_temperature(masked_array)
so that it returns the mean temperature value over all pixels,
ignoring any NaNs.
For example, suppose
masked_array
equals the Numpy array,
[[
nan
21.85
nan]
[
nan
43.55 -71.75]
[
nan
34.35
49.05]
[
6.95
nan -31.55]]
where the values are in degrees Celsius. Then
mean_temperature(masked_array)
would equal (21.85+43.55-71.75+34.35+49.05+6.95-31.55)/7,
which is approximately 7.49 degrees Celsius.
Note 0:
Your approach should work for an input array of
any
dimension. You'll get partial credit (1 point) if it works for 2-D input arrays, and
full credit (2 points) if it works for arrays of
all
dimensions.
Note 1:
If all input values are NaN values, then your function should return NaN.
In [30]:
def
mean_temperature(masked_array):
assert
isinstance(masked_array, np.ndarray)
assert
np.issubdtype(masked_array.dtype, np.floating)
### BEGIN SOLUTION
return
np.nanmean(masked_array)
### END SOLUTION
In [31]:
# Demo cell 0:
img_ex3_demo_clean = np.array([[np.nan,
21.85, np.nan],
[np.nan,
43.55, -71.75],
[np.nan,
34.35,
49.05],
[
6.95, np.nan, -31.55]])
mean_temperature(img_ex3_demo_clean)
# Expected result: ~ 7.49
Opening pickle from './resource/asnlib/publicdata/atl-masked-cleaned.pickle' ...
Out[31]:
7.492857142857143
In [32]:
# Demo cell 1: Check the 1-D case, as an example (expected output is roughly -277.55)
mean_temperature(np.array([-260.85, np.nan, -227.55, -194.25, np.nan]))
In [33]:
# Demo cell 2: Mean temperature in Atlanta (a.k.a., "Hotlanta!")
mean_temperature(sat_demo_clean)
In [34]:
# Test cell 0: f_ex3__mean_temperature_2d (1 point)
### BEGIN HIDDEN TESTS
def
f_ex3__gen_soln(grade=
None
, fn_base="atl-temp", fn_ext="pickle", overwrite=
False
):
from testing_tools import
file_exists, load_pickle, save_pickle
if
grade
is None
:
fn = f"
{fn_base}
.
{fn_ext}
"
else
:
fn = f"
{fn_base}
-
{grade}
.
{fn_ext}
"
if
file_exists(fn)
and not
overwrite:
temperature = load_pickle(fn)
else
:
# not file_exists(fn) or overwrite
gdf = f_ex0__gen_soln(grade=grade)
img = load_satellite_image('LC08_CU_024013_20190808_20190822_C01_V01_ST--EPSG_4326.tif')
img_masked = mask_image_by_geodf(img, gdf)
img_clean = clean_masked_image(img_masked)
temperature = mean_temperature(img_clean)
save_pickle(temperature, fn)
return
temperature
for
g_ex2
in
[
None
, 'A', 'B', 'C', 'D']:
f_ex3__gen_soln(grade=g_ex2, overwrite=
False
)
### END HIDDEN TESTS
from testing_tools import
f_ex3__check
print("Testing...")
for
trial
in
range(250):
f_ex3__check(mean_temperature, ndim=2)
mean_temperature__passed_2d =
True
print("
\n
(Passed the 2-D case!)")
In [35]:
# Test cell 1: f_ex3__mean_temperature_nd (1 point)
from testing_tools import
f_ex3__check
print("Testing...")
for
trial
in
range(250):
f_ex3__check(mean_temperature, ndim=
None
)
print("
\n
(Passed the N-D case!)")
Out[32]:
-227.55000000000004
Out[33]:
39.24372126540902
Opening pickle from './resource/asnlib/publicdata/atl-temp.pickle' ...
Opening pickle from './resource/asnlib/publicdata/atl-temp-A.pickle' ...
Opening pickle from './resource/asnlib/publicdata/atl-temp-B.pickle' ...
Opening pickle from './resource/asnlib/publicdata/atl-temp-C.pickle' ...
Opening pickle from './resource/asnlib/publicdata/atl-temp-D.pickle' ...
Testing...
(Passed the 2-D case!)
/usr/lib/python3.7/site-packages/ipykernel_launcher.py:5: RuntimeWarning: Mean of empty slice
"""
Testing...
(Passed the N-D case!)
/usr/lib/python3.7/site-packages/ipykernel_launcher.py:5: RuntimeWarning: Mean of empty slice
"""
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Sample result of
mean_temperature
(Exercise 3) for Atlanta.
If all of your code were working up until now, you could analyze the average
temperature in each type of neighborhood by rating. You would see the result below. It shows that there is an observable difference in temperature based
on the rating of the neighborhood -- a difference of 5 to 6 degrees Celsius is about 10 degrees Fahrenheit.
Run this cell even if you did not complete Exercise 3.
In [36]:
f_ex3__sample_result();
Part 3: Real estate data
The last piece of data we'll incorporate is real estate data. Here is the raw data:
In [37]:
home_prices = load_df("Zip_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_mon.csv")
# From Zillow
print("
\n
Columns:
\n
", home_prices.columns, "
\n
")
home_prices.head(3)
This dataframe has a lot of information, but here are the elements you need:
Each row gives historical average home price estimates for different areas of the United States. The areas are uniquely identified by their 5-digit
zip code, stored as integers in the
'RegionName'
column. Zip codes are areas that are
different
from the neighborhoods you'd been considering
previously.
The city and two-letter state abbreviations are given by the
'City'
and
'State'
columns. Their values match the city and state abbreviations
you've seen in the other data.
The home price estimates appear in the columns given by numeric dates, in the string format
'yyyy-mm-dd'
.
Average temperatures in Atlanta during some summer day:
*
Overall: ~ 39.2 degrees Celsius (~ 102.6 deg F)
*
In 1930s 'A'-rated neighborhoods: ~ 35.3 degrees Celsius (~ 95.6 deg F)
*
In 1930s 'B'-rated neighborhoods: ~ 37.6 degrees Celsius (~ 99.7 deg F)
*
In 1930s 'C'-rated neighborhoods: ~ 39.3 degrees Celsius (~ 102.8 deg F)
*
In 1930s 'D'-rated neighborhoods: ~ 41.5 degrees Celsius (~ 106.7 deg F)
Reading a regular pandas dataframe from './resource/asnlib/publicdata/Zip_zhvi_uc_sfrcondo_tier_0.33
_0.67_sm_sa_mon.csv' ...
Columns:
Index(['RegionID', 'SizeRank', 'RegionName', 'RegionType', 'StateName',
'State', 'City', 'Metro', 'CountyName', '1996-01-31',
...
'2020-01-31', '2020-02-29', '2020-03-31', '2020-04-30', '2020-05-31',
'2020-06-30', '2020-07-31', '2020-08-31', '2020-09-30', '2020-10-31'],
dtype='object', length=307)
Out[37]:
RegionID SizeRank RegionName RegionType StateName State City
Metro
CountyName
1996-01-
31
...
2020-01
31
0
61639
0
10025
Zip
NY
NY
New
York
New York-
Newark-
Jersey
City
New York
County
223469.0 ... 115249
1
84654
1
60657
Zip
IL
IL
Chicago
Chicago-
Naperville-
Elgin
Cook County
205864.0 ... 476938
2
61637
2
10023
Zip
NY
NY
New
York
New York-
Newark-
Jersey
City
New York
County
227596.0 ... 110572
3 rows × 307 columns
Exercise 4: Cleaning the dataframe (2 points)
Given a regular pandas dataframe
df
formatted like
home_prices
above, complete the function
clean_zip_prices(df)
so that it returns a new
dataframe containing the following columns:
'ZipCode'
: The 5-digit zip code, taken from the
'RegionName'
column and stored as integers.
'City'
: The city name, taken directly from
'City'
.
'State'
: The two-letter state abbreviation, taken directly from
'State'
.
'Price'
: The home price, taken as the
latest
(most recent) date column and stored as floating-point values. In
home_prices
, the latest or
most recent date is
'2020-10-31'
; therefore, the
'Price'
column of the output would contain the values from this column.
For example, suppose
df
is the following:
RegionID SizeRank RegionName RegionType StateName State City
Metro
CountyName
1996-
01-31
2020-
09-30
2020-
10-31
0
98046
6533
95212 Zip
CA
CA
Stockton
Stockton-Lodi
San Joaquin
County
nan
424606
430334
1
68147
16308
24445 Zip
VA
VA
Hot Springs
nan
Bath County
nan
138424
138496
2
84364
3748
60110 Zip
IL
IL
Carpentersville
Chicago-
Naperville-Elgin
Kane County
138980
178311
179852
Then your function would return:
ZipCode City
State
Price
0
95212 Stockton
CA
430334.0
1
24445 Hot Springs
VA
138496.0
2
60110 Carpentersville IL
179852.0
Note 0:
We will test your code on randomly generated input dataframes. Therefore, your solution should only depend on the existence of
the columns
'RegionName'
,
'City'
,
'State'
, and at least one column whose name is formatted as a date-string (
yyyy-mm-dd
). Any
other columns may have different names from what is shown above and, in any case, are immaterial to your solution.
Note 1:
A helpful function for searching for column names matching a given pattern is
df.filter
(https://pandas.pydata.org/pandas-
docs/stable/reference/api/pandas.DataFrame.filter.html)
.
Note 2:
Row ordering does not matter, since we will use an
tibbles_are_equivalent
-type function to check for dataframe
equivalence.
In [38]:
def
clean_zip_prices(df):
assert
isinstance(df, pd.DataFrame)
### BEGIN SOLUTION
last_date = sorted(df.filter(regex='\d
{4}
-\d
{2}
-\d
{2}
', axis=1).columns)[-1]
df_new = df[['RegionName', 'City', 'State', last_date]]
df_new = df_new.rename(columns={'RegionName': 'ZipCode', last_date: 'Price'})
df_new['ZipCode'] = df_new['ZipCode'].astype(int)
df_new['Price'] = df_new['Price'].astype(float)
return
df_new
### END SOLUTION
In [39]:
# Demo cell
clean_zip_prices(home_prices)
In [40]:
# Test cell: f_ex4__clean_zip_prices (2 points)
### BEGIN HIDDEN TESTS
def
f_ex4__gen_soln(fn_base="zip-prices", fn_ext="pickle", overwrite=
False
):
from testing_tools import
file_exists, load_df, load_pickle, save_pickle
fn = f"
{fn_base}
.
{fn_ext}
"
if
file_exists(fn)
and not
overwrite:
df_clean = load_pickle(fn)
else
:
# not file_exists(fn) or overwrite
df = load_df("Zip_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_mon.csv")
# From Zillow
df_clean = clean_zip_prices(df)
save_pickle(df_clean, fn)
return
df_clean
f_ex4__gen_soln(overwrite=
False
)
### END HIDDEN TESTS
from testing_tools import
f_ex4__check
print("Testing...")
for
trial
in
range(250):
f_ex4__check(clean_zip_prices)
print("
\n
(Passed!)")
Sample result of
clean_zip_prices
(Exercise 4).
A successful implementation of Exercise 4 would produce a cleaned dataframe for
home_prices
as shown below.
Run this cell even if you did not complete Exercise 4.
Out[39]:
ZipCode City
State Price
0
10025
New York
NY
1073416.0
1
60657
Chicago
IL
492585.0
2
10023
New York
NY
1152889.0
3
77494
Katy
TX
347871.0
4
60614
Chicago
IL
629989.0
...
...
...
...
...
30225
47865
Carlisle
IN
44241.0
30226
20052
Washington
DC
1343080.0
30227
801
Charlotte Amalie UT
30100.0
30228
820
Choudrant
LA
191183.0
30229
822
Choudrant
LA
190667.0
30230 rows × 4 columns
Opening pickle from './resource/asnlib/publicdata/zip-prices.pickle' ...
Testing...
(Passed!)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In [41]:
home_prices_clean = f_ex4__sample_result()
home_prices_clean.head()
Zip code boundaries
To merge the home prices with the neighborhood rating information, we need the geographic boundaries of the zip codes. The following code loads a
geopandas dataframe with this information:
In [42]:
zip_geo = load_pickle('tl_2017_us_zcta510.pickle')
zip_geo.head(3)
This dataframe has just two columns: the zip code, stored as a
string
in the column named
'GEOID10'
, and
'geometry'
, which holds the shape of the
zip code's area. Being stored in a geopandas dataframe, each zip code's boundary can be visualized easily and a bounding box computed, as the code
cell below demonstrates.
Note 0:
Zip codes in this dataframe are stored as strings, rather than integers as in the pricing dataframe.
Note 1:
The sample zip code visualized by the following code cell is a bit unusual in that it consists of three spatially disconnected regions.
However, that won't matter. Just note that each zip code is associated with some shape, just like the neighborhoods of the 1930s ratings
data.
In [43]:
plot_multipolygon(zip_geo.loc[2, 'geometry'], color='blue')
plot_bounding_box(zip_geo.loc[2, 'geometry'].bounds, color='black', linestyle='dashed')
Opening pickle from './resource/asnlib/publicdata/zip-prices.pickle' ...
Out[41]:
ZipCode City
State Price
0
10025
New York NY
1073416.0
1
60657
Chicago
IL
492585.0
2
10023
New York NY
1152889.0
3
77494
Katy
TX
347871.0
4
60614
Chicago
IL
629989.0
Opening pickle from './resource/asnlib/publicdata/tl_2017_us_zcta510.pickle' ...
Out[42]:
GEOID10 geometry
0
43451
POLYGON ((-83.70873 41.32733, -83.70815 41.327...
1
43452
POLYGON ((-83.08698 41.53780, -83.08256 41.537...
2
43456
MULTIPOLYGON (((-82.83558 41.71082, -82.83515 ...
Exercise 5 (last one!): Merging price and geographic boundaries (2 points)
Complete the function,
merge_prices_with_geo(prices_clean, zip_gdf)
, so that it merges price information stored in
prices_clean
with
geographic boundaries stored in
zip_gdf
.
The
prices_clean
object is a
pandas
dataframe that will have four columns,
'ZipCode'
,
'City'
,
'State'
, and
'Price'
, as would be
produced by
clean_home_prices
(Exercise 4).
The
zip_gdf
input is a
geopandas
dataframe with two columns,
'GEOID10'
and
'geometry'
.
Your function should return a new
geopandas
dataframe with five columns:
'ZipCode'
,
'City'
,
'State'
,
'Price'
, and
'geometry'
.
Note 0:
Recall that the
'ZipCode'
column of
prices_clean
stores values as integers, whereas the
'GEOID10'
column of
zip_gdf
stores values as strings. In your final result, store the
'ZipCode'
column using integer values.
Note 1:
We are only interested in zip codes with
both
price information
and
known geographic boundaries. That is, if a zip code is missing
in either
prices_clean
or
zip_gdf
, you should ignore and omit it from your output.
Note 2:
If
df
is a pandas dataframe, you can convert it to a geopandas one simply by calling
geopandas.GeoDataFrame(df)
.
In [44]:
def
merge_prices_with_geo(prices_clean, zip_gdf):
assert
isinstance(prices_clean, pd.DataFrame)
assert
isinstance(zip_gdf, geopandas.GeoDataFrame)
### BEGIN SOLUTION
zip_gdf_int = zip_gdf.copy()
zip_gdf_int['ZipCode'] = zip_gdf_int['GEOID10'].astype(int)
prices_gdf = geopandas.GeoDataFrame(prices_clean)
return
prices_gdf.merge(zip_gdf_int[['ZipCode', 'geometry']], on='ZipCode')
### END SOLUTION
In [45]:
# Demo cell
merge_prices_with_geo(home_prices_clean, zip_geo).head(3)
In [46]:
merge_prices_with_geo(home_prices_clean, zip_geo).head(3)
Out[45]:
ZipCode City
State Price
geometry
0
10025
New York NY
1073416.0 POLYGON ((-73.97701 40.79281, -73.97695 40.792...
1
60657
Chicago
IL
492585.0
POLYGON ((-87.67850 41.94504, -87.67802 41.945...
2
10023
New York NY
1152889.0 POLYGON ((-73.99015 40.77231, -73.98992 40.773...
Out[46]:
ZipCode City
State Price
geometry
0
10025
New York NY
1073416.0 POLYGON ((-73.97701 40.79281, -73.97695 40.792...
1
60657
Chicago
IL
492585.0
POLYGON ((-87.67850 41.94504, -87.67802 41.945...
2
10023
New York NY
1152889.0 POLYGON ((-73.99015 40.77231, -73.98992 40.773...
In [47]:
# Test cell: f_ex5__merge_prices_with_geo (2 points)
### BEGIN HIDDEN TESTS
def
f_ex5__gen_soln(fn_base="prices-geo", fn_ext="pickle", overwrite=
False
):
from testing_tools import
file_exists, load_df, load_pickle, save_pickle
fn = f"
{fn_base}
.
{fn_ext}
"
if
file_exists(fn)
and not
overwrite:
result = load_pickle(fn)
else
:
# not file_exists(fn) or overwrite
prices = f_ex4__sample_result()
geo = load_pickle('tl_2017_us_zcta510.pickle')
prices_geo = merge_prices_with_geo(prices, geo)
neighborhood_ratings = load_geopandas('fullDownload.geojson')
result = geopandas.overlay(neighborhood_ratings,
prices_geo[['ZipCode', 'Price', 'geometry']],
how='intersection')
save_pickle(result, fn)
return
result
f_ex5__gen_soln(overwrite=
False
)
### END HIDDEN TESTS
from testing_tools import
f_ex5__check
print("Testing...")
for
trial
in
range(250):
f_ex5__check(merge_prices_with_geo)
print("
\n
(Passed!)")
Part 4: Fin! (Epilogue and
optional
wrap-up)
There are no additional required exercises — you’ve reached the end of the final exam and, therefore, of the class! Don’t forget to restart and run all cells
again to make sure it’s all working when run in sequence; and make sure your work passes the submission process. Good luck!
The code cells below provide a bit more supplementary information and analysis. If you've finished early and want to try an interesting analysis, give the
optional "Exercise 6," below, a shot!
Sample result of
merge_prices_with_geo
(Exercise 5).
One incredibly cool feature of geopandas is that it can do spatial (geographic) queries for
you. For instance, let's merge the neighborhood rating data with the housing price data. The geopandas merging routines will account for how the
geographic zones in one dataframe intersect with the other.
Visually, imagine laying the two "geographies" on top of one another, as illustrated below. (The shading corresponds with house prices by zip code
regions, and the hollow polygons correspond to neighborhoods.)
If a neighborhood overlaps with two zip codes, the merge can create two rows in the output for each combination of (neighborhood, zip code). That allows
you to run subsequent queries, like examining the relationship between rating and price.
We have carried out this merge for you. Run the cell below to load that precomputed result into a geopandas dataframe named
neighborhood_prices
(as opposed to the original,
neighborhood_ratings
).
Opening pickle from './resource/asnlib/publicdata/prices-geo.pickle' ...
Testing...
(Passed!)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In [48]:
neighborhood_prices = f_ex5__sample_result()
neighborhood_prices.head()
If we consider just the Atlanta area, here is how today's average house price varies by that original 1930s neighborhood rating.
This code cell requires a working solution to Exercise 0.
In [49]:
from testing_tools import
f_ex5b__sample_result
if
'filter_ratings__passed'
in
globals()
and
filter_ratings__passed:
f_ex5b__sample_result(neighborhood_prices, 'Atlanta, GA', filter_ratings)
# Try other cities!
else
:
print("This code cell was not run because it needs a working version of `filter_ratings` from Exercise 0.")
OPTIONAL Exercise 6: Put it all together (no points)
We've provided satellite images for not just Atlanta, but all the cities defined in the dictionary below. Use all of your code from earlier exercises to repeat
the Atlanta analysis for all other neighborhoods.
In particular, construct a table that shows, for each city, the percent differences in mean temperature and house price among the A/B/C/D-rated
neighborhoods. Then try running a multiple regression (Notebook 12) to see how the 1930s rating predicts them.
Hint:
To conduct the regression analysis, you'll want to "dummy-code" the ratings variable, since it is categorical (A, B, C, and D values).
See this explanation
(https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-
2/#DUMMYCODING)
if you aren't familiar with this practice. When you construct the data matrix, pandas's
pd.get_dummies()
(https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html)
function is a good tool.
Opening pickle from './resource/asnlib/publicdata/prices-geo.pickle' ...
Out[48]:
state city
name
holc_id holc_grade area_description_data ZipCode Price
geometry
0
AL
Birmingham
Mountain Brook
Estates and Country
Club Garden...
A1
A
{'5': 'Both sales and
rental prices in 1929
we...
35223
610462.0
MULTIPOLYGON
(((-86.76062
33.49298,
-86.76202 ...
1
AL
Birmingham
Redmont Park,
Rockridge Park,
Warwick Manor, a...
A2
A
{'5': 'Both sales and
rental prices in 1929
we...
35223
610462.0
MULTIPOLYGON
(((-86.77425
33.49538,
-86.77430 ...
2
AL
Birmingham
Colonial Hills, Pine
Crest (outside city
limits)
A3
A
{'5': 'Generally
speaking, houses are
not buil...
35223
610462.0
POLYGON
((-86.75454
33.48883,
-86.76227
33.488...
3
AL
Birmingham
Grove Park,
Hollywood, Mayfair,
and Edgewood s...
B1
B
{'5': 'Both sales and
rental prices in 1929
we...
35223
610462.0
MULTIPOLYGON
(((-86.77070
33.47585,
-86.76970 ...
4
AL
Birmingham
First Addition to South
Highlands
B3
B
{'5': 'Both sales and
rental prices in 1929
we...
35223
610462.0
POLYGON
((-86.77843
33.49367,
-86.77847
33.490...
Average house price in Atlanta, GA:
*
Overall: ~ $398,127.40
*
In 1930s 'A'-rated neighborhoods: ~ $589,818.12
*
In 1930s 'B'-rated neighborhoods: ~ $445,985.37
*
In 1930s 'C'-rated neighborhoods: ~ $393,349.46
*
In 1930s 'D'-rated neighborhoods: ~ $317,403.05
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In [50]:
# Cities and their available satellite images
satellite_image_data = {
'Birmingham, AL': 'LC08_CU_022014_20200614_20200628_C01_V01_ST--EPSG_4326.tif'
, 'Los Angeles, CA': 'LC08_CU_003012_20200703_20200709_C01_V01_ST--EPSG_4326.tif'
, 'Denver, CO': 'LC08_CU_012009_20200812_20200824_C01_V01_ST--EPSG_4326.tif'
# incomplete
, 'New Haven, CT': 'LC08_CU_029006_20200613_20200627_C01_V01_ST--EPSG_4326.tif'
# incomplete
, 'Jacksonville, FL': 'LC08_CU_026016_20160709_20190430_C01_V01_ST--EPSG_4326.tif'
# mostly complete
, 'Atlanta, GA': 'LC08_CU_024013_20190808_20190822_C01_V01_ST--EPSG_4326.tif'
, 'Chicago, IL': 'LC08_CU_021007_20160624_20181205_C01_V01_ST--EPSG_4326.tif'
# incomplete (multi-tile)
, 'Indianapolis, IN': 'LC08_CU_022009_20200824_20200907_C01_V01_ST--EPSG_4326.tif'
, 'Louisville, KY': 'LC08_CU_023010_20200817_20200825_C01_V01_ST--EPSG_4326.tif'
# incomplete (multi-tile)
, 'New Orleans, LA': 'LC08_CU_020016_20200612_20200627_C01_V01_ST--EPSG_4326.tif'
, 'Boston, MA': 'LC08_CU_030006_20180719_20190614_C01_V01_ST--EPSG_4326.tif'
# incomplete (multi-tile)
, 'Baltimore, MD': 'LC08_CU_028008_20190812_20190822_C01_V01_ST--EPSG_4326.tif'
# mostly complete
, 'Detroit, MI': 'LC08_CU_024007_20190714_20190723_C01_V01_ST--EPSG_4326.tif'
# mostly complete
, 'Minneapolis, MN': 'LC08_CU_018005_20190613_20190621_C01_V01_ST--EPSG_4326.tif'
, 'St.Louis, MO': 'LC08_CU_020010_20180723_20190614_C01_V01_ST--EPSG_4326.tif'
, 'Charlotte, NC': 'LC08_CU_026012_20180823_20190614_C01_V01_ST--EPSG_4326.tif'
, 'Bergen Co., NJ': 'LC08_CU_029007_20190830_20190919_C01_V01_ST--EPSG_4326.tif'
, 'Manhattan, NY': 'LC08_CU_029007_20190830_20190919_C01_V01_ST--EPSG_4326.tif'
# same tile as Bergen Co., NJ
!
, 'Brooklyn, NY': 'LC08_CU_029007_20190830_20190919_C01_V01_ST--EPSG_4326.tif'
# same tile as Bergen Co., NJ!
, 'Columbus, OH': 'LC08_CU_024009_20190824_20190908_C01_V01_ST--EPSG_4326.tif'
# partial
, 'Portland, OR': 'LE07_CU_003003_20190813_20190910_C01_V01_ST--EPSG_4326.tif'
, 'Philadelphia, PA': 'LC08_CU_028008_20190720_20190803_C01_V01_ST--EPSG_4326.tif'
, 'Nashville, TN': 'LC08_CU_022012_20190603_20190621_C01_V01_ST--EPSG_4326.tif'
, 'Dallas, TX': 'LC08_CU_016014_20190816_20190906_C01_V01_ST--EPSG_4326.tif'
, 'Richmond, VA': 'LC08_CU_027010_20190727_20190803_C01_V01_ST--EPSG_4326.tif'
, 'Seattle, WA': 'LC08_CU_003002_20190828_20190905_C01_V01_ST--EPSG_4326.tif'
# partial
, 'Milwaukee Co., WI': 'LC08_CU_021007_20180801_20190614_C01_V01_ST--EPSG_4326.tif'
, 'Charleston, WV': 'LC08_CU_025010_20190817_20190905_C01_V01_ST--EPSG_4326.tif'
}
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In [51]:
# Develop your analysis here!
### BEGIN SOLUTION
# Merges neighborhood ratings, satellite image with temperatures, and Zillow data
# for a given city and (optionally) target rating. Returns the mean temperature
# and house price.
def
merge_city(neighborhood_ratings, satimg, neighborhood_prices, city_st, targets=
None
):
ratings = filter_ratings(neighborhood_ratings, city_st, targets)
satimg_clean = mask_image_by_geodf(satimg, ratings)
degC = mean_temperature(masked_to_degC(satimg_clean))
prices = filter_ratings(neighborhood_prices, city_st, targets)
price = prices['Price'].mean()
return
degC, price
# Merges data for every city having ratings, a temperature satellite image, and prices
def
merge_all_data(ratings, images, prices):
all_rows = []
for
city_st, satimg_filename
in
images.items():
print(f"Processing
{city_st}
[image=
{satimg_filename}
] ...")
satimg = load_satellite_image(satimg_filename, verbose=
False
)
degC_overall, price_overall = merge_city(ratings, satimg, prices, city_st)
for
rating
in
['A', 'B', 'C', 'D']:
degC_r, price_r = merge_city(ratings, satimg, prices, city_st, targets={rating})
delta_degC = round(degC_r - degC_overall, 1)
price_percent = round((price_r - price_overall) / price_overall * 100.0, 1)
all_rows.append((city_st, rating, delta_degC, price_percent))
return
pd.DataFrame(all_rows, columns=['city_st', 'ratings', 'delta_degC', '%price'])
# Construct a data matrix from a table with a single categorical predictor
def
build_matrix_dataframe(df, continuous=[], categorical=[], standardize=
False
, add_bias_term=
False
):
df_data = df[continuous]
if
continuous
else
pd.DataFrame()
if
standardize:
for
col
in
continuous:
mu = df[col].mean(skipna=
True
)
df_data[col] = (df[col] - mu) / mu
for
col
in
categorical:
df_cat_col = pd.get_dummies(df[col], prefix=col)
df_data = pd.concat([df_data, df_cat_col], axis=0)
if
add_bias_term:
df_data['__ones__'] = np.ones(len(df))
return
df_data
# ========== Analysis begins here ==========
# First, verify that dependent functions work:
if not
('filter_ratings__passed'
in
globals()
and
filter_ratings__passed \
and
'mean_temperature__passed_2d'
in
globals()
and
mean_temperature__passed_2d \
and
'masked_to_degC__passed_2d'
in
globals()
and
masked_to_degC__passed_2d \
):
print("This code cell was not run because it needs working solutions from earlier exercises.")
assert False
, "*** Stopping execution here. ***"
summary_tibble = merge_all_data(neighborhood_ratings, satellite_image_data, neighborhood_prices)
summary_df = summary_tibble.pivot(index='city_st', values=['delta_degC', '%price'], columns=['ratings']).reset_in
dex()
display(summary_df)
from numpy.linalg import
lstsq
print(f"Regressing on neighborhood ratings {tuple(summary_tibble['ratings'].unique())}):")
X = build_matrix_dataframe(summary_tibble, categorical=['ratings']).values
for
response
in
['delta_degC', '%price']:
y = summary_tibble[response]
theta, _, _, _ = lstsq(X, y, rcond=
None
)
print(f"* Response '
{response}
' has these weights:
{theta.T}
")
### END SOLUTION
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Processing Birmingham, AL [image=LC08_CU_022014_20200614_20200628_C01_V01_ST--EPSG_4326.tif] ...
Processing Los Angeles, CA [image=LC08_CU_003012_20200703_20200709_C01_V01_ST--EPSG_4326.tif] ...
Processing Denver, CO [image=LC08_CU_012009_20200812_20200824_C01_V01_ST--EPSG_4326.tif] ...
Processing New Haven, CT [image=LC08_CU_029006_20200613_20200627_C01_V01_ST--EPSG_4326.tif] ...
Processing Jacksonville, FL [image=LC08_CU_026016_20160709_20190430_C01_V01_ST--EPSG_4326.tif] ...
Processing Atlanta, GA [image=LC08_CU_024013_20190808_20190822_C01_V01_ST--EPSG_4326.tif] ...
Processing Chicago, IL [image=LC08_CU_021007_20160624_20181205_C01_V01_ST--EPSG_4326.tif] ...
Processing Indianapolis, IN [image=LC08_CU_022009_20200824_20200907_C01_V01_ST--EPSG_4326.tif] ...
Processing Louisville, KY [image=LC08_CU_023010_20200817_20200825_C01_V01_ST--EPSG_4326.tif] ...
Processing New Orleans, LA [image=LC08_CU_020016_20200612_20200627_C01_V01_ST--EPSG_4326.tif] ...
Processing Boston, MA [image=LC08_CU_030006_20180719_20190614_C01_V01_ST--EPSG_4326.tif] ...
Processing Baltimore, MD [image=LC08_CU_028008_20190812_20190822_C01_V01_ST--EPSG_4326.tif] ...
Processing Detroit, MI [image=LC08_CU_024007_20190714_20190723_C01_V01_ST--EPSG_4326.tif] ...
Processing Minneapolis, MN [image=LC08_CU_018005_20190613_20190621_C01_V01_ST--EPSG_4326.tif] ...
Processing St.Louis, MO [image=LC08_CU_020010_20180723_20190614_C01_V01_ST--EPSG_4326.tif] ...
Processing Charlotte, NC [image=LC08_CU_026012_20180823_20190614_C01_V01_ST--EPSG_4326.tif] ...
Processing Bergen Co., NJ [image=LC08_CU_029007_20190830_20190919_C01_V01_ST--EPSG_4326.tif] ...
Processing Manhattan, NY [image=LC08_CU_029007_20190830_20190919_C01_V01_ST--EPSG_4326.tif] ...
Processing Brooklyn, NY [image=LC08_CU_029007_20190830_20190919_C01_V01_ST--EPSG_4326.tif] ...
Processing Columbus, OH [image=LC08_CU_024009_20190824_20190908_C01_V01_ST--EPSG_4326.tif] ...
Processing Portland, OR [image=LE07_CU_003003_20190813_20190910_C01_V01_ST--EPSG_4326.tif] ...
Processing Philadelphia, PA [image=LC08_CU_028008_20190720_20190803_C01_V01_ST--EPSG_4326.tif] ...
Processing Nashville, TN [image=LC08_CU_022012_20190603_20190621_C01_V01_ST--EPSG_4326.tif] ...
Processing Dallas, TX [image=LC08_CU_016014_20190816_20190906_C01_V01_ST--EPSG_4326.tif] ...
Processing Richmond, VA [image=LC08_CU_027010_20190727_20190803_C01_V01_ST--EPSG_4326.tif] ...
Processing Seattle, WA [image=LC08_CU_003002_20190828_20190905_C01_V01_ST--EPSG_4326.tif] ...
Processing Milwaukee Co., WI [image=LC08_CU_021007_20180801_20190614_C01_V01_ST--EPSG_4326.tif] ...
Processing Charleston, WV [image=LC08_CU_025010_20190817_20190905_C01_V01_ST--EPSG_4326.tif] ...
city_st
delta_degC
%price
ratings
A
B
C
D
A
B
C
D
0
Atlanta, GA
-3.9
-1.6 0.1
2.3
48.1
12.0
-1.2
-20.3
1
Baltimore, MD
-2.7
-1.2 1.2
2.8
11.1
10.6
-4.7
-14.8
2
Bergen Co., NJ
-2.9
-1.0 0.2
2.3
22.4
3.3
-1.1
-10.3
3
Birmingham, AL
-3.6
0.1
1.6
-0.2 166.6 52.9
-27.6 -35.2
4
Boston, MA
-5.4
-0.9 -0.2 2.3
17.9
3.4
-5.6
6.1
5
Brooklyn, NY
-0.0
-0.2 -0.2 0.3
-32.8
-4.9
-2.0
5.5
6
Charleston, WV
-2.8
0.6
1.1
-1.0 21.5
-3.6
-5.0
3.8
7
Charlotte, NC
-2.4
-0.6 0.5
1.1
38.5
27.2
-17.6 -7.4
8
Chicago, IL
-5.3
-1.9 0.8
0.8
103.7 23.7
-11.5 -24.1
9
Columbus, OH
-1.6
-0.1 0.3
0.8
27.5
5.5
-4.0
-10.9
10
Dallas, TX
-3.0
-1.1 1.0
1.9
57.8
-3.8
-20.1 -20.3
11
Denver, CO
11.6 1.5
-2.8 0.2
12.2
10.4
-0.5
-11.8
12
Detroit, MI
-3.0
-0.8 0.0
1.0
59.1
14.9
-1.4
-19.9
13
Indianapolis, IN
-1.7
-0.5 -0.3 1.2
15.5
-1.4
-3.2
1.1
14
Jacksonville, FL
-3.7
-0.6 0.1
0.9
32.5
7.7
-7.8
-10.0
15
Los Angeles, CA
-5.2
-0.8 1.2
1.1
40.5
7.9
-6.9
-22.2
16
Louisville, KY
-2.5
-0.4 0.3
1.9
46.7
10.0
-12.7 -15.3
17
Manhattan, NY
-1.0
-1.0 0.0
0.6
1.0
-12.3 -12.9 7.5
18
Milwaukee Co., WI -2.9
0.4
-0.2 0.7
29.8
12.0
-5.4
-10.3
19
Minneapolis, MN
-2.8
-0.7 1.0
2.0
4.5
6.0
-4.7
-6.0
20
Nashville, TN
-2.2
-0.3 -0.2 1.3
20.7
5.3
0.5
-11.2
21
New Haven, CT
-3.2
-1.5 0.8
1.8
8.2
-1.5
-4.9
5.9
22
New Orleans, LA
-1.9
-0.7 0.2
0.2
16.6
13.7
2.0
-10.7
23
Philadelphia, PA
-4.4
-1.2 1.6
2.9
34.9
4.9
-15.0 -10.5
24
Portland, OR
-5.8
0.0
0.7
2.1
13.2
1.7
-7.8
7.6
25
Richmond, VA
-3.0
-0.3 0.2
1.6
18.9
11.8
-5.6
-20.7
26
Seattle, WA
-3.2
-0.6 1.0
-0.2 8.1
3.0
-1.0
-8.9
27
St.Louis, MO
-2.0
0.2
0.1
1.2
40.6
-11.2 -12.6 -14.2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Epilogue.
The analysis in this notebook is inspired by a New York Times article
(https://www.nytimes.com/interactive/2019/08/09/climate/city-heat-
islands.html)
about the disproportionate affects of climate on different racial and socioeconomic groups. What you've done in this notebook just scratches
the surface of that analysis, available in this paper
(https://www.mdpi.com/2225-1154/8/1/12/htm)
, but we hope you can appreciate how remarkable it is
that with just a semester's worth of experience, this kind of data analysis is well within your grasp!
Indeed, although we ended up cutting it out of the problem, you could easily imagine applying any number of the analyses from Notebooks 12-15, starting
with simple regression models that relate ratings with temperature and home prices.
Data sources for this notebook:
Redlining data: The
Mapping Inequality
website
(https://cse6040.gatech.edu/sp23/img/dsl.richmond.edu/panorama/redlining)
Satellite data: The US Geological Survey (USGS) Earth Explorer
(https://earthexplorer.usgs.gov/)
(we used the "provisional land surface
temperature" images)
Real estate data: Zillow Home Price Forecast Data for Researchers
(https://www.zillow.com/research/data)
Geographic boundaries for zip codes: this blog post
(https://n8henrie.com/uploads/2017/11/plotting-us-census-data-with-python-and-
geopandas.html)
, which derives this information from US Census Data (see the post for details)
Regressing on neighborhood ratings ('A', 'B', 'C', 'D')):
* Response 'delta_degC' has these weights: [-2.51785714 -0.54285714
0.36071429
1.21071429]
* Response '%price' has these weights: [31.61785714
7.47142857 -7.15357143 -9.91071429]
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you
![Text book image](https://www.bartleby.com/isbn_cover_images/9781133187844/9781133187844_smallCoverImage.gif)
C++ for Engineers and Scientists
Computer Science
ISBN:9781133187844
Author:Bronson, Gary J.
Publisher:Course Technology Ptr
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337102087/9781337102087_smallCoverImage.gif)
C++ Programming: From Problem Analysis to Program...
Computer Science
ISBN:9781337102087
Author:D. S. Malik
Publisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337671385/9781337671385_smallCoverImage.jpg)
EBK JAVA PROGRAMMING
Computer Science
ISBN:9781337671385
Author:FARRELL
Publisher:CENGAGE LEARNING - CONSIGNMENT
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337102100/9781337102100_smallCoverImage.gif)
Microsoft Visual C#
Computer Science
ISBN:9781337102100
Author:Joyce, Farrell.
Publisher:Cengage Learning,
Programming Logic & Design Comprehensive
Computer Science
ISBN:9781337669405
Author:FARRELL
Publisher:Cengage
![Text book image](https://www.bartleby.com/isbn_cover_images/9781305080195/9781305080195_smallCoverImage.gif)
Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning
Recommended textbooks for you
- C++ for Engineers and ScientistsComputer ScienceISBN:9781133187844Author:Bronson, Gary J.Publisher:Course Technology PtrC++ Programming: From Problem Analysis to Program...Computer ScienceISBN:9781337102087Author:D. S. MalikPublisher:Cengage LearningEBK JAVA PROGRAMMINGComputer ScienceISBN:9781337671385Author:FARRELLPublisher:CENGAGE LEARNING - CONSIGNMENT
- Microsoft Visual C#Computer ScienceISBN:9781337102100Author:Joyce, Farrell.Publisher:Cengage Learning,Programming Logic & Design ComprehensiveComputer ScienceISBN:9781337669405Author:FARRELLPublisher:CengageSystems ArchitectureComputer ScienceISBN:9781305080195Author:Stephen D. BurdPublisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781133187844/9781133187844_smallCoverImage.gif)
C++ for Engineers and Scientists
Computer Science
ISBN:9781133187844
Author:Bronson, Gary J.
Publisher:Course Technology Ptr
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337102087/9781337102087_smallCoverImage.gif)
C++ Programming: From Problem Analysis to Program...
Computer Science
ISBN:9781337102087
Author:D. S. Malik
Publisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337671385/9781337671385_smallCoverImage.jpg)
EBK JAVA PROGRAMMING
Computer Science
ISBN:9781337671385
Author:FARRELL
Publisher:CENGAGE LEARNING - CONSIGNMENT
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337102100/9781337102100_smallCoverImage.gif)
Microsoft Visual C#
Computer Science
ISBN:9781337102100
Author:Joyce, Farrell.
Publisher:Cengage Learning,
Programming Logic & Design Comprehensive
Computer Science
ISBN:9781337669405
Author:FARRELL
Publisher:Cengage
![Text book image](https://www.bartleby.com/isbn_cover_images/9781305080195/9781305080195_smallCoverImage.gif)
Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning