JNB Lab: Housing Equity Initiative

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import folium # map rendering library
from folium.features import DivIcon #used to add popup info to a map
import shapely
from shapely.geometry import Point, Polygon
import shapefile # install pyshp first if you get a ModuleNotFoundError
import sklearn
from sklearn.cluster import KMeans

4.10. JNB Lab: Housing Equity Initiative#

4.10.1. Introduction#

In Fall 2021, the Housing Equity Initiative (HEI) was launched by Sunshine Gospel Ministries in partnership with The BLOCK Movement, both non-profit organizations deeply rooted in the South Side of Chicago. Housing inequity between White and Black people living in the United States is a nationwide problem. In 2020 the National Association of Realtors indicated over 70% of Whites live in their own homes, while for Blacks it is less than 45%. Based on Federal Reserve data for 2019, the real estate wealth per capita for Whites is over 140,000, but for Blacks, it is under $60,000.

In the aftermath of the Great Migration of those escaping the Jim Crow South, which brought a large influx of Blacks into Chicago in the first half of the 20th century, decades of housing-related injustices including redlining, blockbusting, restrictive covenants, and slum land-lording have occurred on the South and West sides [Amdat 2021]. Many communities affected by structural racism are carrying the burden of inequitable housing. For example, for the Woodlawn community area with roughly 9,000 housing units, almost 80% are renter-occupied, 36% of which in 2014 were using more than 50% of their income on rent. (For Woodlawn, the 2014 area median income (AMI) was roughly $25,000.)

HEI is focused on affordable home ownership in the Woodlawn area, the active construction site of the $500+ million dollar Obama Presidential Center. HEI seeks to overcome a long history of structural barriers to home ownership by providing three forms of support for vulnerable families:

Tax-Sale Benevolence Program (TSBP) to help existing homeowners avoid foreclosure due to tax sales;

Downpayment Assistance Program (DAP) to assist responsible families with down payment on a home purchase; and

Affordable Revitalization Program (ARP) to revitalize empty lots and rehab properties with well-constructed, owner-occupied homes.

HEI’s goal is not simply to increase home ownership for neighborhood residents who have experienced housing inequity, but to maximize positive, rather than negative, home ownership experiences

4.10.2. Park district facilities in Ward 20#

There are several factors which can make a difference in whether a young person growing up in a disadvantaged urban community will flourish. One type of community asset are park district facilities. Here we will create a map of such resources for HEI’s neighborhood.

Import Ward 20 map.

img = Image.open("ward20.png")

Define a function to convert (lat,lon) to an (x,y) location on the Ward 20 map.

#Function to convert (lat,lon) to (x,y) location on the Ward 20 map
def coord(lat,lon):
    minlon=-87.672079
    maxlon=-87.586493
    xmin=47
    xmax=806
    x=xmin+(lon-minlon)*(xmax-xmin)/(maxlon-minlon)
    minlat=41.778798
    maxlat=41.801243
    ymax=189
    ymin=389
    y=ymin+(lat-minlat)*(ymax-ymin)/(maxlat-minlat)
    return x,y

Import a shape file with the Ward 20 boundary.

sf = shapefile.Reader("ward20.shp")
shapes=sf.shapes()

Read in the Chicago Park District data from the Chicago Data Portal.

parks=pd.read_json("https://data.cityofchicago.org/resource/eix4-gf83.json?$limit=400000")

parks.head(1)

	the_geom	objectid_1	park	park_no	facility_n	facility_t	x_coord	y_coord	gisobjid
0	{'type': 'Point', 'coordinates': [-87.63769762...	1	HAMILTON (ALEXANDER)	9	CULTURAL CENTER	SPECIAL	-87.637698	41.762999	2494

p=parks[["facility_n","facility_t","x_coord","y_coord"]]
p.columns=["type","loc","longitude","latitude"]
p.head(3)

	type	loc	longitude	latitude
0	CULTURAL CENTER	SPECIAL	-87.637698	41.762999
1	GYMNASIUM	INDOOR	-87.637929	41.762817
2	BASEBALL JR/SOFTBALL	OUTDOOR	-87.636914	41.760849

p["type"].value_counts()

type
BASKETBALL BACKBOARD         771
BASEBALL JR/SOFTBALL         537
PLAYGROUND                   517
TENNIS COURT                 507
BASKETBALL COURT             323
                            ... 
GOLF COURSE MINIATURE          1
SPORT ROLLER COIRT             1
NATURE PLAY AREA               1
ALFRED CALDWELL LILY POND      1
ART TURF - REGULATION          1
Name: count, Length: 75, dtype: int64

Map the Park District data.

import matplotlib.pyplot as plt
from PIL import Image
img = Image.open("ward20.png")
plt.figure(figsize=(20,16))
plt.imshow(img,alpha=.1)
df=p

#use this format
for i in df.index:
    [x,y]=coord(df.loc[i,"latitude"],df.loc[i,"longitude"])
    point=Point(df.loc[i,"longitude"],df.loc[i,"latitude"])
    if point.within(Polygon(shapes[3].points)):
        plt.text(x,y,"o",color='black',size=5,ha='center',va='bottom')
        plt.text(75,250,"o=all CP facilities",color='black',size=10,ha='left',va='bottom')
        if df.loc[i,"type"]=="BASKETBALL BACKBOARD":
            plt.text(x,y,"1",color='black',size=5,ha='center',va='bottom')
            plt.text(75,275,"1=basketball",color='black',size=15,ha='left',va='bottom')
        if df.loc[i,"type"]=="BASKETBALL COURT":
            plt.text(x,y,"1",color='black',size=12,ha='center',va='bottom')
        if df.loc[i,"type"]=="BASEBALL JR/SOFTBALL":
            plt.text(x,y,"2",color='black',size=10,ha='center',va='bottom')
            plt.text(75,300,"2=baseball",color='black',size=15,ha='left',va='bottom')
        if df.loc[i,"type"]=="BASEBALL SR":
            plt.text(x,y,"2",color='black',size=12,ha='center',va='bottom')
plt.savefig("recreation.png")
plt.show()

../../_images/72bae78b8dd24edf38dcc0f4dd31ecae29f8c44e6c78becf51955acafe76ef2f.png

Problem 1

Add playgrounds to the map in red.

4.10.3. Tax Year 2019 Owner-Occupied Tax Sale Data for Ward 20#

Import tax-sale data.

df=pd.read_excel("Ward20residentialparcels.xlsx")
df2=pd.read_excel("HEIcandidateparcels.xlsx")

Create map of Ward 20 residential parcels and HEI candidate parcels.

Chicago_map = folium.Map(location=[41.78453, -87.62859],   zoom_start=13,alpha=.1)
for i in np.arange(0,169,1):
    p=[df.loc[i,"latitude"],df.loc[i,"longitude"]]
    folium.Marker(p,icon=DivIcon(
        icon_size=(100,0),
        icon_anchor=(0,8),
        html='<div style="font-size: 2pt; color : lightgray">'+'</div>',
            )).add_to(Chicago_map)
    Chicago_map.add_child(folium.CircleMarker(p, radius=1,color='lightgray'))
for i in np.arange(0,92,1):
    p2=[df2.loc[i,"latitude"],df2.loc[i,"longitude"]]
    folium.Marker(p2,icon=DivIcon(
        icon_size=(100,0),
        icon_anchor=(0,8),
        html='<div style="font-size: 6pt; color : black">'+'   '+str(df2.loc[i,"Total Tax Due"])+ '</div>',
            )).add_to(Chicago_map)
    Chicago_map.add_child(folium.CircleMarker(p2, radius=1,color='black'))
Chicago_map.save("HEItaxsaleyear19maprev.html")
Chicago_map

Make this Notebook Trusted to load map: File -> Trust Notebook

Exercise

Make a histogram showing tax sale amounts for Ward 20 residential parcels and HEI candidate parcels.

4.10.4. Low Income Tract Clustering#

Read in census tract data.

rawdf=pd.read_csv("tract_covariates.csv")

rawdf.columns

Index(['tract', 'county', 'state', 'hhinc_mean2000', 'mean_commutetime2000',
       'frac_coll_plus2010', 'frac_coll_plus2000', 'foreign_share2010',
       'med_hhinc2016', 'med_hhinc1990', 'popdensity2000', 'poor_share2010',
       'poor_share2000', 'poor_share1990', 'share_black2010', 'share_hisp2010',
       'share_asian2010', 'share_black2000', 'share_white2000',
       'share_hisp2000', 'share_asian2000', 'gsmn_math_g3_2013',
       'rent_twobed2015', 'singleparent_share2010', 'singleparent_share1990',
       'singleparent_share2000', 'traveltime15_2010', 'emp2000',
       'mail_return_rate2010', 'ln_wage_growth_hs_grad', 'jobs_total_5mi_2015',
       'jobs_highpay_5mi_2015', 'nonwhite_share2010', 'popdensity2010', 'cz',
       'czname', 'ann_avg_job_growth_2004_2013', 'job_density_2013'],
      dtype='object')

rawdf.shape

(74123, 38)

Filter data to Cook County tracts with median 2016 household income <30,000.

IL=rawdf[rawdf['state']== 17]
IL.shape

(3128, 38)

cook=IL[IL['county']==31]
cook.shape

(1319, 38)

low_inc=cook[cook['med_hhinc2016']<30000]
low_inc.shape

(202, 38)

Prepare columns used for separation.

df=low_inc[['tract','emp2000', 'frac_coll_plus2010','job_density_2013',   'mean_commutetime2000', 'med_hhinc2016','popdensity2010', 'rent_twobed2015','singleparent_share2010']] 
df.columns

Index(['tract', 'emp2000', 'frac_coll_plus2010', 'job_density_2013',
       'mean_commutetime2000', 'med_hhinc2016', 'popdensity2010',
       'rent_twobed2015', 'singleparent_share2010'],
      dtype='object')

df.columns=['Tract','emp','college','jobdensity','commute','hhincome','popdensity','rent','singleparent']
df.head(1)

	Tract	emp	college	jobdensity	commute	hhincome	popdensity	rent	singleparent
21107	10100	0.560484	0.349921	2530.6123	41.525024	29861.0	33020.406	1153.0	0.543056

df.loc[:, 'work'] = df.loc[:, 'jobdensity'] / df.loc[:, 'popdensity']
df.loc[:, 'room'] = df.loc[:, 'rent'] / df.loc[:, 'hhincome']

Normalize values

df.shape

(202, 11)

df=df.dropna()
df.shape

(167, 11)

Mwork=df["work"].max()
mwork=df["work"].min()
Mroom=df["room"].max()
mroom=df["room"].min()
Msingleparent=df["singleparent"].max()
msingleparent=df["singleparent"].min()
Mcommute=df["commute"].max()
mcommute=df["commute"].min()
Mcollege=df["college"].max()
mcollege=df["college"].min()
Mhhincome=df["hhincome"].max()
mhhincome=df["hhincome"].min()

#normalize values
df.loc[:,"work"]=(df.loc[:,"work"]-mwork)/(Mwork-mwork)
df.loc[:,"room"]=(df.loc[:,"room"]-mroom)/(Mroom-mroom)
df.loc[:,"singleparent"]=(df.loc[:,"singleparent"]-msingleparent)/(Msingleparent-msingleparent)
df.loc[:,"commute"]=(df.loc[:,"commute"]-mcommute)/(Mcommute-mcommute)
df.loc[:,"education"]=(df.loc[:,"college"]-mcollege)/(Mcollege-mcollege)
df.loc[:,"income"]=(df.loc[:,"hhincome"]-mhhincome)/(Mhhincome-mhhincome)

tracts=df[["Tract","work","room","education","income","singleparent","commute"]]
tracts.shape

(167, 7)

tracts=tracts.dropna()
tracts.shape

(167, 7)

Use k-means to separate the census tracts into two clusters.

cols=["work","room","education","income","singleparent","commute"]
tractcluster=tracts[cols]

# Fit the k means model
k_means = KMeans(init="k-means++", n_clusters=2, n_init=2)
k_means.fit(tractcluster)
#Get Labels
k_means_labels = k_means.labels_
k_means_labels

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0,
       0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
       0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1,
       0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1,
       0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0,
       1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1], dtype=int32)

tracts["CLASS"]=k_means_labels
tracts=tracts.reset_index(drop=True)
tracts.head(2)

	Tract	work	room	education	income	singleparent	commute	CLASS
0	10100	0.012277	0.228804	0.423649	0.994194	0.491649	0.554853	1
1	10202	0.043318	0.222477	0.280312	0.973262	0.617686	0.401514	1

C0=tracts[tracts["CLASS"]==0]
C0.head(1)

	Tract	work	room	education	income	singleparent	commute	CLASS
11	231200	0.012669	0.1637	0.129681	0.843523	0.733	0.707592	0

C1=tracts[tracts["CLASS"]==1]
C1.head(1)

	Tract	work	room	education	income	singleparent	commute	CLASS
0	10100	0.012277	0.228804	0.423649	0.994194	0.491649	0.554853	1

C0.shape

(96, 8)

C1.shape

(71, 8)

Make a boxplot comparison of the two groups

numplots=len(cols)
plt.figure(figsize=[10,numplots])
fig, axes = plt.subplots(numplots,2,figsize=[8,40])
for i in np.arange(0,numplots,1):
    axes[i,0].set_ylim((0,1.1))
    axes[i,1].set_ylim((0,1.1))
    axes[i,0].boxplot(C0[cols[i]],whis=3,labels=['C0'+cols[i]],widths=.5)
    axes[i,1].boxplot(C1[cols[i]],whis=3,labels=['C1'+cols[i]],widths=.5)
fig.savefig('6IndR1.png') #Save our figure to a file
plt.show()

<Figure size 1000x600 with 0 Axes>

../../_images/6db72e319340185c0054feb87cede73ebdc6bb2fc44044b2e10fc71dadb672ee.png

df2.shape

(92, 34)

Exercise

Continue the same process one further step starting with the sub-cluster (C0 or C1) that exhibits greater hardship.