JNB Lab: Homicides in Chicago

5.10. JNB Lab: Homicides in Chicago#

Acknowledgements

This lab is an outcome of a collaboration in 2019-2021 between Paul Isihara (Professor of Mathematics, Wheaton College (IL)) and a team at Flourishing Community Initiative of Sunshine Gospels Ministries which included Arnold Sojourner (FCI director), Joel Hamernick (SGM Director) and Evan Trowbridge (FCI Data Analyst) Dr. Paul Campbell (Editor, UMAP Journal) served as a statistical consultant.

Parts of this material was published previously in an article in the UMAP Journal titled Data Detectives: Homicides in Chicago. The UMAP Journal 42(1) (2021) 27-33. Copyright 2021 by COMAP, Inc. All rights reserved. Authosr: Paul Isihara. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights for components of this work owned by others than COMAP must be honored. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.

5.10.1. Historical Context#

Chicago has suffered over 8,000 homicides in the past 20 years, or more than 1 homicide per day on average. A number of historical factors have contributed to the current level of homicides in Chicago.

After slavery was abolished, racism in the form of segregationist Jim Crow Laws and white supremacy led by the KKK contributed to the Great Migration of 6,000,000 black Americans who left their homes in the South and moved in large numbers to cities such as Chicago. Further racism occurred in the form of hatred in the industrial workplace, redlining which kept black Americans from home ownership, slum landlording which created sub-standard and overcrowded rental and school conditions, and gentrification which further displaced and forced black Americans into communities which experienced white flight, dis-investment, and loss of jobs. As hope of a good education, honest employment, and decent living conditions disappeared, some were led into gangs and drugs. Mass incarceration further exacerbated the family instability, economic hardship, and violence of predominantly black American areas of the city.

5.10.2. DATA VISUALIZATION#

Read in Homicide data for Cook County which was obtained from the Cook County Medical Examiner Case Archive https://datacatalog.cookcountyil.gov/Public-Safety/Medical-Examiner-Case-Archive/cjeq-bs86 on Feb 6, 2021, and stored in an Excel file homicides.xlsx. Drop rows that have missing data.

	Date of Incident	Age	Gender	Race	Primary Cause	Residence City	Incident Zip Code
0	2017-02-26 10:48:00	23.0	Male	Black	MULTIPLE GUNSHOT WOUNDS	Chicago	60623.0

Extract the month and year of homicide incidents.

	Date of Incident	Age	Gender	Race	Primary Cause	Residence City	Incident Zip Code	new_date	year	month	day
0	2017-02-26 10:48:00	23.0	Male	Black	MULTIPLE GUNSHOT WOUNDS	Chicago	60623.0	2017-02-26	2017	2	26

Get Chicago data, select the columns [“Age”,“Gender”,“Race”,“Primary Cause”,“Incident Zip Code”,“year”,“month”,“day”] and use integer format

	Age	Gender	Race	Primary Cause	Incident Zip Code	year	month	day
0	23	Male	Black	MULTIPLE GUNSHOT WOUNDS	60623	2017	2	26

Get the homicide totals by year

  669
  620
  566
  492
  433
  429
Name: year, dtype: int64

Sort the data by year

Plot the Homicides by Month

Show code cell source Hide code cell source

#Make plots
%matplotlib notebook
import matplotlib as mpl
import matplotlib.pyplot as plt

plt.figure(figsize=[10,5])

classes=["JAN","FEB","MAR","APR","MAY","JUNE","JULY","AUG","SEP","OCT","NOV","DEC"]
ps = 1+np.arange(len(classes))
sps = A19
bars = plt.bar(ps, sps, align='center', linewidth=0, color='lightslategrey',alpha=.5)
plt.xticks(ps, classes, alpha=0.7)

for spine in plt.gca().spines.values():
    spine.set_visible(False)

plt.tick_params(top='off', bottom='off', left='off', right='off', labelleft='off', labelbottom='on')
 
# directly label each bar with Y axis values
for bar in bars:
    plt.gca().text(bar.get_x() + bar.get_width()/2, 2, str(int(bar.get_height())), ha='center', color='w', fontsize=11)

plt.plot(H20[0:8], 'o-',color='k',linewidth=3)
for i in np.arange(5,9,1):
    plt.text(i-.1,H20[i]+3.1, str(H20[i]),color='k')
plt.text(5.5,103,'2020',color='k',size=18)

plt.plot(H19, color='gray',linewidth=1,linestyle=(0, (5, 2, 1, 2)), dash_capstyle='round')
for i in [7,8]:
    plt.text(i-.1,H19[i]-5, str(H19[i]),color='gray',size=9)
plt.text(7.14,42,'2019',color='gray',size=10)

plt.plot(H18, color='gray',marker='h', markerfacecolor='gray', markeredgewidth=1,
         markersize=5, markevery=1)
plt.text(6.1,58.5,'2018',color='gray',size=11)

plt.plot(H17, color='k',linewidth=1,linestyle=':')
for i in [6,7]:
    plt.text(i-.1,H17[i]+1.75, str(H17[i]),color='k')
plt.text(6.15,80.1,'2017',color='k',size=13)

plt.plot(H16, color='k',linewidth=1,linestyle=(0, (5, 2, 1, 2)), dash_capstyle='round')
plt.text(8.1,H16[8]+1.6, str(H16[8]),color='k')
plt.text(6.2,67,'2016',color='k',size=13)

plt.plot(H15, color='#838B8B',linewidth=2,linestyle=':')
plt.text(5.8,40, str(H15[6]),color='#838B8B')
plt.text(4.8,41, str(H15[5]),color='#838B8B')
plt.text(7,50.5,'2015',color='#838B8B',size=10)

plt.xlabel("Bar Heights Indicate 5 Year Average (2015-2019)")

plt.title("2020 Chicago Homicides Compared with 2015-2019 Average",size=14)
plt.savefig("JNB1.png")
plt.show()

5.10.3. Monte Carlo Analysis#

The Monte Carlo method uses pseudo-random numbers for statistical analysis. We will use this approach to create a histogram of homicide counts for 10,000 hypothetical months of May based on probabilities of daily homicide counts using actual data for the years 2015-2019. For example, we first record that May 1, 2015 had 3 homicides, May 2, had 0 homicides,…, May 31, 2019 had 5 homicides, and then make a frequency distribution called “maycount” using the numbers 3,0, …, 5. We obtain in this way the empirical probability that there will be 0 homicides on a day in May is 0.219 since 21.9% of the days in May (for 2015-2019) had 0 homicides. We then use the empirical probabilities and pseudo-random numbers tto generate a random sample of 31 homicide counts for May, and these daily counts to get a random monthly total for May. We repeat this 10,000 times and make a histogram of the results.

Get the number of Chicago homicides in May 2020.

Chicago Homicides In May 2020:  77

Get the number of homicides which occurred on each day of May between 2015 and 2019.

  5
  0
  1
  0
  2
dtype: int32

Get frequency distributions for the number of homicides in a day

  0.200000
  0.303226
  0.219355
  0.141935
  0.083871
  0.051613
dtype: float64

Do a Monte Carlo simulation for monthly homicide counts in May based on 31 random draws from the respective empirical distributions.

Make a histogram of the frequency distribution for May.

Text(0.5, 1.0, '10,000 Simulated May Homicide Counts based on 2015-2019 Data')

Find the empirical p-value of getting at least 77 homicides (the number in May 2020.)

Probability that a random draw of 31 days from May 2015-2020 distibution results in 59 or more homicides: 0.0038

Exercises

a) Repeat the above analysis to obtain a histogram for July.

b) Find the empirical p-value of getting at least 108 homicides (the number in July 2020.)

a) Make a histogram of the difference July homicide count - May homicide count.

b) Find the probability that July will have at least n more homicides than May \((0\le n \le 20)\). Make a plot of these probabilities.

c) Estimate the probability that there will be 20 more homicides in July than in May.