Hide code cell source
# import libraries
# Always run this cell first!
import numpy as np
import pandas as pd
import scipy.stats as stats
from scipy.stats import powerlaw
from scipy import integrate
from scipy.integrate import odeint
import math
import random
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.colors import cnames
from matplotlib import animation
import scipy
import statsmodels.api # appear to need to import the api as well as the library itself for the interpreter to find the modules
import statsmodels as sm
from matplotlib import ticker
import seaborn as sns
import plotly.graph_objects as go
import plotly.offline
plotly.offline.init_notebook_mode(connected=True) # make plotly work with Jupyter Notebook using CDN
from scipy.stats import linregress
import folium 
from folium.plugins import MarkerCluster


# an extra function for plotting a straight line
def plot_abline(slope, intercept, color = None, x_name = "x", y_name = "y"):
    """Plot a line from slope and intercept"""
    axes = plt.gca()
    x_vals = np.array(axes.get_xlim())
    y_vals = intercept + slope * x_vals
    plt.plot(
        x_vals, y_vals, '--', color = color,
        label = f"${y_name} = {slope:.2f}{x_name} " + ("-" if intercept < 0 else "+") + f"{abs(intercept):.2f}$"
    )
# an extra function for plotting a straight line
def plot_abline(slope, intercept, color = None, x_name = "x", y_name = "y"):
    """Plot a line from slope and intercept"""
    axes = plt.gca()
    x_vals = np.array(axes.get_xlim())
    y_vals = intercept + slope * x_vals
    plt.plot(
        x_vals, y_vals, '--', color = color,
        label = f"${y_name} = {slope:.2f}{x_name} " + ("-" if intercept < 0 else "+") + f"{abs(intercept):.2f}$"
    )

15.6. JNB LAB: Complex Systems#

15.6.1. Bifurcations in the Rozenzweig-MacArthur Predator Pray (RMPP) Equations#

In this section will look at Rozenzweig-MacArthur Predator-Prey (RMPP) Equations

(For background, see https://staff.fnwi.uva.nl/a.m.deroos/projects/QuantitativeBiology/43-HopfPoint-Rosenzweig.html)

Defining the equations RMPP is a 2D nonlinear system of ODEs

\[ \frac{dF}{dt} = rF(1-\frac{F}{K}) - \frac{aF}{1+ahF}C \]
\[ \frac{dC}{dt} = \epsilon \frac{aF}{1+afF}C-\mu C \]

where:

\(F=\) prey density (eg rabbits)

\(C=\) predator density (eg foxes)

\(\frac{dF}{dt}\) = dynamic change of prey

\(\frac{dC}{dt}\) = dynamic change of predator

\(r\) = prey natural growth rate

\(K\) = prey carrying capacity

\(a\) = attack rate

\(h\) = handling rate (predator needs h time units to consume prey)

\(e\) = prey offspring proportionality constant to \(F\)

\(\mu\) - predator death rate

The following cell simulates the solution to RMPP over the time interval \(0\le t\le 500\) with parameter values \(r=.5,K=.05, a=5, h=3, e=.5, \mu=.1\)

Hide code cell source
# timestep determines the accuracy of the euler method of integration
timestep = 0.001
# amplitude of noise term
amp = 0.
# the time at which the simulation ends
end_time = 500

# creates a time vector from 0 to end_time, seperated by a timestep
t = np.arange(0,end_time,timestep)

# intialize rabbits (x) and foxes (y) vectors
x = []
y = []

"""" parameters"""

K=.05
r = .5
a=5
h=3
e=.5
m=.1

""" euler integration """

# initial conditions for the rabbit (x) and fox (y) populations at time=0
x0=.2
y0=.2
x.append(x0)
y.append(y0) 

# forward euler method of integration
# a perturbbation term is added to the differentials to make the simulation stochastic
for index in range(1,len(t)):
    
    # evaluate the current differentials
    xd = r*x[index-1]*(1-x[index-1]/K) - a * x[index-1]*y[index-1]/(1+a*h*x[index-1])
    yd = e*a*x[index-1]*y[index-1]/(1+a*h*x[index-1])-m*y[index-1]
    
    # evaluate the next value of x and y using differentials
    next_x = x[index-1] + xd * timestep
    next_y = y[index-1] + yd * timestep
    
    x.append(next_x)
    y.append(next_y)

""" visualization """

if amp == 0:    
    # visualization of deterministic populations against time
    plt.figure(figsize=(2,1))
    plt.plot(x, y)
    plt.text(x0,y0,'o',ha='center', va='center',color='r')
    plt.text(x0,y0,' Initial Point',ha='left', va='center',color='r')
    plt.ylabel('Predator Density')
    plt.xlabel('Prey Density')
    plt.title('Predator-Prey K='+str(K))
    plt.show()
../../_images/1dfb71a3fa6a8821bffaa3967602100ee6e19078bd68e8d39bcb64d3070decb6.png

Note that the predators grow extinct and the prey density approaches a value of approximately .05.

References: https://staff.fnwi.uva.nl/a.m.deroos/projects/QuantitativeBiology/43-HopfPoint-Rosenzweig.html

Anthony Hills, https://github.com/INASIC/predator-prey_systems/blob/master/Modelling Predator-Prey Systems in Python.ipynb

Exercises#

Exercises

1 a) Define a function rmpp(x0,y0,K) which whose inputs specify the initial point (x0,y0) and prey carrying capacity K and whose ouput is a graph of the solution to the RMPP equations for \(0\le t \le 500\).

b) Check that your output for rmpp(.2,.2,.05) agrees with the above plot.

  1. a) Make a chart showing the solution behavior as K is increased to .15, .25 and .3. (Note: for k=.3, also check the solution behavior for (x0,y0)=(.1,.15)) b) Describe the difference in the solution behavior for these K values.

  1. Explain why RMPP has a Hopf bifurcation (one in which there is a critical value at which a stable equilibrium becomes unstable and a stable periodic solution is created.)

15.6.2. Sensitive Dependence on Initial Conditions in a Fitzhugh Nagumo System#

Imagine you are a new grad student. You are interested in applied math so are taking a course in ordinary differential equations (ODEs). One day, your professor talks about his colleague Dr. John Rinzel, who at the time was head of the math research branch at NIH (National Institutes of Health). Dr. Rinzel is an expert in mathematical neurobiology, and has developed several models of how neurons fire (active phase) and then relax (interval of quiescence). One of the nonlinear models he investigated is a 3 dimensional ODE system which we will refer to as the Fizhugh-Nagumo system (FHN):

\[ \frac{dv}{dt} = v-v^3/3-w+y+I \]
\[ \frac{dw}{dt} = \phi(v+a-bw) \]
\[ \frac{dy}{dt} = \theta(-v+c-dy) \]

\(v(t)\) is a membrane voltage variable which spikes when the neuron fires. \(w(t)\) represents the activity of several membrane channel proteins which help facilitate the neuron’s firing. The parameter I is a bifurcation parameter corresponding to the magnitude of the stimulus current. The underlying 2D system (without the variable \(y\)) undergoes a Hopf bifurcation. For \(I<I_{crit}\), there is a stable spiral equilibrium point which corresponds to the interval of quiescence. For \(I<I_{crit}\), the equilibrium point loses its stability as there is a stable periodic solution which corresponds to the spiking of \(v\) during the active phase.

Note that in the 3D system,if I is fixed, the inclusion of y(t) will effectively move the 2D system back and forth across the bifurcation point of the 2D system.

Using parameter values and initial point specified by Dr. Rinzel, we solve the FHN system numerically and plot the solution to to see the action potentials.

Hide code cell source
def solve_FHN(x0,N=10, max_time=1000.0, I=.3125, phi=.08, theta=.0006,a=.7,b=.8,c=-.775,d=1):

    def FHN_deriv(v_w_y,t0=0, x0=x0,phi=phi, theta=theta, I=I,a=a,b=b,c=c,d=d):
        """Compute the time-derivative of a FHN system."""
        v, w, y = v_w_y
        return [v-v**3/3-w+y+I, phi*(v+a-b*w),theta*(-v+c-d*y) ]

    # Solve for the trajectories
    t = np.linspace(0, max_time, int(250*max_time))
    x_t = integrate.odeint(FHN_deriv, x0, t)
    
    return t, x_t
[t,x_t]=solve_FHN([1.744,.298,.00872]) # Initial conditions v=1, w=0, y=0.
plt.plot(t,x_t)
plt.legend(["v(t)","w(t)","y(t)"])
plt.savefig("FHN1.png")
../../_images/a00a2bfabab72745b25896e47e2cf0c3c323f792af70cbe17f3c63965c1dcd91.png

The periodic oscillation for v(t) has a stable burst pattern denoted 1100000 meaning that two large spikes (11) followed by five small oscilations (ooooo) and this pattern will recur over and over.

Exercise#

Exercise

  1. Explain how the bursting behavior in FHN is related to a Hopf bifurcation in the underlying 2D system.

  2. Nonlinear ODE systems can exhibit complex behavior such as sensitive dependence on initial conditions.

a) Re-simulate FHN using the same paramter values and the initial condition x0=[1.725,.337,.00764]. What do you observe? b) Why does FHN exhibit sensitive dependence on initial conditions and how does FHN exhibit this?

15.6.3. Random Walks#

A random walk beginning at \(x_0=0\), is such that \(x_t\), the position at timestep \(t=0,1,2,...\) is

\[ x_{t+1}=x_t+e_t \]

where \(e_t=\pm 1\) with equal probability. The random variables \(e_t\) and \(e_{t'}\) (\(t\neq t')\) are assumed to be independent and identically distributed (IID).

The following cell simulates a random walk.

Hide code cell source
x=[0]
t=[0]
for i in np.arange(1,1000,1):
    x.append(x[i-1]+random.choice((-1, 1)))
    t.append(i)
    plt.plot(t,x)
../../_images/08125df8ca8ab3651aeeec11af9c7df028f24901c0bd45509d1ac34495913080.png

Although each individual random walk is completely unpredictable, a large ensemble of random walks has a predictable structure since

\[ <x_t>=0 \]
\[ <x_t^2>=t \]
Hide code cell source
x=[0]
t=[0]
xf=[]
xf2=[]
for run in np.arange(0,1000):
    for i in np.arange(1,5000,1):
        x.append(x[i-1]+random.choice((-1, 1)))
        t.append(i)
    xf.append(x[4999])
    xf2.append(x[4999]**2)
    plt.plot(t,x)
    x=[0]
    t=[0]
../../_images/60003dbda94950d432496c0d5184d8c188ed24a147c25d697ecc3bc19ff95845.png

Note that after 5000 steps, the mean for 1000 random walks of \(x_t\) (t=4999) is close to 0, and the mean of \(x_t^2\) is close to t. In other words, for a normal random walk with \(x_t=0\), the mean is zero (\(<x_t>=0\)) and the variance grows linearly with \(t\) (\(<x_t^2>=t\)):

Hide code cell source
print("Mean of x[4999]=", np.mean(xf))
print("Mean of x[4999]^2=", np.mean(xf2))
Mean of x[4999]= 0.566
Mean of x[4999]^2= 5195.856

Here is a graph of a 2D random walk with 1000 steps (both the x position and y position are random walks.)

Hide code cell source
# Set seed for reproducible results (optional)
np.random.seed(10)

# Initial particle position
current_xpos = 0
current_ypos = 0

# Initial list of positions
x = [current_xpos]
y = [current_ypos]

# Define number of particle steps
N = 1000

# Generate set of random steps in advance (it could also be done inside the for loop)
xstep = np.random.randint(-1,2,N)
ystep = np.random.randint(-1,2,N)

# Iterate and track the particle over each step
for i in range(N):
    
    # Update position
    current_xpos += xstep[i]
    current_ypos += ystep[i]
    
    # Append new position
    x.append(current_xpos)
    y.append(current_ypos)
    
# Plot random walk
plt.figure(figsize=(10,5))
plt.plot(x, y)

plt.show()    
../../_images/3d5693b64c888a67b4efa1f5aaa1659ff225a126f7260f2d47a8fd3f237ceab6.png

Brownian Motion

We can also create a random walk using variable step sizes by drawing from a standard normal distribution (np.random.normal(0, 1, 1000))

Hide code cell source
# Set seed for reproducible results (optional)
np.random.seed(10)

# Initial particle position
current_xpos = 0
current_ypos = 0

# Initial list of positions
x = [current_xpos]
y = [current_ypos]

# Define number of particle steps
N = 1000

# Generate set of random steps in advance (it could also be done inside the for loop)
xstep = np.random.normal(0,1,N)
ystep = np.random.normal(0,1,N)

# Iterate and track the particle over each step
for i in range(N):
    
    # Update position
    current_xpos += xstep[i]
    current_ypos += ystep[i]
    
    # Append new position
    x.append(current_xpos)
    y.append(current_ypos)
    
# Plot random walk
plt.figure(figsize=(10,5))
plt.plot(x, y)

plt.show()    
../../_images/ecae34a0e63ff5d9f41ac24bf13fbaf1b9fc0fd9a28fe8b67ea4542cc945e52f.png

Exercise#

Exercise

  1. a) A standard Cauchy distribution with p.d.f. \(f(x)=\frac{1}{\pi(1+x^2)}\) is a heavy tail distribution with \(<x>=0\) and infinite variance. Simulate a 2D random walk with \(N=1000\) steps where each step is determined by a random draw from a Cauchy distribution (np.random.standard_cauchy(N)).

b) What difference do you notice about this random walk compared with brownian motion.

  1. a) Plot the probability density function of a standard Cauchy distribution and a standard normal distribution. Explain why the former is said to have a heavy tail.

b) What is the significance of ‘long flights’ in Cauchy random walks?

15.6.4. Scale Adjusted Metropolitan Index (SAMI)#

Urban scaling theory conceptualizes cities as interdependent socioeconomic and spatial (infrastructural) networks that coevolve to support each other. Major goals include:

  • Find empirical characteristics of cities as a whole (extensive properties) such as GDP or road surface area which exhibit scale invariant relations (vary on average with city size).

  • develop an urban scaling theory to explain the observed patterns

  • develop predictive models

An extensive property of cities has the form

\[ Y(N_i(t),t) = Y_i(t)=Y_0(t)N_i(t)^{\beta} e^{\xi_i(t)}>0 \]
  • \(i=1,...,N_c\) correspond to cities

  • \(N_i\) is a measure of system size varies continuously with time \(t\)

  • \(Y_0(t)\) is a prefactor independent of scale

  • \(\beta\) is a scaling exponent (elasticity) expressing the average relative percentage change in \(Y\) given a percentage variation in \(N\) at a fixed time \(t\).

\[ \beta=\frac{\frac{dY}{Y}}{\frac{dN}{N}} \]
  • \(\xi_i(t)\) are residuals which account for deviations of city \(i\) from the average scaling pattern of all cities. The residuals satisfy

\[ \sum_{i=1}^{N_c} \xi_i=0. \]
  • Scaling laws arise when the variance \(\sigma^2=\frac{1}{N_c}\sum_{i=1}^{N_c} \xi_i^2\) is scale independent.

Scaling laws

A socio-economic variable (\(Y\)) follows a population (\(N\)) scaling law if

\[ Y(N)=G N^{\beta} \]

with \(G>0\) and \(\beta\) are constants independent of the value of \(N\).

Example

GDP of cities in China.

chinagdp = pd.read_excel('chinaGDP.xlsx')
chinagdp.head()
City Pop GDP
0 Shanghai 26875500 664
1 Beijing 21167303 619
2 Shenzhen 17633800 482
3 Chongqing 12313714 433
4 Guangzhou 18810600 429
for i in chinagdp.index:
    chinagdp.loc[i,"LnPop"]=np.log( chinagdp.loc[i,"Pop"])
    chinagdp.loc[i,"LnGDP"]=np.log( chinagdp.loc[i,"GDP"])
chinagdp.shape
(29, 5)
sns.lmplot(
    data = chinagdp,
    x = 'LnPop',
    y = 'LnGDP',
    ci = None
);
../../_images/2c67a5359ce81a5dbcd3f137ef75aaec0bb96d8e8a46255b473b49b990851f6d.png
x = 'LnPop'
y = 'LnGDP'

lin_mod = sm.regression.linear_model.OLS(
    chinagdp[y], # the dependent variable (y)
    sm.tools.tools.add_constant(chinagdp[x]) # the independent variable (x)
)

results = lin_mod.fit()
print(results.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  LnGDP   R-squared:                       0.638
Model:                            OLS   Adj. R-squared:                  0.624
Method:                 Least Squares   F-statistic:                     47.51
Date:                Fri, 29 Dec 2023   Prob (F-statistic):           2.09e-07
Time:                        11:03:43   Log-Likelihood:                -4.2813
No. Observations:                  29   AIC:                             12.56
Df Residuals:                      27   BIC:                             15.30
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         -3.5082      1.298     -2.703      0.012      -6.171      -0.846
LnPop          0.5689      0.083      6.893      0.000       0.400       0.738
==============================================================================
Omnibus:                        0.209   Durbin-Watson:                   1.699
Prob(Omnibus):                  0.901   Jarque-Bera (JB):                0.324
Skew:                           0.176   Prob(JB):                        0.850
Kurtosis:                       2.620   Cond. No.                         380.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Note that the data for China has 𝛽=.5689.

Hide code cell source
fig, axes = plt.subplots(1,1, sharex = True, sharey = True, figsize = (10,7))
slopes = [.5689]
intercepts = [-3.5082]
(scattercolor, trendcolor, residcolor) = sns.color_palette()[0:3]
# plot data
sns.scatterplot(
    data = chinagdp,
    x = 'LnPop',
    y = 'LnGDP',
    color = scattercolor,
    )
# plot the line with given slope and intercept
plot_abline(slopes[0], intercepts[0], color = trendcolor)
# plot residuals
for j, (x, y) in enumerate(zip(chinagdp['LnPop'], chinagdp['LnGDP'])):
    plt.plot(
        [x, x], [y, slopes[0]*x + intercepts[0]],
        color = residcolor, alpha = 0.5,
        label = "residuals" if j == 0 else "" # show legend entry only for first residual in each plot
    )
for j, (x, y) in enumerate(zip(chinagdp['LnPop'], chinagdp['LnGDP'])):
    plt.text(x,y,chinagdp.loc[j,"City"])
    plt.text(x,y-.05,str(np.round(y-(slopes[0]*x + intercepts[0]),2)))
plt.legend(loc = "upper left")
plt.savefig("ChinaSAMI.png")
plt.gca().set_title(f"SAMI with Trend Line")
Text(0.5, 1.0, 'SAMI with Trend Line')
../../_images/1e38cc89b93be1abd14b482bb78e9b18fc9bd5ba6ef91746e8b287da883410fa.png

Exercise#

Exercise

Where would Hong Kong fall if added to the SAMI graph above?

15.6.5. IDP#

15.6.6. Introduction#

A peace agreement was signed in Pretoria, South Africa in November 2022, two years after a horrific war broke out in Tigray, the northernmost region of Ethiopia. An estimated four hundreds thousand Tigrayan civilians were killed out of a population of 6 or 7 million, and a hundred thousand women were raped. The entire region was traumitized. Thousands of schools were damaged/closed and millions of children were out of school for over two years. The vast majority of medical facilities were rendered inoperative. Young people in large number escaping reality through drugs and alcohol additions.

One year after the peace agreement, fighting continued in western Tigray and 1 million internally displaced people (IDP) were living in over 600 IDP camps scattered across Tigray. Most camps are have sub-standard living conditions–acute food and clean water shortages; many living in severely overcrowded rooms in damaged schools or in tents; no livelihood, nothing to return to even if peace is established back home.

In this lab, we look at IDP camp data collected in summer 2023. The data set is large (638 rows or camp observations and 468 columns or attributes) so we begin with exploratory data analysis to get a feel for the data as a whole. Our goal then shifts to adddressing the question of how an NGO concerned with the plight of IDPs might prioritize their efforts based on i) areas within Tigray that are underserved and ii) ranking of the severity of the need of individual camps within underserved regions.

The IDP data we analyzed was obtained at https://data.humdata.org/dataset/ethiopia-displacement-northern-region-tigray-idps-site-assessment-iom-dtm?

  1. Let’s put the IDP data into a dataframe called “df”.

df = pd.read_excel("IDP.xlsx")
df.head(1)
1.1.a.1: Survey Date Country Country Code Reported Date 1.1.a.2: Survey Round 1.1.c.1: Site ID 1.1.d.1: Site Name 1.1.d.2: Site Alternate Name 1.4.a.2: Is site open? 1.1.e.1: Region ... S1723: Other internet sources (e.g. apps) S1723: Please specify which other internet sources S1723: Other S1723: If other source of news/information, please specify S1784: Is mobile network access available in the site? S1495: What % of HHs own a mobile phone? 11.3.a.1: Are members of the community discussing/advertising travel opportunities? 11.3.a.6: If Yes, to where? 11.3.a.2: Specify all locations M1712: Additional Comments / Observations
0 #date+occurred #country+name #country+code #date+reported NaN NaN NaN NaN NaN #adm1+name ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

1 rows × 488 columns

  1. Select columns to assess disorder.

Hide code cell source
#Pre-primary, secondary, resources,
#quality/satisfaction, girl attendance, boy attendance, teachers
columns_to_keep = ['1.1.d.1: Site Name',
                   '1.1.e.2: Zone',
                   '1.1.e.3: Woreda',
                   '1.1.f.1: GPS: Longitude',
                   '1.1.f.2: GPS: Latitude',
                   '2.1.b.7: Total Number of IDP Individuals',
                   'M1594: What is the severity of site/area overcrowding',
                   '1.2.a.1: Is there any registration activity?',
                   'What are the biggest priority need(s) for IDPs in this site?/Food',
                   'What are the biggest priority need(s) for IDPs in this site?/Shelter',
                   'What are the biggest priority need(s) for IDPs in this site?/WASH',
                   'What are the biggest priority need(s) for IDPs in this site?/Livelihoods',
                   'What are the biggest priority need(s) for IDPs in this site?/Healthcare',
                   'What are the biggest priority need(s) for IDPs in this site?/Protection services',
                   'S1225: Are IDPs satisfied with the standard of schools for children in their location',  
                   ]

data = df[columns_to_keep]
new_column_names = ['Name','Zone','Woreda','Long','Lat','Pop','Crowded','Registration','Food','Shelter','WASH','Livelihood','Healthcare','Protection','Education']
data.columns = new_column_names
data=data.dropna()
data.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education
1 Hareze Seb'ata Eastern Erob 39.5828 14.43946 10410 High Yes, but the list is incomplete No Yes Yes Yes No No No
data.shape
(642, 15)
  1. Binary code deficiencies (1=deficient)

Hide code cell source
# Assuming 'data' is your DataFrame with qualitative columns

# Define mapping functions for each column
def map_crowded(value):
    return 1 if 'High' in value else 0

def map_registration(value):
    return 1 if 'No' in value else 0

def map_food(value):
    return 1 if 'Yes' in value else 0

def map_shelter(value):
    return 1 if 'Yes' in value else 0

def map_wash(value):
    return 1 if 'Yes' in value else 0

def map_livelihood(value):
    return 1 if 'Yes' in value else 0

def map_healthcare(value):
    return 1 if 'Yes' in value else 0

def map_protection(value):
    return 1 if 'Yes' in value else 0

def map_education(value):
    return 1 if 'No' in value else 0

# Apply the mapping functions to each column
data['Crowded'] = data['Crowded'].apply(map_crowded)
data['Registration'] = data['Registration'].apply(map_registration)
data['Food'] = data['Food'].apply(map_food)
data['Shelter'] = data['Shelter'].apply(map_shelter)
data['WASH'] = data['WASH'].apply(map_wash)
data['Livelihood'] = data['Livelihood'].apply(map_livelihood)
data['Healthcare'] = data['Healthcare'].apply(map_healthcare)
data['Protection'] = data['Protection'].apply(map_protection)
data['Education'] = data['Education'].apply(map_education)


# Display the updated DataFrame
data.head()
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education
1 Hareze Seb'ata Eastern Erob 39.58280 14.43946 10410 1 0 0 1 1 1 0 0 1
2 Enda Mosa Eastern Erob 39.55493 14.42384 10230 1 0 1 1 0 1 0 0 1
3 Adi Abagie North Western Adi Daero 38.18660 14.26590 196 0 1 1 0 0 0 0 0 1
4 May Ambssa North Western Adi Daero 38.23350 14.22740 117 0 1 1 0 0 0 0 0 1
5 Hibret North Western Adi Daero 38.16930 14.32020 242 0 1 1 1 0 0 0 0 1
  1. Find the total number of needs in each camp and the total number of IDP in camps with i needs.

for i in data.index:
    data.loc[i,"sum"]=int(data.loc[i,'Crowded']+data.loc[i,'Registration']+data.loc[i,'Food']+data.loc[i,'Shelter']+data.loc[i,'WASH']+data.loc[i,'Livelihood']+data.loc[i,'Healthcare']+data.loc[i,'Protection']+data.loc[i,'Education'])
data.head(3)  
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
1 Hareze Seb'ata Eastern Erob 39.58280 14.43946 10410 1 0 0 1 1 1 0 0 1 5.0
2 Enda Mosa Eastern Erob 39.55493 14.42384 10230 1 0 1 1 0 1 0 0 1 5.0
3 Adi Abagie North Western Adi Daero 38.18660 14.26590 196 0 1 1 0 0 0 0 0 1 3.0
Hide code cell source
def population(data):
    pop=[0,0,0,0,0,0,0,0,0,0]
    for i in data.index:
        if data.loc[i,"sum"]==0:
            pop[0]=pop[0]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==1:
            pop[1]=pop[1]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==2:
            pop[2]=pop[2]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==3:
            pop[3]=pop[3]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==4:
            pop[4]=pop[4]++data.loc[i,"Pop"]
        if data.loc[i,"sum"]==5:
            pop[5]=pop[5]+data.loc[i,"Pop"] 
        if data.loc[i,"sum"]==6:
            pop[6]=pop[6]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==7:
            pop[7]=pop[7]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==8:
            pop[8]=pop[8]+data.loc[i,"Pop"]
        if data.loc[i,"sum"]==9:
            pop[9]=pop[9]++data.loc[i,"Pop"]
    return pop
totalpop=population(data)
np.sum(totalpop)
1021593
  1. Create dataframes with data for each of the 6 zones.

data["Zone"].value_counts()
Central          191
North Western    140
Eastern          130
South East        73
Mekelle           55
Southern          53
Name: Zone, dtype: int64
tot=642  #total number of camps
U=data["sum"].sum()/tot
U  #mean number of needs
3.1183800623052957
central_df = data[data['Zone'] == 'Central']
centralpop=needlevel(central_df)
central_df.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
9 Kewanit Central Tahtay Maychew 38.634 14.0922 136 0 0 1 0 1 0 1 0 1 4.0
centralpop
[0, 185, 66443, 98460, 34418, 24262, 0, 0, 0, 0]
NW_df = data[data['Zone'] == 'North Western']
NWpop=needlevel(NW_df)
NW_df.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
3 Adi Abagie North Western Adi Daero 38.1866 14.2659 196 0 1 1 0 0 0 0 0 1 3.0
NWpop
[0, 5917, 84960, 105586, 78965, 34967, 5242, 0, 0, 0]
eastern_df = data[data['Zone'] == 'Eastern']
easternpop=needlevel(eastern_df)
eastern_df.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
1 Hareze Seb'ata Eastern Erob 39.5828 14.43946 10410 1 0 0 1 1 1 0 0 1 5.0
easternpop
[0, 702, 44725, 47223, 14953, 24134, 23113, 0, 0, 0]
# Filter rows with 'Mekele' in the 'City' column
SE_df = data[data['Zone'] == 'South East']
SEpop=needlevel(SE_df)
SE_df.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
263 Adi-Keyh South East Wejerat 39.667 13.0126 326 0 0 1 0 0 0 0 0 1 2.0
SEpop
[0, 469, 11772, 9957, 12005, 0, 0, 0, 0, 0]
mekelle_df = data[data['Zone'] == 'Mekelle']
mekellepop=needlevel(mekelle_df)
mekelle_df.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
559 Hayelom Elementary School Mekelle Hawelti Sub City 39.45346 13.50069 1760 0 1 1 0 0 1 0 0 0 3.0
mekellepop
[0, 14009, 79426, 78206, 29365, 33478, 0, 0, 0, 0]
southern_df = data[data['Zone'] == 'Southern']
southernpop=needlevel(southern_df)
southern_df.head(1)
Name Zone Woreda Long Lat Pop Crowded Registration Food Shelter WASH Livelihood Healthcare Protection Education sum
295 Ebo Southern Raya Azebo 39.6911 12.8469 123 0 0 1 1 0 1 0 0 1 4.0
southernpop
[0, 0, 3203, 10645, 33313, 11490, 0, 0, 0, 0]

15.6.7. Exercises#

Exercises

  1. Compute the mean, weighted mean, entropy, and weighted entropy for each of the 6 zones.

  2. Make a histogram for each zone which shows the number of IDP living in camps with i deficiencies (i=0,1,2,3,4,5,6,7,8,9).