JNB LAB: Patterns

1.2. JNB LAB: Patterns#

Note

This lab gives an overview of some of the features that can be included in a Jupyter Notebook (JNB). This can be reviewed quickly before proceeding to the next chapter.

In this lab, we use the theme of ‘Patterns’ to introduce a variety of functions that can be done using JNBs from a pre-college to REU level:

I. PATTERNS IN NATURE (Pre-college)

1.1 Displaying an Image

1.2 Displaying a YouTube Video

II. PATTERNS IN SOCIAL DATA (Exploratory Data Analysis)

2.1 Reading in CPS Data from the Chicago Data Portal

2.2 Streamlining the Data

2.3 Simplifying the Column Names

2.4 Making a Scatterplot with OLS Regression line

III. PATTERNS IN MATHEMATICS (Undergrad math)

3.1 Tracing a Parametric Curve

3.2 Plotting Quadric Surfaces and Level Curves

IV.APPLYING MATHEMATICAL PATTERNS TO NATURE AND SOCIETY (REU)

4.1 Identifying Streamlines of Polluted and Freshwater Flow

1.2.1. Patterns in Nature#

Patterns in nature are a beautiful and interesting phenomenon

Displaying an Image#

a) Here is a NASA Hubble photograph of an exploding supernova. (The image files are available here).

from IPython.display import Image
Image(filename='supernova.png',width=100,height=100)

_images/625155db10e68c87c23db12533a24ddfc001a5f3a068d0ecdfba2462ec3934f5.png

Exercise 1a

Display another beautiful natural pattern shown in the file ‘MtFuji.png’.

Displaying a YouTube Video#

Patterns can arise arise in natural dynamical systems. For example, here is a scene of waves shown in the video “Ocean Waves.” YouTube, uploaded by BusyBoy Productions , 14 February 2020, https://www.youtube.com/watch?v=J7pBztjUqUc&t=14s. Permissions: YouTube Terms of Service

from IPython.display import YouTubeVideo
YouTubeVideo('J7pBztjUqUc',width=100,height=100)

Exercise 1b

Display another video of a dynamic pattern from nature shown in the video

“Waterfall Clip.” YouTube, uploaded by Bradley Erickson , 21 June 2016, https://www.youtube.com/watch?v=oYEtLQ3lEH0&t=5s. Permissions: YouTube Terms of Service

1.2.2. Patterns in Societal Data#

Individual people, groups, and societies as a special part of the natural world give rise to patterns which can be analyzed using data. Here is an example using schools in zip code 60623. See the section OLS Linear Regression

Reading in CPS Data from the Chicago Data Portal#

We use the Pandas (Python Data Analysis) library (abbreviated as pd) to read in the data. The ‘.head(2)’ command displays the first two rows. We can also list the columns using the ‘.columns’ command.

raw_CPS_data=  pd.read_json('https://data.cityofchicago.org/resource/kh4r-387c.json?$limit=100000')
raw_CPS_data.head(2) 

	school_id	legacy_unit_id	finance_id	short_name	long_name	primary_category	is_high_school	is_middle_school	is_elementary_school	is_pre_school	...	fifth_contact_title	fifth_contact_name	seventh_contact_title	seventh_contact_name	refugee_services	visual_impairments	freshman_start_end_time	sixth_contact_title	sixth_contact_name	hard_of_hearing
0	609966	3750	23531	HAMMOND	Charles G Hammond Elementary School	ES	False	True	True	True	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	400069	4150	67081	POLARIS	Polaris Charter Academy	ES	False	True	True	False	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

2 rows × 92 columns

raw_CPS_data.columns

Index(['school_id', 'legacy_unit_id', 'finance_id', 'short_name', 'long_name',
       'primary_category', 'is_high_school', 'is_middle_school',
       'is_elementary_school', 'is_pre_school', 'summary',
       'administrator_title', 'administrator', 'secondary_contact_title',
       'secondary_contact', 'address', 'city', 'state', 'zip', 'phone', 'fax',
       'cps_school_profile', 'website', 'facebook', 'attendance_boundaries',
       'grades_offered_all', 'grades_offered', 'student_count_total',
       'student_count_low_income', 'student_count_special_ed',
       'student_count_english_learners', 'student_count_black',
       'student_count_hispanic', 'student_count_white', 'student_count_asian',
       'student_count_native_american', 'student_count_other_ethnicity',
       'student_count_asian_pacific', 'student_count_multi',
       'student_count_hawaiian_pacific', 'student_count_ethnicity_not',
       'statistics_description', 'demographic_description', 'dress_code',
       'prek_school_day', 'kindergarten_school_day', 'school_hours',
       'after_school_hours', 'earliest_drop_off_time', 'classroom_languages',
       'bilingual_services', 'title_1_eligible', 'preschool_inclusive',
       'preschool_instructional', 'transportation_bus', 'transportation_el',
       'school_latitude', 'school_longitude', 'overall_rating',
       'rating_status', 'rating_statement', 'classification_description',
       'school_year', 'third_contact_title', 'third_contact_name', 'network',
       'is_gocps_participant', 'is_gocps_prek', 'is_gocps_elementary',
       'is_gocps_high_school', 'open_for_enrollment_date', 'twitter',
       'youtube', 'pinterest', 'college_enrollment_rate_school',
       'college_enrollment_rate_mean', 'graduation_rate_school',
       'graduation_rate_mean', 'significantly_modified',
       'transportation_metra', 'fourth_contact_title', 'fourth_contact_name',
       'fifth_contact_title', 'fifth_contact_name', 'seventh_contact_title',
       'seventh_contact_name', 'refugee_services', 'visual_impairments',
       'freshman_start_end_time', 'sixth_contact_title', 'sixth_contact_name',
       'hard_of_hearing'],
      dtype='object')

Streamlining the Data#

In the next cell, we first create a dataframe df1 which includes just four columns: [‘address’,‘student_count_total’,‘student_count_black’,‘student_count_hispanic’,‘student_count_white’,‘zip’] and then filter to zipcode 60623. We then drop rows with missing data and give the total number of schools, as well as the largest and the smallest schools.

Total number of CPS schools considered in 60623 is 35
Largest student_count_total =  1072
Smallest student_count_total =  96

	address	student_count_total	student_count_black	student_count_hispanic	zip
0	2819 W 21ST PL	342	33	304	60623
1	2345 S CHRISTIANA AVE	559	66	484	60623

Simplifying the Column Names#

We will abbreviate the column names as [“address”,“total”,“black”,“hispanic”]

and then add 2 new columns ‘%black’, ‘%hispanic’.

	address	total	black	hispanic	zip	%black	%hispanic
0	2819 W 21ST PL	342	33	304	60623	9.6	88.9
1	2345 S CHRISTIANA AVE	559	66	484	60623	11.8	86.6

Making a Scatterplot with OLS Regression Line#

The plot indicates %black (x-axis) vs. %hispanic (y-axis) and includes the OLS regression line on the plot

Intercept is  [98.71284906]
Slope is  [[-0.99515441]]
R^2 for OLS is  0.9996943205528053

_images/8e24f933fb5e31a945c1adfb45266068b3f6f91d0ef3682852d7435d731f3b92.png

This scatterplot shows that the CPS schools considered in 60623 are either predominantly hispanic or predominantly black.

Exercise 2

Modify the steps above to create a dataframe df2 and a scatterplot of % Black (x-axis) vs. % ‘‘graduation_rate_school’ (y-axis) in zip 60623.

1.2.3. Patterns in Mathematics#

Interesting and beautiful patterns arise throughout mathematics.

Tracing Parametric Curves#

Various patterns can be created using sine, cosine, and exponential functions to define 2D parametric curves. (See the section [parametric equations](./Undergrad/Calculus/6 ParametricEquations.ipynb).)

The parametric equations

\(x(t)=\cos(t)[3.5-1.5\mid\cos t\mid \sqrt{1.3+\mid\sin t\mid}+\cos(2t)-3\sin(t)+.7\cos(12.2t)]\)

\(y(t)=\sin(t)[3.5-1.5\mid\cos t\mid \sqrt{1.3+\mid\sin t\mid}+\cos(2t)-3\sin(t)+.7\cos(12.2t)]\)

create a heart-shaped design.

_images/302adc1b49748f5f0d3e1bbdca49f169adc2994f12e5c6c1533f0aa597dd0176.png

Exercise 3a

What familiar object resembles the parametric curve traced out by the equations

\(x(t)=\sin(t)[e^{\cos t}-2\cos(4t)-\sin^5(t/12)]\),

\(y(t)=\cos(t)[e^{\cos t}-2\cos(4t)-\sin^5(t/12)]\)

Plotting Quadric Surfaces and Level Curves#

Quadric surfaces are another source of interesting patterns and can be constructed and analyzed using level curves. We can use the following function to plot quadric surfaces. The input variable fn is an expression in three variables (x,y,z). The function plot_implicit() plots the implicit relation fn=0 by plotting level curves parallel to the xy plane, xz plane, and yz plane.

For example, let us plot the elliptic paraboloid

(1.1)#\[\begin{equation} z=(\frac{x}{2})^2+(\frac{y}{3})^2 \end{equation}\]

def hyperbolic_paraboloid(x,y,z):
    return (x/2)**2-(y/3)**2-z

plot_implicit(hyperbolic_paraboloid)

_images/54ca0b025846574a2ce3a18e2b45b1b5ac8a4ead228fd3dfd3a7d6d5f9fb0915.png

We can sketch the level curves of

\[ z=(\frac{x}{2})^2+(\frac{y}{3})^2 \]

as follows:

_images/dce093bde1080d41999e04470d048d6fae7a7185af07657d505d80985bd44918.png

Exercise 3b

Make a sketch of the hyperboloid of one sheet defined by

\((\frac{z}{4})^2=(\frac{x}{2})^2+(\frac{y}{3})^2-1.\)

Then plot the level curves of the function

\(z=4\sqrt{(\frac{x}{2})^2+(\frac{y}{3})^2-1}.\)

1.2.4. Connecting Patterns in Mathematics to Patterns in Nature and Society#

In complex variables, we study functions of a complex variable \(f(z)=f(x+iy)=u(x,y)+iv(x,y)\) which maps a complex variable \(z=x+iy\) represented by a point \((x,y)\) in the complex plane to another complex variable \(u+iv\) modeled as a point \((u,v)\) in another complex plane.

Such functions have wide-ranging applications including applications to analysis of 2-dimensional fluid flow.

The following function \(\Omega(z)\) called a “complex potential” can be used to model a scenario where there is a source of pollution located at (-1,0) as well an extraction well or sink of equal strength located at (1,0).

\[ \Omega(z) = -z - 2\ln(z+1) + 2\ln(z-1) \]

The imaginary part of \(\Omega\) is called the stream function \(\Psi\). The level curves of \(\Psi\) are called streamlines, and model the path along which the fluid particles flow.

Identifying Streamlines of Polluted and Freshwater Flow#

We now make a plot of the streamlines. A Rankine Oval separates the freshwater streamlines from the polluted streamlines.

_images/44634cd5c98852454f3cda16b17821bfddba12f8e05b7bd19d70ba8980d04f77.png

Note that the extraction well is sufficiently strong to capture all the pollution streamlines. The boundary between the polluted and freshwater streamlines is called a “Rankine Oval.”

Exercise 4

Read the paragraph “Model Output” in Section 4 “Contaminant Extraction Modeling” of the The COMPLEX VARIABLES IN GROUNDWATER MODELING chapter. Then explain the difference between an Ineffective, Regular, and Inefficient system.