JNB Lab Solutions

6.8. JNB Lab Solutions#

framingham = pd.read_csv("framingham.csv")

	AGE	SYSBP	DIABP	TOTCHOL	CURSMOKE	GLUCOSE	DEATH	ANYCHD
0	39	106.0	70.0	195.0	0	77.0	0	1
1	46	121.0	81.0	250.0	0	76.0	0	0
2	48	127.5	80.0	245.0	1	70.0	0	0
3	61	150.0	95.0	225.0	1	103.0	1	0
4	46	130.0	84.0	285.0	1	85.0	0	0

6.8.1. Part 1: Explore the data#

Write a line or two of code to figure out how many people in the study have an occurrence of CHD and how many do not.

There are 954 people in the study with CHD and 2888 without CHD.

954 2888

Make a histogram of the cholesterol levels for both samples. Describe the distributions’ centers, shapes, and compare the two.

../../_images/d9e417c2c5eba975c31d6cc5cc69d5102bbdbb3d0b24bb62d9bf507ade844d1b.png

6.8.2. Part 2: Two Sample T-tests#

Describe the assumptions of the T-test and comment on if they are valid for this example. Your work in Part 1 should be enough.

The t-test is valid since we have a large sample size in both samples (\(n_1 = 945\) and \(n_2 = 2888\)), and both sample distributions are relatively mound shaped.

Compute the test statistic and \(P\)-value using cm.ttest_ind.

Results of T-test: test statistic is 9.766 with 1545.774 degrees of freedom.
P-value is 0.000.

Write your conclusion in a complete sentence.

Solution: Our test yields a test statistic of 9.766 on 1545.774 degrees of freedom, with a p-value of 0. Thus, we reject the null hypothesis that the average total cholesterol levels are the same in both populations.

Give a confidence interval for the difference in average total cholesterol.

Solution: A 95% confidence interval for the difference is (13.295, 19.977), meaning that we are 95% confident that average cholesterol levels are (roughly) 13 to 20 points higher in patients with CHD.

95% confidence interval is (13.295, 19.977).

JNB Lab Solutions

Contents

6.8. JNB Lab Solutions#

6.8.1. Part 1: Explore the data#

6.8.2. Part 2: Two Sample T-tests#