
STAT 218 - Week 4, Lecture 3
To distinguish a quantitative variable from a categorical variable, we use different symbols to show population parameters and sample statistics.
Parameters
\(\mu\) = population mean
\(\sigma\) = population standard deviation
Statistics
\(\bar{x}\) = sample mean
\(s\) = sample standard deviation
Observational Unit: A student in that college. Variable: Cumulative GPA Statistic: Sample Mean

In previous chapters, we were focusing on inferences about a population proportion (categorical variable).
Now, we will focus on data consisting of a single quantitative variable.
We will make inferences about a population mean (this is our new parameter now) by creating confidence intervals.
The general form of a confidence interval is
\[ \\ statistic \pm multiplier \times (SD \ of \ statistic) \]
Here, our statistic will be the sample mean and multiplier will come from t-distribution.
\[ \\ \bar{x} \pm t^* \times (s/ \sqrt{n}) \]
\(t\)-distribution is another bell shape and symmetric distribution that can be useful if we do not know anything about population parameters.
The \(t\)-distribution is always centered at zero and has a single parameter: degrees of freedom.
Broadly speaking, we use \(t\)-distribution with \(df = n − 1\)

Both are symmetric and bell-shaped but \(t\)-distribution has a larger standard deviation.
The \(t\)-distribution has a single parameter: degrees of freedom.
Standard Normal Distribution has two parameters: \(\mu\) and \(\sigma\).
Let’s have a look wing areas of 14 male Monarch butterflies at Oceano Dunes State Park in California
Suppose we consider these 14 observations as a random sample from a population.
Let’s have a look wing areas of 14 male Monarch butterflies at Oceano Dunes State Park in California
For the multiplier, it is given as
\[ \\ multiplier = 2.160 \]
95% confidence interval (CI) for \(\mu\) can be calculated as following:
\[ \\95 \% \ CI = (\bar{x} \pm multiplier \ \times \ SE_{\bar{x}}) \\95 \% \ CI = (32.8143 \pm 2.160 \ \times \ 2.4757 / \sqrt{14}) \]
90% confidence interval (CI) for \(\mu\) can be calculated as following (multiplier:1.771):
\[ \\90 \% \ CI = (\bar{y} \pm multiplier \ \times SE_{\bar{x}}) \\90 \% \ CI = (32.8143 \pm 1.771 \ \times \ 2.4757 / \sqrt{14}) \]
\[ \\= 32.81 \pm 1.17 \\ 31.64 \ cm^2 < \mu < 33.98 \ cm^2 \]
What were the differences between 90% CI and 95% CI?
And…
If we calculate confidence intervals for each of these 100 samples, we will find that around 95% of these intervals capture the true population mean.
We are 95% confident that the true population mean is in this confidence interval.
Recall that
\[ SE_{\bar{x}} = \frac{s}{\sqrt{n}} \]
We can use this formula to determine our sample size as follows:
\[ Desired \ SE = \frac{Guessed \ SD}{\sqrt{n}} \]
Suppose the researcher is now planning a new study of butterflies Monarch butterflies at Oceano Dunes State Park in California and has decided that it would be desirable that the SE be no more than \(0.4 \ cm^2\)
\[ SE_{\bar{y}} = s / \sqrt{n} \]
\[ Desired \ SE = Guessed \ SD / \sqrt{n} \]
\[ \\Desired \ SE = 2.48 / \sqrt{n} \ \le 0.4 \\ n\ge 38.4 \] \[ \\ at \ least \ 39 \ butterflies \]