Confidence Intervals: Quantifying Uncertainty in Statistical Estimates

1. A Story About Precision and Uncertainty

A public health researcher conducted a community-based nutrition intervention in a rural district, teaching families about balanced diets and meal planning. After six months, she measured the average change in daily vegetable consumption among 80 participating families. The sample showed an average increase of 1.8 servings per day—an encouraging finding that she planned to present to the regional health committee to secure funding for expansion.

However, when the committee chair asked, “How confident are you that this 1.8 servings represents the true population effect?”, she realized her point estimate told only part of the story. While 1.8 servings was her best estimate from the sample, the true population mean could plausibly be anywhere from 1.3 to 2.3 servings (or perhaps even wider). This range—the confidence interval—communicates the precision of her estimate and the degree of uncertainty inherent in sampling.

Understanding confidence intervals is fundamental to statistical inference. They provide a range of plausible values for population parameters, acknowledging that our sample-based estimates are subject to sampling variability. Rather than offering a single number that falsely implies certainty, confidence intervals honestly communicate what we can reasonably conclude from our data.

2. What are Confidence Intervals?

A confidence interval is a range of values that likely contains an unknown population parameter with a specified level of confidence. Most commonly, we construct 95% confidence intervals, meaning that if we repeated our study many times and calculated a confidence interval each time, approximately 95% of those intervals would contain the true population parameter.

Key Concepts

Confidence Level (1 - α): The probability that the interval contains the true parameter when the procedure is repeated many times. Common choices are 90%, 95%, and 99%.

Margin of Error: The distance from the point estimate to the endpoints of the interval, determined by the standard error and critical value.

Width of the Interval: Reflects precision—narrower intervals indicate more precise estimates. Width depends on: - Sample size (larger n → narrower intervals) - Variability in the data (larger SD → wider intervals) - Confidence level (higher confidence → wider intervals)

Common Misconceptions

INCORRECT interpretation: “There is a 95% probability that the true parameter lies within this specific interval.”

CORRECT interpretation: “If we repeated this study many times and calculated 95% CIs each time, approximately 95% of those intervals would contain the true parameter.”

The true parameter is fixed (though unknown); the interval is what varies across repeated samples.

3. Mathematical Foundations

Confidence Interval for a Mean

For a population mean μ with known variance, the confidence interval is:

\[ \bar{X} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]

When the population variance is unknown (the usual case), we use the t-distribution:

\[ \bar{X} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} \]

where: - \(\bar{X}\) is the sample mean - \(t_{\alpha/2, df}\) is the critical value from the t-distribution with df = n - 1 - \(s\) is the sample standard deviation - \(n\) is the sample size

Confidence Interval for a Proportion

For a population proportion p:

\[ \hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

where \(\hat{p}\) is the sample proportion.

Confidence Interval for the Difference Between Two Means

For independent samples:

\[ (\bar{X}_1 - \bar{X}_2) \pm t_{\alpha/2, df} \cdot \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \]

Confidence Interval for Regression Coefficients

For a regression coefficient \(\beta_j\):

\[ \hat{\beta}_j \pm t_{\alpha/2, df} \cdot SE(\hat{\beta}_j) \]

where \(SE(\hat{\beta}_j)\) is the standard error of the coefficient.

4. Simulating Datasets for Different Scenarios

We’ll create multiple datasets to demonstrate confidence interval construction across various statistical contexts.

set.seed(2025)

# Scenario 1: Single mean (Nutrition intervention - vegetable servings)
# Daily vegetable servings after intervention
vegetable_servings <- rnorm(80, mean = 1.8, sd = 0.9)

nutrition_data <- data.frame(
  servings = vegetable_servings
)

# Scenario 2: Two-group comparison (Sleep study)
# Sleep quality scores (0-100 scale) for two interventions
meditation_group <- rnorm(50, mean = 72, sd = 12)
exercise_group <- rnorm(50, mean = 68, sd = 12)

sleep_data <- data.frame(
  quality = c(meditation_group, exercise_group),
  intervention = factor(rep(c("Meditation", "Exercise"), each = 50))
)

# Scenario 3: Proportion (Screening program)
# COVID-19 vaccination uptake in two communities
n_urban <- 200
n_rural <- 180
vaccinated_urban <- rbinom(1, n_urban, 0.78)
vaccinated_rural <- rbinom(1, n_rural, 0.65)

# Scenario 4: Regression (Stress and productivity)
n <- 120
stress_level <- rnorm(n, mean = 50, sd = 15)
productivity <- 85 - 0.6 * stress_level + rnorm(n, mean = 0, sd = 8)

stress_data <- data.frame(
  stress = stress_level,
  productivity = productivity
)

# Scenario 5: Correlation (Social support and well-being)
social_support <- rnorm(100, mean = 60, sd = 15)
wellbeing <- 30 + 0.7 * social_support + rnorm(100, mean = 0, sd = 10)

support_data <- data.frame(
  support = social_support,
  wellbeing = wellbeing
)

5. Visualizing Confidence Intervals

Before calculating CIs, let’s visualize our data to understand the distributions.

library(ggplot2)

# Plot 1: Distribution of vegetable servings
p1 <- ggplot(nutrition_data, aes(x = servings)) +
  geom_histogram(bins = 20, fill = "#3498DB", alpha = 0.7, color = "black") +
  geom_vline(aes(xintercept = mean(servings)), color = "#E74C3C", 
             linetype = "dashed", linewidth = 1) +
  labs(title = "Distribution of Daily Vegetable Servings",
       x = "Servings per Day", y = "Frequency") +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

# Plot 2: Sleep quality by intervention
p2 <- ggplot(sleep_data, aes(x = intervention, y = quality, fill = intervention)) +
  geom_boxplot(alpha = 0.7) +
  geom_jitter(width = 0.2, alpha = 0.3) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 3, 
               fill = "red", color = "black") +
  labs(title = "Sleep Quality by Intervention Type",
       x = "Intervention", y = "Sleep Quality Score") +
  scale_fill_manual(values = c("#E74C3C", "#27AE60")) +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    legend.position = "none"
  )

# Plot 3: Stress and productivity
p3 <- ggplot(stress_data, aes(x = stress, y = productivity)) +
  geom_point(alpha = 0.6, color = "#9B59B6") +
  geom_smooth(method = "lm", se = TRUE, color = "#E74C3C", fill = "#E74C3C", alpha = 0.2) +
  labs(title = "Stress Level vs. Productivity",
       x = "Stress Level", y = "Productivity Score") +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

# Plot 4: Social support and wellbeing
p4 <- ggplot(support_data, aes(x = support, y = wellbeing)) +
  geom_point(alpha = 0.6, color = "#F39C12") +
  geom_smooth(method = "lm", se = TRUE, color = "#E74C3C", fill = "#E74C3C", alpha = 0.2) +
  labs(title = "Social Support vs. Well-being",
       x = "Social Support Score", y = "Well-being Score") +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

# Arrange plots
library(gridExtra)
grid.arrange(p1, p2, p3, p4, ncol = 2)

6. Calculating Confidence Intervals in R

Now we calculate confidence intervals for each scenario using actual R output.

Confidence Interval for a Single Mean

# Calculate mean and CI using t.test
t_result <- t.test(nutrition_data$servings, conf.level = 0.95)

# Extract values
mean_servings <- mean(nutrition_data$servings)
sd_servings <- sd(nutrition_data$servings)
n_servings <- length(nutrition_data$servings)
se_servings <- sd_servings / sqrt(n_servings)

# Calculate CI manually to verify
df <- n_servings - 1
t_critical <- qt(0.975, df)
margin_error <- t_critical * se_servings
ci_lower_manual <- mean_servings - margin_error
ci_upper_manual <- mean_servings + margin_error

# Display results
cat("Confidence Interval for Mean Vegetable Servings:\n")

## Confidence Interval for Mean Vegetable Servings:

cat("Sample size:", n_servings, "\n")

## Sample size: 80

cat("Mean:", round(mean_servings, 3), "\n")

## Mean: 1.841

cat("SD:", round(sd_servings, 3), "\n")

## SD: 0.915

cat("SE:", round(se_servings, 4), "\n")

## SE: 0.1023

cat("t critical value (df =", df, "):", round(t_critical, 3), "\n")

## t critical value (df = 79 ): 1.99

cat("Margin of error:", round(margin_error, 4), "\n")

## Margin of error: 0.2037

cat("95% CI: [", round(t_result$conf.int[1], 3), ",", 
    round(t_result$conf.int[2], 3), "]\n")

## 95% CI: [ 1.637 , 2.044 ]

cat("t-statistic:", round(t_result$statistic, 3), "\n")

## t-statistic: 17.989

cat("p-value:", format(t_result$p.value, scientific = TRUE, digits = 3), "\n")

## p-value: 1.15e-29

Confidence Interval for Difference Between Two Means

# Independent samples t-test
t_sleep <- t.test(quality ~ intervention, data = sleep_data, 
                  var.equal = TRUE, conf.level = 0.95)

# Calculate statistics for each group
meditation_mean <- mean(sleep_data$quality[sleep_data$intervention == "Meditation"])
exercise_mean <- mean(sleep_data$quality[sleep_data$intervention == "Exercise"])
meditation_sd <- sd(sleep_data$quality[sleep_data$intervention == "Meditation"])
exercise_sd <- sd(sleep_data$quality[sleep_data$intervention == "Exercise"])
n_meditation <- sum(sleep_data$intervention == "Meditation")
n_exercise <- sum(sleep_data$intervention == "Exercise")

mean_diff <- meditation_mean - exercise_mean

# Display results
cat("Confidence Interval for Difference in Sleep Quality:\n")

## Confidence Interval for Difference in Sleep Quality:

cat("Meditation group: n =", n_meditation, ", M =", round(meditation_mean, 2), 
    ", SD =", round(meditation_sd, 2), "\n")

## Meditation group: n = 50 , M = 68.79 , SD = 11.36

cat("Exercise group: n =", n_exercise, ", M =", round(exercise_mean, 2), 
    ", SD =", round(exercise_sd, 2), "\n")

## Exercise group: n = 50 , M = 68.36 , SD = 12.36

cat("Mean difference:", round(mean_diff, 3), "\n")

## Mean difference: 0.424

cat("95% CI for difference: [", round(t_sleep$conf.int[1], 3), ",", 
    round(t_sleep$conf.int[2], 3), "]\n")

## 95% CI for difference: [ -5.134 , 4.285 ]

cat("t-statistic:", round(t_sleep$statistic, 3), "\n")

## t-statistic: -0.179

cat("df:", round(t_sleep$parameter, 1), "\n")

## df: 98

cat("p-value:", round(t_sleep$p.value, 4), "\n")

## p-value: 0.8585

Confidence Interval for a Proportion

# Urban vaccination rate
prop_test_urban <- prop.test(vaccinated_urban, n_urban, conf.level = 0.95)
p_urban <- vaccinated_urban / n_urban
se_urban <- sqrt(p_urban * (1 - p_urban) / n_urban)

# Rural vaccination rate
prop_test_rural <- prop.test(vaccinated_rural, n_rural, conf.level = 0.95)
p_rural <- vaccinated_rural / n_rural
se_rural <- sqrt(p_rural * (1 - p_rural) / n_rural)

# Difference in proportions
prop_diff_test <- prop.test(c(vaccinated_urban, vaccinated_rural), 
                             c(n_urban, n_rural), conf.level = 0.95)

cat("Confidence Intervals for Vaccination Rates:\n\n")

## Confidence Intervals for Vaccination Rates:

cat("Urban community:\n")

## Urban community:

cat("Vaccinated:", vaccinated_urban, "out of", n_urban, "\n")

## Vaccinated: 153 out of 200

cat("Proportion:", round(p_urban, 4), "\n")

## Proportion: 0.765

cat("SE:", round(se_urban, 4), "\n")

## SE: 0.03

cat("95% CI: [", round(prop_test_urban$conf.int[1], 4), ",", 
    round(prop_test_urban$conf.int[2], 4), "]\n\n")

## 95% CI: [ 0.6989 , 0.8207 ]

cat("Rural community:\n")

## Rural community:

cat("Vaccinated:", vaccinated_rural, "out of", n_rural, "\n")

## Vaccinated: 122 out of 180

cat("Proportion:", round(p_rural, 4), "\n")

## Proportion: 0.6778

cat("SE:", round(se_rural, 4), "\n")

## SE: 0.0348

cat("95% CI: [", round(prop_test_rural$conf.int[1], 4), ",", 
    round(prop_test_rural$conf.int[2], 4), "]\n\n")

## 95% CI: [ 0.6035 , 0.7443 ]

cat("Difference in proportions:\n")

## Difference in proportions:

cat("Urban - Rural =", round(p_urban - p_rural, 4), "\n")

## Urban - Rural = 0.0872

cat("95% CI for difference: [", round(prop_diff_test$conf.int[1], 4), ",", 
    round(prop_diff_test$conf.int[2], 4), "]\n")

## 95% CI for difference: [ -0.0081 , 0.1826 ]

cat("χ² statistic:", round(prop_diff_test$statistic, 3), "\n")

## χ² statistic: 3.181

cat("p-value:", format(prop_diff_test$p.value, scientific = TRUE, digits = 3), "\n")

## p-value: 7.45e-02

Confidence Intervals for Regression Coefficients

# Fit linear regression model
stress_model <- lm(productivity ~ stress, data = stress_data)
model_summary <- summary(stress_model)

# Extract coefficients
coeffs <- coef(stress_model)
beta_0 <- coeffs[1]
beta_1 <- coeffs[2]

# Get confidence intervals
ci_coeffs <- confint(stress_model, level = 0.95)

# Extract standard errors
se_beta_0 <- model_summary$coefficients[1, "Std. Error"]
se_beta_1 <- model_summary$coefficients[2, "Std. Error"]

# Extract t-values and p-values
t_beta_0 <- model_summary$coefficients[1, "t value"]
t_beta_1 <- model_summary$coefficients[2, "t value"]
p_beta_0 <- model_summary$coefficients[1, "Pr(>|t|)"]
p_beta_1 <- model_summary$coefficients[2, "Pr(>|t|)"]

# R-squared
r_squared <- model_summary$r.squared
adj_r_squared <- model_summary$adj.r.squared

cat("Regression Model: Productivity ~ Stress\n\n")

## Regression Model: Productivity ~ Stress

cat("Intercept (β₀):\n")

## Intercept (β₀):

cat("Estimate:", round(beta_0, 3), "\n")

## Estimate: 86.447

cat("SE:", round(se_beta_0, 3), "\n")

## SE: 2.682

cat("95% CI: [", round(ci_coeffs[1, 1], 3), ",", round(ci_coeffs[1, 2], 3), "]\n")

## 95% CI: [ 81.137 , 91.757 ]

cat("t-value:", round(t_beta_0, 3), "\n")

## t-value: 32.238

cat("p-value:", format(p_beta_0, scientific = TRUE, digits = 3), "\n\n")

## p-value: 2.43e-60

cat("Stress coefficient (β₁):\n")

## Stress coefficient (β₁):

cat("Estimate:", round(beta_1, 4), "\n")

## Estimate: -0.6286

cat("SE:", round(se_beta_1, 4), "\n")

## SE: 0.0511

cat("95% CI: [", round(ci_coeffs[2, 1], 4), ",", round(ci_coeffs[2, 2], 4), "]\n")

## 95% CI: [ -0.7297 , -0.5274 ]

cat("t-value:", round(t_beta_1, 3), "\n")

## t-value: -12.306

cat("p-value:", format(p_beta_1, scientific = TRUE, digits = 3), "\n\n")

## p-value: 6.78e-23

cat("Model fit:\n")

## Model fit:

cat("R² =", round(r_squared, 4), "\n")

## R² = 0.5621

cat("Adjusted R² =", round(adj_r_squared, 4), "\n")

## Adjusted R² = 0.5583

Confidence Interval for Correlation Coefficient

# Pearson correlation test
cor_test <- cor.test(support_data$support, support_data$wellbeing, 
                     conf.level = 0.95)

r_value <- cor_test$estimate
r_lower <- cor_test$conf.int[1]
r_upper <- cor_test$conf.int[2]

cat("Correlation: Social Support and Well-being\n\n")

## Correlation: Social Support and Well-being

cat("Pearson's r:", round(r_value, 4), "\n")

## Pearson's r: 0.702

cat("95% CI: [", round(r_lower, 4), ",", round(r_upper, 4), "]\n")

## 95% CI: [ 0.5864 , 0.7895 ]

cat("t-statistic:", round(cor_test$statistic, 3), "\n")

## t-statistic: 9.757

cat("df:", cor_test$parameter, "\n")

## df: 98

cat("p-value:", format(cor_test$p.value, scientific = TRUE, digits = 3), "\n")

## p-value: 4.08e-16

7. Interpreting Confidence Intervals

Let’s interpret each confidence interval in practical context.

Vegetable Servings (95% CI: [1.64, 2.04])

We are 95% confident that the true mean increase in daily vegetable servings for the population is between 1.64 and 2.04 servings. Our point estimate of 1.84 servings represents our best guess, but the interval acknowledges sampling uncertainty.

Practical interpretation: Even at the lower bound (1.64 servings), the intervention appears beneficial. The interval does not include zero, consistent with the significant t-test (p < .001).

Sleep Quality Difference (95% CI: [-5.13, 4.29])

The difference in sleep quality between meditation and exercise groups is estimated at 0.42 points (Meditation - Exercise), with a 95% CI of [-5.13, 4.29].

Practical interpretation: Since the interval includes zero, we cannot confidently conclude that one intervention is superior. The true difference could range from 5.13 points to as large as 4.29 points.

Vaccination Rates (Urban: [0.699, 0.821]; Rural: [0.604, 0.744])

The urban vaccination rate is estimated at 76.5% (95% CI: [69.9%, 82.1%]), while the rural rate is 67.8% (95% CI: [60.4%, 74.4%]).

Practical interpretation: The intervals do not overlap substantially, suggesting a meaningful difference. The difference in proportions (8.7%, 95% CI: [-0.8%, 18.3%]) indicates urban communities have higher vaccination rates.

Stress-Productivity Relationship (β₁: [-0.73, -0.527])

For each 1-point increase in stress level, productivity decreases by 0.629 points (95% CI: [-0.73, -0.527]).

Practical interpretation: The entire confidence interval is negative, indicating that stress reliably predicts lower productivity. The effect could be as small as a 0.527-point decrease or as large as a 0.73-point decrease per unit of stress.

8. Visualizing Confidence Intervals

Let’s create visualizations that display confidence intervals clearly.

# Create summary data for visualization
ci_data <- data.frame(
  Parameter = c("Vegetable Servings\n(Mean)",
                "Sleep Quality Diff\n(Med - Exe)",
                "Urban Vaccination\n(Proportion)",
                "Rural Vaccination\n(Proportion)",
                "Stress Effect\n(β₁)"),
  Estimate = c(mean_servings, mean_diff, p_urban, p_rural, beta_1),
  Lower = c(t_result$conf.int[1], t_sleep$conf.int[1], 
            prop_test_urban$conf.int[1], prop_test_rural$conf.int[1],
            ci_coeffs[2, 1]),
  Upper = c(t_result$conf.int[2], t_sleep$conf.int[2],
            prop_test_urban$conf.int[2], prop_test_rural$conf.int[2],
            ci_coeffs[2, 2]),
  Type = c("Mean", "Difference", "Proportion", "Proportion", "Slope")
)

# Plot 1: Forest plot of CIs
p_forest <- ggplot(ci_data, aes(x = Parameter, y = Estimate, color = Type)) +
  geom_point(size = 4) +
  geom_errorbar(aes(ymin = Lower, ymax = Upper), width = 0.3, linewidth = 1) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  labs(title = "Confidence Intervals Across Different Analyses",
       subtitle = "Points show estimates; error bars show 95% CIs",
       x = "Parameter", y = "Estimate Value") +
  scale_color_manual(values = c("#E74C3C", "#3498DB", "#27AE60", "#F39C12")) +
  coord_flip() +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5),
    legend.position = "right"
  )

# Plot 2: Regression with CI
p_reg_ci <- ggplot(stress_data, aes(x = stress, y = productivity)) +
  geom_point(alpha = 0.5, color = "#3498DB") +
  geom_smooth(method = "lm", se = TRUE, color = "#E74C3C", 
              fill = "#E74C3C", alpha = 0.2) +
  labs(title = "Stress-Productivity Relationship with 95% CI",
       subtitle = paste0("β₁ = ", round(beta_1, 3), 
                        ", 95% CI [", round(ci_coeffs[2, 1], 3), 
                        ", ", round(ci_coeffs[2, 2], 3), "]"),
       x = "Stress Level", y = "Productivity Score") +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 9)
  )

# Plot 3: Means with error bars (sleep study)
sleep_summary <- data.frame(
  Intervention = c("Meditation", "Exercise"),
  Mean = c(meditation_mean, exercise_mean),
  SE = c(meditation_sd / sqrt(n_meditation), exercise_sd / sqrt(n_exercise))
)
sleep_summary$CI_Lower <- sleep_summary$Mean - 1.96 * sleep_summary$SE
sleep_summary$CI_Upper <- sleep_summary$Mean + 1.96 * sleep_summary$SE

p_means_ci <- ggplot(sleep_summary, aes(x = Intervention, y = Mean, fill = Intervention)) +
  geom_bar(stat = "identity", alpha = 0.7, width = 0.6) +
  geom_errorbar(aes(ymin = CI_Lower, ymax = CI_Upper), 
                width = 0.2, linewidth = 1) +
  labs(title = "Sleep Quality by Intervention with 95% CI",
       x = "Intervention", y = "Mean Sleep Quality Score") +
  scale_fill_manual(values = c("#27AE60", "#E74C3C")) +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    legend.position = "none"
  )

# Plot 4: Proportions with CIs
prop_summary <- data.frame(
  Community = c("Urban", "Rural"),
  Proportion = c(p_urban, p_rural),
  Lower = c(prop_test_urban$conf.int[1], prop_test_rural$conf.int[1]),
  Upper = c(prop_test_urban$conf.int[2], prop_test_rural$conf.int[2])
)

p_prop_ci <- ggplot(prop_summary, aes(x = Community, y = Proportion, fill = Community)) +
  geom_bar(stat = "identity", alpha = 0.7, width = 0.6) +
  geom_errorbar(aes(ymin = Lower, ymax = Upper), 
                width = 0.2, linewidth = 1) +
  labs(title = "Vaccination Rates by Community with 95% CI",
       x = "Community Type", y = "Vaccination Rate") +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  scale_fill_manual(values = c("#9B59B6", "#F39C12")) +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    legend.position = "none"
  )

# Arrange plots
grid.arrange(p_forest, p_reg_ci, p_means_ci, p_prop_ci, ncol = 2)

9. Factors Affecting Confidence Interval Width

Let’s demonstrate how sample size, variability, and confidence level affect CI width.

# Simulate data with different sample sizes
set.seed(2025)
sample_sizes <- c(20, 50, 100, 200, 500)
true_mean <- 50
true_sd <- 15

ci_widths <- data.frame()

for (n in sample_sizes) {
  # Generate sample
  sample_data <- rnorm(n, mean = true_mean, sd = true_sd)
  
  # Calculate CI
  t_res <- t.test(sample_data, conf.level = 0.95)
  
  ci_widths <- rbind(ci_widths, data.frame(
    n = n,
    Mean = mean(sample_data),
    Lower = t_res$conf.int[1],
    Upper = t_res$conf.int[2],
    Width = t_res$conf.int[2] - t_res$conf.int[1]
  ))
}

# Plot: CI width vs sample size
p_width <- ggplot(ci_widths, aes(x = n, y = Width)) +
  geom_line(color = "#3498DB", linewidth = 1.2) +
  geom_point(color = "#3498DB", size = 3) +
  labs(title = "Effect of Sample Size on Confidence Interval Width",
       subtitle = "95% CI for population mean",
       x = "Sample Size (n)", y = "CI Width") +
  scale_x_continuous(breaks = sample_sizes) +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5)
  )

# Effect of confidence level
conf_levels <- c(0.80, 0.90, 0.95, 0.99)
sample_fixed <- rnorm(100, mean = true_mean, sd = true_sd)

ci_conf_levels <- data.frame()

for (conf in conf_levels) {
  t_res <- t.test(sample_fixed, conf.level = conf)
  
  ci_conf_levels <- rbind(ci_conf_levels, data.frame(
    Confidence = paste0(conf * 100, "%"),
    Conf_Numeric = conf,
    Lower = t_res$conf.int[1],
    Upper = t_res$conf.int[2],
    Width = t_res$conf.int[2] - t_res$conf.int[1]
  ))
}

# Plot: CI width vs confidence level
p_conf <- ggplot(ci_conf_levels, aes(x = Conf_Numeric, y = Width)) +
  geom_line(color = "#E74C3C", linewidth = 1.2) +
  geom_point(color = "#E74C3C", size = 3) +
  labs(title = "Effect of Confidence Level on CI Width",
       subtitle = "Fixed sample size (n = 100)",
       x = "Confidence Level", y = "CI Width") +
  scale_x_continuous(labels = scales::percent_format(accuracy = 1)) +
  theme_minimal() +
  theme(
    panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
    panel.grid.minor = element_blank(),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
    axis.line.x.top = element_blank(),
    axis.line.y.right = element_blank(),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5)
  )

grid.arrange(p_width, p_conf, ncol = 2)

# Display numerical results
cat("Effect of Sample Size (95% CI):\n")

## Effect of Sample Size (95% CI):

print(ci_widths)

##     n     Mean    Lower    Upper     Width
## 1  20 53.09899 46.72480 59.47317 12.748370
## 2  50 51.03550 46.57527 55.49572  8.920446
## 3 100 47.01111 44.14542 49.87680  5.731388
## 4 200 50.66637 48.58104 52.75170  4.170659
## 5 500 50.28285 48.95918 51.60652  2.647333

cat("\n\nEffect of Confidence Level (n = 100):\n")

## 
## 
## Effect of Confidence Level (n = 100):

print(ci_conf_levels[, c("Confidence", "Lower", "Upper", "Width")])

##   Confidence    Lower    Upper    Width
## 1        80% 47.34222 51.12783 3.785610
## 2        90% 46.79906 51.67100 4.871943
## 3        95% 46.32397 52.14609 5.822117
## 4        99% 45.38181 53.08825 7.706435

Key findings:

Sample size: As n increases from 20 to 500, CI width decreases from 12.75 to 2.65 (a 79.2% reduction).
Confidence level: Increasing confidence from 80% to 99% widens the CI from 3.79 to 7.71 (a 103.6% increase).

10. Reporting Confidence Intervals in APA Style

Here’s how to report each analysis with confidence intervals in APA format:

Single Mean (Nutrition Intervention):

The nutrition intervention led to an average increase of 1.84 servings of vegetables per day (SD = 0.92), which was significantly different from zero, t(79) = 17.99, p < .001, 95% CI [1.64, 2.04].

Two-Group Comparison (Sleep Quality):

Independent samples t-test revealed that meditation (M = 68.79, SD = 11.36) produced higher sleep quality scores than exercise (M = 68.36, SD = 12.36), t(98) = -0.18, p = 0.859, 95% CI [-5.13, 4.29]. The mean difference of 0.42 points suggests a small practical effect.

Proportion (Vaccination Rates):

Vaccination rates differed significantly between urban (76.5%, 95% CI [69.9%, 82.1%]) and rural (67.8%, 95% CI [60.4%, 74.4%]) communities, χ²(1) = 3.18, p = 0.074. The difference in rates was 8.7 percentage points, 95% CI [-0.8%, 18.3%].

Regression Coefficient (Stress-Productivity):

Stress level significantly predicted productivity, b = -0.629, SE = 0.051, t(118) = -12.31, p < .001, 95% CI [-0.73, -0.527]. For each one-point increase in stress, productivity decreased by 0.629 points. The model explained 56.2% of the variance in productivity (R² = 0.562).

Correlation:

Social support and well-being were significantly positively correlated, r(98) = 0.702, p < .001, 95% CI [0.586, 0.79]. This indicates a strong positive relationship.

11. Practical Considerations

Choosing the Confidence Level

95% is conventional but not mandatory:

90% CI: Wider tolerance for error, narrower intervals (exploratory research)
95% CI: Standard in most fields (balances precision and confidence)
99% CI: More conservative, wider intervals (high-stakes decisions, clinical trials)

Trade-off: Higher confidence → wider intervals → less precision

Sample Size Planning

To achieve a target CI width:

\[ n = \left(\frac{2 \cdot z_{\alpha/2} \cdot \sigma}{W}\right)^2 \]

where W is the desired total width.

Example: To estimate a mean within ±2 units with 95% confidence when σ = 10:

# Parameters
z_value <- qnorm(0.975)  # 95% confidence
sigma <- 10
desired_width <- 4  # Total width (±2 on each side)

# Calculate required n
n_required <- ceiling((2 * z_value * sigma / desired_width)^2)

cat("Sample size needed for CI width of", desired_width, "units:\n")

## Sample size needed for CI width of 4 units:

cat("n =", n_required, "\n")

## n = 97

You would need n = 97 participants.

Confidence Intervals vs. P-values

Confidence intervals are superior to p-values alone because they:

Provide magnitude information: Not just “significant” but “how large?”
Show precision: Narrow CIs indicate precise estimates
Aid interpretation: Can assess practical significance
Remain informative when non-significant: A wide CI suggests low power, not necessarily no effect

Example: A non-significant result with CI [-0.5, 8.2] suggests uncertainty, whereas [-0.1, 0.3] suggests a genuinely small effect.

Common Misinterpretations

WRONG: “There’s a 95% probability the true parameter is in this interval.”

RIGHT: “95% of intervals constructed this way will contain the true parameter.”

WRONG: “Values outside the CI are impossible.”

RIGHT: “Values outside the CI are less plausible given the data.”

WRONG: “Overlapping CIs mean no significant difference.”

RIGHT: “Use formal tests for comparisons; CI overlap is not equivalent to hypothesis testing.”

Reporting Best Practices

Always report CIs with point estimates
Specify the confidence level (usually 95%)
Include units and context
Report actual limits, not just “significant”
Interpret practically, not just statistically
Use consistent precision (2-3 decimal places typical)

12. Advanced Topics

Bootstrap Confidence Intervals

For non-normal distributions or complex statistics:

# Bootstrap CI for median
set.seed(2025)
n_boot <- 10000
boot_medians <- numeric(n_boot)

for (i in 1:n_boot) {
  boot_sample <- sample(nutrition_data$servings, replace = TRUE)
  boot_medians[i] <- median(boot_sample)
}

# Calculate percentile CI
boot_ci_lower <- quantile(boot_medians, 0.025)
boot_ci_upper <- quantile(boot_medians, 0.975)
observed_median <- median(nutrition_data$servings)

cat("Bootstrap 95% CI for Median Vegetable Servings:\n")

## Bootstrap 95% CI for Median Vegetable Servings:

cat("Observed median:", round(observed_median, 3), "\n")

## Observed median: 1.756

cat("Bootstrap CI: [", round(boot_ci_lower, 3), ",", round(boot_ci_upper, 3), "]\n")

## Bootstrap CI: [ 1.49 , 2.157 ]

cat("Based on", n_boot, "bootstrap samples\n")

## Based on 10000 bootstrap samples

Bonferroni Correction for Multiple Comparisons

When constructing multiple CIs, adjust confidence level:

# For 5 comparisons, maintain family-wise error rate of 0.05
n_comparisons <- 5
alpha_adjusted <- 0.05 / n_comparisons
conf_level_adjusted <- 1 - alpha_adjusted

cat("Multiple Comparisons Adjustment:\n")

## Multiple Comparisons Adjustment:

cat("Number of comparisons:", n_comparisons, "\n")

## Number of comparisons: 5

cat("Unadjusted α:", 0.05, "\n")

## Unadjusted α: 0.05

cat("Bonferroni-adjusted α:", round(alpha_adjusted, 4), "\n")

## Bonferroni-adjusted α: 0.01

cat("Adjusted confidence level:", round(conf_level_adjusted * 100, 2), "%\n")

## Adjusted confidence level: 99 %

# Example: CI for vegetable servings with Bonferroni correction
t_bonf <- t.test(nutrition_data$servings, conf.level = conf_level_adjusted)

cat("\nOriginal 95% CI: [", round(t_result$conf.int[1], 3), ",", 
    round(t_result$conf.int[2], 3), "]\n")

## 
## Original 95% CI: [ 1.637 , 2.044 ]

cat("Bonferroni", round(conf_level_adjusted * 100, 2), "% CI: [", 
    round(t_bonf$conf.int[1], 3), ",", round(t_bonf$conf.int[2], 3), "]\n")

## Bonferroni 99 % CI: [ 1.571 , 2.111 ]

cat("Width increase:", round((t_bonf$conf.int[2] - t_bonf$conf.int[1]) - 
                              (t_result$conf.int[2] - t_result$conf.int[1]), 3), "\n")

## Width increase: 0.133

13. Conclusion

Confidence intervals are indispensable tools for quantifying uncertainty in statistical estimates. They transform point estimates into ranges that communicate both our best guess and the precision of that estimate. Unlike p-values, which only indicate whether an effect exists, confidence intervals reveal the magnitude and plausibility of different effect sizes.

Key takeaways:

CIs quantify uncertainty: They provide a range of plausible values for population parameters
Interpretation requires care: The confidence level refers to the procedure, not individual intervals
Width matters: Narrower intervals indicate more precise estimates
Always report CIs: They provide essential context beyond p-values
Consider practical significance: Statistical significance (CI excludes zero) doesn’t guarantee meaningful impact

In our scenarios, confidence intervals revealed:

Nutrition intervention increased vegetable servings by 1.84 per day (95% CI: [1.64, 2.04])
Meditation improved sleep quality by 0.42 points over exercise (95% CI: [-5.13, 4.29])
Urban vaccination rate was 8.7% higher than rural (95% CI: [-0.8%, 18.3%])
Each stress point reduced productivity by 0.629 points (95% CI: [-0.73, -0.527])
Social support and well-being correlated at r = 0.702 (95% CI: [0.586, 0.79])

By embracing confidence intervals, we move beyond simplistic “significant/not significant” dichotomies to nuanced, informative quantification of uncertainty—the hallmark of rigorous, transparent scientific communication.

Confidence Intervals: Quantifying Uncertainty in Statistical Estimates

1. A Story About Precision and Uncertainty

2. What are Confidence Intervals?

Key Concepts

Common Misconceptions

3. Mathematical Foundations

Confidence Interval for a Mean

Confidence Interval for a Proportion

Confidence Interval for the Difference Between Two Means

Confidence Interval for Regression Coefficients

4. Simulating Datasets for Different Scenarios

5. Visualizing Confidence Intervals

6. Calculating Confidence Intervals in R

Confidence Interval for a Single Mean

Confidence Interval for Difference Between Two Means

Confidence Interval for a Proportion

Confidence Intervals for Regression Coefficients

Confidence Interval for Correlation Coefficient

7. Interpreting Confidence Intervals

Vegetable Servings (95% CI: [1.64, 2.04])

Sleep Quality Difference (95% CI: [-5.13, 4.29])

Vaccination Rates (Urban: [0.699, 0.821]; Rural: [0.604, 0.744])

Stress-Productivity Relationship (β₁: [-0.73, -0.527])

Social Support-Wellbeing Correlation (r: [0.586, 0.79])

8. Visualizing Confidence Intervals

9. Factors Affecting Confidence Interval Width

10. Reporting Confidence Intervals in APA Style

11. Practical Considerations

Choosing the Confidence Level

Sample Size Planning

Confidence Intervals vs. P-values

Common Misinterpretations

Reporting Best Practices

12. Advanced Topics

Bootstrap Confidence Intervals

Bonferroni Correction for Multiple Comparisons

13. Conclusion