A public health researcher conducted a community-based nutrition intervention in a rural district, teaching families about balanced diets and meal planning. After six months, she measured the average change in daily vegetable consumption among 80 participating families. The sample showed an average increase of 1.8 servings per day—an encouraging finding that she planned to present to the regional health committee to secure funding for expansion.
However, when the committee chair asked, “How confident are you that this 1.8 servings represents the true population effect?”, she realized her point estimate told only part of the story. While 1.8 servings was her best estimate from the sample, the true population mean could plausibly be anywhere from 1.3 to 2.3 servings (or perhaps even wider). This range—the confidence interval—communicates the precision of her estimate and the degree of uncertainty inherent in sampling.
Understanding confidence intervals is fundamental to statistical inference. They provide a range of plausible values for population parameters, acknowledging that our sample-based estimates are subject to sampling variability. Rather than offering a single number that falsely implies certainty, confidence intervals honestly communicate what we can reasonably conclude from our data.
A confidence interval is a range of values that likely contains an unknown population parameter with a specified level of confidence. Most commonly, we construct 95% confidence intervals, meaning that if we repeated our study many times and calculated a confidence interval each time, approximately 95% of those intervals would contain the true population parameter.
Confidence Level (1 - α): The probability that the interval contains the true parameter when the procedure is repeated many times. Common choices are 90%, 95%, and 99%.
Margin of Error: The distance from the point estimate to the endpoints of the interval, determined by the standard error and critical value.
Width of the Interval: Reflects precision—narrower intervals indicate more precise estimates. Width depends on: - Sample size (larger n → narrower intervals) - Variability in the data (larger SD → wider intervals) - Confidence level (higher confidence → wider intervals)
INCORRECT interpretation: “There is a 95% probability that the true parameter lies within this specific interval.”
CORRECT interpretation: “If we repeated this study many times and calculated 95% CIs each time, approximately 95% of those intervals would contain the true parameter.”
The true parameter is fixed (though unknown); the interval is what varies across repeated samples.
For a population mean μ with known variance, the confidence interval is:
\[ \bar{X} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]
When the population variance is unknown (the usual case), we use the t-distribution:
\[ \bar{X} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} \]
where: - \(\bar{X}\) is the sample mean - \(t_{\alpha/2, df}\) is the critical value from the t-distribution with df = n - 1 - \(s\) is the sample standard deviation - \(n\) is the sample size
For a population proportion p:
\[ \hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]
where \(\hat{p}\) is the sample proportion.
For independent samples:
\[ (\bar{X}_1 - \bar{X}_2) \pm t_{\alpha/2, df} \cdot \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \]
For a regression coefficient \(\beta_j\):
\[ \hat{\beta}_j \pm t_{\alpha/2, df} \cdot SE(\hat{\beta}_j) \]
where \(SE(\hat{\beta}_j)\) is the standard error of the coefficient.
We’ll create multiple datasets to demonstrate confidence interval construction across various statistical contexts.
set.seed(2025)
# Scenario 1: Single mean (Nutrition intervention - vegetable servings)
# Daily vegetable servings after intervention
vegetable_servings <- rnorm(80, mean = 1.8, sd = 0.9)
nutrition_data <- data.frame(
servings = vegetable_servings
)
# Scenario 2: Two-group comparison (Sleep study)
# Sleep quality scores (0-100 scale) for two interventions
meditation_group <- rnorm(50, mean = 72, sd = 12)
exercise_group <- rnorm(50, mean = 68, sd = 12)
sleep_data <- data.frame(
quality = c(meditation_group, exercise_group),
intervention = factor(rep(c("Meditation", "Exercise"), each = 50))
)
# Scenario 3: Proportion (Screening program)
# COVID-19 vaccination uptake in two communities
n_urban <- 200
n_rural <- 180
vaccinated_urban <- rbinom(1, n_urban, 0.78)
vaccinated_rural <- rbinom(1, n_rural, 0.65)
# Scenario 4: Regression (Stress and productivity)
n <- 120
stress_level <- rnorm(n, mean = 50, sd = 15)
productivity <- 85 - 0.6 * stress_level + rnorm(n, mean = 0, sd = 8)
stress_data <- data.frame(
stress = stress_level,
productivity = productivity
)
# Scenario 5: Correlation (Social support and well-being)
social_support <- rnorm(100, mean = 60, sd = 15)
wellbeing <- 30 + 0.7 * social_support + rnorm(100, mean = 0, sd = 10)
support_data <- data.frame(
support = social_support,
wellbeing = wellbeing
)
Before calculating CIs, let’s visualize our data to understand the distributions.
library(ggplot2)
# Plot 1: Distribution of vegetable servings
p1 <- ggplot(nutrition_data, aes(x = servings)) +
geom_histogram(bins = 20, fill = "#3498DB", alpha = 0.7, color = "black") +
geom_vline(aes(xintercept = mean(servings)), color = "#E74C3C",
linetype = "dashed", linewidth = 1) +
labs(title = "Distribution of Daily Vegetable Servings",
x = "Servings per Day", y = "Frequency") +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold")
)
# Plot 2: Sleep quality by intervention
p2 <- ggplot(sleep_data, aes(x = intervention, y = quality, fill = intervention)) +
geom_boxplot(alpha = 0.7) +
geom_jitter(width = 0.2, alpha = 0.3) +
stat_summary(fun = mean, geom = "point", shape = 23, size = 3,
fill = "red", color = "black") +
labs(title = "Sleep Quality by Intervention Type",
x = "Intervention", y = "Sleep Quality Score") +
scale_fill_manual(values = c("#E74C3C", "#27AE60")) +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
legend.position = "none"
)
# Plot 3: Stress and productivity
p3 <- ggplot(stress_data, aes(x = stress, y = productivity)) +
geom_point(alpha = 0.6, color = "#9B59B6") +
geom_smooth(method = "lm", se = TRUE, color = "#E74C3C", fill = "#E74C3C", alpha = 0.2) +
labs(title = "Stress Level vs. Productivity",
x = "Stress Level", y = "Productivity Score") +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold")
)
# Plot 4: Social support and wellbeing
p4 <- ggplot(support_data, aes(x = support, y = wellbeing)) +
geom_point(alpha = 0.6, color = "#F39C12") +
geom_smooth(method = "lm", se = TRUE, color = "#E74C3C", fill = "#E74C3C", alpha = 0.2) +
labs(title = "Social Support vs. Well-being",
x = "Social Support Score", y = "Well-being Score") +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold")
)
# Arrange plots
library(gridExtra)
grid.arrange(p1, p2, p3, p4, ncol = 2)
Now we calculate confidence intervals for each scenario using actual R output.
# Calculate mean and CI using t.test
t_result <- t.test(nutrition_data$servings, conf.level = 0.95)
# Extract values
mean_servings <- mean(nutrition_data$servings)
sd_servings <- sd(nutrition_data$servings)
n_servings <- length(nutrition_data$servings)
se_servings <- sd_servings / sqrt(n_servings)
# Calculate CI manually to verify
df <- n_servings - 1
t_critical <- qt(0.975, df)
margin_error <- t_critical * se_servings
ci_lower_manual <- mean_servings - margin_error
ci_upper_manual <- mean_servings + margin_error
# Display results
cat("Confidence Interval for Mean Vegetable Servings:\n")
## Confidence Interval for Mean Vegetable Servings:
cat("Sample size:", n_servings, "\n")
## Sample size: 80
cat("Mean:", round(mean_servings, 3), "\n")
## Mean: 1.841
cat("SD:", round(sd_servings, 3), "\n")
## SD: 0.915
cat("SE:", round(se_servings, 4), "\n")
## SE: 0.1023
cat("t critical value (df =", df, "):", round(t_critical, 3), "\n")
## t critical value (df = 79 ): 1.99
cat("Margin of error:", round(margin_error, 4), "\n")
## Margin of error: 0.2037
cat("95% CI: [", round(t_result$conf.int[1], 3), ",",
round(t_result$conf.int[2], 3), "]\n")
## 95% CI: [ 1.637 , 2.044 ]
cat("t-statistic:", round(t_result$statistic, 3), "\n")
## t-statistic: 17.989
cat("p-value:", format(t_result$p.value, scientific = TRUE, digits = 3), "\n")
## p-value: 1.15e-29
# Independent samples t-test
t_sleep <- t.test(quality ~ intervention, data = sleep_data,
var.equal = TRUE, conf.level = 0.95)
# Calculate statistics for each group
meditation_mean <- mean(sleep_data$quality[sleep_data$intervention == "Meditation"])
exercise_mean <- mean(sleep_data$quality[sleep_data$intervention == "Exercise"])
meditation_sd <- sd(sleep_data$quality[sleep_data$intervention == "Meditation"])
exercise_sd <- sd(sleep_data$quality[sleep_data$intervention == "Exercise"])
n_meditation <- sum(sleep_data$intervention == "Meditation")
n_exercise <- sum(sleep_data$intervention == "Exercise")
mean_diff <- meditation_mean - exercise_mean
# Display results
cat("Confidence Interval for Difference in Sleep Quality:\n")
## Confidence Interval for Difference in Sleep Quality:
cat("Meditation group: n =", n_meditation, ", M =", round(meditation_mean, 2),
", SD =", round(meditation_sd, 2), "\n")
## Meditation group: n = 50 , M = 68.79 , SD = 11.36
cat("Exercise group: n =", n_exercise, ", M =", round(exercise_mean, 2),
", SD =", round(exercise_sd, 2), "\n")
## Exercise group: n = 50 , M = 68.36 , SD = 12.36
cat("Mean difference:", round(mean_diff, 3), "\n")
## Mean difference: 0.424
cat("95% CI for difference: [", round(t_sleep$conf.int[1], 3), ",",
round(t_sleep$conf.int[2], 3), "]\n")
## 95% CI for difference: [ -5.134 , 4.285 ]
cat("t-statistic:", round(t_sleep$statistic, 3), "\n")
## t-statistic: -0.179
cat("df:", round(t_sleep$parameter, 1), "\n")
## df: 98
cat("p-value:", round(t_sleep$p.value, 4), "\n")
## p-value: 0.8585
# Urban vaccination rate
prop_test_urban <- prop.test(vaccinated_urban, n_urban, conf.level = 0.95)
p_urban <- vaccinated_urban / n_urban
se_urban <- sqrt(p_urban * (1 - p_urban) / n_urban)
# Rural vaccination rate
prop_test_rural <- prop.test(vaccinated_rural, n_rural, conf.level = 0.95)
p_rural <- vaccinated_rural / n_rural
se_rural <- sqrt(p_rural * (1 - p_rural) / n_rural)
# Difference in proportions
prop_diff_test <- prop.test(c(vaccinated_urban, vaccinated_rural),
c(n_urban, n_rural), conf.level = 0.95)
cat("Confidence Intervals for Vaccination Rates:\n\n")
## Confidence Intervals for Vaccination Rates:
cat("Urban community:\n")
## Urban community:
cat("Vaccinated:", vaccinated_urban, "out of", n_urban, "\n")
## Vaccinated: 153 out of 200
cat("Proportion:", round(p_urban, 4), "\n")
## Proportion: 0.765
cat("SE:", round(se_urban, 4), "\n")
## SE: 0.03
cat("95% CI: [", round(prop_test_urban$conf.int[1], 4), ",",
round(prop_test_urban$conf.int[2], 4), "]\n\n")
## 95% CI: [ 0.6989 , 0.8207 ]
cat("Rural community:\n")
## Rural community:
cat("Vaccinated:", vaccinated_rural, "out of", n_rural, "\n")
## Vaccinated: 122 out of 180
cat("Proportion:", round(p_rural, 4), "\n")
## Proportion: 0.6778
cat("SE:", round(se_rural, 4), "\n")
## SE: 0.0348
cat("95% CI: [", round(prop_test_rural$conf.int[1], 4), ",",
round(prop_test_rural$conf.int[2], 4), "]\n\n")
## 95% CI: [ 0.6035 , 0.7443 ]
cat("Difference in proportions:\n")
## Difference in proportions:
cat("Urban - Rural =", round(p_urban - p_rural, 4), "\n")
## Urban - Rural = 0.0872
cat("95% CI for difference: [", round(prop_diff_test$conf.int[1], 4), ",",
round(prop_diff_test$conf.int[2], 4), "]\n")
## 95% CI for difference: [ -0.0081 , 0.1826 ]
cat("χ² statistic:", round(prop_diff_test$statistic, 3), "\n")
## χ² statistic: 3.181
cat("p-value:", format(prop_diff_test$p.value, scientific = TRUE, digits = 3), "\n")
## p-value: 7.45e-02
# Fit linear regression model
stress_model <- lm(productivity ~ stress, data = stress_data)
model_summary <- summary(stress_model)
# Extract coefficients
coeffs <- coef(stress_model)
beta_0 <- coeffs[1]
beta_1 <- coeffs[2]
# Get confidence intervals
ci_coeffs <- confint(stress_model, level = 0.95)
# Extract standard errors
se_beta_0 <- model_summary$coefficients[1, "Std. Error"]
se_beta_1 <- model_summary$coefficients[2, "Std. Error"]
# Extract t-values and p-values
t_beta_0 <- model_summary$coefficients[1, "t value"]
t_beta_1 <- model_summary$coefficients[2, "t value"]
p_beta_0 <- model_summary$coefficients[1, "Pr(>|t|)"]
p_beta_1 <- model_summary$coefficients[2, "Pr(>|t|)"]
# R-squared
r_squared <- model_summary$r.squared
adj_r_squared <- model_summary$adj.r.squared
cat("Regression Model: Productivity ~ Stress\n\n")
## Regression Model: Productivity ~ Stress
cat("Intercept (β₀):\n")
## Intercept (β₀):
cat("Estimate:", round(beta_0, 3), "\n")
## Estimate: 86.447
cat("SE:", round(se_beta_0, 3), "\n")
## SE: 2.682
cat("95% CI: [", round(ci_coeffs[1, 1], 3), ",", round(ci_coeffs[1, 2], 3), "]\n")
## 95% CI: [ 81.137 , 91.757 ]
cat("t-value:", round(t_beta_0, 3), "\n")
## t-value: 32.238
cat("p-value:", format(p_beta_0, scientific = TRUE, digits = 3), "\n\n")
## p-value: 2.43e-60
cat("Stress coefficient (β₁):\n")
## Stress coefficient (β₁):
cat("Estimate:", round(beta_1, 4), "\n")
## Estimate: -0.6286
cat("SE:", round(se_beta_1, 4), "\n")
## SE: 0.0511
cat("95% CI: [", round(ci_coeffs[2, 1], 4), ",", round(ci_coeffs[2, 2], 4), "]\n")
## 95% CI: [ -0.7297 , -0.5274 ]
cat("t-value:", round(t_beta_1, 3), "\n")
## t-value: -12.306
cat("p-value:", format(p_beta_1, scientific = TRUE, digits = 3), "\n\n")
## p-value: 6.78e-23
cat("Model fit:\n")
## Model fit:
cat("R² =", round(r_squared, 4), "\n")
## R² = 0.5621
cat("Adjusted R² =", round(adj_r_squared, 4), "\n")
## Adjusted R² = 0.5583
# Pearson correlation test
cor_test <- cor.test(support_data$support, support_data$wellbeing,
conf.level = 0.95)
r_value <- cor_test$estimate
r_lower <- cor_test$conf.int[1]
r_upper <- cor_test$conf.int[2]
cat("Correlation: Social Support and Well-being\n\n")
## Correlation: Social Support and Well-being
cat("Pearson's r:", round(r_value, 4), "\n")
## Pearson's r: 0.702
cat("95% CI: [", round(r_lower, 4), ",", round(r_upper, 4), "]\n")
## 95% CI: [ 0.5864 , 0.7895 ]
cat("t-statistic:", round(cor_test$statistic, 3), "\n")
## t-statistic: 9.757
cat("df:", cor_test$parameter, "\n")
## df: 98
cat("p-value:", format(cor_test$p.value, scientific = TRUE, digits = 3), "\n")
## p-value: 4.08e-16
Let’s interpret each confidence interval in practical context.
We are 95% confident that the true mean increase in daily vegetable servings for the population is between 1.64 and 2.04 servings. Our point estimate of 1.84 servings represents our best guess, but the interval acknowledges sampling uncertainty.
Practical interpretation: Even at the lower bound (1.64 servings), the intervention appears beneficial. The interval does not include zero, consistent with the significant t-test (p < .001).
The difference in sleep quality between meditation and exercise groups is estimated at 0.42 points (Meditation - Exercise), with a 95% CI of [-5.13, 4.29].
Practical interpretation: Since the interval includes zero, we cannot confidently conclude that one intervention is superior. The true difference could range from 5.13 points to as large as 4.29 points.
The urban vaccination rate is estimated at 76.5% (95% CI: [69.9%, 82.1%]), while the rural rate is 67.8% (95% CI: [60.4%, 74.4%]).
Practical interpretation: The intervals do not overlap substantially, suggesting a meaningful difference. The difference in proportions (8.7%, 95% CI: [-0.8%, 18.3%]) indicates urban communities have higher vaccination rates.
For each 1-point increase in stress level, productivity decreases by 0.629 points (95% CI: [-0.73, -0.527]).
Practical interpretation: The entire confidence interval is negative, indicating that stress reliably predicts lower productivity. The effect could be as small as a 0.527-point decrease or as large as a 0.73-point decrease per unit of stress.
Let’s create visualizations that display confidence intervals clearly.
# Create summary data for visualization
ci_data <- data.frame(
Parameter = c("Vegetable Servings\n(Mean)",
"Sleep Quality Diff\n(Med - Exe)",
"Urban Vaccination\n(Proportion)",
"Rural Vaccination\n(Proportion)",
"Stress Effect\n(β₁)"),
Estimate = c(mean_servings, mean_diff, p_urban, p_rural, beta_1),
Lower = c(t_result$conf.int[1], t_sleep$conf.int[1],
prop_test_urban$conf.int[1], prop_test_rural$conf.int[1],
ci_coeffs[2, 1]),
Upper = c(t_result$conf.int[2], t_sleep$conf.int[2],
prop_test_urban$conf.int[2], prop_test_rural$conf.int[2],
ci_coeffs[2, 2]),
Type = c("Mean", "Difference", "Proportion", "Proportion", "Slope")
)
# Plot 1: Forest plot of CIs
p_forest <- ggplot(ci_data, aes(x = Parameter, y = Estimate, color = Type)) +
geom_point(size = 4) +
geom_errorbar(aes(ymin = Lower, ymax = Upper), width = 0.3, linewidth = 1) +
geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
labs(title = "Confidence Intervals Across Different Analyses",
subtitle = "Points show estimates; error bars show 95% CIs",
x = "Parameter", y = "Estimate Value") +
scale_color_manual(values = c("#E74C3C", "#3498DB", "#27AE60", "#F39C12")) +
coord_flip() +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
plot.subtitle = element_text(hjust = 0.5),
legend.position = "right"
)
# Plot 2: Regression with CI
p_reg_ci <- ggplot(stress_data, aes(x = stress, y = productivity)) +
geom_point(alpha = 0.5, color = "#3498DB") +
geom_smooth(method = "lm", se = TRUE, color = "#E74C3C",
fill = "#E74C3C", alpha = 0.2) +
labs(title = "Stress-Productivity Relationship with 95% CI",
subtitle = paste0("β₁ = ", round(beta_1, 3),
", 95% CI [", round(ci_coeffs[2, 1], 3),
", ", round(ci_coeffs[2, 2], 3), "]"),
x = "Stress Level", y = "Productivity Score") +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
plot.subtitle = element_text(hjust = 0.5, size = 9)
)
# Plot 3: Means with error bars (sleep study)
sleep_summary <- data.frame(
Intervention = c("Meditation", "Exercise"),
Mean = c(meditation_mean, exercise_mean),
SE = c(meditation_sd / sqrt(n_meditation), exercise_sd / sqrt(n_exercise))
)
sleep_summary$CI_Lower <- sleep_summary$Mean - 1.96 * sleep_summary$SE
sleep_summary$CI_Upper <- sleep_summary$Mean + 1.96 * sleep_summary$SE
p_means_ci <- ggplot(sleep_summary, aes(x = Intervention, y = Mean, fill = Intervention)) +
geom_bar(stat = "identity", alpha = 0.7, width = 0.6) +
geom_errorbar(aes(ymin = CI_Lower, ymax = CI_Upper),
width = 0.2, linewidth = 1) +
labs(title = "Sleep Quality by Intervention with 95% CI",
x = "Intervention", y = "Mean Sleep Quality Score") +
scale_fill_manual(values = c("#27AE60", "#E74C3C")) +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
legend.position = "none"
)
# Plot 4: Proportions with CIs
prop_summary <- data.frame(
Community = c("Urban", "Rural"),
Proportion = c(p_urban, p_rural),
Lower = c(prop_test_urban$conf.int[1], prop_test_rural$conf.int[1]),
Upper = c(prop_test_urban$conf.int[2], prop_test_rural$conf.int[2])
)
p_prop_ci <- ggplot(prop_summary, aes(x = Community, y = Proportion, fill = Community)) +
geom_bar(stat = "identity", alpha = 0.7, width = 0.6) +
geom_errorbar(aes(ymin = Lower, ymax = Upper),
width = 0.2, linewidth = 1) +
labs(title = "Vaccination Rates by Community with 95% CI",
x = "Community Type", y = "Vaccination Rate") +
scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
scale_fill_manual(values = c("#9B59B6", "#F39C12")) +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
legend.position = "none"
)
# Arrange plots
grid.arrange(p_forest, p_reg_ci, p_means_ci, p_prop_ci, ncol = 2)
Let’s demonstrate how sample size, variability, and confidence level affect CI width.
# Simulate data with different sample sizes
set.seed(2025)
sample_sizes <- c(20, 50, 100, 200, 500)
true_mean <- 50
true_sd <- 15
ci_widths <- data.frame()
for (n in sample_sizes) {
# Generate sample
sample_data <- rnorm(n, mean = true_mean, sd = true_sd)
# Calculate CI
t_res <- t.test(sample_data, conf.level = 0.95)
ci_widths <- rbind(ci_widths, data.frame(
n = n,
Mean = mean(sample_data),
Lower = t_res$conf.int[1],
Upper = t_res$conf.int[2],
Width = t_res$conf.int[2] - t_res$conf.int[1]
))
}
# Plot: CI width vs sample size
p_width <- ggplot(ci_widths, aes(x = n, y = Width)) +
geom_line(color = "#3498DB", linewidth = 1.2) +
geom_point(color = "#3498DB", size = 3) +
labs(title = "Effect of Sample Size on Confidence Interval Width",
subtitle = "95% CI for population mean",
x = "Sample Size (n)", y = "CI Width") +
scale_x_continuous(breaks = sample_sizes) +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
plot.subtitle = element_text(hjust = 0.5)
)
# Effect of confidence level
conf_levels <- c(0.80, 0.90, 0.95, 0.99)
sample_fixed <- rnorm(100, mean = true_mean, sd = true_sd)
ci_conf_levels <- data.frame()
for (conf in conf_levels) {
t_res <- t.test(sample_fixed, conf.level = conf)
ci_conf_levels <- rbind(ci_conf_levels, data.frame(
Confidence = paste0(conf * 100, "%"),
Conf_Numeric = conf,
Lower = t_res$conf.int[1],
Upper = t_res$conf.int[2],
Width = t_res$conf.int[2] - t_res$conf.int[1]
))
}
# Plot: CI width vs confidence level
p_conf <- ggplot(ci_conf_levels, aes(x = Conf_Numeric, y = Width)) +
geom_line(color = "#E74C3C", linewidth = 1.2) +
geom_point(color = "#E74C3C", size = 3) +
labs(title = "Effect of Confidence Level on CI Width",
subtitle = "Fixed sample size (n = 100)",
x = "Confidence Level", y = "CI Width") +
scale_x_continuous(labels = scales::percent_format(accuracy = 1)) +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "gray90", linetype = "dashed"),
panel.grid.minor = element_blank(),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
axis.line.x.top = element_blank(),
axis.line.y.right = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold"),
plot.subtitle = element_text(hjust = 0.5)
)
grid.arrange(p_width, p_conf, ncol = 2)
# Display numerical results
cat("Effect of Sample Size (95% CI):\n")
## Effect of Sample Size (95% CI):
print(ci_widths)
## n Mean Lower Upper Width
## 1 20 53.09899 46.72480 59.47317 12.748370
## 2 50 51.03550 46.57527 55.49572 8.920446
## 3 100 47.01111 44.14542 49.87680 5.731388
## 4 200 50.66637 48.58104 52.75170 4.170659
## 5 500 50.28285 48.95918 51.60652 2.647333
cat("\n\nEffect of Confidence Level (n = 100):\n")
##
##
## Effect of Confidence Level (n = 100):
print(ci_conf_levels[, c("Confidence", "Lower", "Upper", "Width")])
## Confidence Lower Upper Width
## 1 80% 47.34222 51.12783 3.785610
## 2 90% 46.79906 51.67100 4.871943
## 3 95% 46.32397 52.14609 5.822117
## 4 99% 45.38181 53.08825 7.706435
Key findings:
Sample size: As n increases from 20 to 500, CI width decreases from 12.75 to 2.65 (a 79.2% reduction).
Confidence level: Increasing confidence from 80% to 99% widens the CI from 3.79 to 7.71 (a 103.6% increase).
Here’s how to report each analysis with confidence intervals in APA format:
Single Mean (Nutrition Intervention):
The nutrition intervention led to an average increase of 1.84 servings of vegetables per day (SD = 0.92), which was significantly different from zero, t(79) = 17.99, p < .001, 95% CI [1.64, 2.04].
Two-Group Comparison (Sleep Quality):
Independent samples t-test revealed that meditation (M = 68.79, SD = 11.36) produced higher sleep quality scores than exercise (M = 68.36, SD = 12.36), t(98) = -0.18, p = 0.859, 95% CI [-5.13, 4.29]. The mean difference of 0.42 points suggests a small practical effect.
Proportion (Vaccination Rates):
Vaccination rates differed significantly between urban (76.5%, 95% CI [69.9%, 82.1%]) and rural (67.8%, 95% CI [60.4%, 74.4%]) communities, χ²(1) = 3.18, p = 0.074. The difference in rates was 8.7 percentage points, 95% CI [-0.8%, 18.3%].
Regression Coefficient (Stress-Productivity):
Stress level significantly predicted productivity, b = -0.629, SE = 0.051, t(118) = -12.31, p < .001, 95% CI [-0.73, -0.527]. For each one-point increase in stress, productivity decreased by 0.629 points. The model explained 56.2% of the variance in productivity (R² = 0.562).
Correlation:
Social support and well-being were significantly positively correlated, r(98) = 0.702, p < .001, 95% CI [0.586, 0.79]. This indicates a strong positive relationship.
95% is conventional but not mandatory:
Trade-off: Higher confidence → wider intervals → less precision
To achieve a target CI width:
\[ n = \left(\frac{2 \cdot z_{\alpha/2} \cdot \sigma}{W}\right)^2 \]
where W is the desired total width.
Example: To estimate a mean within ±2 units with 95% confidence when σ = 10:
# Parameters
z_value <- qnorm(0.975) # 95% confidence
sigma <- 10
desired_width <- 4 # Total width (±2 on each side)
# Calculate required n
n_required <- ceiling((2 * z_value * sigma / desired_width)^2)
cat("Sample size needed for CI width of", desired_width, "units:\n")
## Sample size needed for CI width of 4 units:
cat("n =", n_required, "\n")
## n = 97
You would need n = 97 participants.
Confidence intervals are superior to p-values alone because they:
Example: A non-significant result with CI [-0.5, 8.2] suggests uncertainty, whereas [-0.1, 0.3] suggests a genuinely small effect.
WRONG: “There’s a 95% probability the true parameter is in this interval.”
RIGHT: “95% of intervals constructed this way will contain the true parameter.”
WRONG: “Values outside the CI are impossible.”
RIGHT: “Values outside the CI are less plausible given the data.”
WRONG: “Overlapping CIs mean no significant difference.”
RIGHT: “Use formal tests for comparisons; CI overlap is not equivalent to hypothesis testing.”
For non-normal distributions or complex statistics:
# Bootstrap CI for median
set.seed(2025)
n_boot <- 10000
boot_medians <- numeric(n_boot)
for (i in 1:n_boot) {
boot_sample <- sample(nutrition_data$servings, replace = TRUE)
boot_medians[i] <- median(boot_sample)
}
# Calculate percentile CI
boot_ci_lower <- quantile(boot_medians, 0.025)
boot_ci_upper <- quantile(boot_medians, 0.975)
observed_median <- median(nutrition_data$servings)
cat("Bootstrap 95% CI for Median Vegetable Servings:\n")
## Bootstrap 95% CI for Median Vegetable Servings:
cat("Observed median:", round(observed_median, 3), "\n")
## Observed median: 1.756
cat("Bootstrap CI: [", round(boot_ci_lower, 3), ",", round(boot_ci_upper, 3), "]\n")
## Bootstrap CI: [ 1.49 , 2.157 ]
cat("Based on", n_boot, "bootstrap samples\n")
## Based on 10000 bootstrap samples
When constructing multiple CIs, adjust confidence level:
# For 5 comparisons, maintain family-wise error rate of 0.05
n_comparisons <- 5
alpha_adjusted <- 0.05 / n_comparisons
conf_level_adjusted <- 1 - alpha_adjusted
cat("Multiple Comparisons Adjustment:\n")
## Multiple Comparisons Adjustment:
cat("Number of comparisons:", n_comparisons, "\n")
## Number of comparisons: 5
cat("Unadjusted α:", 0.05, "\n")
## Unadjusted α: 0.05
cat("Bonferroni-adjusted α:", round(alpha_adjusted, 4), "\n")
## Bonferroni-adjusted α: 0.01
cat("Adjusted confidence level:", round(conf_level_adjusted * 100, 2), "%\n")
## Adjusted confidence level: 99 %
# Example: CI for vegetable servings with Bonferroni correction
t_bonf <- t.test(nutrition_data$servings, conf.level = conf_level_adjusted)
cat("\nOriginal 95% CI: [", round(t_result$conf.int[1], 3), ",",
round(t_result$conf.int[2], 3), "]\n")
##
## Original 95% CI: [ 1.637 , 2.044 ]
cat("Bonferroni", round(conf_level_adjusted * 100, 2), "% CI: [",
round(t_bonf$conf.int[1], 3), ",", round(t_bonf$conf.int[2], 3), "]\n")
## Bonferroni 99 % CI: [ 1.571 , 2.111 ]
cat("Width increase:", round((t_bonf$conf.int[2] - t_bonf$conf.int[1]) -
(t_result$conf.int[2] - t_result$conf.int[1]), 3), "\n")
## Width increase: 0.133
Confidence intervals are indispensable tools for quantifying uncertainty in statistical estimates. They transform point estimates into ranges that communicate both our best guess and the precision of that estimate. Unlike p-values, which only indicate whether an effect exists, confidence intervals reveal the magnitude and plausibility of different effect sizes.
Key takeaways:
In our scenarios, confidence intervals revealed:
By embracing confidence intervals, we move beyond simplistic “significant/not significant” dichotomies to nuanced, informative quantification of uncertainty—the hallmark of rigorous, transparent scientific communication.
Social Support-Wellbeing Correlation (r: [0.586, 0.79])
The correlation between social support and well-being is 0.702 (95% CI: [0.586, 0.79]).
Practical interpretation: We are 95% confident the true population correlation is between 0.586 and 0.79, indicating a strong positive relationship between social support and well-being.