Handling Missing Data in Single-Case Experimental Design#
The fill_missing
function provides a way to handle missing values in single-case experimental design (SCED) datasets. It ensures that missing values are interpolated without adding new time points.
Filling Missing Values with fill_missing
#
The fill_missing
function fills missing values in a single-case dataset using linear interpolation. It does not introduce new time points, ensuring that the original structure of the dataset is preserved.
Required Arguments:#
data
: A Pandas DataFrame containing SCED data.
Optional Arguments:#
dvar
: The column name of the dependent variable (default"values"
).mvar
: The column name representing measurement time (default"mt"
).na_rm
: IfTrue
, explicitly removes missing values before interpolation.
The function processes each case separately and interpolates missing values based on existing time points.
import scia as sc
📖 scia 1.101.0.dev6 - For Documentation, visit: https://ahsankhodami.github.io/scia/intro.html
Example 1: Filling Missing Values in a Dataset#
This example demonstrates how missing values in the dependent variable column (values
) are automatically filled.
import pandas as pd
import numpy as np
# Create a sample dataset with missing values
df = pd.DataFrame({
"case": ["A", "A", "A", "A", "B", "B", "B", "B"],
"mt": [1, 2, 3, 4, 1, 2, 3, 4],
"values": [2, np.nan, 5, 6, 1, np.nan, np.nan, 4]
})
# Fill missing values
df_filled = sc.fill_missing(df)
print(df_filled)
case mt values
0 A 1 2.0
1 A 2 3.5
2 A 3 5.0
3 A 4 6.0
4 B 1 1.0
5 B 2 2.0
6 B 3 3.0
7 B 4 4.0
Example 2: Keeping Explicit NA Values#
By setting na_rm=False
, explicit NaN
values are retained before interpolation.
df_filled = sc.fill_missing(df, na_rm=False)
print(df_filled)
case mt values
0 A 1 2.0
1 A 2 3.5
2 A 3 5.0
3 A 4 6.0
4 B 1 1.0
5 B 2 2.0
6 B 3 3.0
7 B 4 4.0