# üìñ describe

## Analyze EDF Trials and Extract Statistics

The describe() function allows users to analyze eye-tracking trial data from an EDF (Eye Data Format) file. It computes key statistics for a given trial or all trials, including:

- Total duration of the trial
- Number of fixations, saccades, and blinks
- Average fixation duration
- Average saccade amplitude
- This function is useful for summarizing eye movement behavior from recorded trials.



`describe` can directly work with EDF file or you can first Convert data using [export](export.md) function and then import files and use `describe` function.

In [2]:
import etformat as et
et.describe(r"D:\Github_web_page_website\test.EDF")

Unnamed: 0,Trial,Total Duration (ms),Total Samples,Number of Fixations,Number of Saccades,Number of Blinks,Avg Fixation Duration (ms),Avg Saccade Amplitude
0,1,,0,0,0,0,,
1,2,5386.0,5387,14,14,1,384.714286,2.102397
2,3,2659.0,2660,5,5,0,531.800000,1.693147
3,4,2776.0,2777,8,8,0,347.000000,1.463418
4,5,2979.0,2980,6,6,1,496.500000,2.124911
...,...,...,...,...,...,...,...,...
475,476,2451.0,2452,2,2,0,1225.500000,0.967749
476,477,2523.0,2524,4,4,0,630.750000,0.707254
477,478,3084.0,3085,9,9,1,342.666667,2.083270
478,479,3127.0,3128,7,7,1,446.714286,2.679784


```{note}
As and Per quesitons, the reason you see a NaN is because some meta data from binary files are reproduced which includes the calibration and other detail, that is why you need to use `clean` code to clean your data and remove the unneccessary row
```

```{warning}
if you see an error here regarding file is because of compiler error due to jupyter-book` code works in natural environment.
```

if you have already exported data there are two seperate files after conversion which are `EDFfilename_events.csv` and `EDFfilename_samples.csv` which can be used to analyze data.

In [10]:
# Load CSV files
samples = pd.read_csv(r"D:\Github_web_page_website\test_samples.csv")
events = pd.read_csv(r"D:\Github_web_page_website\test_events.csv")

In [6]:
et.clean(samples, events, copy = False)

üßπ Starting data cleaning for eye-tracking data
   üìä Samples: DataFrame with shape (1520309, 56)
   üìä Events: DataFrame with shape (18261, 38)

üîç Processing SAMPLES data...
   Original samples shape: (1520309, 56)
   Columns: 56
   ‚ùå Removed 2522 rows with NaN trials
   ‚ùå Removed 15 columns:
      ‚Ä¢ time_rel (all_nan)
      ‚Ä¢ pxL (all_nan)
      ‚Ä¢ pyL (all_nan)
      ‚Ä¢ hxL (all_nan)
      ‚Ä¢ hyL (all_nan)
      ‚Ä¢ paL (all_nan)
      ‚Ä¢ gxL (all_nan)
      ‚Ä¢ gyL (all_nan)
      ‚Ä¢ hdata1 (all_zero)
      ‚Ä¢ hdata6 (all_zero)
      ‚Ä¢ hdata7 (all_zero)
      ‚Ä¢ input (all_zero)
      ‚Ä¢ buttons (all_zero)
      ‚Ä¢ htype (all_nan)
      ‚Ä¢ errors (all_zero)
   ‚úÖ Samples cleaned: (1520309, 56) ‚Üí (1517787, 41)

üîç Processing EVENTS data...
   Original events shape: (18261, 38)
   Columns: 38
   ‚ùå Removed 83 rows with NaN trials
   ‚ùå Removed 7 columns:
      ‚Ä¢ time (all_zero)
      ‚Ä¢ sttime_rel (all_nan)
      ‚Ä¢ entime_rel (all_nan)
      ‚

(         trial     time     pxR     pyR    hxR    hyR     paR     gxR    gyR  \
 2522       1.0   392207 -3860.0 -2051.0  342.0  479.0  5275.0  1382.7  839.0   
 2523       1.0   392208 -3862.0 -2048.0  341.0  481.0  5275.0  1382.3  839.7   
 2524       1.0   392209 -3867.0 -2044.0  338.0  484.0  5276.0  1381.4  840.5   
 2525       1.0   392210 -3873.0 -2042.0  334.0  486.0  5276.0  1380.2  841.1   
 2526       1.0   392211 -3885.0 -2041.0  327.0  486.0  5276.0  1378.1  841.2   
 ...        ...      ...     ...     ...    ...    ...     ...     ...    ...   
 1520304  480.0  1909989 -4790.0 -3160.0 -176.0 -336.0  4075.0  1226.6  594.4   
 1520305  480.0  1909990 -4790.0 -3155.0 -176.0 -332.0  4061.0  1226.6  595.5   
 1520306  480.0  1909991 -4794.0 -3135.0 -179.0 -317.0  4040.0  1225.7  599.9   
 1520307  480.0  1909992 -4799.0 -3115.0 -182.0 -303.0  4019.0  1224.8  604.3   
 1520308  480.0  1909993 -4803.0 -3095.0 -185.0 -288.0  3997.0  1224.0  608.7   
 
            rx  ...      f

In [13]:
# Describe trial #3 using CSV
trial_stats = et.describe((events, samples), trial_number=3)
trial_stats

{'Trial': 3.0,
 'Total Duration (ms)': 2659.0,
 'Total Samples': 2660.0,
 'Number of Fixations': 5.0,
 'Number of Saccades': 5.0,
 'Number of Blinks': 0.0,
 'Avg Fixation Duration (ms)': 531.8,
 'Avg Saccade Amplitude': 1.6931468949603112}