A Python toolkit for analysis of graphomotor data collected via Curious.
Welcome to graphomotor
, a specialized Python library for analyzing graphomotor data collected via Curious. This toolkit aims to provide comprehensive tools for processing, analyzing, and visualizing data from various graphomotor assessment tasks, including spiral drawing, trails making, alphabetic writing, digit symbol substitution, and the Rey-Osterrieth Complex Figure Test.
Important
graphomotor
is under active development. So far, the feature extraction and visualization components for the spiral drawing task are complete. The next steps involve implementing preprocessing for this task and extending support to other tasks.
The toolkit extracts 25 clinically relevant metrics from digitized drawing data. Currently implemented feature categories include:
- Velocity Features (15): Velocity analysis including linear, radial, and angular velocity components with statistical measures (sum, median, variation, skewness, kurtosis).
- Distance Features (8): Spatial accuracy measurements using Hausdorff distance metrics with temporal normalizations and segment-specific analysis.
- Drawing Error Features (1): Area under the curve (AUC) calculations between drawn paths and ideal reference trajectories to quantify spatial accuracy.
- Temporal Features (1): Task completion duration.
The toolkit provides several plotting functions to visualize extracted features:
-
Distribution Plots: Kernel density estimation plots showing feature distributions grouped by task type and hand.
-
Trend Plots: Line plots displaying feature progression across task sequences with individual participant trajectories and group means.
-
Box Plots: Box-and-whisker plots comparing feature distributions across different tasks and hand conditions.
-
Cluster Heatmaps: Hierarchically clustered heatmaps of z-score standardized features to identify patterns across conditions.
The toolkit includes plotting functions to visualize raw spiral drawing trajectories for quality control and data inspection:
-
Single Spiral Plots: Individual spiral trajectory visualization with optional reference spiral overlay and color-coded line segments.
-
Batch Spiral Plots: Grid-based visualization of multiple spirals organized by participant, hand, and task with automatic labeling.
Install graphomotor
from PyPI:
pip install graphomotor
Or install the latest development version directly from GitHub:
pip install git+https://github.com/childmindresearch/graphomotor
graphomotor
provides two main functionalities: feature extraction from raw drawing data and visualization of extracted features. Both are available as a command-line interface (CLI) and an importable Python library.
Caution
Input data must follow the Curious drawing responses format. See Data Format Requirements below.
To see all available commands and global options in the CLI:
graphomotor --help
The feature extraction functionality computes a set of clinically relevant metrics from digitized graphomotor drawing data. It processes raw CSV files exported from Curious, extracts participant metadata, and calculates features such as velocity, distance, drawing error, and temporal measures for each drawing. The resulting structured dataset can be used for further analysis or visualization, supporting both single-file and batch processing workflows.
To extract features from a single file:
graphomotor extract /path/to/data.csv /path/to/output/features.csv
To extract features from entire directories:
graphomotor extract /path/to/data_directory/ /path/to/output/features.csv
To see all available options for extract
:
graphomotor extract --help
To extract features from a single file:
from graphomotor.core import orchestrator
# Define input path
input_path = "/path/to/data.csv"
# Define output path to save results to disk
# If output path is a directory, file name will be auto-generated as
# `{participant_id}_{task}_{hand}_features_{YYYYMMDD_HHMM}.csv`
output_path = "/path/to/output/features.csv"
# Run the pipeline
results_df = orchestrator.run_pipeline(input_path=input_path, output_path=output_path)
To extract features from entire directories:
from graphomotor.core import orchestrator
# Define input directory
input_dir = "/path/to/data_directory/"
# Define output path to save results to disk
# If output path is a directory, file name will be auto-generated as
# `batch_features_{YYYYMMDD_HHMM}.csv`
output_dir = "/path/to/output/"
# Run the pipeline
results_df = orchestrator.run_pipeline(input_path=input_dir, output_path=output_dir)
To access the results:
# run_pipeline() returns a DataFrame with extracted metadata and features
print(f"Processed {len(results_df)} files")
print(f"Extracted metadata and features: {results_df.columns.tolist()}")
# Get data for first file
# DataFrame is indexed by file path
file_path = results_df.index[0]
participant = results_df.loc[file_path, "participant_id"]
task = results_df.loc[file_path, "task"]
duration = results_df.loc[file_path, "duration"]
Note
For detailed configuration options and additional parameters for feature extraction, refer to the run_pipeline
documentation.
The feature visualization module enables users to generate a variety of plots from batch feature output files produced by the feature extraction process. These output CSV files are expected to contain data from multiple participants, with the first five columns reserved for metadata (source_file
, participant_id
, task
, hand
, start_time
), followed by columns representing numerical features. The visualization tools offer flexible selection of features and plot types, accessible via both the CLI and Python library. All plots return matplotlib.figure
objects, facilitating further customization and integration into analysis workflows. This functionality streamlines exploratory data analysis and supports the identification of trends and outliers in graphomotor datasets.
Tip
Custom Features: You can add custom feature columns to your output CSV files alongside the standard graphomotor features. The plotting functions will automatically detect and include any additional columns after the first 5 metadata columns.
To generate all available plot types with all features in the input file:
graphomotor plot-features /path/to/batch_features.csv /path/to/plots/
To generate only selected plot types for specific features:
graphomotor plot-features /path/to/batch_features.csv /path/to/plots/ -p dist -p trend -f area_under_curve -f duration
To see all available options for plot-features
:
graphomotor plot-features --help
To generate a distribution plot for specific features and save it:
from graphomotor.plot import feature_plots
# Define paths to data, output directory and features to plot
data = "/path/to/batch_features.csv"
output_dir = "/path/to/plots/"
features = ["linear_velocity_median", "hausdorff_distance_maximum"]
# Generate and save distribution plots for selected features
feature_plots.plot_feature_distributions(
data=data,
output_path=output_dir,
features=features
)
To generate boxplots for all available features in a notebook:
from graphomotor.plot import feature_plots
# Use magic command for inline plotting in notebooks
%matplotlib inline
# Define the path to data
data = "/path/to/batch_features.csv"
# Generate boxplots for all available features and return the figure object
fig = feature_plots.plot_feature_boxplots(data=data)
# It is possible to customize the figure object further before displaying or saving it:
# Change the figure title
fig.suptitle("Custom Title", fontsize=24)
# Adjust the layout so that figure title does not overlap with subplots
fig.subplots_adjust(top=0.94)
# Customize each subplot
for ax in fig.get_axes():
# Change the degree of rotation for x-tick labels:
ax.tick_params(axis="x", rotation=30)
# Hide gridlines
ax.grid(False)
# Highlight outliers by changing their color and size
for line in ax.get_lines():
if line.get_marker() == "o":
line.set_markerfacecolor("red")
line.set_markeredgecolor("red")
line.set_markersize(4)
# Save the figure after the changes
fig.savefig(f"path/to/customized_boxplots.png", dpi=300)
Note
For all available feature plotting options, refer to the feature_plots
documentation.
Visualize raw spiral drawing trajectories from Curious CSVs for quick QC and exploratory review. You can plot a single file or batch-plot a whole folder into a structured grid. Plots can optionally include a reference spiral and color-coded line segments.
To plot a single spiral CSV file:
graphomotor plot-spiral /path/to/spiral.csv /path/to/plots/
To batch-plot all spiral CSVs in a directory as a grid with reference spiral and color-coded segments:
graphomotor plot-spiral /path/to/spiral_directory/ /path/to/plots/ -i -c
To plot a single spiral and return a matplotlib.figure.Figure
and save it:
from graphomotor.plot import spiral_plots
fig = spiral_plots.plot_single_spiral(
data="/path/to/spiral.csv",
output_path="/path/to/plots/",
include_reference=True,
color_segments=True,
)
To batch-plot a directory of spirals into a grid figure in a notebook:
from graphomotor.plot import spiral_plots
# Use magic command for inline plotting in notebooks
%matplotlib inline
fig = spiral_plots.plot_batch_spirals(
input_path="/path/to/spiral_directory/",
include_reference=True
)
Note
For all available spiral plotting options, refer to the spiral_plots
documentation.
Task | Preprocessing | Feature Extraction | Visualization |
---|---|---|---|
Spiral | |||
Rey-Osterrieth Complex Figure | |||
Alphabetic Writing | |||
Digit Symbol Substitution | |||
Trails Making |
When exporting drawing data from Curious, the export typically includes the following files:
- report.csv: Contains the participants' actual responses.
- activity_user_journey.csv: Logs the entire journey through the activity, including button actions like "Next", "Skip", "Back", and "Undo", regardless of whether a response was provided.
- drawing-responses-{date}.zip: A ZIP archive with raw drawing response CSV files for each participant (e.g.,
drawing-responses-Mon May 29 2023.zip
). - media-responses-{date}.zip: A ZIP archive containing SVG files for the drawing responses (e.g.,
media-responses-Mon May 29 2023.zip
). - trails-responses-{date}.zip: A ZIP archive with raw trail making response CSV files (if there are any) for each participant (e.g.,
trails-responses-Mon May 29 2023.zip
).
For Spiral tasks, the toolkit uses only the CSV files from the drawing responses ZIP. Support for additional tasks will be added in future releases.
Spiral data files must follow this naming convention:
[5123456]a7f3b2e9-d4c8-f1a6-e5b9-c2d7f8a3e6b4-spiral_trace1_Dom.csv
Where:
- Participant ID: Must be enclosed in brackets
[]
and be a 7-digit number starting with5
(e.g.,[5123456]
) that matches thetarget_secret_id
column in the report.csv file. - Activity Submission ID: Must be a 32-character hexadecimal string (e.g.,
18f2-45ea-a1e4-2334e07cc706
) that matches theid
column in the report.csv file. - Task: Must be one of the following that matches the
item
column in the report.csv file:spiral_trace1_Dom
throughspiral_trace5_Dom
(dominant hand tracing tasks)spiral_trace1_NonDom
throughspiral_trace5_NonDom
(non-dominant hand tracing tasks)spiral_recall1_Dom
throughspiral_recall3_Dom
(dominant hand recall tasks)spiral_recall1_NonDom
throughspiral_recall3_NonDom
(non-dominant hand recall tasks)
Spiral data CSV files must contain the following columns:
line_number, x, y, UTC_Timestamp, seconds, epoch_time_in_seconds_start
These columns constitute the standard output from Curious drawing responses data dictionary.
The graphomotor
is under active development. For more detailed information about upcoming features and development plans, please refer to our GitHub Issues page.
Contributions from the community are welcome! Please review the Contributing Guidelines for information on how to get started, coding standards, and the pull request process.
- Messan, K. S., Kia, S. M., Narayan, V. A., Redmond, S. J., Kogan, A., Hussain, M. A., McKhann, G. M. II, & Vahdat, S. (2022). Assessment of Smartphone-Based Spiral Tracing in Multiple Sclerosis Reveals Intra-Individual Reproducibility as a Major Determinant of the Clinical Utility of the Digital Test. Frontiers in Medical Technology, 3, 714682. https://doi.org/10.3389/fmedt.2021.714682