Skip to content

[BUG] Improve signature docs #2929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ def _rescale_path(path, depth):


def _rescale_signature(signature, channels, depth):
"""Rescals the output signature by multiplying the depth-d term by d!.
"""Rescales the output signature by multiplying the depth-d term by d!.

Ain is that every term become ~O(1).
Aim is that every term become ~O(1).

Parameters
----------
Expand Down
96 changes: 57 additions & 39 deletions examples/transformations/signature_method.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,56 +4,61 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Signature Method with aeon\n",
"# The Signature Method\n",
"\n",
"The ‘signature method’ refers to a collection of feature extraction techniques for multimodal sequential data, derived from the theory of controlled differential equations. In recent years, a large number of modifications have been suggested to the signature method so as to improve some aspect of it.\n",
"\n",
"In the paper [\"A Generalised Signature Method for Time-Series\"](https://arxiv.org/abs/2006.00873) [1] the authors collated the vast majority of these modifications into a single document and ran a large hyper-parameter study over the multivariate UEA datasets to build a generic signature algorithm that is expected to work well on a wide range of datasets. We implement the best practice results from this study as the default starting values for our hyperparameters in the `SignatureClassifier` module.\n"
"In the paper \n",
"[\"A Generalised Signature Method for Time-Series\"](https://arxiv.org/abs/2006.00873) [1] the authors collated the vast majority of these \n",
"modifications into a single document and ran a large hyper-parameter study over the \n",
"multivariate UEA datasets to build a generic signature algorithm that is expected to \n",
"work well on a wide range of datasets. We implement the best practice results from \n",
"this study as the default starting values for our hyperparameters in the \n",
"`SignatureClassifier` estimator.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Path Signature\n",
"At the heart of the signature method is the so-called \"signature transform\".\n",
"\n",
"A path $X$ of finite length in $\\textit{d}$ dimensions can be described by the mapping $X:[a, b]\\rightarrow\\mathbb{R}$ $\\!\\!^d$, or in terms of coordinates $X=(X^1_t, X^2_t, ...,X^d_t)$, where each coordinate $X^i_t$ is real-valued and parameterised by $t\\in[a,b]$.\n",
"At the heart of the signature method is the so-called *signature transform*.\n",
"\n",
"A path $X$ of finite length in $d$ dimensions can be described by the mapping\n",
"$X:[a, b] \\rightarrow \\mathbb{R}^d$, or in terms of coordinates\n",
"$X = (X^1_t, X^2_t, \\ldots, X^d_t)$, where each coordinate $X^i_t$ is\n",
"real-valued and parameterised by $t \\in [a,b]$.\n",
"\n",
"The **signature transform** $S$ of a path $X$ is defined as an infinite sequence of values:\n",
"\n",
"\\begin{equation}\n",
" S(X)_{a, b} = (1, S(X)_{a, b}^1, S(X)_{a, b}^2, ..., S(X)_{a, b}^d, S(X)_{a,b}^{1, 1}, S(X)_{a,b}^{1, 2}, ...),\n",
" \\label{eq:path_signature}\n",
"\\end{equation}\n",
"\n",
"$$\n",
"S(X)_{a, b} = (1, S(X)_{a, b}^1, S(X)_{a, b}^2, \\ldots, S(X)_{a, b}^d, S(X)_{a,b}^{1,\n",
" 1}, S(X)_{a,b}^{1, 2}, \\ldots)\n",
"$$\n",
"\n",
"\n",
"where each term is a $k$-fold iterated integral of $X$ with multi-index $i_1,...,i_k$:\n",
"\\begin{equation}\n",
" S(X)_{a, b}^{i_1,...,i_k} = \\int_{a<t_k<b}...\\int_{a<t_1<t_2} \\mathrm{d}X_{t_1}^{i_1}...\\mathrm{d}X_{t_k}^{i_k}.\n",
" \\label{eq:sig_moments}\n",
"\\end{equation}\n",
"\n",
"$$\n",
"S(X)_{a, b}^{i_1,...,i_k} = \\int_{a<t_k<b}...\\int_{a<t_1<t_2} \\mathrm{d}X_{t_1}^{i_1}...\\mathrm{d}X_{t_k}^{i_k}.\n",
"$$\n",
"\n",
"This defines a graded sequence of numbers associated with a path which is known to characterise it up to a generalised form of reparameterisation [2]. One can think of the signature as a collection of summary statistics that determine a path (almost) uniquely. Furthermore, any continuous function on the path $X$ can be approximated arbitrarily well as a linear function on its signature [3]; the signature unravels the non-linearities on functions on the space of unparameterised paths."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### A Visualisation\n",
"To give an idea of what the signature terms represent physically, we consider a patient in an ICU where we are tracking their systolic blood pressure (SBP) and heart rate (HR) changing in time. This can be represented as a path in $\\mathbb{R}^3$ (assuming time is included as a channel).\n",
"\n",
"[signature_visualisation](img/signatures_visualisation.png)\n",
"\n",
"The plot above sketches two scenarios of how such a path might look. We are assuming here an implicit time dimension for each plot such that the path is traversed from left to right along the blue line.\n",
"There is a good introduction to signatures \n",
"[here](https://medium.com/@ti.tomhortons/intuitive-understandings-of-signature-method-and-practical-examples-in-machine-learning-4586ecf24926).\n",
"\n",
"#### Depth 1:\n",
"The signature terms to depth 1 are simply the changes of each of the variables over the interval, in the image this is the $\\Delta \\text{HR}$ and $\\Delta \\text{SBP}$ terms. Note that these values are the same in each case.\n",
"\n",
"#### Depth 2:\n",
"The second level gives us the signed areas (the shaded orange regions), where the orientation of the left most plot is such that the negatively signed area is produced whereas the second gives the positive value, and thus, at order 2 in the signature we now have sufficient information to discriminate between these two situations where in the first rise in heart rate occurs before (or at least, initially faster than) the rise in blood pressure, and vice versa.\n",
"\n",
"\n",
"#### Depth > 2:\n",
"Depths larger than 2 become more difficult to visualise graphically, however the idea is similar to that of the depth 2 case where we saw that the signature produced information on whether the increase in HR or SBP appeared to be happening first, along with some numerical quantification of how much this was happening. At higher orders the signature is doing something similar, but now with three events, rather than two. The signature picks out structural information regarding the order in which events occur."
"James Morell contributed this code as part of his \n",
"[PhD thesis](https://ora.ox.ac.uk/objects/uuid:44cb30f8-6dc8-4e0e-8347-14d40452c3e6/files/d8049g5436) \n"
]
},
{
Expand All @@ -64,9 +69,9 @@
"The signature is a natural tool to apply in problems related to time-series analysis. As described above it can convert multi-dimensional time-series data into static features that represent information about the sequential nature of the time-series, that can be fed through a standard machine learning model.\n",
"\n",
"A simplistic view of how this works is as follows:\n",
"\\begin{equation}\n",
" \\text{Model}(\\text{Signature}(\\text{Sequential data}))) = \\text{Predictions}\n",
"\\end{equation}"
"$$\n",
"\\text{Model}(\\text{Signature}(\\text{Sequential data}))) = \\text{Predictions}\n",
"$$"
]
},
{
Expand Down Expand Up @@ -100,7 +105,6 @@
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true,
"execution": {
Expand All @@ -111,48 +115,57 @@
},
"jupyter": {
"outputs_hidden": true
},
"ExecuteTime": {
"end_time": "2025-07-12T13:20:25.970545Z",
"start_time": "2025-07-12T13:20:21.197682Z"
}
},
"outputs": [],
"source": [
"# Some additional imports we will use\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.metrics import accuracy_score\n",
"\n",
"from aeon.datasets import load_unit_test"
]
],
"outputs": [],
"execution_count": 1
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"execution": {
"iopub.execute_input": "2021-06-25T12:41:11.529792Z",
"iopub.status.busy": "2021-06-25T12:41:11.529157Z",
"iopub.status.idle": "2021-06-25T12:41:11.609172Z",
"shell.execute_reply": "2021-06-25T12:41:11.609562Z"
},
"ExecuteTime": {
"end_time": "2025-07-12T13:20:26.002672Z",
"start_time": "2025-07-12T13:20:25.977454Z"
}
},
"outputs": [],
"source": [
"# Load an example dataset\n",
"train_x, train_y = load_unit_test(split=\"train\")\n",
"test_x, test_y = load_unit_test(split=\"test\")"
]
],
"outputs": [],
"execution_count": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Overview\n",
"We provide the following:\n",
"- **aeon.transformers.panel.signature_based.SignatureTransformer** - An sklearn transformer that provides the functionality to apply the signature method with some choice of variations as noted above.\n",
"- **aeon.transformers.collection.signature_based.SignatureTransformer** - An sklearn \n",
"transformer that provides the functionality to apply the signature method with some choice of variations as noted above.\n",
"- **aeon.classification.feature_based.SignatureClassifier** - This provides a simple interface to append a classifier to the SignatureTransformer class."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true,
"execution": {
Expand All @@ -163,13 +176,18 @@
},
"jupyter": {
"outputs_hidden": true
},
"ExecuteTime": {
"end_time": "2025-07-12T13:20:27.077433Z",
"start_time": "2025-07-12T13:20:26.846292Z"
}
},
"outputs": [],
"source": [
"from aeon.classification.feature_based import SignatureClassifier\n",
"from aeon.transformations.collection.signature_based import SignatureTransformer"
]
],
"outputs": [],
"execution_count": 3
},
{
"cell_type": "markdown",
Expand Down