Skip to content

Commit 6089a99

Browse files
yarikopticMatthewMiddlehurstbaraline
authored
[MNT] Add codespell support (config, workflow to detect/not fix) and make it fix a "few" typos (#2653)
* Add github action to codespell main on push and PRs * Add rudimentary codespell config * Add pre-commit definition for codespell * Some skips for codespell -- lots of work todo * [DATALAD RUNCMD] run codespell throughout fixing typos automagically (but ignoring overall fail due to ambigous ones) === Do not change lines below === { "chain": [], "cmd": "codespell -w || :", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^ * codespeed notebook and testing * incorrect typos and notebooks * more typos * config and fixes * Update similarity_search.ipynb fix notebook typos * update skip * temp typo and annotation workflow * annotations * fix typos --------- Co-authored-by: Matthew Middlehurst <pfm15hbu@gmail.com> Co-authored-by: Antoine Guillaume <antoine.guillaume45@gmail.com>
1 parent 76abbdd commit 6089a99

File tree

167 files changed

+2579
-1768
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

167 files changed

+2579
-1768
lines changed

.github/actions/numba_cache/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,6 @@ runs:
6262
path: ${{ github.workspace }}/.numba_cache
6363
# Try restore using today's date
6464
key: numba-${{ inputs.cache_name }}-${{ inputs.runner_os }}-${{ inputs.python_version }}-${{ env.CURRENT_DATE }}
65-
# If cant restore with today's date try another cache (without date)
65+
# If can't restore with today's date try another cache (without date)
6666
restore-keys: |
6767
numba-${{ inputs.cache_name }}-${{ inputs.runner_os }}-${{ inputs.python_version }}-
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
fpr
2+
mape
3+
recuse
4+
strat

.github/workflows/pr_precommit.yml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,3 +60,16 @@ jobs:
6060
with:
6161
commit_message: Automatic `pre-commit` fixes
6262
commit_user_name: aeon-actions-bot[bot]
63+
64+
codespell-annotations:
65+
runs-on: ubuntu-24.04
66+
67+
steps:
68+
- name: Checkout
69+
uses: actions/checkout@v4
70+
71+
- name: Annotate locations with typos
72+
uses: codespell-project/codespell-problem-matcher@v1
73+
74+
- name: Codespell
75+
uses: codespell-project/actions-codespell@v2

.pre-commit-config.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,3 +78,11 @@ repos:
7878
hooks:
7979
- id: check-manifest
8080
stages: [ manual ]
81+
82+
- repo: https://github.com/codespell-project/codespell
83+
# Configuration for codespell is in pyproject.toml
84+
rev: v2.4.1
85+
hooks:
86+
- id: codespell
87+
additional_dependencies:
88+
- tomli # for python_version < '3.11'

aeon/anomaly_detection/series/distance_based/_kmeans.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ class KMeansAD(BaseSeriesAnomalyDetector):
4242
4343
stride : int, default=1
4444
The stride of the sliding window. The stride determines how many time points
45-
the windows are spaced appart. A stride of 1 means that the window is moved one
45+
the windows are spaced apart. A stride of 1 means that the window is moved one
4646
time point forward compared to the previous window. The larger the stride, the
4747
fewer windows are created, which leads to noisier anomaly scores.
4848

aeon/anomaly_detection/series/distance_based/_merlin.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -102,26 +102,26 @@ def _predict(self, X):
102102

103103
r = 2 * np.sqrt(self.min_length)
104104
distances = np.full(len(lengths), -1.0)
105-
indicies = np.full(len(lengths), -1)
105+
indices = np.full(len(lengths), -1)
106106

107-
indicies[0], distances[0] = self._find_index(X, lengths[0], r, np.multiply, 0.5)
107+
indices[0], distances[0] = self._find_index(X, lengths[0], r, np.multiply, 0.5)
108108

109109
for i in range(1, min(5, len(lengths))):
110110
r = distances[i - 1] * 0.99
111-
indicies[i], distances[i] = self._find_index(
111+
indices[i], distances[i] = self._find_index(
112112
X, lengths[i], r, np.multiply, 0.99
113113
)
114114

115115
for i in range(min(5, len(lengths)), len(lengths)):
116116
m = mean(distances[i - 5 : i])
117117
s = std(distances[i - 5 : i])
118118
r = m - 2 * s
119-
indicies[i], distances[i] = self._find_index(
119+
indices[i], distances[i] = self._find_index(
120120
X, lengths[i], r, np.subtract, s
121121
)
122122

123123
anomalies = np.zeros(X.shape[0], dtype=bool)
124-
for i in indicies:
124+
for i in indices:
125125
if i > -1:
126126
anomalies[i] = True
127127

aeon/anomaly_detection/series/outlier_detection/_stray.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ class STRAY(BaseSeriesAnomalyDetector):
2020
ability to detect clusters of outliers in multidimensional data without
2121
requiring a model of the typical behavior of the system. However, it suffers
2222
from some limitations that affect its accuracy. STRAY is an extension of
23-
HDoutliers that uses extreme value theory for the anomolous threshold
23+
HDoutliers that uses extreme value theory for the anomalous threshold
2424
calculation, to deal with data streams that exhibit non-stationary behavior.
2525
2626
Parameters
@@ -39,7 +39,7 @@ class STRAY(BaseSeriesAnomalyDetector):
3939
Proportion of possible candidates for outliers. This defines the starting point
4040
for the bottom up searching algorithm.
4141
size_threshold : int, default=50
42-
Sample size to calculate an emperical threshold.
42+
Sample size to calculate an empirical threshold.
4343
outlier_tail : str {"min", "max"}, default="max"
4444
Direction of the outlier tail.
4545

aeon/base/_estimators/compose/collection_ensemble.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ class BaseCollectionEnsemble(ComposableEstimatorMixin, BaseCollectionEstimator):
4242
Only used if weights is a float. The method used to generate a performance
4343
estimation from the training data set i.e. cross-validation.
4444
If None, predictions are made using that estimators fit_predict or
45-
fit_predict_proba methods. These are somtimes overridden for efficient
45+
fit_predict_proba methods. These are sometimes overridden for efficient
4646
performance evaluations, i.e. out-of-bag predictions.
4747
If int or sklearn object input, the parameter is passed directly to the cv
4848
parameter of the cross_val_predict function from sklearn.

aeon/benchmarking/results_loaders.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ def estimator_alias(name: str) -> str:
201201
def get_available_estimators(
202202
task: str = "classification", as_list: bool = False
203203
) -> Union[pd.DataFrame, list]:
204-
"""Get a DataFrame of estimators avialable for a specific learning task.
204+
"""Get a DataFrame of estimators available for a specific learning task.
205205
206206
Parameters
207207
----------
@@ -251,7 +251,7 @@ def get_estimator_results(
251251
252252
Parameters
253253
----------
254-
estimators : str ot list of str
254+
estimators : str or list of str
255255
Estimator name or list of estimator names to search for. See
256256
get_available_estimators, aeon.benchmarking.results_loading.NAME_ALIASES or
257257
the directory at path for valid options.

aeon/benchmarking/stats.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ def nemenyi_test(ordered_avg_ranks, n_datasets, alpha):
5757
ordered_avg_ranks : np.array
5858
Average ranks of estimators.
5959
n_datasets : int
60-
Mumber of datasets.
60+
Number of datasets.
6161
alpha : float
6262
alpha level for Nemenyi test.
6363

0 commit comments

Comments
 (0)