Skip to content

Commit 737e265

Browse files
author
Alex Lee
committed
Implement the codespell pre-commit hook
1 parent 66c2623 commit 737e265

13 files changed

+39
-16
lines changed

.codespell-whitelist.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
nd
2+
cace

.pre-commit-config.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,21 @@ repos:
4141
# needed to make excludes in pyproject.toml work
4242
# see here https://github.com/econchick/interrogate/issues/60#issuecomment-735436566
4343
pass_filenames: false
44+
- repo: https://github.com/codespell-project/codespell
45+
rev: v2.3.0
46+
hooks:
47+
- id: codespell
48+
args: [
49+
"-S",
50+
"*.csv",
51+
"-S",
52+
"*.ipynb",
53+
"-S",
54+
"pyproject.toml",
55+
"--ignore-words=.codespell-whitelist.txt",
56+
# Write changes in place
57+
"-w",
58+
]
59+
additional_dependencies:
60+
# Support pyproject.toml configuration
61+
- tomli

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ This is appropriate when you have multiple units, one of which is treated. You b
8989
> The data (treated and untreated units), pre-treatment model fit, and counterfactual (i.e. the synthetic control) are plotted (top). The causal impact is shown as a blue shaded region. The Bayesian analysis shows shaded Bayesian credible regions of the model fit and counterfactual. Also shown is the causal impact (middle) and cumulative causal impact (bottom).
9090
9191
### Geographical lift (Geolift)
92-
We can also use synthetic control methods to analyse data from geographical lift studies. For example, we can try to evaluate the causal impact of an intervention (e.g. a marketing campaign) run in one geographical area by using control geographical areas which are similar to the intervention area but which did not recieve the specific marketing intervention.
92+
We can also use synthetic control methods to analyse data from geographical lift studies. For example, we can try to evaluate the causal impact of an intervention (e.g. a marketing campaign) run in one geographical area by using control geographical areas which are similar to the intervention area but which did not receive the specific marketing intervention.
9393

9494
### ANCOVA
9595

causalpy/data/simulate_data.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,7 @@ def generate_ancova_data(
291291
N=200, pre_treatment_means=np.array([10, 12]), treatment_effect=2, sigma=1
292292
):
293293
"""
294-
Generate ANCOVA eample data
294+
Generate ANCOVA example data
295295
296296
Example
297297
--------
@@ -440,7 +440,7 @@ def generate_seasonality(n=12, amplitude=1, length_scale=0.5):
440440

441441

442442
def periodic_kernel(x1, x2, period=1, length_scale=1, amplitude=1):
443-
"""Generate a periodic kernal for gaussian process"""
443+
"""Generate a periodic kernel for gaussian process"""
444444
return amplitude**2 * np.exp(
445445
-2 * np.sin(np.pi * np.abs(x1 - x2) / period) ** 2 / length_scale**2
446446
)

causalpy/experiments/instrumental_variable.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ class InstrumentalVariable(BaseExperiment):
4444
:param model: A PyMC model
4545
:param priors: An optional dictionary of priors for the
4646
mus and sigmas of both regressions. If priors are not
47-
specified we will substitue MLE estimates for the beta
47+
specified we will substitute MLE estimates for the beta
4848
coefficients. Greater control can be achieved
4949
by specifying the priors directly e.g. priors = {
5050
"mus": [0, 0],

causalpy/experiments/inverse_propensity_weighting.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ def make_doubly_robust_adjustment(self, ps):
195195
m1 = sk_lin_reg().fit(X[t == 1].astype(float), self.y[t == 1])
196196
m0_pred = m0.predict(X)
197197
m1_pred = m1.predict(X)
198-
## Compromise between outcome and treatement assignment model
198+
## Compromise between outcome and treatment assignment model
199199
weighted_outcome0 = (1 - t) * (self.y - m0_pred) / (1 - X["ps"]) + m0_pred
200200
weighted_outcome1 = t * (self.y - m1_pred) / X["ps"] + m1_pred
201201
return weighted_outcome0, weighted_outcome1, None, None

causalpy/experiments/prepostfit.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,7 @@ class InterruptedTimeSeries(PrePostFit):
311311
:param data:
312312
A pandas dataframe
313313
:param treatment_time:
314-
The time when treatment occured, should be in reference to the data index
314+
The time when treatment occurred, should be in reference to the data index
315315
:param formula:
316316
A statistical model formula
317317
:param model:
@@ -352,7 +352,7 @@ class SyntheticControl(PrePostFit):
352352
:param data:
353353
A pandas dataframe
354354
:param treatment_time:
355-
The time when treatment occured, should be in reference to the data index
355+
The time when treatment occurred, should be in reference to the data index
356356
:param formula:
357357
A statistical model formula
358358
:param model:

causalpy/plot_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ def plot_xY(
7373
ax=ax,
7474
**plot_hdi_kwargs,
7575
)
76-
# Return handle to patch. We get a list of the childen of the axis. Filter for just
76+
# Return handle to patch. We get a list of the children of the axis. Filter for just
7777
# the PolyCollection objects. Take the last one.
7878
h_patch = list(
7979
filter(lambda x: isinstance(x, PolyCollection), ax_hdi.get_children())

causalpy/pymc_models.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727

2828

2929
class PyMCModel(pm.Model):
30-
"""A wraper class for PyMC models. This provides a scikit-learn like interface with
30+
"""A wrapper class for PyMC models. This provides a scikit-learn like interface with
3131
methods like `fit`, `predict`, and `score`. It also provides other methods which are
3232
useful for causal inference.
3333

causalpy/tests/test_pymc_models.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ def test_idata_property():
142142
@pytest.mark.parametrize("seed", seeds)
143143
def test_result_reproducibility(seed):
144144
"""Test that we can reproduce the results from the model. We could in theory test
145-
this with all the model and experiment types, but what is being targetted is
145+
this with all the model and experiment types, but what is being targeted is
146146
the PyMCModel.fit method, so we should be safe testing with just one model. Here
147147
we use the DifferenceInDifferences experiment class."""
148148
# Load the data

docs/source/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ This is appropriate when you have multiple units, one of which is treated. You b
9898
![Synthetic Control](./_static/synthetic_control_pymc.svg)
9999

100100
### Geographical Lift / Geolift
101-
We can also use synthetic control methods to analyse data from geographical lift studies. For example, we can try to evaluate the causal impact of an intervention (e.g. a marketing campaign) run in one geographical area by using control geographical areas which are similar to the intervention area but which did not recieve the specific marketing intervention.
101+
We can also use synthetic control methods to analyse data from geographical lift studies. For example, we can try to evaluate the causal impact of an intervention (e.g. a marketing campaign) run in one geographical area by using control geographical areas which are similar to the intervention area but which did not receive the specific marketing intervention.
102102

103103
### ANCOVA
104104

docs/source/knowledgebase/glossary.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,11 @@ Glossary
99

1010
Average treatment effect
1111
ATE
12-
The average treatement effect across all units.
12+
The average treatment effect across all units.
1313

1414
Average treatment effect on the treated
1515
ATT
16-
The average effect of the treatment on the units that recieved it. Also called Treatment on the treated.
16+
The average effect of the treatment on the units that received it. Also called Treatment on the treated.
1717

1818
Change score analysis
1919
A statistical procedure where the outcome variable is the difference between the posttest and protest scores.
@@ -48,7 +48,7 @@ Glossary
4848

4949
Local Average Treatment effect
5050
LATE
51-
Also known asthe complier average causal effect (CACE), is the effect of a treatment for subjects who comply with the experimental treatment assigned to their sample group. It is the quantity we're estimating in IV designs.
51+
Also known asthe compiler average causal effect (CACE), is the effect of a treatment for subjects who comply with the experimental treatment assigned to their sample group. It is the quantity we're estimating in IV designs.
5252

5353
Non-equivalent group designs
5454
NEGD
@@ -76,7 +76,7 @@ Glossary
7676
Where units are assigned to conditions at random.
7777

7878
Randomized experiment
79-
An emprical comparison used to estimate the effects of treatments where units are assigned to treatment conditions randomly.
79+
An empirical comparison used to estimate the effects of treatments where units are assigned to treatment conditions randomly.
8080

8181
Regression discontinuity design
8282
RDD
@@ -96,7 +96,7 @@ Glossary
9696

9797
Treatment on the treated effect
9898
TOT
99-
The average effect of the treatment on the units that recieved it. Also called the average treatment effect on the treated (ATT).
99+
The average effect of the treatment on the units that received it. Also called the average treatment effect on the treated (ATT).
100100

101101
Treatment effect
102102
The difference in outcomes between what happened after a treatment is implemented and what would have happened (see Counterfactual) if the treatment had not been implemented, assuming everything else had been the same.

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,6 @@ badge-format = "svg"
122122
extend-select = [
123123
"I", # isort
124124
]
125+
126+
[tools.codespell]
127+
ignore-words = ".codespell/ignore_words.txt"

0 commit comments

Comments
 (0)