Skip to content

Commit 8680b75

Browse files
JoaoGranjaAhmed Elghareeb
and
Ahmed Elghareeb
authored
Use poetry (#19)
* Adds poetry template to replace setup.py and requirements.txt (#18) Added all libraries requirements from requirements.txt. Updated aif360 package to a fork that contains a bug fix. * Add PyPI README. Update pyproject.toml with urls, classifiers, license, authors --------- Co-authored-by: Ahmed Elghareeb <ahmeds.elghareeb@gmail.com>
1 parent 25ff1a8 commit 8680b75

File tree

3 files changed

+1304
-0
lines changed

3 files changed

+1304
-0
lines changed

PyPI_README.md

Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
[![Continuous Integration](https://github.com/EqualityAI/EqualityML/actions/workflows/ci.yml/badge.svg)](https://github.com/EqualityAI/EqualityML/actions/workflows/ci.yml)
2+
[![License](https://img.shields.io/github/license/EqualityAI/EqualityML.svg?color=blue)](https://github.com/EqualityAI/EqualityML/blob/main/LICENSE)
3+
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg)](https://github.com/EqualityAI/EqualityML/blob/main/CODE_OF_CONDUCT.md)
4+
<!---
5+
[![Documentation](https://readthedocs.org/projects/aif360/badge/?version=latest)](http://aif360.readthedocs.io/en/latest/?badge=latest)
6+
[![PyPI version](https://badge.fury.io/py/equalityml.svg)](https://badge.fury.io/py/equalityml)
7+
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/equalityml)](https://cran.r-project.org/package=equalityml)
8+
--->
9+
# Equality AI `EqualityML`
10+
11+
### Let's end algorithmic bias together!
12+
13+
[Equality AI (EAI)](https://equalityai.com/) is a public-benefit corporation dedicated to providing developers with
14+
evidence-based tools to end algorithmic bias. Our tools are built by developers for developers. So, we know that
15+
developers want their models to be fair, but we also understand that bias is <b> difficult and intimidating.</b>
16+
17+
The EAI `EqualityML` repository provides tools and guidance on how to include fairness and bias mitigation methods to
18+
model fitting so as to safeguard the people on the receiving end of our models from bias.
19+
20+
If you like what we're doing, give us a star and join our [EAI Manifesto!](https://equalityai.com/community/#manifesto)!</br>
21+
22+
>We have extented `EqualityML` to include other aspects of Responsible AI (see full framework <b>Figure 1.</b>) and
23+
collaboration features to create our Beta MLOps Developer Studio. <b>Become a Beta user by going to our
24+
[website!](https://equalityai.com/)</b>
25+
26+
![](https://github.com/EqualityAI/EqualityML/blob/main/img/framework.png?raw=true)
27+
<sub><b>Figure 1:</b> Full Responsible AI Framework.</sub>
28+
29+
## Introduction
30+
Incorporating bias mitigation methods and fairness metrics into the traditional end-to-end MLOps is called
31+
fairness-based machine learning (ML) or fair machine learning. However, fair ML comes with its own challenges.
32+
We assembled a diverse team of statisticians and ML experts to provide evidence-based guidance on fairness metrics
33+
use/selection and validated code to properly run bias mitigation methods.
34+
35+
<details>
36+
<summary> Click to read our findings: </summary>
37+
38+
#### Fairness Metric
39+
* Statistical measure of the output of a machine learning model based a mathematical definition of fairness.
40+
41+
> [Fairness Metric Guide:](https://github.com/EqualityAI/EqualityML/raw/main/Fairness%20Metrics%20User%20Manual.pdf)
42+
We have combined fairness metrics and bias mitigation into a unified syntax.</br><sub> Statistical Parity |
43+
Conditional Statistical Parity | Negative Predictive Parity | Equal Opportunity | Balance for Positive Class |
44+
Predictive Parity | Well Calibration | Calibration | Conditional Use Accuracy | Predictive Equality | Balance for
45+
Negative Class | Equalized Odds | Overall Balance
46+
</sub>
47+
48+
#### Bias Mitigation
49+
* Methods or algorithms applied to a machine learning dataset or model to improve the fairness of the model output.
50+
Many mitigation methods have been proposed in the literature, which can be broadly classified into the application of
51+
a mitigation method on the data set (pre-processing), in the model fitting (in-processing), and to the model
52+
predictions (post-processing).
53+
54+
> [Bias Mitigation Guide:](https://github.com/EqualityAI/EqualityML/blob/main/Fairness%20Metrics%20User%20Manual.pdf)</br>
55+
<sub> Resampling | Reweighting | Disparate Impact Remover | Correlation Remover
56+
</sub>
57+
58+
![](https://github.com/EqualityAI/EqualityML/blob/main/img/pre_in_post_nw.png?raw=true)
59+
<sub><b>Figure 2:</b> Bias mitigation can be performed in the pre-processing, in-processing, and post-processing of a model.</sub>
60+
<br>
61+
62+
> Need a specific metric or method? [Just let us know!](https://equalityai.slack.com/join/shared_invite/zt-1claqpebo-MnGnGoqCM9Do~40HqbSaww#/shared-invite/email)
63+
64+
#### Potential Uses
65+
66+
* Bias mitigation methods are employed to address bias in data and/or machine learning models and fairness metrics are needed to mathematically represent the fairness or bias levels of a ML model.
67+
68+
| Use | Description |
69+
|:-------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
70+
| As a metric | Quantify a measure of fairness (a.k.a a fairness metric) targeting a bias |
71+
| Evaluate fairness | Fairness metrics can be used to mathematically represent the fairness levels of a ML model. This can also be used to monitor a model. |
72+
| Create parity on fairness | Unlike model performance metrics (e.g., loss, accuracy, etc.), fairness metrics affect your final model selection by creating parity (i.e., equality) on appropriate fairness metrics before model deployment. |
73+
| Select most fair model | Balance fairness with performance metrics when selecting the final model. |
74+
| Apply methods to improve the fairness & performance tradeoff | Methods to improve the fairness by applying a.k.a bias mitigation methods |
75+
76+
<sub><b>Table 1:</b> The potential uses for fairness metrics and bias mitigation methods.
77+
</sub>
78+
79+
<b>Note:</b> Parity is achieved when a fairness metric (such as the percent of positive predictions) have the same value across all levels of a sensitive attribute. <i>Sensitive attributes</i> are attributes such as race, gender, age, and other patient attributes that are of primary concern when it comes to fairness, and are typically protected by law.
80+
<br></br>
81+
82+
Through these steps we <b>safeguard against bias</b> by:
83+
> 1. Creating metrics targeting sources of bias to balance alongside our performance metrics in evaluation, model selection, and monitoring.
84+
> 2. Applying bias mitigation methods to improve fairness without compromising performance.
85+
<br></br>
86+
87+
</details>
88+
89+
90+
## EAI `EqualityML` Workflow
91+
We have conducted extensive literature review and theoretical analysis on dozens of fairness metrics and mitigation methods. Theoretical properties of those fairness mitigation methods were analyzed to determine their suitability under various conditions to create our framework for a pre-processing workflow.
92+
93+
| Pre-processing Workflow | Tool or Guidance provided |
94+
|:-------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
95+
| 1. Select Fairness Metric | Use our [Fairness Metric Selection Questionnaire & Tree](https://github.com/EqualityAI/EqualityML/blob/main/Equality%20AI%20Fairness%20Metric%20Selection%20Questionnaire%20%26%20Tree.pdf) to determine appropriate fairness metric(s) |
96+
| 2. Data Preparation ||
97+
| 3. Fit Prediction Model ||
98+
| 4. Compute Model Results and Evaluate Fairness Metric | Use `EqualityML` method `fairness_metric` to evaluate the fairness of a model |
99+
| 5. Run Bias Mitigation | Use `EqualityML` method `bias_mitigation` to run various bias mitigation methods on your dataset |
100+
| 6. Compute Model Results and Fairness Metric After Mitigation | `fairness_metric` `bias_mitigation` |
101+
| 7. Compare Model Results and Fairness Metric Before and After Mitigation | `fairness_metric` `bias_mitigation` |
102+
103+
<sub><b>Table 2:</b> The Equality AI recommended pre-processing workflow with tools and guidance made available per step.
104+
</sub> </br>
105+
106+
We recommend assessing the fairness of the same ML model after bias mitigation is applied. By comparing the predictions before and
107+
after mitigation, we will be able to assess whether and to what extent the fairness can be improved. Furthermore,
108+
the trade-offs between the accuracy and fairness of the machine learning model will be examined.
109+
110+
> In-processing and Post-processing are still under development. Do you need this now? [Let us know!](https://equalityai.slack.com/join/shared_invite/zt-1claqpebo-MnGnGoqCM9Do~40HqbSaww#/shared-invite/email)
111+
112+
## Guidance on selecting Fairness Metrics
113+
To make fairness metric selection easy we have provided a few essential questions you must answer to identify the
114+
appropriate fairness metric for your use case. [Click here for the questionnaire](https://github.com/EqualityAI/EqualityML/blob/main/Equality%20AI%20Fairness%20Metric%20Selection%20Questionnaire%20%26%20Tree.pdf). Complete the answers to this questionnaire, then refer to the scoring guide to map your inputs to the desired metrics.
115+
116+
After identifying the important fairness criteria, we recommend you attempt to use multiple bias mitigation strategies
117+
to try to optimize the efficiency-fairness tradeoff.</br>
118+
119+
## `EqualityML` Installation
120+
121+
## Python
122+
The `EqualityML` python package can be installed from [PyPI](https://pypi.org/project/equalityml/).
123+
124+
```bash
125+
pip install equalityml
126+
```
127+
128+
### Manual Installation
129+
Clone the last version of this repository:
130+
```bash
131+
https://github.com/EqualityAI/EqualityML.git
132+
```
133+
In the root directory of the project run the command:
134+
```bash
135+
pip install -e '.[all]'
136+
```
137+
138+
### Package Testing
139+
To run the bunch of tests over the EqualityML package, dependencies shall be first installed before calling pytest.
140+
141+
```sh
142+
pip install -e '.[tests]'
143+
pytest tests
144+
```
145+
### Quick Tour
146+
147+
Check out the example below to see how EqualityML can be used to assess fairness metrics and mitigate unwanted bias in
148+
the dataset.
149+
150+
```python
151+
from sklearn.linear_model import LogisticRegression
152+
from equalityml import FAIR
153+
import numpy as np
154+
import pandas as pd
155+
156+
# Sample unfair dataset
157+
random_col = np.random.normal(size=30)
158+
sex_col = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
159+
0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
160+
weight_col = [80, 75, 70, 65, 60, 85, 70, 75, 70, 70, 70, 80, 70, 70, 70, 80, 75, 70, 65, 70,
161+
70, 75, 80, 75, 75, 70, 65, 70, 75, 65]
162+
target_col = [1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,
163+
0, 1, 0, 1, 1, 0, 0, 1, 1, 0]
164+
training_data = pd.DataFrame({"random": random_col, "sex": sex_col, "weight": weight_col,
165+
"Y": target_col})
166+
167+
# Train a machine learning model (for example LogisticRegression)
168+
ml_model = LogisticRegression()
169+
ml_model.fit(training_data.drop(columns="Y"), training_data["Y"])
170+
171+
# Instantiate a FAIR object
172+
fair_obj = FAIR(ml_model=ml_model,
173+
training_data=training_data,
174+
target_variable="Y",
175+
protected_variable="sex",
176+
privileged_class=1)
177+
178+
# Evaluate a fairness metric (for example statistical parity ratio)
179+
metric_name = 'statistical_parity_ratio'
180+
fairness_metric = fair_obj.fairness_metric(metric_name)
181+
182+
# In case the model is unfair in terms of checked fairness metric (value is not close to 1),
183+
# EqualityML provides a range of methods to try to mitigate bias in Machine Learning models.
184+
# For example, we can use 'resampling' to perform mitigation on training dataset.
185+
186+
mitigation_method = "resampling"
187+
mitigation_result = fair_obj.bias_mitigation(mitigation_method)
188+
189+
# Now we can re-train the machine learning model based on that mitigated data and
190+
# evaluate again the fairness metric
191+
mitigated_data = mitigation_result['training_data']
192+
ml_model.fit(mitigated_data.drop(columns="Y"), mitigated_data["Y"])
193+
194+
fair_obj.update_classifier(ml_model)
195+
new_fairness_metric = fair_obj.fairness_metric(metric_name)
196+
197+
# print the unmitigated fairness metric
198+
print(f"Unmitigated fairness metric = {fairness_metric}")
199+
200+
# print the mitigated fairness metric
201+
print(f"Mitigated fairness metric = {new_fairness_metric}")
202+
203+
# All available fairness metrics and bias mitigation can be printed calling the methods:
204+
fair_obj.print_fairness_metrics()
205+
fair_obj.print_bias_mitigation_methods()
206+
```
207+
208+
## R
209+
The `EqualityML` R package can be installed from [CRAN](https://cran.r-project.org/web/packages/equalityml/index.html):
210+
```
211+
install.packages("equalityml")
212+
```
213+
or developer version from GitHub:
214+
```
215+
devtools::install_github("EqualityAI/equalityml/equalityml-r")
216+
```
217+
For more details regarding the R package, please check [here](https://github.com/EqualityAI/EqualityML/tree/main/equalityml-r).
218+
219+
220+
## Responsible AI Takes a Community
221+
The connections and trade-offs between fairness, explainability, and privacy require a holistic approach to Responsible
222+
AI development in the machine learning community. We are starting with the principle of fairness and working towards a
223+
solution that incorporates multiple aspects of Responsible AI for data scientists and healthcare professionals. We have
224+
much more in the works, and we want to know—what do you need? Do you have a Responsible AI challenge you need to solve?
225+
[Drop us a line and let’s see how we can help!](https://equalityai.slack.com/join/shared_invite/zt-1claqpebo-MnGnGoqCM9Do~40HqbSaww#/shared-invite/email)
226+
227+
## Contributing to the project
228+
Equality AI uses both GitHub and Slack to manage our open source community. To participate:
229+
230+
1. Join the Slack community (https://equalityai.com/slack)
231+
+ Introduce yourself in the #Introductions channel. We're all friendly people!
232+
2. Check out the [CONTRIBUTING](https://github.com/EqualityAI/EqualityML/blob/main/CONTRIBUTING.md) file to learn how
233+
to contribute to our project, report bugs, or make feature requests.
234+
3. Try out the [`EqualityML`](https://github.com/EqualityAI/EqualityML)
235+
+ Hit the top right "star" button on GitHub to show your love!
236+
+ Follow the recipe above to use the code.
237+
4. Provide feedback on your experience using the [GitHub discussions](https://github.com/EqualityAI/EqualityML/discussions)
238+
or the [Slack #support](https://equalityai.slack.com/archives/C03HF7G4N0Y) channel
239+
+ For any questions or problems, send a message on Slack, or send an email to support@equalityai.com.

0 commit comments

Comments
 (0)