Skip to content

Commit 6c7c0a6

Browse files
committed
requirements.txt to pyproject.toml
1 parent 960f038 commit 6c7c0a6

File tree

5 files changed

+1616
-74
lines changed

5 files changed

+1616
-74
lines changed

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.11

README.md

Lines changed: 55 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# vuln-data-science
22

33
![MIT License](https://img.shields.io/badge/License-MIT-yellow.svg)
4-
![Python Version](https://img.shields.io/badge/Python-3.7%2B-blue.svg)
4+
![Python Version](https://img.shields.io/badge/Python-3.11%2B-blue.svg)
55

66
Welcome to the vuln-data-science repository! This project focuses on applying data science techniques to vulnerability
77
management and analysis. Our goal is to explore, analyze, and share insights on vulnerabilities using data science
@@ -48,7 +48,7 @@ professionals.
4848
- **Data Cleaning**: Techniques to preprocess and clean the data for analysis.
4949
- **Exploratory Data Analysis**: Visualizations and insights into vulnerability trends.
5050
- **Predictive Analysis**: Models to predict future vulnerabilities and their potential impact.
51-
- **Tools & Libraries**: Utilization of tools like Pandas, Polars, Matplotlib, and Scikit-learn for data processing and
51+
- **Tools & Libraries**: Utilization of tools like Pandas, Matplotlib, Seaborn, and Scikit-learn for data processing and
5252
analysis.
5353

5454
## Getting Started
@@ -57,7 +57,7 @@ professionals.
5757

5858
Before you begin, ensure you have the following software installed:
5959

60-
- Python 3.7 or higher
60+
- Python 3.11 or higher
6161

6262
### Installation
6363

@@ -93,20 +93,20 @@ Before you begin, ensure you have the following software installed:
9393
5. Install the required dependencies:
9494

9595
```bash
96-
pip install -r requirements.txt
96+
pip install .
97+
```
98+
99+
Alternatively, if you use Hatch, you can set up the environment with:
100+
101+
```bash
102+
hatch env create
103+
hatch shell
97104
```
98105

99106
## Usage
100107

101108
To start exploring the data and running the analyses, open the Jupyter notebooks in the `notebooks` directory. Each
102-
notebook focuses on a different aspect of the data pipeline:
103-
104-
- `01_data_collection.ipynb`: Collects and aggregates data from various vulnerability sources.
105-
- `02_data_cleaning.ipynb`: Cleans and preprocesses the raw data for analysis.
106-
- `03_weighted_vulnerability_scoring.ipynb`: Applies weighted scoring to prioritize vulnerabilities based on multiple
107-
factors.
108-
- `04_analysis.ipynb`: Analyzes the processed data to identify trends and insights.
109-
- `05_summary.ipynb`: Summarizes the findings and prepares the final report.
109+
notebook focuses on a different aspect of the data pipeline.
110110

111111
You can launch Jupyter Notebook with the following command:
112112

@@ -116,71 +116,18 @@ jupyter notebook
116116

117117
Navigate to the `notebooks` directory and open any notebook to get started.
118118

119-
To keep the Markdown files in sync with the Jupyter notebooks, you can use the provided conversion script:
120-
121-
```bash
122-
python scripts/nb_to_md.py
123-
```
124-
125-
This script requires the `jupytext` package, which will be installed with the other dependencies.
126-
127119
## Project Structure
128120

129121
```
130122
vuln-data-science/
131123
├── data/
132-
│ ├── raw/
133-
│ ├── processed/
134124
├── notebooks/
135-
│ ├── patch_tuesday/
136-
│ │ ├── 01_data_collection.ipynb
137-
│ │ ├── 02_data_cleaning.ipynb
138-
│ │ ├── 03_weighted_vulnerability_scoring.ipynb
139-
│ │ ├── 04_analysis.ipynb
140-
│ │ ├── 05_summary.ipynb
141-
├── markdown/
142125
├── scripts/
143126
│ ├── nb_to_md.py
144127
├── README.md
145-
├── requirements.txt
146128
└── LICENSE
147129
```
148130

149-
- `data/`: Contains raw and processed data files, organized by project (e.g., `patch_tuesday`, `weekly_cve`).
150-
- `notebooks/`: Jupyter notebooks for data exploration, cleaning, and analysis.
151-
- `markdown/`: Markdown versions of the Jupyter notebooks.
152-
- `scripts/`: Python scripts for data processing and analysis tools.
153-
- `README.md`: Project documentation.
154-
- `requirements.txt`: List of dependencies.
155-
- `LICENSE`: License information.
156-
157-
## Notebooks and Markdown
158-
159-
Jupyter notebooks are located in the `/notebooks` directory. These contain code and analysis for various aspects of
160-
vulnerability management. For convenience, markdown versions are available in the `/markdown` directory.
161-
162-
To keep the Markdown files in sync with the Jupyter notebooks, use the conversion script:
163-
164-
```bash
165-
python scripts/nb_to_md.py
166-
```
167-
168-
The `jupytext` package will be installed with the other dependencies.
169-
170-
### Patch Tuesday
171-
172-
#### Notebooks
173-
174-
- [Data Collection Notebook](notebooks/patch_tuesday/01_data_collection.ipynb)
175-
- [Data Cleaning Notebook](notebooks/patch_tuesday/02_data_cleaning.ipynb)
176-
- [Vulnerability Analysis Notebook](notebooks/patch_tuesday/03_vulnerability_analysis.ipynb)
177-
178-
#### Markdown
179-
180-
- [Data Collection Markdown](markdown/patch_tuesday/01_data_collection.md)
181-
- [Data Cleaning Markdown](markdown/patch_tuesday/02_data_cleaning.md)
182-
- [Vulnerability Analysis Markdown](markdown/patch_tuesday/03_vulnerability_analysis.md)
183-
184131
## Contributing
185132

186133
We welcome contributions! If you have ideas or find issues, please open a GitHub issue or submit a pull request.
@@ -203,10 +150,50 @@ We plan to expand the project with the following features:
203150
- **Advanced Analytics**: Machine learning models for predicting vulnerability exploitation likelihood.
204151
- **Visualization Dashboards**: Interactive dashboards for visualizing trends and insights.
205152

206-
## Acknowledgments
153+
### Data Usage and Attribution
154+
155+
This project uses data from various publicly available sources. Please ensure compliance with their respective usage
156+
agreements and attribution requirements if you use or redistribute the data.
157+
158+
#### **NIST National Vulnerability Database (NVD)**
159+
160+
- Website: [NVD Developers - Terms of Use](https://nvd.nist.gov/developers/terms-of-use)
161+
- **Attribution Requirement**:
162+
- Services utilizing the NVD API must display the following notice prominently:
163+
> "This product uses the NVD API but is not endorsed or certified by the NVD."
164+
- The NVD name may only be used to identify the source of API content and may not imply endorsement of any product
165+
or service.
166+
167+
#### **CISA Known Exploited Vulnerabilities (KEV)**
168+
169+
- Website: [CISA KEV License](https://www.cisa.gov/sites/default/files/licenses/kev/license.txt)
170+
- **License**:
171+
- The KEV database is distributed under the **Creative Commons 0 1.0 License**.
172+
- You may use this data in any legal manner, but note:
173+
- Information provided at any 3rd-party links included in the KEV database is bound by the policies and licenses
174+
of those third-party websites.
175+
- Use of the information does not authorize you to use the **CISA Logo** or **DHS Seal**, nor should such use be
176+
interpreted as an endorsement by CISA or DHS.
177+
178+
#### **Exploit Prediction Scoring System (EPSS)**
179+
180+
- Website: [EPSS - FIRST.org](https://www.first.org/epss)
181+
- **Usage Agreement**:
182+
- EPSS scores are freely available for public use.
183+
- **Attribution Requirement**:
184+
> "See EPSS at https://www.first.org/epss"
185+
> or
186+
> "Jay Jacobs, Sasha Romanosky, Benjamin Edwards, Michael Roytman, Idris Adjerid, (2021), Exploit Prediction
187+
Scoring System, Digital Threats Research and Practice, 2(3)."
188+
189+
---
190+
191+
### Acknowledgments
207192

208193
We would like to acknowledge the work of researchers and contributors who are advancing the field of vulnerability data
209-
science. Their insights and tools have been instrumental in shaping this project.
194+
science. Their insights and tools have been instrumental in shaping this project. This project also draws inspiration
195+
from the broader cybersecurity and data science communities, whose collective efforts improve security practices and
196+
promote knowledge sharing.
210197

211198
- **[Jay Jacobs](https://www.linkedin.com/in/jayjacobs1/)**
212199
Co-founder of the Cyentia Institute, focusing on security metrics and data-driven decision-making in vulnerability
@@ -226,3 +213,4 @@ science. Their insights and tools have been instrumental in shaping this project
226213

227214
We also want to thank the broader cybersecurity and data science communities for their contributions. This project draws
228215
inspiration from collective efforts to improve security practices and promote knowledge sharing.
216+

pyproject.toml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
[project]
2+
name = "vuln-data-science"
3+
version = "0.1.0"
4+
description = "Add your description here"
5+
readme = "README.md"
6+
requires-python = ">=3.11"
7+
dependencies = [
8+
"matplotlib>=3.10.0",
9+
"pandas>=2.2.3",
10+
"requests>=2.32.3",
11+
"seaborn>=0.13.2",
12+
]
13+
14+
[dependency-groups]
15+
dev = [
16+
"black>=24.10.0",
17+
"hatch>=1.14.0",
18+
"ipython>=8.31.0",
19+
"jupytext>=1.16.6",
20+
]

requirements.txt

Lines changed: 0 additions & 7 deletions
This file was deleted.

0 commit comments

Comments
 (0)