Skip to content
This repository was archived by the owner on Oct 13, 2025. It is now read-only.

Commit 715b60e

Browse files
Merge pull request #253 from dermatologist/feature/PR-ver-4
Merge branch 'feature/cluster-2'
2 parents f6614f1 + f572888 commit 715b60e

30 files changed

+1845
-895
lines changed

.github/workflows/docs.yml

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -11,21 +11,26 @@ jobs:
1111
timeout-minutes: 15
1212
steps:
1313
- uses: actions/checkout@v4
14-
- name: Set up Python
15-
uses: actions/setup-python@v4
14+
- name: Install uv
15+
uses: astral-sh/setup-uv@v5
1616
with:
17-
python-version: '3.11'
18-
- name: Install dependencies
17+
enable-cache: true
18+
- name: "Set up Python"
19+
uses: actions/setup-python@v5
20+
with:
21+
python-version-file: "pyproject.toml"
22+
23+
- name: Install the project
1924
run: |
20-
python -m pip install --upgrade pip
21-
pip install -r requirements.txt -r dev-requirements.txt
22-
python -m spacy download en_core_web_sm
25+
uv sync --all-extras --dev
26+
uv pip install pip
27+
uv run python -m spacy download en_core_web_sm
2328
- name: Create docs
2429
run: |
25-
make -C docs/ html
30+
uv run python -m sphinx -b html docs/ docs/_build/html
2631
cp docs/_config.yml docs/_build/html/_config.yml
2732
- name: Deploy Docs 🚀
28-
uses: JamesIves/github-pages-deploy-action@v4.2.5
33+
uses: JamesIves/github-pages-deploy-action@v4
2934
with:
3035
branch: gh-pages # The branch the action should deploy to.
31-
folder: docs/_build/html # The folder the action should deploy.
36+
folder: docs/_build/html # The folder the action should deploy.

.github/workflows/pr.yml

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: Pytest on PR
1+
name: Pytest using UV on PR
22
on:
33
push:
44
branches:
@@ -13,27 +13,27 @@ jobs:
1313
strategy:
1414
max-parallel: 4
1515
matrix:
16-
python-version: ["3.11"]
1716
os: [ubuntu-latest, macos-13, windows-latest]
1817
runs-on: ${{ matrix.os }}
1918
timeout-minutes: 20
2019
steps:
2120
- uses: actions/checkout@v4
22-
- name: Set up Python ${{ matrix.python-version }}
23-
uses: actions/setup-python@v4
21+
- name: Install uv
22+
uses: astral-sh/setup-uv@v5
2423
with:
25-
python-version: ${{ matrix.python-version }}
26-
cache: 'pip' # caching pip dependencies
24+
enable-cache: true
25+
- name: "Set up Python"
26+
uses: actions/setup-python@v5
27+
with:
28+
python-version-file: "pyproject.toml"
2729
- name: run on mac
2830
if: startsWith(matrix.os, 'mac')
2931
run: |
3032
brew install libomp
31-
- name: Install dependencies
32-
run: |
33-
python -m pip install --upgrade pip
34-
pip install -r requirements.txt
35-
python -m spacy download en_core_web_sm
36-
- name: Test with pytest
33+
- name: Install the project
3734
run: |
38-
pip install pytest
39-
pytest
35+
uv sync --all-extras --dev
36+
uv pip install pip
37+
uv run python -m spacy download en_core_web_sm
38+
- name: Run tests
39+
run: uv run pytest tests

.github/workflows/publish.yml

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,25 @@ jobs:
1111
timeout-minutes: 20
1212
steps:
1313
- uses: actions/checkout@v4
14-
- name: Set up Python
15-
uses: actions/setup-python@v5.1.1
14+
- name: Install uv
15+
uses: astral-sh/setup-uv@v5
1616
with:
17-
python-version: '3.11'
18-
- name: Install dependencies
17+
enable-cache: true
18+
- name: "Set up Python"
19+
uses: actions/setup-python@v5
20+
with:
21+
python-version-file: "pyproject.toml"
22+
- name: Install the project
1923
run: |
20-
python -m pip install --upgrade pip
21-
pip install -r dev-requirements.txt
24+
uv sync --all-extras --dev
25+
uv pip install pip
26+
uv run python -m spacy download en_core_web_sm
2227
- name: Build and publish
2328
run: |
24-
python setup.py bdist_wheel
29+
uv run python setup.py bdist_wheel
2530
- name: Publish distribution 📦 to PyPI
2631
if: startsWith(github.ref, 'refs/tags')
27-
uses: pypa/gh-action-pypi-publish@master
32+
uses: pypa/gh-action-pypi-publish@release/v1
2833
with:
2934
user: __token__
30-
password: ${{ secrets.PYPI_API_TOKEN }}
35+
password: ${{ secrets.PYPI_API_TOKEN }}

.github/workflows/tox.yml

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,20 @@ jobs:
1717

1818
steps:
1919
- uses: actions/checkout@v4
20-
- name: Set up Python ${{ matrix.python-version }}
21-
uses: actions/setup-python@v5.1.1
20+
- name: Install uv
21+
uses: astral-sh/setup-uv@v5
2222
with:
23-
python-version: ${{ matrix.python-version }}
24-
- name: Install dependencies
23+
enable-cache: true
24+
- name: "Set up Python"
25+
uses: actions/setup-python@v5
26+
with:
27+
python-version-file: "pyproject.toml"
28+
29+
- name: Install the project
2530
run: |
26-
python -m pip install --upgrade pip
27-
pip install -r dev-requirements.txt -r requirements.txt
28-
python -m spacy download en_core_web_sm
31+
uv sync --all-extras --dev
32+
uv pip install pip
33+
uv run python -m spacy download en_core_web_sm
2934
- name: Test with tox
3035
run: |
31-
tox
36+
uv run tox

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ __pycache__/*
2020
.idea
2121
.venv
2222
conda
23+
uv.lock
2324

2425
# Package files
2526
*.egg

README.md

Lines changed: 26 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,64 @@
1-
# :flashlight: QRMine
1+
# 🔍 QRMine
22
*/ˈkärmīn/*
33

44
[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)[![PyPI download total](https://img.shields.io/pypi/dm/qrmine.svg)](https://pypi.python.org/pypi/qrmine/)
55
![Libraries.io SourceRank](https://img.shields.io/librariesio/sourcerank/pypi/qrmine)
66
![GitHub tag (latest by date)](https://img.shields.io/github/v/tag/dermatologist/nlp-qrmine)
77
[![Documentation](https://badgen.net/badge/icon/documentation?icon=libraries&label)](https://dermatologist.github.io/nlp-qrmine/)
88

9-
QRMine is a suite of qualitative research (QR) data mining tools in Python using Natural Language Processing (NLP) and Machine Learning (ML). QRMine is work in progress. [Read More..](https://nuchange.ca/2017/09/grounded-theory-qualitative-research-python.html)
9+
Qualitative research involves the collection and analysis of textual data, such as interview transcripts, open-ended survey responses, and field notes. It is often used in social sciences, humanities, and health research to explore complex phenomena and understand human experiences. In addition to textual data, qualitative researchers may also collect quantitative data, such as survey responses or demographic information, to complement their qualitative findings.
1010

11-
## What it does
11+
Qualitative research is often characterized by its inductive approach, where researchers aim to generate theories or concepts from the data rather than testing pre-existing hypotheses. This process is known as Grounded Theory, which emphasizes the importance of data-driven analysis and theory development.
1212

13-
### NLP
13+
QRMine is a Python package for qualitative research and triangulation of textual and numeric data in Grounded Theory. It provides tools for Natural Language Processing (NLP) and Machine Learning (ML) to analyze qualitative data, such as interview transcripts, and quantitative data, such as survey responses for theorizing.
14+
15+
Version 4.0 is a major update with new features and bug fixes. It moves some of the ML dependencies to an optional install. Version 4.0 is a prelude to version 5.0 that will introduce large language models (LLMs) for qualitative research.
16+
17+
## ✨ Features
18+
19+
### 🔧 NLP
1420
* Lists common categories for open coding.
1521
* Create a coding dictionary with categories, properties and dimensions.
1622
* Topic modelling.
1723
* Arrange docs according to topics.
1824
* Compare two documents/interviews.
1925
* Select documents/interviews by sentiment, category or title for further analysis.
2026
* Sentiment analysis
27+
* Clusters documents and creates visualizations.
28+
* Generate (non LLM) summary of documents/interviews.
2129

2230

23-
### ML
31+
### 🧠 ML
2432
* Accuracy of a neural network model trained using the data
2533
* Confusion matrix from an support vector machine classifier
2634
* K nearest neighbours of a given record
2735
* K-Means clustering
2836
* Principal Component Analysis (PCA)
2937
* Association rules
3038

31-
## How to install
39+
## 🛠️ How to install
3240

33-
* Requires Python 3.11 and a CPU that support AVX instructions
41+
* Requires Python 3.11
3442
```text
35-
pip install uv
36-
uv pip install qrmine
43+
pip install qrmine
3744
python -m spacy download en_core_web_sm
3845
3946
```
4047

48+
* For ML functions (neural networks & SVM), install the optional packages
49+
```text
50+
pip install qrmine[ml]
51+
```
52+
4153
### Mac users
4254
* Mac users, please install *libomp* for XGBoost
4355
```
4456
brew install libomp
4557
```
4658

47-
## How to Use
59+
## 🚀 How to Use
4860

49-
* input files are transcripts as txt files and a single csv file with numeric data. The output txt file can be specified.
61+
* Input files are transcripts as txt/pdf files and (optionally) a single csv file with numeric data. The output txt file can be specified. All transcripts can be in a single file separated by a break tag as described below.
5062

5163
* The coding dictionary, topics and topic assignments can be created from the entire corpus (all documents) using the respective command line options.
5264

@@ -140,33 +152,15 @@ index, obesity, bmi, exercise, income, bp, fbs, has_diabetes
140152

141153
## Author
142154

143-
* [Bell Eapen](https://nuchange.ca) (McMaster U) | [Contact](https://nuchange.ca/contact) | [![Twitter Follow](https://img.shields.io/twitter/follow/beapen?style=social)](https://twitter.com/beapen)
155+
* [Bell Eapen](https://nuchange.ca) ([UIS](https://www.uis.edu/directory/bell-punneliparambil-eapen)) | [Contact](https://nuchange.ca/contact) | [![Twitter Follow](https://img.shields.io/twitter/follow/beapen?style=social)](https://twitter.com/beapen)
144156

145-
* This software is developed and tested using [Compute Canada](http://www.computecanada.ca) resources.
146-
* See also: [:fire: The FHIRForm framework for managing healthcare eForms](https://github.com/E-Health/fhirform)
147-
* See also: [:eyes: Drishti | An mHealth sense-plan-act framework!](https://github.com/E-Health/drishti)
148157

149158
## Citation
150159

151-
Please cite QRMine in your publications if it helped your research. Here
152-
is an example BibTeX entry [(Read paper on arXiv)](https://arxiv.org/abs/2003.13519):
153-
154-
```
155-
156-
@article{eapenbr2019qrmine,
157-
title={QRMine: A python package for triangulation in Grounded Theory},
158-
author={Eapen, Bell Raj and Archer, Norm and Sartpi, Kamran},
159-
journal={arXiv preprint arXiv:2003.13519 },
160-
year={2020}
161-
}
162-
163-
```
164-
165-
QRMine is inspired by [this work](https://github.com/lknelson/computational-grounded-theory) and the associated [paper](https://journals.sagepub.com/doi/abs/10.1177/0049124117729703).
160+
Please cite QRMine in your publications if it helped your research.
161+
Citation information will be available soon.
166162

167163
## Give us a star ⭐️
168164
If you find this project useful, give us a star. It helps others discover the project.
169165

170-
## Demo
171166

172-
[![QRMine](https://github.com/dermatologist/nlp-qrmine/blob/develop/notes/qrmine.gif)](https://github.com/dermatologist/nlp-qrmine/blob/develop/notes/qrmine.gif)

dev-requirements.in

Lines changed: 0 additions & 11 deletions
This file was deleted.

0 commit comments

Comments
 (0)