Skip to content
This repository was archived by the owner on Oct 13, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 15 additions & 10 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,26 @@ jobs:
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
python-version: '3.11'
- name: Install dependencies
enable-cache: true
- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: "pyproject.toml"

- name: Install the project
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r dev-requirements.txt
python -m spacy download en_core_web_sm
uv sync --all-extras --dev
uv pip install pip
uv run python -m spacy download en_core_web_sm
- name: Create docs
run: |
make -C docs/ html
uv run python -m sphinx -b html docs/ docs/_build/html
cp docs/_config.yml docs/_build/html/_config.yml
- name: Deploy Docs 🚀
uses: JamesIves/github-pages-deploy-action@v4.2.5
uses: JamesIves/github-pages-deploy-action@v4
with:
branch: gh-pages # The branch the action should deploy to.
folder: docs/_build/html # The folder the action should deploy.
folder: docs/_build/html # The folder the action should deploy.
28 changes: 14 additions & 14 deletions .github/workflows/pr.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Pytest on PR
name: Pytest using UV on PR
on:
push:
branches:
Expand All @@ -13,27 +13,27 @@ jobs:
strategy:
max-parallel: 4
matrix:
python-version: ["3.11"]
os: [ubuntu-latest, macos-13, windows-latest]
runs-on: ${{ matrix.os }}
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip' # caching pip dependencies
enable-cache: true
- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: "pyproject.toml"
- name: run on mac
if: startsWith(matrix.os, 'mac')
run: |
brew install libomp
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
python -m spacy download en_core_web_sm
- name: Test with pytest
- name: Install the project
run: |
pip install pytest
pytest
uv sync --all-extras --dev
uv pip install pip
uv run python -m spacy download en_core_web_sm
- name: Run tests
run: uv run pytest tests
23 changes: 14 additions & 9 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,25 @@ jobs:
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5.1.1
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
python-version: '3.11'
- name: Install dependencies
enable-cache: true
- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: "pyproject.toml"
- name: Install the project
run: |
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
uv sync --all-extras --dev
uv pip install pip
uv run python -m spacy download en_core_web_sm
- name: Build and publish
run: |
python setup.py bdist_wheel
uv run python setup.py bdist_wheel
- name: Publish distribution 📦 to PyPI
if: startsWith(github.ref, 'refs/tags')
uses: pypa/gh-action-pypi-publish@master
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
password: ${{ secrets.PYPI_API_TOKEN }}
21 changes: 13 additions & 8 deletions .github/workflows/tox.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,20 @@ jobs:

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5.1.1
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
enable-cache: true
- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: "pyproject.toml"

- name: Install the project
run: |
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r requirements.txt
python -m spacy download en_core_web_sm
uv sync --all-extras --dev
uv pip install pip
uv run python -m spacy download en_core_web_sm
- name: Test with tox
run: |
tox
uv run tox
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ __pycache__/*
.idea
.venv
conda
uv.lock

# Package files
*.egg
Expand Down
58 changes: 26 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,64 @@
# :flashlight: QRMine
# 🔍 QRMine
*/ˈkärmīn/*

[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)[![PyPI download total](https://img.shields.io/pypi/dm/qrmine.svg)](https://pypi.python.org/pypi/qrmine/)
![Libraries.io SourceRank](https://img.shields.io/librariesio/sourcerank/pypi/qrmine)
![GitHub tag (latest by date)](https://img.shields.io/github/v/tag/dermatologist/nlp-qrmine)
[![Documentation](https://badgen.net/badge/icon/documentation?icon=libraries&label)](https://dermatologist.github.io/nlp-qrmine/)

QRMine is a suite of qualitative research (QR) data mining tools in Python using Natural Language Processing (NLP) and Machine Learning (ML). QRMine is work in progress. [Read More..](https://nuchange.ca/2017/09/grounded-theory-qualitative-research-python.html)
Qualitative research involves the collection and analysis of textual data, such as interview transcripts, open-ended survey responses, and field notes. It is often used in social sciences, humanities, and health research to explore complex phenomena and understand human experiences. In addition to textual data, qualitative researchers may also collect quantitative data, such as survey responses or demographic information, to complement their qualitative findings.

## What it does
Qualitative research is often characterized by its inductive approach, where researchers aim to generate theories or concepts from the data rather than testing pre-existing hypotheses. This process is known as Grounded Theory, which emphasizes the importance of data-driven analysis and theory development.

### NLP
QRMine is a Python package for qualitative research and triangulation of textual and numeric data in Grounded Theory. It provides tools for Natural Language Processing (NLP) and Machine Learning (ML) to analyze qualitative data, such as interview transcripts, and quantitative data, such as survey responses for theorizing.

Version 4.0 is a major update with new features and bug fixes. It moves some of the ML dependencies to an optional install. Version 4.0 is a prelude to version 5.0 that will introduce large language models (LLMs) for qualitative research.

## ✨ Features

### 🔧 NLP
* Lists common categories for open coding.
* Create a coding dictionary with categories, properties and dimensions.
* Topic modelling.
* Arrange docs according to topics.
* Compare two documents/interviews.
* Select documents/interviews by sentiment, category or title for further analysis.
* Sentiment analysis
* Clusters documents and creates visualizations.
* Generate (non LLM) summary of documents/interviews.


### ML
### 🧠 ML
* Accuracy of a neural network model trained using the data
* Confusion matrix from an support vector machine classifier
* K nearest neighbours of a given record
* K-Means clustering
* Principal Component Analysis (PCA)
* Association rules

## How to install
## 🛠️ How to install

* Requires Python 3.11 and a CPU that support AVX instructions
* Requires Python 3.11
```text
pip install uv
uv pip install qrmine
pip install qrmine
python -m spacy download en_core_web_sm

```

* For ML functions (neural networks & SVM), install the optional packages
```text
pip install qrmine[ml]
```

### Mac users
* Mac users, please install *libomp* for XGBoost
```
brew install libomp
```

## How to Use
## 🚀 How to Use

* input files are transcripts as txt files and a single csv file with numeric data. The output txt file can be specified.
* Input files are transcripts as txt/pdf files and (optionally) a single csv file with numeric data. The output txt file can be specified. All transcripts can be in a single file separated by a break tag as described below.

* The coding dictionary, topics and topic assignments can be created from the entire corpus (all documents) using the respective command line options.

Expand Down Expand Up @@ -140,33 +152,15 @@ index, obesity, bmi, exercise, income, bp, fbs, has_diabetes

## Author

* [Bell Eapen](https://nuchange.ca) (McMaster U) | [Contact](https://nuchange.ca/contact) | [![Twitter Follow](https://img.shields.io/twitter/follow/beapen?style=social)](https://twitter.com/beapen)
* [Bell Eapen](https://nuchange.ca) ([UIS](https://www.uis.edu/directory/bell-punneliparambil-eapen)) | [Contact](https://nuchange.ca/contact) | [![Twitter Follow](https://img.shields.io/twitter/follow/beapen?style=social)](https://twitter.com/beapen)

* This software is developed and tested using [Compute Canada](http://www.computecanada.ca) resources.
* See also: [:fire: The FHIRForm framework for managing healthcare eForms](https://github.com/E-Health/fhirform)
* See also: [:eyes: Drishti | An mHealth sense-plan-act framework!](https://github.com/E-Health/drishti)

## Citation

Please cite QRMine in your publications if it helped your research. Here
is an example BibTeX entry [(Read paper on arXiv)](https://arxiv.org/abs/2003.13519):

```

@article{eapenbr2019qrmine,
title={QRMine: A python package for triangulation in Grounded Theory},
author={Eapen, Bell Raj and Archer, Norm and Sartpi, Kamran},
journal={arXiv preprint arXiv:2003.13519 },
year={2020}
}

```

QRMine is inspired by [this work](https://github.com/lknelson/computational-grounded-theory) and the associated [paper](https://journals.sagepub.com/doi/abs/10.1177/0049124117729703).
Please cite QRMine in your publications if it helped your research.
Citation information will be available soon.

## Give us a star ⭐️
If you find this project useful, give us a star. It helps others discover the project.

## Demo

[![QRMine](https://github.com/dermatologist/nlp-qrmine/blob/develop/notes/qrmine.gif)](https://github.com/dermatologist/nlp-qrmine/blob/develop/notes/qrmine.gif)
11 changes: 0 additions & 11 deletions dev-requirements.in

This file was deleted.

Loading
Loading