Skip to content
This repository was archived by the owner on Oct 13, 2025. It is now read-only.

Conversation

@dermatologist
Copy link
Owner

No description provided.

dermatologist and others added 20 commits April 30, 2025 15:20
…, update LDA model methods; add tests for topic printing
…matting; update tests to validate output structure
…ng top documents by topic; update tests for new functionality
… modify plot method to use instance data if no DataFrame is provided
…izing document word counts by dominant topic; update tests accordingly
…lusterDocs; enhance document processing capabilities
… visualize modules; enhance test readability in test files
@dermatologist dermatologist requested a review from Copilot April 30, 2025 21:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds new test suites and updates core modules to enhance data visualization, file reading, neural network modeling with PyTorch, and clustering functionality.

  • Added/updated tests for visualization, NLP and numerical output.
  • Updated file reading to accept different input types and refactored ML modules to use PyTorch.
  • Introduced clustering extensions and updated packaging configuration.

Reviewed Changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_visualize.py Added tests for QRVisualize plotting functions.
tests/test_readfiles.py Updated tests to use a single filename instead of a list.
tests/test_num.py Adjusted assertion to match updated output capitalization.
tests/test_nlp.py Minor improvements in fixtures and assertions.
test.py Added a standalone test for spaCy based NLP processing.
src/qrmine/visualize.py Implements various plotting methods using matplotlib and wordcloud; includes wordcloud function.
src/qrmine/readfiles.py Refactored read_file function to support file, folder, and URL inputs.
src/qrmine/mlqrmine.py Replaces Keras-based NN with PyTorch implementation for predictions and evaluation.
src/qrmine/content.py Minor addition: exposing tokens property for filtering processed tokens.
src/qrmine/cluster.py New file added for clustering operations using LDA and topic representations.
src/qrmine/init.py Updated to include new modules for clustering and visualization.
pyproject.toml Updated project metadata and dependencies.
notes/*.md Added/updated documentation notes on pip-tools and conda environment setup.
Files not reviewed (2)
  • setup.cfg: Language not supported
  • src/qrmine/resources/df_dominant_topic.csv: Language not supported

Comment on lines +52 to +54
# if input is a folder name
elif isinstance(input, str):
import os
Copy link

Copilot AI Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type-check conditions in read_file are redundant since all inputs are checked as 'str'. Consider distinguishing file, folder, and URL cases with dedicated checks (e.g. os.path.isfile, os.path.isdir, or URL pattern matching) to ensure the correct branch is executed.

Suggested change
# if input is a folder name
elif isinstance(input, str):
import os
# Check if input is a folder
elif os.path.isdir(input):

Copilot uses AI. Check for mistakes.
height=180,
max_words=5,
colormap="tab10",
color_func=lambda *args, **kwargs: cols[i],
Copy link

Copilot AI Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda in the WordCloud instantiation captures 'i', which is undefined at that point. Capture the current index explicitly (e.g. using a default parameter like lambda *args, i=i, **kwargs: cols[i]) to fix the reference.

Suggested change
color_func=lambda *args, **kwargs: cols[i],
color_func=lambda *args, i=i, **kwargs: cols[i],

Copilot uses AI. Check for mistakes.
@dermatologist dermatologist reopened this May 1, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants