GitHub - John4064/text-analysis: Using text analysis of pdfs converted to text using xpdf pdftotext

Text Analysis of books using C++

A text analysis application that processes the text file of books gathered with XPDFREADER to do analysis on.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Roadmap
Contributing
License
Contact
Acknowledgments

About The Project

Version one the application will take in a text file that is supplied in the inputs folder, from there it will execute the algorithm to scan and process the textfile using multithreading and output to an output folder as well as efficiency statistics. With the change in the number of threads you can get statistics on the efficiency of # of threads. https://machinelearningmastery.com/gentle-introduction-bag-words-model/ Model to implement

(back to top)

Built With

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Prerequisites

C++17

Dependencies

Standard Library

Installation

Upon Version 1's completion need to update installation guide. As of right now git clone and branch off to work on it!

(back to top)

Usage

Use this space to show useful examples of how a project can be used. Additional screenshots, code examples and demos work well in this space. You may also link to more resources.

For more examples, please refer to the Documentation

(back to top)

Roadmap

[] Basic Functionality
- [] Optimize any potential algorithms(search/sort)
[] Documentation
- Readme
- [] Doxygen
- [] Performance Monitoring(Google Benchmark)
[] User Interface
- [] Interactibility with the application without programming selecting files
[] Report
- Word Count/Frequency
- [] Sentiment Analysis
- [] General Stats
[] Automation
- [] Ability to send multiple files for analysis(Batch Order)
- [] Ability to analyze same file multiple configurations(single vs multi threading)
[] Futures
- [] Analyze different file types
- [] Analysis of none books such as tweets/articles

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the MIT License. See license.md for more information.

(back to top)

Contact

John Parkhurst - jparkhurst120@gmail.com

Project Link: https://github.com/John4064/text-analysis

(back to top)

Acknowledgments

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
.github/workflows		.github/workflows
cmake-build-debug		cmake-build-debug
documentation		documentation
images		images
inputs		inputs
scripts		scripts
source		source
.gitattributes		.gitattributes
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text Analysis of books using C++

About The Project

Built With

Getting Started

Prerequisites

Dependencies

Installation

Usage

Roadmap

Contributing

License

Contact

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

John4064/text-analysis

Folders and files

Latest commit

History

Repository files navigation

Text Analysis of books using C++

About The Project

Built With

Getting Started

Prerequisites

Dependencies

Installation

Usage

Roadmap

Contributing

License

Contact

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages