Skip to content

John4064/text-analysis

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

Text Analysis of books using C++

A text analysis application that processes the text file of books gathered with XPDFREADER to do analysis on.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Version one the application will take in a text file that is supplied in the inputs folder, from there it will execute the algorithm to scan and process the textfile using multithreading and output to an output folder as well as efficiency statistics. With the change in the number of threads you can get statistics on the efficiency of # of threads. https://machinelearningmastery.com/gentle-introduction-bag-words-model/ Model to implement

(back to top)

Built With

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Prerequisites

  • C++17

Dependencies

  • Standard Library

Installation

  1. Upon Version 1's completion need to update installation guide. As of right now git clone and branch off to work on it!

(back to top)

Usage

Use this space to show useful examples of how a project can be used. Additional screenshots, code examples and demos work well in this space. You may also link to more resources.

For more examples, please refer to the Documentation

(back to top)

Roadmap

  • [] Basic Functionality
    • [] Optimize any potential algorithms(search/sort)
  • [] Documentation
    • Readme
    • [] Doxygen
    • [] Performance Monitoring(Google Benchmark)
  • [] User Interface
    • [] Interactibility with the application without programming selecting files
  • [] Report
    • Word Count/Frequency
    • [] Sentiment Analysis
    • [] General Stats
  • [] Automation
    • [] Ability to send multiple files for analysis(Batch Order)
    • [] Ability to analyze same file multiple configurations(single vs multi threading)
  • [] Futures
    • [] Analyze different file types
    • [] Analysis of none books such as tweets/articles

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See license.md for more information.

(back to top)

Contact

John Parkhurst - jparkhurst120@gmail.com

Project Link: https://github.com/John4064/text-analysis

(back to top)

Acknowledgments

(back to top)

About

Using text analysis of pdfs converted to text using xpdf pdftotext

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •