Skip to content

Frangidha/Data-Processing-Template

Repository files navigation

Data-Integration-template

This project was created in order for QC(quality control) labs or students to easily keep track of their products. It will allow them to keep track of certain chemical information about their products as well as update them using an easy to use google sheets API. this example has been made to check different type of olive oils.

an FTIR(Fourier transform infrared) spectrum of olive oil with attributed band is displayed. a change in spectrum can appear depending of the olive oil itself. a spectrum of olive oil

Site Goals

This is a simple application to give user a template to quickly modify and quickly keep track of their data and do data manipulation on their data.

User Guide

A quick user guide can be found on google sheet documents using the following link.

Link to User Guide

Target Audience

students or QC labs to have an user friendly way to do manipilation and keep track of the chemical data of their products.

User Stories

  • As a User, I would like to be able to easily input my data in a well know environment which requires no/minimal training to use.
  • As a User, I would like to be able to check all the data steps because this could be important in case of audit or check for mistakes.
  • As a User, I would like to be able to visualize my acquired Data.
  • As a User, I would like to be able to visualize my acquired Data to see certain trends.
  • As a User, I would like to be able to process my acquired Data in a quick way.
  • Features Planned

  • Simple easy to use application using a familiar environment.
  • Simple storage of the data.
  • Visualization of the data.
  • data manipulation to give more information to the user.
  • looking for trends of the last data set to see changes.
  • Structure

    USER STORY

    1. As a User, I would like to be able to easily input my data in a well know environment which requires no/minimal training to use.

    IMPLEMENTATION

  • API to google sheets
  • excel is a familiar environment for many user so it will be an ideal place to do this.

    Also a userguide is provided(Link to User Guide).

    google sheets example input form

  • The user will put the data as required in this application in the google sheets Raw_Data file.
  • link to input file

    in case a wrong data has been imported they will get a notification to alter it and the application won't run. it will ask you to put in new data and will check it again when you press "x"in the application.

  • when the data has been input they just need to press "x" to run the application.
  • if the data is valid the application will run completely.

    USER STORY

    As a user, I would like to be able to to visualize my acquired Data.

    IMPLEMENTATION

  • plotting of the data that has been added to google sheets file using PloText library.
  • once the programme starts running after the data check it will plot the spectrum data

    plot of the Olive Oil from the data input

    USER STORY

  • data manipulation to give more information to the user.
  • As a User, I would like to be able to check all the data steps because this could be important in case of audit or check for mistakes
  • IMPLEMENTATION

  • the NumPy libary has been used to integrate the data using trap integration
  • the Data was tested using https://www.integral-calculator.com/ and integration by hand. The function Test_Data() was used

    Only a slight difference could be observed between the resulst so this is a valaible option for this kind of application because also don't have actual function to describe the spectrum so an approximation of the integrated value will be calculated(More information)

  • Integration of certain chemical vibrations to determine the presence of oxygenated groups and the branching of the olive oils.
  • The integrations borders can be changed in the top of the program for flexibility.

  • Afterwards the integrated data is added to the Integrated_Data sheet for to see the actual integrated values.
  • this is very important to check your data and for audit purposes because the all steps can be traced back.

  • Afterwards the Ratio Calculation are made to look at the ratio between the CH2 groups and the oxynated groups.
  • This will be displayed in table format.
  • display of table and data intepratation

    USER STORY

  • looking for trends of the last data set to see changes.
  • IMPLEMENTATION

  • the last 5 values will be plotted to look for the change/trend in the final samples.
  • The PloText libary is used to plot the barcharts for each ratio to see a trend.
  • Barchart of the carbonyl index

    USER STORY

    As a user, I would like to be able to to visualize my acquired Data.

    IMPLEMENTATION

  • A loop was created to check how many new spectra were added afterwards it will loop through all the new added spectra but it has limitions due to the API which has a query limit(see more of bug fixes) between 3-10 are able to be analyzed at once. Important do not at more then 10 new spectra in go to the data set this will increase the chance of crashing. the data has been tested until 10 samples only sometimes a crash occured when putting in 10 spectra.
  • Error Handling

    Error handling was implemented throughout the application with the use of try/except statements to handle exceptions raised for things like, NaN values, wrong data input for example if the data input has to many values or to little values.

    API quota problems for this reasons it was opted to have a 10 second delay of the count per sample. to limit this issue.

    Features Left to Implement

    As a future enhancement, I would like to add some basic functionality to have an input using an excell file that would be unloaded and read for a more easy input. Secondly instead of writing out into the terminal it would beneficial to have output in pdf.

    Logical Flow

    flow-chart of application:

    flow chart

    Technologies

    • Python
    • Python was the main language used to build the application.

      Python packages used:

      • NumPy library for data integration
      • PloText library for graph plotting in the terminal
      • API to google scheets for user input
        • Gspread
        • google.oauth2.service_account
      • Time packages was imported to slow down the code for 10 second to not overload the API

      Testing

      Testing results

      Error handling

      Error Handling
      1. Confirmation check:

        The code verifies if the user has pressed the 'x' key for program execution. If 'x' is not pressed, a ValueError is raised with the message "Please press 'x' if you want to run the program."

      2. Data addition check:

        If the confirmation variable is 'x', the code proceeds to check if new data has been added. It compares the values of old_data and new_data. If old_data is less than new_data, the code performs data processing operations. If old_data is greater than new_data, it raises a ValueError with the message "Did you remove a row from the raw data file? Please remove the same rows from the integrated_Data sheet?" If old_data is equal to new_data, it raises a ValueError with the message "Did you add new data?"

      3. Data correctness check:

        Within the data processing logic, several checks are performed on the data. These include:

        • Ensuring the absence of any strings in the data. If any strings are found, they are replaced with an empty string.
        • Removing commas from the data, if present.
        • Converting the data values to floating-point format.
        • Calculating the lengths of each data set and storing them in the lenght_data list.
        • Comparing the lengths with the initial length (lenght_data[0]). If any lengths differ, a ValueError is raised with the appropriate message.
      4. Exception handling:

        If any ValueError exception occurs within the try block, it is caught in the except ValueError block. The caught exception is assigned to the variable e, and an error message is printed, indicating the nature of the invalid data. The function then returns False.

      Pep8 Validation

      All python code was ran through pep8online.com validator and any warnings or errors were fixed. Code then validated successfully.

      In gitpod, warning was displayed by linter that string statement has no effect at line 6 but this is mainly used for documentation. Unuses variables because this function give 4 output and is different variables are used in different functions but all variables are used. So this warning is ignored.

      No Error PEP8

      Bugs and fixes

      problem with the values of the data integration due to , seperator for values bigger then a thousand. a fix was implemented so the , seperator was removed so it could be converted into floats.

      unfixed bugs

      Due to the problem of using the API the loop function is maximum amount ranging from 3-10 depending how on external factors such as the changes you did in the file. this is limitation through using a free API. it has a Requests per 100 seconds per user limit.

      Deployment

      Version Control

      The site was created using the Visual Studio code editor and pushed to github to the remote repository ‘history’.

      The following git commands were used throughout development to push code to the remote repo:

      git add - This command was used to add the file(s) to the staging area before they are committed.

      git commit -m “commit message” - This command was used to commit changes to the local repository queue ready for the final step.

      git push - This command was used to push all committed code to the remote repository on github.

      Heroku Deployment

      The below steps were followed to deploy this project to Heroku:
      • Go to Heroku and click "New" to create a new app.
      • Choose an app name and region region, click "Create app".
      • Go to "Settings" and navigate to Config Vars. Add the following config variables:
      • PORT : 8000
      • Navigate to Buildpacks and add buildpacks for Python and NodeJS (in that order).
      • Navigate to "Deploy". Set the deployment method to Github and enter repository name and connect.
      • Scroll down to Manual Deploy, select "main" branch and click "Deploy Branch".
      • The app will now be deployed to heroku
      • link to application

      Clone Locally

      • Open IDE of choice and type the following into the terminal:
      • git clone https://github.com/Frangidha/Data-Processing-Template.git
      • Project will now be cloned locally.
      • Open your IDE of choice (git must be installed for the next steps)
      • Type git clone copied-git-url into the IDE terminal
      • The project will now of been cloned on your local machine for use.

      Credits

      StackOverflow

      was used for certain bug fixes which were encounterd during the programming process so could find what the problem was and give guidance on solution.

      Core Science Resources(link)

      the used spectra for this applications were obtained by Henri S. Tapp, Marianne Defernez, and E. Katherine Kemsley.

      Code insitute

      the code insitute curriculum was used to develop the entire application. Mainly the love-sandwiches project was great inspiration to find out how to connect the file to google sheets.link.

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published