Skip to content

jacj9/jobguardia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

I. Introduction

Jobguardia is designed to help job seekers identify potentially fraudulent job postings. By analyzing job descriptions and job URLs, the tool provides a probability assessment of a job postings's legitimacy. This tool is publicly accessible to assist in evaluating job opportunities.

II. User Guide

2.1 Accessing the Tool

Job seekers will be able to use the web application through direct URL. From there, users can start browsing a job post and select the one they want to analyze. Users can access the Jobguardia web application through the following URL: https://jobguardia-92336f8e6d13.herokuapp.com/

2.2 Analyzing Job Postings

To analyze a job posting, follow these steps:

2.2.1 Input Methods

Choose the input method by selecting either "Job Description" or "Job URL."

Job Description

If you select "Job Description," copy the complete job description text from the job posting (including details about the company, responsibilities, and requirements) and paste it into the provided text box.

Example:

job_description-2025-05-15 212558

Job URL

If you select "Job URL," copy the full web address (URL) of the job posting from the website where it is listed and paste it into the provided text box.

Example:

job_url-2025-05-15 212939

2.2.2 Analyze

Click the "Analyze" button to initiate the analysis. The tool will process the input and display the results below the buttons.

Examples:

analyze_job_description- 2025-05-15 212753 analyze_job_url-2025-05-15 213035

2.3 Understanding the Output

The tool provides a probability percentage indicating whether the job posting is likely fraudulent or legitimate.

  • A higher percentage (e.g., 80-100%) labeled as "Likely Fraudulent" suggests a strong possibility that the job posting is not genuine.

  • A lower percentage (e.g., 0-20%) labeled as "Likely Legit" indicates a higher probability that the job posting is authentic.

  • Percentages in the mid-range (e.g., 40-60%) should be interpreted with caution, and users should conduct further investigation.

2.4 Using the “Clear” Button

The "Clear" button erases the text in the input box and the output area. This allows users to easily analyze another job description or URL.

clear-2025-05-15.215005.mp4

2.5 Troubleshooting

Common errors users might encounter include the tool’s inability to analyze certain URLs from Google shorteners or Indeed job posts.

  • Some of the shortened URL can’t be read because the AI has difficulty reading certain links.
  • To bypass the error, use the job description method by pasting the job description instead of analyzing the job posting via the job URL. (Check the video below for guidance.)

2.5.1 URL Analysis Errors:

The tool may encounter errors when analyzing certain URLs, particularly those from URL shortening services (like Google Shortener) or Indeed job postings. This is sometimes due to the AI's difficulty in correctly processing these types of links.

Example Error Message:** "Error: Could not extract job description from the URL. Please try again or enter a job summary."

Solution: To bypass this, use the "Job Description" method. Copy the job description text and paste it into the text box instead of using the URL.  

Demo:

error_troubleshoot.-.Made.with.Clipchamp.mp4

2.5.2 Input Limitations:

The tool does not have a fixed character limit for job description input. Limitations depend on the user's browser and server. If your job description exceeds the allowed limit, please shorten it to focus on the core details

2.5.3 Browser Compatibility:

The tool is optimized for use with the latest versions of Chrome, Firefox, and Safari. If you encounter issues, please try using a supported browser.

2.6 Frequently Asked Questions (FAQ)

Q: How accurate is Jobguardia?

  • The tool provides a probability-based assessment and is not a guarantee of a job posting's legitimacy. Always exercise caution and conduct your own thorough research.

Q: Is my data kept private?

  • The job descriptions and URLs you submit are processed to provide the analysis. Currently, we do not store your input data after the analysis is complete.

Q: Who do I contact for support?

For technical issues or feedback, please message us here on GitHub.

III. How the tool was made?

Data Acquisition and Preparation

The process started with finding available job datasets online to train the model. A LinkedIn job posts dataset was downloaded for this purpose.

  • Each data entry in the dataset was manually reviewed and labeled as either "fraudulent" or "legitimate."

  • The dataset was iteratively updated with more data to continuously refine the machine learning model's accuracy.

3.1 Data Training

The tool's machine learning model was trained using a dataset of job postings.  

  • Job data was collected from online sources (Bright Data scraping request), including LinkedIn.  
  • Each job posting in the dataset was manually reviewed and labeled as either "fraudulent" or "legitimate."  
  • The dataset was iteratively expanded and refined to improve the model's accuracy.

3.2 Backend (Flask Application)

The machine learning model was developed using Jupyter Notebook and trained with the job dataset.

  • The process involved cleaning and vectorizing the text data to make it suitable for the model.  
  • A Random Forest Classifier algorithm was used to train the model.  
  • Security measures, including the use of the bleach library, were implemented to sanitize user input and help prevent cross-site scripting (XSS) vulnerabilities.  
  • The Flask framework was used to create the web application and handle communication between the frontend and backend.

3.3 Frontend (HTML, CSS, JavaScript)

The frontend provides the user interface for interacting with the tool.  

  • Radio buttons allow users to select the input method (Job Description or Job URL).  
  • Flask facilitates the interaction between the frontend and backend, enabling the display of the analysis results.  
  • The "Analyze" and "Clear" buttons allow users to submit job postings for analysis and clear the input/output, respectively.

IV. Data and Model

The machine learning model is based on the Random Forest Classifier algorithm.

  • The model can be retrained with new data to enhance its performance and adapt to evolving trends in job scams.

Disclaimer

Jobguardia is a job scam detection tool to assist in evaluating job postings and may not always be accurate. It is crucial to independently verify the legitimacy of any job opportunity and exercise caution before making any decisions.

About

ML-Powered Job Scam Detection Tool

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published