-
Notifications
You must be signed in to change notification settings - Fork 21
Backend
This a public phishing site dataset taken from UCI repository.
Download the dataset and save as dataset.arff
. The preprocess.py
loads the arff file and converts it to numpy array. Then dataset metadata is printed and then dataset is splited into training and testing set with 30% for testing.
Change working directory to /backend/dataset
and Run the preprocessor with
python3 preprocess.py
Training and testing data *.npy files are created in the working directory.
The RandomForestClassifier (ensemble learner) is fitted with the training set and then the accuracy and cross validation scores are printed.
The parameters of the learned model, such as number of estimators, tree parameters such as thresholds for each estimators are dumped on to a file named classifier.json
.
Change working directory to /backend/classifier
and Run
python3 training.py
classifier.py
is created in the working directory.
Serve this classifier.py
over HTTP and update URL in the plugin settings.