Skip to content

alexanderbinnekamp/Master-Thesis-Documents

Repository files navigation

Contents: 

Linear_svm_bow_training.ipynb
	Jupyter notebook with the codes for the training of the linear svm bow model 

Linear_svm_wordembedding_training.ipynb
	Jupyter notebook with the codes for the training of the linear svm word embedding model

Preproccessed_documents.zip
	corpus of all legal documents preprocessed (tokenized in sentences, removal of comma's)

Translated_documents.zip
	corpus of all legal documents preprocessed and translated into English

analysis_of_results.ipynb
	Jupyter notebook with the codes for the application of the trained model to the entire corpus, and the analysis of the results

csv_annotated_doc_sample.csv
	Annotated training set in csv format

dataset_training.xlsx
	Annotated training set in xlsx format

json_annotated_docs_sample.json
	Annotated training set in JSON format

json_to_csv_export.ipynb
	Jupyter notebook with the codes for the transformation of the annotated training set from JSON format to CSV format

preprocessing_jupyter_notebook
	Jupyter notebook with the codes for the preprocessing steps

translation_jupyter_notebook
	Jupyter notebook with the codes for the translation steps

NOTE: the trained models in .pkl format and the final dataset in .csv format are not included in the repository because these files are too large (>25mb). 
However, they are added on the OSF account: https://osf.io/2uctj/?view_only=b996ceeb74cb4194ae1406d0b4c69333

About

Storage of all raw and edited documents related to my Master Thesis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published