Skip to content

mark-antal-csizmadia/finding-similar-items-textually-similar-documents

Repository files navigation

finding-similar-items-textually-similar-documents

Setup

Make a conda virtual environment from the environment.yml file as discussed here. Make the virtual environment available in Jupyter Notebooks as discussed here. Start Jupyter Notebooks and select the environment. Run the main.ipynb notebook.

Reproducibility

The Python PYTHONHASHSEED environment variable is fixed so that the built-in Python hash() function yields consistent results. Pass the seed variable to the MinHashing class constructor as minhashing uses the numpy.random.randint() function.

Results

About

Finding Similar Items: Textually Similar Documents

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published