Welcome! This Flask app is a small scale information retrieval system using Python and html combined with jinja2.
The system searches a collection of plain text documents in the docs
folder.
It will retrieve them using two models:
The system searches a collection of documents. It comes preloaded with 200 documents, on the topic of aeronautica. You can provide extra documents yourself, as long as these are plain text. Make sure you have the required libraries installed by:
- navigate to project folder
- run
pip install -r requirements.txt
To start searching, run flask run
.
- Engineer
- What is a wing?
- Lift on aeroplanes
- queries terms are treated as a boolean AND query.
- Results are ordered by PageRank values.
- Queries and documents are treated as vectors.
- Results are ordered by cosine similarity.
These features are not implemented in the current system
- the user can indicate a minimum recommendation threshold (e.g., 0.6 or a level in the scale, i.e., high, medium, or low). Only the documents with a similarity equal to or bigger than the threshold are displayed in the results.
- it can deal with structured documents, such as HTML pages.
- it shows metadata of documents in the ranked output.
- it implements some non-trivial form of personalization.
- visualization of the graph with the PageRank values.