|
1 |
| -## Viewers |
| 1 | +# Viewers |
2 | 2 |
|
3 |
| -Module ```viewers``` provides information from a topic model allowing to estimate the model quality. Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy. |
| 3 | +Module `viewers` provides information from a topic model allowing to estimate the model quality. |
| 4 | +Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy. |
4 | 5 |
|
5 |
| -Currently module contains following viewers: |
| 6 | +Currently module contains the following viewers: |
6 | 7 |
|
| 8 | +* `base_viewer` (`BaseViewer`) — module responsible for base infrastructure. |
7 | 9 |
|
8 |
| -* ```base_viewer``` - module responsible for base infrastructure |
| 10 | +* `document_cluster` (`DocumentClusterViewer`) — module which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html) library. |
9 | 11 |
|
10 |
| -* ```spectrum``` - module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrun, |
11 |
| -```TopicSpectrumViewer``` |
| 12 | +<p> |
| 13 | + <div align="center"> |
| 14 | + <img src="../docs/images/doc_cluster__plot.png" width="80%" alt/> |
| 15 | + </div> |
| 16 | + <em> |
| 17 | + Visualisation of reduced document embeddings colored according to their topic made by DocumentClusterViewer. |
| 18 | + </em> |
| 19 | +</p> |
12 | 20 |
|
13 |
| -* ```top_documents_viewer``` - module with functions that work with dataset document collections and class ```TopDocumentsViewer``` wrapping this functionality. |
| 21 | +* `spectrum` (`TopicSpectrumViewer`) — module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrum. |
14 | 22 |
|
15 |
| -* ```top_similar_documents_viewer``` - module containing class for finding simmilar document for a givenone. This viewer helps to estimate homogeneity of clusters given by the model |
| 23 | +<p> |
| 24 | + <div align="center"> |
| 25 | + <img src="../docs/images/topic_spectrum__refined_view.png" width="80%" alt/> |
| 26 | + </div> |
| 27 | + <em> |
| 28 | + Each point on the plot represents some topic. |
| 29 | + The viewer helped to calculate such a route between topics when one topic is connected with similar one, and so on, forming a circle. |
| 30 | + </em> |
| 31 | +</p> |
16 | 32 |
|
17 |
| -* ```top_tokens_viewer``` - module with class for displaying the most relevant tokens in each topic of the model. |
| 33 | +* `top_documents_viewer` (`TopDocumentsViewer`) — module with functions that work with dataset document collections. |
18 | 34 |
|
19 |
| -* ```topic_mapping``` - module allowing to compare topics between two different models trained on the same collection. |
| 35 | +<p> |
| 36 | + <div align="center"> |
| 37 | + <img src="../docs/images/top_doc__view.png" width="80%" alt/> |
| 38 | + </div> |
| 39 | + <em> |
| 40 | + The viewer shows fragments of top documents corresponding to some topic. |
| 41 | + </em> |
| 42 | +</p> |
20 | 43 |
|
21 |
| -* ```document_cluster``` - module containing class ```DocumentClusterViewer``` which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from sklearn library. |
| 44 | +* `top_similar_documents_viewer` (`TopSimilarDocumentsViewer`) — module containing class for finding similar document for a given one. This viewer helps to estimate homogeneity of clusters given by the model. |
22 | 45 |
|
| 46 | +<p> |
| 47 | + <div align="center"> |
| 48 | + <img src="../docs/images/top_sim_doc__refined_view.png" width="80%" alt/> |
| 49 | + </div> |
| 50 | + <em> |
| 51 | + Some document from text collection (on top), and documents nearest to it given topic model. |
| 52 | + The viewer (currently) gives only document names as output, but the picture is not very difficult to be made. |
| 53 | + </em> |
| 54 | +</p> |
23 | 55 |
|
| 56 | +* `top_tokens_viewer` (`TopTokensViewer`) — module with class for displaying the most relevant tokens in each topic of the model. |
24 | 57 |
|
25 |
| -* ```initial_doc_to_topic_viewer``` - first edition of ```TopDocumentsViewer``` - Deprecated |
| 58 | +<p> |
| 59 | + <div align="center"> |
| 60 | + <img src="../docs/images/top_tokens__view.png" width="80%" alt/> |
| 61 | + </div> |
| 62 | + <em> |
| 63 | + Output of the TopTokensViewer. Token score in the topic is calculated for every token, score function can be specified at the stage of a viewer initialization. |
| 64 | + </em> |
| 65 | +</p> |
26 | 66 |
|
27 |
| -* ```tokens_viewer``` - first edition of ```TopTokensViewer``` - Deprecated |
| 67 | +* `topic_mapping` (`TopicMapViewer`) — module allowing to compare topics between two different models trained on the same collection. |
| 68 | + |
| 69 | +<p> |
| 70 | + <div align="center"> |
| 71 | + <img src="../docs/images/topic_map__view.png" width="80%" alt/> |
| 72 | + </div> |
| 73 | + <em> |
| 74 | + The mapping between topics of two models (currently only topic names are displayed). |
| 75 | + </em> |
| 76 | +</p> |
| 77 | + |
| 78 | +## Deprecated |
| 79 | + |
| 80 | +* `initial_doc_to_topic_viewer` — first edition of `TopDocumentsViewer` |
| 81 | + |
| 82 | +* `tokens_viewer` - first edition of `TopTokensViewer` |
0 commit comments