Skip to content

Commit b8e1778

Browse files
committed
add pics
1 parent 607fbbf commit b8e1778

File tree

8 files changed

+68
-49
lines changed

8 files changed

+68
-49
lines changed
46.3 KB
Loading
76.8 KB
Loading
Loading
20.3 KB
Loading
39.6 KB
Loading
Loading

topicnet/viewers/README-rus.md

Lines changed: 0 additions & 36 deletions
This file was deleted.

topicnet/viewers/README.md

Lines changed: 68 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,82 @@
1-
## Viewers
1+
# Viewers
22

3-
Module ```viewers``` provides information from a topic model allowing to estimate the model quality. Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy.
3+
Module `viewers` provides information from a topic model allowing to estimate the model quality.
4+
Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy.
45

5-
Currently module contains following viewers:
6+
Currently module contains the following viewers:
67

8+
* `base_viewer` (`BaseViewer`) — module responsible for base infrastructure.
79

8-
* ```base_viewer``` - module responsible for base infrastructure
10+
* `document_cluster` (`DocumentClusterViewer`) — module which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html) library.
911

10-
* ```spectrum``` - module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrun,
11-
```TopicSpectrumViewer```
12+
<p>
13+
<div align="center">
14+
<img src="../docs/images/doc_cluster__plot.png" width="80%" alt/>
15+
</div>
16+
<em>
17+
Visualisation of reduced document embeddings colored according to their topic made by DocumentClusterViewer.
18+
</em>
19+
</p>
1220

13-
* ```top_documents_viewer``` - module with functions that work with dataset document collections and class ```TopDocumentsViewer``` wrapping this functionality.
21+
* `spectrum` (`TopicSpectrumViewer`) — module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrum.
1422

15-
* ```top_similar_documents_viewer``` - module containing class for finding simmilar document for a givenone. This viewer helps to estimate homogeneity of clusters given by the model
23+
<p>
24+
<div align="center">
25+
<img src="../docs/images/topic_spectrum__refined_view.png" width="80%" alt/>
26+
</div>
27+
<em>
28+
Each point on the plot represents some topic.
29+
The viewer helped to calculate such a route between topics when one topic is connected with similar one, and so on, forming a circle.
30+
</em>
31+
</p>
1632

17-
* ```top_tokens_viewer``` - module with class for displaying the most relevant tokens in each topic of the model.
33+
* `top_documents_viewer` (`TopDocumentsViewer`) — module with functions that work with dataset document collections.
1834

19-
* ```topic_mapping``` - module allowing to compare topics between two different models trained on the same collection.
35+
<p>
36+
<div align="center">
37+
<img src="../docs/images/top_doc__view.png" width="80%" alt/>
38+
</div>
39+
<em>
40+
The viewer shows fragments of top documents corresponding to some topic.
41+
</em>
42+
</p>
2043

21-
* ```document_cluster``` - module containing class ```DocumentClusterViewer``` which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from sklearn library.
44+
* `top_similar_documents_viewer` (`TopSimilarDocumentsViewer`) — module containing class for finding similar document for a given one. This viewer helps to estimate homogeneity of clusters given by the model.
2245

46+
<p>
47+
<div align="center">
48+
<img src="../docs/images/top_sim_doc__refined_view.png" width="80%" alt/>
49+
</div>
50+
<em>
51+
Some document from text collection (on top), and documents nearest to it given topic model.
52+
The viewer (currently) gives only document names as output, but the picture is not very difficult to be made.
53+
</em>
54+
</p>
2355

56+
* `top_tokens_viewer` (`TopTokensViewer`) — module with class for displaying the most relevant tokens in each topic of the model.
2457

25-
* ```initial_doc_to_topic_viewer``` - first edition of ```TopDocumentsViewer``` - Deprecated
58+
<p>
59+
<div align="center">
60+
<img src="../docs/images/top_tokens__view.png" width="80%" alt/>
61+
</div>
62+
<em>
63+
Output of the TopTokensViewer. Token score in the topic is calculated for every token, score function can be specified at the stage of a viewer initialization.
64+
</em>
65+
</p>
2666

27-
* ```tokens_viewer``` - first edition of ```TopTokensViewer``` - Deprecated
67+
* `topic_mapping` (`TopicMapViewer`) — module allowing to compare topics between two different models trained on the same collection.
68+
69+
<p>
70+
<div align="center">
71+
<img src="../docs/images/topic_map__view.png" width="80%" alt/>
72+
</div>
73+
<em>
74+
The mapping between topics of two models (currently only topic names are displayed).
75+
</em>
76+
</p>
77+
78+
## Deprecated
79+
80+
* `initial_doc_to_topic_viewer` — first edition of `TopDocumentsViewer`
81+
82+
* `tokens_viewer` - first edition of `TopTokensViewer`

0 commit comments

Comments
 (0)