You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+19-16Lines changed: 19 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -38,7 +38,7 @@ Consider using TopicNet if:
38
38
* you want to build a good topic model quickly (out-of-box, with default parameters).
39
39
* you have an ARTM model at hand and you want to explore it's topics.
40
40
41
-
`TopicNet` provides an infrastructure for your prototyping with the help of `Experiment` class and helps to observe results of your actions via `viewers` module.
41
+
`TopicNet` provides an infrastructure for your prototyping with the help of `Experiment` class and helps to observe results of your actions via [`viewers`](topicnet/viewers) module.
42
42
43
43
<p>
44
44
<div align="center">
@@ -159,7 +159,7 @@ Here we can finally get on the main part: making your own, best of them all, man
159
159
We need to load our data prepared previously with Dataset:
160
160
161
161
```python
162
-
data= Dataset('/Wiki_raw_set/wiki_data.csv')
162
+
dataset= Dataset('/Wiki_raw_set/wiki_data.csv')
163
163
```
164
164
165
165
### Make initial model
@@ -169,8 +169,8 @@ In case you want to start from a fresh model we suggest you use this code:
169
169
```python
170
170
from topicnet.cooking_machine.model_constructor import init_simple_default_model
Browsing the model is easy: create a viewer and call its `view()` method:
250
+
251
+
Browsing the model is easy: create a viewer and call its `view()` method (or `view_from_jupyter()` — it is advised to use it if working in Jupyter Notebook):
Module ```viewers``` provides information from a topic model allowing to estimate the model quality. Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy.
3
+
Module `viewers` provides information from a topic model allowing to estimate the model quality.
4
+
Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy.
4
5
5
-
Currently module contains following viewers:
6
+
Currently module contains the following viewers:
6
7
8
+
## `base_viewer` (`BaseViewer`)
7
9
8
-
*```base_viewer``` - module responsible for base infrastructure
10
+
Module responsible for base infrastructure.
9
11
10
-
*```spectrum``` - module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrun,
11
-
```TopicSpectrumViewer```
12
12
13
-
*```top_documents_viewer``` - module with functions that work with dataset document collections and class ```TopDocumentsViewer``` wrapping this functionality.
13
+
## `document_cluster` (`DocumentClusterViewer`)
14
14
15
-
*```top_similar_documents_viewer``` - module containing class for finding simmilar document for a givenone. This viewer helps to estimate homogeneity of clusters given by the model
15
+
Module which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html) library.
16
16
17
-
*```top_tokens_viewer``` - module with class for displaying the most relevant tokens in each topic of the model.
Visualisation of reduced document embeddings colored according to their topic made by DocumentClusterViewer.
23
+
</em>
24
+
</p>
18
25
19
-
*```topic_mapping``` - module allowing to compare topics between two different models trained on the same collection.
20
26
21
-
*```document_cluster``` - module containing class ```DocumentClusterViewer``` which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from sklearn library.
27
+
## `spectrum` (`TopicSpectrumViewer`)
22
28
29
+
Module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrum.
Output of the TopTokensViewer. Token score in the topic is calculated for every token, score function can be specified at the stage of a viewer initialization.
81
+
</em>
82
+
</p>
83
+
84
+
85
+
## `topic_mapping` (`TopicMapViewer`)
86
+
87
+
Module allowing to compare topics between two different models trained on the same collection.
0 commit comments