Skip to content

Commit 9185db9

Browse files
Merge pull request #46 from machine-intelligence-laboratory/hotfix/readme_and_order
README updates and small fix in recipes
2 parents 0105191 + 8c21424 commit 9185db9

10 files changed

+52
-10
lines changed

README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,8 @@ Here we can finally get on the main part: making your own, best of them all, man
159159
We need to load our data prepared previously with Dataset:
160160

161161
```python
162-
dataset = Dataset('/Wiki_raw_set/wiki_data.csv')
162+
DATASET_PATH = '/Wiki_raw_set/wiki_data.csv'
163+
dataset = Dataset(DATASET_PATH)
163164
```
164165

165166
### Make initial model
@@ -246,6 +247,20 @@ perplexity_criterion = 'PerplexityScore@lemmatized -> min COLLECT 1'
246247
best_model = experiment.select(perplexity_criterion)
247248
```
248249

250+
### Alternatively: Use Recipes
251+
If you need a topic model now, you can use one of the code snippets we call *recipes*.
252+
```python
253+
from topicnet.cooking_machine.recipes import BaselineRecipe
254+
255+
training_pipeline = BaselineRecipe()
256+
EXPERIMENT_PATH = '/home/user/experiment/'
257+
258+
training_pipeline.format_recipe(dataset_path=DATASET_PATH)
259+
experiment, dataset = training_pipeline.build_experiment_environment(save_path=EXPERIMENT_PATH,)
260+
```
261+
after that you can expect a following result:
262+
![run_result](./docs/readme_images/experiment_train.gif)
263+
249264
### View the results
250265

251266
Browsing the model is easy: create a viewer and call its `view()` method (or `view_from_jupyter()` — it is advised to use it if working in Jupyter Notebook):
416 KB
Loading

topicnet/cooking_machine/recipes/multimodal_exploratory_search_pipeline.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,6 @@ def format_recipe(
153153
background_topics=background_topics,
154154
modality_list=modality_list,
155155
num_iter=num_iter,
156-
order=self._order
157156
)
158157
return self._recipe
159158

@@ -256,8 +255,7 @@ def _make_multimodal_recipe(
256255
reg_forms = self._form_regularizers(modality_list)
257256
cube_forms = self._form_and_order_cubes(
258257
modality_list,
259-
num_iter=num_iter,
260-
order=self._order)
258+
num_iter=num_iter,)
261259
self._recipe = self.recipe_template.format(
262260
modality=modality,
263261
dataset_path=dataset_path,

topicnet/demos/20NG-Preprocessing.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -691,7 +691,7 @@
691691
"name": "python",
692692
"nbconvert_exporter": "python",
693693
"pygments_lexer": "ipython3",
694-
"version": "3.6.8"
694+
"version": "3.6.9"
695695
}
696696
},
697697
"nbformat": 4,

topicnet/demos/Coherence-Maximization-Recipe.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2032,7 +2032,7 @@
20322032
"name": "python",
20332033
"nbconvert_exporter": "python",
20342034
"pygments_lexer": "ipython3",
2035-
"version": "3.6.8"
2035+
"version": "3.6.9"
20362036
}
20372037
},
20382038
"nbformat": 4,

topicnet/demos/PScience-Building-Topic-Model.ipynb renamed to topicnet/demos/PostNauka-Building-Topic-Model.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4394,7 +4394,7 @@
43944394
"name": "python",
43954395
"nbconvert_exporter": "python",
43964396
"pygments_lexer": "ipython3",
4397-
"version": "3.6.8"
4397+
"version": "3.6.9"
43984398
}
43994399
},
44004400
"nbformat": 4,

topicnet/demos/README.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,30 @@
1-
### This folder contains Jupyter notebooks with working examples of the code
1+
# Demo
2+
This section provides demonstrations of how to use this library in NLP tasks.
3+
4+
1. [RTL-Wiki-Preprocessing](RTL-Wiki-Preprocessing.ipynb) -- notebook working with a dataset introduced in [1]. It serves as an example of a typical preprocessing pipeline: getting a dataset, lemmatizing it, extracting n-grams/collocations, writing data in VW format
5+
6+
2. [RTL-Wiki-Building-Topic-Mode](RTL-Wiki-Building-Topic-Model.ipynb) -- notebook with first steps to build topic model by consequently tuning its hyperparameters
7+
8+
3. [Visualizing-Your-Model-Documents](Visualizing-Your-Model-Documents.ipynb) -- notebook providing a fresh outlook on unstructured document collection with the help of a topic model
9+
10+
4. [20NG-Preprocessing](20NG-Preprocessing.ipynb) -- preparing data from a well-know 20 Newsgroups dataset
11+
12+
5. [20NG-GenSim-vs-TopicNet](20NG-GenSim-vs-TopicNet.ipynb) -- a comparison between two topic models build by Gensim and TopicNet library. In the notebook we compare model topics by calculating their UMass coherence measure with the help of [Palmetto](https://palmetto.demos.dice-research.org/) and using Jaccard measure to compare topic top-tokens diversity
13+
14+
6. [PostNauka-Building-Topic-Model](PostNauka-Building-Topic-Model.ipynb)-- an analog of the RTL-Wiki notebook performed on the corpus of Russian pop-science articles given by postnauka.ru
15+
16+
7. [PostNauka-Recipe](PostNauka-Recipe.ipynb) -- a demonstration of rapid-prototyping methods provided by the library
17+
18+
8. [Coherence-Maximization-Recipe](Coherence-Maximization-Recipe.ipynb) -- a recipe for hyperparameter search in regard to custom Coherence metric
19+
20+
9. [Topic-Prior-Regularizer-Tutorial](Topic-Prior-Regularizer-Tutorial.ipynb) -- a demonstration of the approach to learning topics from the unbalanced corpus
21+
22+
10. [Making-Decorrelation-and-Topic-Selection-Friends](Making-Decorrelation-and-Topic-Selection-Friends.ipynb) -- reproduction of a very complicated experiment on automatically learning optimal number of topics from the collection. Hurdle is -- both needed regularizers when working together nullify token-topic matrix.
23+
24+
----
25+
[1](https://dl.acm.org/doi/10.5555/2984093.2984126) Jonathan Chang, Jordan Boyd-Graber, Sean Gerrish, Chong Wang, and David M. Blei. 2009. Reading tea leaves: how humans interpret topic models. In Proceedings of the 22nd International Conference on Neural Information Processing Systems (NIPS’09). Curran Associates Inc., Red Hook, NY, USA, 288–296.
26+
27+
----
28+
P.S. All the guides are supposed to contain **working** examples of the library code.
29+
If you happen to find code that is no longer works, please write about it in the library issues.
30+
We will try to resolve it as soon as possible and plan to include fixes in the nearest releases.

topicnet/demos/RTL-Wiki-Building-Topic-Model.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2787,7 +2787,7 @@
27872787
"name": "python",
27882788
"nbconvert_exporter": "python",
27892789
"pygments_lexer": "ipython3",
2790-
"version": "3.6.8"
2790+
"version": "3.6.9"
27912791
}
27922792
},
27932793
"nbformat": 4,

topicnet/demos/Topic-Prior-Regularizer-Tutorial.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -573,7 +573,7 @@
573573
"name": "python",
574574
"nbconvert_exporter": "python",
575575
"pygments_lexer": "ipython3",
576-
"version": "3.6.8"
576+
"version": "3.6.9"
577577
}
578578
},
579579
"nbformat": 4,

0 commit comments

Comments
 (0)