Merge pull request #46 from machine-intelligence-laboratory/hotfix/readme_and_order

Evgeny-Egorov-Projects · web-flow · commit 9185db92eb81 · 2020-04-23T17:15:47.000+03:00
README updates and small fix in recipes
diff --git a/README.md b/README.md
@@ -159,7 +159,8 @@ Here we can finally get on the main part: making your own, best of them all, man
 We need to load our data prepared previously with Dataset:
 
 ```python
-dataset = Dataset('/Wiki_raw_set/wiki_data.csv')
+DATASET_PATH = '/Wiki_raw_set/wiki_data.csv'
+dataset = Dataset(DATASET_PATH)
 ```
 
 ### Make initial model
@@ -246,6 +247,20 @@ perplexity_criterion = 'PerplexityScore@lemmatized -> min COLLECT 1'
 best_model = experiment.select(perplexity_criterion)
 ```
 
+### Alternatively: Use Recipes
+If you need a topic model now, you can use one of the code snippets we call *recipes*.
+```python
+from topicnet.cooking_machine.recipes import BaselineRecipe
+
+training_pipeline = BaselineRecipe()
+EXPERIMENT_PATH = '/home/user/experiment/'
+
+training_pipeline.format_recipe(dataset_path=DATASET_PATH)
+experiment, dataset = training_pipeline.build_experiment_environment(save_path=EXPERIMENT_PATH,)
+```
+after that you can expect a following result:
+![run_result](./docs/readme_images/experiment_train.gif)
+
 ### View the results
 
 Browsing the model is easy: create a viewer and call its `view()` method (or `view_from_jupyter()` — it is advised to use it if working in Jupyter Notebook):
diff --git a/docs/readme_images/experiment_train.gif b/docs/readme_images/experiment_train.gif
diff --git a/topicnet/cooking_machine/recipes/multimodal_exploratory_search_pipeline.py b/topicnet/cooking_machine/recipes/multimodal_exploratory_search_pipeline.py
@@ -153,7 +153,6 @@ def format_recipe(
             background_topics=background_topics,
             modality_list=modality_list,
             num_iter=num_iter,
-            order=self._order
         )
         return self._recipe
 
@@ -256,8 +255,7 @@ def _make_multimodal_recipe(
         reg_forms = self._form_regularizers(modality_list)
         cube_forms = self._form_and_order_cubes(
             modality_list,
-            num_iter=num_iter,
-            order=self._order)
+            num_iter=num_iter,)
         self._recipe = self.recipe_template.format(
             modality=modality,
             dataset_path=dataset_path,
diff --git a/topicnet/demos/20NG-Preprocessing.ipynb b/topicnet/demos/20NG-Preprocessing.ipynb
@@ -691,7 +691,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.8"
+   "version": "3.6.9"
   }
  },
  "nbformat": 4,
diff --git a/topicnet/demos/Coherence-Maximization-Recipe.ipynb b/topicnet/demos/Coherence-Maximization-Recipe.ipynb
@@ -2032,7 +2032,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.8"
+   "version": "3.6.9"
   }
  },
  "nbformat": 4,
diff --git a/topicnet/demos/PostNauka-Building-Topic-Model.ipynb b/topicnet/demos/PostNauka-Building-Topic-Model.ipynb
@@ -4394,7 +4394,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.8"
+   "version": "3.6.9"
   }
  },
  "nbformat": 4,
diff --git a/topicnet/demos/PostNauka-Recipe.ipynb b/topicnet/demos/PostNauka-Recipe.ipynb
diff --git a/topicnet/demos/README.md b/topicnet/demos/README.md
@@ -1 +1,30 @@
-### This folder contains Jupyter notebooks with working  examples of the code
+# Demo
+This section provides demonstrations of how to use this library in NLP tasks.
+
+1. [RTL-Wiki-Preprocessing](RTL-Wiki-Preprocessing.ipynb) --  notebook working with a dataset introduced in [1]. It serves as an example of a typical preprocessing pipeline: getting a dataset, lemmatizing it, extracting n-grams/collocations, writing data in VW format
+
+2. [RTL-Wiki-Building-Topic-Mode](RTL-Wiki-Building-Topic-Model.ipynb) --  notebook with first steps to build topic model by consequently tuning its hyperparameters
+
+3. [Visualizing-Your-Model-Documents](Visualizing-Your-Model-Documents.ipynb) -- notebook providing a fresh outlook on unstructured document collection with the help of a topic model
+
+4. [20NG-Preprocessing](20NG-Preprocessing.ipynb) -- preparing data from a well-know 20 Newsgroups dataset
+
+5. [20NG-GenSim-vs-TopicNet](20NG-GenSim-vs-TopicNet.ipynb) -- a comparison between two topic models build by Gensim and TopicNet library. In the notebook we compare model topics by calculating their UMass coherence measure with the help of [Palmetto](https://palmetto.demos.dice-research.org/) and using Jaccard measure to compare topic top-tokens diversity
+
+6. [PostNauka-Building-Topic-Model](PostNauka-Building-Topic-Model.ipynb)-- an analog of the RTL-Wiki notebook performed on the corpus of Russian pop-science articles given by postnauka.ru
+
+7. [PostNauka-Recipe](PostNauka-Recipe.ipynb) -- a demonstration of rapid-prototyping methods provided by the library
+
+8. [Coherence-Maximization-Recipe](Coherence-Maximization-Recipe.ipynb) -- a recipe for hyperparameter search in regard to custom Coherence metric
+
+9. [Topic-Prior-Regularizer-Tutorial](Topic-Prior-Regularizer-Tutorial.ipynb) -- a demonstration of the approach to learning topics from the unbalanced corpus
+
+10. [Making-Decorrelation-and-Topic-Selection-Friends](Making-Decorrelation-and-Topic-Selection-Friends.ipynb) -- reproduction of a very complicated experiment on automatically learning optimal number of topics from the collection. Hurdle is -- both needed regularizers when working together nullify token-topic matrix.
+
+----
+[1](https://dl.acm.org/doi/10.5555/2984093.2984126) Jonathan Chang, Jordan Boyd-Graber, Sean Gerrish, Chong Wang, and David M. Blei. 2009. Reading tea leaves: how humans interpret topic models. In Proceedings of the 22nd International Conference on Neural Information Processing Systems (NIPS’09). Curran Associates Inc., Red Hook, NY, USA, 288–296.
+
+----
+P.S. All the guides are supposed to contain **working** examples of the library code.
+If you happen to find code that is no longer works, please write about it in the library issues.
+We will try to resolve it as soon as possible and plan to include fixes in the nearest releases.
diff --git a/topicnet/demos/RTL-Wiki-Building-Topic-Model.ipynb b/topicnet/demos/RTL-Wiki-Building-Topic-Model.ipynb
@@ -2787,7 +2787,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.8"
+   "version": "3.6.9"
   }
  },
  "nbformat": 4,
diff --git a/topicnet/demos/Topic-Prior-Regularizer-Tutorial.ipynb b/topicnet/demos/Topic-Prior-Regularizer-Tutorial.ipynb
@@ -573,7 +573,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.8"
+   "version": "3.6.9"
   }
  },
  "nbformat": 4,

Original file line number	Diff line number	Diff line change
`@@ -691,7 +691,7 @@`
`691`	`691`	`"name": "python",`
`692`	`692`	`"nbconvert_exporter": "python",`
`693`	`693`	`"pygments_lexer": "ipython3",`
`694`		`- "version": "3.6.8"`
	`694`	`+ "version": "3.6.9"`
`695`	`695`	`}`
`696`	`696`	`},`
`697`	`697`	`"nbformat": 4,`
Original file line number	Diff line number	Diff line change
`@@ -2032,7 +2032,7 @@`
`2032`	`2032`	`"name": "python",`
`2033`	`2033`	`"nbconvert_exporter": "python",`
`2034`	`2034`	`"pygments_lexer": "ipython3",`
`2035`		`- "version": "3.6.8"`
	`2035`	`+ "version": "3.6.9"`
`2036`	`2036`	`}`
`2037`	`2037`	`},`
`2038`	`2038`	`"nbformat": 4,`
Original file line number	Diff line number	Diff line change
`@@ -4394,7 +4394,7 @@`
`4394`	`4394`	`"name": "python",`
`4395`	`4395`	`"nbconvert_exporter": "python",`
`4396`	`4396`	`"pygments_lexer": "ipython3",`
`4397`		`- "version": "3.6.8"`
	`4397`	`+ "version": "3.6.9"`
`4398`	`4398`	`}`
`4399`	`4399`	`},`
`4400`	`4400`	`"nbformat": 4,`
Original file line number	Diff line number	Diff line change
`@@ -2787,7 +2787,7 @@`
`2787`	`2787`	`"name": "python",`
`2788`	`2788`	`"nbconvert_exporter": "python",`
`2789`	`2789`	`"pygments_lexer": "ipython3",`
`2790`		`- "version": "3.6.8"`
	`2790`	`+ "version": "3.6.9"`
`2791`	`2791`	`}`
`2792`	`2792`	`},`
`2793`	`2793`	`"nbformat": 4,`
Original file line number	Diff line number	Diff line change
`@@ -573,7 +573,7 @@`
`573`	`573`	`"name": "python",`
`574`	`574`	`"nbconvert_exporter": "python",`
`575`	`575`	`"pygments_lexer": "ipython3",`
`576`		`- "version": "3.6.8"`
	`576`	`+ "version": "3.6.9"`
`577`	`577`	`}`
`578`	`578`	`},`
`579`	`579`	`"nbformat": 4,`