|
1 |
| -### Demo |
| 1 | +# Demo |
2 | 2 | This section provides demonstrations of how to use this library in NLP tasks.
|
3 | 3 |
|
4 |
| -`1-RTL-Wiki-Preprocessing` -- notebook describing how to get a wikipedia dataset and write data in VW format. |
| 4 | +1. [RTL-Wiki-Preprocessing](RTL-Wiki-Preprocessing.ipynb) -- notebook describing how to get a Wikipedia dataset and write data in VW format. |
5 | 5 |
|
6 |
| -`2-RTL-Wiki-Building-Topic-Model` -- notebook with first steps to build topic model by consequently tuning it's hyperparameters |
| 6 | +2. [RTL-Wiki-Building-Topic-Mode](RTL-Wiki-Building-Topic-Model.ipynb) -- notebook with first steps to build topic model by consequently tuning its hyperparameters |
7 | 7 |
|
8 |
| -`3-Visualizing-Your-Model-Documents` -- notebook providing a fres outlook on unstructured document collection with the help of a topic model |
| 8 | +3. [Visualizing-Your-Model-Documents](Visualizing-Your-Model-Documents.ipynb) -- notebook providing a fresh outlook on unstructured document collection with the help of a topic model |
9 | 9 |
|
10 |
| -`4-20NG-Preprocessing` -- preparing data from a well-know 20 Newsgroups dataset |
| 10 | +4. [20NG-Preprocessing](20NG-Preprocessing.ipynb) -- preparing data from a well-know 20 Newsgroups dataset |
11 | 11 |
|
12 |
| -`5-20NG-GenSim-vs-TopicNet` -- a comparisson between two topic models build by Gensim and TopicNet library. In the notebook we compare model topics by calculating their [UMass coherence measure](https://palmetto.demos.dice-research.org/) and using Jaccard measure to compare topic top-tokens diversity |
| 12 | +5. [20NG-GenSim-vs-TopicNet](20NG-GenSim-vs-TopicNet.ipynb) -- a comparison between two topic models build by Gensim and TopicNet library. In the notebook we compare model topics by calculating their [UMass coherence measure](https://palmetto.demos.dice-research.org/) and using Jaccard measure to compare topic top-tokens diversity |
13 | 13 |
|
14 |
| -`6-Postnauka-Building-Topic-Model` -- an analog of the RTL-Wiki notebook performed on the corpus of Russian pop-science articles given by postnauka.ru |
| 14 | +6. [PostNauka-Building-Topic-Model](PostNauka-Building-Topic-Model.ipynb)-- an analog of the RTL-Wiki notebook performed on the corpus of Russian pop-science articles given by postnauka.ru |
15 | 15 |
|
16 |
| -`7-Postnauka-Recipe` -- a demonstration of rapid-prototyping methods provided by the library |
| 16 | +7. [PostNauka-Recipe](PostNauka-Recipe) -- a demonstration of rapid-prototyping methods provided by the library |
17 | 17 |
|
18 |
| -`8-Coherence-Maximization-Recipe` -- a recipe for hyperparameter search in regard to custom Coherence metric |
| 18 | +8. [Coherence-Maximization-Recipe](Coherence-Maximization-Recipe.ipynb) -- a recipe for hyperparameter search in regard to custom Coherence metric |
19 | 19 |
|
20 |
| -`9-Topic-Prior-Regularizer-Tutorial` -- a demonstration of the approach to learning topics from the unbalanced corpus |
| 20 | +9. [Topic-Prior-Regularizer-Tutorial](Topic-Prior-Regularizer-Tutorial.ipynb) -- a demonstration of the approach to learning topics from the unbalanced corpus |
21 | 21 |
|
22 |
| -`10-Making-Decorrelation-and-Topic-Selection-Friends` -- reproduction of a very complicated experiment on automatically learning optimal number of topics from the collection. Hurdle is - both needed regularizers when working together nullify token-topic matrix. |
| 22 | +10. [Making-Decorrelation-and-Topic-Selection-Friends](Making-Decorrelation-and-Topic-Selection-Friends.ipynb) -- reproduction of a very complicated experiment on automatically learning optimal number of topics from the collection. Hurdle is -- both needed regularizers when working together nullify token-topic matrix. |
23 | 23 |
|
24 | 24 | ----
|
25 | 25 | P.S. All the guides are supposed to contain **working** examples of the library code.
|
26 |
| -If you happen to find code that is no longer works please write about it in the library issues. |
| 26 | +If you happen to find code that is no longer works, please write about it in the library issues. |
27 | 27 | We will try to resolve it as soon as possible and plan to include fixes in the nearest releases.
|
0 commit comments