Skip to content

Commit fa0caca

Browse files
put items away
1 parent bec9309 commit fa0caca

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

topicnet/demos/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,23 @@
22
This section provides demonstrations of how to use this library in NLP tasks.
33

44
`1-RTL-Wiki-Preprocessing` -- notebook describing how to get a wikipedia dataset and write data in VW format.
5+
56
`2-RTL-Wiki-Building-Topic-Model` -- notebook with first steps to build topic model by consequently tuning it's hyperparameters
7+
68
`3-Visualizing-Your-Model-Documents` -- notebook providing a fres outlook on unstructured document collection with the help of a topic model
9+
710
`4-20NG-Preprocessing` -- preparing data from a well-know 20 Newsgroups dataset
11+
812
`5-20NG-GenSim-vs-TopicNet` -- a comparisson between two topic models build by Gensim and TopicNet library. In the notebook we compare model topics by calculating their [UMass coherence measure](https://palmetto.demos.dice-research.org/) and using Jaccard measure to compare topic top-tokens diversity
13+
914
`6-Postnauka-Building-Topic-Model` -- an analog of the RTL-Wiki notebook performed on the corpus of Russian pop-science articles given by postnauka.ru
15+
1016
`7-Postnauka-Recipe` -- a demonstration of rapid-prototyping methods provided by the library
17+
1118
`8-Coherence-Maximization-Recipe` -- a recipe for hyperparameter search in regard to custom Coherence metric
19+
1220
`9-Topic-Prior-Regularizer-Tutorial` -- a demonstration of the approach to learning topics from the unbalanced corpus
21+
1322
`10-Making-Decorrelation-and-Topic-Selection-Friends` -- reproduction of a very complicated experiment on automatically learning optimal number of topics from the collection. Hurdle is - both needed regularizers when working together nullify token-topic matrix.
1423

1524
----

0 commit comments

Comments
 (0)