Skip to content

Commit 57d3e75

Browse files
authored
Merge pull request #51 from machine-intelligence-laboratory/feature/add-colab-info-in-readme
Add info about Colab
2 parents aa4658e + e493615 commit 57d3e75

File tree

1 file changed

+43
-11
lines changed

1 file changed

+43
-11
lines changed

README.md

Lines changed: 43 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ And here is sample code of the TopicNet baseline experiment:
5959
from topicnet.cooking_machine.config_parser import build_experiment_environment_from_yaml_config
6060
from topicnet.cooking_machine.recipes import ARTM_baseline as config_string
6161

62+
6263
config_string = config_string.format(
6364
dataset_path = '/data/datasets/NIPS/dataset.csv',
6465
modality_list = ['@word'],
@@ -78,22 +79,22 @@ best_model = experiment.select('PerplexityScore@all -> min')[0]
7879
```
7980

8081

81-
## How to start
82+
## How to Start
8283

8384
Define `TopicModel` from an ARTM model at hand or with help from `model_constructor` module, where you can set models main parameters. Then create an `Experiment`, assigning a root position to this model and path to store your experiment. Further, you can define a set of training stages by the functionality provided by the `cooking_machine.cubes` module.
8485

8586
Further you can read documentation [here](https://machine-intelligence-laboratory.github.io/TopicNet/).
8687

8788

88-
# Installation
89+
## Installation
8990

9091
**Core library functionality is based on BigARTM library**.
9192
So BigARTM should also be installed on the machine.
9293
Fortunately, the installation process should not be so difficult now.
9394
Below are the detailed explanations.
9495

9596

96-
## Via pip
97+
### Via Pip
9798

9899
The easiest way to install everything is via `pip` (but currently works fine only for Linux users!)
99100

@@ -102,11 +103,15 @@ pip install topicnet
102103
```
103104

104105
The command also installs BigARTM library, not only TopicNet.
106+
However, [BigARTM Command Line Utility](https://bigartm.readthedocs.io/en/stable/tutorials/bigartm_cli.html) will not be assembled.
107+
Pip installation makes it possible to use BigARTM only through Python Interface.
105108

106109
If working on Windows or Mac, you should install BigARTM by yourself first, then `pip install topicnet` will work just fine.
107110
We are hoping to bring all-in-`pip` installation support to the mentioned systems.
108111
However, right now you may find the following guide useful.
109112

113+
### BigARTM for Non-Linux Users
114+
110115
To avoid installing BigARTM you can use [docker images](https://hub.docker.com/r/xtonev/bigartm/tags) with preinstalled different versions of BigARTM library:
111116

112117
```bash
@@ -117,17 +122,23 @@ docker run -t -i xtonev/bigartm:v0.10.0
117122
Checking if all installed successfully:
118123

119124
```bash
120-
python
125+
$ python
121126

122127
>>> import artm
123128
>>> artm.version()
124129
```
125130

126131
Alternatively, you can follow [BigARTM installation manual](https://bigartm.readthedocs.io/en/stable/installation/index.html).
132+
There is also a pair of tips which may provide additional help for Windows users:
133+
134+
1. Go to the [installation page for Windows](http://docs.bigartm.org/en/stable/installation/windows.html) and download the 7z archive in the Downloads section.
135+
2. Use Anaconda `conda install` to download all the Python packages that BigARTM requires.
136+
3. Path variables must be set through the GUI window of system variables, and, if the variable `PYTHONPATH` is missing — add it to the **system wide** variables. Close the GUI window.
137+
127138
After setting up the environment you can fork this repository or use `pip install topicnet` to install the library.
128139

129140

130-
## From source
141+
### From Source
131142

132143
One can also install the library from GitHub, which may give more flexibility in developing (for example, making one's own viewers or regularizers a part of the module as .py files)
133144

@@ -137,6 +148,20 @@ cd topicnet
137148
pip install .
138149
```
139150

151+
### Google Colab & Kaggle Notebooks
152+
153+
As Linux installation may be done solely using `pip`, TopicNet can be used in such online services as
154+
[Google Colab](https://colab.research.google.com) and
155+
[Kaggle Notebooks](https://www.kaggle.com/kernels).
156+
All you need is to run the following command in a notebook cell:
157+
158+
```bash
159+
! pip install topicnet
160+
```
161+
162+
There is also a [notebook in Google Colab](https://colab.research.google.com/drive/1Tr1ZO03iPufj11HtIH3JjaWWU1Wyxkzv) made by [Nikolay Gerasimenko](https://github.com/Nikolay-Gerasimenko), where BigARTM is build from source.
163+
This may be useful, for example, if you want to use the BigARTM Command Line Utility.
164+
140165

141166
# Usage
142167

@@ -150,11 +175,11 @@ TopicNet does not perform data preprocessing itself.
150175
Instead, it demands data being prepared by the user and loaded via [Dataset](topicnet/cooking_machine/dataset.py) class.
151176
Here is a basic example of how one can achieve that: [rtl_wiki_preprocessing](topicnet/demos/RTL-WIKI-PREPROCESSING.ipynb).
152177

153-
## Training topic model
178+
## Training a Topic Model
154179

155180
Here we can finally get on the main part: making your own, best of them all, manually crafted Topic Model
156181

157-
### Get your data
182+
### Get Your Data
158183

159184
We need to load our data prepared previously with Dataset:
160185

@@ -163,7 +188,7 @@ DATASET_PATH = '/Wiki_raw_set/wiki_data.csv'
163188
dataset = Dataset(DATASET_PATH)
164189
```
165190

166-
### Make initial model
191+
### Make an Initial Model
167192

168193
In case you want to start from a fresh model we suggest you use this code:
169194

@@ -185,6 +210,7 @@ Further, if needed, one can define a custom score to be calculated during the mo
185210
```python
186211
from topicnet.cooking_machine.models.base_score import BaseScore
187212

213+
188214
class CustomScore(BaseScore):
189215
def __init__(self):
190216
super().__init__()
@@ -205,21 +231,24 @@ Now, `TopicModel` with custom score can be defined:
205231
```python
206232
from topicnet.cooking_machine.models.topic_model import TopicModel
207233

234+
208235
custom_scores = {'SpecificSparsity': CustomScore()}
209236
topic_model = TopicModel(artm_model, model_id='Groot', custom_scores=custom_scores)
210237
```
211238

212-
### Define experiment
239+
### Define an Experiment
213240

214241
For further model training and tuning `Experiment` is necessary:
215242

216243
```python
217244
from topicnet.cooking_machine.experiment import Experiment
218245

246+
219247
experiment = Experiment(experiment_id="simple_experiment", save_path="experiments", topic_model=topic_model)
220248
```
221249

222-
### Toy with the cubes
250+
### Toy with the Cubes
251+
223252
Defining a next stage of the model training to select a decorrelator parameter:
224253

225254
```python
@@ -248,10 +277,12 @@ best_model = experiment.select(perplexity_criterion)
248277
```
249278

250279
### Alternatively: Use Recipes
280+
251281
If you need a topic model now, you can use one of the code snippets we call *recipes*.
252282
```python
253283
from topicnet.cooking_machine.recipes import BaselineRecipe
254284

285+
255286
training_pipeline = BaselineRecipe()
256287
EXPERIMENT_PATH = '/home/user/experiment/'
257288

@@ -261,7 +292,8 @@ experiment, dataset = training_pipeline.build_experiment_environment(save_path=E
261292
after that you can expect a following result:
262293
![run_result](./docs/readme_images/experiment_train.gif)
263294

264-
### View the results
295+
296+
### View the Results
265297

266298
Browsing the model is easy: create a viewer and call its `view()` method (or `view_from_jupyter()` — it is advised to use it if working in Jupyter Notebook):
267299

0 commit comments

Comments
 (0)