Neural Language Model
Open Grammar is a biologically inspired language model.
pip install opengrammar
To start training the model, you need to set the path to the minipile dataset.
You can do this by either defining a config.toml
file as seen below or setting an
OG_MINIPILE_ROOT
environment variable.
Then simply run one of the following commands:
opengrammar train
OR
opengrammar train --config /path/to/config.toml
Environment variables which will take precedence over config.toml
files.
For instance you can set the token environment like so:
export OG_WANDB_TOKEN=token_goes_here
Config | Type | Default | Description |
---|---|---|---|
minipile_root |
Path |
Not Set | A path to the Minipile dataset folder. [Mandatory] |
wandb_token |
str |
Not Set | An API key for WandB. [Optional] |
batch |
bool |
4 |
The number of samples per batch. |
lr |
int |
0.00001 |
The learning rate. |
epochs |
int |
10 |
The number of epochs to train for. |
hidden_dim |
int |
128 |
The number of hidden dimensions used for model layers. |
tensor_cores |
bool |
true |
Enable or disable usage of tensor cores in your GPU. |
devices |
int |
1 |
The number of GPUs available to use. |
debug |
bool |
False |
Disables expensive code and prints debugging logs. |
minipile_root = "resources/minipile"
wandb_token = "very_secret_token"
batch = 4
lr = 0.00001
epochs = 10
hidden_dim = 16
tensor_cores = true
devices = 1
debug = false