Skip to content

vivekjoshy/OpenGrammar

Neural Language Model

Neural Language Model

Stand With Ukraine

Tests codecov PyPI - Downloads Documentation Status

Description

PyPI - Python Version All Contributors

Open Grammar is a biologically inspired language model.

Installation

pip install opengrammar

Usage

Training

To start training the model, you need to set the path to the minipile dataset.

You can do this by either defining a config.toml file as seen below or setting an OG_MINIPILE_ROOT environment variable.

Then simply run one of the following commands:

opengrammar train

OR

opengrammar train --config /path/to/config.toml

Environment variables which will take precedence over config.toml files. For instance you can set the token environment like so:

export OG_WANDB_TOKEN=token_goes_here

Configuration

Config Type Default Description
minipile_root Path Not Set A path to the Minipile dataset folder. [Mandatory]
wandb_token str Not Set An API key for WandB. [Optional]
batch bool 4 The number of samples per batch.
lr int 0.00001 The learning rate.
epochs int 10 The number of epochs to train for.
hidden_dim int 128 The number of hidden dimensions used for model layers.
tensor_cores bool true Enable or disable usage of tensor cores in your GPU.
devices int 1 The number of GPUs available to use.
debug bool False Disables expensive code and prints debugging logs.

Sample Config

minipile_root = "resources/minipile"
wandb_token = "very_secret_token"
batch = 4
lr = 0.00001
epochs = 10
hidden_dim = 16
tensor_cores = true
devices = 1
debug = false

About

[WIP] Neural Language Model

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages