Skip to content

Latest commit

 

History

History
64 lines (34 loc) · 1.5 KB

README.md

File metadata and controls

64 lines (34 loc) · 1.5 KB

Tensorflow Layer Normalization and Hyper Networks

================================= Tensorflow implementation of Layer Normalization and Hyper Networks.

This implementation contains:

  1. Layer Normalization for GRU

  2. Layer Normalization for LSTM

    • Currently normalizing c causes lot of nan's in the model, thus commenting it out for now.
  3. Hyper Networks for LSTM

  4. Layer Normalization and Hyper Networks (combined) for LSTM

model_demo

Prerequisites

MNIST

To evaluate the new model, we train it on MNIST. Here is the model and results using Layer Normalized GRU

histogram

scalar

Usage

To train a mnist model with different cell_types:

$ python mnist.py --hidden 128 summaries_dir log/ --cell_type LNGRU

To train a mnist model with HyperNetworks:

$ python mnist.py --hidden 128 summaries_dir log/ --cell_type HyperLnLSTMCell --layer_norm 0

To train a mnist model with HyperNetworks and Layer Normalization:

$ python mnist.py --hidden 128 summaries_dir log/ --cell_type HyperLnLSTMCell --layer_norm 1

cell_type = [LNGRU, LNLSTM, LSTM , GRU, BasicRNN, HyperLnLSTMCell]

To view graph:

$ tensorboard --logdir log/train/

Todo

  1. Add attention based models ( in progress ).