Doubt in N-Beats architecture for Block and Stack #146

abhi8893 · 2021-06-26T21:24:31Z

abhi8893
Jun 26, 2021

Hey Daneil,

In Table 18 of the N-Beats paper, the hyperparameter are listed as follows:

Here, the number of Blocks i.e. the NBeatsBlock is 1 for each stack, with each such block having 4 Block-layers. Now this forms part of a NBeatsStack which are listed as being 30 in number. So as per my understanding the model architecture is as follows:

The inputs are taken as model_inputs
Now this is passed to a NBeatsStack.
Inside the NBeatsStack, we have 1 NBeatsBlock i.e. Block=1
Inside each NBeatsBlock we have 4 FC layers i.e. Block-layers=4, each of which outputs a block residual (subtracted cumulatively) and a block forecast (aggregated cumulatively)
Finally the residual which is subtracted cumulatively is passed on to the next stack, while forecast which is aggregated cumulatively is aggregated for each stack (30 times)

The hyperparameters:

# Hyperparameters values maybe different from the N-BEATS paper Figure 1 and Table 18/Appendix D
N_EPOCHS = 5000
N_NEURONS = 512
N_LAYERS = 4
N_STACKS = 30
N_BLOCKS = 1

INPUT_SIZE = WINDOW*HORIZON
THETA_SIZE = WINDOW + HORIZON

I coded this, as follows:

First I create an abstraction for NBeatsStack:

class NBeatsStack(tf.keras.layers.Layer):

    def __init__(self, block, n_blocks, **kwargs):
        super().__init__(**kwargs)
        self.block = block
        self.block_config = block.get_config()
        self.n_blocks = n_blocks

    def call(self, inputs):
        block_config = {**self.block_config, 'name': f'{self.name}_block0'}
        initial_block = NBeatsBlock.from_config(block_config)
        stack_residuals, stack_forecast = initial_block(inputs)

        for block_num in range(1, self.n_blocks):
            block_config = {**self.block_config, 'name': f'{self.name}_block{block_num}'}
            block = NBeatsBlock.from_config(block_config)
            block_backcast, block_forecast = block(stack_residuals)

            stack_residuals = layers.subtract([stack_residuals, block_backcast], name=f'{block.name}_subtract')
            stack_forecast = layers.add([stack_forecast, block_forecast], name=f'{block.name}_add')

        return stack_residuals, stack_forecast

The model architecture is coded as follows:

# Set up the stack input layer
model_inputs = layers.Input(shape=INPUT_SIZE, name='model_input')

# Set up a template block
template_block = NBeatsBlock(input_size=INPUT_SIZE, theta_size=THETA_SIZE,
                             horizon=HORIZON, n_neurons=N_NEURONS, n_layers=N_LAYERS)

# Set up the initial stack
initial_stack = NBeatsStack(template_block, n_blocks=N_BLOCKS, name='stack0')

# Get residuals and forecast from the initial stack
residuals, forecast =  initial_stack(model_inputs)

for stack_num in range(1, N_STACKS):
    stack = NBeatsStack(template_block, n_blocks=N_BLOCKS, name=f'stack{stack_num}')
    residuals, stack_forecast = stack(residuals)

    forecast = layers.add([forecast, stack_forecast], name=f'{stack.name}_add')


model = tf.keras.Model(inputs=model_inputs, outputs=forecast, name=model_name)

model.compile(loss='mae', optimizer=tf.keras.optimizers.Adam(), metrics=['mse'])

But this way, I get so much worse results, which is making me think I am doing something wrong. In your code, you repeat the number of blocks 30 times, but the paper mentions the number of stacks being 30. Is there something I am missing?

BTW, a huge huge thanks for this course. I have learned so much from this course :)

mrdbourke · 2021-06-28T08:40:38Z

mrdbourke
Jun 28, 2021
Maintainer

Hey Abhi,

The NBeatsBlock class implemented in notebook 10 creates a stack of fully connected layers.

# Create NBeatsBlock custom layer 
class NBeatsBlock(tf.keras.layers.Layer):
  def __init__(self, # the constructor takes all the hyperparameters for the layer
               input_size: int,
               theta_size: int,
               horizon: int,
               n_neurons: int,
               n_layers: int,
               **kwargs): # the **kwargs argument takes care of all of the arguments for the parent class (input_shape, trainable, name)
    super().__init__(**kwargs)
    self.input_size = input_size
    self.theta_size = theta_size
    self.horizon = horizon
    self.n_neurons = n_neurons
    self.n_layers = n_layers

    # Block contains stack of 4 fully connected layers each has ReLU activation
    self.hidden = [tf.keras.layers.Dense(n_neurons, activation="relu") for _ in range(n_layers)]
    # Output of block is a theta layer with linear activation
    self.theta_layer = tf.keras.layers.Dense(theta_size, activation="linear", name="theta")

  def call(self, inputs): # the call method is what runs when the layer is called 
    x = inputs 
    for layer in self.hidden: # pass inputs through each hidden layer 
      x = layer(x)
    theta = self.theta_layer(x) 
    # Output the backcast and forecast from theta
    backcast, forecast = theta[:, :self.input_size], theta[:, -self.horizon:]
    return backcast, forecast

A fully constructed block is equal to a stack.

When constructing the full model, the number of blocks is 30x (see step 4 below) thus equalling 30 stacks.

## Stacking together N-BEATS model

N_STACKS = 30

# 1. Setup N-BEATS Block layer
nbeats_block_layer = NBeatsBlock(input_size=INPUT_SIZE,
                                 theta_size=THETA_SIZE,
                                 horizon=HORIZON,
                                 n_neurons=N_NEURONS,
                                 n_layers=N_LAYERS,
                                 name="InitialBlock")

# 2. Create input to stacks
stack_input = layers.Input(shape=(INPUT_SIZE), name="stack_input")

# 3. Create initial backcast and forecast input (backwards predictions are referred to as residuals in the paper)
residuals, forecast = nbeats_block_layer(stack_input)

# 4. Create stacks of blocks
for i, _ in enumerate(range(N_STACKS-1)): # first stack is already creted in (3)

  # 5. Use the NBeatsBlock to calculate the backcast as well as block forecast
  backcast, block_forecast = NBeatsBlock(
      input_size=INPUT_SIZE,
      theta_size=THETA_SIZE,
      horizon=HORIZON,
      n_neurons=N_NEURONS,
      n_layers=N_LAYERS,
      name=f"NBeatsBlock_{i}"
  )(residuals) # pass it in residuals (the backcast)

  # 6. Create the double residual stacking
  residuals = layers.subtract([residuals, backcast], name=f"subtract_{i}") 
  forecast = layers.add([forecast, block_forecast], name=f"add_{i}")

# 7. Put the stack model together
model_7 = tf.keras.Model(inputs=stack_input, 
                         outputs=forecast, 
                         name="model_7_N-BEATS")

Glad to hear you're enjoying to course :)

1 reply

abhi8893 Jun 30, 2021
Author

Thanks! I might have to read this specific section of paper once though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Doubt in N-Beats architecture for Block and Stack #146

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Doubt in N-Beats architecture for Block and Stack #146

Uh oh!

abhi8893 Jun 26, 2021

Replies: 1 comment · 1 reply

Uh oh!

mrdbourke Jun 28, 2021 Maintainer

Uh oh!

abhi8893 Jun 30, 2021 Author

abhi8893
Jun 26, 2021

Replies: 1 comment 1 reply

mrdbourke
Jun 28, 2021
Maintainer

abhi8893 Jun 30, 2021
Author